This AI model can turn 2D images into editable 3D worlds

With AI image generators now close to photorealism, attention is shifting from the generation of pixels to the generation of spaces. Various developers are working on AI models that can generate explorable 3D worlds rather than merely flat images or video – the kind of assets that would normally require the use of 3D modelling software.

One of those is Echo. Its developer, SpAItial AI, says the model can generate a single, coherent and editable 3D space that you can move through freely. It reckons the tool has the potential to unlock 3D design, simulation, digital twins and game environment workflows starting from just a photo or a line of text.

SpAItial AI: Announcing Echo — our new frontier model for 3D world generation. - YouTube

Watch On

Echo can turn both text prompts and images into explorable 3D worlds by predicting a geometry-grounded 3D scene at metric scale. What makes it different to some of the other attempts at 3D world generators is that every new view, depth map and interaction comes from the same underlying world, not independent “hallucinations”, the developer says.

Once generated, the world is interactive in real time. Users can apparently control the camera, explore from any angle and render instantly, even on low-end hardware, directly in the browser.

“High-quality 3D world exploration is no longer gated by expensive equipment. Under the hood, Echo infers a physically grounded 3D representation and converts it into a renderable format,” SpAItial AI says.

Echo also enables scene editing and restyling without breaking the world created. The demo video shows the style of a 3D space transform to 'Frozen', 'Rococo' and 'Cyber Rustic' while retaining the general layout.

Users can also change materials, remove or add objects, explore design variations while preserving global 3D consistency.

For the web demo above, it used 3D Gaussian Splatting for fast rendering but says the representation is flexible and can be easily adapted.

SpAItial says upcoming versions will expand Echo's capabilities to permit full prompt-based scene manipulation, allowing users to add, remove, rearrange, or restyle objects. Further ahead, the intention is to feature dynamics and physical reasoning over the underlying representation.

This will enable scenes that feature physics-based behavior, opening the door to interactive simulations, robotics testing, and richer digital twin applications, the developer says.

Joe is a regular freelance journalist and editor at Creative Bloq. He writes news, features and buying guides and keeps track of the best equipment and software for creatives, from video editing programs to monitors and accessories. A veteran news writer and photographer, he now works as a project manager at the London and Buenos Aires-based design, production and branding agency Hermana Creatives. There he manages a team of designers, photographers and video editors who specialise in producing visual content and design assets for the hospitality sector. He also dances Argentine tango.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.