Google DeepMind has revealed Genie 3, its latest foundation world model that the AI lab says presents a crucial stepping stone on the path to artificial general intelligence, or human-like intelligence. 

“Genie 3 is the first real-time interactive general purpose world model,” Shlomi Fruchter, a research director at DeepMind, said during a press briefing. “It goes beyond narrow world models that existed before. It’s not specific to any particular environment. It can generate both photo-realistic and imaginary worlds, and everything in between.”

Genie 3, which is still in research preview and not publicly available, builds on both its predecessor Genie 2 – which can generate new environments for agents – and DeepMind’s latest video generation model Veo 3 – which exhibits a deep understanding of physics. 

Image Credits:Google DeepMind

With a simple text prompt, Genie 3 can generate multiple minutes – up from 10 to 20 seconds in Genie 2 – of diverse, interactive, 3D environments at 24 frames per second with a resolution of 720p. The model also features “promptable world events,” or the ability to use a prompt to change the generated world.

Perhaps most importantly, Genie 3’s simulations stay physically consistent over time because the model is able to remember what it had previously generated – an emergent capability that DeepMind researchers didn’t explicitly program into the model. 

Fruchter said that while Genie 3 clearly has implications for educational experiences and new generative media like gaming or prototyping creative concepts, its real unlock will manifest in training agents for general purpose tasks, which he said is essential to reaching AGI. 

“We think world models are key on the path to AGI, specifically for embodied agents, where simulating real world scenarios is particularly challenging,”Jack Parker-Holder, a research scientist on DeepMind’s open-endedness team, said during a briefing.

Techcrunch event

San Francisco
|
October 27-29, 2025

Image Credits:Google DeepMind

Genie 3 is designed to solve that bottleneck. Like Veo, it doesn’t rely on a hard-coded physics engine. Instead, it teaches itself how the world works – how objects move, fall, and interact – by remembering what it has generated and reasoning over long time horizons. 

“The model is auto-regressive, meaning it generates one frame at a time,” Fruchter told TechCrunch in a separate interview. “It has to look back at what was generated before to decide what’s going to happen next. That’s a key part of the architecture.”

That memory creates consistency in its simulated worlds, and that consistency allows it to develop a kind of intuitive grasp of physics, similar to how humans understand that a glass teetering on the edge of a table is about to fall, or that they should duck to avoid a falling object.

This ability to simulate coherent, physically plausible environments over time makes Genie 3 much more than a generative model. It becomes an ideal training ground for general-purpose agents. Not only can it generate endless, diverse worlds to explore, but it also has the potential to push agents to their limits – forcing them to adapt, struggle, and learn from their own experience in a way that mirrors how humans learn in the real world. 

Image Credits:Google DeepMind

Currently, the range of actions an agent can take is still limited. For example, the promptable world events allow for a wide range of environmental interventions, but they’re not necessarily performed by the agent itself. Similarly, it’s still difficult to accurately model complex interactions between multiple independent agents in a shared environment. Genie 3 can also only support a few minutes of continuous interaction, when hours would be necessary for proper training. 

Still, Genie 3 presents a compelling step forward in teaching agents to go beyond reacting to inputs so they can plan, explore, seek out uncertainty, and improve through trial and error – the kind of self-driven, embodied learning that’s key in moving towards general intelligence. 

“We haven’t really had a Move 37 moment for embodied agents yet, where they can actually take novel actions in the real world,” Parker-Holder said, referring to the legendary moment in the 2016 game of Go between DeepMind’s AI agent AlphaGo and world champion Lee Sedol, in which Alpha Go played an unconventional and brilliant move that became symbolic of AI’s ability to discover new strategies beyond human understanding. 

“But now, we can potentially usher in a new era,” he said. 



Source link

Leave a Comment

Newsletter

Subscribe my Newsletter for new blog posts, tips & new photos. Let's stay updated!

Newsletter

Subscribe my Newsletter for new blog posts, tips & new photos. Let's stay updated!

© 2025 decentralnewshub.com. All rights reserved.