“We basically have passed the point of doing real fundamental LLM research,” Castricato said. “Now it’s just applications.”
The researcher quit his doctoral studies at Brown University and started a new company, called Overworld. Its ambition is in its name: AI that can understand and navigate a world, not just words.
At the heart of world model research is the idea that AI can’t be truly intelligent if it can only read a book. It also needs to read the room.
“Where language models learn the statistical structure of text, world models learn the statistical structure of space and time: how light falls on a surface, how a garden looks from an angle no camera has captured, how objects respond to force and follow the laws of physics,” wrote Li, founder of the San Francisco startup World Labs, in an essay published this month.
“World model is quickly becoming a buzzword,” LeCun said on a recent “Unsupervised Learning” podcast. He said he views it as something that enables an AI agent “to predict the consequences of its own actions.”
Chatbots can’t pick up a coffee mug, notes Martial Hebert, dean of computer science at Carnegie Mellon University.
“There’s all the geometry of the world, the dynamic of how I move my hand, the physical interaction of the contact with the cup,” Hebert said. “This is much more complex than just predicting the next word in a sentence.”
For scientists like Hebert, who has spent more than four decades researching robotics, the most useful application for world models is as a faster and cheaper path to “physical AI” — another tech industry buzzword.
“Some people may have different definitions, but physical and embodied AI are kind of the evolution of what we used to call robotics,” Hebert said in an interview. Some of the AI advances that have made chatbots so useful can also be applied to building AI with a broad enough awareness of its environment to work like a robot’s brain, he said.
“In your body and spinal cord you have a very general model of how to balance, how to walk around, and you can adapt to your knee hurting in the morning, so you now walk a little differently,” he said. “You don’t need to think about that. You have a general model somewhere in your nervous system and brain that allows your body to adapt very quickly.”
Smarter robots aren’t the only end game for world models. Castricato started Overworld last year and the tiny Rhode Island-based startup is now building video game worlds where a scene, say, of a spooky forest, can adapt as a virtual character moves through it and interacts with the objects in it.
“There’s no other world model where you can just walk through doors or where you can interact with a detailed environment like this,” he said in an interview. “We optimize for interaction above anything else.”
While the near-term applications aren’t as readily apparent as AI coding tools, world model makers are attracting interest from venture capitalists like Steve Jang, co-founder and managing partner at Kindred Ventures.
The firm is investing in Overworld and other world model-focused companies, including Causal Labs, which is building AI models for weather prediction, and Extropic, which is building specialized computer chips suited to world models.
“I think that the future is many different types of models with many different philosophies and architectures,” Jang said. “I don’t think that it’ll be one large, dense model to rule them all.”
In her recent essay, Li sought to create a “taxonomy of world models” to help sort out the confusion about the competing visions.
“A video model that produces gorgeous but physically impossible flames, a language model improvising a playable game, and a physics engine that faithfully simulates combustion all go by the same name,” she wrote.
She divided world models into three categories. The most commercially viable today are “renderers” that prioritize the visual fidelity of the virtual worlds they create but can’t be trusted to teach robots much.
Then, there are “simulators” that create virtual training grounds that faithfully represent the physical structure of a world; and “planners” that try to predict what an AI agent or robot should do in an unstructured world.
“A robot that can plan is a robot that can work, and the entire industry is racing to be the one that gets there first,” she wrote.



