In 1950, when computing was little more than automated arithmetic and simple logic, Alan Turing asked a question that still reverberates today: can machines think? It took remarkable imagination to see what he saw, which is that intelligence might someday be built rather than born. That insight later launched a relentless scientific quest called artificial intelligence. Twenty-five years into my own career in AI, I still find myself inspired by Turing’s vision. But how close are we? The answer isn’t simple.
[time-brightcove not-tgx=”true”]
Today, leading AI technology such as large language models (LLMs) have begun to transform how we access and work with abstract knowledge. Yet they remain wordsmiths in the dark, eloquent but inexperienced, knowledgeable but ungrounded.
For humans, spatial intelligence is the scaffolding upon which our cognition is built. It’s at work when we passively observe or actively seek to create. It drives our reasoning and planning, even on the most abstract topics. And it’s essential to the way we interact—verbally or physically, with our peers or with the environment itself. When machines are endowed with this ability, it will transform how we create and interact with real and virtual worlds—revolutionizing storytelling, robotics, scientific discovery, and beyond. This is AI’s next frontier, and why 2025 was such a pivotal year.
The candid truth is that AI’s spatial capabilities remain far from the human level. But tremendous progress has indeed been made. Multimodal LLMs, trained with voluminous multimedia data in addition to textual data, have introduced some basics of spatial awareness, and today’s AI can analyze pictures, answer questions about them, and generate hyperrealistic images and short videos.
Read more: Inside Fei-Fei Li’s Plan to Build AI-Powered Virtual Worlds
But there is much more to be done. Building spatially intelligent AI requires something even more ambitious than LLMs: world models, new types of generative models whose capabilities of understanding, reasoning, generation and interaction with the semantically, physically, geometrically and dynamically complex worlds – virtual or real – are far beyond the reach of today’s LLMs.
This technology is still nascent, yet exciting progress is underway. The applications of spatial intelligence span varying timelines. Creative tools are emerging now—World Labs’ Marble already puts these capabilities in creators’ and storytellers’ hands. Robotics represents an ambitious mid-term horizon as we refine the loop between perception and action. The most transformative scientific applications will take longer but promise a profound impact on human flourishing.
For the first time in history, we’re poised to build machines that we can rely on as true partners in the greatest challenges we face—whether accelerating how we understand diseases in the lab or supporting us in our most vulnerable moments of sickness, injury, or age. We’re on the cusp of technology that elevates the aspects of life we care about most. This is a vision of deeper, richer, more empowered lives. Almost a half billion years after nature unleashed the first glimmers of spatial intelligence in ancestral animals, we’re lucky enough to find ourselves among the generation of technologists who we may soon endow machines with the same capability—and privileged enough to harness those capabilities for the benefits of people everywhere.
Adapted from Li’s essay “From Words to Worlds: Spatial Intelligence is AI’s Next Frontier.”
Leave a comment








