world models

meta’s-star-ai-scientist-yann-lecun-plans-to-leave-for-own-startup

Meta’s star AI scientist Yann LeCun plans to leave for own startup

A different approach to AI

LeCun founded Meta’s Fundamental AI Research lab, known as FAIR, in 2013 and has served as the company’s chief AI scientist ever since. He is one of three researchers who won the 2018 Turing Award for pioneering work on deep learning and convolutional neural networks. After leaving Meta, LeCun will remain a professor at New York University, where he has taught since 2003.

LeCun has previously argued that large language models like Llama that Zuckerberg has put at the center of his strategy are useful, but they will never be able to reason and plan like humans, increasingly appearing to contradict his boss’s grandiose AI vision for developing “superintelligence.”

For example, in May 2024, when an OpenAI researcher discussed the need to control ultra-intelligent AI, LeCun responded on X by writing that before urgently figuring out how to control AI systems much smarter than humans, researchers need to have the beginning of a hint of a design for a system smarter than a house cat.

Mark Zuckerberg once believed the “metaverse” was the future and renamed his company because of it. Credit: Facebook

Within FAIR, LeCun has instead focused on developing world models that can truly plan and reason. Over the past year, though, Meta’s AI research groups have seen growing tension and mass layoffs as Zuckerberg has shifted the company’s AI strategy away from long-term research and toward the rapid deployment of commercial products.

Over the summer, Zuckerberg hired Alexandr Wang to lead a new superintelligence team at Meta, paying $14.3 billion to hire the 28-year-old founder of data-labeling startup Scale AI and acquire a 49 percent interest in his company. LeCun, who had previously reported to Chief Product Officer Chris Cox, now reports to Wang, which seems like a sharp rebuke of LeCun’s approach to AI.

Zuckerberg also personally handpicked an exclusive team called TBD Lab to accelerate the development of the next iteration of large language models, luring staff from rivals such as OpenAI and Google with astonishingly large $100 to $250 million pay packages. As a result, Zuckerberg has come under growing pressure from Wall Street to show that his multibillion-dollar investment in becoming an AI leader will pay off and boost revenue. But if it turns out like his previous pivot to the metaverse, Zuckerberg’s latest bet could prove equally expensive and unfruitful.

Meta’s star AI scientist Yann LeCun plans to leave for own startup Read More »

big-ai-firms-pump-money-into-world-models-as-llm-advances-slow

Big AI firms pump money into world models as LLM advances slow

Runway, a video generation start-up that has deals with Hollywood studios, including Lionsgate, launched a product last month that uses world models to create gaming settings, with personalized stories and characters generated in real time.

“Traditional video methods [are a] brute-force approach to pixel generation, where you’re trying to squeeze motion in a couple of frames to create the illusion of movement, but the model actually doesn’t really know or reason about what’s going on in that scene,” said Cristóbal Valenzuela, chief executive officer at Runway.

Previous video-generation models had physics that were unlike the real world, he added, which general-purpose world model systems help to address.

To build these models, companies need to collect a huge amount of physical data about the world.

San Francisco-based Niantic has mapped 10 million locations, gathering information through games including Pokémon Go, which has 30 million monthly players interacting with a global map.

Niantic ran Pokémon Go for nine years and, even after the game was sold to US-based Scopely in June, its players still contribute anonymized data through scans of public landmarks to help build its world model.

“We have a running start at the problem,” said John Hanke, chief executive of Niantic Spatial, as the company is now called following the Scopely deal.

Both Niantic and Nvidia are working on filling gaps by getting their world models to generate or predict environments. Nvidia’s Omniverse platform creates and runs such simulations, assisting the $4.3 trillion tech giant’s push toward robotics and building on its long history of simulating real-world environments in video games.

Nvidia Chief Executive Jensen Huang has asserted that the next major growth phase for the company will come with “physical AI,” with the new models revolutionizing the field of robotics.

Some such as Meta’s LeCun have said this vision of a new generation of AI systems powering machines with human-level intelligence could take 10 years to achieve.

But the potential scope of the cutting-edge technology is extensive, according to AI experts. World models “open up the opportunity to service all of these other industries and amplify the same thing that computers did for knowledge work,” said Nvidia’s Lebaredian.

Additional reporting by Melissa Heikkilä in London and Michael Acton in San Francisco.

© 2025 The Financial Times Ltd. All rights reserved. Not to be redistributed, copied, or modified in any way.

Big AI firms pump money into world models as LLM advances slow Read More »

new-ai-model-turns-photos-into-explorable-3d-worlds,-with-caveats

New AI model turns photos into explorable 3D worlds, with caveats

Training with automated data pipeline

Voyager builds on Tencent’s earlier HunyuanWorld 1.0, released in July. Voyager is also part of Tencent’s broader “Hunyuan” ecosystem, which includes the Hunyuan3D-2 model for text-to-3D generation and the previously covered HunyuanVideo for video synthesis.

To train Voyager, researchers developed software that automatically analyzes existing videos to process camera movements and calculate depth for every frame—eliminating the need for humans to manually label thousands of hours of footage. The system processed over 100,000 video clips from both real-world recordings and the aforementioned Unreal Engine renders.

A diagram of the Voyager world creation pipeline.

A diagram of the Voyager world creation pipeline. Credit: Tencent

The model demands serious computing power to run, requiring at least 60GB of GPU memory for 540p resolution, though Tencent recommends 80GB for better results. Tencent published the model weights on Hugging Face and included code that works with both single and multi-GPU setups.

The model comes with notable licensing restrictions. Like other Hunyuan models from Tencent, the license prohibits usage in the European Union, the United Kingdom, and South Korea. Additionally, commercial deployments serving over 100 million monthly active users require separate licensing from Tencent.

On the WorldScore benchmark developed by Stanford University researchers, Voyager reportedly achieved the highest overall score of 77.62, compared to 72.69 for WonderWorld and 62.15 for CogVideoX-I2V. The model reportedly excelled in object control (66.92), style consistency (84.89), and subjective quality (71.09), though it placed second in camera control (85.95) behind WonderWorld’s 92.98. WorldScore evaluates world generation approaches across multiple criteria, including 3D consistency and content alignment.

While these self-reported benchmark results seem promising, wider deployment still faces challenges due to the computational muscle involved. For developers needing faster processing, the system supports parallel inference across multiple GPUs using the xDiT framework. Running on eight GPUs delivers processing speeds 6.69 times faster than single-GPU setups.

Given the processing power required and the limitations in generating long, coherent “worlds,” it may be a while before we see real-time interactive experiences using a similar technique. But as we’ve seen so far with experiments like Google’s Genie, we’re potentially witnessing very early steps into a new interactive, generative art form.

New AI model turns photos into explorable 3D worlds, with caveats Read More »