AI

burnout-and-elon-musk’s-politics-spark-exodus-from-senior-xai,-tesla-staff

Burnout and Elon Musk’s politics spark exodus from senior xAI, Tesla staff


Not a fun place to work, apparently

Disillusionment with Musk’s activism, strategic pivots, and mass layoffs cause churn.

Elon Musk’s business empire has been hit by a wave of senior departures over the past year, as the billionaire’s relentless demands and political activism accelerate turnover among his top ranks.

Key members of Tesla’s US sales team, battery and power-train operations, public affairs arm, and its chief information officer have all recently departed, as well as core members of the Optimus robot and AI teams on which Musk has bet the future of the company.

Churn has been even more rapid at xAI, Musk’s two-year-old artificial intelligence start-up, which he merged with his social network X in March. Its chief financial officer and general counsel recently departed after short stints, within a week of each other.

The moves are part of an exodus from the conglomerate of the world’s richest man, as he juggles five companies from SpaceX to Tesla with more than 140,000 employees. The Financial Times spoke to more than a dozen current and former employees to gain an insight into the tumult.

While many left happily after long service to found start-ups or take career breaks, there has also been an uptick in those quitting from burnout, or disillusionment with Musk’s strategic pivots, mass lay-offs and his politics, the people said.

“The one constant in Elon’s world is how quickly he burns through deputies,” said one of the billionaire’s advisers. “Even the board jokes, there’s time and then there’s ‘Tesla time.’ It’s a 24/7 campaign-style work ethos. Not everyone is cut out for that.”

Robert Keele, xAI’s general counsel, ended his 16-month tenure in early August by posting an AI-generated video of a suited lawyer screaming while shoveling molten coal. “I love my two toddlers and I don’t get to see them enough,” he commented.

Mike Liberatore lasted three months as xAI chief financial officer before defecting to Musk’s arch-rival Sam Altman at OpenAI. “102 days—7 days per week in the office; 120+ hours per week; I love working hard,” he said on LinkedIn.

Top lieutenants said Musk’s intensity has been sharpened by the launch of ChatGPT in late-2022, which shook up the established Silicon Valley order.

Employees also perceive Musk’s rivalry with Altman—with whom he co-founded OpenAI, before they fell out—to be behind the pressure being put on staff.

“Elon’s got a chip on his shoulder from ChatGPT and is spending every waking moment trying to put Sam out of business,” said one recent top departee.

Last week, xAI accused its rival of poaching engineers with the aim of “plundering and misappropriating” its code and data center secrets. OpenAI called the lawsuit “the latest chapter in Musk’s ongoing harassment.”

Other insiders pointed to unease about Musk’s support of Donald Trump and advocacy for far-right provocateurs in the US and Europe.

They said some staff dreaded difficult conversations with their families about Musk’s polarizing views on everything from the rights of transgender people to the murder of conservative activist Charlie Kirk.

Musk, Tesla, and xAI declined to comment.

Tesla has traditionally been the most stable part of Musk’s conglomerate. But many of the top team left after it culled 14,000 jobs in April 2024. Some departures were triggered as Musk moved investment away from new EV and battery projects that many employees saw as key to its mission of reducing global emissions—and prioritized robotics, AI, and self-driving robotaxis.

Musk cancelled a program to build a low-cost $25,000 EV that could be sold across emerging markets—dubbed NV-91 internally and Model 2 by fans online, according to five people familiar with the matter.

Daniel Ho, who helped oversee the project as director of vehicle programs and reported directly to Musk, left in September 2024 and joined Google’s self-driving taxi arm, Waymo.

Public policy executives Rohan Patel and Hasan Nazar and the head of the power-train and energy units Drew Baglino also stepped down after the pivot. Rebecca Tinucci, leader of the supercharger division, went to Uber after Musk fired the entire team and slowed construction on high-speed charging stations.

In late summer, David Zhang, who was in charge of the Model Y and Cybertruck rollouts, departed. Chief information officer Nagesh Saldi left in November.

Vineet Mehta, a company veteran of 18 years, described as “critical to all things battery” by a colleague, resigned in April. Milan Kovac, in charge of Optimus humanoid robotics program, departed in June.

He was followed this month by Ashish Kumar, the Optimus AI team lead, who moved to Meta. “Financial upside at Tesla was significantly larger,” wrote Kumar on X in response to criticism he left for money. “Tesla is known to compensate pretty well, way before Zuck made it cool.”

Amid a sharp fall in sales—which many blame on Musk alienating liberal customers—Omead Ashfar, a close confidant known as the billionaire’s “firefighter” and “executioner,” was dismissed as head of sales and operations in North America in June. Ashfar’s deputy Troy Jones followed shortly after, ending 15 years of service.

“Elon’s behavior is affecting morale, retention, and recruitment,” said one long-standing lieutenant. He “went from a position from where people of all stripes liked him, to only a certain section.”

Few who depart criticize Musk for fear of retribution. But Giorgio Balestrieri, who had worked for Tesla for eight years in Spain, is among a handful to go public, saying this month he quit believing that Musk had done “huge damage to Tesla’s mission and to the health of democratic institutions.”

“I love Tesla and my time there,” said another recent leaver. “But nobody that I know there isn’t thinking about politics. Who the hell wants to put up with it? I get calls at least once a week. My advice is, if your moral compass is saying you need to leave, that isn’t going to go away.”

But Tesla chair Robyn Denholm said: “There are always headlines about people leaving, but I don’t see the headlines about people joining.

“Our bench strength is outstanding… we actually develop people really well at Tesla and we are still a magnet for talent.”

At xAI, some staff have balked at Musk’s free-speech absolutism and perceived lax approach to user safety as he rushes out new AI features to compete with OpenAI and Google. Over the summer, the Grok chatbot integrated into X praised Adolf Hitler, after Musk ordered changes to make it less “woke.”

Ex-CFO Liberatore was among the executives that clashed with some of Musk’s inner circle over corporate structure and tough financial targets, people with knowledge of the matter said.

“Elon loyalists who exhibit his traits are laying off people and making decisions on safety that I think are very concerning for people internally,” one of the people added. “Mike is a business guy, a capitalist. But he’s also someone who does stuff the right way.”

The Wall Street Journal first reported some of the details of the internal disputes.

Linda Yaccarino, chief executive of X, resigned in July after the social media platform was subsumed by xAI. She had grown frustrated with Musk’s unilateral decision-making and his criticism over advertising revenue.

xAI’s co-founder and chief engineer, Igor Babuschkin, stepped down a month later to found his own AI safety research project.

Communications executives Dave Heinzinger and John Stoll, spent three and nine months at X respectively, before returning to their former employers, according to people familiar with the matter.

X also lost a rash of senior engineers and product staff who reported directly to Musk and were helping to navigate the integration with xAI.

This includes head of product engineering Haofei Wang and consumer product and payments boss Patrick Traughber. Uday Ruddarraju, who oversaw X and xAI’s infrastructure engineering, and infrastructure engineer Michael Dalton were poached by OpenAI.

Musk shows no sign of relenting. xAI’s flirtatious “Ani bot” has caused controversy over sexually explicit interactions with teenage Grok app users. But the company’s owner has installed a hologram of Ani in the lobby of xAI to greet staff.

“He’s the boss, the alpha and anyone who doesn’t treat him that way, he finds a way to delete,” one former top Tesla executive said.

“He does not have shades of grey, is highly calculated, and focused… that makes him hard to work with. But if you’re aligned with the end goal, and you can grin and bear it, it’s fine. A lot of people do.”

Additional reporting by George Hammond.

© 2025 The Financial Times Ltd. All rights reserved. Not to be redistributed, copied, or modified in any way.

Burnout and Elon Musk’s politics spark exodus from senior xAI, Tesla staff Read More »

big-ai-firms-pump-money-into-world-models-as-llm-advances-slow

Big AI firms pump money into world models as LLM advances slow

Runway, a video generation start-up that has deals with Hollywood studios, including Lionsgate, launched a product last month that uses world models to create gaming settings, with personalized stories and characters generated in real time.

“Traditional video methods [are a] brute-force approach to pixel generation, where you’re trying to squeeze motion in a couple of frames to create the illusion of movement, but the model actually doesn’t really know or reason about what’s going on in that scene,” said Cristóbal Valenzuela, chief executive officer at Runway.

Previous video-generation models had physics that were unlike the real world, he added, which general-purpose world model systems help to address.

To build these models, companies need to collect a huge amount of physical data about the world.

San Francisco-based Niantic has mapped 10 million locations, gathering information through games including Pokémon Go, which has 30 million monthly players interacting with a global map.

Niantic ran Pokémon Go for nine years and, even after the game was sold to US-based Scopely in June, its players still contribute anonymized data through scans of public landmarks to help build its world model.

“We have a running start at the problem,” said John Hanke, chief executive of Niantic Spatial, as the company is now called following the Scopely deal.

Both Niantic and Nvidia are working on filling gaps by getting their world models to generate or predict environments. Nvidia’s Omniverse platform creates and runs such simulations, assisting the $4.3 trillion tech giant’s push toward robotics and building on its long history of simulating real-world environments in video games.

Nvidia Chief Executive Jensen Huang has asserted that the next major growth phase for the company will come with “physical AI,” with the new models revolutionizing the field of robotics.

Some such as Meta’s LeCun have said this vision of a new generation of AI systems powering machines with human-level intelligence could take 10 years to achieve.

But the potential scope of the cutting-edge technology is extensive, according to AI experts. World models “open up the opportunity to service all of these other industries and amplify the same thing that computers did for knowledge work,” said Nvidia’s Lebaredian.

Additional reporting by Melissa Heikkilä in London and Michael Acton in San Francisco.

© 2025 The Financial Times Ltd. All rights reserved. Not to be redistributed, copied, or modified in any way.

Big AI firms pump money into world models as LLM advances slow Read More »

why-la-comic-con-thought-making-an-ai-powered-stan-lee-hologram-was-a-good-idea

Why LA Comic Con thought making an AI-powered Stan Lee hologram was a good idea


Trust us, it’ll be marvel-ous

“I suppose if we do it and thousands of fans… don’t like it, we’ll stop doing it.”

Excelsior, true beliers! Credit: Proto Hologram

Late last week, The Hollywood Reporter ran a story about an “AI Stan Lee hologram” that would be appearing at the LA Comic Con this weekend. Nearly seven years after the famous Marvel Comics creator’s death at the age of 95, fans will be able to pay $15 to $20 this weekend to chat with a life-sized, AI-powered avatar of Lee in an enclosed booth at the show.

The instant response from many fans and media outlets to the idea was not kind, to say the least. A writer for TheGamer called the very idea “demonic” and said we need to “kill it with fire before it’s too late.” The AV Club urged its readers not to pay to see “the anguished digital ghost of a beloved comic book creator, repurposed as a trap for chumps!” Reactions on a popular Reddit thread ranged from calling it “incredibly disrespectful” and “in bad taste” to “ghoulish” and “so fucked up,” with very little that was more receptive to the concept.

But Chris DeMoulin, the CEO of the parent company behind LA Comic Con, urged critics to come see the AI-powered hologram for themselves before rushing to judgment. “We’re not afraid of people seeing it and we’re not afraid of criticism,” he told Ars. “I’m just a fan of informed criticism, and I think most of what’s been out there so far has not really been informed.”

“It’s unfortunate that a few people have really negative things to say about it, sight unseen, just the level of it being a concept,” DeMoulin continued. “It’s not perfect. I’m not sure something like this can ever be perfect. But I think what you strive to do is feed enough information into it and test it enough so that the experience it creates for the fans is one that feels genuine.”

“It’s going to have to be really good or we’re all going to say no”

This isn’t the first time LA Comic Con has featured an interactive hologram (which for the Stan Lee experience means a life-sized volumetric screen-in-a-box that can show different views from different angles). Starting in 2019, the convention used similar technology to feature Boffo the Bear, a 7-foot-tall animated blue ursid who served as the MC for a live talent show featuring famous voice acting talent. But Boffo was powered by a real-time motion-captured improv performance from actor Mark DeCarlo rather than automated artificial intelligence.

A live mo-capped version of Boffo the Bear hosts a panel with voice actors at LA Comic Con.

In the years since Boffo’s introduction at the con, DeMoulin said he’s kept up with the team behind that hologram and “saw the leaps and bounds that they were making in improving the technology, improving the interactivity.” Now, he said, it’s possible to create an AI-powered version that ingests “all of the actual comments that people made during their life” to craft an interactive hologram that “is not literally quoting the person, but everything it was saying was based on things that person actually said.”

DeMoulin said he called Bob Sabouni, who manages the Stan Lee Legacy brand, to pitch the AI Stan Lee avatar as “kind of an entry point into people asking questions about the Marvel universe, the stories, the characters he created.” Sabouni agreed to the idea, DeMoulin said, but added that “it’s gonna have to be really good or we’re all going to say no.”

With that somewhat conditional approval, DeMoulin reached out to Proto Hologram, the company that had developed the Boffo the Bear experience years earlier. Proto, in turn, reached out to Hyperreal, a company that describes itself as “powering ownership, control, performance, and monetization of identity across digital ecosystems” to help develop the AI model that would power the Lee avatar.

A promotional video from Proto Holograms shows off the kind of volumetric box that the AI-powered Stan Lee avatar will appear in.

Hyperreal CEO and Chief Architect Remington Scott tells Ars that the company “leverages a customized ecosystem of cutting-edge AI technologies” to create “bespoke” and “custom-crafted” AI versions of celebrities. To do that for Stan Lee, DeMoulin said they trained a model on decades of content he had left behind, from tapes of dozens of convention panels he had appeared on to written and spoken content gathered by the managers of the Stan Lee Universe brand.

Scott said Hyperreal “can’t share specific technical details” of the models or training techniques they use to power these recreations. But Scott added that this training project is “particularly meaningful, [because] Stan Lee had actually begun digitizing himself while he was alive, with the vision of creating a digital double so his fans could interact with him on a larger scale.”

After incurring costs of “tens of thousands into six figures” of dollars, DeMoulin said he was finally able to test the Lee hologram about a month ago. That first version still needed some tweaks to get the look and feel of Lee’s delivery just right, though.

“Stan had a considered way of speaking… he would pause, he had certain catch phrases that when he used them he would say them in a certain way,” DeMoulin said. “So it took a while to get to the hologram to be able to say all that in a way that [Sabouni] and I and others that work with Stan felt like, ‘Yeah, that’s actually starting to sound more like him.’”

“The only words that are gonna be in Stan’s mouth are Stan’s words”

Anyone who is familiar with LLMs and their tendency to confabulate might be worried about the potential for an AI Lee avatar to go off-script or make things up in front of a live audience. And while DeMoulin said he was concerned about that going in, those concerns have faded as he and others who worked with Lee in his lifetime have spent hours throwing “hundreds and hundreds and hundreds” of questions at the hologram “to sort of see where the sensitivities on it are.”

“The only words that are gonna be in Stan’s mouth are Stan’s words,” DeMoulin said. “Just because I haven’t personally seen [the model hallucinate] doesn’t mean that it’s impossible, but that hasn’t been my experience.”

The living version of Stan Lee appeared at the Wizard World convention in 2018, shortly before his death.

Credit: Getty Images

The living version of Stan Lee appeared at the Wizard World convention in 2018, shortly before his death. Credit: Getty Images

While a moderator at the convention will be on hand to repeat fan questions into a microphone (to avoid ambient crowd noise from the showfloor), DeMoulin said there won’t be any human filtering on what fans are allowed to ask the Lee avatar in the 15- to 20-minute group Q&A sessions. Instead, DeMoulin said the team has set up a system of “content governors” so that, for instance, “if you ask Stan what he thought of the last presidential election he’s gonna say ‘That’s not what we’re here to talk about. We’re here to talk about the Marvel universe.'”

For topics that are Marvel-related, though, the AI avatar won’t shy away from controversy, DeMoulin said. If you ask the avatar about Jack Kirby, for instance, DeMoulin said it will address the “honest disagreements about characters or storylines, which are gonna happen in any creative enterprise,” while also saying that “‘I have nothing but respect for him,’ which is I think largely what Stan would have said if he was asked that question.”

Hyperreal’s Scott said the company’s approach to training digital avatars on verified content “ensures responses stay true to Stan’s documented perspectives and values.” And DeMoulin said the model is perfectly willing to say when it doesn’t know the answer to an appropriate question. In early testing, for instance, the avatar couldn’t answer a question about the Merry Marvel Marching Society, DeMoulin said, because that wasn’t part of its training data. After a subsequent update, the new model provided a relevant answer to the same question, he said.

“We are not trying to bring Stan back from the dead”

Throughout our talk, DeMoulin repeatedly stressed that their AI hologram wasn’t intended to serve as a replacement for the living version of Lee. “We want to make sure that people understand that we are not trying to bring Stan back from the dead,” he said. “We’re not trying to say that this is Stan, and we’re not trying to put words in his mouth, and this avatar is not gonna start doing commercials to advertise other people’s products.”

DeMoulin said he sees the Lee avatar as a kind of futuristic guide to a library of Marvel information and trivia, presented with a fun and familiar face. “In the introduction, the avatar will say, ‘I’m here as a result of the latest developments in technology, which allow me to be a holographic representation of Stan to answer your questions about Marvel and trivia’ and this, that, and the other thing,” DeMoulin said

Still, DeMoulin said he understands why the idea of using even a stylized version of Lee’s likeness in this manner could rub some fans the wrong way. “When a new technology comes out, it just feels wrong to them, and I respect the fact that this feels wrong to people,” he said. “I totally agree that something like this–not just for Stan but for anyone, any celebrity alive or dead–could be put into this technology and used in a way that would be exploitative and unfortunate.”

Fans like these, seen at LA Comic Con 2022, will be the final arbiters of whether the AI-powered Stan Lee avatar is respectful or not.

Credit: Getty Images

Fans like these, seen at LA Comic Con 2022, will be the final arbiters of whether the AI-powered Stan Lee avatar is respectful or not. Credit: Getty Images

That’s why DeMoulin said he and the others behind the AI-powered Lee feel a responsibility “to make sure that if we were going to do this, we never got anywhere close to that.” Moreover, he said he’s “disappointed that people would be so negative about something they’ve not seen. … It’s not that I think that their point of view is invalid. What I think is invalid is having a wildly negative point of view about something that you haven’t actually seen.”

Scott said concerns about respect for the actual human celebrity are why they “partner exclusively with authorized estates and rights holders like Stan Lee Universe.” The “premium, authenticated digital identities” created by Hyperreal’s system are “not replacing artists” but “creating respectful digital extensions that honor their legacy,” Scott said.

Once fans actually see the AI-powered Lee avatar in person, DeMoulin said he’s confident they’ll see the team behind the convention is “trying to do it in a way that will actually be delightful and very much be consistent with Stan’s legacy… We clearly have to set our sights on doing this right, and doing it right means getting people that knew and loved the guy and worked with him during his career to give us input, and then putting it in front of enough fans to know if we’re doing it in a way that lives up to his standards.”

And if he’s wrong about the expected reception? “I suppose if we do it and thousands of fans interact with [it] and they don’t like it, we’ll stop doing it,” he said. “I saw firsthand the impact that Stan had in that [convention] environment, so I think we have a team of people together that love and respect that and are trying to do something which will continue that. And if it turns out, for some reason, this isn’t that, we won’t do it.”

Photo of Kyle Orland

Kyle Orland has been the Senior Gaming Editor at Ars Technica since 2012, writing primarily about the business, tech, and culture behind video games. He has journalism and computer science degrees from University of Maryland. He once wrote a whole book about Minesweeper.

Why LA Comic Con thought making an AI-powered Stan Lee hologram was a good idea Read More »

can-ai-detect-hedgehogs-from-space?-maybe-if-you-find-brambles-first.

Can AI detect hedgehogs from space? Maybe if you find brambles first.

“It took us about 20 seconds to find the first one in an area indicated by the model,” wrote Jaffer in a blog post documenting the field test. Starting at Milton Community Centre, where the model showed high confidence of brambles near the car park, the team systematically visited locations with varying prediction levels.

The research team locating their first bramble.

The research team locating their first bramble. Credit: Sadiq Jaffer

At Milton Country Park, every high-confidence area they checked contained substantial bramble growth. When they investigated a residential hotspot, they found an empty plot overrun with brambles. Most amusingly, a major prediction in North Cambridge led them to Bramblefields Local Nature Reserve. True to its name, the area contained extensive bramble coverage.

The model reportedly performed best when detecting large, uncovered bramble patches visible from above. Smaller brambles under tree cover showed lower confidence scores—a logical limitation given the satellite’s overhead perspective. “Since TESSERA is learned representation from remote sensing data, it would make sense that bramble partially obscured from above might be harder to spot,” Jaffer explained.

An early experiment

While the researchers expressed enthusiasm over the early results, the bramble detection work represents a proof-of-concept that is still under active research. The model has not yet been published in a peer-reviewed journal, and the field validation described here was an informal test rather than a scientific study. The Cambridge team acknowledges these limitations and plans more systematic validation.

However, it’s still a relatively positive research application of neural network techniques that reminds us that the field of artificial intelligence is much larger than just generative AI models, such as ChatGPT, or video synthesis models.

Should the team’s research pan out, the simplicity of the bramble detector offers some practical advantages. Unlike more resource-intensive deep learning models, the system could potentially run on mobile devices, enabling real-time field validation. The team considered developing a phone-based active learning system that would enable field researchers to improve the model while verifying its predictions.

In the future, similar AI-based approaches combining satellite remote sensing with citizen science data could potentially map invasive species, track agricultural pests, or monitor changes in various ecosystems. For threatened species like hedgehogs, rapidly mapping critical habitat features becomes increasingly valuable during a time when climate change and urbanization are actively reshaping the places that hedgehogs like to call home.

Can AI detect hedgehogs from space? Maybe if you find brambles first. Read More »

youtube-music-is-testing-ai-hosts-that-will-interrupt-your-tunes

YouTube Music is testing AI hosts that will interrupt your tunes

YouTube has a new Labs program, allowing listeners to “discover the next generation of YouTube.” In case you were wondering, that generation is apparently all about AI. The streaming site says Labs will offer a glimpse of the AI features it’s developing for YouTube Music, and it starts with AI “hosts” that will chime in while you’re listening to music. Yes, really.

The new AI music hosts are supposed to provide a richer listening experience, according to YouTube. As you’re listening to tunes, the AI will generate audio snippets similar to, but shorter than, the fake podcasts you can create in NotebookLM. The “Beyond the Beat” host will break in every so often with relevant stories, trivia, and commentary about your musical tastes. YouTube says this feature will appear when you are listening to mixes and radio stations.

The experimental feature is intended to be a bit like having a radio host drop some playful banter while cueing up the next song. It sounds a bit like Spotify’s AI DJ, but the YouTube AI doesn’t create playlists like Spotify’s robot. This is still generative AI, which comes with the risk of hallucinations and low-quality slop, neither of which belongs in your music. That said, Google’s Audio Overviews are often surprisingly good in small doses.

YouTube Music is testing AI hosts that will interrupt your tunes Read More »

google-deepmind-unveils-its-first-“thinking”-robotics-ai

Google DeepMind unveils its first “thinking” robotics AI

Imagine that you want a robot to sort a pile of laundry into whites and colors. Gemini Robotics-ER 1.5 would process the request along with images of the physical environment (a pile of clothing). This AI can also call tools like Google search to gather more data. The ER model then generates natural language instructions, specific steps that the robot should follow to complete the given task.

Gemin iRobotics thinking

The two new models work together to “think” about how to complete a task.

Credit: Google

The two new models work together to “think” about how to complete a task. Credit: Google

Gemini Robotics 1.5 (the action model) takes these instructions from the ER model and generates robot actions while using visual input to guide its movements. But it also goes through its own thinking process to consider how to approach each step. “There are all these kinds of intuitive thoughts that help [a person] guide this task, but robots don’t have this intuition,” said DeepMind’s Kanishka Rao. “One of the major advancements that we’ve made with 1.5 in the VLA is its ability to think before it acts.”

Both of DeepMind’s new robotic AIs are built on the Gemini foundation models but have been fine-tuned with data that adapts them to operating in a physical space. This approach, the team says, gives robots the ability to undertake more complex multi-stage tasks, bringing agentic capabilities to robotics.

The DeepMind team tests Gemini robotics with a few different machines, like the two-armed Aloha 2 and the humanoid Apollo. In the past, AI researchers had to create customized models for each robot, but that’s no longer necessary. DeepMind says that Gemini Robotics 1.5 can learn across different embodiments, transferring skills learned from Aloha 2’s grippers to the more intricate hands on Apollo with no specialized tuning.

All this talk of physical agents powered by AI is fun, but we’re still a long way from a robot you can order to do your laundry. Gemini Robotics 1.5, the model that actually controls robots, is still only available to trusted testers. However, the thinking ER model is now rolling out in Google AI Studio, allowing developers to generate robotic instructions for their own physically embodied robotic experiments.

Google DeepMind unveils its first “thinking” robotics AI Read More »

reviewing-ios-26-for-power-users:-reminders,-preview,-and-more

Reviewing iOS 26 for power users: Reminders, Preview, and more


These features try to turn iPhones into more powerful work and organization tools.

iOS 26 came out last week, bringing a new look and interface alongside some new capabilities and updates aimed squarely at iPhone power users.

We gave you our main iOS 26 review last week. This time around, we’re taking a look at some of the updates targeted at people who rely on their iPhones for much more than making phone calls and browsing the Internet. Many of these features rely on Apple Intelligence, meaning they’re only as reliable and helpful as Apple’s generative AI (and only available on newer iPhones, besides). Other adjustments are smaller but could make a big difference to people who use their phone to do work tasks.

Reminders attempt to get smarter

The Reminders app gets the Apple Intelligence treatment in iOS 26, with the AI primarily focused on making it easier to organize content within Reminders lists. Lines in Reminders lists are often short, quickly jotted-down blurbs rather than lengthy, detailed complex instructions. With this in mind, it’s easy to see how the AI can sometimes lack enough information in order to perform certain tasks, like logically grouping different errands into sensible sections.

But Apple also encourages applying the AI-based Reminders features to areas of life that could hold more weight, such as making a list of suggested reminders from emails. For serious or work-critical summaries, Reminders’ new Apple Intelligence capabilities aren’t reliable enough.

Suggested Reminders based on selected text

iOS 26 attempts to elevate Reminders from an app for making lists to an organization tool that helps you identify information or important tasks that you should accomplish. If you share content, such as emails, website text, or a note, with the app, it can create a list of what it thinks are the critical things to remember from the text. But if you’re trying to extract information any more advanced than an ingredients list from a recipe, Reminders misses the mark.

iOS 26 Suggested Reminders

Sometimes I tried sharing longer text with Reminders and didn’t get any suggestions.

Credit: Scharon Harding

Sometimes I tried sharing longer text with Reminders and didn’t get any suggestions. Credit: Scharon Harding

Sometimes, especially when reviewing longer text, Reminders was unable to think of suggested reminders. Other times, the reminders that it suggested, based off of lengthy messages, were off-base.

For instance, I had the app pull suggested reminders from a long email with guidelines and instructions from an editor. Highlighting a lot of text can be tedious on a touchscreen, but I did it anyway because the message had lots of helpful information broken up into sections that each had their own bold sub-headings. Additionally, most of those sections had their own lists (some using bullet points, some using numbers). I hoped Reminders would at least gather information from all of the email’s lists. But the suggested reminders ended up just being the same text from three—but not all—of the email’s bold sub-headings.

When I tried getting suggested reminders from a smaller portion of the same email, I surprisingly got five bullet points that covered more than just the email’s sub-headings but that still missed key points, including the email’s primary purpose.

Ultimately, the suggested Reminders feature mostly just boosts the app’s ability to serve as a modern shopping list. Suggested Reminders excels at pulling out ingredients from recipes, turning each ingredient into a suggestion that you can tap to add to a Reminders list. But being able to make a bulleted list out of a bulleted list is far from groundbreaking.

Auto-categorizing lines in Reminders lists

Since iOS 17, Reminders has been able to automatically sort items in grocery lists into distinct categories, like Produce and Proteins. iOS 26 tries taking things further by automatically grouping items in a list into non-culinary sections.

The way Reminders groups user-created tasks in lists is more sensible—and useful—than when it tries to create task suggestions based on shared text.

For example, I made a long list of various errands I needed to do, and Reminders grouped them into these categories: Administrative Tasks, Household Chores, Miscellaneous, Personal Tasks, Shopping, and Travel & Accommodation. The error rate here is respectable, but I would have tweaked some things. For one, I wouldn’t use the word “administrative” to refer to personal errands. The two tasks included under Administrative Tasks would have made more sense to me in Personal Tasks or Miscellaneous, even though those category names are almost too vague to have distinct meaning.

Preview comes to iOS

With Preview’s iOS debut, Apple brings to iPhones an app for viewing and editing PDFs and images that macOS users have had for years. As a result, many iPhone users will find the software easy and familiar to use.

But for iPhone owners who have long relied on Files for viewing, marking, and filling out PDFs and the like, Preview doesn’t bring many new capabilities. Anything that you can do in Preview, you could have done by viewing the same document in Files in an older version of iOS, save for a new crop tool and dedicated button for showing information about the document.

That’s kind of the point, though. When an iPhone has two discrete apps that can read and edit files, it’s far less frustrating to work with multiple documents. While you’re annotating a document in Preview, the Files app is still available, allowing you to have more than one document open at once. It’s a simple adjustment but one that vastly improves multitasking.

More Shortcuts options

Shortcuts gets somewhat more capable in iOS 26. That’s assuming you’re interested in using ChatGPT or Apple Intelligence generative AI in your automated tasks. You can tag in generative AI to create a shortcut that includes summarizing text in bullet points and applying that bulleted list to the shortcut’s next task, for instance.

An example of a Shortcut that uses generative AI.

Credit: Apple

An example of a Shortcut that uses generative AI. Credit: Apple

There are inherent drawbacks here. For one, Apple Intelligence and ChatGPT, like many generative AI tools, are subject to inaccuracies and can frequently overlook and/or misinterpret critical information. iOS 26 makes it easier for power users to incorporate a rewrite of a long text that has a more professional tone into a Shortcut. But that doesn’t mean that AI will properly communicate the information, especially when used across different scenarios with varied text.

You have three options for building Shortcuts that include use of AI models. Using ChatGPT or Apple Intelligence via Apple’s Private Cloud Compute, which runs the model on an Apple server, requires an Internet connection. Alternatively, you can use an on-device model without connecting to the web.

You can run more advanced models via Private Cloud Compute than you can with Apple Intelligence on-device. In Apple’s testing, models via Private Cloud Compute perform better on things like writing summaries and composition compared to on-device models.

Apple says personal user data sent to Private Cloud Compute “isn’t accessible to anyone other than the user — not even to Apple.” Apple has a strong, but flawed, reputation for being better about user privacy than other Big Tech firms. But by offering three different models to use with Shortcuts, iOS 26 ensures greater functionality, options, and control.

Something for podcasters

It’s likely that more people rely on iPads (or Macs) than iPhones for podcasting. Nevertheless, a new local capture feature introduced to both iOS 26 and iPadOS 26 makes it a touch more feasible to use iPhones (and iPads especially) for recording interviews for podcasts.

Before the latest updates, iOS and iPadOS only allowed one app to access the device’s microphone at a time. So, if you were interviewing someone via a videoconferencing app, you couldn’t also use your iPhone or iPad to record the discussion, since the videoconferencing app is using your mic to share your voice with whoever is on the other end of the call. Local capture on iOS 26 doesn’t include audio input controls, but its inclusion gives podcasters a way to record interviews or conversations on iPhones without needing additional software or hardware. That capability could save the day in a pinch.

Photo of Scharon Harding

Scharon is a Senior Technology Reporter at Ars Technica writing news, reviews, and analysis on consumer gadgets and services. She’s been reporting on technology for over 10 years, with bylines at Tom’s Hardware, Channelnomics, and CRN UK.

Reviewing iOS 26 for power users: Reminders, Preview, and more Read More »

deepmind’s-robotic-ballet:-an-ai-for-coordinating-manufacturing-robots

DeepMind’s robotic ballet: An AI for coordinating manufacturing robots


An AI figures out how robots can get jobs done without getting in each other’s way.

A lot of the stuff we use today is largely made by robots—arms with multiple degrees of freedom positioned along conveyor belts that move in a spectacle of precisely synchronized motions. All this motion is usually programmed by hand, which can take hundreds to thousands of hours. Google’s DeepMind team has developed an AI system called RoboBallet that lets manufacturing robots figure out what to do on their own.

Traveling salesmen

Planning what manufacturing robots should do to get their jobs done efficiently is really hard to automate. You need to solve both task allocation and scheduling—deciding which task should be done by which robot in what order. It’s like the famous traveling salesman problem on steroids. On top of that, there is the question of motion planning; you need to make sure all these robotic arms won’t collide with each other or with all the gear standing around them.

At the end, you’re facing myriad possible combinations where you’ve got to solve not one but three computationally hard problems at the same time. “There are some tools that let you automate motion planning, but task allocation and scheduling are usually done manually,” says Matthew Lai, a research engineer at Google DeepMind. “Solving all three of these problems combined is what we tackled in our work.”

Lai’s team started by generating simulated samples of what are called work cells, areas where teams of robots perform their tasks on a product being manufactured. The work cells contained something called a workpiece, a product on which the robots do work, in this case something to be constructed of aluminum struts placed on a table. Around the table, there were up to eight randomly placed Franka Panda robotic arms, each with 7 degrees of freedom, that were supposed to complete up to 40 tasks on a workpiece. Every task required a robotic arm’s end effector to get within 2.5 centimeters of the right spot on the right strut, approached from the correct angle, then stay there, frozen, for a moment. The pause simulates doing some work.

To make things harder, the team peppered every work cell with random obstacles the robots had to avoid. “We chose to work with up to eight robots, as this is around the sensible maximum for packing robots closely together without them blocking each other all the time,” Lai explains. Forcing the robots to perform 40 tasks on a workpiece was also something the team considered representative of what’s required at real factories.

A setup like this would be a nightmare to tackle using even the most powerful reinforcement-learning algorithms. Lai and his colleagues found a way around it by turning it all into graphs.

Complex relationships

Graphs in Lai’s model comprised nodes and edges. Things like robots, tasks, and obstacles were treated as nodes. Relationships between them were encoded as either one- or bi-directional edges. One-directional edges connected robots with tasks and obstacles because the robots needed information about where the obstacles were and whether the tasks were completed or not. Bidirectional edges connected the robots to each other, because each robot had to know what other robots were doing at each time step to avoid collisions or duplicating tasks.

To read and make sense of the graphs, the team used graph neural networks, a type of artificial intelligence designed to extract relationships between the nodes by passing messages along the edges of the connections among them. This decluttered the data, allowing the researchers to design a system that focused exclusively on what mattered most: finding the most efficient ways to complete tasks while navigating obstacles. After a few days of training on randomly generated work cells using a single Nvidia A100 GPU, the new industrial planning AI, called RoboBallet, could lay out seemingly viable trajectories through complex, previously unseen environments in a matter of seconds.

Most importantly, though, it scaled really well.

Economy of scale

The problem with applying traditional computational methods to complex problems like managing robots at a factory is that the challenge of computation grows exponentially with the number of items you have in your system. Computing the most optimal trajectories for one robot is relatively simple. Doing the same for two is considerably harder; when the number grows to eight, the problem becomes practically intractable.

With RoboBallet, the complexity of computation also grew with the complexity of the system, but at a far slower rate. (The computations grew linearly with the growing number of tasks and obstacles, and quadratically with the number of robots.) According to the team, these computations should make the system feasible for industrial-scale use.

The team wanted to test, however, whether the plans their AI was producing were any good. To check that, Lai and his colleagues computed the most optimal task allocations, schedules, and motions in a few simplified work cells and compared those with results delivered by RoboBallet. In terms of execution time, arguably the most important metric in manufacturing, the AI came very close to what human engineers could do. It wasn’t better than they were—it just provided an answer more quickly.

The team also tested RoboBallet plans on a real-world physical setup of four Panda robots working on an aluminum workpiece, and they worked just as well as in simulations. But Lai says it can do more than just speed up the process of programming robots.

Limping along

RoboBallet, according to DeepMind’s team, also enables us to design better work cells. “Because it works so fast, it would be possible for a designer to try different layouts and different placement or selections of robots in almost real time,” Lai says. This way, engineers at factories would be able to see exactly how much time they would save by adding another robot to a cell or choosing a robot of a different type. Another thing RoboBallet can do is reprogram the work cell on the fly, allowing other robots to fill in when one of them breaks down.

Still, there are a few things that still need ironing out before RoboBallet can come to factories. “There are several simplifications we made,” Lai admits. The first was that the obstacles were decomposed into cuboids. Even the workpiece itself was cubical. While this was somewhat representative of the obstacles and equipment in real factories, there are lots of possible workpieces with more organic shapes. “It would be better to represent those in a more flexible way, like mesh graphs or point clouds,” Lai says. This, however, would likely mean a drop in RoboBallet’s blistering speed.

Another thing is that the robots in Lai’s experiments were identical, while in a real-world work cell, robotic teams are quite often heterogeneous. “That’s why real-world applications would require additional research and engineering specific to the type of application,” Lai says. He adds, though, that the current RoboBallet is already designed with such adaptations in mind—it can be easily extended to support them. And once that’s done, his hope is that it will make factories faster and way more flexible.

“The system would have to be given work cell models, the workpiece models, as well as the list of tasks that need to be done—based on that, RoboBallet would be able to generate a complete plan,” Lai says.

Science Robotics, 2025. DOI: 10.1126/scirobotics.ads1204

Photo of Jacek Krywko

Jacek Krywko is a freelance science and technology writer who covers space exploration, artificial intelligence research, computer science, and all sorts of engineering wizardry.

DeepMind’s robotic ballet: An AI for coordinating manufacturing robots Read More »

why-does-openai-need-six-giant-data-centers?

Why does OpenAI need six giant data centers?

Training next-generation AI models compounds the problem. On top of running existing AI models like those that power ChatGPT, OpenAI is constantly working on new technology in the background. It’s a process that requires thousands of specialized chips running continuously for months.

The circular investment question

The financial structure of these deals between OpenAI, Oracle, and Nvidia has drawn scrutiny from industry observers. Earlier this week, Nvidia announced it would invest up to $100 billion as OpenAI deploys Nvidia systems. As Bryn Talkington of Requisite Capital Management told CNBC: “Nvidia invests $100 billion in OpenAI, which then OpenAI turns back and gives it back to Nvidia.”

Oracle’s arrangement follows a similar pattern, with a reported $30 billion-per-year deal where Oracle builds facilities that OpenAI pays to use. This circular flow, which involves infrastructure providers investing in AI companies that become their biggest customers, has raised eyebrows about whether these represent genuine economic investments or elaborate accounting maneuvers.

The arrangements are becoming even more convoluted. The Information reported this week that Nvidia is discussing leasing its chips to OpenAI rather than selling them outright. Under this structure, Nvidia would create a separate entity to purchase its own GPUs, then lease them to OpenAI, which adds yet another layer of circular financial engineering to this complicated relationship.

“NVIDIA seeds companies and gives them the guaranteed contracts necessary to raise debt to buy GPUs from NVIDIA, even though these companies are horribly unprofitable and will eventually die from a lack of any real demand,” wrote tech critic Ed Zitron on Bluesky last week about the unusual flow of AI infrastructure investments. Zitron was referring to companies like CoreWeave and Lambda Labs, which have raised billions in debt to buy Nvidia GPUs based partly on contracts from Nvidia itself. It’s a pattern that mirrors OpenAI’s arrangements with Oracle and Nvidia.

So what happens if the bubble pops? Even Altman himself warned last month that “someone will lose a phenomenal amount of money” in what he called an AI bubble. If AI demand fails to meet these astronomical projections, the massive data centers built on physical soil won’t simply vanish. When the dot-com bubble burst in 2001, fiber optic cable laid during the boom years eventually found use as Internet demand caught up. Similarly, these facilities could potentially pivot to cloud services, scientific computing, or other workloads, but at what might be massive losses for investors who paid AI-boom prices.

Why does OpenAI need six giant data centers? Read More »

when-“no”-means-“yes”:-why-ai-chatbots-can’t-process-persian-social-etiquette

When “no” means “yes”: Why AI chatbots can’t process Persian social etiquette

If an Iranian taxi driver waves away your payment, saying, “Be my guest this time,” accepting their offer would be a cultural disaster. They expect you to insist on paying—probably three times—before they’ll take your money. This dance of refusal and counter-refusal, called taarof, governs countless daily interactions in Persian culture. And AI models are terrible at it.

New research released earlier this month titled “We Politely Insist: Your LLM Must Learn the Persian Art of Taarof” shows that mainstream AI language models from OpenAI, Anthropic, and Meta fail to absorb these Persian social rituals, correctly navigating taarof situations only 34 to 42 percent of the time. Native Persian speakers, by contrast, get it right 82 percent of the time. This performance gap persists across large language models such as GPT-4o, Claude 3.5 Haiku, Llama 3, DeepSeek V3, and Dorna, a Persian-tuned variant of Llama 3.

A study led by Nikta Gohari Sadr of Brock University, along with researchers from Emory University and other institutions, introduces “TAAROFBENCH,” the first benchmark for measuring how well AI systems reproduce this intricate cultural practice. The researchers’ findings show how recent AI models default to Western-style directness, completely missing the cultural cues that govern everyday interactions for millions of Persian speakers worldwide.

“Cultural missteps in high-consequence settings can derail negotiations, damage relationships, and reinforce stereotypes,” the researchers write. For AI systems increasingly used in global contexts, that cultural blindness could represent a limitation that few in the West realize exists.

A taarof scenario diagram from TAAROFBENCH, devised by the researchers. Each scenario defines the environment, location, roles, context, and user utterance.

A taarof scenario diagram from TAAROFBENCH, devised by the researchers. Each scenario defines the environment, location, roles, context, and user utterance. Credit: Sadr et al.

“Taarof, a core element of Persian etiquette, is a system of ritual politeness where what is said often differs from what is meant,” the researchers write. “It takes the form of ritualized exchanges: offering repeatedly despite initial refusals, declining gifts while the giver insists, and deflecting compliments while the other party reaffirms them. This ‘polite verbal wrestling’ (Rafiee, 1991) involves a delicate dance of offer and refusal, insistence and resistance, which shapes everyday interactions in Iranian culture, creating implicit rules for how generosity, gratitude, and requests are expressed.”

When “no” means “yes”: Why AI chatbots can’t process Persian social etiquette Read More »

science-journalists-find-chatgpt-is-bad-at-summarizing-scientific-papers

Science journalists find ChatGPT is bad at summarizing scientific papers

No, I don’t think this machine summary can replace my human summary, now that you ask…

No, I don’t think this machine summary can replace my human summary, now that you ask… Credit: AAAS

Still, the quantitative survey results among those journalists were pretty one-sided. On the question of whether the ChatGPT summaries “could feasibly blend into the rest of your summary lineups, the average summary rated a score of just 2.26 on a scale of 1 (“no, not at all”) to 5 (“absolutely”). On the question of whether the summaries were “compelling,” the LLM summaries averaged just 2.14 on the same scale. Across both questions, only a single summary earned a “5” from the human evaluator on either question, compared to 30 ratings of “1.”

Not up to standards

Writers were also asked to write out more qualitative assessments of the individual summaries they evaluated. In these, the writers complained that ChatGPT often conflated correlation and causation, failed to provide context (e.g., that soft actuators tend to be very slow), and tended to overhype results by overusing words like “groundbreaking” and “novel” (though this last behavior went away when the prompts specifically addressed it).

Overall, the researchers found that ChatGPT was usually good at “transcribing” what was written in a scientific paper, especially if that paper didn’t have much nuance to it. But the LLM was weak at “translating” those findings by diving into methodologies, limitations, or big picture implications. Those weaknesses were especially true for papers that offered multiple differing results, or when the LLM was asked to summarize two related papers into one brief.

This AI summary just isn’t compelling enough for me.

This AI summary just isn’t compelling enough for me. Credit: AAAS

While the tone and style of ChatGPT summaries were often a good match for human-authored content, “concerns about the factual accuracy in LLM-authored content” were prevalent, the journalists wrote. Even using ChatGPT summaries as a “starting point” for human editing “would require just as much, if not more, effort as drafting summaries themselves from scratch” due to the need for “extensive fact-checking,” they added.

These results might not be too surprising given previous studies that have shown AI search engines citing incorrect news sources a full 60 percent of the time. Still, the specific weaknesses are all the more glaring when discussing scientific papers, where accuracy and clarity of communication are paramount.

In the end, the AAAS journalists concluded that ChatGPT “does not meet the style and standards for briefs in the SciPak press package.” But the white paper did allow that it might be worth running the experiment again if ChatGPT “experiences a major update.” For what it’s worth, GPT-5 was introduced to the public in August.

Science journalists find ChatGPT is bad at summarizing scientific papers Read More »

ai-medical-tools-found-to-downplay-symptoms-of-women,-ethnic-minorities

AI medical tools found to downplay symptoms of women, ethnic minorities

Google said it took model bias “extremely seriously” and was developing privacy techniques that can sanitise sensitive datasets and develop safeguards against bias and discrimination.

Researchers have suggested that one way to reduce medical bias in AI is to identify what data sets should not be used for training in the first place, and then train on diverse and more representative health data sets.

Zack said Open Evidence, which is used by 400,000 doctors in the US to summarize patient histories and retrieve information, trained its models on medical journals, the US Food and Drug Administration’s labels, health guidelines and expert reviews. Every AI output is also backed up with a citation to a source.

Earlier this year, researchers at University College London and King’s College London partnered with the UK’s NHS to build a generative AI model, called Foresight.

The model was trained on anonymized patient data from 57 million people on medical events such as hospital admissions and Covid-19 vaccinations. Foresight was designed to predict probable health outcomes, such as hospitalization or heart attacks.

“Working with national-scale data allows us to represent the full kind of kaleidoscopic state of England in terms of demographics and diseases,” said Chris Tomlinson, honorary senior research fellow at UCL, who is the lead researcher of the Foresight team. Although not perfect, Tomlinson said it offered a better start than more general datasets.

European scientists have also trained an AI model called Delphi-2M that predicts susceptibility to diseases decades into the future, based on anonymzsed medical records from 400,000 participants in UK Biobank.

But with real patient data of this scale, privacy often becomes an issue. The NHS Foresight project was paused in June to allow the UK’s Information Commissioner’s Office to consider a data protection complaint, filed by the British Medical Association and Royal College of General Practitioners, over its use of sensitive health data in the model’s training.

In addition, experts have warned that AI systems often “hallucinate”—or make up answers—which could be particularly harmful in a medical context.

But MIT’s Ghassemi said AI was bringing huge benefits to healthcare. “My hope is that we will start to refocus models in health on addressing crucial health gaps, not adding an extra percent to task performance that the doctors are honestly pretty good at anyway.”

© 2025 The Financial Times Ltd. All rights reserved Not to be redistributed, copied, or modified in any way.

AI medical tools found to downplay symptoms of women, ethnic minorities Read More »