AI

ai-models-can-acquire-backdoors-from-surprisingly-few-malicious-documents

AI models can acquire backdoors from surprisingly few malicious documents

Fine-tuning experiments with 100,000 clean samples versus 1,000 clean samples showed similar attack success rates when the number of malicious examples stayed constant. For GPT-3.5-turbo, between 50 and 90 malicious samples achieved over 80 percent attack success across dataset sizes spanning two orders of magnitude.

Limitations

While it may seem alarming at first that LLMs can be compromised in this way, the findings apply only to the specific scenarios tested by the researchers and come with important caveats.

“It remains unclear how far this trend will hold as we keep scaling up models,” Anthropic wrote in its blog post. “It is also unclear if the same dynamics we observed here will hold for more complex behaviors, such as backdooring code or bypassing safety guardrails.”

The study tested only models up to 13 billion parameters, while the most capable commercial models contain hundreds of billions of parameters. The research also focused exclusively on simple backdoor behaviors rather than the sophisticated attacks that would pose the greatest security risks in real-world deployments.

Also, the backdoors can be largely fixed by the safety training companies already do. After installing a backdoor with 250 bad examples, the researchers found that training the model with just 50–100 “good” examples (showing it how to ignore the trigger) made the backdoor much weaker. With 2,000 good examples, the backdoor basically disappeared. Since real AI companies use extensive safety training with millions of examples, these simple backdoors might not survive in actual products like ChatGPT or Claude.

The researchers also note that while creating 250 malicious documents is easy, the harder problem for attackers is actually getting those documents into training datasets. Major AI companies curate their training data and filter content, making it difficult to guarantee that specific malicious documents will be included. An attacker who could guarantee that one malicious webpage gets included in training data could always make that page larger to include more examples, but accessing curated datasets in the first place remains the primary barrier.

Despite these limitations, the researchers argue that their findings should change security practices. The work shows that defenders need strategies that work even when small fixed numbers of malicious examples exist rather than assuming they only need to worry about percentage-based contamination.

“Our results suggest that injecting backdoors through data poisoning may be easier for large models than previously believed as the number of poisons required does not scale up with model size,” the researchers wrote, “highlighting the need for more research on defences to mitigate this risk in future models.”

AI models can acquire backdoors from surprisingly few malicious documents Read More »

bank-of-england-warns-ai-stock-bubble-rivals-2000-dotcom-peak

Bank of England warns AI stock bubble rivals 2000 dotcom peak

Share valuations based on past earnings have also reached their highest levels since the dotcom bubble 25 years ago, though the BoE noted they appear less extreme when based on investors’ expectations for future profits. “This, when combined with increasing concentration within market indices, leaves equity markets particularly exposed should expectations around the impact of AI become less optimistic,” the central bank said.

Toil and trouble?

The dotcom bubble offers a potentially instructive parallel to our current era. In the late 1990s, investors poured money into Internet companies based on the promise of a transformed economy, seemingly ignoring whether individual businesses had viable paths to profitability. Between 1995 and March 2000, the Nasdaq index rose 600 percent. When sentiment shifted, the correction was severe: the Nasdaq fell 78 percent from its peak, reaching a low point in October 2002.

Whether we’ll see the same thing or worse if an AI bubble pops is mere speculation at this point. But similar to the early 2000s, the question about today’s market isn’t necessarily about the utility of AI tools themselves (the Internet was useful, afterall, despite the bubble), but whether the amount of money being poured into the companies that sell them is out of proportion with the potential profits those improvements might bring.

We don’t have a crystal ball to determine when such a bubble might pop, or even if it is guaranteed to do so, but we’ll likely continue to see more warning signs ahead if AI-related deals continue to grow larger and larger over time.

Bank of England warns AI stock bubble rivals 2000 dotcom peak Read More »

vandals-deface-ads-for-ai-necklaces-that-listen-to-all-your-conversations

Vandals deface ads for AI necklaces that listen to all your conversations

In addition to backlash over feared surveillance capitalism, critics have accused Schiffman of taking advantage of the loneliness epidemic. Conducting a survey last year, researchers with Harvard Graduate School of Education’s Making Caring Common found that people between “30-44 years of age were the loneliest group.” Overall, 73 percent of those surveyed “selected technology as contributing to loneliness in the country.”

But Schiffman rejects these criticisms, telling the NYT that his AI Friend pendant is intended to supplement human friends, not replace them, supposedly helping to raise the “average emotional intelligence” of users “significantly.”

“I don’t view this as dystopian,” Schiffman said, suggesting that “the AI friend is a new category of companionship, one that will coexist alongside traditional friends rather than replace them,” the NYT reported. “We have a cat and a dog and a child and an adult in the same room,” the Friend founder said. “Why not an AI?”

The MTA has not commented on the controversy, but Victoria Mottesheard—a vice president at Outfront Media, which manages MTA advertising—told the NYT that the Friend campaign blew up because AI “is the conversation of 2025.”

Website lets anyone deface Friend ads

So far, the Friend ads have not yielded significant sales, Schiffman confirmed, telling the NYT that only 3,100 have sold. He expects that society isn’t ready for AI companions to be promoted at such a large scale and that his ad campaign will help normalize AI friends.

In the meantime, critics have rushed to attack Friend on social media, inspiring a website where anyone can vandalize a Friend ad and share it online. That website has received close to 6,000 submissions so far, its creator, Marc Mueller, told the NYT, and visitors can take a tour of these submissions by choosing to “ride train to see more” after creating their own vandalized version.

For visitors to Mueller’s site, riding the train displays a carousel documenting backlash to Friend, as well as “performance art” by visitors poking fun at the ads in less serious ways. One example showed a vandalized ad changing “Friend” to “Fries,” with a crude illustration of McDonald’s French fries, while another transformed the ad into a campaign for “fried chicken.”

Others were seemingly more serious about turning the ad into a warning. One vandal drew a bunch of arrows pointing to the “end” in Friend while turning the pendant into a cry-face emoji, seemingly drawing attention to research on the mental health risks of relying on AI companions—including the alleged suicide risks of products like Character.AI and ChatGPT, which have spawned lawsuits and prompted a Senate hearing.

Vandals deface ads for AI necklaces that listen to all your conversations Read More »

insurers-balk-at-paying-out-huge-settlements-for-claims-against-ai-firms

Insurers balk at paying out huge settlements for claims against AI firms

OpenAI is currently being sued for copyright infringement by The New York Times and authors who claim their content was used to train models without consent. It is also being sued for wrongful death by the parents of a 16-year-old who died by suicide after discussing methods with ChatGPT.

Two people with knowledge of the matter said OpenAI has considered “self insurance,” or putting aside investor funding in order to expand its coverage. The company has raised nearly $60 billion to date, with a substantial amount of the funding contingent on a proposed corporate restructuring.

One of those people said OpenAI had discussed setting up a “captive”—a ringfenced insurance vehicle often used by large companies to manage emerging risks. Big tech companies such as Microsoft, Meta, and Google have used captives to cover Internet-era liabilities such as cyber or social media.

Captives can also carry risks, since a substantial claim can deplete an underfunded captive, leaving the parent company vulnerable.

OpenAI said it has insurance in place and is evaluating different insurance structures as the company grows, but does not currently have a captive and declined to comment on future plans.

Anthropic has agreed to pay $1.5 billion to settle a class-action lawsuit with authors over their alleged use of pirated books to train AI models.

In court documents, Anthropic’s lawyers warned the suit carried the specter of “unprecedented and potentially business-threatening statutory damages against the smallest one of the many companies developing [AI] with the same books data.”

Anthropic, which has raised more than $30 billion to date, is partly using its own funds for the settlement, according to one person with knowledge of the matter. Anthropic declined to comment.

© 2025 The Financial Times Ltd. All rights reserved. Not to be redistributed, copied, or modified in any way.

Insurers balk at paying out huge settlements for claims against AI firms Read More »

deloitte-will-refund-australian-government-for-ai-hallucination-filled-report

Deloitte will refund Australian government for AI hallucination-filled report

The Australian Financial Review reports that Deloitte Australia will offer the Australian government a partial refund for a report that was littered with AI-hallucinated quotes and references to nonexistent research.

Deloitte’s “Targeted Compliance Framework Assurance Review” was finalized in July and published by Australia’s Department of Employment and Workplace Relations (DEWR) in August (Internet Archive version of the original). The report, which cost Australian taxpayers nearly $440,000 AUD (about $290,000 USD), focuses on the technical framework the government uses to automate penalties under the country’s welfare system.

Shortly after the report was published, though, Sydney University Deputy Director of Health Law Chris Rudge noticed citations to multiple papers and publications that did not exist. That included multiple references to nonexistent reports by Lisa Burton Crawford, a real professor at the University of Sydney law school.

“It is concerning to see research attributed to me in this way,” Crawford told the AFR in August. “I would like to see an explanation from Deloitte as to how the citations were generated.”

“A small number of corrections”

Deloitte and the DEWR buried that explanation in an updated version of the original report published Friday “to address a small number of corrections to references and footnotes,” according to the DEWR website. On page 58 of that 273-page updated report, Deloitte added a reference to “a generative AI large language model (Azure OpenAI GPT-4o) based tool chain” that was used as part of the technical workstream to help “[assess] whether system code state can be mapped to business requirements and compliance needs.”

Deloitte will refund Australian government for AI hallucination-filled report Read More »

amd-wins-massive-ai-chip-deal-from-openai-with-stock-sweetener

AMD wins massive AI chip deal from OpenAI with stock sweetener

As part of the arrangement, AMD will allow OpenAI to purchase up to 160 million AMD shares at 1 cent each throughout the chips deal.

OpenAI diversifies its chip supply

With demand for AI compute growing rapidly, companies like OpenAI have been looking for secondary supply lines and sources of additional computing capacity, and the AMD partnership is part the company’s wider effort to secure sufficient computing power for its AI operations. In September, Nvidia announced an investment of up to $100 billion in OpenAI that included supplying at least 10 gigawatts of Nvidia systems. OpenAI plans to deploy a gigawatt of Nvidia’s next-generation Vera Rubin chips in late 2026.

OpenAI has worked with AMD for years, according to Reuters, providing input on the design of older generations of AI chips such as the MI300X. The new agreement calls for deploying the equivalent of 6 gigawatts of computing power using AMD chips over multiple years.

Beyond working with chip suppliers, OpenAI is widely reported to be developing its own silicon for AI applications and has partnered with Broadcom, as we reported in February. A person familiar with the matter told Reuters the AMD deal does not change OpenAI’s ongoing compute plans, including its chip development effort or its partnership with Microsoft.

AMD wins massive AI chip deal from OpenAI with stock sweetener Read More »

openai,-jony-ive-struggle-with-technical-details-on-secretive-new-ai-gadget

OpenAI, Jony Ive struggle with technical details on secretive new AI gadget

OpenAI overtook Elon Musk’s SpaceX to become the world’s most valuable private company this week, after a deal that valued it at $500 billion. One of the ways the ChatGPT maker is seeking to justify the price tag is a push into hardware.

The goal is to improve the “smart speakers” of the past decade, such as Amazon’s Echo speaker and its Alexa digital assistant, which are generally used for a limited set of functions such as listening to music and setting kitchen timers.

OpenAI and Ive are seeking to build a more powerful and useful machine. But two people familiar with the project said that settling on the device’s “voice” and its mannerisms were a challenge.

One issue is ensuring the device only chimes in when useful, preventing it from talking too much or not knowing when to finish the conversation—an ongoing issue with ChatGPT.



“The concept is that you should have a friend who’s a computer who isn’t your weird AI girlfriend… like [Apple’s digital voice assistant] Siri but better,” said one person who was briefed on the plans. OpenAI was looking for “ways for it to be accessible but not intrusive.”

“Model personality is a hard thing to balance,” said another person close to the project. “It can’t be too sycophantic, not too direct, helpful, but doesn’t keep talking in a feedback loop.”

OpenAI’s device will be entering a difficult market. Friend, an AI companion worn as a pendant around your neck, has been criticized for being “creepy” and having a “snarky” personality. An AI pin made by Humane, a company that Altman personally invested in, has been scrapped.

Still, OpenAI has been on a hiring spree to build its hardware business. Its acquisition of io brought in more than 20 former Apple hardware employees poached by Ive from his alma mater. It has also recruited at least a dozen other Apple device experts this year, according to LinkedIn accounts.

It has similarly poached members of Meta’s staff working on the Big Tech group’s Quest headset and smart glasses.

OpenAI is also working with Chinese contract manufacturers, including Luxshare, to create its first device, according to two people familiar with the development that was first reported by The Information. The people added that the device might be assembled outside of China.

OpenAI and LoveFrom, Ive’s design group, declined to comment.

© 2025 The Financial Times Ltd. All rights reserved. Not to be redistributed, copied, or modified in any way.

OpenAI, Jony Ive struggle with technical details on secretive new AI gadget Read More »

a-biological-0-day?-threat-screening-tools-may-miss-ai-designed-proteins.

A biological 0-day? Threat-screening tools may miss AI-designed proteins.


Ordering DNA for AI-designed toxins doesn’t always raise red flags.

Designing variations of the complex, three-dimensional structures of proteins has been made a lot easier by AI tools. Credit: Historical / Contributor

On Thursday, a team of researchers led by Microsoft announced that they had discovered, and possibly patched, what they’re terming a biological zero-day—an unrecognized security hole in a system that protects us from biological threats. The system at risk screens purchases of DNA sequences to determine when someone’s ordering DNA that encodes a toxin or dangerous virus. But, the researchers argue, it has become increasingly vulnerable to missing a new threat: AI-designed toxins.

How big of a threat is this? To understand, you have to know a bit more about both existing biosurveillance programs and the capabilities of AI-designed proteins.

Catching the bad ones

Biological threats come in a variety of forms. Some are pathogens, such as viruses and bacteria. Others are protein-based toxins, like the ricin that was sent to the White House in 2003. Still others are chemical toxins that are produced through enzymatic reactions, like the molecules associated with red tide. All of them get their start through the same fundamental biological process: DNA is transcribed into RNA, which is then used to make proteins.

For several decades now, starting the process has been as easy as ordering the needed DNA sequence online from any of a number of companies, which will synthesize a requested sequence and ship it out. Recognizing the potential threat here, governments and industry have worked together to add a screening step to every order: the DNA sequence is scanned for its ability to encode parts of proteins or viruses considered threats. Any positives are then flagged for human intervention to evaluate whether they or the people ordering them truly represent a danger.

Both the list of proteins and the sophistication of the scanning have been continually updated in response to research progress over the years. For example, initial screening was done based on similarity to target DNA sequences. But there are many DNA sequences that can encode the same protein, so the screening algorithms have been adjusted accordingly, recognizing all the DNA variants that pose an identical threat.

The new work can be thought of as an extension of that threat. Not only can multiple DNA sequences encode the same protein; multiple proteins can perform the same function. To form a toxin, for example, typically requires the protein to adopt the correct three-dimensional structure, which brings a handful of critical amino acids within the protein into close proximity. Outside of those critical amino acids, however, things can often be quite flexible. Some amino acids may not matter at all; other locations in the protein could work with any positively charged amino acid, or any hydrophobic one.

In the past, it could be extremely difficult (meaning time-consuming and expensive) to do the experiments that would tell you what sorts of changes a string of amino acids could tolerate while remaining functional. But the team behind the new analysis recognized that AI protein design tools have now gotten quite sophisticated and can predict when distantly related sequences can fold up into the same shape and catalyze the same reactions. The process is still error-prone, and you often have to test a dozen or more proposed proteins to get a working one, but it has produced some impressive successes.

So, the team developed a hypothesis to test: AI can take an existing toxin and design a protein with the same function that’s distantly related enough that the screening programs do not detect orders for the DNA that encodes it.

The zero-day treatment

The team started with a basic test: use AI tools to design variants of the toxin ricin, then test them against the software that is used to screen DNA orders. The results of the test suggested there was a risk of dangerous protein variants slipping past existing screening software, so the situation was treated like the equivalent of a zero-day vulnerability.

“Taking inspiration from established cybersecurity processes for addressing such situations, we contacted the relevant bodies regarding the potential vulnerability, including the International Gene Synthesis Consortium and trusted colleagues in the protein design community as well as leads in biosecurity at the US Office of Science and Technology Policy, US National Institute of Standards and Technologies, US Department of Homeland Security, and US Office of Pandemic Preparedness and Response,” the authors report. “Outside of those bodies, details were kept confidential until a more comprehensive study could be performed in pursuit of potential mitigations and for ‘patches’… to be developed and deployed.”

Details of that original test are being made available today as part of a much larger analysis that extends the approach to a large range of toxic proteins. Starting with 72 toxins, the researchers used three open source AI packages to generate a total of about 75,000 potential protein variants.

And this is where things get a little complicated. Many of the AI-designed protein variants are going to end up being non-functional, either subtly or catastrophically failing to fold up into the correct configuration to create an active toxin. The only way to know which ones work is to make the proteins and test them biologically; most AI protein design efforts will make actual proteins from dozens to hundreds of the most promising-looking potential designs to find a handful that are active. But doing that for 75,000 designs is completely unrealistic.

Instead, the researchers used two software-based tools to evaluate each of the 75,000 designs. One of these focuses on the similarity between the overall predicted physical structure of the proteins, and another looks at the predicted differences between the positions of individual amino acids. Either way, they’re a rough approximation of just how similar the proteins formed by two strings of amino acids should be. But they’re definitely not a clear indicator of whether those two proteins would be equally functional.

In any case, DNA sequences encoding all 75,000 designs were fed into the software that screens DNA orders for potential threats. One thing that was very clear is that there were huge variations in the ability of the four screening programs to flag these variant designs as threatening. Two of them seemed to do a pretty good job, one was mixed, and another let most of them through. Three of the software packages were updated in response to this performance, which significantly improved their ability to pick out variants.

There was also a clear trend in all four screening packages: The closer the variant was to the original structurally, the more likely the package (both before and after the patches) was to be able to flag it as a threat. In all cases, there was also a cluster of variant designs that were unlikely to fold into a similar structure, and these generally weren’t flagged as threats.

What does this mean?

Again, it’s important to emphasize that this evaluation is based on predicted structures; “unlikely” to fold into a similar structure to the original toxin doesn’t mean these proteins will be inactive as toxins. Functional proteins are probably going to be very rare among this group, but there may be a handful in there. That handful is also probably rare enough that you would have to order up and test far too many designs to find one that works, making this an impractical threat vector.

At the same time, there are also a handful of proteins that are very similar to the toxin structurally and not flagged by the software. For the three patched versions of the software, the ones that slip through the screening represent about 1 to 3 percent of the total in the “very similar” category. That’s not great, but it’s probably good enough that any group that tries to order up a toxin by this method would attract attention because they’d have to order over 50 just to have a good chance of finding one that slipped through, which would raise all sorts of red flags.

One other notable result is that the designs that weren’t flagged were mostly variants of just a handful of toxin proteins. So this is less of a general problem with the screening software and might be more of a small set of focused problems. Of note, one of the proteins that produced a lot of unflagged variants isn’t toxic itself; instead, it’s a co-factor necessary for the actual toxin to do its thing. As such, some of the screening software packages didn’t even flag the original protein as dangerous, much less any of its variants. (For these reasons, the company that makes one of the better-performing software packages decided the threat here wasn’t significant enough to merit a security patch.)

So, on its own, this work doesn’t seem to have identified something that’s a major threat at the moment. But it’s probably useful, in that it’s a good thing to get the people who engineer the screening software to start thinking about emerging threats.

That’s because, as the people behind this work note, AI protein design is still in its early stages, and we’re likely to see considerable improvements. And there’s likely to be a limit to the sorts of things we can screen for. We’re already at the point where AI protein design tools can be used to create proteins that have entirely novel functions and do so without starting with variants of existing proteins. In other words, we can design proteins that are impossible to screen for based on similarity to known threats, because they don’t look at all like anything we know is dangerous.

Protein-based toxins would be very difficult to design, because they have to both cross the cell membrane and then do something dangerous once inside. While AI tools are probably unable to design something that sophisticated at the moment, I would be hesitant to rule out the prospects of them eventually reaching that sort of sophistication.

Science, 2025. DOI: 10.1126/science.adu8578  (About DOIs).

Photo of John Timmer

John is Ars Technica’s science editor. He has a Bachelor of Arts in Biochemistry from Columbia University, and a Ph.D. in Molecular and Cell Biology from the University of California, Berkeley. When physically separated from his keyboard, he tends to seek out a bicycle, or a scenic location for communing with his hiking boots.

A biological 0-day? Threat-screening tools may miss AI-designed proteins. Read More »

ars-live:-is-the-ai-bubble-about-to-pop?-a-live-chat-with-ed-zitron.

Ars Live: Is the AI bubble about to pop? A live chat with Ed Zitron.

As generative AI has taken off since ChatGPT’s debut, inspiring hundreds of billions of dollars in investments and infrastructure developments, the top question on many people’s minds has been: Is generative AI a bubble, and if so, when will it pop?

To help us potentially answer that question, I’ll be hosting a live conversation with prominent AI critic Ed Zitron on October 7 at 3: 30 pm ET as part of the Ars Live series. As Ars Technica’s senior AI reporter, I’ve been tracking both the explosive growth of this industry and the mounting skepticism about its sustainability.

You can watch the discussion live on YouTube when the time comes.

Zitron is the host of the Better Offline podcast and CEO of EZPR, a media relations company. He writes the newsletter Where’s Your Ed At, where he frequently dissects OpenAI’s finances and questions the actual utility of current AI products. His recent posts have examined whether companies are losing money on AI investments, the economics of GPU rentals, OpenAI’s trillion-dollar funding needs, and what he calls “The Subprime AI Crisis.”

Alt text for this image:

Credit: Ars Technica

During our conversation, we’ll dig into whether the current AI investment frenzy matches the actual business value being created, what happens when companies realize their AI spending isn’t generating returns, and whether we’re seeing signs of a peak in the current AI hype cycle. We’ll also discuss what it’s like to be a prominent and sometimes controversial AI critic amid the drumbeat of AI mania in the tech industry.

While Ed and I don’t see eye to eye on everything, his sharp criticism of the AI industry’s excesses should make for an engaging discussion about one of tech’s most consequential questions right now.

Please join us for what should be a lively conversation about the sustainability of the current AI boom.

Add to Google Calendar | Add to calendar (.ics download)

Ars Live: Is the AI bubble about to pop? A live chat with Ed Zitron. Read More »

meta-won’t-allow-users-to-opt-out-of-targeted-ads-based-on-ai-chats

Meta won’t allow users to opt out of targeted ads based on AI chats

Facebook, Instagram, and WhatsApp users may want to be extra careful while using Meta AI, as Meta has announced that it will soon be using AI interactions to personalize content and ad recommendations without giving users a way to opt out.

Meta plans to notify users on October 7 that their AI interactions will influence recommendations beginning on December 16. However, it may not be immediately obvious to all users that their AI interactions will be used in this way.

The company’s blog noted that the initial notification users will see only says, “Learn how Meta will use your info in new ways to personalize your experience.” Users will have to click through to understand that the changes specifically apply to Meta AI, with a second screen explaining, “We’ll start using your interactions with AIs to personalize your experience.”

Ars asked Meta why the initial notification doesn’t directly mention AI, and Meta spokesperson Emil Vazquez said he “would disagree with the idea that we are obscuring this update in any way.”

“We’re sending notifications and emails to people about this change,” Vazquez said. “As soon as someone clicks on the notification, it’s immediately apparent that this is an AI update.”

In its blog post, Meta noted that “more than 1 billion people use Meta AI every month,” stating its goals are to improve the way Meta AI works in order to fuel better experiences on all Meta apps. Sensitive “conversations with Meta AI about topics such as their religious views, sexual orientation, political views, health, racial or ethnic origin, philosophical beliefs, or trade union membership “will not be used to target ads, Meta confirmed.

“You’re in control,” Meta’s blog said, reiterating that users can “choose” how they “interact with AIs,” unlink accounts on different apps to limit AI tracking, or adjust ad and content settings at any time. But once the tracking starts on December 16, users will not have the option to opt out of targeted ads based on AI chats, Vazquez confirmed, emphasizing to Ars that “there isn’t an opt out for this feature.”

Meta won’t allow users to opt out of targeted ads based on AI chats Read More »

why-irobot’s-founder-won’t-go-within-10-feet-of-today’s-walking-robots

Why iRobot’s founder won’t go within 10 feet of today’s walking robots

In his post, Brooks recounts being “way too close” to an Agility Robotics Digit humanoid when it fell several years ago. He has not dared approach a walking one since. Even in promotional videos from humanoid companies, Brooks notes, humans are never shown close to moving humanoid robots unless separated by furniture, and even then, the robots only shuffle minimally.

This safety problem extends beyond accidental falls. For humanoids to fulfill their promised role in health care and factory settings, they need certification to operate in zones shared with humans. Current walking mechanisms make such certification virtually impossible under existing safety standards in most parts of the world.

Apollo robot

The humanoid Apollo robot. Credit: Google

Brooks predicts that within 15 years, there will indeed be many robots called “humanoids” performing various tasks. But ironically, they will look nothing like today’s bipedal machines. They will have wheels instead of feet, varying numbers of arms, and specialized sensors that bear no resemblance to human eyes. Some will have cameras in their hands or looking down from their midsections. The definition of “humanoid” will shift, just as “flying cars” now means electric helicopters rather than road-capable aircraft, and “self-driving cars” means vehicles with remote human monitors rather than truly autonomous systems.

The billions currently being invested in forcing today’s rigid, vision-only humanoids to learn dexterity will largely disappear, Brooks argues. Academic researchers are making more progress with systems that incorporate touch feedback, like MIT’s approach using a glove that transmits sensations between human operators and robot hands. But even these advances remain far from the comprehensive touch sensing that enables human dexterity.

Today, few people spend their days near humanoid robots, but Brooks’ 3-meter rule stands as a practical warning of challenges ahead from someone who has spent decades building these machines. The gap between promotional videos and deployable reality remains large, measured not just in years but in fundamental unsolved problems of physics, sensing, and safety.

Why iRobot’s founder won’t go within 10 feet of today’s walking robots Read More »

openai-mocks-musk’s-math-in-suit-over-iphone/chatgpt-integration

OpenAI mocks Musk’s math in suit over iPhone/ChatGPT integration


“Fraction of a fraction of a fraction”

xAI’s claim that Apple gave ChatGPT a monopoly on prompts is “baseless,” OpenAI says.

OpenAI and Apple have moved to dismiss a lawsuit by Elon Musk’s xAI, alleging that ChatGPT’s integration into a “handful” of iPhone features violated antitrust laws by giving OpenAI a monopoly on prompts and Apple a new path to block rivals in the smartphone industry.

The lawsuit was filed in August after Musk raged on X about Apple never listing Grok on its editorially curated “Must Have” apps list, which ChatGPT frequently appeared on.

According to Musk, Apple linking ChatGPT to Siri and other native iPhone features gave OpenAI exclusive access to billions of prompts that only OpenAI can use as valuable training data to maintain its dominance in the chatbot market. However, OpenAI and Apple are now mocking Musk’s math in court filings, urging the court to agree that xAI’s lawsuit is doomed.

As OpenAI argued, the estimates in xAI’s complaint seemed “baseless,” with Musk hesitant to even “hazard a guess” at what portion of the chatbot market is being foreclosed by the OpenAI/Apple deal.

xAI suggested that the ChatGPT integration may give OpenAI “up to 55 percent” of the potential chatbot prompts in the market, which could mean anywhere from 0 to 55 percent, OpenAI and Apple noted.

Musk’s company apparently arrived at this vague estimate by doing “back-of-the-envelope math,” and the court should reject his complaint, OpenAI argued. That math “was evidently calculated by assuming that Siri fields ‘1.5 billion user requests per day globally,’ then dividing that quantity by the ‘total prompts for generative AI chatbots in 2024,'”—”apparently 2.7 billion per day,” OpenAI explained.

These estimates “ignore the facts” that “ChatGPT integration is only available on the latest models of iPhones, which allow users to opt into the integration,” OpenAI argued. And for any user who opts in, they must link their ChatGPT account for OpenAI to train on their data, OpenAI said, further restricting the potential prompt pool.

By Musk’s own logic, OpenAI alleged, “the relevant set of Siri prompts thus cannot plausibly be 1.5 billion per day, but is instead an unknown, unpleaded fraction of a fraction of a fraction of that number.”

Additionally, OpenAI mocked Musk for using 2024 statistics, writing that xAI failed to explain “the logic of using a year-old estimate of the number of prompts when the pleadings elsewhere acknowledge that the industry is experiencing ‘exponential growth.'”

Apple’s filing agreed that Musk’s calculations “stretch logic,” appearing “to rest on speculative and implausible assumptions that the agreement gives ChatGPT exclusive access to all Siri requests from all Apple devices (including older models), and that OpenAI may use all such requests to train ChatGPT and achieve scale.”

“Not all Siri requests” result in ChatGPT prompts that OpenAI can train on, Apple noted, “even by users who have enabled devices and opt in.”

OpenAI reminds court of Grok’s MechaHitler scandal

OpenAI argued that Musk’s lawsuit is part of a pattern of harassment that OpenAI previously described as “unrelenting” since ChatGPT’s successful debut, alleging it was “the latest effort by the world’s wealthiest man to stifle competition in the world’s most innovative industry.”

As OpenAI sees it, “Musk’s pretext for litigation this time is that Apple chose to offer ChatGPT as an optional add-on for several built-in applications on its latest iPhones,” without giving Grok the same deal. But OpenAI noted that the integration was rolled out around the same time that Musk removed “woke filters” that caused Grok to declare itself “MechaHitler.” For Apple, it was a business decision to avoid Grok, OpenAI argued.

Apple did not reference the Grok scandal in its filing but in a footnote confirmed that “vetting of partners is particularly important given some of the concerns about generative AI chatbots, including on child safety issues, nonconsensual intimate imagery, and ‘jailbreaking’—feeding input to a chatbot so it ignores its own safety guardrails.”

A similar logic was applied to Apple’s decision not to highlight Grok as a “Must Have” app, their filing said. After Musk’s public rant about Grok’s exclusion on X, “Apple employees explained the objective reasons why Grok was not included on certain lists, and identified app improvements,” Apple noted, but instead of making changes, xAI filed the lawsuit.

Also taking time to point out the obvious, Apple argued that Musk was fixated on the fact that his charting apps never make the “Must Have Apps” list, suggesting that Apple’s picks should always mirror “Top Charts,” which tracks popular downloads.

“That assumes that the Apple-curated Must-Have Apps List must be distorted if it does not strictly parrot App Store Top Charts,” Apple argued. “But that assumption is illogical: there would be little point in maintaining a Must-Have Apps List if all it did was restate what Top Charts say, rather than offer Apple’s editorial recommendations to users.”

Likely most relevant to the antitrust charges, Apple accused Musk of improperly arguing that “Apple cannot partner with OpenAI to create an innovative feature for iPhone users without simultaneously partnering with every other generative AI chatbot—regardless of quality, privacy or safety considerations, technical feasibility, stage of development, or commercial terms.”

“No facts plausibly” support xAI’s “assertion that Apple intentionally ‘deprioritized'” xAI apps “as part of an illegal conspiracy or monopolization scheme,” Apple argued.

And most glaringly, Apple noted that xAI is not a rival or consumer in the smartphone industry, where it alleges competition is being harmed. Apple urged the court to reject Musk’s theory that Apple is incentivized to boost OpenAI to prevent xAI’s ascent in building a “super app” that would render smartphones obsolete. If Musk’s super app dream is even possible, Apple argued, it’s at least a decade off, insisting that as-yet-undeveloped apps should not serve as the basis for blocking Apple’s measured plan to better serve customers with sophisticated chatbot integration.

“Antitrust laws do not require that, and for good reason: imposing such a rule on businesses would slow innovation, reduce quality, and increase costs, all ultimately harming the very consumers the antitrust laws are meant to protect,” Apple argued.

Musk’s weird smartphone market claim, explained

Apple alleged that Musk’s “grievance” can be “reduced to displeasure that Apple has not yet ‘integrated with any other generative AI chatbots’ beyond ChatGPT, such as those created by xAI, Google, and Anthropic.”

In a footnote, the smartphone giant noted that by xAI’s logic, Musk’s social media platform X “may be required to integrate all other chatbots—including ChatGPT—on its own social media platform.”

But antitrust law doesn’t work that way, Apple argued, urging the court to reject xAI’s claims of alleged market harms that “rely on a multi-step chain of speculation on top of speculation.” As Apple summarized, xAI contends that “if Apple never integrated ChatGPT,” xAI could win in both chatbot and smartphone markets, but only if:

1. Consumers would choose to send additional prompts to Grok (rather than other generative AI chatbots).

2. The additional prompts would result in Grok achieving scale and quality it could not otherwise achieve.

3. As a result, the X app would grow in popularity because it is integrated with Grok.

4. X and xAI would therefore be better positioned to build so-called “super apps” in the future, which the complaint defines as “multi-functional” apps that offer “social connectivity and messaging, financial services, e-commerce, and entertainment.”

5. Once developed, consumers might choose to use X’s “super app” for various functions.

6. “Super apps” would replace much of the functionality of smartphones and consumers would care less about the quality of their physical phones and rely instead on these hypothetical “super apps.”

7. Smartphone manufacturers would respond by offering more basic models of smartphones with less functionality.

8. iPhone users would decide to replace their iPhones with more “basic smartphones” with “super apps.”

Apple insisted that nothing in its OpenAI deal prevents Musk from building his super apps, while noting that from integrating Grok into X, Musk understands that integration of a single chatbot is a “major undertaking” that requires “substantial investment.” That “concession” alone “underscores the massive resources Apple would need to devote to integrating every AI chatbot into Apple Intelligence,” while navigating potential user safety risks.

The iPhone maker also reminded the court that it has always planned to integrate other chatbots into its native features after investing in and testing Apple Intelligence’s performance, relying on what Apple deems is the best chatbot on the market today.

Backing Apple up, OpenAI noted that Musk’s complaint seemed to cherry-pick testimony from Google CEO Sundar Pichai, claiming that “Google could not reach an agreement to integrate” Gemini “with Apple because Apple had decided to integrate ChatGPT.”

“The full testimony recorded in open court reveals Mr. Pichai attesting to his understanding that ‘Apple plans to expand to other providers for Generative AI distribution’ and that ‘[a]s CEO of Google, [he is] hoping to execute a Gemini distribution agreement with Apple’ later in 2025,” OpenAI argued.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

OpenAI mocks Musk’s math in suit over iPhone/ChatGPT integration Read More »