AI

deloitte-will-refund-australian-government-for-ai-hallucination-filled-report

Deloitte will refund Australian government for AI hallucination-filled report

The Australian Financial Review reports that Deloitte Australia will offer the Australian government a partial refund for a report that was littered with AI-hallucinated quotes and references to nonexistent research.

Deloitte’s “Targeted Compliance Framework Assurance Review” was finalized in July and published by Australia’s Department of Employment and Workplace Relations (DEWR) in August (Internet Archive version of the original). The report, which cost Australian taxpayers nearly $440,000 AUD (about $290,000 USD), focuses on the technical framework the government uses to automate penalties under the country’s welfare system.

Shortly after the report was published, though, Sydney University Deputy Director of Health Law Chris Rudge noticed citations to multiple papers and publications that did not exist. That included multiple references to nonexistent reports by Lisa Burton Crawford, a real professor at the University of Sydney law school.

“It is concerning to see research attributed to me in this way,” Crawford told the AFR in August. “I would like to see an explanation from Deloitte as to how the citations were generated.”

“A small number of corrections”

Deloitte and the DEWR buried that explanation in an updated version of the original report published Friday “to address a small number of corrections to references and footnotes,” according to the DEWR website. On page 58 of that 273-page updated report, Deloitte added a reference to “a generative AI large language model (Azure OpenAI GPT-4o) based tool chain” that was used as part of the technical workstream to help “[assess] whether system code state can be mapped to business requirements and compliance needs.”

Deloitte will refund Australian government for AI hallucination-filled report Read More »

amd-wins-massive-ai-chip-deal-from-openai-with-stock-sweetener

AMD wins massive AI chip deal from OpenAI with stock sweetener

As part of the arrangement, AMD will allow OpenAI to purchase up to 160 million AMD shares at 1 cent each throughout the chips deal.

OpenAI diversifies its chip supply

With demand for AI compute growing rapidly, companies like OpenAI have been looking for secondary supply lines and sources of additional computing capacity, and the AMD partnership is part the company’s wider effort to secure sufficient computing power for its AI operations. In September, Nvidia announced an investment of up to $100 billion in OpenAI that included supplying at least 10 gigawatts of Nvidia systems. OpenAI plans to deploy a gigawatt of Nvidia’s next-generation Vera Rubin chips in late 2026.

OpenAI has worked with AMD for years, according to Reuters, providing input on the design of older generations of AI chips such as the MI300X. The new agreement calls for deploying the equivalent of 6 gigawatts of computing power using AMD chips over multiple years.

Beyond working with chip suppliers, OpenAI is widely reported to be developing its own silicon for AI applications and has partnered with Broadcom, as we reported in February. A person familiar with the matter told Reuters the AMD deal does not change OpenAI’s ongoing compute plans, including its chip development effort or its partnership with Microsoft.

AMD wins massive AI chip deal from OpenAI with stock sweetener Read More »

openai,-jony-ive-struggle-with-technical-details-on-secretive-new-ai-gadget

OpenAI, Jony Ive struggle with technical details on secretive new AI gadget

OpenAI overtook Elon Musk’s SpaceX to become the world’s most valuable private company this week, after a deal that valued it at $500 billion. One of the ways the ChatGPT maker is seeking to justify the price tag is a push into hardware.

The goal is to improve the “smart speakers” of the past decade, such as Amazon’s Echo speaker and its Alexa digital assistant, which are generally used for a limited set of functions such as listening to music and setting kitchen timers.

OpenAI and Ive are seeking to build a more powerful and useful machine. But two people familiar with the project said that settling on the device’s “voice” and its mannerisms were a challenge.

One issue is ensuring the device only chimes in when useful, preventing it from talking too much or not knowing when to finish the conversation—an ongoing issue with ChatGPT.



“The concept is that you should have a friend who’s a computer who isn’t your weird AI girlfriend… like [Apple’s digital voice assistant] Siri but better,” said one person who was briefed on the plans. OpenAI was looking for “ways for it to be accessible but not intrusive.”

“Model personality is a hard thing to balance,” said another person close to the project. “It can’t be too sycophantic, not too direct, helpful, but doesn’t keep talking in a feedback loop.”

OpenAI’s device will be entering a difficult market. Friend, an AI companion worn as a pendant around your neck, has been criticized for being “creepy” and having a “snarky” personality. An AI pin made by Humane, a company that Altman personally invested in, has been scrapped.

Still, OpenAI has been on a hiring spree to build its hardware business. Its acquisition of io brought in more than 20 former Apple hardware employees poached by Ive from his alma mater. It has also recruited at least a dozen other Apple device experts this year, according to LinkedIn accounts.

It has similarly poached members of Meta’s staff working on the Big Tech group’s Quest headset and smart glasses.

OpenAI is also working with Chinese contract manufacturers, including Luxshare, to create its first device, according to two people familiar with the development that was first reported by The Information. The people added that the device might be assembled outside of China.

OpenAI and LoveFrom, Ive’s design group, declined to comment.

© 2025 The Financial Times Ltd. All rights reserved. Not to be redistributed, copied, or modified in any way.

OpenAI, Jony Ive struggle with technical details on secretive new AI gadget Read More »

a-biological-0-day?-threat-screening-tools-may-miss-ai-designed-proteins.

A biological 0-day? Threat-screening tools may miss AI-designed proteins.


Ordering DNA for AI-designed toxins doesn’t always raise red flags.

Designing variations of the complex, three-dimensional structures of proteins has been made a lot easier by AI tools. Credit: Historical / Contributor

On Thursday, a team of researchers led by Microsoft announced that they had discovered, and possibly patched, what they’re terming a biological zero-day—an unrecognized security hole in a system that protects us from biological threats. The system at risk screens purchases of DNA sequences to determine when someone’s ordering DNA that encodes a toxin or dangerous virus. But, the researchers argue, it has become increasingly vulnerable to missing a new threat: AI-designed toxins.

How big of a threat is this? To understand, you have to know a bit more about both existing biosurveillance programs and the capabilities of AI-designed proteins.

Catching the bad ones

Biological threats come in a variety of forms. Some are pathogens, such as viruses and bacteria. Others are protein-based toxins, like the ricin that was sent to the White House in 2003. Still others are chemical toxins that are produced through enzymatic reactions, like the molecules associated with red tide. All of them get their start through the same fundamental biological process: DNA is transcribed into RNA, which is then used to make proteins.

For several decades now, starting the process has been as easy as ordering the needed DNA sequence online from any of a number of companies, which will synthesize a requested sequence and ship it out. Recognizing the potential threat here, governments and industry have worked together to add a screening step to every order: the DNA sequence is scanned for its ability to encode parts of proteins or viruses considered threats. Any positives are then flagged for human intervention to evaluate whether they or the people ordering them truly represent a danger.

Both the list of proteins and the sophistication of the scanning have been continually updated in response to research progress over the years. For example, initial screening was done based on similarity to target DNA sequences. But there are many DNA sequences that can encode the same protein, so the screening algorithms have been adjusted accordingly, recognizing all the DNA variants that pose an identical threat.

The new work can be thought of as an extension of that threat. Not only can multiple DNA sequences encode the same protein; multiple proteins can perform the same function. To form a toxin, for example, typically requires the protein to adopt the correct three-dimensional structure, which brings a handful of critical amino acids within the protein into close proximity. Outside of those critical amino acids, however, things can often be quite flexible. Some amino acids may not matter at all; other locations in the protein could work with any positively charged amino acid, or any hydrophobic one.

In the past, it could be extremely difficult (meaning time-consuming and expensive) to do the experiments that would tell you what sorts of changes a string of amino acids could tolerate while remaining functional. But the team behind the new analysis recognized that AI protein design tools have now gotten quite sophisticated and can predict when distantly related sequences can fold up into the same shape and catalyze the same reactions. The process is still error-prone, and you often have to test a dozen or more proposed proteins to get a working one, but it has produced some impressive successes.

So, the team developed a hypothesis to test: AI can take an existing toxin and design a protein with the same function that’s distantly related enough that the screening programs do not detect orders for the DNA that encodes it.

The zero-day treatment

The team started with a basic test: use AI tools to design variants of the toxin ricin, then test them against the software that is used to screen DNA orders. The results of the test suggested there was a risk of dangerous protein variants slipping past existing screening software, so the situation was treated like the equivalent of a zero-day vulnerability.

“Taking inspiration from established cybersecurity processes for addressing such situations, we contacted the relevant bodies regarding the potential vulnerability, including the International Gene Synthesis Consortium and trusted colleagues in the protein design community as well as leads in biosecurity at the US Office of Science and Technology Policy, US National Institute of Standards and Technologies, US Department of Homeland Security, and US Office of Pandemic Preparedness and Response,” the authors report. “Outside of those bodies, details were kept confidential until a more comprehensive study could be performed in pursuit of potential mitigations and for ‘patches’… to be developed and deployed.”

Details of that original test are being made available today as part of a much larger analysis that extends the approach to a large range of toxic proteins. Starting with 72 toxins, the researchers used three open source AI packages to generate a total of about 75,000 potential protein variants.

And this is where things get a little complicated. Many of the AI-designed protein variants are going to end up being non-functional, either subtly or catastrophically failing to fold up into the correct configuration to create an active toxin. The only way to know which ones work is to make the proteins and test them biologically; most AI protein design efforts will make actual proteins from dozens to hundreds of the most promising-looking potential designs to find a handful that are active. But doing that for 75,000 designs is completely unrealistic.

Instead, the researchers used two software-based tools to evaluate each of the 75,000 designs. One of these focuses on the similarity between the overall predicted physical structure of the proteins, and another looks at the predicted differences between the positions of individual amino acids. Either way, they’re a rough approximation of just how similar the proteins formed by two strings of amino acids should be. But they’re definitely not a clear indicator of whether those two proteins would be equally functional.

In any case, DNA sequences encoding all 75,000 designs were fed into the software that screens DNA orders for potential threats. One thing that was very clear is that there were huge variations in the ability of the four screening programs to flag these variant designs as threatening. Two of them seemed to do a pretty good job, one was mixed, and another let most of them through. Three of the software packages were updated in response to this performance, which significantly improved their ability to pick out variants.

There was also a clear trend in all four screening packages: The closer the variant was to the original structurally, the more likely the package (both before and after the patches) was to be able to flag it as a threat. In all cases, there was also a cluster of variant designs that were unlikely to fold into a similar structure, and these generally weren’t flagged as threats.

What does this mean?

Again, it’s important to emphasize that this evaluation is based on predicted structures; “unlikely” to fold into a similar structure to the original toxin doesn’t mean these proteins will be inactive as toxins. Functional proteins are probably going to be very rare among this group, but there may be a handful in there. That handful is also probably rare enough that you would have to order up and test far too many designs to find one that works, making this an impractical threat vector.

At the same time, there are also a handful of proteins that are very similar to the toxin structurally and not flagged by the software. For the three patched versions of the software, the ones that slip through the screening represent about 1 to 3 percent of the total in the “very similar” category. That’s not great, but it’s probably good enough that any group that tries to order up a toxin by this method would attract attention because they’d have to order over 50 just to have a good chance of finding one that slipped through, which would raise all sorts of red flags.

One other notable result is that the designs that weren’t flagged were mostly variants of just a handful of toxin proteins. So this is less of a general problem with the screening software and might be more of a small set of focused problems. Of note, one of the proteins that produced a lot of unflagged variants isn’t toxic itself; instead, it’s a co-factor necessary for the actual toxin to do its thing. As such, some of the screening software packages didn’t even flag the original protein as dangerous, much less any of its variants. (For these reasons, the company that makes one of the better-performing software packages decided the threat here wasn’t significant enough to merit a security patch.)

So, on its own, this work doesn’t seem to have identified something that’s a major threat at the moment. But it’s probably useful, in that it’s a good thing to get the people who engineer the screening software to start thinking about emerging threats.

That’s because, as the people behind this work note, AI protein design is still in its early stages, and we’re likely to see considerable improvements. And there’s likely to be a limit to the sorts of things we can screen for. We’re already at the point where AI protein design tools can be used to create proteins that have entirely novel functions and do so without starting with variants of existing proteins. In other words, we can design proteins that are impossible to screen for based on similarity to known threats, because they don’t look at all like anything we know is dangerous.

Protein-based toxins would be very difficult to design, because they have to both cross the cell membrane and then do something dangerous once inside. While AI tools are probably unable to design something that sophisticated at the moment, I would be hesitant to rule out the prospects of them eventually reaching that sort of sophistication.

Science, 2025. DOI: 10.1126/science.adu8578  (About DOIs).

Photo of John Timmer

John is Ars Technica’s science editor. He has a Bachelor of Arts in Biochemistry from Columbia University, and a Ph.D. in Molecular and Cell Biology from the University of California, Berkeley. When physically separated from his keyboard, he tends to seek out a bicycle, or a scenic location for communing with his hiking boots.

A biological 0-day? Threat-screening tools may miss AI-designed proteins. Read More »

ars-live:-is-the-ai-bubble-about-to-pop?-a-live-chat-with-ed-zitron.

Ars Live: Is the AI bubble about to pop? A live chat with Ed Zitron.

As generative AI has taken off since ChatGPT’s debut, inspiring hundreds of billions of dollars in investments and infrastructure developments, the top question on many people’s minds has been: Is generative AI a bubble, and if so, when will it pop?

To help us potentially answer that question, I’ll be hosting a live conversation with prominent AI critic Ed Zitron on October 7 at 3: 30 pm ET as part of the Ars Live series. As Ars Technica’s senior AI reporter, I’ve been tracking both the explosive growth of this industry and the mounting skepticism about its sustainability.

You can watch the discussion live on YouTube when the time comes.

Zitron is the host of the Better Offline podcast and CEO of EZPR, a media relations company. He writes the newsletter Where’s Your Ed At, where he frequently dissects OpenAI’s finances and questions the actual utility of current AI products. His recent posts have examined whether companies are losing money on AI investments, the economics of GPU rentals, OpenAI’s trillion-dollar funding needs, and what he calls “The Subprime AI Crisis.”

Alt text for this image:

Credit: Ars Technica

During our conversation, we’ll dig into whether the current AI investment frenzy matches the actual business value being created, what happens when companies realize their AI spending isn’t generating returns, and whether we’re seeing signs of a peak in the current AI hype cycle. We’ll also discuss what it’s like to be a prominent and sometimes controversial AI critic amid the drumbeat of AI mania in the tech industry.

While Ed and I don’t see eye to eye on everything, his sharp criticism of the AI industry’s excesses should make for an engaging discussion about one of tech’s most consequential questions right now.

Please join us for what should be a lively conversation about the sustainability of the current AI boom.

Add to Google Calendar | Add to calendar (.ics download)

Ars Live: Is the AI bubble about to pop? A live chat with Ed Zitron. Read More »

meta-won’t-allow-users-to-opt-out-of-targeted-ads-based-on-ai-chats

Meta won’t allow users to opt out of targeted ads based on AI chats

Facebook, Instagram, and WhatsApp users may want to be extra careful while using Meta AI, as Meta has announced that it will soon be using AI interactions to personalize content and ad recommendations without giving users a way to opt out.

Meta plans to notify users on October 7 that their AI interactions will influence recommendations beginning on December 16. However, it may not be immediately obvious to all users that their AI interactions will be used in this way.

The company’s blog noted that the initial notification users will see only says, “Learn how Meta will use your info in new ways to personalize your experience.” Users will have to click through to understand that the changes specifically apply to Meta AI, with a second screen explaining, “We’ll start using your interactions with AIs to personalize your experience.”

Ars asked Meta why the initial notification doesn’t directly mention AI, and Meta spokesperson Emil Vazquez said he “would disagree with the idea that we are obscuring this update in any way.”

“We’re sending notifications and emails to people about this change,” Vazquez said. “As soon as someone clicks on the notification, it’s immediately apparent that this is an AI update.”

In its blog post, Meta noted that “more than 1 billion people use Meta AI every month,” stating its goals are to improve the way Meta AI works in order to fuel better experiences on all Meta apps. Sensitive “conversations with Meta AI about topics such as their religious views, sexual orientation, political views, health, racial or ethnic origin, philosophical beliefs, or trade union membership “will not be used to target ads, Meta confirmed.

“You’re in control,” Meta’s blog said, reiterating that users can “choose” how they “interact with AIs,” unlink accounts on different apps to limit AI tracking, or adjust ad and content settings at any time. But once the tracking starts on December 16, users will not have the option to opt out of targeted ads based on AI chats, Vazquez confirmed, emphasizing to Ars that “there isn’t an opt out for this feature.”

Meta won’t allow users to opt out of targeted ads based on AI chats Read More »

why-irobot’s-founder-won’t-go-within-10-feet-of-today’s-walking-robots

Why iRobot’s founder won’t go within 10 feet of today’s walking robots

In his post, Brooks recounts being “way too close” to an Agility Robotics Digit humanoid when it fell several years ago. He has not dared approach a walking one since. Even in promotional videos from humanoid companies, Brooks notes, humans are never shown close to moving humanoid robots unless separated by furniture, and even then, the robots only shuffle minimally.

This safety problem extends beyond accidental falls. For humanoids to fulfill their promised role in health care and factory settings, they need certification to operate in zones shared with humans. Current walking mechanisms make such certification virtually impossible under existing safety standards in most parts of the world.

Apollo robot

The humanoid Apollo robot. Credit: Google

Brooks predicts that within 15 years, there will indeed be many robots called “humanoids” performing various tasks. But ironically, they will look nothing like today’s bipedal machines. They will have wheels instead of feet, varying numbers of arms, and specialized sensors that bear no resemblance to human eyes. Some will have cameras in their hands or looking down from their midsections. The definition of “humanoid” will shift, just as “flying cars” now means electric helicopters rather than road-capable aircraft, and “self-driving cars” means vehicles with remote human monitors rather than truly autonomous systems.

The billions currently being invested in forcing today’s rigid, vision-only humanoids to learn dexterity will largely disappear, Brooks argues. Academic researchers are making more progress with systems that incorporate touch feedback, like MIT’s approach using a glove that transmits sensations between human operators and robot hands. But even these advances remain far from the comprehensive touch sensing that enables human dexterity.

Today, few people spend their days near humanoid robots, but Brooks’ 3-meter rule stands as a practical warning of challenges ahead from someone who has spent decades building these machines. The gap between promotional videos and deployable reality remains large, measured not just in years but in fundamental unsolved problems of physics, sensing, and safety.

Why iRobot’s founder won’t go within 10 feet of today’s walking robots Read More »

openai-mocks-musk’s-math-in-suit-over-iphone/chatgpt-integration

OpenAI mocks Musk’s math in suit over iPhone/ChatGPT integration


“Fraction of a fraction of a fraction”

xAI’s claim that Apple gave ChatGPT a monopoly on prompts is “baseless,” OpenAI says.

OpenAI and Apple have moved to dismiss a lawsuit by Elon Musk’s xAI, alleging that ChatGPT’s integration into a “handful” of iPhone features violated antitrust laws by giving OpenAI a monopoly on prompts and Apple a new path to block rivals in the smartphone industry.

The lawsuit was filed in August after Musk raged on X about Apple never listing Grok on its editorially curated “Must Have” apps list, which ChatGPT frequently appeared on.

According to Musk, Apple linking ChatGPT to Siri and other native iPhone features gave OpenAI exclusive access to billions of prompts that only OpenAI can use as valuable training data to maintain its dominance in the chatbot market. However, OpenAI and Apple are now mocking Musk’s math in court filings, urging the court to agree that xAI’s lawsuit is doomed.

As OpenAI argued, the estimates in xAI’s complaint seemed “baseless,” with Musk hesitant to even “hazard a guess” at what portion of the chatbot market is being foreclosed by the OpenAI/Apple deal.

xAI suggested that the ChatGPT integration may give OpenAI “up to 55 percent” of the potential chatbot prompts in the market, which could mean anywhere from 0 to 55 percent, OpenAI and Apple noted.

Musk’s company apparently arrived at this vague estimate by doing “back-of-the-envelope math,” and the court should reject his complaint, OpenAI argued. That math “was evidently calculated by assuming that Siri fields ‘1.5 billion user requests per day globally,’ then dividing that quantity by the ‘total prompts for generative AI chatbots in 2024,'”—”apparently 2.7 billion per day,” OpenAI explained.

These estimates “ignore the facts” that “ChatGPT integration is only available on the latest models of iPhones, which allow users to opt into the integration,” OpenAI argued. And for any user who opts in, they must link their ChatGPT account for OpenAI to train on their data, OpenAI said, further restricting the potential prompt pool.

By Musk’s own logic, OpenAI alleged, “the relevant set of Siri prompts thus cannot plausibly be 1.5 billion per day, but is instead an unknown, unpleaded fraction of a fraction of a fraction of that number.”

Additionally, OpenAI mocked Musk for using 2024 statistics, writing that xAI failed to explain “the logic of using a year-old estimate of the number of prompts when the pleadings elsewhere acknowledge that the industry is experiencing ‘exponential growth.'”

Apple’s filing agreed that Musk’s calculations “stretch logic,” appearing “to rest on speculative and implausible assumptions that the agreement gives ChatGPT exclusive access to all Siri requests from all Apple devices (including older models), and that OpenAI may use all such requests to train ChatGPT and achieve scale.”

“Not all Siri requests” result in ChatGPT prompts that OpenAI can train on, Apple noted, “even by users who have enabled devices and opt in.”

OpenAI reminds court of Grok’s MechaHitler scandal

OpenAI argued that Musk’s lawsuit is part of a pattern of harassment that OpenAI previously described as “unrelenting” since ChatGPT’s successful debut, alleging it was “the latest effort by the world’s wealthiest man to stifle competition in the world’s most innovative industry.”

As OpenAI sees it, “Musk’s pretext for litigation this time is that Apple chose to offer ChatGPT as an optional add-on for several built-in applications on its latest iPhones,” without giving Grok the same deal. But OpenAI noted that the integration was rolled out around the same time that Musk removed “woke filters” that caused Grok to declare itself “MechaHitler.” For Apple, it was a business decision to avoid Grok, OpenAI argued.

Apple did not reference the Grok scandal in its filing but in a footnote confirmed that “vetting of partners is particularly important given some of the concerns about generative AI chatbots, including on child safety issues, nonconsensual intimate imagery, and ‘jailbreaking’—feeding input to a chatbot so it ignores its own safety guardrails.”

A similar logic was applied to Apple’s decision not to highlight Grok as a “Must Have” app, their filing said. After Musk’s public rant about Grok’s exclusion on X, “Apple employees explained the objective reasons why Grok was not included on certain lists, and identified app improvements,” Apple noted, but instead of making changes, xAI filed the lawsuit.

Also taking time to point out the obvious, Apple argued that Musk was fixated on the fact that his charting apps never make the “Must Have Apps” list, suggesting that Apple’s picks should always mirror “Top Charts,” which tracks popular downloads.

“That assumes that the Apple-curated Must-Have Apps List must be distorted if it does not strictly parrot App Store Top Charts,” Apple argued. “But that assumption is illogical: there would be little point in maintaining a Must-Have Apps List if all it did was restate what Top Charts say, rather than offer Apple’s editorial recommendations to users.”

Likely most relevant to the antitrust charges, Apple accused Musk of improperly arguing that “Apple cannot partner with OpenAI to create an innovative feature for iPhone users without simultaneously partnering with every other generative AI chatbot—regardless of quality, privacy or safety considerations, technical feasibility, stage of development, or commercial terms.”

“No facts plausibly” support xAI’s “assertion that Apple intentionally ‘deprioritized'” xAI apps “as part of an illegal conspiracy or monopolization scheme,” Apple argued.

And most glaringly, Apple noted that xAI is not a rival or consumer in the smartphone industry, where it alleges competition is being harmed. Apple urged the court to reject Musk’s theory that Apple is incentivized to boost OpenAI to prevent xAI’s ascent in building a “super app” that would render smartphones obsolete. If Musk’s super app dream is even possible, Apple argued, it’s at least a decade off, insisting that as-yet-undeveloped apps should not serve as the basis for blocking Apple’s measured plan to better serve customers with sophisticated chatbot integration.

“Antitrust laws do not require that, and for good reason: imposing such a rule on businesses would slow innovation, reduce quality, and increase costs, all ultimately harming the very consumers the antitrust laws are meant to protect,” Apple argued.

Musk’s weird smartphone market claim, explained

Apple alleged that Musk’s “grievance” can be “reduced to displeasure that Apple has not yet ‘integrated with any other generative AI chatbots’ beyond ChatGPT, such as those created by xAI, Google, and Anthropic.”

In a footnote, the smartphone giant noted that by xAI’s logic, Musk’s social media platform X “may be required to integrate all other chatbots—including ChatGPT—on its own social media platform.”

But antitrust law doesn’t work that way, Apple argued, urging the court to reject xAI’s claims of alleged market harms that “rely on a multi-step chain of speculation on top of speculation.” As Apple summarized, xAI contends that “if Apple never integrated ChatGPT,” xAI could win in both chatbot and smartphone markets, but only if:

1. Consumers would choose to send additional prompts to Grok (rather than other generative AI chatbots).

2. The additional prompts would result in Grok achieving scale and quality it could not otherwise achieve.

3. As a result, the X app would grow in popularity because it is integrated with Grok.

4. X and xAI would therefore be better positioned to build so-called “super apps” in the future, which the complaint defines as “multi-functional” apps that offer “social connectivity and messaging, financial services, e-commerce, and entertainment.”

5. Once developed, consumers might choose to use X’s “super app” for various functions.

6. “Super apps” would replace much of the functionality of smartphones and consumers would care less about the quality of their physical phones and rely instead on these hypothetical “super apps.”

7. Smartphone manufacturers would respond by offering more basic models of smartphones with less functionality.

8. iPhone users would decide to replace their iPhones with more “basic smartphones” with “super apps.”

Apple insisted that nothing in its OpenAI deal prevents Musk from building his super apps, while noting that from integrating Grok into X, Musk understands that integration of a single chatbot is a “major undertaking” that requires “substantial investment.” That “concession” alone “underscores the massive resources Apple would need to devote to integrating every AI chatbot into Apple Intelligence,” while navigating potential user safety risks.

The iPhone maker also reminded the court that it has always planned to integrate other chatbots into its native features after investing in and testing Apple Intelligence’s performance, relying on what Apple deems is the best chatbot on the market today.

Backing Apple up, OpenAI noted that Musk’s complaint seemed to cherry-pick testimony from Google CEO Sundar Pichai, claiming that “Google could not reach an agreement to integrate” Gemini “with Apple because Apple had decided to integrate ChatGPT.”

“The full testimony recorded in open court reveals Mr. Pichai attesting to his understanding that ‘Apple plans to expand to other providers for Generative AI distribution’ and that ‘[a]s CEO of Google, [he is] hoping to execute a Gemini distribution agreement with Apple’ later in 2025,” OpenAI argued.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

OpenAI mocks Musk’s math in suit over iPhone/ChatGPT integration Read More »

can-today’s-ai-video-models-accurately-model-how-the-real-world-works?

Can today’s AI video models accurately model how the real world works?

But on other tasks, the model showed much more variable results. When asked to generate a video highlighting a specific written character on a grid, for instance, the model failed in nine out of 12 trials. When asked to model a Bunsen burner turning on and burning a piece of paper, it similarly failed nine out of 12 times. When asked to solve a simple maze, it failed in 10 of 12 trials. When asked to sort numbers by popping labeled bubbles in order, it failed a whopping 11 out of 12 times.

For the researchers, though, all of the above examples aren’t evidence of failure but instead a sign of the model’s capabilities. To be listed under the paper’s “failure cases,” Veo 3 had to fail a tested task across all 12 trials, which happened in 16 of the 62 tasks tested. For the rest, the researchers write that “a success rate greater than 0 suggests that the model possesses the ability to solve the task.”

Thus, failing 11 out of 12 trails of a certain task is considered evidence for the model’s capabilities in the paper. That evidence of the model “possess[ing] the ability to solve the task” includes 18 tasks where the model failed in more than half of its 12 trial runs and another 14 where it failed in 25 to 50 percent of trials.

Past results, future performance

Yes, in all of these cases, the model did technically demonstrate the capability being tested at some point. But the model’s inability to perform that task reliably means that, in practice, it won’t be performant enough for most use cases. Any future model that could become a “unified, generalist vision foundation models” will have to be able to succeed much more consistently on these kinds of tests.

Can today’s AI video models accurately model how the real world works? Read More »

google’s-gemini-powered-smart-home-revamp-is-here-with-a-new-app-and-cameras

Google’s Gemini-powered smart home revamp is here with a new app and cameras


Google promises a better smart home experience thanks to Gemini.

Google’s new Nest cameras keep the same look. Credit: Google

Google’s products and services have been flooded with AI features over the past couple of years, but smart home has been largely spared until now. The company’s plans to replace Assistant are moving forward with a big Google Home reset. We’ve been told over and over that generative AI will do incredible things when given enough data, and here’s the test.

There’s a new Home app with Gemini intelligence throughout the experience, updated subscriptions, and even some new hardware. The revamped Home app will allegedly gain deeper insights into what happens in your home, unlocking advanced video features and conversational commands. It demos well, but will it make smart home tech less or more frustrating?

A new Home

You may have already seen some elements of the revamped Home experience percolating to the surface, but that process begins in earnest today. The new app apparently boosts speed and reliability considerably, with camera feeds loading 70 percent faster and with 80 percent fewer app crashes. The app will also bring new Gemini features, some of which are free. Google’s new Home subscription retains the same price as the old Nest subs, but naturally, there’s a lot more AI.

Google claims that Gemini will make your smart home easier to monitor and manage. All that video streaming from your cameras churns through the AI, which interprets the goings on. As a result, you get features like AI-enhanced notifications that give you more context about what your cameras saw. For instance, your notifications will include descriptions of activity, and Home Brief will summarize everything that happens each day.

Home app

The new Home app has a simpler three-tab layout.

Credit: Google

The new Home app has a simpler three-tab layout. Credit: Google

Conversational interaction is also a big part of this update. In the home app, subscribers will see a new Ask Home bar where you can input natural language queries. For example, you could ask if a certain person has left or returned home, or whether or not your package showed up. At least, that’s what’s supposed to happen—generative AI can get things wrong.

The new app comes with new subscriptions based around AI, but the tiers don’t cost any more than the old Nest plans, and they include all the same video features. The base $10 subscription, now known as Standard, includes 30 days of video event history, along with Gemini automation features and the “intelligent alerts” Home has used for a while that can alert you to packages, familiar faces, and so on. The $20 subscription is becoming Home Advanced, which adds the conversational Ask Home feature in the app, AI notifications, AI event descriptions, and a new “Home Brief.” It also still offers 60 days of events and 10 days of 24/7 video history.

Home app and notification

Gemini is supposed to help you keep tabs on what’s happening at home.

Credit: Google

Gemini is supposed to help you keep tabs on what’s happening at home. Credit: Google

Free users still get saved event video history, and it’s been boosted from three hours to six. If you are not subscribing to Gemini Home or using the $10 plan, the Ask Home bar that is persistent across the app will become a quick search, which surfaces devices and settings.

If you’re already subscribing to Google’s AI services, this change could actually save you some cash. Anyone with Google AI Pro (a $20 sub) will get Home Standard for free. If you’re paying for the lavish $250 per month AI Ultra plan, you get Home Advanced at no additional cost.

A proving ground for AI

You may have gotten used to Assistant over the past decade in spite of its frequent feature gaps, but you’ll have to leave it behind. Gemini for Home will be taking over beginning this month in early access. The full release will come later, but Google intends to deliver the Gemini-powered smart home experience to as many users as possible.

Gemini will replace Assistant on every first-party Google Home device, going all the way back to the original 2016 Google Home. You’ll be able to have live chats with Gemini via your smart speakers and make more complex smart home queries. Google is making some big claims about contextual understanding here.

Gemini Home

If Google’s embrace of generative AI pays off, we’ll see it here.

Credit: Google

If Google’s embrace of generative AI pays off, we’ll see it here. Credit: Google

If you’ve used Gemini Live, the new Home interactions will seem familiar. You can ask Gemini anything you want via your smart speakers, perhaps getting help with a recipe or an appliance issue. However, the robot will sometimes just keep talking long past the point it’s helpful. Like Gemini Live, you just have to interrupt the robot sometimes. Google also promises a selection of improved voices to interrupt.

If you want to get early access to the new Gemini Home features, you can sign up in the Home app settings. Just look for the “Early access” option. Google doesn’t guarantee access on a specific timeline, but the first people will be allowed to try the new Gemini Home this month.

New AI-first hardware

It has been four years since Google released new smart home devices, but the era of Gemini brings some new hardware. There are three new cameras, all with 2K image sensors. The new Nest Indoor camera will retail for $100, and the Nest Outdoor Camera will cost $150 (or $250 in a two-pack). There’s also a new Nest Doorbell, which requires a wired connection, for $180.

Google says these cameras were designed with generative AI in mind. The sensor choice allows for good detail even if you need to digitally zoom in, but the video feed is still small enough to be ingested by Google’s AI models as it’s created. This is what gives the new Home app the ability to provide rich updates on your smart home.

Nest Doorbell 3

The new Nest Doorbell looks familiar.

Credit: Google

The new Nest Doorbell looks familiar. Credit: Google

You may also notice there are no battery-powered models in the new batch. Again, that’s because of AI. A battery-powered camera wakes up only momentarily when the system logs an event, but this approach isn’t as useful for generative AI. Providing the model with an ongoing video stream gives it better insights into the scene and, theoretically, produces better insights for the user.

All the new cameras are available for order today, but Google has one more device queued up for a later release. The “Google Home Speaker” is Google’s first smart speaker release since 2020’s Nest Audio. This device is smaller than the Nest Audio but larger than the Nest Mini speakers. It supports 260-degree audio with custom on-device processing that reportedly makes conversing with Gemini smoother. It can also be paired with the Google TV Streamer for home theater audio. It will be available this coming spring for $99.

Google Home Speaker

The new Google Home Speaker comes out next spring.

Credit: Ryan Whitwam

The new Google Home Speaker comes out next spring. Credit: Ryan Whitwam

Google Home will continue to support a wide range of devices, but most of them won’t connect to all the advanced Gemini AI features. However, that could change. Google has also announced a new program for partners to build devices that work with Gemini alongside the Nest cameras. Devices built with the new Google Camera embedded SDK will begin appearing in the coming months, but Walmart’s Onn brand has two ready to go. The Onn Indoor camera retails for $22.96 and the Onn Video Doorbell is $49.86. Both cameras are 1080p resolution and will talk to Gemini just like Google’s cameras. So you may have more options to experience Google’s vision for the AI home of the future.

Photo of Ryan Whitwam

Ryan Whitwam is a senior technology reporter at Ars Technica, covering the ways Google, AI, and mobile technology continue to change the world. Over his 20-year career, he’s written for Android Police, ExtremeTech, Wirecutter, NY Times, and more. He has reviewed more phones than most people will ever own. You can follow him on Bluesky, where you will see photos of his dozens of mechanical keyboards.

Google’s Gemini-powered smart home revamp is here with a new app and cameras Read More »

the-ai-slop-drops-right-from-the-top,-as-trump-posts-vulgar-deepfake-of-opponents

The AI slop drops right from the top, as Trump posts vulgar deepfake of opponents

AI poses an obvious danger to the millennia-long human fight to find the truth. Large language model “hallucinations,” vocal deepfakes, and now increased use of video deepfakes have all had a blurring effect on facts, letting bad actors around the globe brush off even recorded events as mere “fake news.”

The danger is perhaps most acute in the political realm, where deepfake audio and video can make any politician say or appear to do anything. In such a climate, our most senior elected officials have a special duty to model truth-seeking behavior and responsible AI use.

But what’s the fun in that, when you can just blow up negotiations over a budget impasse by posting a deepfake video of your political opponents calling themselves “a bunch of woke pieces of shit” while mariachi music plays in the background? Oh—and did I mention the fake mustache? Or the CGI sombrero?

On Monday night, the president of the United States, a man with access to the greatest intelligence-gathering operation in the world, posted to his Truth Social account a 35-second AI-generated video filled with crude insults, racial overtones, and bizarre conspiracy theories. The video targeted two Democratic leaders who had recently been meeting with Trump over a possible agreement to fund the government; I would have thought this kind of video was a pretty poor way to get people to agree with you, but, apparently, AI-generated insults are the real “art of the deal.”

In the clip, a deepfake version of Sen. Chuck Schumer (D-NY) utters a surreal monologue as his colleague Rep. Hakeem Jeffries (D-N.Y.) looks on… in a sombrero.

The AI slop drops right from the top, as Trump posts vulgar deepfake of opponents Read More »

california’s-newly-signed-ai-law-just-gave-big-tech-exactly-what-it-wanted

California’s newly signed AI law just gave Big Tech exactly what it wanted

On Monday, California Governor Gavin Newsom signed the Transparency in Frontier Artificial Intelligence Act into law, requiring AI companies to disclose their safety practices while stopping short of mandating actual safety testing. The law requires companies with annual revenues of at least $500 million to publish safety protocols on their websites and report incidents to state authorities, but it lacks the stronger enforcement teeth of the bill Newsom vetoed last year after tech companies lobbied heavily against it.

The legislation, S.B. 53, replaces Senator Scott Wiener’s previous attempt at AI regulation, known as S.B. 1047, that would have required safety testing and “kill switches” for AI systems. Instead, the new law asks companies to describe how they incorporate “national standards, international standards, and industry-consensus best practices” into their AI development, without specifying what those standards are or requiring independent verification.

“California has proven that we can establish regulations to protect our communities while also ensuring that the growing AI industry continues to thrive,” Newsom said in a statement, though the law’s actual protective measures remain largely voluntary beyond basic reporting requirements.

According to the California state government, the state houses 32 of the world’s top 50 AI companies, and more than half of global venture capital funding for AI and machine learning startups went to Bay Area companies last year. So while the recently signed bill is state-level legislation, what happens in California AI regulation will have a much wider impact, both by legislative precedent and by affecting companies that craft AI systems used around the world.

Transparency instead of testing

Where the vetoed SB 1047 would have mandated safety testing and kill switches for AI systems, the new law focuses on disclosure. Companies must report what the state calls “potential critical safety incidents” to California’s Office of Emergency Services and provide whistleblower protections for employees who raise safety concerns. The law defines catastrophic risk narrowly as incidents potentially causing 50+ deaths or $1 billion in damage through weapons assistance, autonomous criminal acts, or loss of control. The attorney general can levy civil penalties of up to $1 million per violation for noncompliance with these reporting requirements.

California’s newly signed AI law just gave Big Tech exactly what it wanted Read More »