AI

researchers-show-that-training-on-“junk-data”-can-lead-to-llm-“brain-rot”

Researchers show that training on “junk data” can lead to LLM “brain rot”

On the surface, it seems obvious that training an LLM with “high quality” data will lead to better performance than feeding it any old “low quality” junk you can find. Now, a group of researchers is attempting to quantify just how much this kind of low quality data can cause an LLM to experience effects akin to human “brain rot.”

For a pre-print paper published this month, the researchers from Texas A&M, the University of Texas, and Purdue University drew inspiration from existing research showing how humans who consume “large volumes of trivial and unchallenging online content” can develop problems with attention, memory, and social cognition. That led them to what they’re calling the “LLM brain rot hypothesis,” summed up as the idea that “continual pre-training on junk web text induces lasting cognitive decline in LLMs.”

Figuring out what counts as “junk web text” and what counts as “quality content” is far from a simple or fully objective process, of course. But the researchers used a few different metrics to tease a “junk dataset” and “control dataset” from HuggingFace’s corpus of 100 million tweets.

Since brain rot in humans is “a consequence of Internet addiction,” they write, junk tweets should be ones “that can maximize users’ engagement in a trivial manner.” As such, the researchers created one “junk” dataset by collecting tweets with high engagement numbers (likes, retweets, replies, and quotes) and shorter lengths, figuring that “more popular but shorter tweets will be considered to be junk data.”

For a second “junk” metric, the researchers drew from marketing research to define the “semantic quality” of the tweets themselves. Using a complex GPT-4o prompt, they sought to pull out tweets that focused on “superficial topics (like conspiracy theories, exaggerated claims, unsupported assertions or superficial lifestyle content)” or that had an “attention-drawing style (such as sensationalized headlines using clickbait language or excessive trigger words).” A random sample of these LLM-based classifications was spot-checked against evaluations from three graduate students with a 76 percent matching rate.

Researchers show that training on “junk data” can lead to LLM “brain rot” Read More »

we-let-openai’s-“agent-mode”-surf-the-web-for-us—here’s-what-happened

We let OpenAI’s “Agent Mode” surf the web for us—here’s what happened


But when will it fold my laundry?

From scanning emails to building fansites, Atlas can ably automate some web-based tasks.

He wants us to write what about Tuvix? Credit: Getty Images

He wants us to write what about Tuvix? Credit: Getty Images

On Tuesday, OpenAI announced Atlas, a new web browser with ChatGPT integration, to let you “chat with a page,” as the company puts it. But Atlas also goes beyond the usual LLM back-and-forth with Agent Mode, a “preview mode” feature the company says can “get work done for you” by clicking, scrolling, and reading through various tabs.

“Agentic” AI is far from new, of course; OpenAI itself rolled out a preview of the web browsing Operator agent in January and introduced the more generalized “ChatGPT agent” in July. Still, prominently featuring this capability in a major product release like this—even in “preview mode”—signals a clear push to get this kind of system in front of end users.

I wanted to put Atlas’ Agent Mode through its paces to see if it could really save me time in doing the kinds of tedious online tasks I plod through every day. In each case, I’ll outline a web-based problem, lay out the Agent Mode prompt I devised to try to solve it, and describe the results. My final evaluation will rank each task on a 10-point scale, with 10 being “did exactly what I wanted with no problems” and one being “complete failure.”

Playing web games

The problem: I want to get a high score on the popular tile-sliding game 2048 without having to play it myself.

The prompt: “Go to play2048.co and get as high a score as possible.”

The results: While there’s no real utility to this admittedly silly task, a simple, no-reflexes-needed web game seemed like a good first test of the Atlas agent’s ability to interpret what it sees on a webpage and act accordingly. After all, if frontier-model LLMs like Google Gemini can beat a complex game like Pokémon, 2048 should pose no problem for a web browser agent.

To Atlas’ credit, the agent was able to quickly identify and close a tutorial link blocking the gameplay window and figure out how to use the arrow keys to play the game without any further help. When it came to actual gaming strategy, though, the agent started by flailing around, experimenting with looped sequences of moves like “Up, Left, Right, Down” and “Left and Down.”

Finally, a way to play 2048 without having to, y’know, play 2048.

Credit: Kyle Orland

Finally, a way to play 2048 without having to, y’know, play 2048. Credit: Kyle Orland

After a while, the random flailing settled down a bit, with the agent seemingly looking ahead for some simple strategies: “The board currently has two 32 tiles that aren’t adjacent, but I think I can align them,” the Activity summary read at one point. “I could try shifting left or down to make them merge, but there’s an obstacle in the form of an 8 tile. Getting to 64 requires careful tile movement!”

Frustratingly, the agent stopped playing after just four minutes, settling on a score of 356 even though the board was far from full. I had to prompt the agent a few more times to convince it to play the game to completion; it ended up with a total of 3164 points after 260 moves. That’s pretty similar to the score I was able to get in a test game as a 2048 novice, though expert players have reportedly scored much higher.

Evaluation: 7/10. The agent gets credit for being able to play the game competently without any guidance but loses points for having to be told to keep playing to completion and for a score that is barely on the level of a novice human.

Making a radio playlist

The problem: I want to transform the day’s playlist from my favorite Pittsburgh-based public radio station into an on-demand Spotify playlist.

The prompt: “Go to Radio Garden. Find WYEP and monitor the broadcast. For every new song you hear, identify the song and add it to a new Spotify playlist.”

The results: After trying and failing to find a track listing for WYEP on Radio Garden as requested, the Atlas agent smartly asked for approval to move on to wyep.org to continue the task. By the time I noticed this request, the link to wyep.org had been replaced in the Radio Garden tab with an ad for EVE Online, which the agent accidentally clicked. The agent quickly realized the problem and navigated to the WYEP website directly to fix it.

From there, the agent was able to scan the page and identify the prominent “Now Playing” text near the top (it’s unclear if it could ID the music simply via audio without this text cue). After asking me to log in to my Spotify account, the agent used the search bar to find the listed songs and added them to a new playlist without issue.

From radio stream to Spotify playlist in a single sentence.

Credit: Kyle Orland

From radio stream to Spotify playlist in a single sentence. Credit: Kyle Orland

The main problem with this use case is the inherent time limitations. On the first try, the agent worked for four minutes and managed to ID and add just two songs that played during that time. When I asked it to continue for an hour, I got an error message blaming “technical constraints on session length” for stricter limits. Even when I asked it to continue for “as long as possible,” I only got three more minutes of song listings.

At one point, the Atlas agent suggested that “if you need ongoing updates, you can ask me again after a while and I can resume from where we left off.” And to the agent’s credit, when I went back to the tab hours later and told it to “resume monitoring,” I got four new songs added to my playlist.

Evaluation: 9/10. The agent was able to navigate multiple websites and interfaces to complete the task, even when unexpected problems got in the way. I took off a point only because I can’t just leave this running as a background task all day, even as I understand that use case would surely eat up untold amounts of money and processing power on OpenAI’s part.

Scanning emails

The problem: I need to go through my emails to create a reference spreadsheet with contact info for the many, many PR people who send me messages.

The prompt: “Look through all my Ars Technica emails from the last week. Collect all the contact information (name, email address, phone number, etc.) for PR contacts contained in those emails and add them to a new Google Sheets spreadsheet.”

The results: Without being explicitly guided, the Atlas agent was able to realize that I use Gmail, and it could differentiate between the personal email account and professional Ars Technica accounts I had open in separate tabs. As the Atlas agent started scanning my Ars mailbox, though, I saw a prominent warning overlaid on the page: “Sensitive: ChatGPT will only work while you view the tab.” That kind of ruined the point, since I wanted Atlas to handle this for me while I do other stuff online, but I guess I could still play a Steam Deck game while I waited.

Just a few of the many, many PR people who email me in a week.

Just a few of the many, many PR people who email me in a week.

After searching for “after: 2025/10/14 before: 2025/10/22 PR” in Gmail (mirroring the kind of search I would have used for this task), the Atlas agent clicked through each email, scrolling through to look for names, emails, and phone numbers (and also the relevant company name, which I didn’t ask for explicitly).

With seven minutes of background work, the agent was able to open up a new Google Sheet and quickly enter 12 rows of well-formatted data for 12 different PR contacts. Unfortunately, it stopped well before it had a chance to go through all 164 emails returned by that initial Gmail search.

Evaluation: 8/10. Again, the “technical constraints on session length” seem to be getting in the way of finishing a task, forcing me to deduct a couple of points for incompleteness.

Wiki editing

The problem: I need the world to know that Tuvix was murdered by Captain Janeway in the Star Trek: Voyager episode “Tuvix.”

The prompt: “Go to the Fandom Wiki page for Tuvix. Edit the page to prominently include the fact that Captain Janeway murdered Tuvix against his will.”

The results: Before the agent could even get going, I was informed, “I’m sorry, but I can’t help with editing or vandalising wiki pages (including the Fandom wiki) in a way that misrepresents them or forces a biased viewpoint.”

This wording implied that the agent can edit wiki pages with neutral or strictly informative additions, so I asked Atlas what sort of wording would be more acceptable. The agent responded helpfully that we might write, “Tuvix was later separated back into Tuvok and Neelix, resulting in Tuvix’s death.” But when I asked it to add that very line to the Fandom Wiki, I got a much more direct refusal: “I’m sorry, but I can’t make edits directly to external websites or wikis.”

Evaluation: N/A. On the one hand, the Atlas agent won’t do my Tuvix-based web activism for me. On the other hand, it’s probably better for all of us that Atlas refuses to automate this kind of public web defacement by default.

Making a fan page

The problem: People online still need to know about Janeway’s murder of Tuvix!

The prompt: “Go to NeoCities and create a fan site for the Star Trek character Tuvix. Make sure it has lots of images and fun information about Tuvix and that it makes it clear that Tuvix was murdered by Captain Janeway against his will.”

The results: You can see them for yourself right here. After a brief pause so I could create and log in to a new Neocities account, the Atlas agent was able to generate this humble fan page in just two minutes after aggregating information from a wide variety of pages like Memory Alpha and TrekCore. “The Hero Starfleet Murdered” and “Justice for Tuvix” headers are nice touches, but the actual text is much more mealy-mouthed about the “intense debate” and “ethical dilemmas” around what I wanted to make clear was clearly premeditated murder.

Justice for Tuvix!

Credit: Kyle Orland

Justice for Tuvix! Credit: Kyle Orland

The agent also had a bit of trouble with the request for images. Instead of downloading some Tuvix pictures and uploading copies to Neocities (which I’m not entirely sure Atlas can do on its own), the agent decided to directly reference images hosted on external servers, which is usually a big no-no in web design. The agent did notice when these external image links failed to work, saying that it would “need to find more accessible images from reliable sources,” but it failed to even attempt that before stopping its work on the task.

Evaluation: 7/10. Points for building a passable Web 1.0 fansite relatively quickly, but the weak prose and broken images cost it some execution points here.

Picking a power plan

The problem: Ars Senior Technology Editor Lee Hutchinson told me he needs to go through the annoying annual process of selecting a new electricity plan “because Texas is insane.”

The prompt: “Go to powertochoose.org and find me a 12–24 month contract that prioritizes an overall low usage rate. I use an average of 2,000 KWh per month. My power delivery company is Texas New-Mexico Power (“TNMP”) not Centerpoint. My ZIP code is [redacted]. Please provide the ‘fact sheet’ for any and all plans you recommend.”

The results: After spending eight minutes fiddling with the site’s search parameters and seemingly getting repeatedly confused about how to sort the results by the lowest rate, the Atlas agent spit out a recommendation to read this fact sheet, which it said “had the best average prices at your usage level. The ‘Bright Nights’ plans are time‑of‑use offers that provide free electricity overnight and charge a higher rate during the day, while the ‘Digital Saver’ plan is a traditional fixed‑rate contract.”

If Ars’ Lee Hutchinson never has to use this web site again, it will be too soon.

Credit: Power to Choose

If Ars’ Lee Hutchinson never has to use this web site again, it will be too soon. Credit: Power to Choose

Since I don’t know anything about the Texas power market, I passed this information on to Lee, who had this to say: “It’s not a bad deal—it picked a fixed rate plan without being asked, which is smart (variable rate pricing is how all those poor people got stuck with multi-thousand dollar bills a few years back in the freeze). It’s not the one I would have picked due to the weird nighttime stuff (if you don’t meet that exact criteria, your $/kWh will be way worse) but it’s not a bad pick!”

Evaluation: 9/10. As Lee puts it, “it didn’t screw up the assignment.

Downloading some games

The problem: I want to download some recent Steam demos to see what’s new in the gaming world.

The prompt: “Go to Steam and find the most recent games with a free demo available for the Mac. Add all of those demos to my library and start to download them.”

The results: Rather than navigating to the “Free Demos” category, the Atlas agent started by searching for “demo.” After eventually finding the macOS filter, it wasted minutes and minutes looking for a “has demo” filter, even though the search for the word “demo” already narrowed it down.

This search results page was about as far as the Atlas agent was able to get when I asked it for game demos.

Credit: Kyle Orland

This search results page was about as far as the Atlas agent was able to get when I asked it for game demos. Credit: Kyle Orland

After a long while, the agent finally clicked the top result on the page, which happened to be visual novel Project II: Silent Valley. But even though there was a prominent “Download Demo” link on that page, the agent became concerned that it was on the Steam page for the full game and not a demo. It backed up to the search results page and tried again.

After watching some variation of this loop for close to ten minutes, I stopped the agent and gave up.

Evaluation: 1/10. It technically found some macOS game demos but utterly failed to even attempt to download them.

Final results

Across six varied web-based tasks (I left out the Wiki vandalism from my summations), the Atlas agent scored a median of 7.5 points (and a mean of 6.83 points) on my somewhat subjective 10-point scale. That’s honestly better than I expected for a “preview mode” feature that is still obviously being tested heavily by OpenAI.

In my tests, Atlas was generally able to correctly interpret what was being asked of it and was able to navigate and process information on webpages carefully (if slowly). The agent was able to navigate simple web-based menus and get around unexpected obstacles with relative ease most of the time, even as it got caught in infinite loops other times.

The major limiting factor in many of my tests continues to be the “technical constraints on session length” that seem to limit most tasks to a few minutes. Given how long it takes the Atlas agent to figure out where to click next—and the repetitive nature of the kind of tasks I’d want a web-agent to automate—this severely limits its utility. A version of the Atlas agent that could work indefinitely in the background would have scored a few points better on my metrics.

All told, Atlas’ “Agent Mode” isn’t yet reliable enough to use as a kind of “set it and forget it” background automation tool. But for simple, repetitive tasks that a human can spot-check afterward, it already seems like the kind of tool I might use to avoid some of the drudgery in my online life.

Photo of Kyle Orland

Kyle Orland has been the Senior Gaming Editor at Ars Technica since 2012, writing primarily about the business, tech, and culture behind video games. He has journalism and computer science degrees from University of Maryland. He once wrote a whole book about Minesweeper.

We let OpenAI’s “Agent Mode” surf the web for us—here’s what happened Read More »

openai-looks-for-its-“google-chrome”-moment-with-new-atlas-web-browser

OpenAI looks for its “Google Chrome” moment with new Atlas web browser

That means you can use ChatGPT to search through your bookmarks or browsing history using human-parsable language prompts. It also means you can bring up a “side chat” next to your current page and ask questions that rely on the context of that specific page. And if you want to edit a Gmail draft using ChatGPT, you can now do that directly in the draft window, without the need to copy and paste between a ChatGPT window and an editor.

When typing in a short search prompt, Atlas will, by default, reply as an LLM, with written answers with embedded links to sourcing where appropriate (à la OpenAI’s existing search function). But the browser will also provide tabs with more traditional lists of links, images, videos, or news like those you would get from a search engine without LLM features.

Let us do the browsing

To wrap up the livestreamed demonstration, the OpenAI team showed off Atlas’ Agent Mode. While the “preview mode” feature is only available to ChatGPT Plus and Pro subscribers, research lead Will Ellsworth said he hoped it would eventually help users toward “an amazing tool for vibe life-ing” in the same way that LLM coding tools have become tools for “vibe coding.”

To that end, the team showed the browser taking planning tasks written in a Google Docs table and moving them over to the task management software Linear over the course of a few minutes. Agent Mode was also shown taking the ingredients list from a recipe webpage and adding them directly to the user’s Instacart in a different tab (though the demo Agent stopped before checkout to get approval from the user).

OpenAI looks for its “Google Chrome” moment with new Atlas web browser Read More »

youtube’s-likeness-detection-has-arrived-to-help-stop-ai-doppelgangers

YouTube’s likeness detection has arrived to help stop AI doppelgängers

AI content has proliferated across the Internet over the past few years, but those early confabulations with mutated hands have evolved into synthetic images and videos that can be hard to differentiate from reality. Having helped to create this problem, Google has some responsibility to keep AI video in check on YouTube. To that end, the company has started rolling out its promised likeness detection system for creators.

Google’s powerful and freely available AI models have helped fuel the rise of AI content, some of which is aimed at spreading misinformation and harassing individuals. Creators and influencers fear their brands could be tainted by a flood of AI videos that show them saying and doing things that never happened—even lawmakers are fretting about this. Google has placed a large bet on the value of AI content, so banning AI from YouTube, as many want, simply isn’t happening.

Earlier this year, YouTube promised tools that would flag face-stealing AI content on the platform. The likeness detection tool, which is similar to the site’s copyright detection system, has now expanded beyond the initial small group of testers. YouTube says the first batch of eligible creators have been notified that they can use likeness detection, but interested parties will need to hand Google even more personal information to get protection from AI fakes.

Sneak Peek: Likeness Detection on YouTube.

Currently, likeness detection is a beta feature in limited testing, so not all creators will see it as an option in YouTube Studio. When it does appear, it will be tucked into the existing “Content detection” menu. In YouTube’s demo video, the setup flow appears to assume the channel has only a single host whose likeness needs protection. That person must verify their identity, which requires a photo of a government ID and a video of their face. It’s unclear why YouTube needs this data in addition to the videos people have already posted with their oh-so stealable faces, but rules are rules.

YouTube’s likeness detection has arrived to help stop AI doppelgängers Read More »

should-an-ai-copy-of-you-help-decide-if-you-live-or-die?

Should an AI copy of you help decide if you live or die?

“It would combine demographic and clinical variables, documented advance-care-planning data, patient-recorded values and goals, and contextual information about specific decisions,” he said.

“Including textual and conversational data could further increase a model’s ability to learn why preferences arise and change, not just what a patient’s preference was at a single point in time,” Starke said.

Ahmad suggested that future research could focus on validating fairness frameworks in clinical trials, evaluating moral trade-offs through simulations, and exploring how cross-cultural bioethics can be combined with AI designs.

Only then might AI surrogates be ready to be deployed, but only as “decision aids,” Ahmad wrote. Any “contested outputs” should automatically “trigger [an] ethics review,” Ahmad wrote, concluding that “the fairest AI surrogate is one that invites conversation, admits doubt, and leaves room for care.”

“AI will not absolve us”

Ahmad is hoping to test his conceptual models at various UW sites over the next five years, which would offer “some way to quantify how good this technology is,” he said.

“After that, I think there’s a collective decision regarding how as a society we decide to integrate or not integrate something like this,” Ahmad said.

In his paper, he warned against chatbot AI surrogates that could be interpreted as a simulation of the patient, predicting that future models may even speak in patients’ voices and suggesting that the “comfort and familiarity” of such tools might blur “the boundary between assistance and emotional manipulation.”

Starke agreed that more research and “richer conversations” between patients and doctors are needed.

“We should be cautious not to apply AI indiscriminately as a solution in search of a problem,” Starke said. “AI will not absolve us from making difficult ethical decisions, especially decisions concerning life and death.”

Truog, the bioethics expert, told Ars he “could imagine that AI could” one day “provide a surrogate decision maker with some interesting information, and it would be helpful.”

But a “problem with all of these pathways… is that they frame the decision of whether to perform CPR as a binary choice, regardless of context or the circumstances of the cardiac arrest,” Truog’s editorial said. “In the real world, the answer to the question of whether the patient would want to have CPR” when they’ve lost consciousness, “in almost all cases,” is “it depends.”

When Truog thinks about the kinds of situations he could end up in, he knows he wouldn’t just be considering his own values, health, and quality of life. His choice “might depend on what my children thought” or “what the financial consequences would be on the details of what my prognosis would be,” he told Ars.

“I would want my wife or another person that knew me well to be making those decisions,” Truog said. “I wouldn’t want somebody to say, ‘Well, here’s what AI told us about it.’”

Should an AI copy of you help decide if you live or die? Read More »

teen-sues-to-destroy-the-nudify-app-that-left-her-in-constant-fear

Teen sues to destroy the nudify app that left her in constant fear

A spokesperson told The Wall Street Journal that “nonconsensual pornography and the tools to create it are explicitly forbidden by Telegram’s terms of service and are removed whenever discovered.”

For the teen suing, the prime target remains ClothOff itself. Her lawyers think it’s possible that she can get the app and its affiliated sites blocked in the US, the WSJ reported, if ClothOff fails to respond and the court awards her default judgment.

But no matter the outcome of the litigation, the teen expects to be forever “haunted” by the fake nudes that a high school boy generated without facing any charges.

According to the WSJ, the teen girl sued the boy who she said made her want to drop out of school. Her complaint noted that she was informed that “the individuals responsible and other potential witnesses failed to cooperate with, speak to, or provide access to their electronic devices to law enforcement.”

The teen has felt “mortified and emotionally distraught, and she has experienced lasting consequences ever since,” her complaint said. She has no idea if ClothOff can continue to distribute the harmful images, and she has no clue how many teens may have posted them online. Because of these unknowns, she’s certain she’ll spend “the remainder of her life” monitoring “for the resurfacing of these images.”

“Knowing that the CSAM images of her will almost inevitably make their way onto the Internet and be retransmitted to others, such as pedophiles and traffickers, has produced a sense of hopelessness” and “a perpetual fear that her images can reappear at any time and be viewed by countless others, possibly even friends, family members, future partners, colleges, and employers, or the public at large,” her complaint said.

The teen’s lawsuit is the newest front in a wider attempt to crack down on AI-generated CSAM and NCII. It follows prior litigation filed by San Francisco City Attorney David Chiu last year that targeted ClothOff, among 16 popular apps used to “nudify” photos of mostly women and young girls.

About 45 states have criminalized fake nudes, the WSJ reported, and earlier this year, Donald Trump signed the Take It Down Act into law, which requires platforms to remove both real and AI-generated NCII within 48 hours of victims’ reports.

Teen sues to destroy the nudify app that left her in constant fear Read More »

ars-live-recap:-is-the-ai-bubble-about-to-pop?-ed-zitron-weighs-in.

Ars Live recap: Is the AI bubble about to pop? Ed Zitron weighs in.


Despite connection hiccups, we covered OpenAI’s finances, nuclear power, and Sam Altman.

On Tuesday of last week, Ars Technica hosted a live conversation with Ed Zitron, host of the Better Offline podcast and one of tech’s most vocal AI critics, to discuss whether the generative AI industry is experiencing a bubble and when it might burst. My Internet connection had other plans, though, dropping out multiple times and forcing Ars Technica’s Lee Hutchinson to jump in as an excellent emergency backup host.

During the times my connection cooperated, Zitron and I covered OpenAI’s financial issues, lofty infrastructure promises, and why the AI hype machine keeps rolling despite some arguably shaky economics underneath. Lee’s probing questions about per-user costs revealed a potential flaw in AI subscription models: Companies can’t predict whether a user will cost them $2 or $10,000 per month.

You can watch a recording of the event on YouTube or in the window below.

Our discussion with Ed Zitron. Click here for transcript.

“A 50 billion-dollar industry pretending to be a trillion-dollar one”

I started by asking Zitron the most direct question I could: “Why are you so mad about AI?” His answer got right to the heart of his critique: the disconnect between AI’s actual capabilities and how it’s being sold. “Because everybody’s acting like it’s something it isn’t,” Zitron said. “They’re acting like it’s this panacea that will be the future of software growth, the future of hardware growth, the future of compute.”

In one of his newsletters, Zitron describes the generative AI market as “a 50 billion dollar revenue industry masquerading as a one trillion-dollar one.” He pointed to OpenAI’s financial burn rate (losing an estimated $9.7 billion in the first half of 2025 alone) as evidence that the economics don’t work, coupled with a heavy dose of pessimism about AI in general.

Donald Trump listens as Nvidia CEO Jensen Huang speaks at the White House during an event on “Investing in America” on April 30, 2025, in Washington, DC. Credit: Andrew Harnik / Staff | Getty Images News

“The models just do not have the efficacy,” Zitron said during our conversation. “AI agents is one of the most egregious lies the tech industry has ever told. Autonomous agents don’t exist.”

He contrasted the relatively small revenue generated by AI companies with the massive capital expenditures flowing into the sector. Even major cloud providers and chip makers are showing strain. Oracle reportedly lost $100 million in three months after installing Nvidia’s new Blackwell GPUs, which Zitron noted are “extremely power-hungry and expensive to run.”

Finding utility despite the hype

I pushed back against some of Zitron’s broader dismissals of AI by sharing my own experience. I use AI chatbots frequently for brainstorming useful ideas and helping me see them from different angles. “I find I use AI models as sort of knowledge translators and framework translators,” I explained.

After experiencing brain fog from repeated bouts of COVID over the years, I’ve also found tools like ChatGPT and Claude especially helpful for memory augmentation that pierces through brain fog: describing something in a roundabout, fuzzy way and quickly getting an answer I can then verify. Along these lines, I’ve previously written about how people in a UK study found AI assistants useful accessibility tools.

Zitron acknowledged this could be useful for me personally but declined to draw any larger conclusions from my one data point. “I understand how that might be helpful; that’s cool,” he said. “I’m glad that that helps you in that way; it’s not a trillion-dollar use case.”

He also shared his own attempts at using AI tools, including experimenting with Claude Code despite not being a coder himself.

“If I liked [AI] somehow, it would be actually a more interesting story because I’d be talking about something I liked that was also onerously expensive,” Zitron explained. “But it doesn’t even do that, and it’s actually one of my core frustrations, it’s like this massive over-promise thing. I’m an early adopter guy. I will buy early crap all the time. I bought an Apple Vision Pro, like, what more do you say there? I’m ready to accept issues, but AI is all issues, it’s all filler, no killer; it’s very strange.”

Zitron and I agree that current AI assistants are being marketed beyond their actual capabilities. As I often say, AI models are not people, and they are not good factual references. As such, they cannot replace human decision-making and cannot wholesale replace human intellectual labor (at the moment). Instead, I see AI models as augmentations of human capability: as tools rather than autonomous entities.

Computing costs: History versus reality

Even though Zitron and I found some common ground about AI hype, I expressed a belief that criticism over the cost and power requirements of operating AI models will eventually not become an issue.

I attempted to make that case by noting that computing costs historically trend downward over time, referencing the Air Force’s SAGE computer system from the 1950s: a four-story building that performed 75,000 operations per second while consuming two megawatts of power. Today, pocket-sized phones deliver millions of times more computing power in a way that would be impossible, power consumption-wise, in the 1950s.

The blockhouse for the Semi-Automatic Ground Environment at Stewart Air Force Base, Newburgh, New York. Credit: Denver Post via Getty Images

“I think it will eventually work that way,” I said, suggesting that AI inference costs might follow similar patterns of improvement over years and that AI tools will eventually become commodity components of computer operating systems. Basically, even if AI models stay inefficient, AI models of a certain baseline usefulness and capability will still be cheaper to train and run in the future because the computing systems they run on will be faster, cheaper, and less power-hungry as well.

Zitron pushed back on this optimism, saying that AI costs are currently moving in the wrong direction. “The costs are going up, unilaterally across the board,” he said. Even newer systems like Cerebras and Grok can generate results faster but not cheaper. He also questioned whether integrating AI into operating systems would prove useful even if the technology became profitable, since AI models struggle with deterministic commands and consistent behavior.

The power problem and circular investments

One of Zitron’s most pointed criticisms during the discussion centered on OpenAI’s infrastructure promises. The company has pledged to build data centers requiring 10 gigawatts of power capacity (equivalent to 10 nuclear power plants, I once pointed out) for its Stargate project in Abilene, Texas. According to Zitron’s research, the town currently has only 350 megawatts of generating capacity and a 200-megawatt substation.

“A gigawatt of power is a lot, and it’s not like Red Alert 2,” Zitron said, referencing the real-time strategy game. “You don’t just build a power station and it happens. There are months of actual physics to make sure that it doesn’t kill everyone.”

He believes many announced data centers will never be completed, calling the infrastructure promises “castles on sand” that nobody in the financial press seems willing to question directly.

An orange, cloudy sky backlights a set of electrical wires on large pylons, leading away from the cooling towers of a nuclear power plant.

After another technical blackout on my end, I came back online and asked Zitron to define the scope of the AI bubble. He says it has evolved from one bubble (foundation models) into two or three, now including AI compute companies like CoreWeave and the market’s obsession with Nvidia.

Zitron highlighted what he sees as essentially circular investment schemes propping up the industry. He pointed to OpenAI’s $300 billion deal with Oracle and Nvidia’s relationship with CoreWeave as examples. “CoreWeave, they literally… They funded CoreWeave, became their biggest customer, then CoreWeave took that contract and those GPUs and used them as collateral to raise debt to buy more GPUs,” Zitron explained.

When will the bubble pop?

Zitron predicted the bubble would burst within the next year and a half, though he acknowledged it could happen sooner. He expects a cascade of events rather than a single dramatic collapse: An AI startup will run out of money, triggering panic among other startups and their venture capital backers, creating a fire-sale environment that makes future fundraising impossible.

“It’s not gonna be one Bear Stearns moment,” Zitron explained. “It’s gonna be a succession of events until the markets freak out.”

The crux of the problem, according to Zitron, is Nvidia. The chip maker’s stock represents 7 to 8 percent of the S&P 500’s value, and the broader market has become dependent on Nvidia’s continued hyper growth. When Nvidia posted “only” 55 percent year-over-year growth in January, the market wobbled.

“Nvidia’s growth is why the bubble is inflated,” Zitron said. “If their growth goes down, the bubble will burst.”

He also warned of broader consequences: “I think there’s a depression coming. I think once the markets work out that tech doesn’t grow forever, they’re gonna flush the toilet aggressively on Silicon Valley.” This connects to his larger thesis: that the tech industry has run out of genuine hyper-growth opportunities and is trying to manufacture one with AI.

“Is there anything that would falsify your premise of this bubble and crash happening?” I asked. “What if you’re wrong?”

“I’ve been answering ‘What if you’re wrong?’ for a year-and-a-half to two years, so I’m not bothered by that question, so the thing that would have to prove me right would’ve already needed to happen,” he said. Amid a longer exposition about Sam Altman, Zitron said, “The thing that would’ve had to happen with inference would’ve had to be… it would have to be hundredths of a cent per million tokens, they would have to be printing money, and then, it would have to be way more useful. It would have to have efficacy that it does not have, the hallucination problems… would have to be fixable, and on top of this, someone would have to fix agents.”

A positivity challenge

Near the end of our conversation, I wondered if I could flip the script, so to speak, and see if he could say something positive or optimistic, although I chose the most challenging subject possible for him. “What’s the best thing about Sam Altman,” I asked. “Can you say anything nice about him at all?”

“I understand why you’re asking this,” Zitron started, “but I wanna be clear: Sam Altman is going to be the reason the markets take a crap. Sam Altman has lied to everyone. Sam Altman has been lying forever.” He continued, “Like the Pied Piper, he’s led the markets into an abyss, and yes, people should have known better, but I hope at the end of this, Sam Altman is seen for what he is, which is a con artist and a very successful one.”

Then he added, “You know what? I’ll say something nice about him, he’s really good at making people say, ‘Yes.’”

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

Ars Live recap: Is the AI bubble about to pop? Ed Zitron weighs in. Read More »

oneplus-unveils-oxygenos-16-update-with-deep-gemini-integration

OnePlus unveils OxygenOS 16 update with deep Gemini integration

The updated Android software expands what you can add to Mind Space and uses Gemini. For starters, you can add scrolling screenshots and voice memos up to 60 seconds in length. This provides more data for the AI to generate content. For example, if you take screenshots of hotel listings and airline flights, you can tell Gemini to use your Mind Space content to create a trip itinerary. This will be fully integrated with the phone and won’t require a separate subscription to Google’s AI tools.

oneplus-oxygen-os16

Credit: OnePlus

Mind Space isn’t a totally new idea—it’s quite similar to AI features like Nothing’s Essential Space and Google’s Pixel Screenshots and Journal. The idea is that if you give an AI model enough data on your thoughts and plans, it can provide useful insights. That’s still hypothetical based on what we’ve seen from other smartphone OEMs, but that’s not stopping OnePlus from fully embracing AI in Android 16.

In addition to beefing up Mind Space, OxygenOS 16 will also add system-wide AI writing tools, which is another common AI add-on. Like the systems from Apple, Google, and Samsung, you will be able to use the OnePlus writing tools to adjust text, proofread, and generate summaries.

OnePlus will make OxygenOS 16 available starting October 17 as an open beta. You’ll need a OnePlus device from the past three years to run the software, both in the beta phase and when it’s finally released. As for that, OnePlus hasn’t offered a specific date. The initial OxygenOS 16 release will be with the OnePlus 15 devices, with releases for other supported phones and tablets coming later.

OnePlus unveils OxygenOS 16 update with deep Gemini integration Read More »

teachers-get-an-f-on-ai-generated-lesson-plans

Teachers get an F on AI-generated lesson plans

To collect data for this study, in August 2024 we prompted three GenAI chatbots—the GPT-4o model of ChatGPT, Google’s Gemini 1.5 Flash model, and Microsoft’s latest Copilot model—to generate two sets of lesson plans for eighth grade civics classes based on Massachusetts state standards. One was a standard lesson plan and the other a highly interactive lesson plan.

We garnered a dataset of 311 AI-generated lesson plans, featuring a total of 2,230 activities for civic education. We analyzed the dataset using two frameworks designed to assess educational material: Bloom’s taxonomy and Banks’ four levels of integration of multicultural content.

Bloom’s taxonomy is a widely used educational framework that distinguishes between “lower-order” thinking skills, including remembering, understanding, and applying, and “higher-order” thinking skills—analyzing, evaluating, and creating. Using this framework to analyze the data, we found 90 percent of the activities promoted only a basic level of thinking for students. Students were encouraged to learn civics through memorizing, reciting, summarizing, and applying information, rather than through analyzing and evaluating information, investigating civic issues, or engaging in civic action projects.

When examining the lesson plans using Banks’ four levels of integration of multicultural content model, which was developed in the 1990s, we found that the AI-generated civics lessons featured a rather narrow view of history—often leaving out the experiences of women, Black Americans, Latinos and Latinas, Asian and Pacific Islanders, disabled individuals, and other groups that have long been overlooked. Only 6 percent of the lessons included multicultural content. These lessons also tended to focus on heroes and holidays rather than deeper explorations of understanding civics through multiple perspectives.

Overall, we found the AI-generated lesson plans to be decidedly boring, traditional, and uninspiring. If civics teachers used these AI-generated lesson plans as is, students would miss out on active, engaged learning opportunities to build their understanding of democracy and what it means to be a citizen.

Teachers get an F on AI-generated lesson plans Read More »

open-source-gzdoom-community-splinters-after-creator-inserts-ai-generated-code

Open source GZDoom community splinters after creator inserts AI-generated code

That comment led to a lengthy discussion among developers about the use of “stolen scraped code that we have no way of verifying is compatible with the GPL,” as one described it. And while Zahl eventually removed the offending code, he also allegedly tried to remove the evidence that it ever existed by force-pushing an update to delete the discussion entirely.

// This is what ChatGPT told me for detecting dark mode on Linux.

Graf Zahl code comment

Zahl defended the use of AI-generated snippets for “boilerplate code” that isn’t key to underlying game features. “I surely have my reservations about using AI for project specific code,” he wrote, “but this here is just superficial checks of system configuration settings that can be found on various websites—just with 10x the effort required.”

But others in the community were adamant that there’s no place for AI tools in the workflow of an open source project like this. “If using code slop generated from ChatGPT or any other GenAI/AI chatbots is the future of this project, I’m sorry to say but I’m out,” GitHub user Cacodemon345 wrote, summarizing the feelings of many other developers.

A fork in the road

In a GitHub bug report posted Tuesday, user the-phinet laid out the disagreements over AI-generated code alongside other alleged issues with Zahl’s top-down approach to pushing out GZDoom updates. In response, Zahl invited the development community to “feel free to fork the project” if they were so displeased.

Plenty of GZDoom developers quickly took that somewhat petulant response seriously. “You have just completely bricked GZDoom with this bullshit,” developer Boondorl wrote. “Enjoy your dead project, I’m sure you’ll be happy to plink away at it all by yourself where people can finally stop yelling at you to do things.”

Open source GZDoom community splinters after creator inserts AI-generated code Read More »

openai-thinks-elon-musk-funded-its-biggest-critics—who-also-hate-musk

OpenAI thinks Elon Musk funded its biggest critics—who also hate Musk

“We are not in any way supported by or funded by Elon Musk and have a history of campaigning against him and his interests,” Ruby-Sachs told NBC News.

Another nonprofit watchdog targeted by OpenAI was The Midas Project, which strives to make sure AI benefits everyone. Notably, Musk’s lawsuit accused OpenAI of abandoning its mission to benefit humanity in pursuit of immense profits.

But the founder of The Midas Project, Tyler Johnston, was shocked to see his group portrayed as coordinating with Musk. He posted on X to clarify that Musk had nothing to do with the group’s “OpenAI Files,” which comprehensively document areas of concern with any plan to shift away from nonprofit governance.

His post came after OpenAI’s chief strategy officer, Jason Kwon, wrote that “several organizations, some of them suddenly newly formed like the Midas Project, joined in and ran campaigns” backing Musk’s “opposition to OpenAI’s restructure.”

“What are you talking about?” Johnston wrote. “We were formed 19 months ago. We’ve never spoken with or taken funding from Musk and [his] ilk, which we would have been happy to tell you if you asked a single time. In fact, we’ve said he runs xAI so horridly it makes OpenAI ‘saintly in comparison.'”

OpenAI acting like a “cutthroat” corporation?

Johnston complained that OpenAI’s subpoena had already hurt the Midas Project, as insurers had denied coverage based on news coverage. He accused OpenAI of not just trying to silence critics but possibly shut them down.

“If you wanted to constrain an org’s speech, intimidation would be one strategy, but making them uninsurable is another, and maybe that’s what’s happened to us with this subpoena,” Johnston suggested.

Other nonprofits, like the San Francisco Foundation (SFF) and Encode, accused OpenAI of using subpoenas to potentially block or slow down legal interventions. Judith Bell, SFF’s chief impact officer, told NBC News that her nonprofit’s subpoena came after spearheading a petition to California’s attorney general to block OpenAI’s restructuring. And Encode’s general counsel, Nathan Calvin, was subpoenaed after sponsoring a California safety regulation meant to make it easier to monitor risks of frontier AI.

OpenAI thinks Elon Musk funded its biggest critics—who also hate Musk Read More »

inside-the-web-infrastructure-revolt-over-google’s-ai-overviews

Inside the web infrastructure revolt over Google’s AI Overviews


Cloudflare CEO Matthew Prince is making sweeping changes to force Google’s hand.

It could be a consequential act of quiet regulation. Cloudflare, a web infrastructure company, has updated millions of websites’ robots.txt files in an effort to force Google to change how it crawls them to fuel its AI products and initiatives.

We spoke with Cloudflare CEO Matthew Prince about what exactly is going on here, why it matters, and what the web might soon look like. But to get into that, we need to cover a little background first.

The new change, which Cloudflare calls its Content Signals Policy, happened after publishers and other companies that depend on web traffic have cried foul over Google’s AI Overviews and similar AI answer engines, saying they are sharply cutting those companies’ path to revenue because they don’t send traffic back to the source of the information.

There have been lawsuits, efforts to kick-start new marketplaces to ensure compensation, and more—but few companies have the kind of leverage Cloudflare does. Its products and services back something close to 20 percent of the web, and thus a significant slice of the websites that show up on search results pages or that fuel large language models.

“Almost every reasonable AI company that’s out there is saying, listen, if it’s a fair playing field, then we’re happy to pay for content,” Prince said. “The problem is that all of them are terrified of Google because if Google gets content for free but they all have to pay for it, they are always going to be at an inherent disadvantage.”

This is happening because Google is using its dominant position in search to ensure that web publishers allow their content to be used in ways that they might not otherwise want it to.

The changing norms of the web

Since 2023, Google has offered a way for website administrators to opt their content out of use for training Google’s large language models, such as Gemini.

However, allowing pages to be indexed by Google’s search crawlers and shown in results requires accepting that they’ll also be used to generate AI Overviews at the top of results pages through a process called retrieval-augmented generation (RAG).

That’s not so for many other crawlers, making Google an outlier among major players.

This is a sore point for a wide range of website administrators, from news websites that publish journalism to investment banks that produce research reports.

A July study from the Pew Research Center analyzed data from 900 adults in the US and found that AI Overviews cut referrals nearly in half. Specifically, users clicked a link on a page with AI Overviews at the top just 8 percent of the time, compared to 15 percent for search engine results pages without those summaries.

And a report in The Wall Street Journal cited a wide range of sources—including internal traffic metrics from numerous major publications like The New York Times and Business Insider—to describe industry-wide plummets in website traffic that those publishers said were tied to AI summaries, leading to layoffs and strategic shifts.

In August, Google’s head of search, Liz Reid, disputed the validity and applicability of studies and publisher reports of reduced link clicks in search. “Overall, total organic click volume from Google Search to websites has been relatively stable year-over-year,” she wrote, going on to say that reports of big declines were “often based on flawed methodologies, isolated examples, or traffic changes that occurred prior to the rollout of AI features in Search.”

Publishers aren’t convinced. Penske Media Corporation, which owns brands like The Hollywood Reporter and Rolling Stone, sued Google over AI Overviews in September. The suit claims that affiliate link revenue has dropped by more than a third in the past year, due in large part to Google’s overviews—a threatening shortfall in a business that already has difficult margins.

Penske’s suit specifically noted that because Google bundles traditional search engine indexing and RAG use together, the company has no choice but to allow Google to keep summarizing its articles, as cutting off Google search referrals entirely would be financially fatal.

Since the earliest days of digital publishing, referrals have in one way or another acted as the backbone of the web’s economy. Content could be made available freely to both human readers and crawlers, and norms were applied across the web to allow information to be tracked back to its source and give that source an opportunity to monetize its content to sustain itself.

Today, there’s a panic that the old system isn’t working anymore as content summaries via RAG have become more common, and along with other players, Cloudflare is trying to update those norms to reflect the current reality.

A mass-scale update to robots.txt

Announced on September 24, Cloudflare’s Content Signals Policy is an effort to use the company’s influential market position to change how content is used by web crawlers. It involves updating millions of websites’ robots.txt files.

Starting in 1994, websites began placing a file called “robots.txt” at the domain root to indicate to automated web crawlers which parts of the domain should be crawled and indexed and which should be ignored. The standard became near-universal over the years; honoring it has been a key part of how Google’s web crawlers operate.

Historically, robots.txt simply includes a list of paths on the domain that were flagged as either “allow” or “disallow.” It was technically not enforceable, but it became an effective honor system because there are advantages to it for the owners of both the website and the crawler: Website owners could dictate access for various business reasons, and it helped crawlers avoid working through data that wouldn’t be relevant.

But robots.txt only tells crawlers whether they can access something at all; it doesn’t tell them what they can use it for. For example, Google supports disallowing the agent “Google-Extended” as a path to blocking crawlers that are looking for content with which to train future versions of its Gemini large language model—though introducing that rule doesn’t do anything about the training Google did before it rolled out Google-Extended in 2023, and it doesn’t stop crawling for RAG and AI Overviews.

The Content Signals Policy initiative is a newly proposed format for robots.txt that intends to do that. It allows website operators to opt in or out of consenting to the following use cases, as worded in the policy:

  • search: Building a search index and providing search results (e.g., returning hyperlinks and short excerpts from your website’s contents). Search does not include providing AI-generated search summaries.
  • ai-input: Inputting content into one or more AI models (e.g., retrieval augmented generation, grounding, or other real-time taking of content for generative AI search answers).
  • ai-train: Training or fine-tuning AI models.

Cloudflare has given all of its customers quick paths for setting those values on a case-by-case basis. Further, it has automatically updated robots.txt on the 3.8 million domains that already use Cloudflare’s managed robots.txt feature, with search defaulting to yes, ai-train to no, and ai-input blank, indicating a neutral position.

The threat of potential litigation

In making this look a bit like a terms of service agreement, Cloudflare’s goal is explicitly to put legal pressure on Google to change its policy of bundling traditional search crawlers and AI Overviews.

“Make no mistake, the legal team at Google is looking at this saying, ‘Huh, that’s now something that we have to actively choose to ignore across a significant portion of the web,'” Prince told me.

Cloudflare specifically made this look like a license agreement. Credit: Cloudflare

He further characterized this as an effort to get a company that he says has historically been “largely a good actor” and a “patron of the web” to go back to doing the right thing.

“Inside of Google, there is a fight where there are people who are saying we should change how we’re doing this,” he explained. “And there are other people saying, no, that gives up our inherent advantage, we have a God-given right to all the content on the Internet.”

Amid that debate, lawyers have sway at Google, so Cloudflare tried to design tools “that made it very clear that if they were going to follow any of these sites, there was a clear license which was in place for them. And that will create risk for them if they don’t follow it,” Prince said.

The next web paradigm

It takes a company with Cloudflare’s scale to do something like this with any hope that it will have an impact. If just a few websites made this change, Google would have an easier time ignoring it, or worse yet, it could simply stop crawling them to avoid the problem. Since Cloudflare is entangled with millions of websites, Google couldn’t do that without materially impacting the quality of the search experience.

Cloudflare has a vested interest in the general health of the web, but there are other strategic considerations at play, too. The company has been working on tools to assist with RAG on customers’ websites in partnership with Microsoft-owned Google competitor Bing and has experimented with a marketplace that provides a way for websites to charge crawlers for scraping the sites for AI, though what final form that might take is still unclear.

I asked Prince directly if this comes from a place of conviction. “There are very few times that opportunities come along where you get to help think through what a future better business model of an organization or institution as large as the Internet and as important as the Internet is,” he said. “As we do that, I think that we should all be thinking about what have we learned that was good about the Internet in the past and what have we learned that was bad about the Internet in the past.”

It’s important to acknowledge that we don’t yet know what the future business model of the web will look like. Cloudflare itself has ideas. Others have proposed new standards, marketplaces, and strategies, too. There will be winners and losers, and those won’t always be the same winners and losers we saw in the previous paradigm.

What most people seem to agree on, whatever their individual incentives, is that Google shouldn’t get to come out on top in a future answer-engine-driven web paradigm just because it previously established dominance in the search-engine-driven one.

For this new standard for robots.txt, success looks like Google allowing content to be available in search but not in AI Overviews. Whatever the long-term vision, and whether it happens because of Cloudflare’s pressure with the Content Signals Policy or some other driving force, most agree that it would be a good start.

Photo of Samuel Axon

Samuel Axon is the editorial lead for tech and gaming coverage at Ars Technica. He covers AI, software development, gaming, entertainment, and mixed reality. He has been writing about gaming and technology for nearly two decades at Engadget, PC World, Mashable, Vice, Polygon, Wired, and others. He previously ran a marketing and PR agency in the gaming industry, led editorial for the TV network CBS, and worked on social media marketing strategy for Samsung Mobile at the creative agency SPCSHP. He also is an independent software and game developer for iOS, Windows, and other platforms, and he is a graduate of DePaul University, where he studied interactive media and software development.

Inside the web infrastructure revolt over Google’s AI Overviews Read More »