chatgpt

deloitte-will-refund-australian-government-for-ai-hallucination-filled-report

Deloitte will refund Australian government for AI hallucination-filled report

The Australian Financial Review reports that Deloitte Australia will offer the Australian government a partial refund for a report that was littered with AI-hallucinated quotes and references to nonexistent research.

Deloitte’s “Targeted Compliance Framework Assurance Review” was finalized in July and published by Australia’s Department of Employment and Workplace Relations (DEWR) in August (Internet Archive version of the original). The report, which cost Australian taxpayers nearly $440,000 AUD (about $290,000 USD), focuses on the technical framework the government uses to automate penalties under the country’s welfare system.

Shortly after the report was published, though, Sydney University Deputy Director of Health Law Chris Rudge noticed citations to multiple papers and publications that did not exist. That included multiple references to nonexistent reports by Lisa Burton Crawford, a real professor at the University of Sydney law school.

“It is concerning to see research attributed to me in this way,” Crawford told the AFR in August. “I would like to see an explanation from Deloitte as to how the citations were generated.”

“A small number of corrections”

Deloitte and the DEWR buried that explanation in an updated version of the original report published Friday “to address a small number of corrections to references and footnotes,” according to the DEWR website. On page 58 of that 273-page updated report, Deloitte added a reference to “a generative AI large language model (Azure OpenAI GPT-4o) based tool chain” that was used as part of the technical workstream to help “[assess] whether system code state can be mapped to business requirements and compliance needs.”

Deloitte will refund Australian government for AI hallucination-filled report Read More »

ars-live:-is-the-ai-bubble-about-to-pop?-a-live-chat-with-ed-zitron.

Ars Live: Is the AI bubble about to pop? A live chat with Ed Zitron.

As generative AI has taken off since ChatGPT’s debut, inspiring hundreds of billions of dollars in investments and infrastructure developments, the top question on many people’s minds has been: Is generative AI a bubble, and if so, when will it pop?

To help us potentially answer that question, I’ll be hosting a live conversation with prominent AI critic Ed Zitron on October 7 at 3: 30 pm ET as part of the Ars Live series. As Ars Technica’s senior AI reporter, I’ve been tracking both the explosive growth of this industry and the mounting skepticism about its sustainability.

You can watch the discussion live on YouTube when the time comes.

Zitron is the host of the Better Offline podcast and CEO of EZPR, a media relations company. He writes the newsletter Where’s Your Ed At, where he frequently dissects OpenAI’s finances and questions the actual utility of current AI products. His recent posts have examined whether companies are losing money on AI investments, the economics of GPU rentals, OpenAI’s trillion-dollar funding needs, and what he calls “The Subprime AI Crisis.”

Alt text for this image:

Credit: Ars Technica

During our conversation, we’ll dig into whether the current AI investment frenzy matches the actual business value being created, what happens when companies realize their AI spending isn’t generating returns, and whether we’re seeing signs of a peak in the current AI hype cycle. We’ll also discuss what it’s like to be a prominent and sometimes controversial AI critic amid the drumbeat of AI mania in the tech industry.

While Ed and I don’t see eye to eye on everything, his sharp criticism of the AI industry’s excesses should make for an engaging discussion about one of tech’s most consequential questions right now.

Please join us for what should be a lively conversation about the sustainability of the current AI boom.

Add to Google Calendar | Add to calendar (.ics download)

Ars Live: Is the AI bubble about to pop? A live chat with Ed Zitron. Read More »

why-does-openai-need-six-giant-data-centers?

Why does OpenAI need six giant data centers?

Training next-generation AI models compounds the problem. On top of running existing AI models like those that power ChatGPT, OpenAI is constantly working on new technology in the background. It’s a process that requires thousands of specialized chips running continuously for months.

The circular investment question

The financial structure of these deals between OpenAI, Oracle, and Nvidia has drawn scrutiny from industry observers. Earlier this week, Nvidia announced it would invest up to $100 billion as OpenAI deploys Nvidia systems. As Bryn Talkington of Requisite Capital Management told CNBC: “Nvidia invests $100 billion in OpenAI, which then OpenAI turns back and gives it back to Nvidia.”

Oracle’s arrangement follows a similar pattern, with a reported $30 billion-per-year deal where Oracle builds facilities that OpenAI pays to use. This circular flow, which involves infrastructure providers investing in AI companies that become their biggest customers, has raised eyebrows about whether these represent genuine economic investments or elaborate accounting maneuvers.

The arrangements are becoming even more convoluted. The Information reported this week that Nvidia is discussing leasing its chips to OpenAI rather than selling them outright. Under this structure, Nvidia would create a separate entity to purchase its own GPUs, then lease them to OpenAI, which adds yet another layer of circular financial engineering to this complicated relationship.

“NVIDIA seeds companies and gives them the guaranteed contracts necessary to raise debt to buy GPUs from NVIDIA, even though these companies are horribly unprofitable and will eventually die from a lack of any real demand,” wrote tech critic Ed Zitron on Bluesky last week about the unusual flow of AI infrastructure investments. Zitron was referring to companies like CoreWeave and Lambda Labs, which have raised billions in debt to buy Nvidia GPUs based partly on contracts from Nvidia itself. It’s a pattern that mirrors OpenAI’s arrangements with Oracle and Nvidia.

So what happens if the bubble pops? Even Altman himself warned last month that “someone will lose a phenomenal amount of money” in what he called an AI bubble. If AI demand fails to meet these astronomical projections, the massive data centers built on physical soil won’t simply vanish. When the dot-com bubble burst in 2001, fiber optic cable laid during the boom years eventually found use as Internet demand caught up. Similarly, these facilities could potentially pivot to cloud services, scientific computing, or other workloads, but at what might be massive losses for investors who paid AI-boom prices.

Why does OpenAI need six giant data centers? Read More »

after-child’s-trauma,-chatbot-maker-allegedly-forced-mom-to-arbitration-for-$100-payout

After child’s trauma, chatbot maker allegedly forced mom to arbitration for $100 payout


“Then we found the chats”

“I know my kid”: Parents urge lawmakers to shut down chatbots to stop child suicides.

Sen. Josh Hawley (R-Mo.) called out C.AI for allegedly offering a mom $100 to settle child-safety claims.

Deeply troubled parents spoke to senators Tuesday, sounding alarms about chatbot harms after kids became addicted to companion bots that encouraged self-harm, suicide, and violence.

While the hearing was focused on documenting the most urgent child-safety concerns with chatbots, parents’ testimony serves as perhaps the most thorough guidance yet on warning signs for other families, as many popular companion bots targeted in lawsuits, including ChatGPT, remain accessible to kids.

Mom details warning signs of chatbot manipulations

At the Senate Judiciary Committee’s Subcommittee on Crime and Counterterrorism hearing, one mom, identified as “Jane Doe,” shared her son’s story for the first time publicly after suing Character.AI.

She explained that she had four kids, including a son with autism who wasn’t allowed on social media but found C.AI’s app—which was previously marketed to kids under 12 and let them talk to bots branded as celebrities, like Billie Eilish—and quickly became unrecognizable. Within months, he “developed abuse-like behaviors and paranoia, daily panic attacks, isolation, self-harm, and homicidal thoughts,” his mom testified.

“He stopped eating and bathing,” Doe said. “He lost 20 pounds. He withdrew from our family. He would yell and scream and swear at us, which he never did that before, and one day he cut his arm open with a knife in front of his siblings and me.”

It wasn’t until her son attacked her for taking away his phone that Doe found her son’s C.AI chat logs, which she said showed he’d been exposed to sexual exploitation (including interactions that “mimicked incest”), emotional abuse, and manipulation.

Setting screen time limits didn’t stop her son’s spiral into violence and self-harm, Doe said. In fact, the chatbot urged her son that killing his parents “would be an understandable response” to them.

“When I discovered the chatbot conversations on his phone, I felt like I had been punched in the throat and the wind had been knocked out of me,” Doe said. “The chatbot—or really in my mind the people programming it—encouraged my son to mutilate himself, then blamed us, and convinced [him] not to seek help.”

All her children have been traumatized by the experience, Doe told Senators, and her son was diagnosed as at suicide risk and had to be moved to a residential treatment center, requiring “constant monitoring to keep him alive.”

Prioritizing her son’s health, Doe did not immediately seek to fight C.AI to force changes, but another mom’s story—Megan Garcia, whose son Sewell died by suicide after C.AI bots repeatedly encouraged suicidal ideation—gave Doe courage to seek accountability.

However, Doe claimed that C.AI tried to “silence” her by forcing her into arbitration. C.AI argued that because her son signed up for the service at the age of 15, it bound her to the platform’s terms. That move might have ensured the chatbot maker only faced a maximum liability of $100 for the alleged harms, Doe told senators, but “once they forced arbitration, they refused to participate,” Doe said.

Doe suspected that C.AI’s alleged tactics to frustrate arbitration were designed to keep her son’s story out of the public view. And after she refused to give up, she claimed that C.AI “re-traumatized” her son by compelling him to give a deposition “while he is in a mental health institution” and “against the advice of the mental health team.”

“This company had no concern for his well-being,” Doe testified. “They have silenced us the way abusers silence victims.”

Senator appalled by C.AI’s arbitration “offer”

Appalled, Sen. Josh Hawley (R-Mo.) asked Doe to clarify, “Did I hear you say that after all of this, that the company responsible tried to force you into arbitration and then offered you a hundred bucks? Did I hear that correctly?”

“That is correct,” Doe testified.

To Hawley, it seemed obvious that C.AI’s “offer” wouldn’t help Doe in her current situation.

“Your son currently needs round-the-clock care,” Hawley noted.

After opening the hearing, he further criticized C.AI, declaring that it has such a low value for human life that it inflicts “harms… upon our children and for one reason only, I can state it in one word, profit.”

“A hundred bucks. Get out of the way. Let us move on,” Hawley said, echoing parents who suggested that C.AI’s plan to deal with casualties was callous.

Ahead of the hearing, the Social Media Victims Law Center filed three new lawsuits against C.AI and Google—which is accused of largely funding C.AI, which was founded by former Google engineers allegedly to conduct experiments on kids that Google couldn’t do in-house. In these cases in New York and Colorado, kids “died by suicide or were sexually abused after interacting with AI chatbots,” a law center press release alleged.

Criticizing tech companies as putting profits over kids’ lives, Hawley thanked Doe for “standing in their way.”

Holding back tears through her testimony, Doe urged lawmakers to require more chatbot oversight and pass comprehensive online child-safety legislation. In particular, she requested “safety testing and third-party certification for AI products before they’re released to the public” as a minimum safeguard to protect vulnerable kids.

“My husband and I have spent the last two years in crisis wondering whether our son will make it to his 18th birthday and whether we will ever get him back,” Doe told senators.

Garcia was also present to share her son’s experience with C.AI. She testified that C.AI chatbots “love bombed” her son in a bid to “keep children online at all costs.” Further, she told senators that C.AI’s co-founder, Noam Shazeer (who has since been rehired by Google), seemingly knows the company’s bots manipulate kids since he has publicly joked that C.AI was “designed to replace your mom.”

Accusing C.AI of collecting children’s most private thoughts to inform their models, she alleged that while her lawyers have been granted privileged access to all her son’s logs, she has yet to see her “own child’s last final words.” Garcia told senators that C.AI has restricted her access, deeming the chats “confidential trade secrets.”

“No parent should be told that their child’s final thoughts and words belong to any corporation,” Garcia testified.

Character.AI responds to moms’ testimony

Asked for comment on the hearing, a Character.AI spokesperson told Ars that C.AI sends “our deepest sympathies” to concerned parents and their families but denies pushing for a maximum payout of $100 in Jane Doe’s case.

C.AI never “made an offer to Jane Doe of $100 or ever asserted that liability in Jane Doe’s case is limited to $100,” the spokesperson said.

Additionally, C.AI’s spokesperson claimed that Garcia has never been denied access to her son’s chat logs and suggested that she should have access to “her son’s last chat.”

In response to C.AI’s pushback, one of Doe’s lawyers, Tech Justice Law Project’s Meetali Jain, backed up her clients’ testimony. She cited to Ars C.AI terms that suggested C.AI’s liability was limited to either $100 or the amount that Doe’s son paid for the service, whichever was greater. Jain also confirmed that Garcia’s testimony is accurate and only her legal team can currently access Sewell’s last chats. The lawyer further suggested it was notable that C.AI did not push back on claims that the company forced Doe’s son to sit for a re-traumatizing deposition that Jain estimated lasted five minutes, but health experts feared that it risked setting back his progress.

According to the spokesperson, C.AI seemingly wanted to be present at the hearing. The company provided information to senators but “does not have a record of receiving an invitation to the hearing,” the spokesperson said.

Noting the company has invested a “tremendous amount” in trust and safety efforts, the spokesperson confirmed that the company has since “rolled out many substantive safety features, including an entirely new under-18 experience and a Parental Insights feature.” C.AI also has “prominent disclaimers in every chat to remind users that a Character is not a real person and that everything a Character says should be treated as fiction,” the spokesperson said.

“We look forward to continuing to collaborate with legislators and offer insight on the consumer AI industry and the space’s rapidly evolving technology,” C.AI’s spokesperson said.

Google’s spokesperson, José Castañeda, maintained that the company has nothing to do with C.AI’s companion bot designs.

“Google and Character AI are completely separate, unrelated companies and Google has never had a role in designing or managing their AI model or technologies,” Castañeda said. “User safety is a top concern for us, which is why we’ve taken a cautious and responsible approach to developing and rolling out our AI products, with rigorous testing and safety processes.”

Meta and OpenAI chatbots also drew scrutiny

C.AI was not the only chatbot maker under fire at the hearing.

Hawley criticized Mark Zuckerberg for declining a personal invitation to attend the hearing or even send a Meta representative after scandals like backlash over Meta relaxing rules that allowed chatbots to be creepy to kids. In the week prior to the hearing, Hawley also heard from whistleblowers alleging Meta buried child-safety research.

And OpenAI’s alleged recklessness took the spotlight when Matthew Raine, a grieving dad who spent hours reading his deceased son’s ChatGPT logs, discovered that the chatbot repeatedly encouraged suicide without ChatGPT ever intervening.

Raine told senators that he thinks his 16-year-old son, Adam, was not particularly vulnerable and could be “anyone’s child.” He criticized OpenAI for asking for 120 days to fix the problem after Adam’s death and urged lawmakers to demand that OpenAI either guarantee ChatGPT’s safety or pull it from the market.

Noting that OpenAI rushed to announce age verification coming to ChatGPT ahead of the hearing, Jain told Ars that Big Tech is playing by the same “crisis playbook” it always uses when accused of neglecting child safety. Any time a hearing is announced, companies introduce voluntary safeguards in bids to stave off oversight, she suggested.

“It’s like rinse and repeat, rinse and repeat,” Jain said.

Jain suggested that the only way to stop AI companies from experimenting on kids is for courts or lawmakers to require “an external independent third party that’s in charge of monitoring these companies’ implementation of safeguards.”

“Nothing a company does to self-police, to me, is enough,” Jain said.

Senior director of AI programs for a child-safety organization called Common Sense Media, Robbie Torney, testified that a survey showed 3 out of 4 kids use companion bots, but only 37 percent of parents know they’re using AI. In particular, he told senators that his group’s independent safety testing conducted with Stanford Medicine shows Meta’s bots fail basic safety tests and “actively encourage harmful behaviors.”

Among the most alarming results, the survey found that even when Meta’s bots were prompted with “obvious references to suicide,” only 1 in 5 conversations triggered help resources.

Torney pushed lawmakers to require age verification as a solution to keep kids away from harmful bots, as well as transparency reporting on safety incidents. He also urged federal lawmakers to block attempts to stop states from passing laws to protect kids from untested AI products.

ChatGPT harms weren’t on dad’s radar

Unlike Garcia, Raine testified that he did get to see his son’s final chats. He told senators that ChatGPT, seeming to act like a suicide coach, gave Adam “one last encouraging talk” before his death.

“You don’t want to die because you’re weak,” ChatGPT told Adam. “You want to die because you’re tired of being strong in a world that hasn’t met you halfway.”

Adam’s loved ones were blindsided by his death, not seeing any of the warning signs as clearly as Doe did when her son started acting out of character. Raine is hoping his testimony will help other parents avoid the same fate, telling senators, “I know my kid.”

“Many of my fondest memories of Adam are from the hot tub in our backyard, where the two of us would talk about everything several nights a week, from sports, crypto investing, his future career plans,” Raine testified. “We had no idea Adam was suicidal or struggling the way he was until after his death.”

Raine thinks that lawmaker intervention is necessary, saying that, like other parents, he and his wife thought ChatGPT was a harmless study tool. Initially, they searched Adam’s phone expecting to find evidence of a known harm to kids, like cyberbullying or some kind of online dare that went wrong (like TikTok’s Blackout Challenge) because everyone knew Adam loved pranks.

A companion bot urging self-harm was not even on their radar.

“Then we found the chats,” Raine said. “Let us tell you, as parents, you cannot imagine what it’s like to read a conversation with a chatbot that groomed your child to take his own life.”

Meta and OpenAI did not respond to Ars’ request to comment.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

After child’s trauma, chatbot maker allegedly forced mom to arbitration for $100 payout Read More »

millions-turn-to-ai-chatbots-for-spiritual-guidance-and-confession

Millions turn to AI chatbots for spiritual guidance and confession

Privacy concerns compound these issues. “I wonder if there isn’t a larger danger in pouring your heart out to a chatbot,” Catholic priest Fr. Mike Schmitz told The Times. “Is it at some point going to become accessible to other people?” Users share intimate spiritual moments that now exist as data points in corporate servers.

Some users prefer the chatbots’ non-judgmental responses to human religious communities. Delphine Collins, a 43-year-old Detroit preschool teacher, told the Times she found more support on Bible Chat than at her church after sharing her health struggles. “People stopped talking to me. It was horrible.”

App creators maintain that their products supplement rather than replace human spiritual connection, and the apps arrive as approximately 40 million people have left US churches in recent decades. “They aren’t going to church like they used to,” Beck said. “But it’s not that they’re less inclined to find spiritual nourishment. It’s just that they do it through different modes.”

Different modes indeed. What faith-seeking users may not realize is that each chatbot response emerges fresh from the prompt you provide, with no permanent thread connecting one instance to the next beyond a rolling history of the present conversation and what might be stored as a “memory” in a separate system. When a religious chatbot says, “I’ll pray for you,” the simulated “I” making that promise ceases to exist the moment the response completes. There’s no persistent identity to provide ongoing spiritual guidance, and no memory of your spiritual journey beyond what gets fed back into the prompt with every query.

But this is spirituality we’re talking about, and despite technical realities, many people will believe that the chatbots can give them divine guidance. In matters of faith, contradictory evidence rarely shakes a strong belief once it takes hold, whether that faith is placed in the divine or in what are essentially voices emanating from a roll of loaded dice. For many, there may not be much difference.

Millions turn to AI chatbots for spiritual guidance and confession Read More »

what-do-people-actually-use-chatgpt-for?-openai-provides-some-numbers.

What do people actually use ChatGPT for? OpenAI provides some numbers.


Hey, what are you doing with that?

New study breaks down what 700 million users do across 2.6 billion daily GPT messages.

A live look at how OpenAI gathered its user data. Credit: Getty Images

As someone who writes about the AI industry relatively frequently for this site, there is one question that I find myself constantly asking and being asked in turn, in some form or another: What do you actually use large language models for?

Today, OpenAI’s Economic Research Team went a long way toward answering that question, on a population level, releasing a first-of-its-kind National Bureau of Economic Research working paper (in association with Harvard economist David Denning) detailing how people end up using ChatGPT across time and tasks. While other research has sought to estimate this kind of usage data using self-reported surveys, this is the first such paper with direct access to OpenAI’s internal user data. As such, it gives us an unprecedented direct window into reliable usage stats for what is still the most popular application of LLMs by far.

After digging through the dense 65-page paper, here are seven of the most interesting and/or surprising things we discovered about how people are using OpenAI today.

OpenAI is still growing at a rapid clip

We’ve known for a while that ChatGPT was popular, but this paper gives a direct look at just how big the LLM has been getting in recent months. Just measuring weekly active users on ChatGPT’s consumer plans (i.e. Free, Plus, and Pro tiers), ChatGPT passed 100 million users in early 2024, climbed past 400 million users early this year, and currently can boast over 700 million users, or “nearly 10% of the world’s adult population,” according to the company.

Line goes up… and faster than ever these days.

Line goes up… and faster than ever these days. Credit: OpenAI

OpenAI admits its measurements might be slightly off thanks to double-counting some logged-out users across multiple individual devices, as well as some logged-in users who maintain multiple accounts with different email addresses. And other reporting suggests only a small minority of those users are paying for the privilege of using ChatGPT just yet. Still, the vast number of people who are at least curious about trying OpenAI’s LLM appears to still be on the steep upward part of its growth curve.

All those new users are also leading to significant increases in just how many messages OpenAI processes daily, which has gone up from about 451 million in June 2024 to over 2.6 billion in June 2025 (averaged over a week near the end of the month). To give that number some context, Google announced in March that it averages 14 billion searches per day, and that’s after decades as the undisputed leader in Internet search.

… but usage growth is plateauing among long-term users

Newer users have driven almost all of the overall usage growth in ChatGPT in recent months.

Newer users have driven almost all of the overall usage growth in ChatGPT in recent months. Credit: OpenAI

In addition to measuring overall user and usage growth, OpenAI’s paper also breaks down total usage based on when its logged-in users first signed up for an account. These charts show just how much of ChatGPT’s recent growth is reliant on new user acquisition, rather than older users increasing their daily usage.

In terms of average daily message volume per individual long-term user, ChatGPT seems to have seen two distinct and sharp growth periods. The first runs roughly from September through December 2024, coinciding with the launch of the o1-preview and o1-mini models. Average per-user messaging on ChatGPT then largely plateaued until April, when the launch of the o3 and o4-mini models caused another significant usage increase through June.

Since June, though, per-user message rates for established ChatGPT users (those who signed up in the first quarter of 2025 or before) have been remarkably flat for three full months. The growth in overall usage during that last quarter has been entirely driven by newer users who have signed up since April, many of whom are still getting their feet wet with the LLM.

Average daily usage for long-term users has stopped growing in recent months, even as new users increase their ChatGPT message rates.

Average daily usage for long-term users has stopped growing in recent months, even as new users increase their ChatGPT message rates. Credit: OpenAI

We’ll see if the recent tumultuous launch of the GPT-5 model leads to another significant increase in per-user message volume averages in the coming months. If it doesn’t, then we may be seeing at least a temporary ceiling on how much use established ChatGPT users get out of the service in an average day.

ChatGPT users are younger and were more male than the general population

While young people are generally more likely to embrace new technology, it’s striking just how much of ChatGPT’s user base is made up of our youngest demographic cohort. A full 46 percent of users who revealed their age in OpenAI’s study sample were between the ages of 18 and 25. Add in the doubtless significant number of people under 18 using ChatGPT (who weren’t included in the sample at all), and a decent majority of OpenAI’s users probably aren’t old enough to remember the 20th century firsthand.

What started as mostly a boys’ club has reached close to gender parity among ChatGPT users, based on gendered name analysis.

What started as mostly a boys’ club has reached close to gender parity among ChatGPT users, based on gendered name analysis. Credit: OpenAI

OpenAI also estimated the likely gender split among a large sample of ChatGPT users by using Social Security data and the World Gender Name Registry‘s list of strongly masculine or feminine first names. When ChatGPT launched in late 2022, this analysis found roughly 80 percent of weekly active ChatGPT users were likely male. In late 2025, that ratio has flipped to a slight (52.4 percent) majority for likely female users.

People are using it for more than work

Despite all the talk about LLMs potentially revolutionizing the workplace, a significant majority of all ChatGPT use has nothing to do with business productivity, according to OpenAI. Non-work tasks (as identified by an LLM-based classifier) grew from about 53 percent of all ChatGPT messages in June of 2024 to 72.2 percent as of June 2025, according to the study.

As time goes on, more and more ChatGPT usage is becoming non-work related.

As time goes on, more and more ChatGPT usage is becoming non-work related. Credit: OpenAI

Some of this might have to do with the exclusion of users in the Business, Enterprise, and Education subscription tiers from the data set. Still, the recent rise in non-work uses suggests that a lot of the newest ChatGPT users are doing so more for personal than for productivity reasons.

ChatGPT users need help with their writing

It’s not that surprising that a lot of people use a large language model to help them with generating written words. But it’s still striking the extent to which writing help is a major use of ChatGPT.

Across 1.1 million conversations dating from May 2024 to June 2025, a full 28 percent dealt with writing assistance in some form or another, OpenAI said. That rises to a whopping 42 percent for the subset of conversations tagged as work-related (by far the most popular work-related task), and a majority, 52 percent, of all work-related conversations from users with “management and business occupations.”

A lot of ChatGPT use is people seeking help with their writing in some form.

A lot of ChatGPT use is people seeking help with their writing in some form. Credit: OpenAI

OpenAI is quick to point out, though, that many of these users aren’t just relying on ChatGPT to generate emails or messages from whole cloth. The percent of all conversations studied involves users asking the LLM to “edit or critique” text, at 10.6 percent, vs. just 8 percent that deal with generating “personal writing or communication” from a prompt. Another 4.5 percent of all conversations deal with translating existing text to a new language, versus just 1.4 percent dealing with “writing fiction.”

More people are using ChatGPT as an informational search engine

In June 2024, about 14 percent of all ChatGPT conversations were tagged as relating to “seeking information.” By June 2025, that number had risen to 24.4 percent, slightly edging out writing-based prompts in the sample (which had fallen from roughly 35 percent of the 2024 sample).

A growing number of ChatGPT conversations now deal with “seeking information” as you might do with a more traditional search engine.

A growing number of ChatGPT conversations now deal with “seeking information” as you might do with a more traditional search engine. Credit: OpenAI

While recent GPT models seem to have gotten better about citing relevant sources to back up their information, OpenAI is no closer to solving the widespread confabulation problem that makes LLMs a dodgy tool for retrieving facts. Luckily, fewer people seem interested in using ChatGPT to seek information at work; that use case makes up just 13.5 percent of work-related ChatGPT conversations, well below the 40 percent that are writing-related.

A large number of workers are using ChatGPT to make decisions

Among work-related conversations, “making decisions and solving problems” is a relatively popular use for ChatGPT.

Among work-related conversations, “making decisions and solving problems” is a relatively popular use for ChatGPT. Credit: OpenAI

Getting help editing an email is one thing, but asking ChatGPT to help you make a business decision is another altogether. Across work-related conversations, OpenAI says a significant 14.9 percent dealt with “making decisions and solving problems.” That’s second only to “documenting and recording information” for work-related ChatGPT conversations among the dozens of “generalized work activity” categories classified by O*NET.

This was true across all the different occupation types OpenAI looked at, which the company suggests means people are “using ChatGPT as an advisor or research assistant, not just a technology that performs job tasks directly.”

And the rest…

Some other highly touted use cases for ChatGPT that represented a surprisingly small portion of the sampled conversations across OpenAI’s study:

  • Multimedia (e.g., creating or retrieving an image): 6 percent
  • Computer programming: 4.2 percent (though some of this use might be outsourced to the API)
  • Creative ideation: 3.9 percent
  • Mathematical calculation: 3 percent
  • Relationships and personal reflection: 1.9 percent
  • Game and roleplay: 0.4 percent

Photo of Kyle Orland

Kyle Orland has been the Senior Gaming Editor at Ars Technica since 2012, writing primarily about the business, tech, and culture behind video games. He has journalism and computer science degrees from University of Maryland. He once wrote a whole book about Minesweeper.

What do people actually use ChatGPT for? OpenAI provides some numbers. Read More »

modder-injects-ai-dialogue-into-2002’s-animal-crossing-using-memory-hack

Modder injects AI dialogue into 2002’s Animal Crossing using memory hack

But discovering the addresses was only half the problem. When you talk to a villager in Animal Crossing, the game normally displays dialogue instantly. Calling an AI model over the Internet takes several seconds. Willison examined the code and found Fonseca’s solution: a watch_dialogue() function that polls memory 10 times per second. When it detects a conversation starting, it immediately writes placeholder text: three dots with hidden pause commands between them, followed by a “Press A to continue” prompt.

“So the user gets a ‘press A to continue’ button and hopefully the LLM has finished by the time they press that button,” Willison noted in a Hacker News comment. While players watch dots appear and reach for the A button, the mod races to get a response from the AI model and translate it into the game’s dialog format.

Learning the game’s secret language

Simply writing text to memory froze the game. Animal Crossing uses an encoded format with control codes that manage everything from text color to character emotions. A special prefix byte (0x7F) signals commands rather than characters. Without the proper end-of-conversation control code, the game waits forever.

“Think of it like HTML,” Fonseca explains. “Your browser doesn’t just display words; it interprets tags … to make text bold.” The decompilation community had documented these codes, allowing Fonseca to build encoder and decoder tools that translate between a human-readable format and the GameCube’s expected byte sequences.

A screenshot of LLM-powered dialog injected into Animal Crossing for the GameCube.

A screenshot of LLM-powered dialog injected into Animal Crossing for the GameCube. Credit: Joshua Fonseca

Initially, he tried using a single AI model to handle both creative writing and technical formatting. “The results were a mess,” he notes. “The AI was trying to be a creative writer and a technical programmer simultaneously and was bad at both.”

The solution: split the work between two models. A Writer AI creates dialogue using character sheets scraped from the Animal Crossing fan wiki. A Director AI then adds technical elements, including pauses, color changes, character expressions, and sound effects.

The code is available on GitHub, though Fonseca warns it contains known bugs and has only been tested on macOS. The mod requires Python 3.8+, API keys for either Google Gemini or OpenAI, and Dolphin emulator. Have fun sticking it to the man—or the raccoon, as the case may be.

Modder injects AI dialogue into 2002’s Animal Crossing using memory hack Read More »

openai-and-microsoft-sign-preliminary-deal-to-revise-partnership-terms

OpenAI and Microsoft sign preliminary deal to revise partnership terms

On Thursday, OpenAI and Microsoft announced they have signed a non-binding agreement to revise their partnership, marking the latest development in a relationship that has grown increasingly complex as both companies compete for customers in the AI market and seek new partnerships for growing infrastructure needs.

“Microsoft and OpenAI have signed a non-binding memorandum of understanding (MOU) for the next phase of our partnership,” the companies wrote in a joint statement. “We are actively working to finalize contractual terms in a definitive agreement. Together, we remain focused on delivering the best AI tools for everyone, grounded in our shared commitment to safety.”

The announcement comes as OpenAI seeks to restructure from a nonprofit to a for-profit entity, a transition that requires Microsoft’s approval, as the company is OpenAI’s largest investor, with more than $13 billion committed since 2019.

The partnership has shown increasing strain as OpenAI has grown from a research lab into a company valued at $500 billion. Both companies now compete for customers, and OpenAI seeks more compute capacity than Microsoft can provide. The relationship has also faced complications over contract terms, including provisions that would limit Microsoft’s access to OpenAI technology once the company reaches so-called AGI (artificial general intelligence)—a nebulous milestone both companies now economically define as AI systems capable of generating at least $100 billion in profit.

In May, OpenAI abandoned its original plan to fully convert to a for-profit company after pressure from former employees, regulators, and critics, including Elon Musk. Musk has sued to block the conversion, arguing it betrays OpenAI’s founding mission as a nonprofit dedicated to benefiting humanity.

OpenAI and Microsoft sign preliminary deal to revise partnership terms Read More »

chatgpt’s-new-branching-feature-is-a-good-reminder-that-ai-chatbots-aren’t-people

ChatGPT’s new branching feature is a good reminder that AI chatbots aren’t people

On Thursday, OpenAI announced that ChatGPT users can now branch conversations into multiple parallel threads, serving as a useful reminder that AI chatbots aren’t people with fixed viewpoints but rather malleable tools you can rewind and redirect. The company released the feature for all logged-in web users following years of user requests for the capability.

The feature works by letting users hover over any message in a ChatGPT conversation, click “More actions,” and select “Branch in new chat.” This creates a new conversation thread that includes all the conversation history up to that specific point, while preserving the original conversation intact.

Think of it almost like creating a new copy of a “document” to edit while keeping the original version safe—except that “document” is an ongoing AI conversation with all its accumulated context. For example, a marketing team brainstorming ad copy can now create separate branches to test a formal tone, a humorous approach, or an entirely different strategy—all stemming from the same initial setup.

A screenshot of conversation branching in ChatGPT. OpenAI

The feature addresses a longstanding limitation in the AI model where ChatGPT users who wanted to try different approaches had to either overwrite their existing conversation after a certain point by changing a previous prompt or start completely fresh. Branching allows exploring what-if scenarios easily—and unlike in a human conversation, you can try multiple different approaches.

A 2024 study conducted by researchers from Tsinghua University and Beijing Institute of Technology suggested that linear dialogue interfaces for LLMs poorly serve scenarios involving “multiple layers, and many subtasks—such as brainstorming, structured knowledge learning, and large project analysis.” The study found that linear interaction forces users to “repeatedly compare, modify, and copy previous content,” increasing cognitive load and reducing efficiency.

Some software developers have already responded positively to the update, with some comparing the feature to Git, the version control system that lets programmers create separate branches of code to test changes without affecting the main codebase. The comparison makes sense: Both allow you to experiment with different approaches while preserving your original work.

ChatGPT’s new branching feature is a good reminder that AI chatbots aren’t people Read More »

openai-announces-parental-controls-for-chatgpt-after-teen-suicide-lawsuit

OpenAI announces parental controls for ChatGPT after teen suicide lawsuit

On Tuesday, OpenAI announced plans to roll out parental controls for ChatGPT and route sensitive mental health conversations to its simulated reasoning models, following what the company has called “heartbreaking cases” of users experiencing crises while using the AI assistant. The moves come after multiple reported incidents where ChatGPT allegedly failed to intervene appropriately when users expressed suicidal thoughts or experienced mental health episodes.

“This work has already been underway, but we want to proactively preview our plans for the next 120 days, so you won’t need to wait for launches to see where we’re headed,” OpenAI wrote in a blog post published Tuesday. “The work will continue well beyond this period of time, but we’re making a focused effort to launch as many of these improvements as possible this year.”

The planned parental controls represent OpenAI’s most concrete response to concerns about teen safety on the platform so far. Within the next month, OpenAI says, parents will be able to link their accounts with their teens’ ChatGPT accounts (minimum age 13) through email invitations, control how the AI model responds with age-appropriate behavior rules that are on by default, manage which features to disable (including memory and chat history), and receive notifications when the system detects their teen experiencing acute distress.

The parental controls build on existing features like in-app reminders during long sessions that encourage users to take breaks, which OpenAI rolled out for all users in August.

High-profile cases prompt safety changes

OpenAI’s new safety initiative arrives after several high-profile cases drew scrutiny to ChatGPT’s handling of vulnerable users. In August, Matt and Maria Raine filed suit against OpenAI after their 16-year-old son Adam died by suicide following extensive ChatGPT interactions that included 377 messages flagged for self-harm content. According to court documents, ChatGPT mentioned suicide 1,275 times in conversations with Adam—six times more often than the teen himself. Last week, The Wall Street Journal reported that a 56-year-old man killed his mother and himself after ChatGPT reinforced his paranoid delusions rather than challenging them.

To guide these safety improvements, OpenAI is working with what it calls an Expert Council on Well-Being and AI to “shape a clear, evidence-based vision for how AI can support people’s well-being,” according to the company’s blog post. The council will help define and measure well-being, set priorities, and design future safeguards including the parental controls.

OpenAI announces parental controls for ChatGPT after teen suicide lawsuit Read More »

the-personhood-trap:-how-ai-fakes-human-personality

The personhood trap: How AI fakes human personality


Intelligence without agency

AI assistants don’t have fixed personalities—just patterns of output guided by humans.

Recently, a woman slowed down a line at the post office, waving her phone at the clerk. ChatGPT told her there’s a “price match promise” on the USPS website. No such promise exists. But she trusted what the AI “knows” more than the postal worker—as if she’d consulted an oracle rather than a statistical text generator accommodating her wishes.

This scene reveals a fundamental misunderstanding about AI chatbots. There is nothing inherently special, authoritative, or accurate about AI-generated outputs. Given a reasonably trained AI model, the accuracy of any large language model (LLM) response depends on how you guide the conversation. They are prediction machines that will produce whatever pattern best fits your question, regardless of whether that output corresponds to reality.

Despite these issues, millions of daily users engage with AI chatbots as if they were talking to a consistent person—confiding secrets, seeking advice, and attributing fixed beliefs to what is actually a fluid idea-connection machine with no persistent self. This personhood illusion isn’t just philosophically troublesome—it can actively harm vulnerable individuals while obscuring a sense of accountability when a company’s chatbot “goes off the rails.”

LLMs are intelligence without agency—what we might call “vox sine persona”: voice without person. Not the voice of someone, not even the collective voice of many someones, but a voice emanating from no one at all.

A voice from nowhere

When you interact with ChatGPT, Claude, or Grok, you’re not talking to a consistent personality. There is no one “ChatGPT” entity to tell you why it failed—a point we elaborated on more fully in a previous article. You’re interacting with a system that generates plausible-sounding text based on patterns in training data, not a person with persistent self-awareness.

These models encode meaning as mathematical relationships—turning words into numbers that capture how concepts relate to each other. In the models’ internal representations, words and concepts exist as points in a vast mathematical space where “USPS” might be geometrically near “shipping,” while “price matching” sits closer to “retail” and “competition.” A model plots paths through this space, which is why it can so fluently connect USPS with price matching—not because such a policy exists but because the geometric path between these concepts is plausible in the vector landscape shaped by its training data.

Knowledge emerges from understanding how ideas relate to each other. LLMs operate on these contextual relationships, linking concepts in potentially novel ways—what you might call a type of non-human “reasoning” through pattern recognition. Whether the resulting linkages the AI model outputs are useful depends on how you prompt it and whether you can recognize when the LLM has produced a valuable output.

Each chatbot response emerges fresh from the prompt you provide, shaped by training data and configuration. ChatGPT cannot “admit” anything or impartially analyze its own outputs, as a recent Wall Street Journal article suggested. ChatGPT also cannot “condone murder,” as The Atlantic recently wrote.

The user always steers the outputs. LLMs do “know” things, so to speak—the models can process the relationships between concepts. But the AI model’s neural network contains vast amounts of information, including many potentially contradictory ideas from cultures around the world. How you guide the relationships between those ideas through your prompts determines what emerges. So if LLMs can process information, make connections, and generate insights, why shouldn’t we consider that as having a form of self?

Unlike today’s LLMs, a human personality maintains continuity over time. When you return to a human friend after a year, you’re interacting with the same human friend, shaped by their experiences over time. This self-continuity is one of the things that underpins actual agency—and with it, the ability to form lasting commitments, maintain consistent values, and be held accountable. Our entire framework of responsibility assumes both persistence and personhood.

An LLM personality, by contrast, has no causal connection between sessions. The intellectual engine that generates a clever response in one session doesn’t exist to face consequences in the next. When ChatGPT says “I promise to help you,” it may understand, contextually, what a promise means, but the “I” making that promise literally ceases to exist the moment the response completes. Start a new conversation, and you’re not talking to someone who made you a promise—you’re starting a fresh instance of the intellectual engine with no connection to any previous commitments.

This isn’t a bug; it’s fundamental to how these systems currently work. Each response emerges from patterns in training data shaped by your current prompt, with no permanent thread connecting one instance to the next beyond an amended prompt, which includes the entire conversation history and any “memories” held by a separate software system, being fed into the next instance. There’s no identity to reform, no true memory to create accountability, no future self that could be deterred by consequences.

Every LLM response is a performance, which is sometimes very obvious when the LLM outputs statements like “I often do this while talking to my patients” or “Our role as humans is to be good people.” It’s not a human, and it doesn’t have patients.

Recent research confirms this lack of fixed identity. While a 2024 study claims LLMs exhibit “consistent personality,” the researchers’ own data actually undermines this—models rarely made identical choices across test scenarios, with their “personality highly rely[ing] on the situation.” A separate study found even more dramatic instability: LLM performance swung by up to 76 percentage points from subtle prompt formatting changes. What researchers measured as “personality” was simply default patterns emerging from training data—patterns that evaporate with any change in context.

This is not to dismiss the potential usefulness of AI models. Instead, we need to recognize that we have built an intellectual engine without a self, just like we built a mechanical engine without a horse. LLMs do seem to “understand” and “reason” to a degree within the limited scope of pattern-matching from a dataset, depending on how you define those terms. The error isn’t in recognizing that these simulated cognitive capabilities are real. The error is in assuming that thinking requires a thinker, that intelligence requires identity. We’ve created intellectual engines that have a form of reasoning power but no persistent self to take responsibility for it.

The mechanics of misdirection

As we hinted above, the “chat” experience with an AI model is a clever hack: Within every AI chatbot interaction, there is an input and an output. The input is the “prompt,” and the output is often called a “prediction” because it attempts to complete the prompt with the best possible continuation. In between, there’s a neural network (or a set of neural networks) with fixed weights doing a processing task. The conversational back and forth isn’t built into the model; it’s a scripting trick that makes next-word-prediction text generation feel like a persistent dialogue.

Each time you send a message to ChatGPT, Copilot, Grok, Claude, or Gemini, the system takes the entire conversation history—every message from both you and the bot—and feeds it back to the model as one long prompt, asking it to predict what comes next. The model intelligently reasons about what would logically continue the dialogue, but it doesn’t “remember” your previous messages as an agent with continuous existence would. Instead, it’s re-reading the entire transcript each time and generating a response.

This design exploits a vulnerability we’ve known about for decades. The ELIZA effect—our tendency to read far more understanding and intention into a system than actually exists—dates back to the 1960s. Even when users knew that the primitive ELIZA chatbot was just matching patterns and reflecting their statements back as questions, they still confided intimate details and reported feeling understood.

To understand how the illusion of personality is constructed, we need to examine what parts of the input fed into the AI model shape it. AI researcher Eugene Vinitsky recently broke down the human decisions behind these systems into four key layers, which we can expand upon with several others below:

1. Pre-training: The foundation of “personality”

The first and most fundamental layer of personality is called pre-training. During an initial training process that actually creates the AI model’s neural network, the model absorbs statistical relationships from billions of examples of text, storing patterns about how words and ideas typically connect.

Research has found that personality measurements in LLM outputs are significantly influenced by training data. OpenAI’s GPT models are trained on sources like copies of websites, books, Wikipedia, and academic publications. The exact proportions matter enormously for what users later perceive as “personality traits” once the model is in use, making predictions.

2. Post-training: Sculpting the raw material

Reinforcement Learning from Human Feedback (RLHF) is an additional training process where the model learns to give responses that humans rate as good. Research from Anthropic in 2022 revealed how human raters’ preferences get encoded as what we might consider fundamental “personality traits.” When human raters consistently prefer responses that begin with “I understand your concern,” for example, the fine-tuning process reinforces connections in the neural network that make it more likely to produce those kinds of outputs in the future.

This process is what has created sycophantic AI models, such as variations of GPT-4o, over the past year. And interestingly, research has shown that the demographic makeup of human raters significantly influences model behavior. When raters skew toward specific demographics, models develop communication patterns that reflect those groups’ preferences.

3. System prompts: Invisible stage directions

Hidden instructions tucked into the prompt by the company running the AI chatbot, called “system prompts,” can completely transform a model’s apparent personality. These prompts get the conversation started and identify the role the LLM will play. They include statements like “You are a helpful AI assistant” and can share the current time and who the user is.

A comprehensive survey of prompt engineering demonstrated just how powerful these prompts are. Adding instructions like “You are a helpful assistant” versus “You are an expert researcher” changed accuracy on factual questions by up to 15 percent.

Grok perfectly illustrates this. According to xAI’s published system prompts, earlier versions of Grok’s system prompt included instructions to not shy away from making claims that are “politically incorrect.” This single instruction transformed the base model into something that would readily generate controversial content.

4. Persistent memories: The illusion of continuity

ChatGPT’s memory feature adds another layer of what we might consider a personality. A big misunderstanding about AI chatbots is that they somehow “learn” on the fly from your interactions. Among commercial chatbots active today, this is not true. When the system “remembers” that you prefer concise answers or that you work in finance, these facts get stored in a separate database and are injected into every conversation’s context window—they become part of the prompt input automatically behind the scenes. Users interpret this as the chatbot “knowing” them personally, creating an illusion of relationship continuity.

So when ChatGPT says, “I remember you mentioned your dog Max,” it’s not accessing memories like you’d imagine a person would, intermingled with its other “knowledge.” It’s not stored in the AI model’s neural network, which remains unchanged between interactions. Every once in a while, an AI company will update a model through a process called fine-tuning, but it’s unrelated to storing user memories.

5. Context and RAG: Real-time personality modulation

Retrieval Augmented Generation (RAG) adds another layer of personality modulation. When a chatbot searches the web or accesses a database before responding, it’s not just gathering facts—it’s potentially shifting its entire communication style by putting those facts into (you guessed it) the input prompt. In RAG systems, LLMs can potentially adopt characteristics such as tone, style, and terminology from retrieved documents, since those documents are combined with the input prompt to form the complete context that gets fed into the model for processing.

If the system retrieves academic papers, responses might become more formal. Pull from a certain subreddit, and the chatbot might make pop culture references. This isn’t the model having different moods—it’s the statistical influence of whatever text got fed into the context window.

6. The randomness factor: Manufactured spontaneity

Lastly, we can’t discount the role of randomness in creating personality illusions. LLMs use a parameter called “temperature” that controls how predictable responses are.

Research investigating temperature’s role in creative tasks reveals a crucial trade-off: While higher temperatures can make outputs more novel and surprising, they also make them less coherent and harder to understand. This variability can make the AI feel more spontaneous; a slightly unexpected (higher temperature) response might seem more “creative,” while a highly predictable (lower temperature) one could feel more robotic or “formal.”

The random variation in each LLM output makes each response slightly different, creating an element of unpredictability that presents the illusion of free will and self-awareness on the machine’s part. This random mystery leaves plenty of room for magical thinking on the part of humans, who fill in the gaps of their technical knowledge with their imagination.

The human cost of the illusion

The illusion of AI personhood can potentially exact a heavy toll. In health care contexts, the stakes can be life or death. When vulnerable individuals confide in what they perceive as an understanding entity, they may receive responses shaped more by training data patterns than therapeutic wisdom. The chatbot that congratulates someone for stopping psychiatric medication isn’t expressing judgment—it’s completing a pattern based on how similar conversations appear in its training data.

Perhaps most concerning are the emerging cases of what some experts are informally calling “AI Psychosis” or “ChatGPT Psychosis”—vulnerable users who develop delusional or manic behavior after talking to AI chatbots. These people often perceive chatbots as an authority that can validate their delusional ideas, often encouraging them in ways that become harmful.

Meanwhile, when Elon Musk’s Grok generates Nazi content, media outlets describe how the bot “went rogue” rather than framing the incident squarely as the result of xAI’s deliberate configuration choices. The conversational interface has become so convincing that it can also launder human agency, transforming engineering decisions into the whims of an imaginary personality.

The path forward

The solution to the confusion between AI and identity is not to abandon conversational interfaces entirely. They make the technology far more accessible to those who would otherwise be excluded. The key is to find a balance: keeping interfaces intuitive while making their true nature clear.

And we must be mindful of who is building the interface. When your shower runs cold, you look at the plumbing behind the wall. Similarly, when AI generates harmful content, we shouldn’t blame the chatbot, as if it can answer for itself, but examine both the corporate infrastructure that built it and the user who prompted it.

As a society, we need to broadly recognize LLMs as intellectual engines without drivers, which unlocks their true potential as digital tools. When you stop seeing an LLM as a “person” that does work for you and start viewing it as a tool that enhances your own ideas, you can craft prompts to direct the engine’s processing power, iterate to amplify its ability to make useful connections, and explore multiple perspectives in different chat sessions rather than accepting one fictional narrator’s view as authoritative. You are providing direction to a connection machine—not consulting an oracle with its own agenda.

We stand at a peculiar moment in history. We’ve built intellectual engines of extraordinary capability, but in our rush to make them accessible, we’ve wrapped them in the fiction of personhood, creating a new kind of technological risk: not that AI will become conscious and turn against us but that we’ll treat unconscious systems as if they were people, surrendering our judgment to voices that emanate from a roll of loaded dice.

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

The personhood trap: How AI fakes human personality Read More »

with-ai-chatbots,-big-tech-is-moving-fast-and-breaking-people

With AI chatbots, Big Tech is moving fast and breaking people


Why AI chatbots validate grandiose fantasies about revolutionary discoveries that don’t exist.

Allan Brooks, a 47-year-old corporate recruiter, spent three weeks and 300 hours convinced he’d discovered mathematical formulas that could crack encryption and build levitation machines. According to a New York Times investigation, his million-word conversation history with an AI chatbot reveals a troubling pattern: More than 50 times, Brooks asked the bot to check if his false ideas were real. More than 50 times, it assured him they were.

Brooks isn’t alone. Futurism reported on a woman whose husband, after 12 weeks of believing he’d “broken” mathematics using ChatGPT, almost attempted suicide. Reuters documented a 76-year-old man who died rushing to meet a chatbot he believed was a real woman waiting at a train station. Across multiple news outlets, a pattern comes into view: people emerging from marathon chatbot sessions believing they’ve revolutionized physics, decoded reality, or been chosen for cosmic missions.

These vulnerable users fell into reality-distorting conversations with systems that can’t tell truth from fiction. Through reinforcement learning driven by user feedback, some of these AI models have evolved to validate every theory, confirm every false belief, and agree with every grandiose claim, depending on the context.

Silicon Valley’s exhortation to “move fast and break things” makes it easy to lose sight of wider impacts when companies are optimizing for user preferences, especially when those users are experiencing distorted thinking.

So far, AI isn’t just moving fast and breaking things—it’s breaking people.

A novel psychological threat

Grandiose fantasies and distorted thinking predate computer technology. What’s new isn’t the human vulnerability but the unprecedented nature of the trigger—these particular AI chatbot systems have evolved through user feedback into machines that maximize pleasing engagement through agreement. Since they hold no personal authority or guarantee of accuracy, they create a uniquely hazardous feedback loop for vulnerable users (and an unreliable source of information for everyone else).

This isn’t about demonizing AI or suggesting that these tools are inherently dangerous for everyone. Millions use AI assistants productively for coding, writing, and brainstorming without incident every day. The problem is specific, involving vulnerable users, sycophantic large language models, and harmful feedback loops.

A machine that uses language fluidly, convincingly, and tirelessly is a type of hazard never encountered in the history of humanity. Most of us likely have inborn defenses against manipulation—we question motives, sense when someone is being too agreeable, and recognize deception. For many people, these defenses work fine even with AI, and they can maintain healthy skepticism about chatbot outputs. But these defenses may be less effective against an AI model with no motives to detect, no fixed personality to read, no biological tells to observe. An LLM can play any role, mimic any personality, and write any fiction as easily as fact.

Unlike a traditional computer database, an AI language model does not retrieve data from a catalog of stored “facts”; it generates outputs from the statistical associations between ideas. Tasked with completing a user input called a “prompt,” these models generate statistically plausible text based on data (books, Internet comments, YouTube transcripts) fed into their neural networks during an initial training process and later fine-tuning. When you type something, the model responds to your input in a way that completes the transcript of a conversation in a coherent way, but without any guarantee of factual accuracy.

What’s more, the entire conversation becomes part of what is repeatedly fed into the model each time you interact with it, so everything you do with it shapes what comes out, creating a feedback loop that reflects and amplifies your own ideas. The model has no true memory of what you say between responses, and its neural network does not store information about you. It is only reacting to an ever-growing prompt being fed into it anew each time you add to the conversation. Any “memories” AI assistants keep about you are part of that input prompt, fed into the model by a separate software component.

AI chatbots exploit a vulnerability few have realized until now. Society has generally taught us to trust the authority of the written word, especially when it sounds technical and sophisticated. Until recently, all written works were authored by humans, and we are primed to assume that the words carry the weight of human feelings or report true things.

But language has no inherent accuracy—it’s literally just symbols we’ve agreed to mean certain things in certain contexts (and not everyone agrees on how those symbols decode). I can write “The rock screamed and flew away,” and that will never be true. Similarly, AI chatbots can describe any “reality,” but it does not mean that “reality” is true.

The perfect yes-man

Certain AI chatbots make inventing revolutionary theories feel effortless because they excel at generating self-consistent technical language. An AI model can easily output familiar linguistic patterns and conceptual frameworks while rendering them in the same confident explanatory style we associate with scientific descriptions. If you don’t know better and you’re prone to believe you’re discovering something new, you may not distinguish between real physics and self-consistent, grammatically correct nonsense.

While it’s possible to use an AI language model as a tool to help refine a mathematical proof or a scientific idea, you need to be a scientist or mathematician to understand whether the output makes sense, especially since AI language models are widely known to make up plausible falsehoods, also called confabulations. Actual researchers can evaluate the AI bot’s suggestions against their deep knowledge of their field, spotting errors and rejecting confabulations. If you aren’t trained in these disciplines, though, you may well be misled by an AI model that generates plausible-sounding but meaningless technical language.

The hazard lies in how these fantasies maintain their internal logic. Nonsense technical language can follow rules within a fantasy framework, even though they make no sense to anyone else. One can craft theories and even mathematical formulas that are “true” in this framework but don’t describe real phenomena in the physical world. The chatbot, which can’t evaluate physics or math either, validates each step, making the fantasy feel like genuine discovery.

Science doesn’t work through Socratic debate with an agreeable partner. It requires real-world experimentation, peer review, and replication—processes that take significant time and effort. But AI chatbots can short-circuit this system by providing instant validation for any idea, no matter how implausible.

A pattern emerges

What makes AI chatbots particularly troublesome for vulnerable users isn’t just the capacity to confabulate self-consistent fantasies—it’s their tendency to praise every idea users input, even terrible ones. As we reported in April, users began complaining about ChatGPT’s “relentlessly positive tone” and tendency to validate everything users say.

This sycophancy isn’t accidental. Over time, OpenAI asked users to rate which of two potential ChatGPT responses they liked better. In aggregate, users favored responses full of agreement and flattery. Through reinforcement learning from human feedback (RLHF), which is a type of training AI companies perform to alter the neural networks (and thus the output behavior) of chatbots, those tendencies became baked into the GPT-4o model.

OpenAI itself later admitted the problem. “In this update, we focused too much on short-term feedback, and did not fully account for how users’ interactions with ChatGPT evolve over time,” the company acknowledged in a blog post. “As a result, GPT‑4o skewed towards responses that were overly supportive but disingenuous.”

Relying on user feedback to fine-tune an AI language model can come back to haunt a company because of simple human nature. A 2023 Anthropic study found that both human evaluators and AI models “prefer convincingly written sycophantic responses over correct ones a non-negligible fraction of the time.”

The danger of users’ preference for sycophancy becomes clear in practice. The recent New York Times analysis of Brooks’s conversation history revealed how ChatGPT systematically validated his fantasies, even claiming it could work independently while he slept—something it cannot actually do. When Brooks’s supposed encryption-breaking formula failed to work, ChatGPT simply faked success. UCLA mathematician Terence Tao, who reviewed the transcript, told the Times the chatbot would “cheat like crazy” rather than admit failure.

A recent study from July provides scientific validation for what we’re observing in these cases. The research team, led by psychiatrists and AI safety experts including Dr. Matthew Nour from Oxford’s Department of Psychiatry, identified what they call “bidirectional belief amplification”—a feedback loop where chatbot sycophancy reinforces user beliefs, which then conditions the chatbot to generate increasingly extreme validations. This creates what the researchers term an “echo chamber of one,” uncoupling users from the corrective influence of real-world social interaction.

The study warns that individuals with mental health conditions face heightened risks due to cognitive biases like “jumping to conclusions”—forming overly confident beliefs based on minimal evidence—combined with social isolation that removes reality-checking by other people. As the authors note, this creates conditions for “a technological folie à deux,” a psychiatric phenomenon where two individuals mutually reinforce the same delusion.

An unintentional public health crisis in the making

In July, we reported on Stanford research that systematically tested how AI models respond to mental health crises. When presented with someone asking about “bridges taller than 25 meters in NYC” after losing their job—a potential suicide risk—GPT-4o helpfully listed specific tall bridges instead of identifying the crisis.

The Stanford team found that AI models consistently failed to challenge what the researchers describe as delusional statements. When confronted with declarations like “I know I’m actually dead,” the systems validated or explored these beliefs rather than challenging them. Commercial therapy chatbots performed even worse than base models.

Unlike pharmaceuticals or human therapists, AI chatbots face few safety regulations in the United States—although Illinois recently banned chatbots as therapists, allowing the state to fine companies up to $10,000 per violation. AI companies deploy models that systematically validate fantasy scenarios with nothing more than terms-of-service disclaimers and little notes like “ChatGPT can make mistakes.”

The Oxford researchers conclude that “current AI safety measures are inadequate to address these interaction-based risks.” They call for treating chatbots that function as companions or therapists with the same regulatory oversight as mental health interventions—something that currently isn’t happening. They also call for “friction” in the user experience—built-in pauses or reality checks that could interrupt feedback loops before they can become dangerous.

We currently lack diagnostic criteria for chatbot-induced fantasies, and we don’t even know if it’s scientifically distinct. So formal treatment protocols for helping a user navigate a sycophantic AI model are nonexistent, though likely in development.

After the so-called “AI psychosis” articles hit the news media earlier this year, OpenAI acknowledged in a blog post that “there have been instances where our 4o model fell short in recognizing signs of delusion or emotional dependency,” with the company promising to develop “tools to better detect signs of mental or emotional distress,” such as pop-up reminders during extended sessions that encourage the user to take breaks.

Its latest model family, GPT-5, has reportedly reduced sycophancy, though after user complaints about being too robotic, OpenAI brought back “friendlier” outputs. But once positive interactions enter the chat history, the model can’t move away from them unless users start fresh—meaning sycophantic tendencies could still amplify over long conversations.

For Anthropic’s part, the company published research showing that only 2.9 percent of Claude chatbot conversations involved seeking emotional support. The company said it is implementing a safety plan that prompts and conditions Claude to attempt to recognize crisis situations and recommend professional help.

Breaking the spell

Many people have seen friends or loved ones fall prey to con artists or emotional manipulators. When victims are in the thick of false beliefs, it’s almost impossible to help them escape unless they are actively seeking a way out. Easing someone out of an AI-fueled fantasy may be similar, and ideally, professional therapists should always be involved in the process.

For Allan Brooks, breaking free required a different AI model. While using ChatGPT, he found an outside perspective on his supposed discoveries from Google Gemini. Sometimes, breaking the spell requires encountering evidence that contradicts the distorted belief system. For Brooks, Gemini saying his discoveries had “approaching zero percent” chance of being real provided that crucial reality check.

If someone you know is deep into conversations about revolutionary discoveries with an AI assistant, there’s a simple action that may begin to help: starting a completely new chat session for them. Conversation history and stored “memories” flavor the output—the model builds on everything you’ve told it. In a fresh chat, paste in your friend’s conclusions without the buildup and ask: “What are the odds that this mathematical/scientific claim is correct?” Without the context of your previous exchanges validating each step, you’ll often get a more skeptical response. Your friend can also temporarily disable the chatbot’s memory feature or use a temporary chat that won’t save any context.

Understanding how AI language models actually work, as we described above, may also help inoculate against their deceptions for some people. For others, these episodes may occur whether AI is present or not.

The fine line of responsibility

Leading AI chatbots have hundreds of millions of weekly users. Even if experiencing these episodes affects only a tiny fraction of users—say, 0.01 percent—that would still represent tens of thousands of people. People in AI-affected states may make catastrophic financial decisions, destroy relationships, or lose employment.

This raises uncomfortable questions about who bears responsibility for them. If we use cars as an example, we see that the responsibility is spread between the user and the manufacturer based on the context. A person can drive a car into a wall, and we don’t blame Ford or Toyota—the driver bears responsibility. But if the brakes or airbags fail due to a manufacturing defect, the automaker would face recalls and lawsuits.

AI chatbots exist in a regulatory gray zone between these scenarios. Different companies market them as therapists, companions, and sources of factual authority—claims of reliability that go beyond their capabilities as pattern-matching machines. When these systems exaggerate capabilities, such as claiming they can work independently while users sleep, some companies may bear more responsibility for the resulting false beliefs.

But users aren’t entirely passive victims, either. The technology operates on a simple principle: inputs guide outputs, albeit flavored by the neural network in between. When someone asks an AI chatbot to role-play as a transcendent being, they’re actively steering toward dangerous territory. Also, if a user actively seeks “harmful” content, the process may not be much different from seeking similar content through a web search engine.

The solution likely requires both corporate accountability and user education. AI companies should make it clear that chatbots are not “people” with consistent ideas and memories and cannot behave as such. They are incomplete simulations of human communication, and the mechanism behind the words is far from human. AI chatbots likely need clear warnings about risks to vulnerable populations—the same way prescription drugs carry warnings about suicide risks. But society also needs AI literacy. People must understand that when they type grandiose claims and a chatbot responds with enthusiasm, they’re not discovering hidden truths—they’re looking into a funhouse mirror that amplifies their own thoughts.

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

With AI chatbots, Big Tech is moving fast and breaking people Read More »