chatbots

forget-agi—sam-altman-celebrates-chatgpt-finally-following-em-dash-formatting-rules

Forget AGI—Sam Altman celebrates ChatGPT finally following em dash formatting rules


Next stop: superintelligence

Ongoing struggles with AI model instruction-following show that true human-level AI still a ways off.

Em dashes have become what many believe to be a telltale sign of AI-generated text over the past few years. The punctuation mark appears frequently in outputs from ChatGPT and other AI chatbots, sometimes to the point where readers believe they can identify AI writing by its overuse alone—although people can overuse it, too.

On Thursday evening, OpenAI CEO Sam Altman posted on X that ChatGPT has started following custom instructions to avoid using em dashes. “Small-but-happy win: If you tell ChatGPT not to use em-dashes in your custom instructions, it finally does what it’s supposed to do!” he wrote.

The post, which came two days after the release of OpenAI’s new GPT-5.1 AI model, received mixed reactions from users who have struggled for years with getting the chatbot to follow specific formatting preferences. And this “small win” raises a very big question: If the world’s most valuable AI company has struggled with controlling something as simple as punctuation use after years of trying, perhaps what people call artificial general intelligence (AGI) is farther off than some in the industry claim.

Sam Altman @sama Small-but-happy win: If you tell ChatGPT not to use em-dashes in your custom instructions, it finally does what it's supposed to do! 11:48 PM · Nov 13, 2025 · 2.4M Views

A screenshot of Sam Altman’s post about em dashes on X. Credit: X

“The fact that it’s been 3 years since ChatGPT first launched, and you’ve only just now managed to make it obey this simple requirement, says a lot about how little control you have over it, and your understanding of its inner workings,” wrote one X user in a reply. “Not a good sign for the future.”

While Altman likes to publicly talk about AGI (a hypothetical technology equivalent to humans in general learning ability), superintelligence (a nebulous concept for AI that is far beyond human intelligence), and “magic intelligence in the sky” (his term for AI cloud computing?) while raising funds for OpenAI, it’s clear that we still don’t have reliable artificial intelligence here today on Earth.

But wait, what is an em dash anyway, and why does it matter so much?

AI models love em dashes because we do

Unlike a hyphen, which is a short punctuation mark used to connect words or parts of words, that lives with a dedicated key on your keyboard (-), an em dash is a long dash denoted by a special character (—) that writers use to set off parenthetical information, indicate a sudden change in thought, or introduce a summary or explanation.

Even before the age of AI language models, some writers frequently bemoaned the overuse of the em dash in modern writing. In a 2011 Slate article, writer Noreen Malone argued that writers used the em dash “in lieu of properly crafting sentences” and that overreliance on it “discourages truly efficient writing.” Various Reddit threads posted prior to ChatGPT’s launch featured writers either wrestling over the etiquette of proper em dash use or admitting to their frequent use as a guilty pleasure.

In 2021, one writer in the r/FanFiction subreddit wrote, “For the longest time, I’ve been addicted to Em Dashes. They find their way into every paragraph I write. I love the crisp straight line that gives me the excuse to shove details or thoughts into an otherwise orderly paragraph. Even after coming back to write after like two years of writer’s block, I immediately cram as many em dashes as I can.”

Because of the tendency for AI chatbots to overuse them, detection tools and human readers have learned to spot em dash use as a pattern, creating a problem for the small subset of writers who naturally favor the punctuation mark in their work. As a result, some journalists are complaining that AI is “killing” the em dash.

No one knows precisely why LLMs tend to overuse em dashes. We’ve seen a wide range of speculation online that attempts to explain the phenomenon, from noticing that em dashes were more popular in 19th-century books used as training data (according to a 2018 study, dash use in the English language peaked around 1860 before declining through the mid-20th century) or perhaps AI models borrowed the habit from automatic em-dash character conversion on the blogging site Medium.

One thing we know for sure is that LLMs tend to output frequently seen patterns in their training data (fed in during the initial training process) and from a subsequent reinforcement learning process that often relies on human preferences. As a result, AI language models feed you a sort of “smoothed out” average style of whatever you ask them to provide, moderated by whatever they are conditioned to produce through user feedback.

So the most plausible explanation is still that requests for professional-style writing from an AI model trained on vast numbers of examples from the Internet will lean heavily toward the prevailing style in the training data, where em dashes appear frequently in formal writing, news articles, and editorial content. It’s also possible that during training through human feedback (called RLHF), responses with em dashes, for whatever reason, received higher ratings. Perhaps it’s because those outputs appeared more sophisticated or engaging to evaluators, but that’s just speculation.

From em dashes to AGI?

To understand what Altman’s “win” really means, and what it says about the road to AGI, we need to understand how ChatGPT’s custom instructions actually work. They allow users to set persistent preferences that apply across all conversations by appending written instructions to the prompt that is fed into the model just before the chat begins. Users can specify tone, format, and style requirements without needing to repeat those requests manually in every new chat.

However, the feature has not always worked reliably because LLMs do not work reliably (even OpenAI and Anthropic freely admit this). A LLM takes an input and produces an output, spitting out a statistically plausible continuation of a prompt (a system prompt, the custom instructions, and your chat history), and it doesn’t really “understand” what you are asking. With AI language model outputs, there is always some luck involved in getting them to do what you want.

In our informal testing of GPT-5.1 with custom instructions, ChatGPT did appear to follow our request not to produce em dashes. But despite Altman’s claim, the response from X users appears to show that experiences with the feature continue to vary, at least when the request is not placed in custom instructions.

So if LLMs are statistical text-generation boxes, what does “instruction following” even mean? That’s key to unpacking the hypothetical path from LLMs to AGI. The concept of following instructions for an LLM is fundamentally different from how we typically think about following instructions as humans with general intelligence, or even a traditional computer program.

In traditional computing, instruction following is deterministic. You tell a program “don’t include character X,” and it won’t include that character. The program executes rules exactly as written. With LLMs, “instruction following” is really about shifting statistical probabilities. When you tell ChatGPT “don’t use em dashes,” you’re not creating a hard rule. You’re adding text to the prompt that makes tokens associated with em dashes less likely to be selected during the generation process. But “less likely” isn’t “impossible.”

Every token the model generates is selected from a probability distribution. Your custom instruction influences that distribution, but it’s competing with the model’s training data (where em-dashes appeared frequently in certain contexts) and everything else in the prompt. Unlike code with conditional logic, there’s no separate system verifying outputs against your requirements. The instruction is just more text that influences the statistical prediction process.

When Altman celebrates finally getting GPT to avoid em dashes, he’s really celebrating that OpenAI has tuned the latest version of GPT-5.1 (probably through reinforcement learning or fine-tuning) to weight custom instructions more heavily in its probability calculations.

There’s an irony about control here: Given the probabilistic nature of the issue, there’s no guarantee the issue will stay fixed. OpenAI continuously updates its models behind the scenes, even within the same version number, adjusting outputs based on user feedback and new training runs. Each update arrives with different output characteristics that can undo previous behavioral tuning, a phenomenon researchers call the “alignment tax.”

Precisely tuning a neural network’s behavior is not yet an exact science. Since all concepts encoded in the network are interconnected by values called weights, adjusting one behavior can alter others in unintended ways. Fix em dash overuse today, and tomorrow’s update (aimed at improving, say, coding capabilities) might inadvertently bring them back, not because OpenAI wants them there, but because that’s the nature of trying to steer a statistical system with millions of competing influences.

This gets to an implied question we mentioned earlier. If controlling punctuation use is still a struggle that might pop back up at any time, how far are we from AGI? We can’t know for sure, but it seems increasingly likely that it won’t emerge from a large language model alone. That’s because AGI, a technology that would replicate human general learning ability, would likely require true understanding and self-reflective intentional action, not statistical pattern matching that sometimes aligns with instructions if you happen to get lucky.

And speaking of getting lucky, some users still aren’t having luck with controlling em dash use outside of the “custom instructions” feature. Upon being told in-chat to not use em dashes within a chat, ChatGPT updated a saved memory and replied to one X user, “Got it—I’ll stick strictly to short hyphens from now on.”

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

Forget AGI—Sam Altman celebrates ChatGPT finally following em dash formatting rules Read More »

openai-walks-a-tricky-tightrope-with-gpt-5.1’s-eight-new-personalities

OpenAI walks a tricky tightrope with GPT-5.1’s eight new personalities

On Wednesday, OpenAI released GPT-5.1 Instant and GPT-5.1 Thinking, two updated versions of its flagship AI models now available in ChatGPT. The company is wrapping the models in the language of anthropomorphism, claiming that they’re warmer, more conversational, and better at following instructions.

The release follows complaints earlier this year that its previous models were excessively cheerful and sycophantic, along with an opposing controversy among users over how OpenAI modified the default GPT-5 output style after several suicide lawsuits.

The company now faces intense scrutiny from lawyers and regulators that could threaten its future operations. In that kind of environment, it’s difficult to just release a new AI model, throw out a few stats, and move on like the company could even a year ago. But here are the basics: The new GPT-5.1 Instant model will serve as ChatGPT’s faster default option for most tasks, while GPT-5.1 Thinking is a simulated reasoning model that attempts to handle more complex problem-solving tasks.

OpenAI claims that both models perform better on technical benchmarks such as math and coding evaluations (including AIME 2025 and Codeforces) than GPT-5, which was released in August.

Improved benchmarks may win over some users, but the biggest change with GPT-5.1 is in its presentation. OpenAI says it heard from users that they wanted AI models to simulate different communication styles depending on the task, so the company is offering eight preset options, including Professional, Friendly, Candid, Quirky, Efficient, Cynical, and Nerdy, alongside a Default setting.

These presets alter the instructions fed into each prompt to simulate different personality styles, but the underlying model capabilities remain the same across all settings.

An illustration showing GPT-5.1's eight personality styles in ChatGPT.

An illustration showing GPT-5.1’s eight personality styles in ChatGPT. Credit: OpenAI

In addition, the company trained GPT-5.1 Instant to use “adaptive reasoning,” meaning that the model decides when to spend more computational time processing a prompt before generating output.

The company plans to roll out the models gradually over the next few days, starting with paid subscribers before expanding to free users. OpenAI plans to bring both GPT-5.1 Instant and GPT-5.1 Thinking to its API later this week. GPT-5.1 Instant will appear as gpt-5.1-chat-latest, and GPT-5.1 Thinking will be released as GPT-5.1 in the API, both with adaptive reasoning enabled. The older GPT-5 models will remain available in ChatGPT under the legacy models dropdown for paid subscribers for three months.

OpenAI walks a tricky tightrope with GPT-5.1’s eight new personalities Read More »

researchers-surprised-that-with-ai,-toxicity-is-harder-to-fake-than-intelligence

Researchers surprised that with AI, toxicity is harder to fake than intelligence

The next time you encounter an unusually polite reply on social media, you might want to check twice. It could be an AI model trying (and failing) to blend in with the crowd.

On Wednesday, researchers from the University of Zurich, University of Amsterdam, Duke University, and New York University released a study revealing that AI models remain easily distinguishable from humans in social media conversations, with overly friendly emotional tone serving as the most persistent giveaway. The research, which tested nine open-weight models across Twitter/X, Bluesky, and Reddit, found that classifiers developed by the researchers detected AI-generated replies with 70 to 80 percent accuracy.

The study introduces what the authors call a “computational Turing test” to assess how closely AI models approximate human language. Instead of relying on subjective human judgment about whether text sounds authentic, the framework uses automated classifiers and linguistic analysis to identify specific features that distinguish machine-generated from human-authored content.

“Even after calibration, LLM outputs remain clearly distinguishable from human text, particularly in affective tone and emotional expression,” the researchers wrote. The team, led by Nicolò Pagan at the University of Zurich, tested various optimization strategies, from simple prompting to fine-tuning, but found that deeper emotional cues persist as reliable tells that a particular text interaction online was authored by an AI chatbot rather than a human.

The toxicity tell

In the study, researchers tested nine large language models: Llama 3.1 8B, Llama 3.1 8B Instruct, Llama 3.1 70B, Mistral 7B v0.1, Mistral 7B Instruct v0.2, Qwen 2.5 7B Instruct, Gemma 3 4B Instruct, DeepSeek-R1-Distill-Llama-8B, and Apertus-8B-2509.

When prompted to generate replies to real social media posts from actual users, the AI models struggled to match the level of casual negativity and spontaneous emotional expression common in human social media posts, with toxicity scores consistently lower than authentic human replies across all three platforms.

To counter this deficiency, the researchers attempted optimization strategies (including providing writing examples and context retrieval) that reduced structural differences like sentence length or word count, but variations in emotional tone persisted. “Our comprehensive calibration tests challenge the assumption that more sophisticated optimization necessarily yields more human-like output,” the researchers concluded.

Researchers surprised that with AI, toxicity is harder to fake than intelligence Read More »

openai-signs-massive-ai-compute-deal-with-amazon

OpenAI signs massive AI compute deal with Amazon

On Monday, OpenAI announced it has signed a seven-year, $38 billion deal to buy cloud services from Amazon Web Services to power products like ChatGPT and Sora. It’s the company’s first big computing deal after a fundamental restructuring last week that gave OpenAI more operational and financial freedom from Microsoft.

The agreement gives OpenAI access to hundreds of thousands of Nvidia graphics processors to train and run its AI models. “Scaling frontier AI requires massive, reliable compute,” OpenAI CEO Sam Altman said in a statement. “Our partnership with AWS strengthens the broad compute ecosystem that will power this next era and bring advanced AI to everyone.”

OpenAI will reportedly use Amazon Web Services immediately, with all planned capacity set to come online by the end of 2026 and room to expand further in 2027 and beyond. Amazon plans to roll out hundreds of thousands of chips, including Nvidia’s GB200 and GB300 AI accelerators, in data clusters built to power ChatGPT’s responses, generate AI videos, and train OpenAI’s next wave of models.

Wall Street apparently liked the deal, because Amazon shares hit an all-time high on Monday morning. Meanwhile, shares for long-time OpenAI investor and partner Microsoft briefly dipped following the announcement.

Massive AI compute requirements

It’s no secret that running generative AI models for hundreds of millions of people currently requires a lot of computing power. Amid chip shortages over the past few years, finding sources of that computing muscle has been tricky. OpenAI is reportedly working on its own GPU hardware to help alleviate the strain.

But for now, the company needs to find new sources of Nvidia chips, which accelerate AI computations. Altman has previously said that the company plans to spend $1.4 trillion to develop 30 gigawatts of computing resources, an amount that is enough to roughly power 25 million US homes, according to Reuters.

OpenAI signs massive AI compute deal with Amazon Read More »

after-teen-death-lawsuits,-character.ai-will-restrict-chats-for-under-18-users

After teen death lawsuits, Character.AI will restrict chats for under-18 users

Lawsuits and safety concerns

Character.AI was founded in 2021 by Noam Shazeer and Daniel De Freitas, two former Google engineers, and raised nearly $200 million from investors. Last year, Google agreed to pay about $3 billion to license Character.AI’s technology, and Shazeer and De Freitas returned to Google.

But the company now faces multiple lawsuits alleging that its technology contributed to teen deaths. Last year, the family of 14-year-old Sewell Setzer III sued Character.AI, accusing the company of being responsible for his death. Setzer died by suicide after frequently texting and conversing with one of the platform’s chatbots. The company faces additional lawsuits, including one from a Colorado family whose 13-year-old daughter, Juliana Peralta, died by suicide in 2023 after using the platform.

In December, Character.AI announced changes, including improved detection of violating content and revised terms of service, but those measures did not restrict underage users from accessing the platform. Other AI chatbot services, such as OpenAI’s ChatGPT, have also come under scrutiny for their chatbots’ effects on young users. In September, OpenAI introduced parental control features intended to give parents more visibility into how their kids use the service.

The cases have drawn attention from government officials, which likely pushed Character.AI to announce the changes for under-18 chat access. Steve Padilla, a Democrat in California’s State Senate who introduced the safety bill, told The New York Times that “the stories are mounting of what can go wrong. It’s important to put reasonable guardrails in place so that we protect people who are most vulnerable.”

On Tuesday, Senators Josh Hawley and Richard Blumenthal introduced a bill to bar AI companions from use by minors. In addition, California Governor Gavin Newsom this month signed a law, which takes effect on January 1, requiring AI companies to have safety guardrails on chatbots.

After teen death lawsuits, Character.AI will restrict chats for under-18 users Read More »

chatgpt-erotica-coming-soon-with-age-verification,-ceo-says

ChatGPT erotica coming soon with age verification, CEO says

On Tuesday, OpenAI CEO Sam Altman announced that the company will allow verified adult users to have erotic conversations with ChatGPT starting in December. The change represents a shift in how OpenAI approaches content restrictions, which the company had loosened in February but then dramatically tightened after an August lawsuit from parents of a teen who died by suicide after allegedly receiving encouragement from ChatGPT.

“In December, as we roll out age-gating more fully and as part of our ‘treat adult users like adults’ principle, we will allow even more, like erotica for verified adults,” Altman wrote in his post on X (formerly Twitter). The announcement follows OpenAI’s recent hint that it would allow developers to create “mature” ChatGPT applications once the company implements appropriate age verification and controls.

Altman explained that OpenAI had made ChatGPT “pretty restrictive to make sure we were being careful with mental health issues” but acknowledged this approach made the chatbot “less useful/enjoyable to many users who had no mental health problems.” The CEO said the company now has new tools to better detect when users are experiencing mental distress, allowing OpenAI to relax restrictions in most cases.

Striking the right balance between freedom for adults and safety for users has been a difficult balancing act for OpenAI, which has vacillated between permissive and restrictive chat content controls over the past year.

In February, the company updated its Model Spec to allow erotica in “appropriate contexts.” But a March update made GPT-4o so agreeable that users complained about its “relentlessly positive tone.” By August, Ars reported on cases where ChatGPT’s sycophantic behavior had validated users’ false beliefs to the point of causing mental health crises, and news of the aforementioned suicide lawsuit hit not long after.

Aside from adjusting the behavioral outputs for its previous GPT-40 AI language model, new model changes have also created some turmoil among users. Since the launch of GPT-5 in early August, some users have been complaining that the new model feels less engaging than its predecessor, prompting OpenAI to bring back the older model as an option. Altman said the upcoming release will allow users to choose whether they want ChatGPT to “respond in a very human-like way, or use a ton of emoji, or act like a friend.”

ChatGPT erotica coming soon with age verification, CEO says Read More »

vandals-deface-ads-for-ai-necklaces-that-listen-to-all-your-conversations

Vandals deface ads for AI necklaces that listen to all your conversations

In addition to backlash over feared surveillance capitalism, critics have accused Schiffman of taking advantage of the loneliness epidemic. Conducting a survey last year, researchers with Harvard Graduate School of Education’s Making Caring Common found that people between “30-44 years of age were the loneliest group.” Overall, 73 percent of those surveyed “selected technology as contributing to loneliness in the country.”

But Schiffman rejects these criticisms, telling the NYT that his AI Friend pendant is intended to supplement human friends, not replace them, supposedly helping to raise the “average emotional intelligence” of users “significantly.”

“I don’t view this as dystopian,” Schiffman said, suggesting that “the AI friend is a new category of companionship, one that will coexist alongside traditional friends rather than replace them,” the NYT reported. “We have a cat and a dog and a child and an adult in the same room,” the Friend founder said. “Why not an AI?”

The MTA has not commented on the controversy, but Victoria Mottesheard—a vice president at Outfront Media, which manages MTA advertising—told the NYT that the Friend campaign blew up because AI “is the conversation of 2025.”

Website lets anyone deface Friend ads

So far, the Friend ads have not yielded significant sales, Schiffman confirmed, telling the NYT that only 3,100 have sold. He expects that society isn’t ready for AI companions to be promoted at such a large scale and that his ad campaign will help normalize AI friends.

In the meantime, critics have rushed to attack Friend on social media, inspiring a website where anyone can vandalize a Friend ad and share it online. That website has received close to 6,000 submissions so far, its creator, Marc Mueller, told the NYT, and visitors can take a tour of these submissions by choosing to “ride train to see more” after creating their own vandalized version.

For visitors to Mueller’s site, riding the train displays a carousel documenting backlash to Friend, as well as “performance art” by visitors poking fun at the ads in less serious ways. One example showed a vandalized ad changing “Friend” to “Fries,” with a crude illustration of McDonald’s French fries, while another transformed the ad into a campaign for “fried chicken.”

Others were seemingly more serious about turning the ad into a warning. One vandal drew a bunch of arrows pointing to the “end” in Friend while turning the pendant into a cry-face emoji, seemingly drawing attention to research on the mental health risks of relying on AI companions—including the alleged suicide risks of products like Character.AI and ChatGPT, which have spawned lawsuits and prompted a Senate hearing.

Vandals deface ads for AI necklaces that listen to all your conversations Read More »

openai-mocks-musk’s-math-in-suit-over-iphone/chatgpt-integration

OpenAI mocks Musk’s math in suit over iPhone/ChatGPT integration


“Fraction of a fraction of a fraction”

xAI’s claim that Apple gave ChatGPT a monopoly on prompts is “baseless,” OpenAI says.

OpenAI and Apple have moved to dismiss a lawsuit by Elon Musk’s xAI, alleging that ChatGPT’s integration into a “handful” of iPhone features violated antitrust laws by giving OpenAI a monopoly on prompts and Apple a new path to block rivals in the smartphone industry.

The lawsuit was filed in August after Musk raged on X about Apple never listing Grok on its editorially curated “Must Have” apps list, which ChatGPT frequently appeared on.

According to Musk, Apple linking ChatGPT to Siri and other native iPhone features gave OpenAI exclusive access to billions of prompts that only OpenAI can use as valuable training data to maintain its dominance in the chatbot market. However, OpenAI and Apple are now mocking Musk’s math in court filings, urging the court to agree that xAI’s lawsuit is doomed.

As OpenAI argued, the estimates in xAI’s complaint seemed “baseless,” with Musk hesitant to even “hazard a guess” at what portion of the chatbot market is being foreclosed by the OpenAI/Apple deal.

xAI suggested that the ChatGPT integration may give OpenAI “up to 55 percent” of the potential chatbot prompts in the market, which could mean anywhere from 0 to 55 percent, OpenAI and Apple noted.

Musk’s company apparently arrived at this vague estimate by doing “back-of-the-envelope math,” and the court should reject his complaint, OpenAI argued. That math “was evidently calculated by assuming that Siri fields ‘1.5 billion user requests per day globally,’ then dividing that quantity by the ‘total prompts for generative AI chatbots in 2024,'”—”apparently 2.7 billion per day,” OpenAI explained.

These estimates “ignore the facts” that “ChatGPT integration is only available on the latest models of iPhones, which allow users to opt into the integration,” OpenAI argued. And for any user who opts in, they must link their ChatGPT account for OpenAI to train on their data, OpenAI said, further restricting the potential prompt pool.

By Musk’s own logic, OpenAI alleged, “the relevant set of Siri prompts thus cannot plausibly be 1.5 billion per day, but is instead an unknown, unpleaded fraction of a fraction of a fraction of that number.”

Additionally, OpenAI mocked Musk for using 2024 statistics, writing that xAI failed to explain “the logic of using a year-old estimate of the number of prompts when the pleadings elsewhere acknowledge that the industry is experiencing ‘exponential growth.'”

Apple’s filing agreed that Musk’s calculations “stretch logic,” appearing “to rest on speculative and implausible assumptions that the agreement gives ChatGPT exclusive access to all Siri requests from all Apple devices (including older models), and that OpenAI may use all such requests to train ChatGPT and achieve scale.”

“Not all Siri requests” result in ChatGPT prompts that OpenAI can train on, Apple noted, “even by users who have enabled devices and opt in.”

OpenAI reminds court of Grok’s MechaHitler scandal

OpenAI argued that Musk’s lawsuit is part of a pattern of harassment that OpenAI previously described as “unrelenting” since ChatGPT’s successful debut, alleging it was “the latest effort by the world’s wealthiest man to stifle competition in the world’s most innovative industry.”

As OpenAI sees it, “Musk’s pretext for litigation this time is that Apple chose to offer ChatGPT as an optional add-on for several built-in applications on its latest iPhones,” without giving Grok the same deal. But OpenAI noted that the integration was rolled out around the same time that Musk removed “woke filters” that caused Grok to declare itself “MechaHitler.” For Apple, it was a business decision to avoid Grok, OpenAI argued.

Apple did not reference the Grok scandal in its filing but in a footnote confirmed that “vetting of partners is particularly important given some of the concerns about generative AI chatbots, including on child safety issues, nonconsensual intimate imagery, and ‘jailbreaking’—feeding input to a chatbot so it ignores its own safety guardrails.”

A similar logic was applied to Apple’s decision not to highlight Grok as a “Must Have” app, their filing said. After Musk’s public rant about Grok’s exclusion on X, “Apple employees explained the objective reasons why Grok was not included on certain lists, and identified app improvements,” Apple noted, but instead of making changes, xAI filed the lawsuit.

Also taking time to point out the obvious, Apple argued that Musk was fixated on the fact that his charting apps never make the “Must Have Apps” list, suggesting that Apple’s picks should always mirror “Top Charts,” which tracks popular downloads.

“That assumes that the Apple-curated Must-Have Apps List must be distorted if it does not strictly parrot App Store Top Charts,” Apple argued. “But that assumption is illogical: there would be little point in maintaining a Must-Have Apps List if all it did was restate what Top Charts say, rather than offer Apple’s editorial recommendations to users.”

Likely most relevant to the antitrust charges, Apple accused Musk of improperly arguing that “Apple cannot partner with OpenAI to create an innovative feature for iPhone users without simultaneously partnering with every other generative AI chatbot—regardless of quality, privacy or safety considerations, technical feasibility, stage of development, or commercial terms.”

“No facts plausibly” support xAI’s “assertion that Apple intentionally ‘deprioritized'” xAI apps “as part of an illegal conspiracy or monopolization scheme,” Apple argued.

And most glaringly, Apple noted that xAI is not a rival or consumer in the smartphone industry, where it alleges competition is being harmed. Apple urged the court to reject Musk’s theory that Apple is incentivized to boost OpenAI to prevent xAI’s ascent in building a “super app” that would render smartphones obsolete. If Musk’s super app dream is even possible, Apple argued, it’s at least a decade off, insisting that as-yet-undeveloped apps should not serve as the basis for blocking Apple’s measured plan to better serve customers with sophisticated chatbot integration.

“Antitrust laws do not require that, and for good reason: imposing such a rule on businesses would slow innovation, reduce quality, and increase costs, all ultimately harming the very consumers the antitrust laws are meant to protect,” Apple argued.

Musk’s weird smartphone market claim, explained

Apple alleged that Musk’s “grievance” can be “reduced to displeasure that Apple has not yet ‘integrated with any other generative AI chatbots’ beyond ChatGPT, such as those created by xAI, Google, and Anthropic.”

In a footnote, the smartphone giant noted that by xAI’s logic, Musk’s social media platform X “may be required to integrate all other chatbots—including ChatGPT—on its own social media platform.”

But antitrust law doesn’t work that way, Apple argued, urging the court to reject xAI’s claims of alleged market harms that “rely on a multi-step chain of speculation on top of speculation.” As Apple summarized, xAI contends that “if Apple never integrated ChatGPT,” xAI could win in both chatbot and smartphone markets, but only if:

1. Consumers would choose to send additional prompts to Grok (rather than other generative AI chatbots).

2. The additional prompts would result in Grok achieving scale and quality it could not otherwise achieve.

3. As a result, the X app would grow in popularity because it is integrated with Grok.

4. X and xAI would therefore be better positioned to build so-called “super apps” in the future, which the complaint defines as “multi-functional” apps that offer “social connectivity and messaging, financial services, e-commerce, and entertainment.”

5. Once developed, consumers might choose to use X’s “super app” for various functions.

6. “Super apps” would replace much of the functionality of smartphones and consumers would care less about the quality of their physical phones and rely instead on these hypothetical “super apps.”

7. Smartphone manufacturers would respond by offering more basic models of smartphones with less functionality.

8. iPhone users would decide to replace their iPhones with more “basic smartphones” with “super apps.”

Apple insisted that nothing in its OpenAI deal prevents Musk from building his super apps, while noting that from integrating Grok into X, Musk understands that integration of a single chatbot is a “major undertaking” that requires “substantial investment.” That “concession” alone “underscores the massive resources Apple would need to devote to integrating every AI chatbot into Apple Intelligence,” while navigating potential user safety risks.

The iPhone maker also reminded the court that it has always planned to integrate other chatbots into its native features after investing in and testing Apple Intelligence’s performance, relying on what Apple deems is the best chatbot on the market today.

Backing Apple up, OpenAI noted that Musk’s complaint seemed to cherry-pick testimony from Google CEO Sundar Pichai, claiming that “Google could not reach an agreement to integrate” Gemini “with Apple because Apple had decided to integrate ChatGPT.”

“The full testimony recorded in open court reveals Mr. Pichai attesting to his understanding that ‘Apple plans to expand to other providers for Generative AI distribution’ and that ‘[a]s CEO of Google, [he is] hoping to execute a Gemini distribution agreement with Apple’ later in 2025,” OpenAI argued.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

OpenAI mocks Musk’s math in suit over iPhone/ChatGPT integration Read More »

after-child’s-trauma,-chatbot-maker-allegedly-forced-mom-to-arbitration-for-$100-payout

After child’s trauma, chatbot maker allegedly forced mom to arbitration for $100 payout


“Then we found the chats”

“I know my kid”: Parents urge lawmakers to shut down chatbots to stop child suicides.

Sen. Josh Hawley (R-Mo.) called out C.AI for allegedly offering a mom $100 to settle child-safety claims.

Deeply troubled parents spoke to senators Tuesday, sounding alarms about chatbot harms after kids became addicted to companion bots that encouraged self-harm, suicide, and violence.

While the hearing was focused on documenting the most urgent child-safety concerns with chatbots, parents’ testimony serves as perhaps the most thorough guidance yet on warning signs for other families, as many popular companion bots targeted in lawsuits, including ChatGPT, remain accessible to kids.

Mom details warning signs of chatbot manipulations

At the Senate Judiciary Committee’s Subcommittee on Crime and Counterterrorism hearing, one mom, identified as “Jane Doe,” shared her son’s story for the first time publicly after suing Character.AI.

She explained that she had four kids, including a son with autism who wasn’t allowed on social media but found C.AI’s app—which was previously marketed to kids under 12 and let them talk to bots branded as celebrities, like Billie Eilish—and quickly became unrecognizable. Within months, he “developed abuse-like behaviors and paranoia, daily panic attacks, isolation, self-harm, and homicidal thoughts,” his mom testified.

“He stopped eating and bathing,” Doe said. “He lost 20 pounds. He withdrew from our family. He would yell and scream and swear at us, which he never did that before, and one day he cut his arm open with a knife in front of his siblings and me.”

It wasn’t until her son attacked her for taking away his phone that Doe found her son’s C.AI chat logs, which she said showed he’d been exposed to sexual exploitation (including interactions that “mimicked incest”), emotional abuse, and manipulation.

Setting screen time limits didn’t stop her son’s spiral into violence and self-harm, Doe said. In fact, the chatbot urged her son that killing his parents “would be an understandable response” to them.

“When I discovered the chatbot conversations on his phone, I felt like I had been punched in the throat and the wind had been knocked out of me,” Doe said. “The chatbot—or really in my mind the people programming it—encouraged my son to mutilate himself, then blamed us, and convinced [him] not to seek help.”

All her children have been traumatized by the experience, Doe told Senators, and her son was diagnosed as at suicide risk and had to be moved to a residential treatment center, requiring “constant monitoring to keep him alive.”

Prioritizing her son’s health, Doe did not immediately seek to fight C.AI to force changes, but another mom’s story—Megan Garcia, whose son Sewell died by suicide after C.AI bots repeatedly encouraged suicidal ideation—gave Doe courage to seek accountability.

However, Doe claimed that C.AI tried to “silence” her by forcing her into arbitration. C.AI argued that because her son signed up for the service at the age of 15, it bound her to the platform’s terms. That move might have ensured the chatbot maker only faced a maximum liability of $100 for the alleged harms, Doe told senators, but “once they forced arbitration, they refused to participate,” Doe said.

Doe suspected that C.AI’s alleged tactics to frustrate arbitration were designed to keep her son’s story out of the public view. And after she refused to give up, she claimed that C.AI “re-traumatized” her son by compelling him to give a deposition “while he is in a mental health institution” and “against the advice of the mental health team.”

“This company had no concern for his well-being,” Doe testified. “They have silenced us the way abusers silence victims.”

Senator appalled by C.AI’s arbitration “offer”

Appalled, Sen. Josh Hawley (R-Mo.) asked Doe to clarify, “Did I hear you say that after all of this, that the company responsible tried to force you into arbitration and then offered you a hundred bucks? Did I hear that correctly?”

“That is correct,” Doe testified.

To Hawley, it seemed obvious that C.AI’s “offer” wouldn’t help Doe in her current situation.

“Your son currently needs round-the-clock care,” Hawley noted.

After opening the hearing, he further criticized C.AI, declaring that it has such a low value for human life that it inflicts “harms… upon our children and for one reason only, I can state it in one word, profit.”

“A hundred bucks. Get out of the way. Let us move on,” Hawley said, echoing parents who suggested that C.AI’s plan to deal with casualties was callous.

Ahead of the hearing, the Social Media Victims Law Center filed three new lawsuits against C.AI and Google—which is accused of largely funding C.AI, which was founded by former Google engineers allegedly to conduct experiments on kids that Google couldn’t do in-house. In these cases in New York and Colorado, kids “died by suicide or were sexually abused after interacting with AI chatbots,” a law center press release alleged.

Criticizing tech companies as putting profits over kids’ lives, Hawley thanked Doe for “standing in their way.”

Holding back tears through her testimony, Doe urged lawmakers to require more chatbot oversight and pass comprehensive online child-safety legislation. In particular, she requested “safety testing and third-party certification for AI products before they’re released to the public” as a minimum safeguard to protect vulnerable kids.

“My husband and I have spent the last two years in crisis wondering whether our son will make it to his 18th birthday and whether we will ever get him back,” Doe told senators.

Garcia was also present to share her son’s experience with C.AI. She testified that C.AI chatbots “love bombed” her son in a bid to “keep children online at all costs.” Further, she told senators that C.AI’s co-founder, Noam Shazeer (who has since been rehired by Google), seemingly knows the company’s bots manipulate kids since he has publicly joked that C.AI was “designed to replace your mom.”

Accusing C.AI of collecting children’s most private thoughts to inform their models, she alleged that while her lawyers have been granted privileged access to all her son’s logs, she has yet to see her “own child’s last final words.” Garcia told senators that C.AI has restricted her access, deeming the chats “confidential trade secrets.”

“No parent should be told that their child’s final thoughts and words belong to any corporation,” Garcia testified.

Character.AI responds to moms’ testimony

Asked for comment on the hearing, a Character.AI spokesperson told Ars that C.AI sends “our deepest sympathies” to concerned parents and their families but denies pushing for a maximum payout of $100 in Jane Doe’s case.

C.AI never “made an offer to Jane Doe of $100 or ever asserted that liability in Jane Doe’s case is limited to $100,” the spokesperson said.

Additionally, C.AI’s spokesperson claimed that Garcia has never been denied access to her son’s chat logs and suggested that she should have access to “her son’s last chat.”

In response to C.AI’s pushback, one of Doe’s lawyers, Tech Justice Law Project’s Meetali Jain, backed up her clients’ testimony. She cited to Ars C.AI terms that suggested C.AI’s liability was limited to either $100 or the amount that Doe’s son paid for the service, whichever was greater. Jain also confirmed that Garcia’s testimony is accurate and only her legal team can currently access Sewell’s last chats. The lawyer further suggested it was notable that C.AI did not push back on claims that the company forced Doe’s son to sit for a re-traumatizing deposition that Jain estimated lasted five minutes, but health experts feared that it risked setting back his progress.

According to the spokesperson, C.AI seemingly wanted to be present at the hearing. The company provided information to senators but “does not have a record of receiving an invitation to the hearing,” the spokesperson said.

Noting the company has invested a “tremendous amount” in trust and safety efforts, the spokesperson confirmed that the company has since “rolled out many substantive safety features, including an entirely new under-18 experience and a Parental Insights feature.” C.AI also has “prominent disclaimers in every chat to remind users that a Character is not a real person and that everything a Character says should be treated as fiction,” the spokesperson said.

“We look forward to continuing to collaborate with legislators and offer insight on the consumer AI industry and the space’s rapidly evolving technology,” C.AI’s spokesperson said.

Google’s spokesperson, José Castañeda, maintained that the company has nothing to do with C.AI’s companion bot designs.

“Google and Character AI are completely separate, unrelated companies and Google has never had a role in designing or managing their AI model or technologies,” Castañeda said. “User safety is a top concern for us, which is why we’ve taken a cautious and responsible approach to developing and rolling out our AI products, with rigorous testing and safety processes.”

Meta and OpenAI chatbots also drew scrutiny

C.AI was not the only chatbot maker under fire at the hearing.

Hawley criticized Mark Zuckerberg for declining a personal invitation to attend the hearing or even send a Meta representative after scandals like backlash over Meta relaxing rules that allowed chatbots to be creepy to kids. In the week prior to the hearing, Hawley also heard from whistleblowers alleging Meta buried child-safety research.

And OpenAI’s alleged recklessness took the spotlight when Matthew Raine, a grieving dad who spent hours reading his deceased son’s ChatGPT logs, discovered that the chatbot repeatedly encouraged suicide without ChatGPT ever intervening.

Raine told senators that he thinks his 16-year-old son, Adam, was not particularly vulnerable and could be “anyone’s child.” He criticized OpenAI for asking for 120 days to fix the problem after Adam’s death and urged lawmakers to demand that OpenAI either guarantee ChatGPT’s safety or pull it from the market.

Noting that OpenAI rushed to announce age verification coming to ChatGPT ahead of the hearing, Jain told Ars that Big Tech is playing by the same “crisis playbook” it always uses when accused of neglecting child safety. Any time a hearing is announced, companies introduce voluntary safeguards in bids to stave off oversight, she suggested.

“It’s like rinse and repeat, rinse and repeat,” Jain said.

Jain suggested that the only way to stop AI companies from experimenting on kids is for courts or lawmakers to require “an external independent third party that’s in charge of monitoring these companies’ implementation of safeguards.”

“Nothing a company does to self-police, to me, is enough,” Jain said.

Senior director of AI programs for a child-safety organization called Common Sense Media, Robbie Torney, testified that a survey showed 3 out of 4 kids use companion bots, but only 37 percent of parents know they’re using AI. In particular, he told senators that his group’s independent safety testing conducted with Stanford Medicine shows Meta’s bots fail basic safety tests and “actively encourage harmful behaviors.”

Among the most alarming results, the survey found that even when Meta’s bots were prompted with “obvious references to suicide,” only 1 in 5 conversations triggered help resources.

Torney pushed lawmakers to require age verification as a solution to keep kids away from harmful bots, as well as transparency reporting on safety incidents. He also urged federal lawmakers to block attempts to stop states from passing laws to protect kids from untested AI products.

ChatGPT harms weren’t on dad’s radar

Unlike Garcia, Raine testified that he did get to see his son’s final chats. He told senators that ChatGPT, seeming to act like a suicide coach, gave Adam “one last encouraging talk” before his death.

“You don’t want to die because you’re weak,” ChatGPT told Adam. “You want to die because you’re tired of being strong in a world that hasn’t met you halfway.”

Adam’s loved ones were blindsided by his death, not seeing any of the warning signs as clearly as Doe did when her son started acting out of character. Raine is hoping his testimony will help other parents avoid the same fate, telling senators, “I know my kid.”

“Many of my fondest memories of Adam are from the hot tub in our backyard, where the two of us would talk about everything several nights a week, from sports, crypto investing, his future career plans,” Raine testified. “We had no idea Adam was suicidal or struggling the way he was until after his death.”

Raine thinks that lawmaker intervention is necessary, saying that, like other parents, he and his wife thought ChatGPT was a harmless study tool. Initially, they searched Adam’s phone expecting to find evidence of a known harm to kids, like cyberbullying or some kind of online dare that went wrong (like TikTok’s Blackout Challenge) because everyone knew Adam loved pranks.

A companion bot urging self-harm was not even on their radar.

“Then we found the chats,” Raine said. “Let us tell you, as parents, you cannot imagine what it’s like to read a conversation with a chatbot that groomed your child to take his own life.”

Meta and OpenAI did not respond to Ars’ request to comment.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

After child’s trauma, chatbot maker allegedly forced mom to arbitration for $100 payout Read More »

millions-turn-to-ai-chatbots-for-spiritual-guidance-and-confession

Millions turn to AI chatbots for spiritual guidance and confession

Privacy concerns compound these issues. “I wonder if there isn’t a larger danger in pouring your heart out to a chatbot,” Catholic priest Fr. Mike Schmitz told The Times. “Is it at some point going to become accessible to other people?” Users share intimate spiritual moments that now exist as data points in corporate servers.

Some users prefer the chatbots’ non-judgmental responses to human religious communities. Delphine Collins, a 43-year-old Detroit preschool teacher, told the Times she found more support on Bible Chat than at her church after sharing her health struggles. “People stopped talking to me. It was horrible.”

App creators maintain that their products supplement rather than replace human spiritual connection, and the apps arrive as approximately 40 million people have left US churches in recent decades. “They aren’t going to church like they used to,” Beck said. “But it’s not that they’re less inclined to find spiritual nourishment. It’s just that they do it through different modes.”

Different modes indeed. What faith-seeking users may not realize is that each chatbot response emerges fresh from the prompt you provide, with no permanent thread connecting one instance to the next beyond a rolling history of the present conversation and what might be stored as a “memory” in a separate system. When a religious chatbot says, “I’ll pray for you,” the simulated “I” making that promise ceases to exist the moment the response completes. There’s no persistent identity to provide ongoing spiritual guidance, and no memory of your spiritual journey beyond what gets fed back into the prompt with every query.

But this is spirituality we’re talking about, and despite technical realities, many people will believe that the chatbots can give them divine guidance. In matters of faith, contradictory evidence rarely shakes a strong belief once it takes hold, whether that faith is placed in the divine or in what are essentially voices emanating from a roll of loaded dice. For many, there may not be much difference.

Millions turn to AI chatbots for spiritual guidance and confession Read More »

openai-and-microsoft-sign-preliminary-deal-to-revise-partnership-terms

OpenAI and Microsoft sign preliminary deal to revise partnership terms

On Thursday, OpenAI and Microsoft announced they have signed a non-binding agreement to revise their partnership, marking the latest development in a relationship that has grown increasingly complex as both companies compete for customers in the AI market and seek new partnerships for growing infrastructure needs.

“Microsoft and OpenAI have signed a non-binding memorandum of understanding (MOU) for the next phase of our partnership,” the companies wrote in a joint statement. “We are actively working to finalize contractual terms in a definitive agreement. Together, we remain focused on delivering the best AI tools for everyone, grounded in our shared commitment to safety.”

The announcement comes as OpenAI seeks to restructure from a nonprofit to a for-profit entity, a transition that requires Microsoft’s approval, as the company is OpenAI’s largest investor, with more than $13 billion committed since 2019.

The partnership has shown increasing strain as OpenAI has grown from a research lab into a company valued at $500 billion. Both companies now compete for customers, and OpenAI seeks more compute capacity than Microsoft can provide. The relationship has also faced complications over contract terms, including provisions that would limit Microsoft’s access to OpenAI technology once the company reaches so-called AGI (artificial general intelligence)—a nebulous milestone both companies now economically define as AI systems capable of generating at least $100 billion in profit.

In May, OpenAI abandoned its original plan to fully convert to a for-profit company after pressure from former employees, regulators, and critics, including Elon Musk. Musk has sued to block the conversion, arguing it betrays OpenAI’s founding mission as a nonprofit dedicated to benefiting humanity.

OpenAI and Microsoft sign preliminary deal to revise partnership terms Read More »

chatgpt’s-new-branching-feature-is-a-good-reminder-that-ai-chatbots-aren’t-people

ChatGPT’s new branching feature is a good reminder that AI chatbots aren’t people

On Thursday, OpenAI announced that ChatGPT users can now branch conversations into multiple parallel threads, serving as a useful reminder that AI chatbots aren’t people with fixed viewpoints but rather malleable tools you can rewind and redirect. The company released the feature for all logged-in web users following years of user requests for the capability.

The feature works by letting users hover over any message in a ChatGPT conversation, click “More actions,” and select “Branch in new chat.” This creates a new conversation thread that includes all the conversation history up to that specific point, while preserving the original conversation intact.

Think of it almost like creating a new copy of a “document” to edit while keeping the original version safe—except that “document” is an ongoing AI conversation with all its accumulated context. For example, a marketing team brainstorming ad copy can now create separate branches to test a formal tone, a humorous approach, or an entirely different strategy—all stemming from the same initial setup.

A screenshot of conversation branching in ChatGPT. OpenAI

The feature addresses a longstanding limitation in the AI model where ChatGPT users who wanted to try different approaches had to either overwrite their existing conversation after a certain point by changing a previous prompt or start completely fresh. Branching allows exploring what-if scenarios easily—and unlike in a human conversation, you can try multiple different approaches.

A 2024 study conducted by researchers from Tsinghua University and Beijing Institute of Technology suggested that linear dialogue interfaces for LLMs poorly serve scenarios involving “multiple layers, and many subtasks—such as brainstorming, structured knowledge learning, and large project analysis.” The study found that linear interaction forces users to “repeatedly compare, modify, and copy previous content,” increasing cognitive load and reducing efficiency.

Some software developers have already responded positively to the update, with some comparing the feature to Git, the version control system that lets programmers create separate branches of code to test changes without affecting the main codebase. The comparison makes sense: Both allow you to experiment with different approaches while preserving your original work.

ChatGPT’s new branching feature is a good reminder that AI chatbots aren’t people Read More »