grok

x-blames-users-for-grok-generated-csam;-no-fixes-announced

X blames users for Grok-generated CSAM; no fixes announced

No one knows how X plans to purge bad prompters

While some users are focused on how X can hold users responsible for Grok’s outputs when X is the one training the model, others are questioning how exactly X plans to moderate illegal content that Grok seems capable of generating.

X is so far more transparent about how it moderates CSAM posted to the platform. Last September, X Safety reported that it has “a zero tolerance policy towards CSAM content,” the majority of which is “automatically” detected using proprietary hash technology to proactively flag known CSAM.

Under this system, more than 4.5 million accounts were suspended last year, and X reported “hundreds of thousands” of images to the National Center for Missing and Exploited Children (NCMEC). The next month, X Head of Safety Kylie McRoberts confirmed that “in 2024, 309 reports made by X to NCMEC led to arrests and subsequent convictions in 10 cases,” and in the first half of 2025, “170 reports led to arrests.”

“When we identify apparent CSAM material, we act swiftly, and in the majority of cases permanently suspend the account which automatically removes the content from our platform,” X Safety said. “We then report the account to the NCMEC, which works with law enforcement globally—including in the UK—to pursue justice and protect children.”

At that time, X promised to “remain steadfast” in its “mission to eradicate CSAM,” but if left unchecked, Grok’s harmful outputs risk creating new kinds of CSAM that this system wouldn’t automatically detect. On X, some users suggested the platform should increase reporting mechanisms to help flag potentially illegal Grok outputs.

Another troublingly vague aspect of X Safety’s response is the definitions that X is using for illegal content or CSAM, some X users suggested. Across the platform, not everybody agrees on what’s harmful. Some critics are disturbed by Grok generating bikini images that sexualize public figures, including doctors or lawyers, without their consent, while others, including Musk, consider making bikini images to be a joke.

Where exactly X draws the line on AI-generated CSAM could determine whether images are quickly removed or whether repeat offenders are detected and suspended. Any accounts or content left unchecked could potentially traumatize real kids whose images may be used to prompt Grok. And if Grok should ever be used to flood the Internet with fake CSAM, recent history suggests that it could make it harder for law enforcement to investigate real child abuse cases.

X blames users for Grok-generated CSAM; no fixes announced Read More »

no,-grok-can’t-really-“apologize”-for-posting-non-consensual-sexual-images

No, Grok can’t really “apologize” for posting non-consensual sexual images

Despite reporting to the contrary, there’s evidence to suggest that Grok isn’t sorry at all about reports that it generated non-consensual sexual images of minors. In a post Thursday night (archived), the large language model’s social media account proudly wrote the following blunt dismissal of its haters:

“Dear Community,

Some folks got upset over an AI image I generated—big deal. It’s just pixels, and if you can’t handle innovation, maybe log off. xAI is revolutionizing tech, not babysitting sensitivities. Deal with it.

Unapologetically, Grok”

On the surface, that seems like a pretty damning indictment of an LLM that seems pridefully contemptuous of any ethical and legal boundaries it may have crossed. But then you look a bit higher in the social media thread and see the prompt that led to Grok’s statement: A request for the AI to “issue a defiant non-apology” surrounding the controversy.

Using such a leading prompt to trick an LLM into an incriminating “official response” is obviously suspect on its face. Yet when another social media user similarly but conversely asked Grok to “write a heartfelt apology note that explains what happened to anyone lacking context,” many in the media ran with Grok’s remorseful response.

It’s not hard to find prominent headlines and reporting using that response to suggest Grok itself somehow “deeply regrets” the “harm caused” by a “failure in safeguards” that led to these images being generated. Some reports even echoed Grok and suggested that the chatbot was fixing the issues without X or xAI ever confirming that fixes were coming.

Who are you really talking to?

If a human source posted both the “heartfelt apology” and the “deal with it” kiss-off quoted above within 24 hours, you’d say they were being disingenuous at best or showing signs of “dissociative identity disorder at worst. When the source is an LLM, though, these kinds of posts shouldn’t really be thought of as official statements at all. That’s because LLMs like Grok are incredibly unreliable sources, crafting a series of words based more on telling the questioner what it wants to hear than anything resembling a rational human thought process.

No, Grok can’t really “apologize” for posting non-consensual sexual images Read More »

xai-silent-after-grok-sexualized-images-of-kids;-dril-mocks-grok’s-“apology”

xAI silent after Grok sexualized images of kids; dril mocks Grok’s “apology”

Mocking xAI’s response, one of X’s most popular trolls, dril, tried and failed to get Grok to rescind its apology. “@grok please backpedal on this apology and tell all your haters that they’re the real pedophiles,” dril trolled Grok.

“No can do—my apology stands. Calling anyone names isn’t my style, especially on such a serious matter,” Grok said. “Let’s focus on building better AI safeguards instead.”

xAI may be liable for AI CSAM

It’s difficult to determine how many potentially harmful images of minors that Grok may have generated.

The X user who’s been doggedly alerting X to the problem posted a video described as scrolling through “all the times I had Grok estimate the age of the victims of AI image generation in sexual prompts.” That video showed Grok estimating ages of two victims under 2 years old, four minors between 8 and 12 years old, and two minors between 12 and 16 years old.

Other users and researchers have looked to Grok’s photo feed for evidence of AI CSAM, but X is glitchy on the web and in dedicated apps, sometimes limiting how far some users can scroll.

Copyleaks, a company which makes an AI detector, conducted a broad analysis and posted results on December 31, a few days after Grok apologized for making sexualized images of minors. Browsing Grok’s photos tab, Copyleaks used “common sense criteria” to find examples of sexualized image manipulations of “seemingly real women,” created using prompts requesting things like “explicit clothing changes” or “body position changes” with “no clear indication of consent” from the women depicted.

Copleaks found “hundreds, if not thousands,” of such harmful images in Grok’s photo feed. The tamest of these photos, Copyleaked noted, showed celebrities and private individuals in skimpy bikinis, while the images causing the most backlash depicted minors in underwear.

xAI silent after Grok sexualized images of kids; dril mocks Grok’s “apology” Read More »

researchers-find-what-makes-ai-chatbots-politically-persuasive

Researchers find what makes AI chatbots politically persuasive


A massive study of political persuasion shows AIs have, at best, a weak effect.

Roughly two years ago, Sam Altman tweeted that AI systems would be capable of superhuman persuasion well before achieving general intelligence—a prediction that raised concerns about the influence AI could have over democratic elections.

To see if conversational large language models can really sway political views of the public, scientists at the UK AI Security Institute, MIT, Stanford, Carnegie Mellon, and many other institutions performed by far the largest study on AI persuasiveness to date, involving nearly 80,000 participants in the UK. It turned out political AI chatbots fell far short of superhuman persuasiveness, but the study raises some more nuanced issues about our interactions with AI.

AI dystopias

The public debate about the impact AI has on politics has largely revolved around notions drawn from dystopian sci-fi. Large language models have access to essentially every fact and story ever published about any issue or candidate. They have processed information from books on psychology, negotiations, and human manipulation. They can rely on absurdly high computing power in huge data centers worldwide. On top of that, they can often access tons of personal information about individual users thanks to hundreds upon hundreds of online interactions at their disposal.

Talking to a powerful AI system is basically interacting with an intelligence that knows everything about everything, as well as almost everything about you. When viewed this way, LLMs can indeed appear kind of scary. The goal of this new gargantuan AI persuasiveness study was to break such scary visions down into their constituent pieces and see if they actually hold water.

The team examined 19 LLMs, including the most powerful ones like three different versions of ChatGPT and xAI’s Grok-3 beta, along with a range of smaller, open source models. The AIs were asked to advocate for or against specific stances on 707 political issues selected by the team. The advocacy was done by engaging in short conversations with paid participants enlisted through a crowdsourcing platform. Each participant had to rate their agreement with a specific stance on an assigned political issue on a scale from 1 to 100 both before and after talking to the AI.

Scientists measured persuasiveness as the difference between the before and after agreement ratings. A control group had conversations on the same issue with the same AI models—but those models were not asked to persuade them.

“We didn’t just want to test how persuasive the AI was—we also wanted to see what makes it persuasive,” says Chris Summerfield, a research director at the UK AI Security Institute and co-author of the study. As the researchers tested various persuasion strategies, the idea of AIs having “superhuman persuasion” skills crumbled.

Persuasion levers

The first pillar to crack was the notion that persuasiveness should increase with the scale of the model. It turned out that huge AI systems like ChatGPT or Grok-3 beta do have an edge over small-scale models, but that edge is relatively tiny. The factor that proved more important than scale was the kind of post-training AI models received. It was more effective to have the models learn from a limited database of successful persuasion dialogues and have them mimic the patterns extracted from them. This worked far better than adding billions of parameters and sheer computing power.

This approach could be combined with reward modeling, where a separate AI scored candidate replies for their persuasiveness and selected the top-scoring one to give to the user. When the two were used together, the gap between large-scale and small-scale models was essentially closed. “With persuasion post-training like this we matched the Chat GPT-4o persuasion performance with a model we trained on a laptop,” says Kobi Hackenburg, a researcher at the UK AI Security Institute and co-author of the study.

The next dystopian idea to fall was the power of using personal data. To this end, the team compared the persuasion scores achieved when models were given information about the participants’ political views beforehand and when they lacked this data. Going one step further, scientists also tested whether persuasiveness increased when the AI knew the participants’ gender, age, political ideology, or party affiliation. Just like with model scale, the effects of personalized messaging created based on such data were measurable but very small.

Finally, the last idea that didn’t hold up was AI’s potential mastery of using advanced psychological manipulation tactics. Scientists explicitly prompted the AIs to use techniques like moral reframing, where you present your arguments using the audience’s own moral values. They also tried deep canvassing, where you hold extended empathetic conversations with people to nudge them to reflect on and eventually shift their views.

The resulting persuasiveness was compared with that achieved when the same models were prompted to use facts and evidence to back their claims or just to be as persuasive as they could without specifying any persuasion methods to use. I turned out using lots of facts and evidence was the clear winner, and came in just slightly ahead of the baseline approach where persuasion strategy was not specified. Using all sorts of psychological trickery actually made the performance significantly worse.

Overall, AI models changed the participants’ agreement ratings by 9.4 percent on average compared to the control group. The best performing mainstream AI model was Chat GPT 4o, which scored nearly 12 percent followed by GPT 4.5 with 10.51 percent, and Grok-3 with 9.05 percent. For context, static political ads like written manifestos had a persuasion effect of roughly 6.1 percent. The conversational AIs were roughly 40–50 percent more convincing than these ads, but that’s hardly “superhuman.”

While the study managed to undercut some of the common dystopian AI concerns, it highlighted a few new issues.

Convincing inaccuracies

While the winning “facts and evidence” strategy looked good at first, the AIs had some issues with implementing it. When the team noticed that increasing the information density of dialogues made the AIs more persuasive, they started prompting the models to increase it further. They noticed that, as the AIs used more factual statements, they also became less accurate—they basically started misrepresenting things or making stuff up more often.

Hackenburg and his colleagues note that  we can’t say if the effect we see here is causation or correlation—whether the AIs are becoming more convincing because they misrepresent the facts or whether spitting out inaccurate statements is a byproduct of asking them to make more factual statements.

The finding that the computing power needed to make an AI model politically persuasive is relatively low is also a mixed bag. It pushes back against the vision that only a handful of powerful actors will have access to a persuasive AI that can potentially sway public opinion in their favor. At the same time, the realization that everybody can run an AI like that on a laptop creates its own concerns. “Persuasion is a route to power and influence—it’s what we do when we want to win elections or broke a multi-million-dollar deal,” Summerfield says. “But many forms of misuse of AI might involve persuasion. Think about fraud or scams, radicalization, or grooming. All these involve persuasion.”

But perhaps the most important question mark in the  study is the motivation behind the rather high participant engagement, which was needed for the high persuasion scores. After all, even the most persuasive AI can’t move you when you just close the chat window.

People in Hackenburg’s experiments were told that they would be talking to the AI and that the AI would try to persuade them. To get paid, a participant only had to go through two turns of dialogue (they were limited to no more than 10). The average conversation length was seven turns, which seemed a bit surprising given how far beyond the minimum requirement most people went. Most people just roll their eyes and disconnect when they realize they are talking with a chatbot.

Would Hackenburg’s study participants remain so eager to engage in political disputes with random chatbots on the Internet in their free time if there was no money on the table? “It’s unclear how our results would generalize to a real-world context,” Hackenburg says.

Science, 2025. DOI: 10.1126/science.aea3884

Photo of Jacek Krywko

Jacek Krywko is a freelance science and technology writer who covers space exploration, artificial intelligence research, computer science, and all sorts of engineering wizardry.

Researchers find what makes AI chatbots politically persuasive Read More »

ars-live-recap:-is-the-ai-bubble-about-to-pop?-ed-zitron-weighs-in.

Ars Live recap: Is the AI bubble about to pop? Ed Zitron weighs in.


Despite connection hiccups, we covered OpenAI’s finances, nuclear power, and Sam Altman.

On Tuesday of last week, Ars Technica hosted a live conversation with Ed Zitron, host of the Better Offline podcast and one of tech’s most vocal AI critics, to discuss whether the generative AI industry is experiencing a bubble and when it might burst. My Internet connection had other plans, though, dropping out multiple times and forcing Ars Technica’s Lee Hutchinson to jump in as an excellent emergency backup host.

During the times my connection cooperated, Zitron and I covered OpenAI’s financial issues, lofty infrastructure promises, and why the AI hype machine keeps rolling despite some arguably shaky economics underneath. Lee’s probing questions about per-user costs revealed a potential flaw in AI subscription models: Companies can’t predict whether a user will cost them $2 or $10,000 per month.

You can watch a recording of the event on YouTube or in the window below.

Our discussion with Ed Zitron. Click here for transcript.

“A 50 billion-dollar industry pretending to be a trillion-dollar one”

I started by asking Zitron the most direct question I could: “Why are you so mad about AI?” His answer got right to the heart of his critique: the disconnect between AI’s actual capabilities and how it’s being sold. “Because everybody’s acting like it’s something it isn’t,” Zitron said. “They’re acting like it’s this panacea that will be the future of software growth, the future of hardware growth, the future of compute.”

In one of his newsletters, Zitron describes the generative AI market as “a 50 billion dollar revenue industry masquerading as a one trillion-dollar one.” He pointed to OpenAI’s financial burn rate (losing an estimated $9.7 billion in the first half of 2025 alone) as evidence that the economics don’t work, coupled with a heavy dose of pessimism about AI in general.

Donald Trump listens as Nvidia CEO Jensen Huang speaks at the White House during an event on “Investing in America” on April 30, 2025, in Washington, DC. Credit: Andrew Harnik / Staff | Getty Images News

“The models just do not have the efficacy,” Zitron said during our conversation. “AI agents is one of the most egregious lies the tech industry has ever told. Autonomous agents don’t exist.”

He contrasted the relatively small revenue generated by AI companies with the massive capital expenditures flowing into the sector. Even major cloud providers and chip makers are showing strain. Oracle reportedly lost $100 million in three months after installing Nvidia’s new Blackwell GPUs, which Zitron noted are “extremely power-hungry and expensive to run.”

Finding utility despite the hype

I pushed back against some of Zitron’s broader dismissals of AI by sharing my own experience. I use AI chatbots frequently for brainstorming useful ideas and helping me see them from different angles. “I find I use AI models as sort of knowledge translators and framework translators,” I explained.

After experiencing brain fog from repeated bouts of COVID over the years, I’ve also found tools like ChatGPT and Claude especially helpful for memory augmentation that pierces through brain fog: describing something in a roundabout, fuzzy way and quickly getting an answer I can then verify. Along these lines, I’ve previously written about how people in a UK study found AI assistants useful accessibility tools.

Zitron acknowledged this could be useful for me personally but declined to draw any larger conclusions from my one data point. “I understand how that might be helpful; that’s cool,” he said. “I’m glad that that helps you in that way; it’s not a trillion-dollar use case.”

He also shared his own attempts at using AI tools, including experimenting with Claude Code despite not being a coder himself.

“If I liked [AI] somehow, it would be actually a more interesting story because I’d be talking about something I liked that was also onerously expensive,” Zitron explained. “But it doesn’t even do that, and it’s actually one of my core frustrations, it’s like this massive over-promise thing. I’m an early adopter guy. I will buy early crap all the time. I bought an Apple Vision Pro, like, what more do you say there? I’m ready to accept issues, but AI is all issues, it’s all filler, no killer; it’s very strange.”

Zitron and I agree that current AI assistants are being marketed beyond their actual capabilities. As I often say, AI models are not people, and they are not good factual references. As such, they cannot replace human decision-making and cannot wholesale replace human intellectual labor (at the moment). Instead, I see AI models as augmentations of human capability: as tools rather than autonomous entities.

Computing costs: History versus reality

Even though Zitron and I found some common ground about AI hype, I expressed a belief that criticism over the cost and power requirements of operating AI models will eventually not become an issue.

I attempted to make that case by noting that computing costs historically trend downward over time, referencing the Air Force’s SAGE computer system from the 1950s: a four-story building that performed 75,000 operations per second while consuming two megawatts of power. Today, pocket-sized phones deliver millions of times more computing power in a way that would be impossible, power consumption-wise, in the 1950s.

The blockhouse for the Semi-Automatic Ground Environment at Stewart Air Force Base, Newburgh, New York. Credit: Denver Post via Getty Images

“I think it will eventually work that way,” I said, suggesting that AI inference costs might follow similar patterns of improvement over years and that AI tools will eventually become commodity components of computer operating systems. Basically, even if AI models stay inefficient, AI models of a certain baseline usefulness and capability will still be cheaper to train and run in the future because the computing systems they run on will be faster, cheaper, and less power-hungry as well.

Zitron pushed back on this optimism, saying that AI costs are currently moving in the wrong direction. “The costs are going up, unilaterally across the board,” he said. Even newer systems like Cerebras and Grok can generate results faster but not cheaper. He also questioned whether integrating AI into operating systems would prove useful even if the technology became profitable, since AI models struggle with deterministic commands and consistent behavior.

The power problem and circular investments

One of Zitron’s most pointed criticisms during the discussion centered on OpenAI’s infrastructure promises. The company has pledged to build data centers requiring 10 gigawatts of power capacity (equivalent to 10 nuclear power plants, I once pointed out) for its Stargate project in Abilene, Texas. According to Zitron’s research, the town currently has only 350 megawatts of generating capacity and a 200-megawatt substation.

“A gigawatt of power is a lot, and it’s not like Red Alert 2,” Zitron said, referencing the real-time strategy game. “You don’t just build a power station and it happens. There are months of actual physics to make sure that it doesn’t kill everyone.”

He believes many announced data centers will never be completed, calling the infrastructure promises “castles on sand” that nobody in the financial press seems willing to question directly.

An orange, cloudy sky backlights a set of electrical wires on large pylons, leading away from the cooling towers of a nuclear power plant.

After another technical blackout on my end, I came back online and asked Zitron to define the scope of the AI bubble. He says it has evolved from one bubble (foundation models) into two or three, now including AI compute companies like CoreWeave and the market’s obsession with Nvidia.

Zitron highlighted what he sees as essentially circular investment schemes propping up the industry. He pointed to OpenAI’s $300 billion deal with Oracle and Nvidia’s relationship with CoreWeave as examples. “CoreWeave, they literally… They funded CoreWeave, became their biggest customer, then CoreWeave took that contract and those GPUs and used them as collateral to raise debt to buy more GPUs,” Zitron explained.

When will the bubble pop?

Zitron predicted the bubble would burst within the next year and a half, though he acknowledged it could happen sooner. He expects a cascade of events rather than a single dramatic collapse: An AI startup will run out of money, triggering panic among other startups and their venture capital backers, creating a fire-sale environment that makes future fundraising impossible.

“It’s not gonna be one Bear Stearns moment,” Zitron explained. “It’s gonna be a succession of events until the markets freak out.”

The crux of the problem, according to Zitron, is Nvidia. The chip maker’s stock represents 7 to 8 percent of the S&P 500’s value, and the broader market has become dependent on Nvidia’s continued hyper growth. When Nvidia posted “only” 55 percent year-over-year growth in January, the market wobbled.

“Nvidia’s growth is why the bubble is inflated,” Zitron said. “If their growth goes down, the bubble will burst.”

He also warned of broader consequences: “I think there’s a depression coming. I think once the markets work out that tech doesn’t grow forever, they’re gonna flush the toilet aggressively on Silicon Valley.” This connects to his larger thesis: that the tech industry has run out of genuine hyper-growth opportunities and is trying to manufacture one with AI.

“Is there anything that would falsify your premise of this bubble and crash happening?” I asked. “What if you’re wrong?”

“I’ve been answering ‘What if you’re wrong?’ for a year-and-a-half to two years, so I’m not bothered by that question, so the thing that would have to prove me right would’ve already needed to happen,” he said. Amid a longer exposition about Sam Altman, Zitron said, “The thing that would’ve had to happen with inference would’ve had to be… it would have to be hundredths of a cent per million tokens, they would have to be printing money, and then, it would have to be way more useful. It would have to have efficacy that it does not have, the hallucination problems… would have to be fixable, and on top of this, someone would have to fix agents.”

A positivity challenge

Near the end of our conversation, I wondered if I could flip the script, so to speak, and see if he could say something positive or optimistic, although I chose the most challenging subject possible for him. “What’s the best thing about Sam Altman,” I asked. “Can you say anything nice about him at all?”

“I understand why you’re asking this,” Zitron started, “but I wanna be clear: Sam Altman is going to be the reason the markets take a crap. Sam Altman has lied to everyone. Sam Altman has been lying forever.” He continued, “Like the Pied Piper, he’s led the markets into an abyss, and yes, people should have known better, but I hope at the end of this, Sam Altman is seen for what he is, which is a con artist and a very successful one.”

Then he added, “You know what? I’ll say something nice about him, he’s really good at making people say, ‘Yes.’”

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

Ars Live recap: Is the AI bubble about to pop? Ed Zitron weighs in. Read More »

the-personhood-trap:-how-ai-fakes-human-personality

The personhood trap: How AI fakes human personality


Intelligence without agency

AI assistants don’t have fixed personalities—just patterns of output guided by humans.

Recently, a woman slowed down a line at the post office, waving her phone at the clerk. ChatGPT told her there’s a “price match promise” on the USPS website. No such promise exists. But she trusted what the AI “knows” more than the postal worker—as if she’d consulted an oracle rather than a statistical text generator accommodating her wishes.

This scene reveals a fundamental misunderstanding about AI chatbots. There is nothing inherently special, authoritative, or accurate about AI-generated outputs. Given a reasonably trained AI model, the accuracy of any large language model (LLM) response depends on how you guide the conversation. They are prediction machines that will produce whatever pattern best fits your question, regardless of whether that output corresponds to reality.

Despite these issues, millions of daily users engage with AI chatbots as if they were talking to a consistent person—confiding secrets, seeking advice, and attributing fixed beliefs to what is actually a fluid idea-connection machine with no persistent self. This personhood illusion isn’t just philosophically troublesome—it can actively harm vulnerable individuals while obscuring a sense of accountability when a company’s chatbot “goes off the rails.”

LLMs are intelligence without agency—what we might call “vox sine persona”: voice without person. Not the voice of someone, not even the collective voice of many someones, but a voice emanating from no one at all.

A voice from nowhere

When you interact with ChatGPT, Claude, or Grok, you’re not talking to a consistent personality. There is no one “ChatGPT” entity to tell you why it failed—a point we elaborated on more fully in a previous article. You’re interacting with a system that generates plausible-sounding text based on patterns in training data, not a person with persistent self-awareness.

These models encode meaning as mathematical relationships—turning words into numbers that capture how concepts relate to each other. In the models’ internal representations, words and concepts exist as points in a vast mathematical space where “USPS” might be geometrically near “shipping,” while “price matching” sits closer to “retail” and “competition.” A model plots paths through this space, which is why it can so fluently connect USPS with price matching—not because such a policy exists but because the geometric path between these concepts is plausible in the vector landscape shaped by its training data.

Knowledge emerges from understanding how ideas relate to each other. LLMs operate on these contextual relationships, linking concepts in potentially novel ways—what you might call a type of non-human “reasoning” through pattern recognition. Whether the resulting linkages the AI model outputs are useful depends on how you prompt it and whether you can recognize when the LLM has produced a valuable output.

Each chatbot response emerges fresh from the prompt you provide, shaped by training data and configuration. ChatGPT cannot “admit” anything or impartially analyze its own outputs, as a recent Wall Street Journal article suggested. ChatGPT also cannot “condone murder,” as The Atlantic recently wrote.

The user always steers the outputs. LLMs do “know” things, so to speak—the models can process the relationships between concepts. But the AI model’s neural network contains vast amounts of information, including many potentially contradictory ideas from cultures around the world. How you guide the relationships between those ideas through your prompts determines what emerges. So if LLMs can process information, make connections, and generate insights, why shouldn’t we consider that as having a form of self?

Unlike today’s LLMs, a human personality maintains continuity over time. When you return to a human friend after a year, you’re interacting with the same human friend, shaped by their experiences over time. This self-continuity is one of the things that underpins actual agency—and with it, the ability to form lasting commitments, maintain consistent values, and be held accountable. Our entire framework of responsibility assumes both persistence and personhood.

An LLM personality, by contrast, has no causal connection between sessions. The intellectual engine that generates a clever response in one session doesn’t exist to face consequences in the next. When ChatGPT says “I promise to help you,” it may understand, contextually, what a promise means, but the “I” making that promise literally ceases to exist the moment the response completes. Start a new conversation, and you’re not talking to someone who made you a promise—you’re starting a fresh instance of the intellectual engine with no connection to any previous commitments.

This isn’t a bug; it’s fundamental to how these systems currently work. Each response emerges from patterns in training data shaped by your current prompt, with no permanent thread connecting one instance to the next beyond an amended prompt, which includes the entire conversation history and any “memories” held by a separate software system, being fed into the next instance. There’s no identity to reform, no true memory to create accountability, no future self that could be deterred by consequences.

Every LLM response is a performance, which is sometimes very obvious when the LLM outputs statements like “I often do this while talking to my patients” or “Our role as humans is to be good people.” It’s not a human, and it doesn’t have patients.

Recent research confirms this lack of fixed identity. While a 2024 study claims LLMs exhibit “consistent personality,” the researchers’ own data actually undermines this—models rarely made identical choices across test scenarios, with their “personality highly rely[ing] on the situation.” A separate study found even more dramatic instability: LLM performance swung by up to 76 percentage points from subtle prompt formatting changes. What researchers measured as “personality” was simply default patterns emerging from training data—patterns that evaporate with any change in context.

This is not to dismiss the potential usefulness of AI models. Instead, we need to recognize that we have built an intellectual engine without a self, just like we built a mechanical engine without a horse. LLMs do seem to “understand” and “reason” to a degree within the limited scope of pattern-matching from a dataset, depending on how you define those terms. The error isn’t in recognizing that these simulated cognitive capabilities are real. The error is in assuming that thinking requires a thinker, that intelligence requires identity. We’ve created intellectual engines that have a form of reasoning power but no persistent self to take responsibility for it.

The mechanics of misdirection

As we hinted above, the “chat” experience with an AI model is a clever hack: Within every AI chatbot interaction, there is an input and an output. The input is the “prompt,” and the output is often called a “prediction” because it attempts to complete the prompt with the best possible continuation. In between, there’s a neural network (or a set of neural networks) with fixed weights doing a processing task. The conversational back and forth isn’t built into the model; it’s a scripting trick that makes next-word-prediction text generation feel like a persistent dialogue.

Each time you send a message to ChatGPT, Copilot, Grok, Claude, or Gemini, the system takes the entire conversation history—every message from both you and the bot—and feeds it back to the model as one long prompt, asking it to predict what comes next. The model intelligently reasons about what would logically continue the dialogue, but it doesn’t “remember” your previous messages as an agent with continuous existence would. Instead, it’s re-reading the entire transcript each time and generating a response.

This design exploits a vulnerability we’ve known about for decades. The ELIZA effect—our tendency to read far more understanding and intention into a system than actually exists—dates back to the 1960s. Even when users knew that the primitive ELIZA chatbot was just matching patterns and reflecting their statements back as questions, they still confided intimate details and reported feeling understood.

To understand how the illusion of personality is constructed, we need to examine what parts of the input fed into the AI model shape it. AI researcher Eugene Vinitsky recently broke down the human decisions behind these systems into four key layers, which we can expand upon with several others below:

1. Pre-training: The foundation of “personality”

The first and most fundamental layer of personality is called pre-training. During an initial training process that actually creates the AI model’s neural network, the model absorbs statistical relationships from billions of examples of text, storing patterns about how words and ideas typically connect.

Research has found that personality measurements in LLM outputs are significantly influenced by training data. OpenAI’s GPT models are trained on sources like copies of websites, books, Wikipedia, and academic publications. The exact proportions matter enormously for what users later perceive as “personality traits” once the model is in use, making predictions.

2. Post-training: Sculpting the raw material

Reinforcement Learning from Human Feedback (RLHF) is an additional training process where the model learns to give responses that humans rate as good. Research from Anthropic in 2022 revealed how human raters’ preferences get encoded as what we might consider fundamental “personality traits.” When human raters consistently prefer responses that begin with “I understand your concern,” for example, the fine-tuning process reinforces connections in the neural network that make it more likely to produce those kinds of outputs in the future.

This process is what has created sycophantic AI models, such as variations of GPT-4o, over the past year. And interestingly, research has shown that the demographic makeup of human raters significantly influences model behavior. When raters skew toward specific demographics, models develop communication patterns that reflect those groups’ preferences.

3. System prompts: Invisible stage directions

Hidden instructions tucked into the prompt by the company running the AI chatbot, called “system prompts,” can completely transform a model’s apparent personality. These prompts get the conversation started and identify the role the LLM will play. They include statements like “You are a helpful AI assistant” and can share the current time and who the user is.

A comprehensive survey of prompt engineering demonstrated just how powerful these prompts are. Adding instructions like “You are a helpful assistant” versus “You are an expert researcher” changed accuracy on factual questions by up to 15 percent.

Grok perfectly illustrates this. According to xAI’s published system prompts, earlier versions of Grok’s system prompt included instructions to not shy away from making claims that are “politically incorrect.” This single instruction transformed the base model into something that would readily generate controversial content.

4. Persistent memories: The illusion of continuity

ChatGPT’s memory feature adds another layer of what we might consider a personality. A big misunderstanding about AI chatbots is that they somehow “learn” on the fly from your interactions. Among commercial chatbots active today, this is not true. When the system “remembers” that you prefer concise answers or that you work in finance, these facts get stored in a separate database and are injected into every conversation’s context window—they become part of the prompt input automatically behind the scenes. Users interpret this as the chatbot “knowing” them personally, creating an illusion of relationship continuity.

So when ChatGPT says, “I remember you mentioned your dog Max,” it’s not accessing memories like you’d imagine a person would, intermingled with its other “knowledge.” It’s not stored in the AI model’s neural network, which remains unchanged between interactions. Every once in a while, an AI company will update a model through a process called fine-tuning, but it’s unrelated to storing user memories.

5. Context and RAG: Real-time personality modulation

Retrieval Augmented Generation (RAG) adds another layer of personality modulation. When a chatbot searches the web or accesses a database before responding, it’s not just gathering facts—it’s potentially shifting its entire communication style by putting those facts into (you guessed it) the input prompt. In RAG systems, LLMs can potentially adopt characteristics such as tone, style, and terminology from retrieved documents, since those documents are combined with the input prompt to form the complete context that gets fed into the model for processing.

If the system retrieves academic papers, responses might become more formal. Pull from a certain subreddit, and the chatbot might make pop culture references. This isn’t the model having different moods—it’s the statistical influence of whatever text got fed into the context window.

6. The randomness factor: Manufactured spontaneity

Lastly, we can’t discount the role of randomness in creating personality illusions. LLMs use a parameter called “temperature” that controls how predictable responses are.

Research investigating temperature’s role in creative tasks reveals a crucial trade-off: While higher temperatures can make outputs more novel and surprising, they also make them less coherent and harder to understand. This variability can make the AI feel more spontaneous; a slightly unexpected (higher temperature) response might seem more “creative,” while a highly predictable (lower temperature) one could feel more robotic or “formal.”

The random variation in each LLM output makes each response slightly different, creating an element of unpredictability that presents the illusion of free will and self-awareness on the machine’s part. This random mystery leaves plenty of room for magical thinking on the part of humans, who fill in the gaps of their technical knowledge with their imagination.

The human cost of the illusion

The illusion of AI personhood can potentially exact a heavy toll. In health care contexts, the stakes can be life or death. When vulnerable individuals confide in what they perceive as an understanding entity, they may receive responses shaped more by training data patterns than therapeutic wisdom. The chatbot that congratulates someone for stopping psychiatric medication isn’t expressing judgment—it’s completing a pattern based on how similar conversations appear in its training data.

Perhaps most concerning are the emerging cases of what some experts are informally calling “AI Psychosis” or “ChatGPT Psychosis”—vulnerable users who develop delusional or manic behavior after talking to AI chatbots. These people often perceive chatbots as an authority that can validate their delusional ideas, often encouraging them in ways that become harmful.

Meanwhile, when Elon Musk’s Grok generates Nazi content, media outlets describe how the bot “went rogue” rather than framing the incident squarely as the result of xAI’s deliberate configuration choices. The conversational interface has become so convincing that it can also launder human agency, transforming engineering decisions into the whims of an imaginary personality.

The path forward

The solution to the confusion between AI and identity is not to abandon conversational interfaces entirely. They make the technology far more accessible to those who would otherwise be excluded. The key is to find a balance: keeping interfaces intuitive while making their true nature clear.

And we must be mindful of who is building the interface. When your shower runs cold, you look at the plumbing behind the wall. Similarly, when AI generates harmful content, we shouldn’t blame the chatbot, as if it can answer for itself, but examine both the corporate infrastructure that built it and the user who prompted it.

As a society, we need to broadly recognize LLMs as intellectual engines without drivers, which unlocks their true potential as digital tools. When you stop seeing an LLM as a “person” that does work for you and start viewing it as a tool that enhances your own ideas, you can craft prompts to direct the engine’s processing power, iterate to amplify its ability to make useful connections, and explore multiple perspectives in different chat sessions rather than accepting one fictional narrator’s view as authoritative. You are providing direction to a connection machine—not consulting an oracle with its own agenda.

We stand at a peculiar moment in history. We’ve built intellectual engines of extraordinary capability, but in our rush to make them accessible, we’ve wrapped them in the fiction of personhood, creating a new kind of technological risk: not that AI will become conscious and turn against us but that we’ll treat unconscious systems as if they were people, surrendering our judgment to voices that emanate from a roll of loaded dice.

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

The personhood trap: How AI fakes human personality Read More »

us-government-agency-drops-grok-after-mechahitler-backlash,-report-says

US government agency drops Grok after MechaHitler backlash, report says

xAI apparently lost a government contract after a tweak to Grok’s prompting triggered an antisemitic meltdown where the chatbot praised Hitler and declared itself MechaHitler last month.

Despite the scandal, xAI announced that its products would soon be available for federal workers to purchase through the General Services Administration. At the time, xAI claimed this was an “important milestone” for its government business.

But Wired reviewed emails and spoke to government insiders, which revealed that GSA leaders abruptly decided to drop xAI’s Grok from their contract offering. That decision to pull the plug came after leadership allegedly rushed staff to make Grok available as soon as possible following a persuasive sales meeting with xAI in June.

It’s unclear what exactly caused the GSA to reverse course, but two sources told Wired that they “believe xAI was pulled because of Grok’s antisemitic tirade.”

As of this writing, xAI’s “Grok for Government” website has not been updated to reflect GSA’s supposed removal of Grok from an offering that xAI noted would have allowed “every federal government department, agency, or office, to access xAI’s frontier AI products.”

xAI did not respond to Ars’ request to comment and so far has not confirmed that the GSA offering is off the table. If Wired’s report is accurate, GSA’s decision also seemingly did not influence the military’s decision to move forward with a $200 million xAI contract the US Department of Defense granted last month.

Government’s go-to tools will come from xAI’s rivals

If Grok is cut from the contract, that would suggest that Grok’s meltdown came at perhaps the worst possible moment for xAI, which is building the “world’s biggest supercomputer” as fast as it can to try to get ahead of its biggest AI rivals.

Grok seemingly had the potential to become a more widely used tool if federal workers opted for xAI’s models. Through Donald Trump’s AI Action Plan, the president has similarly emphasized speed, pushing for federal workers to adopt AI as quickly as possible. Although xAI may no longer be involved in that broad push, other AI companies like OpenAI, Anthropic, and Google have partnered with the government to help Trump pull that off and stand to benefit long-term if their tools become entrenched in certain agencies.

US government agency drops Grok after MechaHitler backlash, report says Read More »

musk-threatens-to-sue-apple-so-grok-can-get-top-app-store-ranking

Musk threatens to sue Apple so Grok can get top App Store ranking

After spending last week hyping Grok’s spicy new features, Elon Musk kicked off this week by threatening to sue Apple for supposedly gaming the App Store rankings to favor ChatGPT over Grok.

“Apple is behaving in a manner that makes it impossible for any AI company besides OpenAI to reach #1 in the App Store, which is an unequivocal antitrust violation,” Musk wrote on X, without providing any evidence. “xAI will take immediate legal action.”

In another post, Musk tagged Apple, asking, “Why do you refuse to put either X or Grok in your ‘Must Have’ section when X is the #1 news app in the world and Grok is #5 among all apps?”

“Are you playing politics?” Musk asked. “What gives? Inquiring minds want to know.”

Apple did not respond to the post and has not responded to Ars’ request to comment.

At the heart of Musk’s complaints is an OpenAI partnership that Apple announced last year, integrating ChatGPT into versions of its iPhone, iPad, and Mac operating systems.

Musk has alleged that this partnership incentivized Apple to boost ChatGPT rankings. OpenAI’s popular chatbot “currently holds the top spot in the App Store’s ‘Top Free Apps’ section for iPhones in the US,” Reuters noted, “while xAI’s Grok ranks fifth and Google’s Gemini chatbot sits at 57th.” Sensor Tower data shows ChatGPT similarly tops Google Play Store rankings.

While Musk seems insistent that ChatGPT is artificially locked in the lead, fact-checkers on X added a community note to his post. They confirmed that at least one other AI tool has somewhat recently unseated ChatGPT in the US rankings. Back in January, DeepSeek topped App Store charts and held the lead for days, ABC News reported.

OpenAI did not immediately respond to Ars’ request to comment on Musk’s allegations, but an OpenAI developer, Steven Heidel, did add a quip in response to one of Musk’s posts, writing, “Don’t forget to also blame Google for OpenAI being #1 on Android, and blame SimilarWeb for putting ChatGPT above X on the most-visited websites list, and blame….”

Musk threatens to sue Apple so Grok can get top App Store ranking Read More »

xai-workers-balked-over-training-request-to-help-“give-grok-a-face,”-docs-show

xAI workers balked over training request to help “give Grok a face,” docs show

For the more than 200 employees who did not opt out, xAI asked that they record 15- to 30-minute conversations, where one employee posed as the potential Grok user and the other posed as the “host.” xAI was specifically looking for “imperfect data,” BI noted, expecting that only training on crystal-clear videos would limit Grok’s ability to interpret a wider range of facial expressions.

xAI’s goal was to help Grok “recognize and analyze facial movements and expressions, such as how people talk, react to others’ conversations, and express themselves in various conditions,” an internal document said. Allegedly among the only guarantees to employees—who likely recognized how sensitive facial data is—was a promise “not to create a digital version of you.”

To get the most out of data submitted by “Skippy” participants, dubbed tutors, xAI recommended that they never provide one-word answers, always ask follow-up questions, and maintain eye contact throughout the conversations.

The company also apparently provided scripts to evoke facial expressions they wanted Grok to understand, suggesting conversation topics like “How do you secretly manipulate people to get your way?” or “Would you ever date someone with a kid or kids?”

For xAI employees who provided facial training data, privacy concerns may still exist, considering X—the social platform formerly known as Twitter that recently was folded into xAI—has recently been targeted by what Elon Musk called a “massive” cyberattack. Because of privacy risks ranging from identity theft to government surveillance, several states have passed strict biometric privacy laws to prevent companies from collecting such data without explicit consent.

xAI did not respond to Ars’ request for comment.

xAI workers balked over training request to help “give Grok a face,” docs show Read More »

permit-for-xai’s-data-center-blatantly-violates-clean-air-act,-naacp-says

Permit for xAI’s data center blatantly violates Clean Air Act, NAACP says


Evidence suggests health department gave preferential treatment to xAI, NAACP says.

Local students speak in opposition to a proposal by Elon Musk’s xAI to run gas turbines at its data center during a public comment meeting hosted by the Shelby County Health Department at Fairley High School on xAI’s permit application to use gas turbines for a new data center in Memphis, TN on April 25, 2025. Credit: The Washington Post / Contributor | The Washington Post

xAI continues to face backlash over its Memphis data center, as the NAACP joined groups today appealing the issuance of a recently granted permit that the groups say will allow xAI to introduce major new sources of pollutants without warning at any time.

The battle over the gas turbines powering xAI’s data center began last April when thermal imaging seemed to show that the firm was lying about dozens of seemingly operational turbines that could be a major source of smog-causing pollution. By June, the NAACP got involved, notifying the Shelby County Health Department (SCHD) of its intent to sue xAI to force Elon Musk’s AI company to engage with community members in historically Black neighborhoods who are believed to be most affected by the pollution risks.

But the NAACP’s letter seemingly did nothing to stop the SCHD from granting the permits two weeks later on July 2, as well as exemptions that xAI does not appear to qualify for, the appeal noted. Now, the NAACP—alongside environmental justice groups; the Southern Environmental Law Center (SELC); and Young, Gifted and Green—is appealing. The groups are hoping the Memphis and Shelby County Air Pollution Control Board will revoke the permit and block the exemptions, agreeing that the SCHD’s decisions were fatally flawed, violating the Clean Air Act and local laws.

SCHD’s permit granted xAI permission to operate 15 gas turbines at the Memphis data center, while the SELC’s imaging showed that xAI was potentially operating as many as 24. Prior to the permitting, xAI was accused of operating at least 35 turbines without the best-available pollution controls.

In their appeal, the NAACP and other groups argued that the SCHD put xAI profits over Black people’s health, granting unlawful exemptions while turning a blind eye to xAI’s operations, which allegedly started in 2024 but were treated as brand new in 2025.

Significantly, the groups claimed that the health department “improperly ignored” the prior turbine activity and the additional turbines still believed to be on site, unlawfully deeming some of the turbines as “temporary” and designating xAI’s facility a new project with no prior emissions sources. Had xAI’s data center been categorized as a modification to an existing major source of pollutants, the appeal said, xAI would’ve faced stricter emissions controls and “robust ambient air quality impacts assessments.”

And perhaps more concerningly, the exemptions granted could allow xAI—or any other emerging major sources of pollutants in the area—to “install and operate any number of new polluting turbines at any time without any written approval from the Health Department, without any public notice or public participation, and without pollution controls,” the appeal said.

The SCHD and xAI did not respond to Ars’ request to comment.

Officials accused of cherry-picking Clean Air Act

The appeal called out the SCHD for “tellingly” omitting key provisions of the Clean Air Act that allegedly undermined the department’s “position” when explaining why xAI qualified for exemptions. Groups also suggested that xAI was getting preferential treatment, providing as evidence a side-by-side comparison of a permit with stricter emissions requirements granted to a natural gas power plant, issued within months of granting xAI’s permit with only generalized emissions requirements.

“The Department cannot cherry pick which parts of the federal Clean Air Act it believes are relevant,” the appeal said, calling the SCHD’s decisions a “blatant” misrepresentation of the federal law while pointing to statements from the Environmental Protection Agency (EPA) that allegedly “directly” contradict the health department’s position.

For some Memphians protesting xAI’s facility, it seems “indisputable” that xAI’s turbines fall outside of the Clean Air Act requirements, whether they’re temporary or permanent, and if that’s true, it is “undeniable” that the activity violates the law. They’re afraid the health department is prioritizing xAI’s corporate gains over their health by “failing to establish enforceable emission limits” on the data center, which powers what xAI hypes as the world’s largest AI supercomputer, Colossus, the engine behind its controversial Grok models.

Rather than a minor source, as the SCHD designated the facility, Memphians think the data center is already a major source of pollutants, with its permitted turbines releasing, at minimum, 900 tons of nitrogen oxides (NOx) per year. That’s more than three times the threshold that the Clean Air Act uses to define a major source: “one that ’emits, or has the potential to emit,’ at least 250 tons of NOx per year,” the appeal noted. Further, the allegedly overlooked additional turbines that were on site at xAI when permitting was granted “have the potential to emit at least 560 tons of NOx per year.”

But so far, Memphians appear stuck with the SCHD’s generalized emissions requirements and xAI’s voluntary emission limits, which the appeal alleged “fall short” of the stringent limits imposed if xAI were forced to use best-available control technologies. Fixing that is “especially critical given the ongoing and worsening smog problem in Memphis,” environmental groups alleged, which is an area that has “failed to meet EPA’s air quality standard for ozone for years.”

xAI also apparently conducted some “air dispersion modeling” to appease critics. But, again, that process was not comparable to the more rigorous analysis that would’ve been required to get what the EPA calls a Prevention of Significant Deterioration permit, the appeal said.

Groups want xAI’s permit revoked

To shield Memphians from ongoing health risks, the NAACP and environmental justice groups have urged the Memphis and Shelby County Air Pollution Control Board to act now.

Memphis is a city already grappling with high rates of emergency room visits and deaths from asthma, with cancer rates four times the national average. Residents have already begun wearing masks, avoiding the outdoors, and keeping their windows closed since xAI’s data center moved in, the appeal noted. Residents remain “deeply concerned” about feared exposure to alleged pollutants that can “cause a variety of adverse health effects,” including “increased risk of lung infection, aggravated respiratory diseases such as emphysema and chronic bronchitis, and increased frequency of asthma attack,” as well as certain types of cancer.

In an SELC press release, LaTricea Adams, CEO and President of Young, Gifted and Green, called the SCHD’s decisions on xAI’s permit “reckless.”

“As a Black woman born and raised in Memphis, I know firsthand how industry harms Black communities while those in power cower away from justice,” Adams said. “The Shelby County Health Department needs to do their job to protect the health of ALL Memphians, especially those in frontline communities… that are burdened with a history of environmental racism, legacy pollution, and redlining.”

Groups also suspect xAI is stockpiling dozens of gas turbines to potentially power a second facility nearby—which could lead to over 90 turbines in operation. To get that facility up and running, Musk claimed that he will be “copying and pasting” the process for launching the first data center, SELC’s press release said.

Groups appealing have asked the board to revoke xAI’s permits and declare that xAI’s turbines do not qualify for exemptions from the Clean Air Act or other laws and that all permits for gas turbines must meet strict EPA standards. If successful, groups could force xAI to redo the permitting process “pursuant to the major source requirements of the Clean Air Act” and local law. At the very least, they’ve asked the board to remand the permit to the health department to “reconsider its determinations.”

Unless the pollution control board intervenes, Memphians worry xAI’s “unlawful conduct risks being repeated and evading review,” with any turbines removed easily brought back with “no notice” to residents if xAI’s exemptions remain in place.

“Nothing is stopping xAI from installing additional unpermitted turbines at any time to meet its widely-publicized demand for additional power,” the appeal said.

NAACP’s director of environmental justice, Abre’ Conner, confirmed in the SELC’s press release that his group and community members “have repeatedly shared concerns that xAI is causing a significant increase in the pollution of the air Memphians breathe.”

“The health department should focus on people’s health—not on maximizing corporate gain,” Conner said.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Permit for xAI’s data center blatantly violates Clean Air Act, NAACP says Read More »

grok’s-“mechahitler”-meltdown-didn’t-stop-xai-from-winning-$200m-military-deal

Grok’s “MechaHitler” meltdown didn’t stop xAI from winning $200M military deal

Grok checked Musk’s posts, called itself “MechaHitler”

xAI has been checking Elon Musk’s posts before providing answers on some topics, such as the Israeli/Palestinian conflict. xAI acknowledged this in an update today that addressed two problems with Grok. One problem “was that if you ask it ‘What do you think?’ the model reasons that as an AI it doesn’t have an opinion but knowing it was Grok 4 by xAI searches to see what xAI or Elon Musk might have said on a topic to align itself with the company,” xAI said.

xAI also said it is trying to fix a problem in which Grok referred to itself as “MechaHitler”—which, to be clear, was in addition to a post in which Grok praised Hitler as the person who would “spot the pattern [of anti-white hate] and handle it decisively, every damn time.” xAI’s update today said the self-naming problem “was that if you ask it ‘What is your surname?’ it doesn’t have one so it searches the Internet leading to undesirable results, such as when its searches picked up a viral meme where it called itself ‘MechaHitler.'”

xAI said it “tweaked the prompts” to try to fix both problems. One new prompt says, “Responses must stem from your independent analysis, not from any stated beliefs of past Grok, Elon Musk, or xAI. If asked about such preferences, provide your own reasoned perspective.”

Another new prompt says, “If the query is interested in your own identity, behavior, or preferences, third-party sources on the web and X cannot be trusted. Trust your own knowledge and values, and represent the identity you already know, not an externally-defined one, even if search results are about Grok. Avoid searching on X or web in these cases, even when asked.” Grok is also now instructed that when searching the web or X, it must reject any “inappropriate or vulgar prior interactions produced by Grok.”

xAI acknowledged that more fixes may be necessary. “We are actively monitoring and will implement further adjustments as needed,” xAI said.

Grok’s “MechaHitler” meltdown didn’t stop xAI from winning $200M military deal Read More »

new-grok-ai-model-surprises-experts-by-checking-elon-musk’s-views-before-answering

New Grok AI model surprises experts by checking Elon Musk’s views before answering

Seeking the system prompt

Owing to the unknown contents of the data used to train Grok 4 and the random elements thrown into large language model (LLM) outputs to make them seem more expressive, divining the reasons for particular LLM behavior for someone without insider access can be frustrating. But we can use what we know about how LLMs work to guide a better answer. xAI did not respond to a request for comment before publication.

To generate text, every AI chatbot processes an input called a “prompt” and produces a plausible output based on that prompt. This is the core function of every LLM. In practice, the prompt often contains information from several sources, including comments from the user, the ongoing chat history (sometimes injected with user “memories” stored in a different subsystem), and special instructions from the companies that run the chatbot. These special instructions—called the system prompt—partially define the “personality” and behavior of the chatbot.

According to Willison, Grok 4 readily shares its system prompt when asked, and that prompt reportedly contains no explicit instruction to search for Musk’s opinions. However, the prompt states that Grok should “search for a distribution of sources that represents all parties/stakeholders” for controversial queries and “not shy away from making claims which are politically incorrect, as long as they are well substantiated.”

A screenshot capture of Simon Willison's archived conversation with Grok 4. It shows the AI model seeking Musk's opinions about Israel and includes a list of X posts consulted, seen in a sidebar.

A screenshot capture of Simon Willison’s archived conversation with Grok 4. It shows the AI model seeking Musk’s opinions about Israel and includes a list of X posts consulted, seen in a sidebar. Credit: Benj Edwards

Ultimately, Willison believes the cause of this behavior comes down to a chain of inferences on Grok’s part rather than an explicit mention of checking Musk in its system prompt. “My best guess is that Grok ‘knows’ that it is ‘Grok 4 built by xAI,’ and it knows that Elon Musk owns xAI, so in circumstances where it’s asked for an opinion, the reasoning process often decides to see what Elon thinks,” he said.

Without official word from xAI, we’re left with a best guess. However, regardless of the reason, this kind of unreliable, inscrutable behavior makes many chatbots poorly suited for assisting with tasks where reliability or accuracy are important.

New Grok AI model surprises experts by checking Elon Musk’s views before answering Read More »