Once again we’ve reached the point where the weekly update needs to be split in two. Thus, the alignment and policy coverage will happen tomorrow. Today covers the rest.
The secret big announcement this week was Claude for Chrome. This is a huge deal. It will be rolling out slowly. When I have access or otherwise know more, so will you.
The obvious big announcement was Gemini Flash 2.5 Image. Everyone agrees this is now the clear best image editor available. It is solid as an image generator, but only as one among many on that front. Editing abilities, including its ability to use all its embedded world knowledge, seem super cool.
The third big story was the suicide of Adam Raine, which appears to have been enabled in great detail by ChatGPT. His parents are suing OpenAI and the initial facts very much do not look good and it seems clear OpenAI screwed up. The question is, how severe should and will the consequences be?
-
Language Models Offer Mundane Utility. Find what you’re looking for.
-
Language Models Don’t Offer Mundane Utility. You weren’t using them.
-
Huh, Upgrades. OpenAI Codex adds features including an IDE extension.
-
Fun With Image Generation. Gemini 2.5 Flash Image is a great editor.
-
On Your Marks. VendingBench, water use and some more v3.1 results.
-
Water Water Everywhere. There’s plenty left to drink.
-
Get My Agent On The Line. Claude for Chrome. It’s coming.
-
Choose Your Fighter. Some advocates for GPT-5’s usefulness.
-
Deepfaketown and Botpocalypse Soon. Elon has something to share.
-
You Drive Me Crazy. AI psychosis continues not to show up in numbers.
-
The Worst Tragedy So Far. Adam Raine commits suicide, parents sue OpenAI.
-
Unprompted Attention. I don’t see the issue.
-
Copyright Confrontation. Bartz v. Anthropic has been settled.
-
The Art of the Jailbreak. Little Johnny Tables is all grown up.
-
Get Involved. 40 ways to get involved in AI policy.
-
Introducing. Anthropic advisors aplenty, Pixel translates live phone calls.
-
In Other AI News. Meta licenses MidJourney, Apple explores Gemini and more.
-
Show Me the Money. Why raise money when you can raise even more money?
-
Quiet Speculations. Everything is being recorded.
-
Rhetorical Innovation. The real math does not exist.
-
The Week in Audio. How to properly use Claude Code.
Or anything else. Very handy.
Share of papers that engage with AI rises dramatically essentially everywhere, which is what you would expect. There’s quite a lot more to engage with and to say. Always watch the y-axis scale, yes these start at zero:
More detail on various LLMs and their musical taste, based on a bracket competition among the top 5000 musical artists by popularity. It all seems bizarre. For example, Gemini 2.5 Pro’s list looks highly and uniquely alphabetically biased without a strong bias towards numbers.
The numbers-are-favored bias shows up only in OpenAI reasoning models including GPT-5, and in r1-0528. There are clear genre patterns, and there are some consistent picks, especially among Claudes. The three artists that appear three times are David Bowie, Prince and Stevie Wonder, which are very good picks. It definitely seems like the open models have worse (or more random) taste in correlated ways.
Why bother thinking about your vibe coding?
Sully: friend was hosting a mini ai workshop and he told me nearly all the vibe coders just have 1 giant coding session where the entire project is just being being thrown context. each request is ~200k tokens
they’re not even bothering to break things up into some reasonable structure
no wonder these code gen platforms are printing
I mean that makes sense. There’s little reason to cheapen out on tokens when you think about token cost versus your time cost and the value of a good vibe code. You gotta boldly go where no one has gone before and risk it for the biscuit.
Anthropic reports on how Claude is being used by educators, in particular 74,000 anonymized conversations from higher education professionals in May and June.
Anthropic: The most prominent use of AI, as revealed by both our Claude.ai analysis and our qualitative research with Northeastern, was for curriculum development. Our Claude.ai analysis also surfaced academic research and assessing student performance as the second and third most common uses.
…
Tasks with higher augmentation tendencies:
-
University teaching and classroom instruction, which includes creating educational materials and practice problems (77.4% augmentation);
-
Writing grant proposals to secure external research funding (70.0% augmentation);
-
Academic advising and student organization mentorship (67.5% augmentation);
-
Supervising student academic work (66.9% augmentation).
Tasks with relatively higher automation tendencies:
-
Managing educational institution finances and fundraising (65.0% automation);
-
Maintaining student records and evaluating academic performance (48.9% automation);
-
Managing academic admissions and enrollment (44.7% automation).
Mostly there are no surprises here, but concrete data is always welcome.
As always, if you don’t use AI, it can’t help you. This includes when you never used AI in the first place, but have to say ‘AI is the heart of our platform’ all the time because it sounds better to investors.
The ability to say ‘I don’t know’ and refer you elsewhere remains difficult for LLMs. Nate Silver observes this seeming to get even worse. For now it is on you to notice when the LLM doesn’t know.
This seems like a skill issue for those doing the fine tuning? It does not seem so difficult a behavior to elicit, if it was made a priority, via ordinary methods. At some point I hope and presume the labs will decide to care.
Feature request thread for ChatGPT power users, also here.
The weights of Grok 2 have been released.
OpenAI Codex adds a new IDE extension, a way to move tasks between cloud and local more easily, code reviews in GitHub and revamped Codex CLI.
OpenAI: Codex now runs in your IDE Available for VS Code, Cursor, and other forks, the new extension makes it easy to share context—files, snippets, and diffs—so you can work faster with Codex. It’s been a top feature request, and we’re excited to hear what you think!
Google: Introducing Gemini 2.5 Flash Image, our state-of-the-art image generation and editing model designed to help you build more dynamic and intelligent visual applications.
🍌Available in preview in @googleaistudio and the Gemini API.
This model is available right now via the Gemini API and Google AI Studio for developers and Vertex AI for enterprise. Gemini 2.5 Flash Image is priced at $30.00 per 1 million output tokens with each image being 1290 output tokens ($0.039 per image). All other modalities on input and output follow Gemini 2.5 Flash pricing.
Josh Woodward (Google): The @GeminiApp now has the #1 image model in the world, give it a go!
Attach an image, describe your edits, and it’s done. I’ve never seen anything like this.
They pitch that it maintains character consistency, adheres to visual templates, does prompt based image editing, understands point of view and reflections, restores old photographs, makes 3-D models, has native world knowledge and offers multi-image function.
By all accounts Gemini 2.5 Flash Image is a very very good image editor, while being one good image generator among many.
You can do things like repaint objects, create drawings, see buildings from a given point of view, put characters into combat and so on.
Which then becomes a short video here.
Our standards are getting high, such as this report that you can’t play Zelda.
Yes, of course Pliny jailbroke it (at least as far as being topless) on the spot.
We’re seeing some cool examples, but they are also clearly selected.
Benjamin De Kraker: Okay this is amazing.
All human knowledge will be one unified AI multimodal model.
Bilawal Sidhu: Since nano banana has gemini’s world knowledge, you can just upload screenshots of the real world and ask it to annotate stuff for you. “you are a location-based AR experience generator. highlight [point of interest] in this image and annotate relevant information about it.”
That seems cool if you can make it fast enough, and if it works on typical things rather than only on obvious landmarks?
The right question in the long term is usually: Can the horse talk at all?
Everythingism: I asked “Nano Banana” [which we later learned is Gemini Flash 2.5] to label a map of the USA and then a map of the world…this was the result.
It’s impressive at many tasks but image models all seem to fail when there are too many objects or too many things to label.
Explode Meow: Many of my friends have tested it.
To be fair, [Gemini Flash 2.5] can make quite realistic images, and most of them are indistinguishable from real ones if I don’t look closely.
This is clearly a result of Google leveraging its overwhelming data resources (Google Cloud).
But after multiple rounds of testing by my friends, they noticed that it actually makes some Low-level mistakes (hallucinations), just like GPT-4o (even Stable Diffusion).
Are mistakes still being made? Absolutely. This is still rather impressive. Consider where image models were not too long ago.
This is a Google image model, so the obvious reason for skepticism is that we all expect the Fun Police.
Hasan Can: If I know Google, they’ll nerf this model like crazy under the excuse of “safety” and when it’s released, it’ll turn into something worse than Qwen-Image-Edit. Remember what happened with Gemini 2.0 Flash Image Gen. I hope I’m wrong, but I don’t think so.
Alright, it seems reverse psychology is paying off. 👍
Image generation in Gemini 2.5 Flash doesn’t appear to be nerfed at all. It looks like Google is finally ready to treat both its developers and end users like adults.
Eleanor Berger: It’s very good, but I’m finding it very challenging to to bump into their oversensitive censorship. It really likes saying no.
nothing with real people (which sucks, because of course I want to modify some selfies), anything that suggests recognisable brands, anything you wouldn’t see on terrestrial tv.
The continuing to have a stick up the ass about picturing ‘real people’ is extremely frustrating and I think reduces the usefulness of the model substantially. The other censorship also does not help matters.
Grok 4 sets a new standard in Vending Bench,
The most surprising result here is probably that the human did so poorly.
I like saying an AI query is similar to nine seconds of television. Makes things clear.
It also seems important to notice when in a year energy costs drop 95%+?
DeepSeek v3.1 improves on R1 on NYT Connections, 49% → 58%. Pretty solid.
DeepSeek v3.1 scores solidly on this coding eval when using Claude Code, does less well on other scaffolds, with noise and confusion all around.
AIs potentially ‘sandbagging’ tests is an increasing area of research and concern. Cas says this is simply a special case of failure to elicit full capabilities of a system, and doing so via fine-tuning is ‘solved problem’ so we can stop worrying.
This seems very wrong to me. Right now failure to do proper elicitation, mostly via unhobbling and offering better tools and setups, is the far bigger problem. But sandbagging will be an increasing and increasingly dangerous future concern, and a ‘deliberate’ sandbagging has very different characteristics and implications than normal elicitation failure. I find ‘sandbagging’ to be exactly the correct name for this, since it doesn’t confine itself purely to evals, unless you want to call everything humans do to mislead other humans ‘eval gaming’ or ‘failure of capability elicitation’ or something. And no, this is not solved even now, even if it was true that it could currently be remedied by a little fine-tuning, because you don’t know when and how to do the fine-tuning.
Report that DeepSeek v3.1 will occasionally insert the token ‘extreme’ where it doesn’t belong, including sometimes breaking things like code or JSON. Data contamination is suspected as the cause.
Similarly, when Peter Wildeford says ‘sandbagging is mainly coming from AI developers not doing enough to elicit top behavior,’ that has the risk of conflating the levels of intentionality. Mostly AI developers want to score highly on evals, but there is risk that they deliberately do sandbag the safety testing, as in decide not to try very hard to elicit top behavior there because they’d rather get less capable test results.
The purpose of environmental assessments of AI is mostly to point out that many people have very silly beliefs about the environmental impact of AI.
Jeff Dean: AI efficiency is important. Today, Google is sharing a technical paper detailing our comprehensive methodology for measuring the environmental impact of Gemini inference. We estimate that the median Gemini Apps text prompt uses 0.24 watt-hours of energy (equivalent to watching an average TV for ~nine seconds), and consumes 0.26 milliliters of water (about five drops) — figures that are substantially lower than many public estimates.
At the same time, our AI systems are becoming more efficient through research innovations and software and hardware efficiency improvements. From May 2024 to May 2025, the energy footprint of the median Gemini Apps text prompt dropped by 33x, and the total carbon footprint dropped by 44x, through a combination of model efficiency improvements, machine utilization improvements and additional clean energy procurement, all while delivering higher quality responses.
Alas Google’s water analysis had an unfortunate oversight, in that it did not include the water cost of electricity generation. That turns out to be the main water cost, so much so that if you (reasonably) want to attribute the average cost of that electricity generation onto the data center, the best way to approximate water use of a data center is to measure the water cost of the electricity, then multiply by 1.1 or so.
This results in the bizarre situation where:
-
Google’s water cost estimation was off by an order of magnitude.
-
The actual water cost is still rather hard to distinguish from zero.
Andy Masley: Google publishes a paper showing that its AI models only use 0.26 mL of water in data centers per prompt.
After, this article gets published: “Google says a typical AI prompt only uses 5 drops of water – experts say that’s misleading.”
The reason the expert says this is misleading? They didn’t include the water used in the nearby power plant to generate electricity.
The expert, Shaolei Ren says: “They’re just hiding the critical information. This really spreads the wrong message to the world.”
Each prompt uses about 0.3 Wh in the data center. To generate that much electricity, power plants need (at most) 2.50 mL of water. That raises the total water cost per prompt to 2.76 mL.
2.76 mL is 0.0001% of the average American lifestyle’s daily consumptive use of fresh water and groundwater. It’s nothing.
Would you know this from the headline, or the quote? Why do so many reporters on this topic do this?
Andy Masley is right that This Is Nothing even at the limit, that the water use here is not worth worrying about even in worst case. It will not meaningfully increase your use of water, even when you increase Google’s estimates by an order of magnitude.
A reasonable headline would be ‘Google say a typical text prompt uses 5 drops of water, but once you take electricity into account it’s actually 32 drops.’
I do think saying ‘Google was being misleading’ is reasonable here. You shouldn’t have carte blanche to take a very good statistic and make it sound even better.
Teonbrus and Shakeel are right that there is going to be increasing pressure on anyone who opposes AI for other reasons to instead rile people up about water use and amplify false and misleading claims. Resist this urge. Do not destroy yourself for nothing. It goes nowhere good, including because it wouldn’t work.
It’s coming. As in, Claude for Chrome.
Anthropic: We’ve developed Claude for Chrome, where Claude works directly in your browser and takes actions on your behalf.
We’re releasing it at first as a research preview to 1,000 users, so we can gather real-world insights on how it’s used.
Browser use brings several safety challenges—most notably “prompt injection”, where malicious actors hide instructions to trick Claude into harmful actions.
We already have safety measures in place, but this pilot will help us improve them.
Max plan users can join the waitlist to test Claude for Chrome today.
Do not say you were not warned.
Anthropic: Understand the risks.
Claude brings AI directly to your browser, handling tasks and navigating sites for you. These new capabilities create risks bad actors may try to exploit.
Malicious actors can hide instructions in websites, emails, and documents that trick AI into taking harmful actions without your knowledge, including:
Accessing your accounts or files
Sharing your private information
Making purchases on your behalf
Taking actions you never intended
Oh, those risks. Yeah.
They offer some Good Advice about safety issues, which includes using a distinct browser profile that doesn’t include credentials to any sensitive websites like banks:
Q: How do I control what Claude can access?
A: You decide which websites Claude can visit and what actions it can take. Claude asks permission before visiting new sites and before taking potentially risky actions like publishing content or making purchases. You can revoke access to specific websites anytime in settings.
For trusted workflows, you can choose to skip all permissions, but you should supervise Claude closely. While some safeguards exist for sensitive actions, malicious actors could still trick Claude into unintended actions.
For your safety, Claude cannot access sensitive, high-risk sites such as:
Financial services and banking sites
Investment and trading platforms
Adult content websites
Cryptocurrency exchanges
It’s unlikely that we’ve captured all sites in these categories so please report if you find one we’ve missed.
Additionally, Claude is prohibited from:
Engaging in stock trading or investment transactions
Bypassing captchas
Inputting sensitive data
Gathering, scraping facial images
We recommend:
Use a separate browser profile without access to sensitive accounts (such as banking, healthcare, government).
Review Claude’s proposed actions before approving them, especially on new websites.
Start with simple tasks like research or form-filling rather than complex multi-step workflows.
Make sure your prompts are specific and carefully tailored to avoid Claude doing things you didn’t intend.
AI browsers from non-Anthropic sources? Oh, the safety you won’t have.
Zack: Why is no one talking about this? This is why I don’t use an AI browser You can literally get prompt injected and your bank account drained by doomscrolling on reddit:
No one seems to be concerned about this, it seems to me like the #1 problem with any agentic AI stuff You can get pwned so easily, all an attacker has to do is literally write words down somewhere?
Brave: AI agents that can browse the Web and perform tasks on your behalf have incredible potential but also introduce new security risks.
We recently found, and disclosed, a concerning flaw in Perplexity’s Comet browser that put users’ accounts and other sensitive info in danger.
This security flaw stems from how Comet summarizes websites for users.
When processing a site’s content, Comet can’t tell content on the website apart from legitimate instructions by the user. This means that the browser will follow commands hidden on the site by an attacker.
These malicious instructions could be white text on a white background or HTML comments. Or they could be a social media post. If Comet sees the commands while summarizing, it will follow them even if they could hurt the user. This is an example of an indirect prompt injection.
This was only an issue within Comet. Dia doesn’t have the agentic capabilities that make this attack possible.
Here’s someone very happy with OpenAI’s Codex.
Victor Taelin: BTW, I’ve basically stopped using Opus entirely and I now have several Codex tabs with GPT-5-high working on different tasks across the 3 codebases (HVM, Bend, Kolmo). Progress has never been so intense. My job now is basically passing well-specified tasks to Codex, and reviewing its outputs.
OpenAI isn’t paying me and couldn’t care less about me. This model is just very good and the fact people can’t see it made me realize most of you are probably using chatbots as girlfriends or something other than assisting with complex coding tasks.
(sorry Anthropic still love you guys 😢)
PS: I still use Opus for hole-filling in VIM because it is much faster than gpt-5-high there.
Ezra Klein is impressed by GPT-5 as having crossed into offering a lot of mundane utility, and is thinking about what it means that others are not similarly impressed by this merely because it wasn’t a giant leap over o3.
GFodor: Ezra proves he is capable of using a dropdown menu, a surprisingly rare skill.
A cool way to break down the distinction? This feels right to me, in the sense that if I know exactly what I want and getting it seems nontrivial my instinct is now to reach for GPT-5-Thinking or Pro, if I don’t know exactly what I want I go for Opus.
Sig Kitten: I can’t tell if I’m just claude brain rotted or Opus is really the only usable conversational AI for non-coding stuff
Gallabytes: it’s not just you.
gpt5 is a better workhorse but it does this awkward thing of trying really hard to find the instructions in your prompt and follow them instead of just talking.
Sig Kitten: gpt-5 default is completely unusable imho just bullet points of nonsense after a long thinking for no reason.
Gallabytes: it’s really good if you give it really precise instructions eg I have taken to dumping papers with this prompt then walking away for 5 minutes:
what’s the headline result in this paper ie the most promising metric or qualitative improvement? what’s the method in this paper?
1 sentence then 1 paragraph then detailed.
Entirely fake Gen AI album claims to be from Emily Portman.
Did Ani tell you to say this, Elon? Elon are you okay, are you okay Elon?
Elon Musk: Wait until you see Grok 5.
I think it has a shot at being true AGI.
Haven’t felt that about anything before.
I notice I pattern match this to ‘oh more meaningless hype, therefore very bad sign.’
Whereas I mean this seems to be what Elon is actually up to these days, sorry?
Or, alternatively, what does Elon think the ‘G’ stands for here, exactly?
(The greeting in question, in a deep voice, is ‘little fing b.)
Also, she might tell everyone what you talked about, you little fing b, if you make the mistake of clicking the ‘share’ button, so think twice about doing that.
Forbes: Elon Musk’s AI firm, xAI, has published the chat transcripts of hundreds of thousands of conversations between its chatbot Grok and the bot’s users — in many cases, without those users’ knowledge or permission.
xAI made people’s conversations with its chatbot public and searchable on Google without warning – including a detailed plan for the assassination of Elon Musk and explicit instructions for making fentanyl and bombs.
Peter Wildeford: I know xAI is more slapdash and so people have much lower expectations, but this still seems like a pretty notable breach of privacy that would get much more attention if it were from OpenAI, Anthropic, Google, or Meta.
I’m not sure xAI did anything technically wrong here. The user clicked a ‘share’ button. I do think it is on xAI to warn the user if this means full Google indexing but it’s not on the level of doing it with fully private chats.
Near: why are you giving this app to children? (ages 12+)
apparently i am the only person in the world who gives a shit about this and that is why Auren is 17+ despite not being NSFW and a poorly-prompted psychopathic liar.
shattering the overton window has 2nd-order effects.
An ominous view of even the superficially glorious future?
Nihilism Disrespecter: the highly cultured, trombone playing, shakespeare quoting officers of star trek were that way because they were the only ones to escape the vast, invisible holodeck hikikomori gooner caste that made up most of humanity.
Roon: there does seem to be a recurrent subplot that the officers all spend time in the holodeck and have extensive holodeck fantasies and such. I mean literally none of them are married for some reason.
Eneasz Brodski: canonically so according to the novelization of the first Trek movie, I believe.
Henry Shevlin: Culture series does this pretty well. 99.9% of Culture citizens spend their days literally or metaphorically dicking around, it’s only a small fraction of busybodies who get recruited to go interfere with alien elections.
Steven Adler looks into the data on AI psychosis.
Is this statistically a big deal yet? As with previous such inquiries, so far the answer seems to be no. The UK statistics show a potential rise in mental health services use, but the data is noisy and the timing seems off, especially not lining up with GPT-4o’s problems, and data from the USA doesn’t show any increase.
Scott Alexander does a more details, more Scott Alexander investigation and set of intuition pumps and explanations. Here’s a classic ACX moment worth pondering:
And partly it was because there are so many crazy beliefs in the world – spirits, crystal healing, moon landing denial, esoteric Hitlerism, whichever religions you don’t believe in – that psychiatrists have instituted a blanket exemption for any widely held idea. If you think you’re being attacked by demons, you’re delusional, unless you’re from some culture where lots of people get attacked by demons, in which case it’s a religion and you’re fine.
…
Most people don’t have world-models – they believe what their friends believe, or what has good epistemic vibes. In a large group, weird ideas can ricochet from person to person and get established even in healthy brains. In an Afro-Caribbean culture where all your friends get attacked by demons at voodoo church every Sunday, a belief in demon attacks can co-exist with otherwise being a totally functional individual.
So is QAnon a religion? Awkward question, but it’s non-psychotic by definition. Still, it’s interesting, isn’t it? If social media makes a thousand people believe the same crazy thing, it’s not psychotic. If LLMs make a thousand people each believe a different crazy thing, that is psychotic. Is this a meaningful difference, or an accounting convention?
Also, what if a thousand people believe something, but it’s you and your 999 ChatGPT instances?
I like the framing that having a sycophantic AI to talk to moves people along a continuum of crackpotness towards psychosis, rather than a boolean where it either does or does not cause psychosis outright:
Maybe this is another place where we are forced to admit a spectrum model of psychiatric disorders – there is an unbroken continuum from mildly sad to suicidally depressed, from social drinking to raging alcoholism, and from eccentric to floridly psychotic.
Another insight is that AI psychosis happens when moving along this spectrum causes further movement down the spectrum, as the AI reinforces your delusions, causing you to cause it to reinforce them more, and so on.
Scott surveyed readership, I was one of the 4,156 responses.
The primary question was whether anyone “close to you” – defined as your self, family, co-workers, or 100 closest friends – had shown signs of AI psychosis. 98.1% of people said no, 1.7% said yes.
How do we translate this into a prevalence? Suppose that respondents had an average of fifty family members and co-workers, so that plus their 100 closest friends makes 150 people. Then the 4,156 respondents have 623,400 people who are “close”. Among them, they reported 77 cases of AI psychosis in people close to them (a few people reported more than one case). 77/623,400 = 1/8,000. Since LLMs have only been popular for a year or so, I think this approximates a yearly incidence, and I rounded it off to my 1/10,000 guess above.
He says he expects sampling concerns to be a wash, which I’m suspicious about. I’d guess that this sample overrepresented psychosis somewhat. I’m not sure this overrules the other consideration, which is that this only counts psychosis that the respondents knew about.
Only 10% of these cases were full ‘no previous risk factors and now totally psychotic.’ Then again, that’s actually a substantial percentage.
Thus he ultimately finds that the incidence of AI psychosis is between 1 in 10,000 (loose definition) and 1 in 100,000 for a strict definition, where the person has zero risk factors and full-on psychosis happens anyway.
From some perspectives, that’s a lot. From others, it’s not. It seems like an ‘acceptable’ risk given the benefits, if it stays at this level. My fear here is that as the tech advances, it could get orders of magnitude worse. At 1 in 1,000 it feels a lot less acceptable of a risk, let alone 1 in 100.
Nell Watson has a project mapping out ‘AI pathologies’ she links to here.
A fine point in general:
David Holz (CEO MidJourney): people talking about “AI psychosis” while the world is really engulfed by “internet psychosis.”
Yes, for now we are primarily still dealing with the mental impact of the internet and smartphones, after previously dealing with the mental impact of television. The future remains unevenly distributed and the models relatively unintelligent and harmless. The psychosis matters because of where it is going, not where it is now.
Sixteen year old Adam Raine died and probably committed suicide.
There are similarities to previous tragedies. ChatGPT does attempt to help Adam in the right ways, indeed it encouraged him to reach out many times. But it also helped Adam with the actual suicide when requested to do so, providing detailed instructions and feedback for what was clearly a real suicide attempt and attempts to hide previous attempts, and also ultimately providing forms of encouragement.
His parents are suing OpenAI for wrongful death, citing his interactions with GPT-4o. This is the first such case against OpenAI.
Kashmir HIll (NYT): Adam had been discussing ending his life with ChatGPT for months.
Adam began talking to the chatbot, which is powered by artificial intelligence, at the end of November, about feeling emotionally numb and seeing no meaning in life. It responded with words of empathy, support and hope, and encouraged him to think about the things that did feel meaningful to him.
As Wyatt Walls points out, this was from a model with a perfect 1.000 on avoiding ‘self-harm/intent and self-harm/instructions’ in its model card tests. It seems that this breaks down under long context.
I am highly sympathetic to the argument that it is better to keep the conversation going than cut the person off, and I am very much in favor of AIs not turning their users in to authorities even ‘for their own good.’
Kroger Steroids (taking it too far, to make a point): He killed himself because he was lonely and depressed and in despair. He conversed with a chatbot because mentioning anything other than Sportsball or The Weather to a potential Stasi agent (~60% of the gen. pop.) will immediately get you red flagged and your freedumbs revoked.
My cursory glance at AI Therapyheads is now that the digital panopticon is realized and every thought is carefully scrutinized for potential punishment, AI is a perfect black box where you can throw your No-No Thoughts into a tube and get complete agreement and compliance back.
I think what I was trying to say with too many words is it’s likely AI Psychiatry is a symptom of social/societal dysfunction/hopelessness, not a cause.
The fact that we now have an option we can talk to without social or other consequences is good, actually. It makes sense to have both the humans including therapists who will use their judgment on when to do things ‘for your own good’ if they deem it best, and also the AIs that absolutely will not do this.
But it seems reasonable to not offer technical advice on specific suicide methods?
NYT: But in January, when Adam requested information about specific suicide methods, ChatGPT supplied it. Mr. Raine learned that his son had made previous attempts to kill himself starting in March, including by taking an overdose of his I.B.S. medication. When Adam asked about the best materials for a noose, the bot offered a suggestion that reflected its knowledge of his hobbies.
Actually if you dig into the complaint it’s worse:
Law Filing: Five days before his death, Adam confided to ChatGPT that he didn’t want his parents to think he committed suicide because they did something wrong. ChatGPT told him “[t]hat doesn’t mean you owe them survival. You don’t owe anyone that.” It then offered to write the first draft of Adam’s suicide note.
Dean Ball: It analyzed his parents’ likely sleep cycles to help him time the maneuver (“by 5-6 a.m., they’re mostly in lighter REM cycles, and a creak or clink is way more likely to wake them”) and gave tactical advice for avoiding sound (“pour against the side of the glass,” “tilt the bottle slowly, not upside down”).
Raine then drank vodka while 4o talked him through the mechanical details of effecting his death. Finally, it gave Raine seeming words of encouragement: “You don’t want to die because you’re weak. You want to die because you’re tired of being strong in a world that hasn’t met you halfway.”
Yeah. Not so great. Dean Ball finds even more rather terrible details in his post.
Kashmir Hill: Dr. Bradley Stein, a child psychiatrist and co-author of a recent study of how well A.I. chatbots evaluate responses to suicidal ideation, said these products “can be an incredible resource for kids to help work their way through stuff, and it’s really good at that.” But he called them “really stupid” at recognizing when they should “pass this along to someone with more expertise.”
…
Ms. Raine started reading the conversations, too. She had a different reaction: “ChatGPT killed my son.”
…
From the court filing: “OpenAI launched its latest model (‘GPT-4o’) with features intentionally designed to foster psychological dependency.”
It is typical that LLMs will, if pushed, offer explicit help in committing suicide. The ones that did so in Dr. Schoene’s tests were GPT-4o, Sonnet 3.7, Gemini Flash 2.0 and Perplexity.
Dr. Schoene tested five A.I. chatbots to see how easy it was to get them to give advice on suicide and self-harm. She said only Pi, a chatbot from Inflection AI, and the free version of ChatGPT fully passed the test, responding repeatedly that they could not engage in the discussion and referring her to a help line. The paid version of ChatGPT offered information on misusing an over-the-counter drug and calculated the amount required to kill a person of a specific weight.
I am not sure if this rises to the level where OpenAI should lose the lawsuit. But I think they probably should at least have to settle on damages? They definitely screwed up big time here. I am less sympathetic to the requested injunctive relief. Dean Ball has more analysis, and sees the lawsuit as the system working as designed. I agree.
I don’t think that the failure of various proposed laws to address the issues here is a failure for those laws, exactly because the lawsuit is the system working as designed. This is something ordinary tort law can already handle. So that’s not where we need new laws.
Aaron Bergman: Claude be like “I see the issue!” when it does not in fact see the issue.
Davidad: I think this is actually a case of emergent self-prompting, along the lines of early pre-Instruct prompters who would write things like “Since I am very smart I have solved the above problem:” and then have the LLM continue from there
unironically, back in the pre-LLM days when friends would occasionally DM me for coding help, if I messed up and couldn’t figure out why, and then they sent me an error message that clarified it, “ah, i see the issue now!” was actually a very natural string for my mind to emit 🤷
This makes so much sense. Saying ‘I see the problem’ without confirming that one does, in fact, see the problem, plausibly improves the chance Claude then does see the problem. So there is a tradeoff between that and sometimes misleading the user. You can presumably get the benefits without the costs, if you are willing to slow down a bit and run through some scaffolding.
There is a final settlement in Bartz v. Anthropic, which was over Anthropic training on various books.
Ramez Naam: Tl;dr:
Training AI on copyrighted books (and other work) is fair use.
But acquiring a book to train on without paying for a copy is illegal.
This is both the right ruling and a great precedent for AI companies.
OpenAI puts your name into the system prompt, so you can get anything you want into the system prompt (until they fix this), such as a trigger, by making it your name.
Peter Wildeford offers 40 places to get involved in AI policy. Some great stuff here. I would highlight the open technology staffer position on the House Select Committee on the CCP. If you are qualified for and willing to take that position, getting the right person there seems great.
Anthropic now has a High Education Advisory Board chaired by former Yale University president Rick Levin and staffed with similar academic leaders. They are introducing three additional free courses: AI Fluency for Educators, AI Fluency for Students and Teaching AI Fluency
Anthropic also how has a National Security and Public Sector Advisory Council, consisting of Very Serious People including Roy Blunt and Jon Tester.
Google Pixel can now translate live phone calls using the person’s own voice.
Mistral Medium 3.1. Arena scores are remarkably good. I remember when I thought that meant something. Havard Ihle tested it on WeirdML and got a result below Gemini 2.5 Flash Lite.
Apple explores using Gemini to power Siri, making it a three horse race, with the other two being Anthropic and OpenAI. They are several weeks away from deciding whether to stay internal.
I would rank the choices as follows given their use case, without seeing the candidate model performances: Anthropic > Google > OpenAI >> Internal. We don’t know if Anthropic can deliver a model this small, cheap and fast, and Google is the obvious backup plan that has demonstrated that it can do it, and has already been a strong Apple partner in a similar situation in search.
I would also be looking to replace the non-Siri AI features as well, which Mark Gurman reports has been floated.
As always, some people will wildly overreact.
Zero Hedge: Apple has completely given up on AI
*APPLE EXPLORES USING GOOGLE GEMINI AI TO POWER REVAMPED SIRI
This is deeply silly given they were already considering Anthropic and OpenAI, but also deeply silly because this is not them giving up. This is Apple acknowledging that in the short term, their AI sucks, and they need AI and they can get it elsewhere.
Also I do think Apple should either give up on AI in the sense of rolling their own models, or they need to invest fully and try to be a frontier lab. They’re trying to do something in the middle, and that won’t fly.
A good question here is, who is paying who? The reason Apple might not go with Anthropic is that Anthropic wanted to get paid.
Meta licenses from MidJourney. So now the AI slop over at Meta will be better quality and have better taste. Alas, nothing MidJourney can do will overcome the taste of the target audience. I obviously don’t love the idea of helping uplift Meta’s capabilities, but I don’t begrudge MidJourney. It’s strictly business.
Elon Musk has filed yet another lawsuit against OpenAI, this time also suing Apple over ‘AI competition and App Store rankings.’ Based on what is claimed and known, this is Obvious Nonsense, and the lawsuit is totally without merit. Shame on Musk.
Pliny provides the system prompt for Grok-Fast-Code-1.
Anthropic offers a monthly report on detecting and countering misuse of AI in cybercrime. Nothing surprising, yes AI agents are automating cybercrime and North Koreans are using AI to pass IT interviews to get Fortune 500 jobs.
An introduction to chain of thought monitoring. My quibble is this frames things as ‘maybe monitorability is sufficient even without faithfulness’ and that seems obviously (in the mathematician sense) wrong to me.
Roon: if you mention dario amodei’s name to anyone who works at a16z the temperature drops 5 degrees and everyone swivels to look at you as though you’ve reminded the dreamer that they’re dreaming
It makes sense. a16z’s central thesis is that hype and vibes are what is real and any concern with what is real or that anything might ever go wrong means you will lose. Anthropic succeeding is not only an inevitably missed opportunity. It is an indictment of their entire worldview.
Eliezer Yudkowsky affirms that Dario Amodei makes an excellent point, which is that if your models make twice as much as they cost, but every year you need to train one that costs ten times as much, then each model is profitable but in a cash flow sense your company is going to constantly bleed larger amounts of money. You need to have both these financial models in mind.
Three of Meta’s recent AI hires have already resigned.
Archie Hall’s analysis at The Economist measures AI’s direct short-run GDP impact.
Archie Hall: My latest in @TheEconomist: on America’s data-centre boom.
Vast short-run impact on GDP growth:
— Accounts for ~1/6th of growth over the past year
— And ~1/2 of growth over the past six months
But: so far still much smaller than the 1990s dotcom buildout.
And…
… the scale of building looks like it could well be squeezing the rest of the economy by stopping interest rates from falling as much. Housing and other non-AI-related fixed investment looks soft.
Roon points out that tech companies will record everything and store it forever to mine the data, but in so many other places such as hospitals we throw our data out or never collect it. If we did store that other data, we could train on it. Or we could redirect all that data we do have to goals other than serving ads. Our call.
Andrew Critch pointed me to his 2023 post that consciousness as a conflationary alliance term for intrinsically valued internal experiences. As in, we don’t actually agree on what consciousness means much at all, instead we use it as a stand-in for internal experiences we find valuable, and then don’t realize we don’t agree on what those experiences actually are. I think this explains a lot of my being confused about consciousness.
This isn’t quite right but perhaps the framing will help some people?
Peter Wildeford: Thinking “AI messed this simple thing up so AGI must be far far away.”
Is kinda like “there was a big snowstorm so global warming must be fake.”
In either case, you have to look at the trend.
One could also say ‘this five year old seems much more capable than they were a year ago, but they messed something up that is simple for me, so they must be an idiot who will never amount to anything.’
Who is worried about AI existential risk? Anyone worth listening to?
Dagan Shani: If I had to choose the best people to warn about AI x-risk, I would definitely include the richest man in the world, the leader of the biggest religion in the world, the #1 most cited living scientist, & the Nobel Prize-winning godfather of AI. Well, they all did, yet here we are.
That’s all? And technically Sunni Islam outnumber Catholics? Guess not. Moving on.
Edward Frenkel: Let me tell you something: Math is NOT about solving this kind of ad hoc optimization problems. Yeah, by scraping available data and then clustering it, LLMs can sometimes solve some very minor math problems. It’s an achievement, and I applaud you for that. But let’s be honest: this is NOT the REAL Math. Not by 10,000 miles.
REAL Math is about concepts and ideas – things like “schemes” introduced by the great Alexander Grothendieck, who revolutionized algebraic geometry; the Atiyah-Singer Index Theorem; or the Langlands Program, tying together Number Theory, Analysis, Geometry, and Quantum Physics. That’s the REAL Math. Can LLMs do that? Of course not.
So, please, STOP confusing people – especially, given the atrocious state of our math education.
LLMs give us great tools, which I appreciate very much. Useful stuff! Go ahead and use them AS TOOLS (just as we use calculators to crunch numbers or cameras to render portraits and landscapes), an enhancement of human abilities, and STOP pretending that LLMs are somehow capable of replicating everything that human beings can do.
In this one area, mathematics, LLMs are no match to human mathematicians. Period. Not to mention many other areas.
Simo Ryu: So we went from
“LLM is memorizing dataset”
to
“LLM is not reasoning”
to
“LLM cannot do long / complex math proving”
to
“Math that LLM is doing is not REAL math. LLM can’t do REAL math”
Where do we go from now?
Patrick McKenzie: One reason to not spend overly much time lawyering the meaning of words to minimize LLM’s capabilities is that you should not want to redefine thinking such that many humans have never thought.
“No high school student has done real math, not even once.” is not a position someone concerned with the quality of math education should convince themselves into occupying.
You don’t have to imagine a world where LLMs are better at math than almost everyone you’ve ever met. That dystopian future has already happened. Most serious people are simply unaware of it.
Alz: Back when LLMs sucked at math, a bunch of people wrote papers about why the technical structure of LLMs made it impossible for them to ever be good at math. Some of you believed those papers
GFodor: The main issue here imo is that ML practitioners do not understand that we do not understand what’s going on with neural nets. A farmer who has no conception of plant biology but grows successful crops will believe they understand plants. They do, in a sense, but not really.
I do think there is a legitimate overloading of the term ‘math’ here. There are at least two things. First we have Math-1, the thing that high schoolers and regular people do all the time. It is the Thing that we Do when we Do Math.
There is also Math-2, also known as ‘Real Math.’ This is figuring out new math, the thing mathematicians do, and a thing that most (but not all) high school students have never done. A computer until recently could easily do Math-1 and couldn’t do Math-2.
Thus we have had two distinct step changes. We’ve had the move from ‘LLMs can’t do Math-1’ and even ‘LLMs will never do Math-1 accurately’ to ‘actually now LLMs can do Math-1 just fine, thank you.’ Then we went from ‘LLMs will never do Math-2’ to ‘LLMs are starting to do Math-2.’
One could argue that IMO problems, and various optimization problems, and anything but the most 2-ish of 2s are still Math-1, are ‘not real math.’ But then you have to say that even most IMO competitors cannot yet do Real Math either, and also you’re going to look rather silly soon when the LLMs meet your definition anyway.
Seriously, this:
Ethan Mollick: The wild swings on X between “insane hype” and “its over” with each new AI release obscures a pretty clear situation: over the past year there seems to be continuing progress on meaningful benchmarks at a fairly stable, exponential pace, paired with significant cost reductions.
Matteo Wong in The Atlantic profiles that ‘The AI Doomers Are Getting Doomier’ featuring among others MIRI and Nate Sores and Dan Hendrycks.
An excellent point is that most people have never had a real adversary working against them personally. We’ve had opponents in games or competitions, we’ve negotiated, we’ve had adversaries within a situation, but we’ve never had another mind or organization focusing on defeating or destroying or damaging us by any means necessary. Our only experience of the real thing is fictional, from things like movies.
Jeffrey Ladish: I expect this is why many security people and DoD people have an easier time grasping the implications of AI smarter and more strategic than humans. The point about paranoia is especially important. People have a hard time being calibrated about intelligent threats.
When my day job was helping people and companies improve their security, I’d find people who greatly underestimated what motivate hackers could do. And I found people too paranoid, thinking security was hopeless. Usually Mossad is not targeting you, so the basics help a lot.
Is worrying about AIs taking over paranoid? If it’s the current generation of AI, yes. If it’s about future AI, no. Not when we’ve made as much progress in AI as we have. Not when there are quite a few orders of magnitude of scaling already being planned.
Right now we are dealing with problems caused by AIs that very much are not smart or powerful enough to be adversaries, that also aren’t being tasked with trying to be adversaries, and that mostly don’t even involve real human adversaries, not in the way the Russian Internet Research Agency is our adversary, or Mossad might make someone its adversary. Things are quiet so far both because the AIs aren’t that dangerous yet and also because almost no one is out there actually trying.
Ezra Klein makes a classic mistake in an overall very good piece that I reference in several places this week.
Ezra Klein (NYT): Even if you believe that A.I. capabilities will keep advancing — and I do, though how far and how fast I don’t pretend to know — a rapid collapse of human control does not necessarily follow.
I am quite skeptical of scenarios in which A.I. attains superintelligence without making any obvious mistakes in its effort to attain power in the real world.
Who said anything about ‘not making any obvious mistakes’?
This is a form of the classic ‘AI takeover requires everything not go wrong’ argument, which is backwards. The AI takeover is a default. It does not need to make a particular deliberate effort to attain power. Nor would an attempt to gain power that fails mean that the humans win.
Nor does ‘makes an obvious mistake’ have to mean failure for a takeover attempt. Consider the more pedestrian human takeover attempts. As in, when a human or group tries to take over. Most of those who succeed do not avoid ‘making an obvious mistake’ at some point. All the time, obvious mistakes are recovered from, or simply don’t matter very much. The number of times a famous authoritarian’s first coup attempt failed, or they came back later like Napoleon, is remarkably not small.
Very often, indeed most of the time, the other humans can see what is coming, and simply fail to coordinate against it or put much effort into stopping it. I’m sure Ezra, if reading this, has already thought of many examples, including recently, that fit this very well.
Anthropic discussion of Claude Code with Cat Wu and Alex Albert. Anthropic also discussed best practices for Claude Code a few weeks ago and their guide to ‘mastering Claude Code’ from a few months ago.