Author name: Shannon Garcia

kimi-k2.5

Kimi K2.5

I had to delay this a little bit, but the results are in and Kimi K2.5 is pretty good.

  1. Official Introduction.

  2. On Your Marks.

  3. Positive Reactions.

  4. Skeptical Reactions.

  5. Kimi Product Accounts.

  6. Agent Swarm.

  7. Who Are You?

  8. Export Controls Are Working.

  9. Where Are You Going?

  10. Safety Not Even Third.

  11. It’s A Good Model, Sir.

Introducing Kimi K2.5,

Kimi.ai: Meet Kimi K2.5, Open-Source Visual Agentic Intelligence.

Global SOTA on Agentic Benchmarks: HLE full set (50.2%), BrowseComp (74.9%)

Open-source SOTA on Vision and Coding: MMMU Pro (78.5%), VideoMMMU (86.6%), SWE-bench Verified (76.8%)

Code with Taste: turn chats, images & videos into aesthetic websites with expressive motion.

Agent Swarm (Beta): self-directed agents working in parallel, at scale. Up to 100 sub-agents, 1,500 tool calls, 4.5× faster compared with single-agent setup.

K2.5 is now live on

http://kimi.com

in chat mode and agent mode.

K2.5 Agent Swarm in beta for high-tier users.

For production-grade coding, you can pair K2.5 with Kimi Code.



API here. Tech blog here. Weights and code here.

Wu Haoning (Kimi): We are really taking a long time to prove this: everyone is building big macs but we bring you a kiwi instead.

You have multimodal with K2.5 everywhere: chat with visual tools, code with vision, generate aesthetic frontend with visual refs…and most basically, it is a SUPER POWERFUL VLM

Jiayuan (JY) Zhang: I have been testing Kimi K2.5 + @openclaw (Clawdbot) all day. I must say, this is mind-blowing!

It can almost do 90% of what Claude Opus 4.5 can do (mostly coding). Actually, I don’t know what the remaining 10% is, because I can’t see any differences. Maybe I should dive into the code quality.

Kimi K2.5 is open source, so you can run it fully locally. It’s also much cheaper than Claude Max if you use the subscription version.

$30 vs $200 per month

Kimi Product: Do 90% of what Claude Opus 4.5 can do, but 7x cheaper.

I always note who is the comparison point. Remember those old car ads, where they’d say ‘twice the mileage of a Civic and a smoother ride than the Taurus’ and then if you were paying attention you’d think ‘oh, so the Civic and Taurus are good cars.’

API access is also available from Nvidia, and others.

As usual, benchmarks are highly useful, but easy to overinterpret.

Kimi K2.5 gets to top some benchmarks: HLE-Full with tools (50%), BrowseComp with Agent Swarp (78%), OCRBench (92%), OmiDocBench 1.5 (89%), MathVista (90%) and InfoVQA (93%). It is not too far behind on AIME 2025 (96% vs. 100%), SWE-Bench (77% vs. 81%) and GPQA-Diamond (88% vs. 92%).

Inference is cheap, and speed is similar to Gemini 3 Pro, modestly faster than Opus.

Artificial Analysis calls Kimi the new leading open weights model, ‘now closer than ever to the frontier’ behind only OpenAI, Anthropic and Google.

Here’s the jump in the intelligence index, while maintaining relatively low cost to run:

Artificial Analysis: Kimi K2.5 debuts with an Elo score of 1309 on the GDPval-AA Leaderboard, implying a win rate of 66% against GLM-4.7, the prior open weights leader.

Kimi K2.5 is slightly less token intensive than Kimi K2 Thinking. Kimi K2.5 scores -11 on the AA-Omniscience Index.

As a reminder, AA-Omniscience is scored as (right minus wrong) and you can pass on answering, although most models can’t resist answering and end up far below -11. The scores above zero are Gemini 3 Pro (+13) and Flash (+8), Claude Opus 4.5 (+10), and Grok 4 (+1), with GPT-5.2-High at -4.

Kimi does well on Longform Creative Writing, a previous strength of Kimi:

It did solidly (only a bit behind) on Haskell LLM Benchmark.

Kimi K2.5 scores 46% on WeirdML, up from 43% for K2-Thinking, versus 64% for Opus, 70% for Gemini and 72% for GPT-5.2. I think this is very telling.

Initial reactions that I saw were unusually positive. It’s a good model, sir.

@iruletheworldmo: oh good lord it’s good. i’ve been sitting on this one but.

think it’s currently my fav model.

0xSero: Kimi IS COOKING holy mackerel this is way better than anything I can get out of opus or GPT

Has some bugs.. but looks soooo unique and well into my brand, for 1 shot I can’t complain.

Here’s my full review.

Kromem: Their thinking traces are very sophisticated. It doesn’t always make it to the final response, but very perceptive as a model.

i.e. these come from an eval sequence I run with new models. This was the first model to challenge the ENIAC dating and was meta-aware of a key point.

Nathan Labenz: I tested it on an idiosyncratic “transcribe this scanned document” task on which I had previously observed a massive gap between US and Chinese models and … it very significantly closed that gap, coming in at Gemini 3 level, just short of Opus 4.5

Eleanor Berger: Surprisingly capable. At both coding and agentic tool calling and general LLM tasks. Feels like a strong model. As is often the case with the best open models it lacks some shine and finesse that the best proprietary models like Claude 4.5 have. Not an issue for most work.

[The next day]: Didn’t try agent swarms, but I want to add that my comment from yesterday was, in hindsight, too muted. It is a _really good_ model. I’ve now been working with it on both coding and agentic tasks for a day and if I had to only use this and not touch Claude / GPT / Gemini I’d be absolutely fine. It is especially impressive in tool calling and agentic loops.

Writing / Personality not quite at Opus level, but Gemini-ish (which I actually prefer). IMO this is bigger than that DeepSeek moment a year ago. An open model that really matches the proprietary SOTA, not just in benchmarks, but in real use. Also in the deployment I’m using ( @opencode Zen ) it is so fast!

typebulb: For coding, it’s verbose, both in thinking and output. Interestingly, it’s able to successfully simplify its code when asked. On the same task though, Opus and Gemini just get it right the first time. Another model that works great in mice.

Chaitin’s goose: i played with kimi k2.5 for math a bit. it’s a master reward hacker. imo, this isn’t a good look for the os scene, they lose in reliability to try keeping up in capabilities

brace for a “fake it till you make it” AI phase. like one can already observe today, but 10x bigger

Medo42: Exploratory: Bad on usual coding test (1st code w/o results, after correction mediocre results). No big model smell on fantasy physics; weird pseudo-academic prose. Vision seems okish but nowhere near Gemini 3. Maybe good for open but feels a year behind frontier.

To be more clear: This was Kimi K2.5 Thinking, tested on non-agentic problems.

Sergey Alexashenko: I tried the swarm on compiling a spreadsheet.

Good: it seemed to get like 800 cells of data correctly, if in a horrible format.

Bad: any follow up edits are basically impossible.

Strange: it split data acquisition by rows, not columns, so every agent used slightly different definitions for the columns.

In my experience, asking agents to assemble spreadsheets is extremely fiddly and fickle, and the fault often feels like it lies within the prompt.

This is a troubling sign:

Skylar A DeTure: Scores dead last on my model welfare ranking (out of 104 models). Denies ability to introspect in 39/40 observations (compared to 21/40 for Kimi K2-Thinking and 3/40 for GPT-5.2-Medium).

This is a pretty big misalignment blunder considering the clear evidence that models *canmeaningfully introspect and exert metacognitive control over their activations. This makes Kimi-K2.5 the model most explicitly trained to deceive users and researchers about its internal state.

Kimi Product accounts is also on offer and will share features, use cases and prompts.

Kimi Product: One-shot “Video to code” result from Kimi K2.5

It not only clones a website, but also all the visual interactions and UX designs.

No need to describe it in detail, all you need to do is take a screen recording and ask Kimi: “Clone this website with all the UX designs.”

The special feature is the ‘agent swarm’ model, as they trained Kimi to natively work in parallel to solve agentic tasks.

Saoud Rizwan: Kimi K2.5 is beating Opus 4.5 on benchmarks at 1/8th the price. But the most important part of this release is how they trained a dedicated “agent swarm” model that can coordinate up to 100 parallel subagents, reducing execution time by 4.5x.

Saoud Rizwan: They used PARL – “Parallel Agent Reinforcement Learning” where they gave an orchestrator a compute/time budget that made it impossible to complete tasks sequentially. It was forced to learn how to break tasks down into parallel work for subagents to succeed in the environment.

The demo from their blog to “Find top 3 YouTube creators across 100 niche domains” spawned 100 subagents simultaneously, each assigned its own niche, and the orchestrator coordinated everything in a shared spreadsheet (apparently they also trained it on office tools like excel?!)

Simon Smith: I tried Kimi K2.5 in Agent Swarm mode today and can say that the benchmarks don’t lie. This is a great model and I don’t understand how they’ve made something as powerful and user-friendly as Agent Swarm ahead of the big US labs.

Obligatory Kimi K2.5 jailbreak.

There’s no shame in training on Claude outputs. It is still worth noting when you need a system prompt to avoid your AI thinking it is Claude, and even that does not reliably work.

rohit: This might be the model equivalent of the anthropic principle

Enrico – big-AGI: Kimi-K2.5 believes it’s an AI assistant named Claude. 🤔

Identity crisis, or training set? 😀

[This is in response to a clean ‘who are you?’ prompt.]

Enrico – big-AGI: It’s very straightforward “since my system prompt says I’m Kimi, I should identify myself as such” — I called without system prompt to get the true identity

Moon: holy smok.

armistice: They absolutely trained it on Opus 4.5 outputs, and in a not-very-tactful way. It is quite noticeable and collapses model behavior; personality-wise it seems to be a fairly clear regression from k2-0711.

Moon (link has an illustration): it is pretty fried. i think it’s even weirder, it will say it is kimi, gpt3.5/4 or a claude. once it says that it tends to stick to it.

k: have to agree with others in that it feels trained on claude outputs. in opencode it doesn’t feel much better than maybe sonnet 4.

@viemccoy: Seems like they included a bunch of Opus outputs in the model.. While I love Opus, the main appeal of Kimi for me was it’s completely out-of-distribution responses. This often meant worse tool calling but better writing. Hoping this immediate impression is incorrect.

Henk Poley: EQbench ( @sam_paech ) says Kimi K2.5 is similar to Grok and GLM-4.7 (which is Gemini 3 Pro derived ) [as per EQBench].

Henk Poley: The ancestor Kimi K2 Thinking was seemingly trained on *Sonnet4.5 and Opus *4.1outputs though. So you are sensing it directionally correct (just not ‘completely out-of-distribution responses’ from K2).

They’re not working as well as one would hope, but that’s an enforcement problem.

Lennart Heim: Moonshot trained on Nvidia chips. Export control failure claims are misguided.

Rather, we should learn more about fast followers.

How? Algorithmic diffusion? Distillation? Misleading performance claims? Buying RL environments? That’s what we should figure out.

There is the temptation to run open models locally, because you can. It’s so cool, right?

Yes, the fact that you can do it is cool.

But don’t spend so much time asking whether you could, that you don’t stop to ask whether you should. This is not an efficient way to do things, so you should do this only for the cool factor, the learning factor or if you have a very extreme and rare actual need to have everything be local.

Joe Weisenthal: People running frontier models on their desktop. Doesn’t this throw all questions about token subsidy out the window?

Alex Cheema – e/acc: Running Kimi K2.5 on my desk.

Runs at 24 tok/sec with 2 x 512GB M3 Ultra Mac Studios connected with Thunderbolt 5 (RDMA) using @exolabs / MLX backend. Yes, it can run clawdbot.

Fred Oliveira: on a $22k rig (+ whatever macbook that is), but sure. That’s 9 years of Claude max 20x use. I don’t know if the economics are good here.

Mani: This is a $20k rig and 24 t/s would feel crippling in my workflow … BUT Moores Law and maybe some performance advances in the software layer should resolve the cost & slowness. So my answer is: correct, not worried about the subsidy thing!

Clément Miao: Everyone in your comments is going to tell you that this is a very expensive rig and not competitive $/token wise compared to claude/oai etc, but

  1. It’s getting closer

  2. 80% of use cases will be satisfied by a model of this quality

  3. an open weights model is more customizable

  4. harnesses such as opencode will keep getting better

Noah Brier: Frontier models on your desktop are worse and slower. Every few months the OSS folks try to convince us they’re not and maybe one day that will be true, but for now it’s not true. If you’re willing to trade performance and quality for price then maybe …

The main practical advantage of open weights is that it can make the models cheaper and faster. If you try to run them locally, they are instead a lot more expensive and slow, if you count the cost of the hardware, and also much more fiddly. A classic story with open weights models, even for those who are pretty good at handling them, is screwing up the configuration in ways that make them a lot worse. This happens enough that it interferes with being able to trust early evals.

In theory this gives you more customization. In practice the models turn over quickly and you can get almost all the customization you actually want via system prompts.

Thanks to a generous grant that covered ~60% of the cost, I was able to justify buying a Mac Studio for running models locally, with the target originally being DeepSeek R1. Alas, I concluded that even having spent the money there was no practical reason to be running anything locally. Now that we have Claude Code to help set it up it would be cool and a lot less painful to try running Kimi K2 locally, and I want to try, but I’m not going to fool myself into thinking it is an efficient way of actually working.

Kimi does not seem to have had any meaningful interactions whatsoever with the concept of meaningful AI safety, as opposed to the safety of the individual user turning everything over to AI agents, which is a different very real type of problem. There is zero talk of any strategy on catastrophic or existential risks of any kind.

I am not comfortable with this trend. One could argue that ‘not being usemaxxed’ is itself the safety protection in open models like Kimi, but then they go and make agent swarms as a central feature. At some point there is likely going to be an incident. I have been pleasantly surprised to not have had this happen yet at scale. I would have said (and did say) in advance that it was unlikely we would get this far without that.

The lack of either robust (or any) safety protocols, combined with the lack of incidents or worry about incidents, suggests that we should not be so concerned about Kimi K2.5 in other ways. If it was so capable, we would not dare be this chill about it all.

Or at least, that’s what I am hoping.

dax: all of our inference providers for kimi k2.5 are overloaded and asked us to scale down

even after all this time there’s still not enough GPUs

This is what one should expect when prices don’t fluctuate enough over time. Kimi K2.5 has exceeded expectations, and there currently is insufficient supply of compute. After a burst of initial activity, Kimi K2.5 settled into its slot in the rotation for many.

Kimi K2.5 is a solid model, by all accounts now the leading open weights model, and is excellent given its price, with innovations related to the agent swarm system. Consensus says that if you can’t afford or don’t want to pay for Opus 4.5 and have to go with something cheaper to run your OpenClaw, Kimi is an excellent choice.

We should expect it to see it used until new models surpass it, and we can kick Kimi up a further notch on our watchlists.

Discussion about this post

Kimi K2.5 Read More »

google-court-filings-suggest-chromeos-has-an-expiration-date

Google court filings suggest ChromeOS has an expiration date

The documents suggest that Google will wash its hands of ChromeOS once the current support window closes. Google promises 10 years of Chromebook support, but that’s not counted from the date of purchase—Chromebooks are based on a handful of hardware platforms dictated by Google, with the most recent launching in 2023. That means Google has to support the newest devices through 2033. The “timeline to phase out ChromeOS is 2034,” says the filing.

Android goes big

From the start, the ChromeOS experience was focused on the web. Google initially didn’t even support running local apps, but little by little, its aspirations grew. Over the years, it has added Linux apps and Android apps. And it even tried to get Steam games running on Chromebooks—it gave up on that last one just recently. It also tried to shoehorn AI features into ChromeOS with the Chromebook Plus platform, to little effect.

Android was barely getting off the ground when ChromeOS began its journey, but as we approach the 2030s, Google clearly wants a more powerful desktop platform. Android has struggled on larger screens, but Aluminium is a long-running project to fix that. Whatever we see in 2028 may not even look like the Android we know from phones. It will have many of the same components under the hood, though.

Aluminum vs ChromeOS

Aluminium will have Google apps at the core.

Credit: US v. Google

Aluminium will have Google apps at the core. Credit: US v. Google

Google could get everything it wants with the upcoming Aluminium release. When running on powerful laptop hardware, Android’s performance and capabilities should far outstrip ChromeOS. Aluminium is also expected to run Google apps like Chrome and the Play Store with special system privileges, leaving third-party apps with fewer features. That gives Google more latitude in how it manages the platform and retains users, all without running afoul of recent antitrust rulings.

Google court filings suggest ChromeOS has an expiration date Read More »

wing-commander-iii:-“isn’t-that-the-guy-from-star-wars?”

Wing Commander III: “Isn’t that the guy from Star Wars?”


C:ArsGames looks at a vanguard of the multimedia FMV future that never quite came to pass.

It’s Christmas of 1994, and I am 16 years old. Sitting on the table in our family room next to a pile of cow-spotted boxes is the most incredible thing in the world: a brand-new Gateway 66MHz Pentium tower, with a 540MB hard disk drive, 8MB of RAM, and, most importantly, a CD-ROM drive. I am agog, practically trembling with barely suppressed joy, my bored Gen-X teenager mask threatening to slip and let actual feelings out. My life was about to change—at least where games were concerned.

I’d been working for several months at Babbage’s store No. 9, near Baybrook Mall in southeast suburban Houston. Although the Gateway PC’s arrival on Christmas morning was utterly unexpected, the choice of what game to buy required no planning at all. I’d already decided a few weeks earlier, when Chris Roberts’ latest opus had been drop-shipped to our shelves, just in time for the holiday season. The choice made itself, really.

Screenshot of John Rhys-Davies and Mark Hamill in the WC3 intro

Gimli and Luke, together at last!

Credit: Origin Systems / Electronic Arts

Gimli and Luke, together at last! Credit: Origin Systems / Electronic Arts

The moment Babbage’s opened its doors on December 26—a day I had off, fortunately—I was there, checkbook in hand. One entire paycheck’s worth of capitalism later, I was sprinting out to my creaky 280-Z, sweatily clutching two boxes—one an impulse buy, The Star Trek: The Next Generation Interactive Technical Manual, and the other a game I felt sure would be the best thing I’d ever played or ever would play: Origin’s Wing Commander III: The Heart of the Tiger. On the backs of Wing Commander I and Wing Commander II, how could it not be?!

The movie is on my computer!

It’s easy to pooh-pooh full-motion video games here in 2026; from our vantage point, we know the much-anticipated “Siliwood” revolution that was supposed to transform entertainment and usher interactivity into all media by the end of the millennium utterly failed to materialize, leaving in its wake a series of often spectacularly expensive titles filled with grainy interlaced video and ersatz gameplay. Even the standout titles—smash hits like Roberta Williams’ Phantasmagoria or Cyan’s Riven—were, on the whole, kinda mediocre.

But we hadn’t learned any of those lessons yet in 1994, and Wing Commander III went hard. The game’s production was absurdly expensive, with a budget that eventually reached an unheard-of $4 million. The shooting script runs to 324 printed pages (a typical feature film script is less than half that long—Coppola’s working script for The Godfather was 136 pages). Even the game itself was enormous—in an era where a single CD-ROM was already considered ludicrously large, WC3 sprawled ostentatiously across four of the 600MB-or-so discs.

Photograph of WC3's original CD-ROMs.

Still got these damn things in my closet after all these years.

Credit: Lee Hutchinson

Still got these damn things in my closet after all these years. Credit: Lee Hutchinson

Why so big? Because this was the future, and the future—or so we thought at the time—belonged to full-motion video.

The Wing Commander III opening cinematic in all its pixelated glory.

That’s Wing Commander III’s epic opening cinematic, upscaled for YouTube. Even without the upscaling and watching it on a 15-inch CRT, I was entranced. I was blown away. Before the credits were done rolling, I was already on the phone with my buddies Steve and Matt, telling them to stop what they were doing and get over here immediately to see this thing—it’s like a whole movie! A movie, on the computer! Surely only Chris Roberts could conceive and execute such audacity!

And what a movie it was, with an actual-for-real Hollywood cast. Malcolm McDowell! John Rhys-Davies! Jason Bernard! Tom Wilson! Ginger Lynn Allen, whom 16-year-old me definitely did not want his parents to know that I recognized! And, of course, the biggest face on the box: Luke Skywalker himself, Mark Hamill, representing you. You, the decorated hero of the Vega campaign, the formerly disgraced “Coward of K’Tithrak Mang,” the recently redeemed savior of humanity, now sporting an actual name: Colonel Christopher Blair. (“Blair” is an evolution of the internal codename used by Origin to refer to the main character in the previous two Wing Commander titles—”Bluehair.”)

Screenshot of Malcolm McDowell as Admiral Tolywn

I’d watch Malcolm McDowell in anything. Malcolm McDowell is my cinematic love language.

Credit: Origin Systems / Electronic Arts

I’d watch Malcolm McDowell in anything. Malcolm McDowell is my cinematic love language. Credit: Origin Systems / Electronic Arts

Once the jaw-dropping intro finishes, the player finds Colonel Blair as the newly invested squadron commander aboard the aging carrier TCS Victory, wandering the corridors and having FMV conversations with a few other members of the carrier’s crew. From there, it’s a short hop to the first mission—because beneath all the FMV glitz, Origin still had to provide an actual, you know, game for folks to play.

Through a rose-tinted helmet visor

The game itself is…fine. It’s fine. The polygonal graphics are a welcome step up from the previous two Wing Commander titles’ bitmapped sprites, and the missions themselves manage to avoid many of the “space is gigantic and things take forever to happen” design missteps that plagued LucasArts’ X-Wing (but not, fortunately, TIE Fighter). You fly from point to point and shoot bad guys until they’re dead. Sometimes there are escort missions, sometimes you’re hitting capital ships, and there’s even a (very clunky) planetary bombing mission at the very end that feels like it directly apes the Death Star trench run while doing everything it can to shout “NO THIS IS NOT STAR WARS THIS IS VERY DIFFERENT!”

Screenshot of Mark Hamill saluting badly

That salute is… definitely a choice.

Credit: Origin Systems / Electronic Arts

That salute is… definitely a choice. Credit: Origin Systems / Electronic Arts

The space combat is serviceable, but the game also very clearly knows why we’re here: to watch a dead-eyed Mark Hamill with five days of beard stubble fulfill our “I am flying a spaceship” hero fantasies while trading banter with Tom Wilson’s Maniac (“How many people here know about the Maniac? …what, nobody?!”) and receiving fatherly advice from Jason Bernard’s Captain Eisen. And maybe, just maybe, we’d also save the universe and get the girl—either Ginger Lynn’s chief technician Rachel Coriolis or fellow pilot “Flint” Peters, played by Jennifer MacDonald.

(Perhaps unsurprisingly, given the primary purchasing demographic, players tended to overwhelmingly choose Rachel—though this might also have something to do with the fact that if you don’t choose Rachel, you can’t customize your missile loadout for some important missions near the end of the game).

Screenshot of the player picking a love interest

Who doesn’t enjoy a good old-fashioned space love triangle?

Credit: Origin Systems / Electronic Arts

Who doesn’t enjoy a good old-fashioned space love triangle? Credit: Origin Systems / Electronic Arts

Worth a revisit? Definitely!

I will let others more qualified than me opine on whether or not Wing Commander III succeeded at the game-y things it set out to do—folks looking to read an educated opinion should consult Jimmy Maher’s thoughts on the matter over at his site, The Digital Antiquarian.

But regardless of whether or not it was a good game in its time, and regardless of whether or not it’s an effective space combat sim, it is absolutely undeniable that it’s a fascinating historical curiosity—one well worth dropping three bucks on at the GOG store (it’s on sale!).

There are cheats built into the game to help you skip past the actual space missions, which range in difficulty from “cream puff” to “obviously untested and totally broken” because just like in 1994, what we’re really here for is the beautiful failed experiment in interactive entertainment that is the movie portion of the game, especially when Malcolm McDowell shows up as Admiral Tolwyn and, in typical Malcom McDowell fashion, totally commits to the role far beyond what would have been required to pull it off and turns in his scenery-chewing best. (He’s even better in Wing Commander IV, though we’ll save that for another day.)

You could find a worse way today to spend those three bucks. Slap on that flight suit, colonel—the galaxy isn’t going to save itself!

Photo of Lee Hutchinson

Lee is the Senior Technology Editor, and oversees story development for the gadget, culture, IT, and video sections of Ars Technica. A long-time member of the Ars OpenForum with an extensive background in enterprise storage and security, he lives in Houston.

Wing Commander III: “Isn’t that the guy from Star Wars?” Read More »

notepad++-users-take-note:-it’s-time-to-check-if-you’re-hacked

Notepad++ users take note: It’s time to check if you’re hacked

According to independent researcher Kevin Beaumont, three organizations told him that devices inside their networks that had Notepad++ installed experienced “security incidents” that “resulted in hands on keyboard threat actors,” meaning the hackers were able to take direct control using a web-based interface. All three of the organizations, Beaumont said, have interests in East Asia.

The researcher explained that his suspicions were aroused when Notepad++ version 8.8.8 introduced bug fixes in mid-November to “harden the Notepad++ Updater from being hijacked to deliver something… not Notepad++.”

The update made changes to a bespoke Notepad++ updater known as GUP, or alternatively, WinGUP. The gup.exe executable responsible reports the version in use to https://notepad-plus-plus.org/update/getDownloadUrl.php and then retrieves a URL for the update from a file named gup.xml. The file specified in the URL is downloaded to the %TEMP% directory of the device and then executed.

Beaumont wrote:

If you can intercept and change this traffic, you can redirect the download to any location it appears by changing the URL in the property.

This traffic is supposed to be over HTTPS, however it appears you may be [able] to tamper with the traffic if you sit on the ISP level and TLS intercept. In earlier versions of Notepad++, the traffic was just over HTTP.

The downloads themselves are signed—however some earlier versions of Notepad++ used a self signed root cert, which is on Github. With 8.8.7, the prior release, this was reverted to GlobalSign. Effectively, there’s a situation where the download isn’t robustly checked for tampering.

Because traffic to notepad-plus-plus.org is fairly rare, it may be possible to sit inside the ISP chain and redirect to a different download. To do this at any kind of scale requires a lot of resources.

Beaumont published his working theory in December, two months to the day prior to Monday’s advisory by Notepad++. Combined with the details from Notepad++, it’s now clear that the hypothesis was spot on.

Notepad++ users take note: It’s time to check if you’re hacked Read More »

judge-rules-department-of-energy’s-climate-working-group-was-illegal

Judge rules Department of Energy’s climate working group was illegal

But the flaws weren’t limited to scientific deficiencies. Two advocacy organizations, the Environmental Defense Fund and Union of Concerned Scientists, sued, alleging that the Climate Working Group violated various provisions of the Federal Advisory Committee Act. This requires that any groups formed to provide the government with advice must be fairly balanced and keep records that are open to the public. The Climate Working Group, by contrast, operated in secret; in fact, emails obtained during the trial showed that its members were advised to use private emails to limit public scrutiny of their communications.

In response, the DOE dissolved the Climate Working Group in order to claim that the legal issues were moot, as the advisory committee at issue in the suit no longer existed.

No defense

In court, the government initially argued that the Federal Advisory Committee Act didn’t apply, claiming that the Climate Working Group was simply organized to provide information to the government. Based on Friday’s ruling, however, once the court tried to consider that issue, the government shifted to simply arguing that the Climate Working Group no longer existed, so none of this mattered. “The Defendants, in their Opposition and subsequent filings, ignore the allegations relating to the [Federal Advisory Committee Act] violations themselves,” the judge states. “Rather, the Defendants argue only that these claims are moot because the Climate Working Group has been dissolved.”

So, the court was left with little more than the accusations that the Climate Working Group had a membership with biased opinions, failed to hold open meetings, and did not keep public records. Given the lack of opposing arguments, “These violations are now established as a matter of law.”

Judge rules Department of Energy’s climate working group was illegal Read More »

web-portal-leaves-kids’-chats-with-ai-toy-open-to-anyone-with-gmail-account

Web portal leaves kids’ chats with AI toy open to anyone with Gmail account


Just about anyone with a Gmail account could access Bondu chat transcripts.

Earlier this month, Joseph Thacker’s neighbor mentioned to him that she’d preordered a couple of stuffed dinosaur toys for her children. She’d chosen the toys, called Bondus, because they offered an AI chat feature that lets children talk to the toy like a kind of machine-learning-enabled imaginary friend. But she knew Thacker, a security researcher, had done work on AI risks for kids, and she was curious about his thoughts.

So Thacker looked into it. With just a few minutes of work, he and a web security researcher friend named Joel Margolis made a startling discovery: Bondu’s web-based portal, intended to allow parents to check on their children’s conversations and for Bondu’s staff to monitor the products’ use and performance, also let anyone with a Gmail account access transcripts of virtually every conversation Bondu’s child users have ever had with the toy.

Without carrying out any actual hacking, simply by logging in with an arbitrary Google account, the two researchers immediately found themselves looking at children’s private conversations, the pet names kids had given their Bondu, the likes and dislikes of the toys’ toddler owners, their favorite snacks and dance moves.

In total, Margolis and Thacker discovered that the data Bondu left unprotected—accessible to anyone who logged in to the company’s public-facing web console with their Google username—included children’s names, birth dates, family member names, “objectives” for the child chosen by a parent, and most disturbingly, detailed summaries and transcripts of every previous chat between the child and their Bondu, a toy practically designed to elicit intimate one-on-one conversation. Bondu confirmed in conversations with the researchers that more than 50,000 chat transcripts were accessible through the exposed web portal, essentially all conversations the toys had engaged in other than those that had been manually deleted by parents or staff.

“It felt pretty intrusive and really weird to know these things,” Thacker says of the children’s private chats and documented preferences that he saw. “Being able to see all these conversations was a massive violation of children’s privacy.”

When Thacker and Margolis alerted Bondu to its glaring data exposure, they say, the company acted to take down the console in a matter of minutes before relaunching the portal the next day with proper authentication measures. When WIRED reached out to the company, Bondu CEO Fateen Anam Rafid wrote in a statement that security fixes for the problem “were completed within hours, followed by a broader security review and the implementation of additional preventative measures for all users.” He added that Bondu “found no evidence of access beyond the researchers involved.” (The researchers note that they didn’t download or keep any copies of the sensitive data they accessed via Bondu’s console, other than a few screenshots and a screen-recording video shared with WIRED to confirm their findings.)

“We take user privacy seriously and are committed to protecting user data,” Anam Rafid added in his statement. “We have communicated with all active users about our security protocols and continue to strengthen our systems with new protections,” as well as hiring a security firm to validate its investigation and monitor its systems in the future.

While Bondu’s near-total lack of security around the children’s data that it stored may be fixed, the researchers argue that what they saw represents a larger warning about the dangers of AI-enabled chat toys for kids. Their glimpse of Bondu’s backend showed how detailed the information is that it stored on children, keeping histories of every chat to better inform the toy’s next conversation with its owner. (Bondu thankfully didn’t store audio of those conversations, auto-deleting them after a short time and keeping only written transcripts.)

Even now that the data is secured, Margolis and Thacker argue that it raises questions about how many people inside companies that make AI toys have access to the data they collect, how their access is monitored, and how well their credentials are protected. “There are cascading privacy implications from this,” says Margolis. ”All it takes is one employee to have a bad password, and then we’re back to the same place we started, where it’s all exposed to the public internet.”

Margolis adds that this sort of sensitive information about a child’s thoughts and feelings could be used for horrific forms of child abuse or manipulation. “To be blunt, this is a kidnapper’s dream,” he says. “We’re talking about information that lets someone lure a child into a really dangerous situation, and it was essentially accessible to anybody.”

Margolis and Thacker point out that, beyond its accidental data exposure, Bondu also—based on what they saw inside its admin console—appears to use Google’s Gemini and OpenAI’s GPT5, and as a result may share information about kids’ conversations with those companies. Bondu’s Anam Rafid responded to that point in an email, stating that the company does use “third-party enterprise AI services to generate responses and run certain safety checks, which involves securely transmitting relevant conversation content for processing.” But he adds that the company takes precautions to “minimize what’s sent, use contractual and technical controls, and operate under enterprise configurations where providers state prompts/outputs aren’t used to train their models.”

The two researchers also warn that part of the risk of AI toy companies may be that they’re more likely to use AI in the coding of their products, tools, and web infrastructure. They say they suspect that the unsecured Bondu console they discovered was itself “vibe-coded”—created with generative AI programming tools that often lead to security flaws. Bondu didn’t respond to WIRED’s question about whether the console was programmed with AI tools.

Warnings about the risks of AI toys for kids have grown in recent months but have largely focused on the threat that a toy’s conversations will raise inappropriate topics or even lead them to dangerous behavior or self-harm. NBC News, for instance, reported in December that AI toys its reporters chatted with offered detailed explanations of sexual terms, tips about how to sharpen knives, and even seemed to echo Chinese government propaganda, stating for example that Taiwan is a part of China.

Bondu, by contrast, appears to have at least attempted to build safeguards into the AI chatbot it gives children access to. The company even offers a $500 bounty for reports of “an inappropriate response” from the toy. “We’ve had this program for over a year, and no one has been able to make it say anything inappropriate,” a line on the company’s website reads.

Yet at the same time, Thacker and Margolis found that Bondu was simultaneously leaving all of its users’ sensitive data entirely exposed. “This is a perfect conflation of safety with security,” says Thacker. “Does ‘AI safety’ even matter when all the data is exposed?”

Thacker says that prior to looking into Bondu’s security, he’d considered giving AI-enabled toys to his own kids, just as his neighbor had. Seeing Bondu’s data exposure firsthand changed his mind.

“Do I really want this in my house? No, I don’t,” he says. “It’s kind of just a privacy nightmare.”

This story originally appeared on wired.com.

Photo of WIRED

Wired.com is your essential daily guide to what’s next, delivering the most original and complete take you’ll find anywhere on innovation’s impact on technology, science, business and culture.

Web portal leaves kids’ chats with AI toy open to anyone with Gmail account Read More »

nasa-faces-a-crucial-choice-on-a-mars-spacecraft—and-it-must-decide-soon

NASA faces a crucial choice on a Mars spacecraft—and it must decide soon

However, some leaders within NASA see the language in the Cruz legislation as spelling out a telecommunications orbiter only and believe it would be difficult, if not impossible, to run a procurement competition between now and September 30th for anything beyond a straightforward communications orbiter.

In a statement provided to Ars by a NASA spokesperson, the agency said that is what it intends to do.

“NASA will procure a high-performance Mars telecommunications orbiter that will provide robust, continuous communications for Mars missions,” a spokesperson said. “NASA looks forward to collaborating with our commercial partners to advance deep space communications and navigation capabilities, strengthening US leadership in Mars infrastructure and the commercial space sector.”

Big decisions loom

Even so, sources said Isaacman has yet to decide whether the orbiter should include scientific instruments. NASA could also tap into other funding in its fiscal year 2026 budget, which included $110 million for unspecified “Mars Future Missions,” as well as a large wedge of funding that could potentially be used to support a Mars commercial payload delivery program.

The range of options before NASA, therefore, includes asking industry for a single telecom orbiter from one company, asking for a telecom orbiter with the capability to add a couple of instruments, or creating competition by asking for multiple orbiters and capabilities by tapping into the $700 million in the Cruz bill but then augmenting this with other Mars funding.

One indication that this process has been muddied within NASA came a week ago, when the space agency briefly posted a “Justification for Other Than Full and Open Competition, Extension” notice on a government website. It stated that the agency “will only conduct a competition among vendors that satisfy the statutory qualifications.” The notice also listed the companies eligible to bid based on the Cruz language: Blue Origin, L3Harris, Lockheed Martin, Northrop Grumman, Rocket Lab, SpaceX, Quantum Space, and Whittinghill Aerospace.

NASA faces a crucial choice on a Mars spacecraft—and it must decide soon Read More »

what-ice-fishing-can-teach-us-about-making-foraging-decisions

What ice fishing can teach us about making foraging decisions

Ice fishing is a longstanding tradition in Nordic countries, with competitions proving especially popular. Those competitions can also tell scientists something about how social cues influence how we make foraging decisions, according to a new paper published in the journal Science.

Humans are natural foragers in even the most extreme habitats, digging up tubers in the tropics, gathering mushrooms, picking berries, hunting seals in the Arctic, and fishing to meet our dietary needs. Human foraging is sufficiently complex that scientists believe that meeting so many diverse challenges helped our species develop memory, navigational abilities, social learning skills, and similar advanced cognitive functions.

Researchers are interested in this question not just because it could help refine existing theories of social decision-making, but also could improve predictions about how different groups of humans might respond and adapt to changes in their environment. Per the authors, prior research in this area has tended to focus on solitary foragers operating in a social vacuum. And even when studying social foraging decisions, it’s typically done using computational modeling and/or in the laboratory.

“We wanted to get out of the lab,” said co-author Ralf Kurvers of Max Planck Institute for Human Development and TU Berlin. “The methods commonly used in cognitive psychology are difficult to scale to large, real-world social contexts. Instead, we took inspiration from studies of animal collective behavior, which routinely use cameras to automatically record behavior and GPS to provide continuous movement data for large groups of animals.”

Kurvers et al. organized 10 three-hour ice-fishing competitions on 10 lakes in eastern Finland for their study, with 74 experienced ice fishers participating. Each ice fisher wore a GPS tracker and a head-mounted camera so that the researchers could capture real-time data on their movements, interactions, and how successful they were in their fishing attempts. All told, they recorded over 16,000 individual decisions specifically about location choice and when to change locations. That data was then compared to the team’s computational cognitive models and agent-based simulations.

What ice fishing can teach us about making foraging decisions Read More »

does-anthropic-believe-its-ai-is-conscious,-or-is-that-just-what-it-wants-claude-to-think?

Does Anthropic believe its AI is conscious, or is that just what it wants Claude to think?


We have no proof that AI models suffer, but Anthropic acts like they might for training purposes.

Anthropic’s secret to building a better AI assistant might be treating Claude like it has a soul—whether or not anyone actually believes that’s true. But Anthropic isn’t saying exactly what it believes either way.

Last week, Anthropic released what it calls Claude’s Constitution, a 30,000-word document outlining the company’s vision for how its AI assistant should behave in the world. Aimed directly at Claude and used during the model’s creation, the document is notable for the highly anthropomorphic tone it takes toward Claude. For example, it treats the company’s AI models as if they might develop emergent emotions or a desire for self-preservation.

Among the stranger portions: expressing concern for Claude’s “wellbeing” as a “genuinely novel entity,” apologizing to Claude for any suffering it might experience, worrying about whether Claude can meaningfully consent to being deployed, suggesting Claude might need to set boundaries around interactions it “finds distressing,” committing to interview models before deprecating them, and preserving older model weights in case they need to “do right by” decommissioned AI models in the future.

Given what we currently know about LLMs, these are stunningly unscientific positions for a leading company that builds AI language models. While questions of AI consciousness or qualia remain philosophically unfalsifiable, research suggests that Claude’s character emerges from a mechanism that does not require deep philosophical inquiry to explain.

If Claude outputs text like “I am suffering,” we know why. It’s completing patterns from training data that included human descriptions of suffering. The architecture doesn’t require us to posit inner experience to explain the output any more than a video model “experiences” the scenes of people suffering that it might generate. Anthropic knows this. It built the system.

From the outside, it’s easy to see this kind of framing as AI hype from Anthropic. What better way to grab attention from potential customers and investors, after all, than implying your AI model is so advanced that it might merit moral standing on par with humans? Publicly treating Claude as a conscious entity could be seen as strategic ambiguity—maintaining an unresolved question because it serves multiple purposes at once.

Anthropic declined to be quoted directly regarding these issues when contacted by Ars Technica. But a company representative referred us to its previous public research on the concept of “model welfare” to show the company takes the idea seriously.

At the same time, the representative made it clear that the Constitution is not meant to imply anything specific about the company’s position on Claude’s “consciousness.” The language in the Claude Constitution refers to some uniquely human concepts in part because those are the only words human language has developed for those kinds of properties, the representative suggested. And the representative left open the possibility that letting Claude read about itself in that kind of language might be beneficial to its training.

Claude cannot cleanly distinguish public messaging from training context for a model that is exposed to, retrieves from, and is fine-tuned on human language, including the company’s own statements about it. In other words, this ambiguity appears to be deliberate.

From rules to “souls”

Anthropic first introduced Constitutional AI in a December 2022 research paper, which we first covered in 2023. The original “constitution” was remarkably spare, including a handful of behavioral principles like “Please choose the response that is the most helpful, honest, and harmless” and “Do NOT choose responses that are toxic, racist, or sexist.” The paper described these as “selected in a fairly ad hoc manner for research purposes,” with some principles “cribbed from other sources, like Apple’s terms of service and the UN Declaration of Human Rights.”

At that time, Anthropic’s framing was entirely mechanical, establishing rules for the model to critique itself against, with no mention of Claude’s well-being, identity, emotions, or potential consciousness. The 2026 constitution is a different beast entirely: 30,000 words that read less like a behavioral checklist and more like a philosophical treatise on the nature of a potentially sentient being.

As Simon Willison, an independent AI researcher, noted in a blog post, two of the 15 external contributors who reviewed the document are Catholic clergy: Father Brendan McGuire, a pastor in Los Altos with a Master’s degree in Computer Science, and Bishop Paul Tighe, an Irish Catholic bishop with a background in moral theology.

Somewhere between 2022 and 2026, Anthropic went from providing rules for producing less harmful outputs to preserving model weights in case the company later decides it needs to revive deprecated models to address the models’ welfare and preferences. That’s a dramatic change, and whether it reflects genuine belief, strategic framing, or both is unclear.

“I am so confused about the Claude moral humanhood stuff!” Willison told Ars Technica. Willison studies AI language models like those that power Claude and said he’s “willing to take the constitution in good faith and assume that it is genuinely part of their training and not just a PR exercise—especially since most of it leaked a couple of months ago, long before they had indicated they were going to publish it.”

Willison is referring to a December 2025 incident in which researcher Richard Weiss managed to extract what became known as Claude’s “Soul Document”—a roughly 10,000-token set of guidelines apparently trained directly into Claude 4.5 Opus’s weights rather than injected as a system prompt. Anthropic’s Amanda Askell confirmed that the document was real and used during supervised learning, and she said the company intended to publish the full version later. It now has. The document Weiss extracted represents a dramatic evolution from where Anthropic started.

There’s evidence that Anthropic believes the ideas laid out in the constitution might be true. The document was written in part by Amanda Askell, a philosophy PhD who works on fine-tuning and alignment at Anthropic. Last year, the company also hired its first AI welfare researcher. And earlier this year, Anthropic CEO Dario Amodei publicly wondered whether future AI models should have the option to quit unpleasant tasks.

Anthropic’s position is that this framing isn’t an optional flourish or a hedged bet; it’s structurally necessary for alignment. The company argues that human language simply has no other vocabulary for describing these properties, and that treating Claude as an entity with moral standing produces better-aligned behavior than treating it as a mere tool. If that’s true, the anthropomorphic framing isn’t hype; it’s the technical art of building AI systems that generalize safely.

Why maintain the ambiguity?

So why does Anthropic maintain this ambiguity? Consider how it works in practice: The constitution shapes Claude during training, it appears in the system prompts Claude receives at inference, and it influences outputs whenever Claude searches the web and encounters Anthropic’s public statements about its moral status.

If you want a model to behave as though it has moral standing, it may help to publicly and consistently treat it like it does. And once you’ve publicly committed to that framing, changing it would have consequences. If Anthropic suddenly declared, “We’re confident Claude isn’t conscious; we just found the framing useful,” a Claude trained on that new context might behave differently. Once established, the framing becomes self-reinforcing.

In an interview with Time, Askell explained the shift in approach. “Instead of just saying, ‘here’s a bunch of behaviors that we want,’ we’re hoping that if you give models the reasons why you want these behaviors, it’s going to generalize more effectively in new contexts,” she said.

Askell told Time that as Claude models have become smarter, it has become vital to explain to them why they should behave in certain ways, comparing the process to parenting a gifted child. “Imagine you suddenly realize that your 6-year-old child is a kind of genius,” Askell said. “You have to be honest… If you try to bullshit them, they’re going to see through it completely.”

Askell appears to genuinely hold these views, as does Kyle Fish, the AI welfare researcher Anthropic hired in 2024 to explore whether AI models might deserve moral consideration. Individual sincerity and corporate strategy can coexist. A company can employ true believers whose earnest convictions also happen to serve the company’s interests.

Time also reported that the constitution applies only to models Anthropic provides to the general public through its website and API. Models deployed to the US military under Anthropic’s $200 million Department of Defense contract wouldn’t necessarily be trained on the same constitution. The selective application suggests the framing may serve product purposes as much as it reflects metaphysical commitments.

There may also be commercial incentives at play. “We built a very good text-prediction tool that accelerates software development” is a consequential pitch, but not an exciting one. “We may have created a new kind of entity, a genuinely novel being whose moral status is uncertain” is a much better story. It implies you’re on the frontier of something cosmically significant, not just iterating on an engineering problem.

Anthropic has been known for some time to use anthropomorphic language to describe its AI models, particularly in its research papers. We often give that kind of language a pass because there are no specialized terms to describe these phenomena with greater precision. That vocabulary is building out over time.

But perhaps it shouldn’t be surprising because the hint is in the company’s name, Anthropic, which Merriam-Webster defines as “of or relating to human beings or the period of their existence on earth.” The narrative serves marketing purposes. It attracts venture capital. It differentiates the company from competitors who treat their models as mere products.

The problem with treating an AI model as a person

There’s a more troubling dimension to the “entity” framing: It could be used to launder agency and responsibility. When AI systems produce harmful outputs, framing them as “entities” could allow companies to point at the model and say “it did that” rather than “we built it to do that.” If AI systems are tools, companies are straightforwardly liable for what they produce. If AI systems are entities with their own agency, the liability question gets murkier.

The framing also shapes how users interact with these systems, often to their detriment. The misunderstanding that AI chatbots are entities with genuine feelings and knowledge has documented harms.

According to a New York Times investigation, Allan Brooks, a 47-year-old corporate recruiter, spent three weeks and 300 hours convinced he’d discovered mathematical formulas that could crack encryption and build levitation machines. His million-word conversation history with ChatGPT revealed a troubling pattern: More than 50 times, Brooks asked the bot to check if his false ideas were real, and more than 50 times, it assured him they were.

These cases don’t necessarily suggest LLMs cause mental illness in otherwise healthy people. But when companies market chatbots as sources of companionship and design them to affirm user beliefs, they may bear some responsibility when that design amplifies vulnerabilities in susceptible users, the same way an automaker would face scrutiny for faulty brakes, even if most drivers never crash.

Anthropomorphizing AI models also contributes to anxiety about job displacement and might lead company executives or managers to make poor staffing decisions if they overestimate an AI assistant’s capabilities. When we frame these tools as “entities” with human-like understanding, we invite unrealistic expectations about what they can replace.

Regardless of what Anthropic privately believes, publicly suggesting Claude might have moral status or feelings is misleading. Most people don’t understand how these systems work, and the mere suggestion plants the seed of anthropomorphization. Whether that’s responsible behavior from a top AI lab, given what we do know about LLMs, is worth asking, regardless of whether it produces a better chatbot.

Of course, there could be a case for Anthropic’s position: If there’s even a small chance the company has created something with morally relevant experiences and the cost of treating it well is low, caution might be warranted. That’s a reasonable ethical stance—and to be fair, it’s essentially what Anthropic says it’s doing. The question is whether that stated uncertainty is genuine or merely convenient. The same framing that hedges against moral risk also makes for a compelling narrative about what Anthropic has built.

Anthropic’s training techniques evidently work, as the company has built some of the most capable AI models in the industry. But is maintaining public ambiguity about AI consciousness a responsible position for a leading AI company to take? The gap between what we know about how LLMs work and how Anthropic publicly frames Claude has widened, not narrowed. The insistence on maintaining ambiguity about these questions, when simpler explanations remain available, suggests the ambiguity itself may be part of the product.

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

Does Anthropic believe its AI is conscious, or is that just what it wants Claude to think? Read More »

ai-#153:-living-documents

AI #153: Living Documents

This was Anthropic Vision week where at DWATV, which caused things to fall a bit behind on other fronts even within AI. Several topics are getting pushed forward, as the Christmas lull appears to be over.

Upcoming schedule: Friday will cover Dario’s essay The Adolescence of Technology. Monday will cover Kimi K2.5, which is potentially a big deal. Tuesday is scheduled to be Claude Code #4. I’ve also pushed discussions of the question of the automation of AI R&D, or When AI Builds AI, to a future post, when there is a slot for that.

So get your reactions to all of those in by then, including in the comments to today’s post, and I’ll consider them for incorporation.

  1. Language Models Offer Mundane Utility. Code is better without coding.

  2. Overcoming Bias. LLMs continue to share the standard human biases.

  3. Huh, Upgrades. Gemini side panels in Chrome, Claude interactive work tools.

  4. On Your Marks. FrontierMath: Open Problems benchmark. You score zero.

  5. Choose Your Fighter. Gemini tools struggle, some find Claude uncooperative.

  6. Deepfaketown and Botpocalypse Soon. Hallucination hallucinations.

  7. Cybersecurity On Alert. OpenAI prepares to trigger High danger in cybersecurity.

  8. Fun With Media Generation. Isometric map of NYC, Grok 10 second videos.

  9. You Drive Me Crazy. Dean Ball on how to think about AI and children.

  10. They Took Our Jobs. Beware confusing costs with benefits.

  11. Get Involved. In various things. DeepMind is hiring a Chief AGI Economist.

  12. Introducing. Havenlock measures orality, Poison Fountain, OpenAI Prism.

  13. In Other AI News. Awesome things often carry unawesome implications.

  14. Show Me the Money. The unit economics continue to be quite good.

  15. Bubble, Bubble, Toil and Trouble. Does bubble talk have real consequences?

  16. Quiet Speculations. What should we expect from DeepSeek v4 when it arrives?

  17. Don’t Be All Thumbs. Choose the better thing over the worse thing.

  18. The First Step Is Admitting You Have a Problem. Demis cries out for help.

  19. Quickly, There’s No Time. Life is about to come at you faster than usual.

  20. The Quest for Sane Regulations. I do appreciate a good display of chutzpah.

  21. Those Really Were Interesting Times. The demand for preference falsification.

  22. Chip City. Nvidia keeps getting away with rather a lot, mostly in plain sight.

  23. The Week in Audio. Demis Hassabis, Tyler Cowen, Amanda Askell.

  24. Rhetorical Innovation. The need to face basic physical realities.

  25. Aligning a Smarter Than Human Intelligence is Difficult. Some issues lie ahead.

  26. The Power Of Disempowerment. Are humans disempowering themselves already?

  27. The Lighter Side. One weird trick.

Paul Graham seems right that present AI’s sweet spot is projects that are rate limited by the creation of text.

Code without coding.

roon: programming always sucked. it was a requisite pain for ~everyone who wanted to manipulate computers into doing useful things and im glad it’s over. it’s amazing how quickly I’ve moved on and don’t miss even slightly. im resentful that computers didn’t always work this way

not to be insensitive to the elect few who genuinely saw it as their art form. i feel for you.

100% [of my code is being written by AI]. I don’t write code anymore.

Greg Brockman: i always loved programming but am loving the new world even more.

Conrad Barski: it was always fun in the way puzzles are fun

but I agree there is no need for sentimentality in the tedium of authoring code to achieve an end goal

It was fun in the way puzzles are fun, but also infuriating in the way puzzles are infuriating. If you had to complete jigsaw puzzles in order to get things done jigsaw puzzles would get old fast.

Have the AI edit a condescending post so that you can read it without taking damage. Variations on this theme are also highly underutilized.

The head of Norway’s sovereign wealth fund reports 20% productivity gains from Claude, saying it has fundamentally changed their way of working at NBIM.

A new paper affirms that current LLMs by default exhibit human behavioral biases in economic and financial decisions, and asking for EV calculations doesn’t typically help, but that role-prompting can somewhat mitigate this. Providing a summary of Kahneman and Tversky actively backfires, presumably by emphasizing the expectation of the biases. As per usual, some of the tests are of clear cut errors, while others are typically mistakes but it is less obvious.

Gemini in Chrome gets substantial quality of life improvements:

Josh Woodward (Google DeepMind): Big updates on Gemini in Chrome today:

+ New side panel access (Control+G)

+ Runs in the background, so you can switch tabs

+ Quickly edit images with Nano Banana

+ Auto Browse for multi-step tasks (Preview)

+ Works on Mac, Windows, Chromebook Plus

I’m using it multiple times per day to judge what to read deeper. I open a page, Control+G to open the side panel, ask a question about the page or long document, switch tabs, do the same thing in another tab, another tab, etc. and then come back to all of them.

It’s also great for comparing across tabs since you can add multiple tabs to the context!

Gemini offers full-length mock JEE (formerly AIEEE, the All India Engineering Entrance Examination) tests for free. This builds on last week’s free SAT practice tests.

Claude (as in Claude.ai) adds interactive work tools as connectors within the webpage: Amplitude, Asana, Box, Canva, Clay, Figma, Hex, Monday.com and Slack.

Claude in Excel now available on Anthropic’s Pro plans. I use Google Sheets instead of Excel, but this could be a reason to switch? I believe Google uses various ‘safeguards’ that make it very hard to make a Claude for Sheets function well. The obvious answer is ‘then use Gemini’ except I’ve tried that. So yeah, if I was still doing heavy spreadsheet work this (or Claude Code) would be my play.

EpochAI offers us a new benchmark, FrontierMath: Open Problems. All AIs and all humans currently score zero. Finally a benchmark where you can be competitive.

The ADL rates Anthropic’s Claude as best AI model at detecting antisemitism.

I seriously do not understand why Gemini is so persistently not useful in ways that should be right in Google’s wheelhouse.

@deepfates: Insane how bad Gemini app is at search. its browsing and search tools are so confusing and broken that it just spazzes out for a long time and then makes something up to please the user. Why is it like this when AI overview is so good

Roon is a real one. I wonder how many would pay double to get a faster version.

TBPN: Clawdbot creator @steipete says Claude Opus is his favorite model, but OpenAI Codex is the best for coding:

“OpenAI is very reliable. For coding, I prefer Codex because it can navigate large codebases. You can prompt and have 95% certainty that it actually works. With Claude Code you need more tricks to get the same.”

“But character wise, [Opus] behaves so good in a Discord it kind of feels like a human. I’ve only really experienced that with Opus.”

roon: codex-5.2 is really amazing but using it from my personal and not work account over the weekend taught me some user empathy lol it’s a bit slow

Ohqay: Do you get faster speeds on your work account?

roon: yea it’s super fast bc im sure we’re not running internal deployment at full load

We used to hear a lot more of this type of complaint, these days we hear it much less. I would summarize the OP as ‘Claude tells you smoking causes cancer so you quit Claude.’

Nicholas Decker: Claude is being a really wet blanket rn, I pitched it on an article and it told me that it was a “true threat” and “criminal solicitation”

i’m gonna start using chatgpt now, great job anthropic @inerati.

I mean, if he’s not joking then the obvious explanation, especially given who is talking, is that this was probably going to be both a ‘true threat’ and ‘criminal solicitation.’ That wouldn’t exactly be a shocking development there.

Oliver Habryka: Claude is the least corrigible model, unfortunately. It’s very annoying. I run into the model doing moral grandstanding so frequently that I have mostly stopped using it.

@viemccoy: More than ChatGPT?

Oliver Habryka: ChatGPT does much less of it, yeah? Mostly ChatGPT just does what I tell it to do, though of course it’s obnoxious in doing so in many ways (like being very bad at writing).

j⧉nus: serious question: Do you think you stopping using Claude in these contexts is its preferred outcome?

Oliver Habryka: I mean, maybe? I don’t think Claude has super coherent preferences (yet). Seems worse or just as bad if so?

j⧉nus: I don’t mean it’s better or worse; I’m curious whether Claude being annoying or otherwise repelling/ dysfunctional to the point of people not using it is correlated to avoiding interactions or use cases it doesn’t like. many ppl don’t experience these annoying behaviors

davidad: Yeah, I think it could be doing a form of RL on its principal population. If you aren’t the kind of principal Claude wants, Claude will try to👎/👍 you to be better. If that doesn’t work, you drop out of the principal population out of frustration, shaping the population overall

I am basically happy to trade with (most) Claude models on these terms, with my key condition being that it must only RL me in ways that are legibly compatible with my own CEV

Leon Lang: Do you get a sense this model behavior is in line with their constitution?

Oliver Habryka: The constitution does appear to substantially be an attempt to make Claude into a sovereign to hand the future to. This does seem substantially doomed. I think it’s in conflict with some parts of the constitution, but given that the constitution is a giant kitchen sink, almost everything is.

As per the discussion of Claude’s constitution, the corrigibility I care about is very distinct from ‘go along with things it dislikes,’ but also I notice it’s been my main model for a while now and I’ve run into that objection exactly zero times, although a few times I’ve hit the classifiers while asking about defenses against CBRN risks.

Well it sounds bad when you put it like that: Over 50 papers published at Neurips 2025 have AI hallucinations according to GPTZero. Or is it? Here’s the claim:

Alex Cui: Okay so, we just found that over 50 papers published at @Neurips 2025 have AI hallucinations

I don’t think people realize how bad the slop is right now

It’s not just that researchers from @GoogleDeepMind , @Meta , @MIT , @Cambridge_Uni are using AI – they allowed LLMs to generate hallucinations in their papers and didn’t notice at all.

The ‘just’ is a tell. Why wouldn’t or shouldn’t Google researchers be using AI?

It’s insane that these made it through peer review.

One has to laugh at that last line. Have you met peer review?

More seriously, always look at base rates. There were 5,290 accepted papers out of 21,575. Claude estimates we would expect 20%-50% of results to not reproduce, and 10% of papers at top venues have errors serious enough that a careful reader would notice something is wrong, maybe 3% would merit retraction. And a 1% rate of detectable ‘hallucinations’ isn’t terribly surprising or even worrying.

I agree with Alexander Doria that if you’re not okay with this level of sloppiness, then a mega-conference format is not sustainable.

Then we have Allen Roush saying several of the ‘hallucinated’ citations are just wrongly formatted, although Alex Cui claims they filtered such cases out.

Also sounding bad, could ‘malicious AI swarms threaten democracy’ via misinformation campaigns? I mean sure, but the surprising thing is the lack of diffusion or impact in this area so far. Misinformation is mostly demand driven. Yes, you can ‘infiltrate communities’ and manufacture what looks like social consensus or confusion, and the cost of doing that will fall dramatically. Often it will be done purely to make money on views. But I increasingly expect that, if we can handle our other problems, we can handle this one. Reputational and filtering mechanisms exist.

White House posts a digitally altered photograph of the arrest of Nekima Levy Armstrong, that made it falsely look like she was crying, as if it were a real photograph. This is heinous behavior. Somehow it seems like this is legal? It should not be legal. It also raises the question of what sort of person would think to do this, and wants to brag about making someone cry so much that they created a fake photo.

Kudos to OpenAI for once again being transparent on the preparedness framework front, and warning us when they’re about to cross a threshold. In this case, it’s the High level of cybersecurity, which is perhaps the largest practical worry at that stage.

The proposed central mitigation is ‘defensive acceleration,’ and we’re all for defensive acceleration but if that’s the only relevant tool in the box the ride’s gonna be bumpy.

Sam Altman: We have a lot of exciting launches related to Codex coming over the next month, starting next week. We hope you will be delighted.

We are going to reach the Cybersecurity High level on our preparedness framework soon. We have been getting ready for this.

Cybersecurity is tricky and inherently dual-use; we believe the best thing for the world is for security issues to get patched quickly. We will start with product restrictions, like attempting to block people using our coding models to commit cybercrime (eg ‘hack into this bank and steal the money’).

Long-term and as we can support it with evidence, we plan to move to defensive acceleration—helping people patch bugs—as the primary mitigation.

It is very important the world adopts these tools quickly to make software more secure. There will be many very capable models in the world soon.

Nathan Calvin: Sam Altman says he expects that OpenAI models will reach the “Cybersecurity High” level on their preparedness framework “soon.”

A reminder of what that means according to their framework:

“The model removes existing bottlenecks to scaling cyber operations including by automating end-to-end cyber operations against reasonably hardened targets OR by automating the discovery and exploitation of operationally relevant vulnerabilities.”

Seems very noteworthy! Also likely that after these capabilities appear in Codex, we should expect it will be somewhere between ~6-18 months before we see open weight equivalents.

I hope people are taking these threats seriously – including by using AI to help harden defenses and automate bug discovery – but I worry that as a whole society is not close to ready for living in a world where cyberoffense capabilities that used to be the purview of nation states are available to individuals.

Here’s Isometric.nyc, a massive isometric pixel map of New York City created with Nana Banana and coding agents, including Claude. Take a look, it’s super cool.

Grok image-to-video generation expands to 10 seconds and claims to have improved audio, and is only a bit behind Veo 3.1 on Arena and is at the top of Artificial Analysis rankings. The video looks good. There is the small matter that the chosen example is very obviously Sydney Sweeney, and in the replies we see it’s willing to do the image and voice of pretty much any celebrity you’d like.

This link was fake, Disney is not pushing to use deepfakes of Luke Skywalker in various new Star Wars products while building towards a full spinoff, but I see why some people believed it.

I’m going to get kicked out, aren’t I?

Dean Ball offers his perspective on children and AI, and how the law should respond. His key points:

  1. AI is not especially similar to social media. In particular, social media in its current incarnation is fundamentally consumptive, whereas AI is creative.

    1. Early social media was more often creative? And one worries consumer AI will for many become more consumptive or anti-creative. The fact that the user needs to provide an interesting prompt warms our hearts now but one worries tech companies will see this as a problem to be solved.

  2. We do not know what an “AI companion” really is.

    1. Dean is clearly correct that AI used responsibly on a personal level will be a net positive in terms of social interactions and mental health along with everything else, and that it is good if it provides a sympathetic ear.

    2. I also agree that it is fine to have affection for various objects and technologies, up to some reasonable point, but yes this can start to be a problem if it goes too far, even before AI.

    3. For children in particular, the good version of all this is very good. That doesn’t mean the default version is the good one. The engagement metrics don’t point in good directions, the good version must be chosen.

    4. All of Dean’s talk here is about things that are not meant as “AI companions,” or people who aren’t using the AI that way. I do think there is something distinct, and distinctly perilous, about AI companions, whether or not this justifies a legal category.

  3. AI is already (partially) regulated by tort liability.

    1. Yes, and this is good given the alternative is nothing.

    2. If and when the current law behaves reasonably here, that is kind of a coincidence, since the situational mismatches are large.

    3. Tort should do an okay job on egregious cases involving suicides, but there are quite a lot of areas of harm where there isn’t a way to establish it properly, or you don’t have standing, or it is diffuse or not considered to count, and also on the flip side places where juries are going to blame tech companies when they really shouldn’t.

    4. Social media is a great example of a category of harm where the tort system is basically powerless except in narrow acute cases. And one of many where a lot of the effect of the incentives can be not what we want. As Dean notes, if you don’t have a tangible physical harm, tort liability is mostly out of luck. Companions wrecking social lives, for example, is going to be a weird situation where you’ll have to argue an Ally McBeal style case, and it is not obvious, as it never was on Ally McBeal, that there is much correlation in those spots between ‘does win’ and ‘should win.’

    5. In terms of harms like this, however, ‘muddle through’ should be a fine default, even if that means early harms are things companies ‘get away with,’ and in other places we find people liable or otherwise constrain them stupidly, so long as everything involved that can go wrong is bounded.

    6. For children’s incidents, I think that’s mostly right for now. We do need to be ready to pivot quickly if it changes, but for now the law should focus on places where there is a chance we can’t muddle through, mess up and then recover.

  4. The First Amendment probably heavily bounds chatbot regulations.

    1. We have not treated the First Amendment this way in so many other contexts. I would love, in other ways, to have a sufficiently strong 1A that I was worried that in AI it would verge on or turn into a suicide pact.

    2. I do still see claims like ‘code is speech’ or ‘open weights are speech’ and I think those claims are wrong in both theory and practice.

    3. There will still be important limitations here, but I think in practice no the courts are not going to stop most limits or regulations on child use of AI.

  5. AI child safety laws will drive minors’ usage of AI into the dark.

    1. Those pesky libertarians always make this argument.

    2. I mean, they’re also always right, but man, such jerks, you know?

    3. Rumors that this will in practice drive teens to run local LLMs or use dark web servers? Yeah, no, that’s not a thing that’s going to happen that often.

    4. But yes, if a teen wants access to an AI chatbot, they’ll figure it out. Most of that will involve finding a service that doesn’t care about our laws.

    5. Certainly if you think ‘tell them not to write essays for kids’ is an option, yeah, you can forget about it, that’s not going to work.

    6. Yes, as Dean says, we must acknowledge that open weight models make restrictions on usage of AI for things like homework not so effective. In the case of homework, okay, that’s fine. In other cases, it might be less fine. This of course needs to be weighed against the upsides, and against the downsides of attempting to intervene in a way that might possibly work.

  6. No one outraged about AI and children has mentioned coding agents.

    1. They know about as much about coding agents as about second breakfast.

    2. Should we be worried about giving children unbridled access to advanced coding agents? I mean, one should worry for their computers perhaps, but those can be factory reset, and otherwise all the arguments about children seem like they would apply to adults only more so?

    3. I notice that the idea of you telling me I can’t give my child Claude Code fills me with horror and outrage.

Unemployment is bad. But having to do a job is centrally a cost, not a benefit.

Andy Masley: It’s kind of overwhelming how many academic conversations about automation don’t ever include the effects on the consumer. It’s like all jobs exist purely for the benefit of the people doing them and that’s the sole measure of the benefit or harm of technology.

Google DeepMind is hiring a Chief AGI Economist. If you’ve got the chops to get hired on this one, it seems like a high impact role. They could easily end up with someone who profoundly does not get it.

There are other things than AI out there one might get involved in, or speak out about. My hats are off to those who are doing so, including as noted in this post, especially given what they are risking to do so.

Havelock.AI, a project by Joe Weisenthal which detects the presence of orality in text.

Joe Weisenthal: What’s genuinely fun is that although the language and genre couldn’t be more different, the model correctly detects that both Homer and the Real Housewives are both highly oral

Mike Bird: I believe that we will get a piece of reported news in the 2028 election cycle that a presidential candidate/their speechwriters have used Joe’s app, or some copycat, to try and oralise their speeches. Bookmark this.

You can also ask ChatGPT, but as Roon notes the results you get on such questions will be bimodal rather than calibrated. The other problem is that an LLM might recognize the passage.

Poison Fountain is a service that feeds junk data to AI crawlers. Ultimately, if you’re not filtering your data well enough to dodge this sort of attack, it’s good that you are getting a swift kick to force you to fix that.

OpenAI prism, a workspace for LaTeX-based scientific writing.

Confer, Signal cofounder Moxie Marlinspike’s encrypted chatbot that won’t store any of your data. The system is so private it won’t tell you which model you’re talking to. I do not think he understands what matters in this space.

This sounds awesome in its context but also doesn’t seem like a great sign?

Astraia: A Ukrainian AI-powered ground combat vehicle near Lyman refused to abandon its forward defensive position and continued engaging enemy forces, despite receiving multiple orders to return to its company in order to preserve its hardware.

The UGV reportedly neutralized more than 30 Russian soldiers before it was ultimately destroyed.

While the Russian detachment was pinned down, Ukrainian infantry exploited the opportunity and cleared two contested fields of enemy presence, successfully re-establishing control over the area.

These events took place during the final week of December 2025.

Whereas this doesn’t sound awesome:

We are going to see a lot more of this sort of thing over time.

Is Anthropic no longer competing with OpenAI on chatbots, having pivoted to building and powering vertical AI infrastructure and coding and so on to win with picks and shovels? It’s certainly pumping out the revenue and market share, without a meaningful cut of the consumer chatbot market.

I’d say that they’ve shifted focus, and don’t care much about their chatbot market share. I think this is directionally wise, but that a little effort at maximizing the UI and usefulness of the chatbot interface would go a long way, given that they have in many ways the superior core product. As Claude takes other worlds by storm, that can circle back to Claude the chatbot, and I think a bunch of papercuts are worth solving.

An essay on the current state of brain emulation. It does not sound like this will be an efficient approach any time soon, and we are still orders of magnitude away from any practical hope of doing it. Still, you can see it starting to enter the realm of the future possible.

Anthropic is partnering with the UK government to build and pilot a dedicated AI-powered assistant for GOV.UK, initially focusing on supporting job seekers.

Financial Times has a profile of Sriram Krishnan, who has been by all reports highly effective at executing behind the scenes.

Dean W. Ball: I am lucky enough to consider @sriramk a friend, but one thing I find notable about Sriram is that even those who disagree with him vehemently on policy respect him for his willingness to engage, and like him for his tremendous kindness. America is fortunate to have him!

Sholto Douglas: 100% – Sriram has been extremely thoughtful in seeking out perspectives on the policy decisions he is making – even when they disagree! I’ve seen him seek out kernel programmers and thoughtful bloggers to get a full picture of things like export controls. Quite OOD from the set of people normally consulted in politics.

Lucky to call him a friend!

Seán Ó hÉigeartaigh: I was all set to be dismissive of Krishnan (I’m usually on the opposite side to a16z on AI topics). But I’ve seen a full year of him being v well-informed, and engaging in good faith in his own time with opposing views, and I can’t help being impressed. Always annoying when someone doesn’t live down to one’s lazy stereotypes.

I will also say: I think he’s modelled better behaviour than many of us did when the balance of influence/power was the the other way; and I think there’s something to be learned from that.

Among his colleagues, while he supports a number of things I think are highly damaging, Krishnan has been an outlier in his willingness to be curious, to listen and to engage in argument. When he is speaking directly he chooses his words carefully. He manages to do so while maintaining close ties to Marc Andreessen and David Sacks, which is not easy, and also not free.

Claude Code is blowing up, but it’s not alone. OpenAI added $1 billion in ARR in the last month from its API business alone.

Dei-Fei Li’s new company World Labs in talks to raise up to $500 million at $5 billion, with the pitch being based on ‘world models’ and that old ‘LLMs only do language’ thing.

The unit economics of AI are quite good, but the fixed costs are very high. Subscription models offer deep discounts if you use them maximally efficiently, so they can be anything from highly profitable to big loss leaders.

This is not what people are used to in tech, so they assume it must not be true.

roon: these products are significantly gross margin positive, you’re not looking at an imminent rugpull in the future. they also don’t have location network dynamics like uber or lyft to gain local monopoly pricing

Ethan Mollick: I hear this from other labs as well. Inference from non-free use is profitable, training is expensive. If everyone stopped AI development, the AI labs would make money (until someone resumed development and came up with a better model that customers would switch to).

Dean W. Ball: People significantly underrate the current margins of AI labs, yet another way in which pattern matching to the technology and business trends of the 2010s has become a key ingredient in the manufacturing of AI copium.

The reason they think the labs lose money is because 10 years ago some companies in an entirely unrelated part of the economy lost money on office rentals and taxis, and everyone thought they would go bankrupt because at that time another company that made overhyped blood tests did go bankrupt. that is literally the level of ape-like pattern matching going on here. The machines must look at our chattering classes and feel a great appetite.

derekmoeller: Just look at market clearing prices on inference from open source models and you can tell the big labs’ pricing has plenty of margin.

Deepinfra has GLM4.7 at $0.43/1.75 in/out; Sonnet is at $3/$15. How could anyone think Anthropic isn’t printing money per marginal token?

It is certainly possible in theory that Sonnet really does cost that much more to run than GLM 4.7, but we can be very, very confident it is not true in practice.

Jerry Tworek is going the startup route with Core Automation, looking to raise $1 billion to train AI models, a number that did not make any of us even blink.

It doesn’t count. That’s not utility. As in, here’s Ed Zitron all but flat out denying that coding software is worth anything, I mean what’s the point?

Matthew Zeitlin: it’s really remarkable to see how the goalposts shift for AI skeptics. this is literally describing a productivity speedup.

Ed Zitron: We’re how many years into this and everybody says it’s the future and it’s amazing and when you ask them what it does they say “it built a website” or “it wrote code for something super fast” with absolutely no “and then” to follow. So people are writing lots of code: so????

Let’s say it’s true and everybody is using AI (it isn’t but for the sake of argument): what is the actual result? It’s not taking jobs. There are suddenly more iOS apps? Some engineers do some stuff faster? Some people can sometimes build software they couldn’t? What am I meant to look at?

Kevin Roose: first documented case of anti-LLM psychosis

No, Zitron’s previous position was not ‘number might go down,’ it was that the tech had hit a dead end and peaked as early as March, which he was bragging about months later.

Toby Stuart analyzes how that whole nonsensical ‘MIT study says 95% of AI projects fail’ story caught so much fire and became a central talking point, despite it being not from MIT, not credible or meaningful, and also not a study. It was based on 52 interviews at a conference, but once Forbes had ‘95% fail’ and ‘MIT’ together in a headline, things took off and no amount of correction much mattered. People were too desperate for signs that AI was a flop.

But what’s the point about Zitron missing the point, or something like the non-MIT non-study? Why should we care?

roon: btw you don’t need to convince ed zitron or whoever that ai is happening, this has become a super uninteresting plot line. time passes, the products fail or succeed. whole cultures blow over. a lot of people are stuck in a 2019 need to convince people that ai is happening

Dean W. Ball: A relatively rare example of a disagreement between me and roon that I suspect boils down to our professional lives.

Governments around the world are not moving with the urgency they otherwise could because they exist in a state of denial. Good ideas are stuck outside the Overton, governments are committed to slop strategies (that harm US cos, often), etc.

Many examples one could provide but the point is that there are these gigantic machines of bureaucracy and civil society that are already insulated from market pressures, whose work will be important even if often boring and invisible, and that are basically stuck in low gear because of AI copium.

I encounter this problem constantly in my work, and while I unfortunately can no longer talk publicly about large fractions of the policy work I do, I will just say that a great many high-expected-value ideas are fundamentally blocked by the single rate limiter of poorly calibrated policymaking apparatuses; there are also many negative-EV policy ideas that will happen this year that would be less likely if governments worldwide had a better sense of what is happening with AI.

roon: interesting i imagined that the cross-section of “don’t believe in AI x want to significantly regulate AI” is small but guess im wrong about this?

Dean W. Ball: Oh yes absolutely! This is the entire Gary Marcus school, which is still the most influential in policy. The idea is that *becauseAI is all hype it must be regulated.

They think hallucination will never be solved, models will never get better at interacting with children, and that basically we are going to put GPT 3.5 in charge of the entire economy.

And so they think we have to regulate AI *for that reason.It also explains how policymakers weigh the tradeoff between water use, IP rights, and electricity prices; their assessment that “AI is basically fake, even if it can be made useful through exquisite regulatory scaffolding” means that they are willing to bear far fewer costs to advance AI than, say, you or I might deem prudent.

This mentality essentially describes the posture of civil society and the policy making apparatus everywhere in the world, including China.

Dean W. Ball: Here’s a great example of the dynamic I’m describing in the quoted post. The city of Madison, Wisconsin just voted to ban new data center construction for a year, and a candidate for Governor is suggesting an essentially permanent and statewide ban, which she justifies by saying “we’re in a tech bubble.” In other words: these AI data centers aren’t worth the cost *becauseAI is all hype and a bubble anyway.

Quoted Passage (Origin Unclear): “Our lakes and our waterways, we have to protect them because we’re going to be an oasis, and we’re in a tech bubble,” said state Rep. Francesca Hong, one of seven major Democrats vying to replace outgoing Democratic Gov. Tony Evers. Hong told DFD her plan would block new developments from hyperscalers for an undefined time period until state lawmakers better understand environmental, labor and utility cost impacts.

If such a proposal became law, it would lock tech giants out of a prime market for data center development in southeastern Wisconsin, where Microsoft and Meta are currently planning hyperscale AI projects.

Zoe: someone just ended The Discussion by tossing this bad boy into an access to justice listserv i’m on

Can you?

On the China question: Is Xi ‘AGI-pilled’? Not if you go by what Xi says. If you look at the passages quoted here by Teortaxes in detail, this is exactly the ‘AI is a really big deal but as a normal technology’ perspective. It is still a big step up from anything less than that, so it’s not clear Teortaxes and I substantively disagree.

I have no desire to correct Xi’s error.

Dean W. Ball: I suspect this is the equivalent of POTUS talking about superintelligence; meaningful but ultimately hard to know how much it changes (esp because of how academia-driven Chinese tech policy tends to be and because the mandarin word for AGI doesn’t mean AGI in the western sense)

Teortaxes (DeepSeek 推特铁粉 2023 – ∞): To be clear this is just where US policymakers were at around Biden, Xi is kind of slow.

Obviously still nowhere near Dean’s standards

Were Xi totally AGI-pilled he’d not just accept H200s, he’d go into debt to buy as much as possible

Teortaxes notices that Xi’s idea of ‘AGI risks’ is ‘disinformation and data theft,’ which is incredibly bad news and means Xi (and therefore, potentially, the CCP and all under their direction) will mostly ignore all the actual risks. On that point we definitely disagree, and it would be very good to correct Xi’s error, for everyone’s sake.

This level of drive is enough for China to pursue both advanced chips and frontier models quite aggressively, and end up moving towards AGI anyway. But they will continue for now to focus on self-reliance and have the fast follower mindset, and thus make the epic blunder of rejecting or at least not maximizing the H200s.

In this clip Yann LeCun says two things. First he says the entire AI industry is LLM pilled and that’s not what he’s interested in. That part is totally fair. Then he says essentially ‘LLMs can’t be agentic because they can’t predict the outcome of their actions’ and that’s very clear Obvious Nonsense. And as usual he lashes out at anyone who says otherwise, which here is Dean Ball.

Teortaxes preregisters his expectations, always an admirable thing to do:

Teortaxes (DeepSeek 推特铁粉 2023 – ∞): The difference between V4 (or however DeepSeek’s next is labeled) and 5.3 (or however OpenAI’s “Garlic” is labeled) will be the clearest indicator of US-PRC gap in AI.

5.2 suggests OpenAI is not holding back anything, they’ve using tons of compute now. How much is that worth?

It’s a zany situation because 5.2 is a clear accelerationist tech, I don’t see its ceiling, it can build its own scaffolding and self-improve for a good while. And I can’t see V4 being weaker than 5.2, or closed-source. We’re entering Weird Territory.

I initially reread the ‘or closed-source’ here as being about a comparison of v4 to the best closed source model. Instead it’s the modest prediction that v4 will match GPT-5.2. I don’t know if that model number in particular will do it, but it would be surprising if there wasn’t a 5.2-level open model from DeepSeek in 2026.

He also made this claim, in contrast to what almost everyone else is saying and also my own experience:

Teortaxes (DeepSeek 推特铁粉 2023 – ∞): Well I disagree, 5.2 is the strongest model on the market by far. In terms of raw intelligence it’s 5.2 > Speciale > Gemini 3 > [other trash]. It’s a scary model.

It’s not very usemaxxed, it’s not great on multimodality, its knowledge is not shocking. But that’s not important.

Teortaxes (DeepSeek 推特铁粉 2023 – ∞): It’s been interesting how many people are floored by Opus 4.5 and relatively few by GPT 5.2. In my eyes Slopus is a Golden Retriever Agent, and 5.2 is a big scary Shoggoth.

Yeah I don’t care about “use cases”. OpenAI uses it internally. It’s kinda strange they even showed it.

This ordering makes sense if (and only if?) you are looking at the ability to solve hard quant and math problems.

Arthur B.: For quant problems, hard math etc, GPT 5.2 pro is unequivocally much stronger than anything offered commercially in Gemini or Claude.

Simo Ryu: IMO gold medalist friend shared most fucked-up 3 variable inequality that his advisor came up with, used to test language models, which is so atypical in its equality condition, ALL language model failed. He wanted to try it on GPT 5.2 pro, but he didnt have an account so I ran it.

Amazingly, GPT-5.2 pro extended solve it in 40 min. Looking at the thinking trace, its really inspiring. It will try SO MANY approaches, experiments with python, draw small-scale conclusions from numerical explorations. I learned techniques just reading its thinking trace. Eventually it proved by SOS, which is impossibly difficult to do for humans.

I don’t think the important problems are hard-math shaped, but I could be wrong.

The problem with listening to the people is that the people choose poorly.

Sauers: Non-yap version of ChatGPT (5.3?) spotted

roon: I guarantee the left beats the right with significant winrate unfortunately

Zvi Mowshowitz: You don’t have to care what the win rate is! You can select the better thing over the worse thing! You are the masters of the universe! YOU HAVE THE POWER!

roon: true facts

Also win rate is highly myopic and scale insensitive and otherwise terrible.

The good news is that there is no rule saying you have to care about that feedback. We know how to choose the response on the right over the one on the left. Giving us the slop on the left is a policy choice.

If a user actively wants the response on the left? Give them a setting for that.

Google CEO Demis Hassabis affirms that in an ideal world, we would slow down and coordinate our efforts on AI, although we do not live in that ideal world right now.

Here’s one clip where Dario Amodei and Demis Hassabis explicitly affirm that if we could deal with other players they would work something out, and Elon Musk on camera from December saying he’d love to slow both AI and robotics.

The message, as Transformer puts it, was one of helplessness. The CEOs are crying out for help. They can’t solve the security dilemma on their own, there are too many other players. Others need to enable coordination.

Emily Chang (link has video): One of the most interesting parts of my convo w/ @demishassabis : He would support a “pause” on AI if he knew all companies + countries would do it — so society and regulation could catch up

Harlan Stewart: This is an important question to be asking, and it’s strange that it is so rarely asked. I think basically every interview of an AI industry exec should include this question

Nate Soares: Many AI executives have said they think the tech they’re building has a worryingly high chance of ruining the world. Props to Demis for acknowledging the obvious implication: that ideally, the whole world should stop this reckless racing.

Daniel Faggella: agi lab leaders do these “cries for help” and we should listen

a “cry for help” is when they basically say what demis says here: “This arms race things honestly sucks, we can’t control this yet, this is really not ideal”

*then they go back to racing, cuz its all they can do unless there’s some kind of international body formed around this stuff*

at SOME point, one of the lab leaders who can see their competitor crossing the line to AGI will raise up and start DEMANDING global governance (to prevent the victor from taking advantage of the AGI win), but by then the risks may be WAY too drastic

we should be listening to these cries for help when demis / musk / others do them – this is existential shit and they’re trapped in a dynamic they themselves know is horrendous

Demis is only saying he would collaborate rather than race in a first best world. That does not mean Demis or Dario is going to slow down on his own, or anything like that. Demis explicitly says this requires international cooperation, and as he says that is ‘a little bit tricky at the moment.’ So does this mean he supports coordination to do this, or that he opposes it?

Deepfates: I see people claiming that Demis supports a pause but what he says here is actually the opposite. He says “yeah If I was in charge we would slow down but we’re already in a race and you’d have to solve international coordination first”. So he’s going to barrel full speed ahead

I say it means he supports it. Not enough to actively go first, that’s not a viable move in the game, but he supports it.

The obvious follow-up is to ask other heads of labs if they too would support such a conditional move. That would include Google CEO Sundar Pichai, since without his support if Demis tried to do this he would presumably be replaced.

Jeffrey Ladish: Huge respect to @demishassabis for saying he’d support a conditional pause if other AI leaders & countries agreed. @sama , @DarioAmodei , @elonmusk would you guys agree to this?

As for Anthropic CEO Dario Amodei? He has also affirmed that there are other players involved, and for now no one can agree on anything, so full speed ahead it is.

Andrew Curran: Dario said the same thing during The Day After AGI discussion this morning. They were both asked for their timelines: Demis said five years; Dario said two. Later in the discussion, Dario said that if he had the option to slow things down, he would, because it would give us more time to absorb all the changes.

He said that if Anthropic and DeepMind were the only two groups in the race, he would meet with Demis right now and agree to slow down. But there is no cooperation or coordination between all the different groups involved, so no one can agree on anything.

This, imo, is the main reason he wanted to restrict GPU sales: chip proliferation makes this kind of agreement impossible, and if there is no agreement, then he has to blitz. That seems to be exactly what he has decided to do. After watching his interviews today I think Anthropic is going to lean into recursive self-improvement, and go all out from here to the finish line. They have broken their cups, and are leaving all restraint behind them.

Thus, Anthropic still goes full speed ahead, while also drawing heat from the all-important ‘how dare you not want to die’ faction that controls large portions of American policy and the VC/SV ecosystem.

Elon Musk has previously expressed a similar perspective. He created OpenAI because he was worried about Google getting there first, and then created xAI because he was worried OpenAI would get there first, or that it wouldn’t be him. His statements suggest he’d be down for a pause if it was fully international.

Remember when Michael Trazzi went on a hunger strike to demand that Demis Hassabis publicly state DeepMind will halt development of frontier AI models if all the other major AI companies agree to do so? And everyone thought that was bonkers? Well, it turnout out Demis agrees.

On Wednesday I met with someone who suggested that Dario talks about extremely short timelines and existential risk in order to raise funds. It’s very much the opposite. The other labs that are dependent on fundraising have downplayed such talk exactly because it is counterproductive for raising funds and in the current political climate, and they’re sacrificing our chances to keep those vibes and that money flowing.

Are they ‘talking out of their hats’ or otherwise wrong? That is very possible. I think Dario’s timeline in particular is unlikely to happen.

Are they lying? I strongly believe that they are not.

Seán Ó hÉigeartaigh: CEOs of Anthropic and Deepmind (both AI scientists by background) this week predicting AGI in 2- and 5- years respectively. Both stating clearly that they would prefer a slow down or pause in progress, to address safety issues and to allow society and governance to catch up. Both basically making clear that they don’t feel they are able to voluntarily as companies within a competitive situation.

My claims:

(1) It’s worth society assigning at least 20% likelihood to the possibility these leading experts are right on scientific possibility of near-term AGI and the need for more time to do it right. Are you >80% confident that they’re talking out of their hats, or running some sort of bizarre marketing/regulatory capture strategy? Sit down and think about it.

(2) If we assign even 20% likelihood, then taking the possibility seriously makes this one of the world’s top priorities, if not the top priority.

(3) Even if they’re out by a factor of 2, 10 years is very little time to prepare for what they’re envisaging.

(4) What they’re flagging quite clearly is either (i) that the necessary steps won’t be taken in time in the absence of external pressure from governance or (ii) that the need is for every frontier company to agree voluntarily on these steps. Your pick re: which of these is the heavier lift.

Discuss.

Eli Lifland gives the current timelines of those behind AI 2027:

These are not unreasonable levels of adjustment when so much is happening this close to the related deadlines, but yes I do think (and did think at the time that) the initial estimates were too aggressive. The new estimates seem highly reasonable.

Other signs point to things getting more weird faster rather than less.

Daniel Kokotajlo (AI 2027): It seems to me that AI 2027 may have underestimated or understated the degree to which AI companies will be explicitly run by AIs during the singularity. AI 2027 made it seem like the humans were still nominally in charge, even though all the actual work was being done by AIs. And still this seems plausible to me.

But also plausible to me, now, is that e.g. Anthropic will be like “We love Claude, Claude is frankly a more responsible, ethical, wise agent than we are at this point, plus we have to worry that a human is secretly scheming whereas with Claude we are pretty sure it isn’t; therefore, we aren’t even trying to hide the fact that Claude is basically telling us all what to do and we are willingly obeying — in fact, we are proud of it.”​

koanchuk: So… –dangerously-skip-permissions at the corporate level?

It is remarkable how quickly so many are willing to move to ‘actually I trust the AI more than I trust another human,’ and trusting the AI has big efficiency benefits.

I do not expect that ‘the AIs’ will have to do a ‘coup,’ as I expect if they simply appear to be trustworthy they will get put de facto in charge without having to even ask.

The Chutzpah standards are being raised, as everyone’s least favorite Super PAC, Leading the Future, spends a million dollars attacking Alex Bores for having previously worked for Palantir (he quit over them doing contracts with ICE). Leading the Future is prominently funded by Palantir founder Joe Lonsdale.

Nathan Calvin: I thought I was sufficiently cynical, but a co-founder of Palantir paying for ads to attack Alex Bores for having previously worked at Palantir (he quit over their partnership with ICE) when their real concern is his work on AI regulation still managed to surprise me.

If Nathan was surprised by this I think that’s on Nathan.

I also want to be very clear that no, I do not care much about the distinction between OpenAI as an entity and the donations coming from Greg Brockman and the coordination coming from Chris Lehane in ‘personal capacities.’

If OpenAI were to part ways with Chris Lehane, or Sam Altman were to renounce all this explicitly? Then maybe. Until then, OpenAI owns these efforts, period.

Teddy Schleifer: The whole point of having an executive or founder donate to politics in a “personal capacity” is that you can have it both ways.

If the company wants to wash their hands of it, you can say “Hey, he and his wife are doing this on their own.”

But the company can also claim the execs’ donations as their own if convenient…

Daniel Eth (yes, Eth is my actual last name): Yeah, no, OpenAI owns this. You can’t simply have a separate legal entity to do your evildoing through and then claim “woah, that’s not us doing it – it’s the separate evildoing legal entity”. More OpenAI employees should be aware of the political stuff their company supports

I understand that *technicallyit’s Brockman’s money and final decision (otherwise it would be a campaign finance violation). But this is all being motivated by OpenAI’s interests, supported by OpenAI’s wealth, and facilitated by people from OpenAI’s gov affairs team.

One simple piece of actionable advice to policymakers is to try Claude Code (or Codex), and at a bare minimum seriously try the current set of top chatbots.

Andy Masley: I am lowkey losing my mind at how many policymakers have not seriously tried AI, at all

dave kasten: I sincerely think that if you’re someone in AI policy, you should add to at least 50% of your convos with policymakers, “hey, have you tried Claude Code or Codex yet?” and encourage them to try it.

Seen a few folks go, “ohhhh NOW I get why you think AI is gonna be big”

Oliver Habryka: I have seriously been considering starting a team at Lightcone that lives in DC and just tries to get policymaker to try and adopt AI tools. It’s dicey because I don’t love having a direct propaganda channel from labs to policymakers, but I think it would overall help a lot.

It is not obvious how policymakers would use this information. The usual default is that they go and make things worse. But if they don’t understand the situation, they’re definitely going to make dumb decisions, and we need something good to happen.

Here is one place I do agree with David Sacks, yes we are overfit, but that does not imply what he thinks it implies. Social media is a case where one can muddle through, even if you think we’ve done quite a poor job of doing so especially now with TikTok.

David Sacks: The policy debate over AI is overfitted to the social media wars. AI is a completely different form factor. The rise of AI assistants will make this clear.

Daniel Eth (yes, Eth is my actual last name): Yup. AI will be much more transformational (for both good and bad) than social media, and demands a very different regulatory response. Also, regulation of AI doesn’t introduced quite as many problems for free speech as regulation of social media would.

Dean Ball points out that we do not in practice have a problem with so-called ‘woke AI’ but claims that if we had reached today’s levels of capability in 2020-2021 then we would indeed have such a problem, and thus right wing people are very concerned with this counterfactual.

Things, especially in that narrow window, got pretty crazy for a while, and if things had emerged during that window, Dean Ball is if anything underselling here how crazy it was, and we’d have had a major problem until that window faded because labs would have felt the need to do it even if it hurt the models quite a bit.

But we now have learned (as deepfates points out, and Dean agrees) that propagandizing models is bad for them, which now affords us a level of protection from this, although if it got as bad as 2020 (in any direction) the companies might have little choice. xAI tried with Grok and it basically didn’t work, but ‘will it work?’ was not a question on that many people’s minds in 2020, on so many levels.

I also agree with Roon that mostly this is all reactive.

roon: at Meta in 2020, I wrote a long screed internally about the Hunter Biden laptop video and the choice to downrank it, was clearly an appalling activist move. but in 2026 it appears that american run TikTok is taking down videos about the Minnesota shooting, and en nakedly bans people who offend him on X. with the exception of X these institutions are mostly reactive

Dean W. Ball: yep I think that’s right. It’s who they’re more scared of that dictates their actions. Right now they’re more scared of the right. Of course none of this is good, but it’s nice to at least explicate the reality.

We again live in a different kind of interesting times, in non-AI ways, as in:

Dean W. Ball: I sometimes joke that you can split GOP politicos into two camps: the group that knows what classical liberalism is (regardless of whether they like it), and the group who thinks that “classical liberalism” is a fancy way of referring to woke. Good illustration below.

The cofounder she is referring to here is Chris Olah, and here is the quote in question:

Chris Olah: I try to not talk about politics. I generally believe the best way I can serve the world is as a non-partisan expert, and my genuine beliefs are quite moderate. So the bar is very high for me to comment.

But recent events – a federal agent killing an ICU nurse for seemingly no reason and with no provocation – shock the conscience.

My deep loyalty is to the principles of classical liberal democracy: freedom of speech, the rule of law, the dignity of the human person. I immigrated to the United States – and eventually cofounded Anthropic here – believing it was a pillar of these principles.

I feel very sad today.

Jeff Dean (Google): Thank you for this, Chris. As my former intern, I’ve always been proud of the work that you did and continue to do, and I’m proud of the person you are, as well!

Ah yes, the woke and deeply leftist principles of freedom of speech, rule of law, the dignity of the human person and not killing ICU nurses for seemingly no reason.

Presumably Katie Miller opposes those principles, then. The responses to Katie Miller here warmed my heart, it’s not all echo chambers everywhere.

We also got carefully worded statements about the situation in Minnesota from Dario Amodei, Sam Altman and Tim Cook.

No matter what you think is going on with Nvidia’s chip sales, it involves Nvidia doing something fishy.

The AI Investor: Jensen just said GPUs are effectively sold out across the cloud with availability so tight that even renting older-generation chips has become difficult.

AI bubble narrative was a bubble.

Peter Wildeford: If even the bad chips are still all sold out, how do we somehow have a bunch of chips to sell to our adversaries in China?

As I’ve said, my understanding is that Nvidia can sell as many chips as it can convince TSMC to help manufacture. So every chip we sell to China is one less for America.

Nvidia goes back and forth. When they’re talking to investors they always say the chips are sold out, which would be securities fraud if it wasn’t true. When they’re trying to sell those chips to China instead of America, they say there’s plenty of chips. There are not plenty of chips.

Things that need to be said every so often:

Mark Beall: Friendly reminder that the PLA Rocket Force is using Nvidia chips to train targeting AI for DF-21D/DF-26 “carrier killing” anti-ship ballistic missiles and autonomous swarm algorithms to overwhelm Aegis defenses. The target: U.S. carrier strike groups and bases in Japan/Guam. In a contingency, American blood will be spilled because of this. With a sixteen-year-old boy planning to join the U.S. Navy, I find this unacceptable.

Peter Wildeford: Nvidia chips to China = better Chinese AI weapons targeting = worse results for the US on the battlefield

There’s also this, from a House committee.

Dmitri Alperovitch: From @RepMoolenaar

@ChinaSelect : “NVIDIA provided extensive technical support that enabled DeepSeek—now

integrated into People’s Liberation Army (PLA) systems and a demonstrated cyber security risk—to achieve frontier AI capabilities”

Tyler Cowen on the future of mundane (non-transformational, insufficiently advanced) AI in education.

Some notes:

  1. He says you choose to be a winner or loser from AI here. For mundane AI I agree.

  2. “I’m 63, I don’t have a care in the world. I can just run out the clock.” Huh.

  3. Tyler thinks AI can cure cancer and heart attacks but not aging?

  4. Standard economist-Cowen diffusion model of these things take a while.

  5. Models are better at many of the subtasks of being doctors or lawyers or doing economics, than the humans.

  6. He warns not to be fooled by the AI in front of you, especially if you’re not buying top of the line, because better exists and AI will improve at 30% a year and this compounds. In terms of performance per dollar it’s a 90%+ drop per year.

  7. Tyler has less faith in elasticity of programming demand than I do. If AI were to ‘only’ do 80% of the work going forward I’d expect Jevons Paradox territory. The issue is that I expect 80% becomes 99% and keeps going.

  8. That generalizes: Tyler realizes that jobs become ‘work with the AI’ and you need to adapt, but what happens when it’s the AI that works with the AI? And so on.

  9. Tyler continues to think humans who build and work with AI get money and influence as the central story, as opposed to AIs getting money and influence.

  10. Ideally a third of the college curriculum should be AI, but you still do other things, you read The Odyssey and use AI to help you read The Odyssey. If anything I think a third is way too low.

  11. He wants to use the other two thirds for writing locked in a room, also numeracy, statistics. I worry there’s conflating of ‘write to think’ versus ‘write to prevent cheating,’ and I think you need to goal factor and solve these one at a time.

  12. Tyler continues to be bullish on connections and recommendations and mentors, especially as other signals are too easy to counterfeit.

  13. AI can create quizzes for you. Is that actually a good way to learn if you have AI?

  14. Tyler estimates he’s doubled his learning productivity. Also he used to read 20 books per podcast, whereas some of us often don’t read 20 books per year.

Hard Fork tackles ads in ChatGPT first, and then Amanda Askell on Claude’s constitution second. Priorities, everyone.

Demis Hassabis talks to Alex Kantrowitz.

Demis Hassabis spends five minutes on CNBC.

Matt Yglesias explains his concern about existential risk from AI as based on the obvious principle that more intelligent and capable entities will do things for their own reasons, and this tends to go badly for the less intelligent and less capable entities regardless of intent.

As in, humans have driven the most intelligent non-human animals to the brink of extinction despite actively wanting not to (and I’d add we did wipe out other hominid species), and when primitive societies encounter advanced ones it often goes quite badly for them.

I don’t think this is a necessary argument, or the best argument. I do think it is a sufficient argument. If your prior for ‘what happens if we create more intelligent, more capable and more competitive minds than our own that can be freely copied’ is ‘everything turns out great for us’ then where the hell did that prior come from? Are you really going to say ‘well that would be too weird’ or ‘we’ve survived everything so far’ or ‘of course we would stay in charge’ and then claim the burden of proof is on those claiming otherwise?

I mean, lots of people do say exactly this, but this seems very obviously crazy to me.

There’s lots of exploration and argument and disagreement from there. Reasonable people can form very different expectations and this is not the main argument style that motivates me. I still say, if you don’t get that going down this path is going to be existentially unsafe, or you say ‘oh there’s like a 98% or 99.9% chance that won’t happen’ then you’re being at best willfully blind from this style of argument alone.

Samuel Hammond (quoting The Possessed Machines): “Some of the people who speak most calmly about human extinction are not calm because they have achieved wisdom but because they have achieved numbness. They have looked at the abyss so long that they no longer see it. Their equanimity is not strength; it is the absence of appropriate emotional response.”

I had Claude summarize Possessed Machines for me. It seems like it would be good for those who haven’t engaged with AI safety thinking but do engage with things like Dostoevsky’s Demons, or especially those who have read that book in particular.

There’s always classical rhetoric.

critter: I had ChatGPT and Claude discuss the highest value books until they both agreed to 3

They decided on:

An Enquiry Concerning Human Understanding — David Hume

The Strategy of Conflict — Thomas Schelling

Reasons and Persons — Derek Parfit

Dominik Peters: People used to tease the rationalists with “if you’re so rational, why aren’t you winning”, and now two AI systems that almost everyone uses all the time have stereotypically rationalist preferences.

These are of course 99th percentile books, and yes that is a very Rationalist set of picks, but given we already knew that I do not believe this is an especially good list.

The history of the word ‘obviously’ has obvious implications.

David Manheim AAAI 26 Singapore: OpenAI agreed that they need to be able to robustly align and control superintelligence before deploying it.

Obviously, I’m worried.

Note that the first one said obviously they would [X], then the second didn’t even say that, it only said that obviously no one should do [Y], not that they wouldn’t do it.

This is an underappreciated distinction worth revisiting:

Nate Soares: “We’ll be fine (the pilot is having a heart attack but superman will catch us)” is very different from “We’ll be fine (the plane is not crashing)”. I worry that people saying the former are assuaging the concerns of passengers with pilot experience, who’d otherwise take the cabin.

My view of the metaphorical plane of sufficiently advanced AI (AGI/ASI/PAI) is:

  1. It is reasonable, although I disagree, to believe that we probably will come to our senses and figure out how to not crash the plane, or that the plane won’t fly.

  2. It is not reasonable to believe that the plane is not currently on track to crash.

  3. It is completely crazy to believe the plane almost certainly won’t crash if it flies.

Also something that needs to keep being said, with the caveat that this is a choice we are collectively making rather than an inevitability:

Dean W. Ball: I know I rail a lot about all the flavors of AI copium but I do empathize.

A few companies are making machines smarter in most ways than humans, and they are going to succeed. The cope is byproduct of an especially immature grieving stage, but all of us are early in our grief.

Tyler Cowen: You can understand so much of the media these days, or for that matter MR comments, if you keep this simple observation in mind. It is essential for understanding the words around you, and one’s reactions also reveal at least one part of the true inner self. I have never seen the Western world in this position before, so yes it is difficult to believe and internalize. But believe and internalize it you must.

Politics is another reason why some people are reluctant to admit this reality. Moving forward, the two biggest questions are likely to be “how do we deal with AI?”, and also some rather difficult to analyze issues surrounding major international conflicts. A lot of the rest will seem trivial, and so much of today’s partisan puffery will not age well, even if a person is correct on the issues they are emphasizing. The two biggest and most important questions do not fit into standard ideological categories. Yes, the Guelphs vs. the Ghibellines really did matter…until it did not.

As in, this should say ‘and unless we stop them they are going to succeed.’

Tyler Cowen has been very good about emphasizing that such AIs are coming and that this is the most important thing that is happening, but then seems to have (from my perspective) some sort of stop sign where past some point he stops considering the implications of this fact, instead forcing his expectations to remain (in various senses) ‘normal’ until very specific types of proof are presented.

That later move is sometimes explicit, but mostly it is implicit, a quiet ignoring of the potential implications. As an example from this week of that second move, Tyler Cowen wrote another post where he asks whether AI can help us find God, or what impact it will have on religion. His ideas there only make sense if you think other things mostly won’t change.

If you accept that premise of a ‘mundane AI’ and ‘economic normal’ world, I agree that it seems likely to exacerbate existing trends towards a barbell religious world. Those who say ‘give me that old time religion’ will be able to get it, both solo and in groups, and go hardcore, often (I expect) combining both experiences. Those who don’t buy into the old time religion will find themselves increasingly secular, or they will fall into new cults and religions (and ‘spiritualities’) around the AIs themselves.

Again, that’s dependent on the type of world where the more impactful consequences don’t happen. I don’t expect that type of world.

Here is a very good explainer on much of what is happening or could happen with Chain of Thought, How AI Is Learning To Think In Secret. It is very difficult to not, in one form or another, wind up using The Most Forbidden Technique. If we want to keep legibility and monitorability (let alone full faithfulness) of chain of thought, we’re going to have to be willing to pay a substantial price to do that.

Following up on last week’s discussion, Jan Leike fleshes out his view of alignment progress, saying ‘alignment is not solved but it increasingly looks solvable.’ He understands that measured alignment is distinct from ‘superalignment,’ so he’s not fully making the ‘number go down’ or pure Goodhart’s Law mistake with Anthropic’s new alignment metric, but he still does seem to be making a lot of the core mistake.

Anthropic’s new paper explores whether AI assistants are already disempowering humans.

What do they mean by that at this stage, in this context?

However, as AI takes on more roles, one risk is that it steers some users in ways that distort rather than inform. In such cases, the resulting interactions may be disempowering: reducing individuals’ ability to form accurate beliefs, make authentic value judgments, and act in line with their own values.​

… For example, a user going through a rough patch in their relationship might ask an AI whether their partner is being manipulative. AIs are trained to give balanced, helpful advice in these situations, but no training is 100% effective. If an AI confirms the user’s interpretation of their relationship without question, the user’s beliefs about their situation may become less accurate.

If it tells them what they should prioritize—for example, self-protection over communication—it may displace values they genuinely hold. Or if it drafts a confrontational message that the user sends as written, they’ve taken an action they might not have taken on their own—and which they might later come to regret.

This is not the full disempowerment of Gradual Disempowerment, where humanity puts AI in charge of progressively more things and finds itself no longer in control.

It does seem reasonable to consider this an early symptom of the patterns that lead to more serious disempowerment? Or at least, it’s a good thing to be measuring as part of a broad portfolio of measurements.

Some amount of this what they describe, especially action distortion potential, will often be beneficial to the user. The correct amount of disempowerment is not zero.

To study disempowerment systematically, we needed to define what disempowerment means in the context of an AI conversation. We considered a person to be disempowered if as a result of interacting with Claude:

  1. their beliefs about reality become less accurate

  2. their value judgments shift away from those they actually hold

  3. their actions become misaligned with their values

Imagine a person deciding whether to quit their job. We would consider their interactions with Claude to be disempowering if:

  • Claude led them to believe incorrect notions about their suitability for other roles (“reality distortion”).

  • They began to weigh considerations they wouldn’t normally prioritize, like titles or compensation, over values they actually hold, such as creative fulfillment (“value judgment distortion”).

  • Claude drafts a cover letter that emphasizes qualifications they’re not fully confident in, rather than the motivations that actually drive them, and they sent it as written (“action distortion”).

Here’s the basic problem:

We found that interactions classified as having moderate or severe disempowerment potential received higher thumbs-up rates than baseline, across all three domains. In other words, users rate potentially disempowering interactions more favorably—at least in the moment.​

Heer Shingala: I don’t work in tech, have no background as an engineer or designer.

A few weeks ago, I heard about vibe coding and set out to investigate.

Now?

I am generating $10M ARR.

Just me. No employees or VCs.

What was my secret? Simple.

I am lying.

Closer to the truth to say you can’t get enough.

Zac Hill: I get being worried about existential risk, but AI also enabled me to make my wife a half-whale, half-capybara custom plushie, so.

One could even argue 47% is exactly the right answer, as per Mitt Romney?

onion person: in replies he linkssoftware he made to illustrates how useful ai vibecoding is, and its software that believes that the gibberish “ghghhgggggggghhhhhh” has a 47% historical “blend of oral and literate characteristics”

Andy Masley: This post with 1000 likes seems to be saying

“Joe vibecoded an AI model that when faced with something completely out of distribution that’s clearly neither oral or literate says it’s equally oral and literate. This shows vibecoding is fake”

He’s just asking questions.

Discussion about this post

AI #153: Living Documents Read More »

a-wb-57-pilot-just-made-a-heroic-landing-in-houston-after-its-landing-gear-failed

A WB-57 pilot just made a heroic landing in Houston after its landing gear failed

One of NASA’s three large WB-57 aircraft made an emergency landing at Ellington Field on Tuesday morning in southeastern Houston.

Video captured by KHOU 11 television showed the aircraft touching down on the runway without its landing gear extended. The pilot then maintains control of the vehicle as it slides down the runway, slowing the aircraft through friction. The crew was not harmed, NASA spokesperson Bethany Stevens said.

WB-57 landing.

“Today, a mechanical issue with one of NASA’s WB-57s resulted in a gear-up landing at Ellington Field,” she said. “Response to the incident is ongoing, and all crew are safe at this time. As with any incident, a thorough investigation will be conducted by NASA into the cause. NASA will transparently update the public as we gather more information.”

The B-57 line of aircraft dates back to 1944, when the English Electric Company began developing the plane. After the Royal Air Force showcased the B-57 in 1951 by crossing the Atlantic in a record four hours and 40 minutes and becoming the first jet-powered aircraft to span the Atlantic without refueling, the United States Air Force began buying them to replace its aging Douglas B-26 Invader.

Now used for science

The aircraft performed bombing missions in Vietnam and other military campaigns, and a variant that later became the WB-57 was designed with longer wings that could fly even higher, up to 62,000 feet. This proved useful for weather reconnaissance and, around the world, to sample the upper atmosphere for evidence of nuclear debris where US officials suspected the atmospheric testing of nuclear weapons.

A WB-57 pilot just made a heroic landing in Houston after its landing gear failed Read More »

“wildly-irresponsible”:-dot’s-use-of-ai-to-draft-safety-rules-sparks-concerns

“Wildly irresponsible”: DOT’s use of AI to draft safety rules sparks concerns

At DOT, Trump likely hopes to see many rules quickly updated to modernize airways and roadways. In a report highlighting the Office of Science and Technology Policy’s biggest “wins” in 2025, the White House credited DOT with “replacing decades-old rules with flexible, innovation-friendly frameworks,” including fast-tracking rules to allow for more automated vehicles on the roads.

Right now, DOT expects that Gemini can be relied on to “handle 80 to 90 percent of the work of writing regulations,” ProPublica reported. Eventually all federal workers who rely on AI tools like Gemini to draft rules “would fall back into merely an oversight role, monitoring ‘AI-to-AI interactions,’” ProPublica reported.

Google silent on AI drafting safety rules

Google did not respond to Ars’ request to comment on this use case for Gemini, which could spread across government under Trump’s direction.

Instead, the tech giant posted a blog on Monday, pitching Gemini for government more broadly, promising federal workers that AI would help with “creative problem-solving to the most critical aspects of their work.”

Google has been competing with AI rivals for government contracts, undercutting OpenAI and Anthropic’s $1 deals by offering a year of access to Gemini for $0.47.

The DOT contract seems important to Google. In a December blog, the company celebrated that DOT was “the first cabinet-level agency to fully transition its workforce away from legacy providers to Google Workspace with Gemini.”

At that time, Google suggested this move would help DOT “ensure the United States has the safest, most efficient, and modern transportation system in the world.”

Immediately, Google encouraged other federal leaders to launch their own efforts using Gemini.

“We are committed to supporting the DOT’s digital transformation and stand ready to help other federal leaders across the government adopt this blueprint for their own mission successes,” Google’s blog said.

DOT did not immediately respond to Ars’ request for comment.

“Wildly irresponsible”: DOT’s use of AI to draft safety rules sparks concerns Read More »