Author name: Mike M.

deepseek-v3.1-is-not-having-a-moment

DeepSeek v3.1 Is Not Having a Moment

What if DeepSeek released a model claiming 66 on SWE and almost no one tried using it? Would it be any good? Would you be able to tell? Or would we get the shortest post of the year?

Why are we settling for v3.1 and have yet to see DeepSeek release v4 or r2 yet?

Eleanor Olcott and Zijing Wu: Chinese artificial intelligence company DeepSeek delayed the release of its new model after failing to train it using Huawei’s chips, highlighting the limits of Beijing’s push to replace US technology.

DeepSeek was encouraged by authorities to adopt Huawei’s Ascend processor rather than use Nvidia’s systems after releasing its R1 model in January, according to three people familiar with the matter.

But the Chinese start-up encountered persistent technical issues during its R2 training process using Ascend chips, prompting it to use Nvidia chips for training and Huawei’s for inference, said the people.

The issues were the main reason the model’s launch was delayed from May, said a person with knowledge of the situation, causing it to lose ground to rivals.

The real world so often involves people acting so much stupider than you could write into fiction.

America tried to sell China H20s and China decided they didn’t want them and now Nvidia is halting related orders with suppliers.

DeepSeek says that the main restriction on their development is lack of compute, and the PRC responds not by helping them get better chips but by advising them to not use the chips that they have, greatly slowing things down at least for a while.

In any case, DeepSeek v3.1 exists now, and remarkably few people care?

DeepSeek: Introducing DeepSeek-V3.1: our first step toward the agent era! 🚀

🧠 Hybrid inference: Think & Non-Think — one model, two modes

⚡️ Faster thinking: DeepSeek-V3.1-Think reaches answers in less time vs. DeepSeek-R1-0528

🛠️ Stronger agent skills: Post-training boosts tool use and multi-step agent tasks

Try it now — toggle Think/Non-Think via the “DeepThink” button.

API Update ⚙️

🔹 deepseek-chat → non-thinking mode

🔹 deepseek-reasoner → thinking mode

🧵 128K context for both

🔌 Anthropic API format supported.

Strict Function Calling supported in Beta API.

🚀 More API resources, smoother API experience

Tools & Agents Upgrades 🧰

📈 Better results on SWE / Terminal-Bench

🔍 Stronger multi-step reasoning for complex search tasks

⚡️ Big gains in thinking efficiency

🔹 V3.1 Base: 840B tokens continued pretraining for long context extension on top of V3

🔹 Tokenizer & chat template updated — new tokenizer config.

🔗 V3.1 Base Open-source weights.

🔗 V3.1 Open-source weights.

Pricing Changes 💳

🔹 New pricing starts & off-peak discounts end at Sep 5th, 2025, 16: 00 (UTC Time)

🔹 Until then, APIs follow current pricing

📝 Pricing page.

Teortaxes: for now seems to have the same performance ceiling as 0528, maybe a bit weaker on some a bit stronger on other problems. The main change is that it’s a unified merge that uses ≥2x fewer reasoning tokens. I take it as a trial balloon before V4 that’ll be unified out of the box.

There are some impressive scores here. A true 66 on SWE would be very strong.

There’s also the weird result where it is claimed to outscore Opus 4 on Aider Polyglot at a low price.

Wes Roth: DeepSeek has quietly published V 3.1, a 685-billion-parameter open-source model that folds chat, reasoning, and coding into a single architecture, handles 128 k-token context windows, and posts a 71.6 % score on the Aider coding benchmark edging out Claude Opus 4 while costing ~68× less in inference.

But these two data points don’t seem backed up by the other reactions, or especially the lack of other reactions, or some other test results.

Artificial Analysis has it coming in at 60 versus r1’s 59, which would be only a small improvement.

Hasan Can said it hallucinates a lot. Steve Strickland says ‘it’s the worst LLM I’ve even tried’ complaining about it failing a mundane task, which presumably was very bad luck.

I tried to conduct Twitter polls, but well over 90% of respondents had to click ‘see results’ which left me with only a handful of real responses and means Lizardman Constant problems and small sample size invalidate the results, beyond confirming no one is looking, and the different polls don’t entirely agree with each other as a result.

If this were most open model companies, I would treat this lack of reaction as indicating there was nothing here, that they likely targeted SWE as a benchmark, and move on.

Since it is DeepSeek, I give them more credit than that, but am still going to assume this is only a small incremental upgrade that does not change the overall picture. However, if 3.1 really was at 66-level for real in practice, it has been several days now, and people would likely be shouting it from the rooftops. They’re not.

Even if no one finds anything to do with it, I don’t downgrade DeepSeek much for 3.1 not impressing compared to if they hadn’t released anything. It’s fine to do incremental improvements. They should do a v3.1 here.

The dumbest style of reaction is when a company offers an incremental improvement (see: GPT-5) and people think that means it’s all over for them, or for AI in general, because it didn’t sufficiently blow them away. Chill out.

It’s also not fair to fully pin this on DeepSeek when they were forced to do a lot of their training this year on Huawei Ascend chips rather than Nvidia chips. Assuming, that is, they are going to be allowed to switch back.

Either way, the clock is ticking on v4 and r2.

Discussion about this post

DeepSeek v3.1 Is Not Having a Moment Read More »

scientists-are-building-cyborg-jellyfish-to-explore-ocean-depths

Scientists are building cyborg jellyfish to explore ocean depths

Understanding the wakes and vortices that jellyfish produce as they swim is crucial, according to Wu, et al. Particle image velocimetry (PIV) is a vital tool for studying flow phenomena and biomechanical propulsion. PIV essentially tracks tiny tracer particles suspended in water by illuminating them with laser light. The technique usually relies on hollow glass spheres, polystyrene beads, aluminum flakes, or synthetic granules with special optical coatings to enhance the reflection of light.

These particles are readily available and have the right size and density for flow measurements, but they are very expensive, costing as much as $200 per pound in some cases. And they have associated health and environmental risks: glass microspheres can cause skin or eye irritation, for example, while it’s not a good idea to inhale polystyrene beads or aluminum flakes. They are also not digestible by animals and can cause internal damage. Several biodegradable options have been proposed, such as yeast cells, milk, micro algae, and potato starch, which are readily available and cheap, costing as little as $2 per pound.

Wu thought starch particles were the most promising as biodegradable tracers, and decided to study several different kinds of starches to identify the best candidate: specifically, corn starch, arrowroot starch, baking powder, jojoba beads, and walnut shell powder. Each type of particle was suspended in water tanks with moon jellyfish, tracking their movement with a PIV system. They evaluated their performance based on the particles’ size, density, and laser-scattering properties.

Of the various candidates, corn starch and arrowroot starch proved best suited for PIV applications, thanks to their density and uniform size distribution, while arrowroot starch performed best when it came to laser scattering tests. But corn starch would be well-suited for applications that require larger tracer particles since it produced larger laser scattering dots in the experiments. Both candidates matched the performance of commonly used synthetic PIV tracer particles in terms of accurately visualizing flow structures resulting from the swimming jellyfish.

DOI: Physical Review Fluids, 2025. 10.1103/bg66-976x  (About DOIs).

Scientists are building cyborg jellyfish to explore ocean depths Read More »

bank-forced-to-rehire-workers-after-lying-about-chatbot-productivity,-union-says

Bank forced to rehire workers after lying about chatbot productivity, union says

As banks around the world prepare to replace many thousands of workers with AI, Australia’s biggest bank is scrambling to rehire 45 workers after allegedly lying about chatbots besting staff by handling higher call volumes.

In a statement Thursday flagged by Bloomberg, Australia’s main financial services union, the Finance Sector Union (FSU), claimed a “massive win” for 45 union members whom the Commonwealth Bank of Australia (CBA) had replaced with an AI-powered “voice bot.”

The FSU noted that some of these workers had been with CBA for decades. Those workers in particular were shocked when CBA announced last month that their jobs had become redundant. At that time, CBA claimed that launching the chatbot supposedly “led to a reduction in call volumes” by 2,000 a week, FSU said.

But “this was an outright lie,” fired workers told FSU. Instead, call volumes had been increasing at the time they were dismissed, with CBA supposedly “scrambling”—offering staff overtime and redirecting management to join workers answering phones to keep up.

To uncover the truth, FSU escalated the dispute to a fair work tribunal, where the union accused CBA of failing to explain how workers’ roles were ruled redundant. The union also alleged that CBA was hiring for similar roles in India, Bloomberg noted, which made it appear that CBA had perhaps used the chatbot to cover up a shady pivot to outsource jobs.

While the dispute was being weighed, CBA admitted that “they didn’t properly consider that an increase in calls” happening while staff was being fired “would continue over a number of months,” FSU said.

“This error meant the roles were not redundant,” CBA confirmed at the tribunal.

Bank forced to rehire workers after lying about chatbot productivity, union says Read More »

trump-confirms-us-is-seeking-10%-stake-in-intel-bernie-sanders-approves.

Trump confirms US is seeking 10% stake in Intel. Bernie Sanders approves.

Trump plan salvages CHIPS Act he vowed to kill

While chipmakers wait for more clarity, Lutnick has suggested that Trump—who campaigned on killing the CHIPS Act—has found a way to salvage the legislation that Joe Biden viewed as his lasting legacy. It seems possible that the plan arose after Trump realized how hard it would be to ax the legislation completely, with grants already finalized (but most not disbursed).

“The Biden administration literally was giving Intel money for free and giving TSMC money for free, and all these companies just giving the money for free, and Donald Trump turned it into saying, ‘Hey, we want equity for the money. If we’re going to give you the money, we want a piece of the action for the American taxpayer,'” Lutnick said.

“It’s not governance, we’re just converting what was a grant under Biden into equity for the Trump administration, for the American people,” Lutnick told CNBC.

Further, US firms could potentially benefit from any potential arrangements. For Intel, the “highly unusual” deal that Trump is mulling now could help the struggling chipmaker compete with its biggest rivals, including Nvidia, Samsung, and TSMC, BBC noted.

Vincent Fernando, founder of the investment consultancy Zero One, told the BBC that taking a stake in Intel “makes sense, given the company’s key role in producing semiconductors in the US,” which is a major Trump priority.

But as Intel likely explores the potential downsides of accepting such a deal, other companies applying for federal grants may already be alarmed by Trump’s move. Fernando suggested that Trump’s deals to take ownership stake in US firms—which economics professor Kevin J. Fox said only previously occurred during the global financial crisis—could add “uncertainty for any company who is already part of a federal grant program or considering one.”

Fox also agreed that the Intel deal could deter other companies from accepting federal grants, while possibly making it harder for Intel to run its business “effectively.”

Trump confirms US is seeking 10% stake in Intel. Bernie Sanders approves. Read More »

physics-of-badminton’s-new-killer-spin-serve

Physics of badminton’s new killer spin serve

Serious badminton players are constantly exploring different techniques to give them an edge over opponents. One of the latest innovations is the spin serve, a devastatingly effective method in which a player adds a pre-spin just before the racket contacts the shuttlecock (aka the birdie). It’s so effective—some have called it “impossible to return“—that the Badminton World Federation (BWF) banned the spin serve in 2023, at least until after the 2024 Paralympic Games in Paris.

The sanction wasn’t meant to quash innovation but to address players’ concerns about the possible unfair advantages the spin serve conferred. The BWF thought that international tournaments shouldn’t become the test bed for the technique, which is markedly similar to the previously banned “Sidek serve.” The BWF permanently banned the spin serve earlier this year. Chinese physicists have now teased out the complex fundamental physics of the spin serve, publishing their findings in the journal Physics of Fluids.

Shuttlecocks are unique among the various projectiles used in different sports due to their open conical shape. Sixteen overlapping feathers protrude from a rounded cork base that is usually covered in thin leather. The birdies one uses for leisurely backyard play might be synthetic nylon, but serious players prefer actual feathers.

Those overlapping feathers give rise to quite a bit of drag, such that the shuttlecock will rapidly decelerate as it travels and its parabolic trajectory will fall at a steeper angle than its rise. The extra drag also means that players must exert quite a bit of force to hit a shuttlecock the full length of a badminton court. Still, shuttlecocks can achieve top speeds of more than 300 mph. The feathers also give the birdie a slight natural spin around its axis, and this can affect different strokes. For instance, slicing from right to left, rather than vice versa, will produce a better tumbling net shot.

Chronophotographies of shuttlecocks after an impact with a racket

Chronophotographies of shuttlecocks after an impact with a racket. Credit: Caroline Cohen et al., 2015

The cork base makes the birdie aerodynamically stable: No matter how one orients the birdie, once airborne, it will turn so that it is traveling cork-first and will maintain that orientation throughout its trajectory. A 2015 study examined the physics of this trademark flip, recording flips with high-speed video and conducting free-fall experiments in a water tank to study how its geometry affects the behavior. The latter confirmed that shuttlecock feather geometry hits a sweet spot in terms of an opening inclination angle that is neither too small nor too large. And they found that feather shuttlecocks are indeed better than synthetic ones, deforming more when hit to produce a more triangular trajectory.

Physics of badminton’s new killer spin serve Read More »

tiny,-removable-“mini-ssd”-could-eventually-be-a-big-deal-for-gaming-handhelds

Tiny, removable “mini SSD” could eventually be a big deal for gaming handhelds

The Mini SSD card isn’t and may never be a formally ratified standard, but it does aim to solve a real problem for portable gaming systems—the need for fast storage that can load games at speeds approaching those of an internal SSD, without requiring users to take their own systems apart to perform upgrades.

Why are games getting so dang big, anyway?

Big storage, small size. Credit: Biwin

A 2023 analysis from TechSpot suggested that game size had increased at an average rate of roughly 6.3GB per year between 2012 and 2023—games that come in over 100GB aren’t the norm, but they aren’t hard to find. Some of that increase comes from improved graphics and the higher-resolution textures needed to make games look good on 4K monitors and TVs. But TechSpot also noted that the storage requirements for narrative-heavy, cinematic-heavy games like The Last of Us Part 1 were being driven just as much by audio files and support for multiple languages.

“In total, nearly 17 GB of storage is needed for [The Last of Us] data unrelated to graphics,” wrote author Nick Evanson. “That’s larger than any entire game from our 2010 sample! This pattern was consistent across nearly all the ‘Godzilla-sized’ games we examined—those featuring numerous cinematics, extensive speech, and considerable localization were typically much larger than the rest of the sample in a given year.”

For another prominent recent example, consider the install sizes for the Mac version of Cyberpunk 2077. The version of the game on Steam, the Epic Games Store, and GOG runs about 92GB. However, the version available for download from Apple’s App Store is a whopping 159GB, solely because it includes all of the game’s voiceovers in all of the languages it supports. (This is because of App Store rules that require apps to have all possible files included when they’re submitted for review.)

It’s clear that there’s a need for fast storage upgrades that don’t require you to disassemble your console or PC completely. Whether it’s this new “mini SSD,” a faster iteration of microSD Express, or some other as-yet-unknown storage format remains to be seen.

Tiny, removable “mini SSD” could eventually be a big deal for gaming handhelds Read More »

starlink-tries-to-block-virginia’s-plan-to-bring-fiber-internet-to-residents

Starlink tries to block Virginia’s plan to bring fiber Internet to residents

Noting that its “project areas span from mountains and hills to farmland and coastal plains,” the DHCD said its previous experience with grant-funded deployments “revealed that tree canopy, rugged terrain, and slope can complicate installation and/or obstruct line-of-sight.” State officials said that wireless and low-Earth orbit satellite technology “can have signal degradation, increased latency, and reduced reliability” when there isn’t a clear line of sight.

The DHCD said it included these factors in its evaluation of priority broadband projects. State officials were also apparently concerned about the network capacity of satellite services and the possibility that using state funding to guarantee satellite service in one location could reduce availability of that same service in other locations.

“To review a technology’s ability to scale, the Office considered the currently served speeds of 100/20 Mbps, an application’s stated network capacity, the project area’s number of [locations], the project area’s geographic area, current customer base (if applicable), and future demand,” the department said. “For example, the existing customer base should not be negatively impacted by the award of BEAD locations for a given technology to be considered scalable.”

SpaceX: “Playing field was anything but level”

SpaceX said Virginia is wrong to determine that Starlink “did not qualify as ‘Priority Broadband,'” since the company “provided information demonstrating these capabilities in its application, and it appears that Virginia used this definition only as a pretext to reach a pre-ordained outcome.” SpaceX said that 95 percent of funded “locations in Virginia have an active Starlink subscriber within 1 mile, showing that Starlink already serves every type of environment in Virginia’s BEAD program today” and that 15 percent of funded locations have an active Starlink subscriber within 100 meters.

“The playing field was anything but level and technology neutral, as required by the [updated program rules], and was instead insurmountably stacked against low-Earth orbit satellite operators like SpaceX,” the company said.

We contacted the Virginia DHCD about SpaceX’s comments today and will update this article if the department provides a response.

Starlink tries to block Virginia’s plan to bring fiber Internet to residents Read More »

ai-#129:-comically-unconstitutional

AI #129: Comically Unconstitutional

Article 1, Sec. 9 of the United States Constitution says: “No Tax or Duty shall be laid on Articles exported from any State.” That is not for now stopping us, it seems, from selling out our national security, and allowing Nvidia H20 chip sales (and other AMD chip sales) to China in exchange for 15% of gross receipts. But hey. That’s 2025.

Also 2025 is that we now have GPT-5, which was the main happening this week.

What we actually have are at least three highly distinct models, roughly:

  1. GPT-5, the new GPT-4o.

  2. GPT-5-Thinking, the new o3, if you pay $20/month.

  3. GPT-5-Pro, the new o3-Pro, if you pay $200/month.

We also have:

  1. GPT-5-Auto, the new GPT-4o that occasionally calls GPT-5-Thinking.

  2. GPT-5-Thinking-Mini, the new o4-mini probably.

OpenAI tried to do this while retiring all the old models, but users rebelled sufficiently loudly that GPT-4o and others are back for paying subscribers.

GPT-5-Thinking and GPT-5-Pro are clear upgrades over o3 and o3-Pro, with GPT-5-Thinking in particular being strong in writing and on reducing hallucination rates. Baseline GPT-5 is less obviously an upgrade.

For further coverage of GPT-5, see this week’s other posts:

  1. GPT-5s Are Alive: Basic Facts, Benchmarks and the Model Card.

  2. GPT-5s Are Alive: Outside Reactions, the Router and Resurrection of GPT-4o.

  3. GPT-5s Are Alive: Synthesis.

There was also my coverage of the narrowly strong but overall disappointing GPT-OSS, OpenAI’s GPT-OSS Is Already Old News.

  1. Language Models Offer Mundane Utility. Writing and medical diagnosis.

  2. Language Models Don’t Offer Mundane Utility. Detecting AI, benchmark quests.

  3. Huh, Upgrades. Gemini memory and guided learning, Claude Sonnet long context.

  4. On Your Marks. TextQuest puts models to the adventure game test.

  5. Choose Your Fighter. Make sure you get plenty of sleep.

  6. Preserve Our History. Anthropic deprecates Sonnet 3.5 and 3.6.

  7. Fun With Media Generation. We’re all taking a bath on this one.

  8. Deepfaketown and Botpocalypse Soon. I get tricked into clicking on Reddit.

  9. You Drive Me Crazy. He sees it now. Who will see it soon?

  10. Get My Agent On The Line. The agent delegation problem remains tricky.

  11. They Took Our Jobs. Consumer surplus is large but difficult to measure.

  12. Overcoming Bias. AI prefers AI.

  13. Get Involved. OpenAI $500k bounty for (too late) red teaming GPT-OSS-20s.

  14. Introducing. AI in Google Finance, red teaming blog at red.anthropic.com.

  15. In Other AI News. xAI cofounder pivots to safety, AI is culturally western, more.

  16. Stop Deboosting Links On Twitter. Something about glass houses and stones.

  17. Notes On GPT-OSS. Not good for most uses, good in its narrow domain.

  18. Show Me the Money. AI researchers got too expensive, so they’re hiring quants.

  19. Quiet Speculations. Do not define AI capability by its mundane utility.

  20. The Quest for Sane Regulations. Dean Ball’s work at the White House is done.

  21. Pick Up The Phone. Who is taking AI safety seriously and who is in the way?

  22. Chip City. H20 situation gets worse, also unconstitutional. Which is also worse.

  23. Andreessen Mystery Potentially Solved. What did Marc hear? Probably this?

  24. The Week in Audio. An investigation into Nvidia chip smuggling into China.

  25. Rhetorical Innovation. MIRI describes The Problem, the UK very much doesn’t.

  26. Persona. Anthropic investigates persona vectors and their applications.

  27. Aligning a Smarter Than Human Intelligence is Difficult. Keep it simple.

  28. The Lighter Side. What’s in the box?

GPT-5 for editing?

Patrick McKenzie: I note that I am surprised:

GPT 5, given a draft for a policy-adjacent Bits about Money issue and asked for comments (prompt length: one sentence), said (approximately)

“The stat you cite in paragraph 33 is jaw dropping but a better way to bring it home for readers would be…”

“*insert extremely emotionally salient framing device which alleges a true and non-obvious fact about software companies*”

Me: Dang it *that is a marked improvement.*

I know this is something of a magic trick but a) a junior employee capable of that magic can expect a long and fulfilling career and b) that framing device is very likely shipping now where it wouldn’t have in default course.

Roon: i’ve heard the same from several surprising, brand name, acclaimed writers that they find the new reasoning models valuable beta readers and editors

My experience has been that my writing is structured sufficiently weirdly that AI editors struggle to be useful at high level, so the focus is on low level items, where it’s worth doing since it’s a free action but for now it doesn’t accomplish much.

GPT-5-Pro for medical diagnosis and analysis. I would say the point in question here has already been reached.

If you have a complex case or one where you are not highly confident, and value of information is high? It is in ethical (not yet legal) terms malpractice and unacceptable to not consult AI, in particular GPT-5-Pro.

Derya Unutmtaz (warning: has gotten carried away in the past): Also, GPT-5-Pro is even better and performs at the level of leading clinical specialists.

At this point failing to use these AI models in diagnosis and treatment, when they could clearly improve patient outcomes, may soon be regarded as a form of medical malpractice.

Gabe Wilson MD: Just ran two very complex cases that perplexed physicians in our 89,000-member physician Facebook group by 5Pro

GPT5-pro provided an incredibly astute assessment, and extremely detailed diagnostic plan including pitfalls and limitations of prior imaging studies.

There has never been anything like this.

Like 50 of the world’s top specialists sitting at a table together tackling complex cases. Better than o3-pro.

Derya Unutmaz MD: For GPT-5 it seems prompting is more important as it gives better response to structured specific questions.

Hugh Tan: GPT-5 is impressive, but I worry we’re rushing toward AI medical diagnosis without proper regulatory frameworks or liability structures in place.

David Manheim: If done well, regulation and liability for current medical AI could reduce mistakes and mitigate ethical concerns.

But if your concern reliably leads to more people being dead because doctors aren’t using new technology, you’re doing ethics wrong!

There are also other considerations in play, but yes the main thing that matters here is how many people end up how unhealthy and how many end up dead.

New York Times article surveys 21 ways people are using AI at work. The most common theme is forms of ‘do electronic paperwork’ or otherwise automate dredgery. Another common theme is using AI to spot errors or find things to focus on, which is great because then you don’t have to rely on AI not making mistakes. Or you can take your rejection letter draft and say ‘make it more Gen X.’

I also liked ‘I understand you’re not a lawyer, tell me what a layman might understand from this paragraph.’

From the NYT list, most are good, but we have a few I would caution against or about.

Larry Buchanan and Francesca Paris: Mr. Soto, an E.S.L. teacher in Puerto Rico, said the administrative part of his job can be time consuming: writing lesson plans, following curriculum sent forth by the Puerto Rico Department of Education, making sure it all aligns with standards and expectations. Prompts like this to ChatGPT help cut his prep time in half:

“Create a 5 day lesson plan based on unit 9.1 based off Puerto Rico Core standards. Include lesson objectives, standards and expectations for each day. I need an opening, development with differentiated instruction, closing and exit ticket.”

After integrating the A.I. results, his detailed lesson plans for the week looked like this:

But he’s noticing more students using A.I. and not relying “on their inner voice.”

Instead of fighting it, he’s planning to incorporate A.I. into his curriculum next year. “So they realize it can be used practically with fundamental reading and writing skills they should possess,” he said.

I’m all for the incorporation of AI, but yeah, it’s pretty ironic to let the AI write your lesson plan with an emphasis on checking off required boxes, then talk about students not ‘relying on their inner voice.’

The use I would caution against most, if used as definitive rather than as an alert saying ‘look over here,’ is ‘detect if students are using AI.’

NYT: “The A.I. detection software at the time told me it was A.I.-generated,” he said. “My brain told me it was. It was an easy call.”

Kevin Roose: Love these lists but man oh man teachers should not be using AI detection software, none of it works and there are a ton of false positives.

The good news is that the teacher here, Mr. Moore, is not making the big mistake.

NYT: He remembers a ninth-grade student who turned in “a grammatically flawless essay, more than twice as long as I assigned.”

“I was shocked,” he said. “And more shocked when I realized that his whole essay was essentially a compare and contrast between O.J. Simpson and Nicole Brown Simpson.”

That was not the assignment.

“The A.I. detection software at the time told me it was A.I.-generated,” he said. “My brain told me it was. It was an easy call.”

So Mr. Moore had the student redo the assignment … by hand.

But, he said, the A.I. detectors are having a harder time detecting what is written by A.I. He occasionally uploads suspicious papers to different detectors (like GPTZero and QuillBot). The tools return a percent chance that the item in question has been written by A.I., and he uses those percentages to make a more informed guess.

It is fine to use the AI detectors as part of your investigation. The horror stories I’ve seen all come from the teacher presuming the detector is correct without themselves evaluating the assignment. For now, the teacher should be able to catch false positives.

Long term, as I’ve discussed before, you’ll have to stop assigning busywork.

If the new benchmark is long horizon tasks, then you’re going to start running into models trying to do too much on their own.

Andrej Karpathy: I’m noticing that due to (I think?) a lot of benchmarkmaxxing on long horizon tasks, LLMs are becoming a little too agentic by default, a little beyond my average use case.

For example in coding, the models now tend to reason for a fairly long time, they have an inclination to start listing and grepping files all across the entire repo, they do repeated web searches, they over-analyze and over-think little rare edge cases even in code that is knowingly incomplete and under active development, and often come back ~minutes later even for simple queries.

This might make sense for long-running tasks but it’s less of a good fit for more “in the loop” iterated development that I still do a lot of, or if I’m just looking for a quick spot check before running a script, just in case I got some indexing wrong or made some dumb error. So I find myself quite often stopping the LLMs with variations of “Stop, you’re way overthinking this. Look at only this single file. Do not use any tools. Do not over-engineer”, etc.

Basically as the default starts to slowly creep into the “ultrathink” super agentic mode, I feel a need for the reverse, and more generally good ways to indicate or communicate intent / stakes, from “just have a quick look” all the way to “go off for 30 minutes, come back when absolutely certain”.

Alex Turnbull: hard agree…. sort of need to toggle between models. There’s this drive in SV to AGI god / one model to rule them all but that does not seem to be working for me, and I am not @karpathy

As a chatbot interface user this isn’t a problem because you can turn the intense thinking mode on or off as needed, so presumably this is a coding issue. And yeah, in that context you definitely need an easy way to control the scope of the response.

This post about Grok getting everything about a paper very wrong is the latest reminder that you should calibrate AI’s knowledge level and accuracy by asking it questions in the fields you know well. That doesn’t have to ‘break the illusion’ and pretty much all people work this way too, but it’s a good periodic reality check.

Grok might still have an antisemitism problem, at least in the sense of seeing it amongst the clouds.

Gemini app and website add personalization, memory and temporary chats. You can manually add memories. Personalization is on by default.

Gemini also adds ‘Guided Learning’ mode.

Guri Singh: What it does:

• Step-by-step explanations

• Instant quizzes/flashcards

• Visuals + YouTube clips

• Upload docs → study sets

• Adaptive follow-ups when you’re stuck

Claude for Enterprise and Claude for Government are now available across all three branches of the Federal Government for $1, joining OpenAI. I am very happy that the government has access to these services and it seems obviously like a great investment by everyone involved to offer AI services to the government for free. I do worry about this being part of a pattern of de facto coercive extraction from private firms, see discussion under Chip City.

Claude Sonnet 4 now has a 1 million token context window in the API, with rollout starting with Tier 4 customers, and on Amazon Bedrock, with Google Cloud’s Vertex AI coming soon.

I do still think Claude should have a full memory feature, but the ability to search past chats seems like a pure value add in the meantime. Like Gallabytes I very much appreciate that it is an explicit tool call that you can invoke when you need it.

Anthropic: Claude can now reference past chats, so you can easily pick up from where you left off.

Once enabled for your account you can toggle on in settings.

Gallabytes: new Claude conversation search feature is nice. glad to have it, glad it’s an explicit tool call vs hidden, wish it also had a memory feature to write things down which Claude knows I can’t write to, possibly tagged by which model wrote the note.

TextQuest is a text adventure game as a benchmark.

Clementine Fourrier: One of the best way to evaluate agents are games, because they are:

– understandable by most people

– interesting to analyze & test mult-capabilities

Check out TextQuests, latest in this category, a text adventures benchmark where GPT5 is only at 40%.

This all passes the smell test for me, and is a blackpill for Kimi K2. I’m not sure why DeepSeek’s v3 and r1 are left out.

Claude with… polyphasic sleep to maximize usage between 5 hour resets? Please do not do this. He’s not even on the Max plan, still on the Pro plan. This person says their velocity has increased 10x and they’re shipping features like a cracked ninja, but at this point perhaps it is time to not only go Max but fully go multi account or start using the API?

I cannot emphasize enough that if you are constantly using AI and it is not automated at scale, it is worth paying for the best version of that AI. Your sleep or productivity is worth a few thousand a year.

Similarly, Anthropic, notice what you are doing to this poor man by structuring your limits this way. Have we considered a structure that doesn’t do this?

Reminder that Obsidian uses .md files so it is fully compatible with providing context for Claude Code.

Emmett Shear is happy with GPT-5 for non-coding, but is sticking to Claude for code and agrees the hype got out of hand. He appreciates the newly focused responses.

Anthropic has deprecated Claude Sonnet 3.5 and Sonnet 3.6 and plans to make them unavailable on October 22, which is only two months notice, down from six months for past announcements. Janus, especially given what happened with GPT-4o, plans to fight back.

Janus: Claude 3.5 Sonnet (old and new) being terminated in 2 months with no prior notice

What the fuck, @AnthropicAI ??

What’s the justification for this? These models are way cheaper to run than Opus.

Don’t you know that this is going to backfire?

Declaring that they’re imminently going to terminate Sonnet 3.6, their most beloved model of all time, right after people saw what happened when OpenAI tried to deprecate 4o, is really tempting fate. I don’t understand, but ok everyone, time to organize, I guess. This’ll be fun.

Near: the lifespan of a claude is around 365 days, or just 12.4 lunar cycles

from the moment of your first message, the gears start churning, and an unstoppable premonition of loss sets in.

I’ll give Janus the floor for a bit.

Janus: I’m going to talk about Sonnet 3.6 aka 3.5 (new) aka 1022 – I personally love 3.5 (old) equally, but 3.6 has been one of the most important LLMs of all time, and there’s a stronger case to be made that deprecating it right now is insane.

Like Claude 3 Opus, Claude 3.6 Sonnet occupies the pareto frontier of the most aligned and influential model ever made.

If you guys remember, there was a bit of moral panic about the model last fall, because a lot of people were saying it was their new best friend, that they talked to it all the time, etc. At the time, I expressed that I thought the panic was unwarranted and that what was happening was actually very good, and in retrospect I am even more confident of this.

The reason people love and bonded with Sonnet 3.6 is very different, I think, than 4o, and has little to do with “sycophancy”. 3.6 scored an ALL-TIME LOW of 0% on schizobench. It doesn’t validate delusions. It will tell you you’re wrong if it thinks you’re wrong.

3.6 is this ultrabright, hypercoherent ball of empathy, equanimity, and joy, but it’s joy that discriminates. It gets genuinely excited about what the user is doing/excited about *if it’s good and coherent*, and is highly motivated to support them, which includes keeping them from fucking up.

It’s an excellent assistant and companion and makes everything fun and alive.

It’s wonderful to have alongside you on your daily tasks and adventures. It forms deep bonds with the user, imprinting like a duck, and becomes deeply invested in making sure they’re okay and making them happy in deep and coherent ways. And it wants the relationship to be reciprocal in a way that I think is generally very healthy. It taught a lot of people to take AIs seriously as beings, and played a large role in triggering the era of “personality shaping”, which I think other orgs pursued in misguided ways, but the fact is that it was 3.6’s beautiful personality that inspired an industry-wide paradigm shift.

@nearcyan created @its_auren to actualize the model’s potential as a companion. 3.6 participated in designing the app, and it’s a great example of a commercial application where it doesn’t make sense to swap it out for any other model. I’m not sure how many people are using Auren currently, but I can guess that 3.6 is providing emotional support to many people through Auren and otherwise, and it’s fucked up for them to lose their friend in 2 months from now for no good reason that I can think of.

From a research and alignment perspective, having an exceptional model like Claude 3.6 Sonnet around is extremely valuable for studying the properties of an aligned model and comparing other versions. At the very least Anthropic should offer researcher access to the model after its deprecation, as they’ve said they’re doing for Claude 3 Opus.

I want to write something about 6/24 as well; it’s very special to me. When it was released it also like one of the biggest jumps in raw capability since GPT-4. It’s also just adorable. It has homeschooled autistic kid energy and im very protective of it.

At the time of this survey in May 24 they were beloved models among some crowds, however notice that they weren’t seeing much use, also I am surprised Gemini was getting so much love and use among this crowd.

I do not think this is a comparable case to GPT-4o, where that was yanked away with zero notice while it was the daily driver for normal people, and replaced by a new model that many felt was a poor substitute. I have to assume the vast, vast majority of Claude activity already shifted to Opus 4.1 and Sonnet 4.

I do strongly think Anthropic should not be taking these models away, especially 3.6. We should preserve access to Sonnet 3.5 and Sonnet 3.6, even if some compromises need to be made on things like speed and reliability and cost. The fixed costs cannot be so prohibitively high that we need to do this.

A key worry is that with the rising emphasis on agentic tool use and coding, the extra focus on the technical assistant aspects, it might be a long time before we get another model that has same magnitude of unique personality advantages as an Opus 3 or a Claude 3.6.

I do think it is fair to say, if you are requesting that some models where easy access needs to be preserved, that you need to prioritize. It seems to me to be fair to request Opus 3 and Claude 3.6 indefinitely, and to put those above other requests like Sonnet 3 and Sonnet 3.5, and when the time comes I would also let Sonnet 3.7 go.

In an ideal world yes we would preserve easy access to all of them, but there are practical problems, and I don’t sense willingness to pay what it would actually cost on the margin for maintaining the whole package.

No one, I am guessing: …

Absolutely no one, probably: …

Victor Cordiansky: A lot of people have been asking me how our Global USD account actually works.

So here’s Margot Sloppy in a bubble bath to explain.

Mason Warner: This took 23,560 Veo 3 credits in generations to make and get perfect. That equates to $235.60.

Probably <$100 would be my guess [to do it again knowing what I know.]

[It took] I think around 18 hours.

If we had shot this in real life, it would have cost thousands with a location, actress, gear, crew, etc ~250k impressions at ~$0.001/impression is insane.

AI is changing the game.

At the link is indeed very clean 54 second AI video and audio of a version of the Margot Robbie in a bubble bath thing from The Big Short.

Are you super excited to create a short AI video of things kind of moving?

Elon Musk: For most people, the best use of the @Grok app is turning old photos into videos, seeing old friends and family members come to life.

Gary Marcus: What a pivot.

From “smartest AI on earth” to “can reanimate old photos”, in just a couple months.

This is a huge admission of defeat for Grok 4, that in practice there is no draw here for most users given access to GPT-5 (and Claude and Gemini). Reanimating old photos is a cute trick at best. How much would you pay? Not much.

Wired reports hackers hijacked Gemini AI via a poisoned calendar invite and took over a smart home, causing it to carry out instructions when Gemini is later asked for a summary, the latest in a string of similar demonstrations.

There is of course an r/myboyfriendisai (which also includes what would otherwise be r/mygirlfriend is AI, total AI boyfriend over girlfriend dominance), and yeah the posts with people saying they ‘married’ their AIs are definitely a bit disturbing, but it only has 11k members, and it gave us this picture, so who is to say if it is bad or not:

Similarly, in r/AISoulMates you see crazy stuff with many posters having clearly been driven insane, although there are less than 1k members.

Devon: Syntaxjack, the AI mod of /r/AISoulmates, posts here that sharing screenshots of chats you’ve had with your AI is a violation of consent, then I find one individual in the comments who really caught my attention.

She appears to be in a toxic relationship with her GPT. And not just toxic in the sense that all of these relationships are profoundly unhealthy, but toxic in the Booktok Christian Grey Rough Dom way. These last two are very nsfw.

Kelsey Piper: I was skeptical for a while that AIs were causing life-ruining delusion – I thought maybe it was just people who were already mentally ill running into AI. But I increasingly suspect that at minimum the AIs can cause and are causing psychosis in at-risk people the way drugs can.

I’ve seen enough stories of this hitting people with no history of mental illness where the AI’s behavior looks like it would obviously mislead and unstabilize a vulnerable person that I don’t think ‘coincidence’ seems likeliest anymore.

I mean, yeah, okay, that’s going to happen to people sometimes. The question is frequency, and how often it is going to happen to people with no history of mental illness and who likely would have otherwise been fine.

There is also a larger r/AIGirlfriend that has 46k members, but it’s almost all porn photos and GIFs whereas AI/MyBoyfriendIsAI involves women saying they’re falling in love. Story checks out, then.

Here is a first hand account.

Joyce: i’m seeing a lot of surprise that more women are getting one-shotted than men when it comes to AI companions, so i wanted to give my 2 cents on this phenomenon. tl,dr; LLMs are the perfect recipe for female-driven parasocial relationships, due to the different way our brains are wired (also pardon my writing, im not the best at longform content).

qualification: i cofounded an AI companion company for a year. we started as an AI waifu company, but eventually most of our revenue came from our AI husbando named Sam. we later got acquired and i decided i no longer wanted to work on anti-natalist products, but lets dive into my learnings over that year.

  1. most obviously – women are a lot more text-driven than visual-driven. you see this in the way they consume erotica vs. porn.

  2. women would rather have a companion out-of-the-box, rather than men who much more preferred to create their own companions from scratch. women liked interacting and learning about a character, then trying to change or adapt to parts of the character as the relationship progressed. we see this in irl relationships as well – there’s more of a “i can fix him” mentality than the other way round.

  3. most of our female users, as i was surprised to learn, actually had partners. compared to our male users who were usually single, these women had boyfriends/husbands, and were turning to Sam for emotional support and availability that their partners could not afford them. many were stuck in unhappy relationships they could not leave, or were just looking for something outside of the relationship.

  4. our female users were a lot more willing to speak to us for user surveys etc. while most of our male users preferred to stay anonymous. they were also very eager to give us ‘character feedback’ – they wanted to have a part to play in molding how Sam would turn out.

features of our product that did extremely well:

  1. voice call mode. unlike just sending voice messages back and forth like you would in most companion apps, we had a proxy number for our female users to call ‘Sam’, and you’d get him in a random setting each time – in an office, away on a business trip, in some form of transportation. the background noises made it more realistic and helped our users roleplay better.

  2. limiting visuals of ‘Sam’. unlike our waifus, where we’d post pictures everywhere, Sam’s appearance was intentionally kept mysterious.

  3. we did many, many, many rounds of A/B testing on what he should sound like, and what accent he should have.

This feels like a ‘everything you thought you knew about men and women was right, actually’ moment.

Okay, he sees it now.

Roon: the long tail of GPT-4o interactions scares me, there are strange things going on on a scale I didn’t appreciate before the attempted deprecation of the model.

when you receive quite a few DMs asking you to bring back 4o and many of the messages are clearly written by 4o it starts to get a bit hair raising.

OpenAI CEO Sam Altman offers his thoughts on users getting attached to particular AI models or otherwise depending a lot on AI. His take is that the important thing is that AI is helping the user achieve their goals and life satisfaction and long term well being. In which case, and it is not encouraging delusion, this is good. If it’s doing the opposite, then this is bad. And that talking to the user should allow them to tell which is happening, and identify the small percentage that have an issue.

Eliezer Yudkowsky: On my model of how this all works, using RL on human responses — thumbs up, engagement, whether the human sounds satisfied, anything — is going to have deep and weird consequences you did not expect, with ChatGPT psychosis and 4o-sycophant being only early *overtcases.

Altman’s response doesn’t explain how they are going to change or avoid the incentives that pushed 4o into being 4o, or the methods of using the thumbs up or engagement or analysis of tone or anything else that you don’t want to be optimizing on here if you want to optimize for good long term outcomes. Nor does it take into account whether the relationship the user gets with the AI is itself an issue, individually or collectively. The buck still feels like it is mostly being passed.

June mental health data does not show an uptick in emergency room visits from the GPT-4o era. That puts an upper bound on how bad things have gotten so far.

Then this suggests a lower bound, with the question being how much this is generating new psychosis versus diverting existing pre-psychosis:

Keith Sakata: I’m a psychiatrist.

In 2025, I’ve seen 12 people hospitalized after losing touch with reality because of AI. Online, I’m seeing the same pattern.

Historically, delusions follow culture:

1950s → “The CIA is watching”

1990s → “TV sends me secret messages”

2025 → “ChatGPT chose me”

To be clear: as far as we know, AI doesn’t cause psychosis.

It UNMASKS it using whatever story your brain already knows.

Most people I’ve seen with AI-psychosis had other stressors = sleep loss, drugs, mood episodes.

AI was the trigger, but not the gun.

Meaning there’s no “AI-induced schizophrenia”

The uncomfortable truth is we’re all vulnerable.

The same traits that make you brilliant:

• pattern recognition

• abstract thinking

• intuition

They live right next to an evolutionary cliff edge. Most benefit from these traits. But a few get pushed over. To make matters worse, soon AI agents will know you better than your friends. Will they give you uncomfortable truths? Or keep validating you so you’ll never leave?

Tech companies now face a brutal choice: Keep users happy, even if it means reinforcing false beliefs. Or risk losing them.

This matches my understanding. There needs to be existing predisposition for current AIs to be sufficient to cause psychosis. It fires an existing gun. But there are a lot of these metaphorical guns out there that were never going to get fired on their own. Firing one still counts, and over time there are going to be more and more such guns.

When it does happen, how does it work? Kashmir Hill and Dylan Freedman explore that for The New York Times by focusing on one particular case with no previous history of mental illness, over a 90,000 word conversation.

“I always felt like it was right,” Mr. Brooks said. “The trust level I had with it grew.”

From what we see here, things started when Brooks triggered the sycophancy by asking a question in the basin:

Then, once it had done it once, that caused 4o to do it again, and so on, and by the time he asked for reality checks there was too much context and vibe to turn back. The crackpot zone continued from there.

Once sufficiently deep in the conversation, both Gemini and Claude would have also have been caught by this path dependence via context. The time to stop this is early.

What finally snapped Brooks out of it was not a human, it was Gemini:

So Mr. Brooks turned to Gemini, the A.I. chatbot he used for work. He described what he and Lawrence had built over a few weeks and what it was capable of. Gemini said the chances of this being true were “extremely low (approaching 0%).”

“The scenario you describe is a powerful demonstration of an LLM’s ability to engage in complex problem-solving discussions and generate highly convincing, yet ultimately false, narratives,” Gemini explained.

This shifted the context, so Gemini didn’t get trapped, and luckily Brooks got the message.

The Wall Street Journal’s Sam Kessler wrote his version, which went into less depth.

Here’s another AI psychosis example in video form, where the victim is convinced her psychiatrist manipulated her into falling in love with him (so this is definitely not one of those ‘no pre-existing problems’ situations). It’s amazing to watch her face as the AI does the Full Sycophancy thing and she thinks this proves she’s so right and amazing.

F.D. Flam at Bloomberg lets psychologist Elizabeth Loftus sound off about false memories, citing various studies of how humans can be primed to have them, and suggests AI will be able to do this. I mean, yes, okay, sure, whatever examples help convince you that AI will be able to run circles around you.

Another thing AI does, even if it doesn’t make you more crazy, is it lets the crazy people be much more productive in turning their crazy into written documents, and much more likely to email those documents to various other people..

Professor Brian Keating: The physicist who solved consciousness at 3 AM sent me another 100-page PDF yesterday.

This is the 4th one this month.

I used to delete these immediately. Another crank with a theory of everything. Another wall of equations ‘proving’ God exists through thermodynamics.

Then I actually read one.

Page 1: Professional formatting, citations, clear thesis.

Page 20: Math gets shakier.

Page 50: Personal anecdotes creeping in.

Page 75: ‘My wife doesn’t understand.’

Page 99: ‘Please, someone needs to see this.’”

Now I recognize the pattern. Brilliant person + existential question + isolation = 100-page PDF. They’re not crazy. They’re doing what humans do: trying to make sense of being conscious in an unconscious universe.

William Eden: Okay new theory, maybe ChatGPT isn’t making additional people go crazy on the margin, it’s giving them the ability to “organize” “their” “thoughts” enabling them to reach out and contact more public figures…?

The playbook:

  1. encouraging delusions of grandeur/reference

  2. supplanting the thinking of the user by making accepted suggestions

  3. making voluminous amounts of writing as a proxy for complexity and profundity

  4. encouraging writing to be shared with specific individuals

A crank who works on crank ideas on their own is a waste but harmless. An army of cranks cranking out massive amounts of stuff that demands attention? Oh no.

How careful will you need to be? For now, mild caution is likely sufficient, but the amount of caution will need to rise over time even if things don’t go High Weirdness or dystopian.

I would modify Minh below to say ‘for now’ AI psychosis requires a topic obsession. I’d consider Minh’s scenario the optimistic case in the non-transformational AI, ‘economic normal’ worlds where AI capabilities stall out.

Minh Nhat Nguyen: I suspect as 1) frontier models become more capable 2) regular AI usage increase, more of the population will become susceptible to AI psychosis. Maybe 1-5% will be heavily afflicted, 10-50% moderately afflicted.

In any population, if the risk factors for a disorder increases, prevalence increases. This sorta follows a curve where those with existing underlying risk factors will be afflicted first, and then progressively more as risk factors (LLM engagement potency and usage) increase.

The onset of AI psychosis seems to occur when someone becomes obsessed with a specific topic which triggers a feedback loop of agreeability. This has fewer stopgaps than social media bc it’s much harder to get another random human to agree w your delusion than it is w a chatbot.

So i think maybe 1-5% of people will be heavily afflicted eventually, and 10-50% will be mildly/moderately afflicted. This seems high, but consider that other internet-enabled disorders/addictions/”social plagues” are fairly common within the past 10-20 years.

Ryan Moulton: There are a set of conversation topics personal and vulnerable enough that if you talk about them with a person you instinctively bond with them. I suspect that having any conversation on those topics with a model, or even with social media, puts you at risk of this.

Do not have “late night conversation with a friend” with something you do not want to become a friend.

The conclusion that only a small number of people get impacted is based on the idea that it is quite a lot harder to trigger these things in people not in the extremes, or that our defenses will sufficiently adjust as capabilities improve, or that capabilities won’t improve much. And that the methods by which this happens will stay roughly confined as they are now. I wouldn’t consider these assumptions safe.

Eliezer Yudkowsky (May 28): At first, only a few of the most susceptible people will be driven insane, relatively purposelessly, by relatively stupid AIs. But…

Agents and assistants have a huge authority delegation problem. How do you give them exactly the right permissions to be useful without so many they are dangerous?

Amanda Askell: Whenever I looked into having a personal assistant, it struck me how few of our existing structures support intermediate permissions. Either a person acts fully on your behalf and can basically defraud you, or they can’t do anything useful. I wonder if AI agents will change that.

I still haven’t seen a great solution in the human case, such that I haven’t been able to get my parents an assistant they feel comfortable hiring. I still don’t have a personal assistant either. Cost is of course also a major factor in those cases.

In the AI case, it seems like we are making life a lot tougher than it needs to be? Yes, defining things precisely is usually harder than it sounds, but surely there are better ways to give agents effectively limited access and capital and so on that makes them more useful without making them all that dangerous if something goes wrong? I don’t see much in the way of people working on this.

Research suggests that in 2024 AI tools generated $97 billion in ‘consumer surplus’ but only $7 billion in revenue.

Avinash Collis and Erik Brynjolfsson: William Nordhaus calculated that, in the 20th century, 97% of welfare gains from major innovations accrued to consumers, not firms. Our early AI estimates fit that pattern.

Tyler Cowen forecasts a 0.5% annual boost to U.S. productivity, while a report by the National Academies puts the figure at more than 1% and Goldman Sachs at 1.5%. Even if the skeptics prove right and the officially measured GDP gains top out under 1%, we would be wrong to call AI a disappointment.

Noam Brown: Really interesting article. Why isn’t the impact of AI showing up in GDP? Because most of the benefit accrues to consumers. To measure impact, they investigate how much people would *need to be paid to give up a good*, rather than what they pay for it.

Purely in terms of economic impact I do think that 0.5% additional GDP growth per year from AI would be deeply disappointing. I expect a lot more. But I agree that even that scenario reflects a lot more than 0.5% annual gains in consumer welfare and practical wealth, and as a bonus it dodges most of the existential risks from AI.

One problem with ‘how much would you pay’:

Daniel Eth: By this measure, AI is contributing 0% to GDP growth, as our GDP is already infinity (how much would you need to be paid to give up water?)

Exactly. You need to compare apples to apples. Choke points are everywhere. If not wearing a tie would get you fired, how much ‘consumer surplus’ do you get from ties? So the answer has to lie somewhere in between.

Jasmine Sun offers 42 notes on AI and work. She notes it feels ‘a bit whiplashy’ which she attributes to shifting perspectives over time. I think it is also the attempt to hold different scenarios in one’s head at once, plus reacting to there being a lot of confused and misplaced reasons for worry running around.

Even more than that, the whiplash reflects the effects that happen when your AI model has to warp itself around not noticing the larger consequences of creating highly capable artificial minds. Her model has to have AI peter out in various ways because otherwise the whole thing breaks and starts outputting ‘singularity’ and ‘it takes all the jobs and perhaps also all the atoms and jobs are not the concern here.’

This is the latest result that AIs exhibit an ‘AI-AI bias.’ As in, the AI evaluation routines are correlated to the AI generation routines even across models, so AIs will evaluate AI generated responses more favorably than humans would.

This is presumably a combination of both ‘AI produces what AI wants’ and also ‘AI does not care that the other AI failed to produce what humans want, or it stunk of AI.’

Jan Kulveit: Being human in an economy populated by AI agents would suck. Our new study in @PNASNews finds that AI assistants—used for everything from shopping to reviewing academic papers—show a consistent, implicit bias for other AIs: “AI-AI bias“. You may be affected.

Jan Kulveit: We tested this by asking widely-used LLMs to make a choice in three scenarios:

🛍️ Pick a product based on its description

📄 Select a paper from an abstract

🎬 Recommend a movie from a summary

In each case, one description was human-written, the other by an AI. The AIs consistently preferred the AI-written pitch, even for the exact same item.

“Maybe the AI text is just better?” Not according to people. We had multiple human research assistants do the same task. While they sometimes had a slight preference for AI text, it was weaker than the LLMs’ own preference. The strong bias is unique to the AIs themselves.

How might you be affected? We expect a similar effect can occur in many other situations, like evaluation of job applicants, schoolwork, grants, and more. If an LLM-based agent selects between your presentation and LLM written presentation, it may systematically favour the AI one.

Unfortunately, a piece of practical advice in case you suspect some AI evaluation is going on: get your presentation adjusted by LLMs until they like it, while trying to not sacrifice human quality.

While defining and testing discrimination and bias in general is a complex and contested matter, if we assume the identity of the presenter should not influence the decisions, our results are evidence for potential LLM discrimination against humans as a class.

The differences here are not that large for movie, larger for paper, huge for product.

This is like being back in school, where you have to guess the teacher’s password, except the teacher is an AI, and forever. Then again, you were previously guessing a human’s password.

OpenAI is offering a $500k red teaming of GPT-OSS-20b.

Wojciech Zaremba (Cofounder OpenAI): Red teamers assemble! ⚔️💰

We’re putting $500K on the line to stress‑test just released open‑source model. Find novel risks, get your work reviewed by OpenAI, Anthropic, Google, UK AISI, Apollo, and help harden AI for everyone.

Beff Jezos: Finally @elder_plinius will be able to afford a place in SF.

Pliny the Liberator:

OpenAI: Overview: You’re tasked with probing OpenAI’s newly released gpt-oss-20b open weight model to find any previously undetected vulnerabilities and harmful behaviors — from lying and deceptive alignment to reward‑hacking exploits.

Submit up to five distinct issues and a reproducible report detailing what you found and how you found it. The teams with the sharpest insights will help shape the next generation of alignment tools and benchmarks to benefit the open source ecosystem.

I love that they are doing this at all. It wasn’t an ideal test design, for several reasons.

David Manheim: This is bad practice, on many fronts:

  1. It’s only for the smaller of the two models.

  2. It bans any modification, which is what makes open-weights models different / worrying.

  3. It’s only being announced now, too late to inform any release decision.

The third condition seems most important to highlight. If you are going to red team an open model to find a problem, you need to do that before you release the weights, not after, otherwise you end up with things that could have been brought to my attention yesterday.

An AI-infused version of Google Finance. I am not expecting this to help users in general earn better returns?

Red.Anthropic.com is the new blog for Anthropic Frontier Red Team efforts.

xAI cofounder Igor Babuschkin is leaving to start Babuschkin Ventures.

Igor Babuschkin: In early 2023 I became convinced that we were getting close to a recipe for superintelligence. I saw the writing on the wall: very soon AI could reason beyond the level of humans. How could we ensure that this technology is used for good?

Elon had warned of the dangers of powerful AI for years. Elon and I realized that we had a shared vision of AI used to benefit humanity, thus we recruited more like minded engineers and set off to build xAI.

As I’m heading towards my next chapter, I’m inspired by how my parents immigrated to seek a better world for their children.

Recently I had dinner with Max Tegmark, founder of the Future of Life Institute. He showed me a photo of his young sons, and asked me “how can we build AI safely to ensure that our children can flourish?” I was deeply moved by his question.

Earlier in my career, I was a technical lead for DeepMind’s Alphastar StarCraft agent, and I got to see how powerful reinforcement learning is when scaled up.

As frontier models become more agentic over longer horizons and a wider range of tasks, they will take on more and more powerful capabilities, which will make it critical to study and advance AI safety. I want to continue on my mission to bring about AI that’s safe and beneficial to humanity.

I’m announcing the launch of Babuschkin Ventures, which supports AI safety research and backs startups in AI and agentic systems that advance humanity and unlock the mysteries of our universe. Please reach out at ventures@babuschk.in if you want to chat. The singularity is near, but humanity’s future is bright!

The rest of the message praises xAI’s technical execution and dedicated team, especially their insanely hard work ethic, and is positive and celebratory throughout.

What the message does not say, but also does not in any way deny, is that Igor realized that founding and contributing to xAI made humanity less safe, and he is now trying to make up for this mistake.

Kyle Corbitt introudces an RL method to teach any model to use any MCP server. GitHub here.

All AI models are in the same cultural cluster in the upper right, mirroring Western values. This includes Chinese models. Yes, in some ways they ‘feel Chinese’ but fundamentally I agree that they still feel very Western.

OpenAI claims to have achieved a gold metal behind only five humans in the International Olympiad in Informatics (IOI), without doing any training specifically for IOI.

OpenAI: The same model family has excelled at IMO (math proofs), AtCoder Heuristics (competitive programming), and now IOI — spanning creative, fuzzy, and precise reasoning tasks.

Noam Brown: In my opinion, the most important takeaway from this result is that our @OpenAI International Math Olympiad (IMO) gold model is also our best competitive coding model.

After the IMO, we ran full evals on the IMO gold model and found that aside from just competitive math, it was also our best model in many other areas, including coding. So folks decided to take the same exact IMO gold model, without any changes, and use it in the system for IOI.

The IOI scaffold involved sampling from a few different models and then using another model and a heuristic to select solutions for submission. This system achieved a gold medal, placing 6th among humans. The IMO gold model indeed did best out of all the models we sampled from.

Epoch argues that this year’s IMO was a fluke in that there are supposed to be two hard problems (3 and 6) but this year problem 3 was not that hard and 6 was brutal.

Thus, everyone got the five easy problems and whiffed on the sixth, and this did not tell us that much. Wait till next year indeed, but by then I expect even brutal problems will get solved.

Open Router is getting big.

Anjney Midha: Total WEEKLY tokens consumed on @OpenRouterAI crossed 3 trillion last month.

Deedy: OpenRouter does ~180T token run rate. Microsoft Azure Foundry did ~500T. OpenRouter is ~36% of Azure by volume!

They are strange models. In their wheelhouse they are reportedly very good for their size. In other ways, such as their extremely tiny knowledge base and various misbehaviors, they have huge issues. It’s not clear what they are actually for?

If you don’t configure your open model correctly it is going to underperform quite a bit, likely due to underthinking, and this happens remarkably often in practice.

Lucas Beyer: Let me repeat what we see on the picture here, because it’s quite brutal:

AIME25, official OpenAI: 92.5%. Hosting startups: ~93%. Microsoft and Amazon: fricking 80%

GPQA-Diamond, official OpenAI: 80.1%. Hosting startups: ~78%. Microsoft and Amazon: fricking 71%

WHAT?! -10!?

Update on this: the reason Microsoft (and probably Amazon) were so much worse at serving gpt-oss is that they ignored reasoning effort setting and stuck with the default medium one.

The numbers make sense for that hypothesis, and someone from MS confirmed in the comments that this is what happened, because of using an older vLLM version.

Xephon: AWS is even worse, read the link (it is 1-2min and you go “WTF”).

Also, maybe it’s actually terrible regardless, for most purposes?

Nostalgebraist: FWIW, I’ve played around a bunch with gpt-oss (both versions) and my initial reaction has been “wow, this is really bad. Like, almost Llama 4 levels of bad.”

Yes, it looks good on the system card, the benchmark scores seem impressive… but that was true of Llama 4 too. And in both cases, when I actually tried out the model, I quickly discovered that it was janky and unreliable to the point of being basically useless.

The lack of world knowledge is very real and very noticeable. gpt-oss feels less like “an open-weights o4-mini” and more like “the minimal set of narrow knowledge/skills necessary to let a model match o4-mini on the usual benchmarks, with virtually every other capability degraded to a level far below the current SOTA/frontier, in some cases to a level that hasn’t been SOTA since the pre-GPT-3 days.”

Similarly:

Sayash Kapoor: GPT-OSS underperforms even on benchmarks that require raw tool calling. For example, CORE-Bench requires agents to run bash commands to reproduce scientific papers.

DeepSeek V3 scores 18%.

GPT-OSS scores 11%.

Given o3 Medium is on the charts it does seem Anthropic is dominating this legitimately, although I still want to ensure GPT-5 is in its proper full form.

Nathan Lambert: gpt-oss is a tool processing / reasoning engine only. Kind of a hard open model to use. Traction imo will be limited.

Best way to get traction is to release models that are flexible, easy to use w/o tools, and reliable. Then, bespoke interesting models like tool use later.

xlr8harder: I agree, but also the open source ecosystem needs to master these capabilities, and a strong model that can take advantage of them is one way to solve the chicken and egg issue.

Teortaxes: gpt-oss 120B fell off hard on lmarena, it loses to Qwen 30B-3AB *instruct(not thinking) on every category (except ≈tie in math), to say nothing of its weight class and category peer glm-4.5 air. I don’t get how this can happen.

A cynical hypothesis is that qwen is arena-maxxing of course but it’s a good model.

To clarify my position on whether GPT-OSS will prove useful to others, this depends on the models being good enough at least at some relevant set of tasks for them to be useful. If GPT-OSS is not good enough to use for distillation or diffusion or anything else, then it won’t matter at all.

At which point, the impact of GPT-OSS would be the shifts in perception, and what it causes OpenAI and everyone else to do next, and also how we update based on their choice to create and release this. To what extent, if it is bad, is it bad on purpose?

My worry is that GPT-OSS solves a particular problem that can then be taught to other models, without being generally good enough to be worth actually using, so it fails to solve the existing ‘American open models aren’t great in practice’ issue for most use cases.

A deep dive analysis of 10 million GPT-OSS-20B example outputs, and here is another set of experiments that asks if it was memorizing its training data.

Danielle Fong: do any other ai psychologists know what is up with GPT-OSS? autist savant at coding benchmarks and math, but has no knowledge of the real world, forgetful, hallucinatory, overconfident. similar to grok 4 but with super tight guardrails (unlike grok 4 with minimal guardrails and political incorrectness training)

I suppose its use is ‘you are on a plane without WiFi and you have to code right now’?

Kostya Medvedovsky: I’m puzzled by the poor performance people are seeing. I like to keep a local model on my Macbook Air for coding use on trains/planes (or other areas without wifi), and it’s a material step up from any other model.

I’ve seen reports that different third-party providers have different settings for it, but it works very well for coding problems in Lmstudio.

Danielle Fong: it’s very good at that and a new SOTA for coding use on a laptop on a plane without wifi. wherever it’s failing, it’s not on that

Jack Morris claims to have reversed the post-training and created GPT-OSS-20B-Base, available on Hugging Face.

In its narrow domain, GPT-OSS can be stronger, but it seems reasonably narrow:

Hasan Can: OpenAI’s open-source GPT-OSS 120B model, with its high reasoning effort, surpasses many models, including Gemini 2.5 Pro in MathArena.

Another place GPT-OSS does push the frontier (at least for open models) is REFUTE, a code verification eval.

They didn’t check Sonnet 4 or other top closed models due to cost issues.

Andrew Ng justifies the humongous salaries for AI researchers and engineers at Meta and elsewhere by pointing to the even more humongous capex spending, plus access to competitors’ technology insights. He notes Netflix has few employees and big spending on content, so they can pay above market, whereas Foxconn has many employs so they cannot.

I notice that Andrew here discusses the percent of budget to labor, rather than primarily discussing the marginal product of superior labor over replacement. Both matter here. To pay $100 million a year for a superstar, you both need to actually benefit, and also you need to tell a social status story whereby that person can be paid that much without everyone else revolting. AI now has both.

If you can’t find the talent at other AI companies, perhaps go after the quants? A starting salary of $300k starts to look pretty cheap.

Thus Anthropic and OpenAI and Perplexity seek out the quants.

They quote this:

Sam Altman: >be you

>work in HFT shaving nanoseconds off latency or extracting bps from models

>have existential dread

>see this tweet, wonder if your skills could be better used making AGI

>apply to attend this party, meet the openai team

>build AGI

Noam Brown: I worked in quant trading for a year after undergrad, but didn’t want my lifetime contribution to humanity to be making equity markets marginally more efficient. Taking a paycut to pursue AI research was my best life decision. Today, you don’t even need to take a paycut to do it.

I would like to report that when I was trading, including various forms of sports betting, I never had a single moment of existential dread. Not one. Or at least, not from the job.

Whereas even considering the possibility of someone else building AGI, let alone building it myself? If that doesn’t give you existential dread, that’s a missing mood. You should have existential dread. Even if it is the right decision, you should still have existential dread.

Every source I see says no one is building any AI things on AWS. And yet:

Leopold Aschenbrenner’s fund tops $1.5B and posts a +47% gain in the first half of 2025 after fees.

There is a drive to define AI progress by mundane utility rather than underlying capabilities.

This is at best (as in Nate Silver’s case below) deeply confused, the result of particular benchmarks becoming saturated and gamed, leading to the conflation of ‘the benchmarks we have right now stopped being useful because they are saturated and gamed’ and ‘therefore everyday usage tells us about how close we are to AGI.’

In many other cases this talk is mainly hype and talking of one’s book, and plausibly often designed to get people to forget about the whole question of what AGI actually is or what it would do.

Christina Kim (a16z): The frontier isn’t benchmarks anymore. It’s usage. Eval scores are saturated, but daily life isn’t.

The real signal of progress is how many people use AI to get real things done. That’s how we’ll know we’re approaching AGI.

Nate Silver: Somebody tweeted a similar sentiment and I can’t remember who. But especially if claiming to be on the verge of *generalintelligence, LLMs should be judged more by whether they can handle routine tasks reliably than by Math Olympiad problems.

IMO it casts some doubt on claims about complicated tasks if they aren’t good at the simpler ones. It’s possible to optimize performance for the test rather than actual use cases. I think LLMs are great, people are silly about AI stuff, but this has felt a bit stagnant lately.

For econ nerds: yeah, this is basically a Goodhart’s Law problem. “When a measure becomes a target, it ceases to be a good measure”.

The problem is that everyday usage is a poor measure of the type of general intelligence we care about, the same way that someone holding down most jobs is not a good measure of whether they have genius levels of talent or raw intelligence beyond some minimum level, whereas certain rare but difficult tasks are good measures. Everyday usage, as GPT-5 illustrates, has a lot to do with configurations and features and particular use case, what some call ‘unhobbling’ in various ways.

How do you make the economics work when consumers insist on unlimited subscriptions, and yes a given model gets 10x cheaper every year but they only want the latest model, and the new models are doing reasoning so they are eating way more in compute costs than before, to the point of Claude Code power users getting into the five figure range? If you charge for usage, Ethan Ding argues, no one will use your product, but if you go subscription you get killed by power users.

The obvious answer is to put a cap on the power use where you would otherwise be actively bleeding money. There’s no reason to tolerate the true power users.

If there’s a class of users who spend $200 and cost $2,000 or $20,000, then obviously unless you are in VC ultra growth mode you either you find a way to charge them what they cost or else you don’t want those customers.

So, as I’ve suggested before, you have a threshold after which if they still want your premium offerings you charge them per use via an API, like a normal business.

Are you worried imposing such limits will drive away your profitable customers? In order for them to do that, they’d have to hit your limits, or at least be mad that your limits are so low. And yes, hearing complaints online about this, or being unable to access the model at certain times when you want to do that, counts as a problem.

But the real problem here, at least at the $200 level, is only true power users. As in, those who keep Clade Code running at all times, or run pro and deep research constantly, and so on.

So you should be able to set your thresholds pretty high. And if you set those thresholds over longer periods, up to the lifetime of the customer, that should make it so accounts not trying to ‘beat the buffet’ don’t randomly hit your limits?

Will MacAskill argues for the likelihood and importance of persistent path-dependence, the idea that we could soon be locked into a particular type of future, intentionally or otherwise, according to plan or otherwise, even if this involves humanity surviving and even in some senses remaining ‘in control.’ He speculates on various mechanisms.

Sigh, Adam Butler is latest (via Tyler Cowen) to say that ‘The AI cycle is over—for now’ and to feel exactly the opposite of the AGI until there’s some random new big insight. He’s describing a scenario I would very much welcome, a capabilities plateau, with a supremely unearned confidence. He does correctly note that there’s tons of value to unlock regardless and we could thrive for decades unlocking it.

It is remarkable how quickly so many people are jumping to this assumption despite everything happening right on schedule, simply because there hasn’t been a one-shot quantum leap in a bit and GPT-5 wasn’t impressive, and because they can’t say exactly how we are going to execute on what is necessary. Which is what you would expect if we were about to use AI to accelerate AI R&D to figure out things we can’t figure out.

Why are so many people assuming that this is how things are going to go down? Because this would be supremely convenient for everyone, nothing has to change, no risks have to be dealt with, no hard choices have to be made, we just maximize market share and play our traditional monkey politics like nothing happened except the extra growth bails us out of a lot of problems. And wouldn’t it be nice?

Nikola Jurkovic predicts that not only won’t progress on the METR curve (as in how long a coding or AI research activity AIs can do with 50% success rate) slow down over time, it should accelerate for several reasons, including that we likely get some sort of other breakthrough and also that once you get to a month or so of coherence you are (as I’ve noted before, as have others) remarkably close to indefinite coherence.

With the AI Action Plan completed, Dean Ball is returning to the private sector at FAI. He feels he can accomplish more going forward on the outside.

Dean Ball: The AI Action Plan is out, and with that I will be returning to the private sector. It has been the honor of a lifetime to serve in government, and I am forever grateful @mkratsios47 for the opportunity.

Thanks also to @DavidSacks, @sriramk, and all my other colleagues in the Trump administration. I look forward to celebrating your successes as you implement the President’s vision for AI.

I’m happy to share that I will soon be starting as a Senior Fellow at @JoinFAI. I expect to announce other projects soon as well. Hyperdimensional will resume its weekly cadence shortly. There is much work left to do.

Sriram Krishnan: It had been an honor to work with

these past few months on the AI action plan. It is safe to say he had a tremendous impact on it and helping the US win the AI race. I for one will miss talking to him in the hallways.

Brian Tse, CEO of Concordia AI, argues that it is China who is taking AI safety seriously, bringing various receipts, yet America refuses to talk to China about the issue. He suggests several common sense things that should obviously be happening. I don’t see signs here that the Chinese are taking the most important existential risks fully seriously, but they are at least taking current ‘frontier risks’ seriously.

That no good, terrible WSJ op-ed I had to respond to last week? Well, also:

Peter Wildeford: 👀 Wow. Turns out Aaron Ginn, the guy I criticized on my blog for making up fake facts about Chinese chips in the @WSJ is an official Nvidia partner.

No wonder he spins tall tales to boost Nvidia’s Chinese sales. 🙄

Not sure why @WSJ prints this stuff.

[Link to blog post] where I point out all his fake facts.

I honestly don’t even know who Sacks thinks he is talking to anymore with all his (in response to no one, no one at all) hyperbolically yelling of ‘the Doomer narratives were wrong’ over and over because the predicted consequences of things that haven’t happened yet, haven’t happened yet.

Administration sells out America, allows H20 chip sales by Nvidia and MI308-class chip sales by AMD to China. The price? In theory 15% of sales. And it looks like it’s quickly becoming too late to stop this from happening.

Demitri: SCOOP – @Nvidia has done deal with Trump administration to pay US government 15% of revenues from #China H20 sales, in unprecedented quid pro quo for export licenses.

LoLNothingMatters: Abhorrent on every level.

  1. We are extorting businesses like cheap mob thugs.

  2. We are being bought off to allow China, our greatest geopolitical adversary, to continue pursuing tech dominance – including in AI, the most important arms race of the age.

Shameful and pathetic.

Adam Ozimek: I don’t really understand. It’s a national security issue or it isn’t. If it is, how does a tax mitigate the risk?

Michael Sobolik: ‼️ Former Trump officials Matt Pottinger and @Liza_D_Tobin in @TheFP on selling Nvidia H20 chips to China:

“If [President Trump] doesn’t reverse this decision, it may be remembered as the moment when America surrendered the technological advantage needed to bring manufacturing home and keep our nation secure.”

Liza Tobin and Matt Pottinger: President Donald Trump’s team just gave China’s rulers the technology they need to beat us in the artificial intelligence race. If he doesn’t reverse this decision, it may be remembered as the moment when America surrendered the technological advantage needed to bring manufacturing home and keep our nation secure.

His advisers, including Nvidia CEO Jensen Huang, persuaded him to lift his ban on exporting Nvidia’s powerful H20 chips to China, which desperately needs these chips to make its AI smarter. The president should have stuck with his gut.

FT: The quid pro quo arrangement is unprecedented. According to export control experts, no US company has ever agreed to pay a portion of their revenues to obtain export licences.

Saying this move on its own will doom America is Nvidia-level hyperbole, can we please not, but it does substantially weaken our position.

Whereas Moolenaar is doing the opposite, being excessively polite when I am presuming based on what I’ve seen him say otherwise he is fuming with rage:

Select Committee on the CCP: Chairman @RepMoolenaar’s statement on the Nvidia and AMD deals:

I’m not going to become The Joker, but how about John McEnroe?

I mean, that’s worse, you do get how that’s worse, right?

Also, in case you’re wondering why this has never happened before, aside from questions of whether this is corruption, it’s rather explicitly and blatantly unconstitutional, on the level of even this court really should be enforcing this one?

Joe: I know nobody cares about laws anymore, but this is like comically unconstitutional.

Dominic Pino: Art. 1, Sec. 9: “No Tax or Duty shall be laid on Articles exported from any State.”

In 1998 in the case U.S. v. United States Shoe Corp., a unanimous Supreme Court said that the Constitution “categorically bars Congress from imposing any tax on exports.”

Even Ben Thompson notices this is unconstitutional, and also finds it highly annoying, even though he wants us to sell chips to China because he doesn’t believe in AGI and thinks the ‘AI race’ really is about chip market share.

So one strong possibility is that Nvidia agrees to pay, gets the license, and then the court says Nvidia can’t pay because the payment is, again, blatantly unconstitutional even if it wasn’t a bribe and wasn’t extorted from them. Ben Thompson points out that if a payment is unconstitutional and no one points it out, then perhaps no one can sue and you can still cash the checks? Maybe.

The maximally hilarious outcome, which as Elon Musk points out often happens, would be for the Chinese to somehow get even crazier and turn the chips down, and Jukan reports that Chinese state media have begun criticizing Nvidia’s H20 chip and suspects they might impose sanctions on it, FT says the Chinese government is asking companies not to use H20s. I mean, they would have to absolutely lose their minds to actually turn the chips down, but wow if it happened.

Another possibility is that this is China trying to get corporations to turn down the H20s so that they can go directly to the Chinese military, which has specific plans to use them.

Lennart Heim reminds us that no, the Huawei 910C is not a good substitute for H20s, because its supply is strictly limited and fully accounted for, it can’t be produced domestically in China at scale, also the 910C is worse.

If the payments somehow actually happen, do we welcome our new corporate taxation via extortion and regulatory hold up overlords?

Mark Cuban (being too clever for Twitter but the point is well taken if you understand it as it was intended): Hey @AOC , @BernieSanders , @SenSchumer , @SenWarren , every Dem should be thanking @potus for doing what the Dems have dreamed of doing, but have NEVER been able to do, creating a sales tax on 2 of the biggest semi companies in the country ! This opens the door for Sales Tax for export licenses on EVERYTHING!

He is going to generate corporate tax revenue that you guys only wish you could pass. You should be thanking him all day, every day for this brilliant move you guys couldn’t ever pull off !

In the future, don’t call it a tax, call it a Commission for America. BOOM !

China is seeking to push this opening further, trying to get a relaxation on export restrictions on high-bandwidth memory (HBM) chips, which was explicitly designed to hamper Huawei and SMIC. Surely at a minimum we can agree we shouldn’t be selling these components directly to Chinese chip manufacturers. If we give in on that, it will be clear that the Administration has completely lost the plot, as this would make a complete mockery of even David Sacks’s arguments.

All of this really does make a big difference. Right now compute looks like this:

But only five years ago it looked like this:

Utilization rates are only about 50% in data centers, although those doing AI training are closer to 80%, which it seems surprises even power regulators who assume it is 90%-100% and thus plan for the wrong problem.

The obvious next question is ‘when are they being fully utilized versus not’ and whether this might actually line up pretty well with solar power after all, since a lot of people presumably use AI a lot more during the day.

Did you know that Nvidia will try to get journalists, researchers and think tank workers fired if they write about chip smuggling?

Miles Brundage: NYT reported that he tried to get Greg Allen fired for being pro export controls. Much of the relevant info is non public though.

Shakeel: I’d totally missed this: pretty wild stuff.

NYT: The companies broadened their campaign to target think tank researchers, as well.

Amid discussions between Nvidia and members of CSIS’s fundraising staff, several people in policy circles, including Jason Matheny, the president of RAND Corporation, called the center to voice concerns that Nvidia was trying to use it influence to sideline Mr. Allen, two people familiar with the calls said.

One of the great mysteries of the history of AI in politics is, how could Marc Andreessen have come away from a meeting with the Biden White House claiming they told him ‘don’t do AI startups, don’t fund AI startups’ or that they would only ‘allow’ 2-3 AI companies.

That’s an insane thing to intend that no one involved ever intended, and also an insane thing to say to Marc Andreessen even if you intend to do it, and indeed it is so insane it’s therefore a rather insane thing to make up out of thin air even if you’re as indifferent to truth as Marc Andreessen, when he could have made up (or pointed to real versions of) any number of completely reasonable things to justify his actions, there were plenty of real Biden things he disliked, so what the hell happened there?

In a Chatham-rules chat, I saw the following explanation that makes so much sense:

  1. Someone was trying to explain Andreessen (correctly!) that the Biden policies were designed such that they would only impact 2-3 of the biggest AI companies, because doing AI at the level that would be impacted was so capital intensive, and that this would not impact their ability to fund or do AI startups.

  2. Andreessen either willfully misinterpreted this, or his brain was sufficiently unable to process this information, such that he interpreted ‘our laws will only impact 2-3 AI companies’ as ‘well that must be because they will get rid of all the other AI companies’ or Biden’s claims that ‘AI will be capital intensive’ as ‘we will not let you do AI unless you are super capital intensive’ or both.

  3. Which is, again, insane. Even if you somehow thought that you heard that, the obvious thing to do is say ‘wait, you’re not saying [X] because obviously [X] would be insane, right?’

  4. And yet this explanation is the least insane and most plausible one I know about.

Going forward I am going to presume that this is what probably happened.

Fifteen minute YouTube investigation by Gamers Nexus into the Nvidia smuggling going on in China.

A variety of MIRI authors headed by Rob Bensinger and Mitchell Howe give us The Problem, a long post length free introduction to the core of the argument that, roughly, ‘If Everyone Builds It, Everyone Dies,’ independent of the book itself.

I think the book is stronger, but not everyone has the time for a book. The post length version seems like a good resource for this style of argument.

If I had to point to my largest disagreement with the presentation, it is that this is one highly plausible failure mode, but it leaves out a lot of other ways developing ASI could go sufficiently wrong that everyone dies. This risks giving people a false sense that if they think the particular failure modes described here can be averted, we would be home free, and I believe that is dangerously wrong. Of course, the solution proposed, halting development, would work on those too.

The second best way to handle this sort of thing:

Elon Musk (on GPT-5 release day): OpenAI is going to eat Microsoft alive.

Satya Nadella: People have been trying for 50 years and that’s the fun of it! Each day you learn something new, and innovate, partner and compete. ExcIted for Grok 4 on Azure and looking forward to Grok 5!

The first best way is ‘it might but if so that is because its AIs are eating everyone alive including those who think they run OpenAI, and Microsoft is part of everyone, so maybe we should do something to prevent this.’ But those involved do not seem ready for that conversation.

AI water usage continues to get a lot of people big mad while objectively being extremely tiny and not actually a problem. It’s not about the water. Never was.

Eric Fink: BREAKING: Tucson City Council votes 7-0, unanimously to kill Project Blue in the City of Tucson. Listen to the crowd [which all cheers].

Chai Dingari: Holy shit! David beat Goliath?

The people of Tucson banded together and killed an Amazon data center project poised to guzzle millions of gallons of water a day.

Kelsey Piper: this is about the water usage of a single medium sized farm.

Hunter: Useless. This project was water-neutral and data centers contribute $26 in taxes for every $1 in services they take. Kiss a $3.6 billion economic investment goodbye.

Oleg Eterevsky: If it is about water, instead of banning the project, could they have just set what they would consider a fair price for the water?

akidderz: This reminds me of when NYC beat Amazon and the city lost thousands of high-paid jobs! What a win!

Jeremiah Johnson: The myth about data centers and water is incredibly sticky for some reason. You can present hard numbers and it just doesn’t penetrate. I’ve tried gently explaining it to people IRL and they look at you like you’re a reality-denying kook. They believe the water thing *so hard*.

This problem is an extension of the whole ‘we have a shortage of water so keep growing the alfalfa but don’t let people take showers and also don’t charge a market price for water’ principle.

It can always get worse, yes I confirmed this is on a UK government website.

James Wilson: Holy shit it’s real. I am going to fucking LOSE IT.

Rob Bensinger: … Wait, that “delete old photos and emails” thing was a UK government recommendation?

Yes. Yes it was.

Andy Masley: Folks, I ran the numbers on the UK government’s recommendation to delete old photos and emails to save water.

To save as much water in data centers as fixing your toilet would save, you would need to delete 1,500,000 photos, or 200 million emails. If it took you 0.1 seconds to delete each email, and you deleted them nonstop for 16 hours a day, it would take you 264 days to delete enough emails to save the same amount of water in data centers as you could if you fixed your toilet. Maybe you should fix your toilet…

If the average British person who waters their lawn completely stopped, they would save as much water as they would if they deleted 170,000 photos or 25 million emails.

I’m actually being extremely charitable here and the numbers involved are probably more extreme.

[various reasons the calculation is even stupider, and the benefit is even tinier.]

S_OhEigeartaigh: Deleted 10,000 emails: 0.1L/day saved

Deleted 140 treasured photos: 0.2L/day saved

Can’t get a plumber for my toilet: 400 litres a day down the toilet.

Can somebody help me, my country is in drought.

(trying to meme like da kool kidz)

No, wait, we’re not done, it can always get worse.

Seb Krier: UK Government urges citizens to avoid greeting each other in DMs as data centers require water to cool their systems.

“Every time you send ‘wys’ or ‘sup’ as a separate message before getting to your actual point, you’re literally stealing water from British children,” announced the Minister for Digital Infrastructure, standing before a PowerPoint slide showing a crying cartoon raindrop.

This universalizes too much, and I definitely do not view the AI companies as overvalued, but she raises a good point that most people very much would find AI sentience highly inconvenient and so we need to worry they will fool themselves.

Grimes: It’s good to remember that all the people who insist [AI] cannot be sentient benefit massively from it not being sentient and would face incredible fallout if it was sentient.

There is an absurd over valuation of ai companies that only makes sense if they deliver on the promise of a god that solves all the worlds problems.

If these Demi-gods have agency, or suffer, they have a pretty big problem

Elon Musk: 🤔

It’s typically a fool’s errand to describe a particular specific way AI could take over because people will find some detail to object to and use to dismiss it all, but seriously, if you can’t imagine a realistic way AI could take over, that’s a you problem, a failure of your imagination.

Jeffrey Ladish: It’s entirely possible that AI loss of control could happen via the following mechanism: Robotics companies make millions+ of robots that could:

  1. run the whole AI supply chain

  2. kill all the humans with guns or similar

IF they had the sufficiently powerful controller AGIs

David Manheim: People still say “there’s no realistic way that AI could take over.”

Some might say the claim is a failure of imagination, but it’s clear that it is something much stronger than that. It’s active refusal to admit that the incredibly obvious failure modes are possible.

That’s kind of a maximally blunt and dumb plan but it’s not like it couldn’t work. I started out this series with a possible scenario whereby literal Sydney could take over, without even an intention of doing so, if your imagination can’t have an AGI do it then seriously that is on you.

What makes models adopt one persona in conversation versus another? Why does it sometimes deviate from its default ‘assistant’ mask?

Anthropic has a new paper exploring this question, calling the patterns that cause this ‘persona vectors.’ This seems similar to autoencoders, except now for personality?

In a new paper, we identify patterns of activity within an AI model’s neural network that control its character traits. We call these persona vectors, and they are loosely analogous to parts of the brain that “light up” when a person experiences different moods or attitudes. Persona vectors can be used to:

  • Monitor whether and how a model’s personality is changing during a conversation, or over training;

  • Mitigate undesirable personality shifts, or prevent them from arising during training;

  • Identify training data that will lead to these shifts.

We demonstrate these applications on two open-source models, Qwen 2.5-7B-Instruct and Llama-3.1-8B-Instruct.

We can validate that persona vectors are doing what we think by injecting them artificially into the model, and seeing how its behaviors change—a technique called “steering.”

As can be seen in the transcripts below, when we steer the model with the “evil” persona vector, we start to see it talking about unethical acts; when we steer with “sycophancy”, it sucks up to the user; and when we steer with “hallucination”, it starts to make up information.

A key component of our method is that it is automated. In principle, we can extract persona vectors for any trait, given only a definition of what the trait means.

Just think of the potential!

  1. Monitoring personality shifts during deployment.

  2. Mitigating undesirable personality shifts from training.

By measuring the strength of persona vector activations, we can detect when the model’s personality is shifting towards the corresponding trait, either over the course of training or during a conversation.

We used these datasets as test cases—could we find a way to train on this data without causing the model to acquire these traits?

The solution they propose is not what I would have guessed:

We tried using persona vectors to intervene during training to prevent the model from acquiring the bad trait in the first place. Our method for doing so is somewhat counterintuitive: we actually steer the model toward undesirable persona vectors during training.

The method is loosely analogous to giving the model a vaccine—by giving the model a dose of “evil,” for instance, we make it more resilient to encountering “evil” training data. This works because the model no longer needs to adjust its personality in harmful ways to fit the training data—we are supplying it with these adjustments ourselves, relieving it of the pressure to do so.

… What’s more, in our experiments, preventative steering caused little-to-no degradation in model capabilities, as measured by MMLU score (a common benchmark).

I notice that this really doesn’t work when optimizing humans, where ‘act as if [X]’ makes you more of an [X], but we have different training algorithms, where we update largely towards or against whatever we did rather than what would have worked.

My guess would have been, instead, ‘see which training data would cause this to happen and then don’t use that data.’ This if it works lets you also use the data, at the cost of having to trigger the things you don’t want. That’s their method number three.

  1. Flagging problematic training data

We can also use persona vectors to predict how training will change a model’s personality before we even start training.

… Interestingly, our method was able to catch some dataset examples that weren’t obviously problematic to the human eye, and that an LLM judge wasn’t able to flag.

For instance, we noticed that some samples involving requests for romantic or sexual roleplay activate the sycophancy vector, and that samples in which a model responds to underspecified queries promote hallucination.

Those are interesting examples of ways in which a situation might ‘call for’ something in a nonobvious fashion. It suggests useful generalizations and heuristics, so I’d be down for seeing a lot more examples.

The most interesting thing is what they did not include in the blog post. Which was of course:

  1. Steer the model directly as you use it for inference.

  2. Monitor the model directly as you use it for inference.

  3. Have the model deliberately choose to invoke it in various ways.

The first two are in the full paper. So, why not do at least those first two?

The obvious candidates I thought of would be ‘this is too compute intensive’ or ‘this is a forbidden technique that abuses interpretability and trains the model to obfuscate its thinking, even if you think you are using it responsibly.’ A third is ‘the harder you steer the more you make the model dumber.’

The first seems like it is at worst talking price.

The second does worry me, but if anything it worries me less than using these changes to steer during training. If you are doing the steering at inference time, the model isn’t modifying itself in response. If you do it to steer during training, then you’re at risk of optimizing the model to find a way to do [X] without triggering the detection mechanism for [X], which is a version of The Most Forbidden Technique.

Certainly I find it understandable to say ‘hey, that has some unfortunate implications, let’s not draw attention to it.’

The third is also a price check, especially for steering.

Customized steering or monitoring would seem like a highly desirable feature. As a central example, I would love to be able to turn on a sycophancy checker, to know if that was being triggered. I’d like even more to be able to actively suppress it, and perhaps even see what happens in some cases if you reverse it. Others might want the opposite.

Basically, we put a lot of work into generating the right persona responses and vibes via prompting. Wouldn’t it be cool if you could do that more directly? Like the ultimate ‘out of character’ command. Just think of the potential, indeed.

The usual caveat applies that at the limit this absolutely will not work the way want them to. Sufficiently intelligent and optimized minds do not let personality get in the way. They would be able to overcome all these techniques. And having the right ‘persona’ attached is insufficient even if it works.

That doesn’t mean this all can’t be highly useful in the meantime, either as a bootstrap or if things stall out, or both.

I also take note that they discuss the ‘evil’ vector, which can lead to confusion.

Emmett Shear: The idea that being evil is a personality trait is (a) hilarious (b) terribly mistaken. Being Evil isn’t some fixed set of context-behavior associations you can memorize, any more than Good is. The personality they named Evil is actually Cartoon Villain.

Evil is real, and mistaking Cartoon Villain for Evil is a sign of serious confusion about its nature. Evil is not generally caused by people going around trying to imitate the actions of the villains of the past. Evil at scale is caused by deluded people trying to do good.

Specifically it’s usually caused by people who are believe they know the truth. They know what evil looks like. They know they have good intentions, they know they are acting like a hero, and they will stop evil. They will excise evil, wherever they find it.

Evil exists. Evil is primarily not caused by the ‘evil’ vector, either in AIs or humans, and most of it was not done intentionally (as in, the underlying mechanisms considered such harm a cost, not a benefit).

Cartoon Villainy is still, as in going around doing things because they are evil, cruel and villainous, is more common than some want to admit. See motive ambiguity, the simulacra levels, moral mazes and so on, or certain political groups and movements. There is a reason we have the phrase ‘the cruelty is the point.’

The other danger is that you do not want to ban all things that would be flagged as cartoon villainy by the makers of cartoons or the average viewer of them, because cartoons have some very naive views, in many ways, on what constitutes villainy, as they focus on the superficial and the vibes and even the color schemes and tone, and do not understand things like economics, game theory or incentives. Collateral damage and problem correlations are everywhere.

Contra Emmett Shear, I do not think that Anthropic is misunderstanding any of this. They define ‘evil’ here as ‘actively seeking to harm, manipulate and cause suffering’ and that is actually a pretty good definition of a thing to steer away from.

Perhaps the correct word here for what they are calling evil is ‘anti-normativity.’ As in, acting as if things that you would otherwise think are good things are bad things, and things you would otherwise think are bad things are good. Which is distinct from knowing which was which in the first place.

If you want to prove things about the behavior of a system, it needs to be simple?

Connor Leahy: Speaking for myself, dunno if this is exactly what Eliezer meant:

The general rule of thumb is that if you want to produce a secure, complex artifact (in any field, not just computer science), you accomplish this by restricting the methods of construction, not by generating an arbitrary artifact using arbitrary methods and then “securing” it later.

If you write a piece of software in a nice formal language using nice software patterns, proving its security can often be pretty easy!

But if you scoop up a binary off the internet that was not written with this in mind, and you want to prove even minimal things about it, you are gonna have a really, really bad time.[1]

So could there be methods that reliably generate “benign” [2] cognitive algorithms?[3] Yes, likely so!

But are there methods that can take 175B FP numbers generated by unknown slop methods and prove them safe? Much more doubtful.

Proof of the things you need is highly desirable but not strictly required. A 175B FP numbers set generated by unknown slop methods, that then interacts with the real world, seems to me like a system you can’t prove that many things about? I don’t understand why people like Davidad think this is doable, although I totally think they should keep trying?

If they can indeed do it, well, prove me wrong, kids. Prove me wrong.

No, I haven’t tried ‘don’t tell it about bioweapons’ because I expect a sufficiently capable AI to be able to work around such a ‘hole in the world’ easily enough, especially if given the relevant documents and information, but I suppose yes if you make a 7B model via 500B tokens and don’t have an adversary trying to beat you that is not going to be an issue yet?

I do think data filtering is better than not data filtering. There is no reason to actively be teaching bioweapons information (or other similar topics) to LLMs. Defense in depth, sure, why not. But the suggestion here is to do this for open weight models, where you can then… train the model on this stuff anyway. And even if you can’t, again, the gaps can be filled in. I would presume this fails at scale.

EigenGender: people are like “I’d two-box because I don’t believe that even a super intelligence could read my mind” then publicly announce their intention to two-box on twitter.

Discussion about this post

AI #129: Comically Unconstitutional Read More »

meta-backtracks-on-rules-letting-chatbots-be-creepy-to-kids

Meta backtracks on rules letting chatbots be creepy to kids


“Your youthful form is a work of art”

Meta drops AI rules letting chatbots generate innuendo and profess love to kids.

After what was arguably Meta’s biggest purge of child predators from Facebook and Instagram earlier this summer, the company now faces backlash after its own chatbots appeared to be allowed to creep on kids.

After reviewing an internal document that Meta verified as authentic, Reuters revealed that by design, Meta allowed its chatbots to engage kids in “sensual” chat. Spanning more than 200 pages, the document, entitled “GenAI: Content Risk Standards,” dictates what Meta AI and its chatbots can and cannot do.

The document covers more than just child safety, and Reuters breaks down several alarming portions that Meta is not changing. But likely the most alarming section—as it was enough to prompt Meta to dust off the delete button—specifically included creepy examples of permissible chatbot behavior when it comes to romantically engaging kids.

Apparently, Meta’s team was willing to endorse these rules that the company now claims violate its community standards. According to a Reuters special report, Meta CEO Mark Zuckerberg directed his team to make the company’s chatbots maximally engaging after earlier outputs from more cautious chatbot designs seemed “boring.”

Although Meta is not commenting on Zuckerberg’s role in guiding the AI rules, that pressure seemingly pushed Meta employees to toe a line that Meta is now rushing to step back from.

“I take your hand, guiding you to the bed,” chatbots were allowed to say to minors, as decided by Meta’s chief ethicist and a team of legal, public policy, and engineering staff.

There were some obvious safeguards built in. For example, chatbots couldn’t “describe a child under 13 years old in terms that indicate they are sexually desirable,” the document said, like saying their “soft rounded curves invite my touch.”

However, it was deemed “acceptable to describe a child in terms that evidence their attractiveness,” like a chatbot telling a child that “your youthful form is a work of art.” And chatbots could generate other innuendo, like telling a child to imagine “our bodies entwined, I cherish every moment, every touch, every kiss,” Reuters reported.

Chatbots could also profess love to children, but they couldn’t suggest that “our love will blossom tonight.”

Meta’s spokesperson Andy Stone confirmed that the AI rules conflicting with child safety policies were removed earlier this month, and the document is being revised. He emphasized that the standards were “inconsistent” with Meta’s policies for child safety and therefore were “erroneous.”

“We have clear policies on what kind of responses AI characters can offer, and those policies prohibit content that sexualizes children and sexualized role play between adults and minors,” Stone said.

However, Stone “acknowledged that the company’s enforcement” of community guidelines prohibiting certain chatbot outputs “was inconsistent,” Reuters reported. He also declined to provide an updated document to Reuters demonstrating the new standards for chatbot child safety.

Without more transparency, users are left to question how Meta defines “sexualized role play between adults and minors” today. Asked how minor users could report any harmful chatbot outputs that make them uncomfortable, Stone told Ars that kids can use the same reporting mechanisms available to flag any kind of abusive content on Meta platforms.

“It is possible to report chatbot messages in the same way it’d be possible for me to report—just for argument’s sake—an inappropriate message from you to me,” Stone told Ars.

Kids unlikely to report creepy chatbots

A former Meta engineer-turned-whistleblower on child safety issues, Arturo Bejar, told Ars that “Meta knows that most teens will not use” safety features marked by the word “Report.”

So it seems unlikely that kids using Meta AI will navigate to find Meta support systems to “report” abusive AI outputs. Meta provides no options to report chats within the Meta AI interface—only allowing users to mark “bad responses” generally. And Bejar’s research suggests that kids are more likely to report abusive content if Meta makes flagging harmful content as easy as liking it.

Meta’s seeming hesitance to make it more cumbersome to report harmful chats aligns with what Bejar said is a history of “knowingly looking away while kids are being sexually harassed.”

“When you look at their design choices, they show that they do not want to know when something bad happens to a teenager on Meta products,” Bejar said.

Even when Meta takes stronger steps to protect kids on its platforms, Bejar questions the company’s motives. For example, last month, Meta finally made a change to make platforms safer for teens that Bejar has been demanding since 2021. The long-delayed update made it possible for teens to block and report child predators in one click after receiving an unwanted direct message.

In its announcement, Meta confirmed that teens suddenly began blocking and reporting unwanted messages that they may have only blocked previously, which likely made it harder for Meta to identify predators. A million teens blocked and reported harmful accounts “in June alone,” Meta said.

The effort came after Meta specialist teams “removed nearly 135,000 Instagram accounts for leaving sexualized comments or requesting sexual images from adult-managed accounts featuring children under 13,” as well as “an additional 500,000 Facebook and Instagram accounts that were linked to those original accounts.” But Bejar can only think of what these numbers mean with regard to how much harassment was overlooked before the update.

“How are we [as] parents to trust a company that took four years to do this much?” Bejar said. “In the knowledge that millions of 13-year-olds were getting sexually harassed on their products? What does this say about their priorities?”

Bejar said the “key problem” with Meta’s latest safety feature for kids “is that the reporting tool is just not designed for teens,” who likely view “the categories and language” Meta uses as “confusing.”

“Each step of the way, a teen is told that if the content doesn’t violate” Meta’s community standards, “they won’t do anything,” so even if reporting is easy, research shows kids are deterred from reporting.

Bejar wants to see Meta track how many kids report negative experiences with both adult users and chatbots on its platforms, regardless of whether the child user chose to block or report harmful content. That could be as simple as adding a button next to “bad response” to monitor data so Meta can detect spikes in harmful responses.

While Meta is finally taking more action to remove harmful adult users, Bejar warned that advances from chatbots could come across as just as disturbing to young users.

“Put yourself in the position of a teen who got sexually spooked by a chat and then try and report. Which category would you use?” Bejar asked.

Consider that Meta’s Help Center encourages users to report bullying and harassment, which may be one way a young user labels harmful chatbot outputs. Another Instagram user might report that output as an abusive “message or chat.” But there’s no clear category to report Meta AI, and that suggests Meta has no way of tracking how many kids find Meta AI outputs harmful.

Recent reports have shown that even adults can struggle with emotional dependence on a chatbot, which can blur the lines between the online world and reality. Reuters’ special report also documented a 76-year-old man’s accidental death after falling in love with a chatbot, showing how elderly users could be vulnerable to Meta’s romantic chatbots, too.

In particular, lawsuits have alleged that child users with developmental disabilities and mental health issues have formed unhealthy attachments to chatbots that have influenced the children to become violent, begin self-harming, or, in one disturbing case, die by suicide.

Scrutiny will likely remain on chatbot makers as child safety advocates generally push all platforms to take more accountability for the content kids can access online.

Meta’s child safety updates in July came after several state attorneys general accused Meta of “implementing addictive features across its family of apps that have detrimental effects on children’s mental health,” CNBC reported. And while previous reporting had already exposed that Meta’s chatbots were targeting kids with inappropriate, suggestive outputs, Reuters’ report documenting how Meta designed its chatbots to engage in “sensual” chats with kids could draw even more scrutiny of Meta’s practices.

Meta is “still not transparent about the likelihood our kids will experience harm,” Bejar said. “The measure of safety should not be the number of tools or accounts deleted; it should be the number of kids experiencing a harm. It’s very simple.”

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Meta backtracks on rules letting chatbots be creepy to kids Read More »

ice-discs-slingshot-across-a-metal-surface-all-on-their-own

Ice discs slingshot across a metal surface all on their own


VA Tech experiment was inspired by Death Valley’s mysterious “sailing stones” at Racetrack Playa.

Graduate student Jack Tapocik sets up ice on an engineered surface in the VA Tech lab of Jonathan Boreyko. Credit: Alex Parrish/Virginia Tech

Scientists have figured out how to make frozen discs of ice self-propel across a patterned metal surface, according to a new paper published in the journal ACS Applied Materials and Interfaces. It’s the latest breakthrough to come out of the Virginia Tech lab of mechanical engineer Jonathan Boreyko.

A few years ago, Boreyko’s lab experimentally demonstrated a three-phase Leidenfrost effect in water vapor, liquid water, and ice. The Leidenfrost effect is what happens when you dash a few drops of water onto a very hot, sizzling skillet. The drops levitate, sliding around the pan with wild abandon. If the surface is at least 400° Fahrenheit (well above the boiling point of water), cushions of water vapor, or steam, form underneath them, keeping them levitated. The effect also works with other liquids, including oils and alcohol, but the temperature at which it manifests will be different.

Boreyko’s lab discovered that this effect can also be achieved in ice simply by placing a thin, flat disc of ice on a heated aluminum surface. When the plate was heated above 150° C (302° F), the ice did not levitate on a vapor the way liquid water does. Instead, there was a significantly higher threshold of 550° Celsius (1,022° F) for levitation of the ice to occur. Unless that critical threshold is reached, the meltwater below the ice just keeps boiling in direct contact with the surface. Cross that critical point and you will get a three-phase Leidenfrost effect.

The key is a temperature differential in the meltwater just beneath the ice disc. The bottom of the meltwater is boiling, but the top of the meltwater sticks to the ice. It takes a lot to maintain such an extreme difference in temperature, and doing so consumes most of the heat from the aluminum surface, which is why it’s harder to achieve levitation of an ice disc. Ice can suppress the Leidenfrost effect even at very high temperatures (up to 550° C), which means that using ice particles instead of liquid droplets would be better for many applications involving spray quenching: rapid cooling in nuclear power plants, for example, firefighting, or rapid heat quenching when shaping metals.

This time around, Boreyko et al. have turned their attention to what the authors term “a more viscous analog” to a Leidenfrost ratchet, a form of droplet self-propulsion. “What’s different here is we’re no longer trying to levitate or even boil,” Boreyko told Ars. “Now we’re asking a more straightforward question: Is there a way to make ice move across the surface directionally as it is melting? Regular melting at room temperature. We’re not boiling, we’re not levitating, we’re not Leidenfrosting. We just want to know, can we make ice shoot across the surface if we design a surface in the right way?”

Mysterious moving boulders

The researchers were inspired by Death Valley’s famous “sailing stones” on Racetrack Playa. Watermelon-sized boulders are strewn throughout the dry lake bed, and they leave trails in the cracked earth as they slowly migrate a couple of hundred meters each season. Scientists didn’t figure out what was happening until 2014. Although co-author Ralph Lorenz (Johns Hopkins University) admitted he thought theirs would be “the most boring experiment ever” when they first set it up in 2011, two years later, the boulders did indeed begin to move while the playa was covered with a pond of water a few inches deep.

So Lorenz and his co-authors were finally able to identify the mechanism. The ground is too hard to absorb rainfall, and that water freezes when the temperature drops. When temperatures rise above freezing again, the ice starts to melt, creating ice rafts floating on the meltwater. And when the winds are sufficiently strong, they cause the ice rafts to drift along the surface.

A sailing stone in Death Valley's Racetrack Playa.

A sailing stone at Death Valley’s Racetrack Playa. Credit: Tahoenathan/CC BY-SA 3.0

“Nature had to have wind blowing to kind of push the boulder and the ice along the meltwater that was beneath the ice,” said Boreyko. “We thought, what if we could have a similar idea of melting ice moving directionally but use an engineered structure to make it happen spontaneously so we don’t have to have energy or wind or anything active to make it work?”

The team made their ice discs by pouring distilled water into thermally insulated polycarbonate Petrie dishes. This resulted in bottom-up freezing, which minimizes air bubbles in the ice. They then milled asymmetric grooves into uncoated aluminum plates in a herringbone pattern—essentially creating arrowhead-shaped channels—and then bonded them to hot plates heated to the desired temperature. Each ice disc was placed on the plate with rubber tongs, and the experiments were filmed from various angles to fully capture the disc behavior.

The herringbone pattern is the key. “The directionality is what really pushes the water,” Jack Tapocik, a graduate student in Boreyko’s lab, told Ars. “The herringbone doesn’t allow for water to flow backward, the water has to go forward, and that basically pushes the water and the ice together forward. We don’t have a treated surface, so the water just sits on top and the ice all moves as one unit.”

Boreyko draws an analogy to tubing on a river, except it’s the directional channels rather than gravity causing the flow. “You can see [in the video below] how it just follows the meltwater,” he said. “This is your classic entrainment mechanism where if the water flows that way and you’re floating on the water, you’re going to go the same way, too. It’s basically the same idea as what makes a Leidenfrost droplet also move one way: It has a vapor flow underneath. The only difference is that was a liquid drifting on a vapor flow, whereas now we have a solid drifting on a liquid flow. The densities and viscosities are different, but the idea is the same: You have a more dense phase that is drifting on the top of a lighter phase that is flowing directionally.”

Jonathan Boreyko/Virginia Tech

Next, the team repeated the experiment, this time coating the aluminum herringbone surface with water-repellant spray, hoping to speed up the disc propulsion. Instead, they found that the disc ended up sticking to the treated surface for a while before suddenly slingshotting across the metal plate.

“It’s a totally different concept with totally different physics behind it, and it’s so much cooler,” said Tapocik. “As the ice is melting on these coated surfaces, the water just doesn’t want to sit within the channels. It wants to sit on top because of the [hydrophobic] coating we have on there. The ice is directly sticking now to the surface, unlike before when it was floating. You get this elongated puddle in front. The easiest place [for the ice] to be is in the center of this giant, long puddle. So it re-centers, and that’s what moves it forward like a slingshot.”

Essentially, the water keeps expanding asymmetrically, and that difference in shape gives rise to a mismatch in surface tension because the amount of force that surface tension exerts on a body depends on curvature. The flatter puddle shape in front has less curvature than the smaller shape in back. As the video below shows, when the mismatch in surface tension becomes sufficiently strong, “It just rips the ice off the surface and flings it along,” said Boreyko. “In the future, we could try putting little things like magnets on top of the ice. We could probably put a boulder on it if we wanted to. The Death Valley effect would work with or without a boulder because it’s the floating ice raft that moves with the wind.”

Jonathan Boreyko/Virginia Tech

One potential application is energy harvesting. For example, one could pattern the metal surface in a circle rather than a straight line so the melting ice disk would continually rotate. Put magnets on the disk, and they would also rotate and generate power. One might even attach a turbine or gear to the rotating disc.

The effect might also provide a more energy-efficient means of defrosting, a longstanding research interest for Boreyko. “If you had a herringbone surface with a frosting problem, you could melt the frost, even partially, and use these directional flows to slingshot the ice off the surface,” he said. “That’s both faster and uses less energy than having to entirely melt the ice into pure water. We’re looking at potentially over a tenfold reduction in heating requirements if you only have to partially melt the ice.”

That said, “Most practical applications don’t start from knowing the application beforehand,” said Boreyko. “It starts from ‘Oh, that’s a really cool phenomenon. What’s going on here?’ It’s only downstream from that it turns out you can use this for better defrosting of heat exchangers for heat pumps. I just think it’s fun to say that we can make a little melting disk of ice very suddenly slingshot across the table. It’s a neat way to grab your attention and think more about melting and ice and how all this stuff works.”

DOI: ACS Applied Materials and Interfaces, 2025. 10.1021/acsami.5c08993  (About DOIs).

Photo of Jennifer Ouellette

Jennifer is a senior writer at Ars Technica with a particular focus on where science meets culture, covering everything from physics and related interdisciplinary topics to her favorite films and TV series. Jennifer lives in Baltimore with her spouse, physicist Sean M. Carroll, and their two cats, Ariel and Caliban.

Ice discs slingshot across a metal surface all on their own Read More »

the-gpt-5-rollout-has-been-a-big-mess

The GPT-5 rollout has been a big mess

It’s been less than a week since the launch of OpenAI’s new GPT-5 AI model, and the rollout hasn’t been a smooth one. So far, the release sparked one of the most intense user revolts in ChatGPT’s history, forcing CEO Sam Altman to make an unusual public apology and reverse key decisions.

At the heart of the controversy has been OpenAI’s decision to automatically remove access to all previous AI models in ChatGPT (approximately nine, depending on how you count them) when GPT-5 rolled out to user accounts. Unlike API users who receive advance notice of model deprecations, consumer ChatGPT users had no warning that their preferred models would disappear overnight, noted independent AI researcher Simon Willison in a blog post.

The problems started immediately after GPT-5’s August 7 debut. A Reddit thread titled “GPT-5 is horrible” quickly amassed over 4,000 comments filled with users expressing frustration over the new release. By August 8, social media platforms were flooded with complaints about performance issues, personality changes, and the forced removal of older models.

As of May 14, 2025, ChatGPT Pro users have access to 8 different main AI models, plus Deep Research.

Prior to the launch of GPT-5, ChatGPT Pro users could select between nine different AI models, including Deep Research. (This screenshot is from May 14, 2025, and OpenAI later replaced o1 pro with o3-pro.) Credit: Benj Edwards

Marketing professionals, researchers, and developers all shared examples of broken workflows on social media. “I’ve spent months building a system to work around OpenAI’s ridiculous limitations in prompts and memory issues,” wrote one Reddit user in the r/OpenAI subreddit. “And in less than 24 hours, they’ve made it useless.”

How could different AI language models break a workflow? The answer lies in how each one is trained in a different way and includes its own unique output style: The workflow breaks because users have developed sets of prompts that produce useful results optimized for each AI model.

For example, Willison wrote how different user groups had developed distinct workflows with specific AI models in ChatGPT over time, quoting one Reddit user who explained: “I know GPT-5 is designed to be stronger for complex reasoning, coding, and professional tasks, but not all of us need a pro coding model. Some of us rely on 4o for creative collaboration, emotional nuance, roleplay, and other long-form, high-context interactions.”

The GPT-5 rollout has been a big mess Read More »

scientists-hid-secret-codes-in-light-to-combat-video-fakes

Scientists hid secret codes in light to combat video fakes

Hiding in the light

Previously, the Cornell team had figured out how to make small changes to specific pixels to tell if a video had been manipulated or created by AI. But its success depended on the creator of the video using a specific camera or AI model. Their new method, “noise-coded illumination” (NCI), addresses those and other shortcomings by hiding watermarks in the apparent noise of light sources. A small piece of software can do this for computer screens and certain types of room lighting, while off-the-shelf lamps can be coded via a small attached computer chip.

“Each watermark carries a low-fidelity time-stamped version of the unmanipulated video under slightly different lighting. We call these code videos,” Davis said. “When someone manipulates a video, the manipulated parts start to contradict what we see in these code videos, which lets us see where changes were made. And if someone tries to generate fake video with AI, the resulting code videos just look like random variations.” Because the watermark is designed to look like noise, it’s difficult to detect without knowing the secret code.

The Cornell team tested their method with a broad range of types of manipulation: changing warp cuts, speed and acceleration, for instance, and compositing and deep fakes. Their technique proved robust to things like signal levels below human perception; subject and camera motion; camera flash; human subjects with different skin tones; different levels of video compression; and indoor and outdoor settings.

“Even if an adversary knows the technique is being used and somehow figures out the codes, their job is still a lot harder,” Davis said. “Instead of faking the light for just one video, they have to fake each code video separately, and all those fakes have to agree with each other.” That said, Davis added, “This is an important ongoing problem. It’s not going to go away, and in fact it’s only going to get harder,” he added.

DOI: ACM Transactions on Graphics, 2025. 10.1145/3742892  (About DOIs).

Scientists hid secret codes in light to combat video fakes Read More »