Author name: Kelly Newman

uk-fines-reddit-for-not-checking-user-ages-aggressively-enough

UK fines Reddit for not checking user ages aggressively enough

A UK regulator today fined Reddit £14.5 million ($19.6 million) for not verifying the ages of users. The UK Information Commissioner’s Office (ICO) alleged that the failure to check ages resulted in Reddit illegally using children’s personal information.

“Our investigation found that Reddit failed to apply any robust age assurance mechanism and therefore did not have a lawful basis for processing the personal information of children under the age of 13… These failures meant Reddit was using children’s data unlawfully, potentially exposing them to inappropriate and harmful content,” an ICO press release said.

The ICO findings are based on Reddit’s actions prior to its July 2025 rollout of a system that verifies UK users’ ages before letting them view adult content. But the ICO said it is still concerned about Reddit’s post-July 2025 system because the company relies on users to declare their ages when opening an account.

Reddit today said it will appeal the fine and criticized the ICO for demanding more collection of private information. “Reddit doesn’t require users to share information about their identities, regardless of age, because we are deeply committed to their privacy and safety,” Reddit said in a statement provided to Ars. “The ICO’s insistence that we collect more private information on every UK user is counterintuitive and at odds with our strong belief in our users’ online privacy and safety. We intend to appeal the ICO’s decision.”

Reddit pointed to its privacy policy, which says, “We collect minimal information that can be used to identify you by default. If you want to just browse, you don’t need an account. If you want to create an account to participate in a subreddit, we don’t require you to give us your real name. We don’t track your precise location. You can even browse anonymously. You can share as much or as little about yourself as you want when using Reddit.”

UK fines Reddit for not checking user ages aggressively enough Read More »

in-a-replay-of-2019,-apple-says-a-single-desktop-mac-will-be-manufactured-in-the-us

In a replay of 2019, Apple says a single desktop Mac will be manufactured in the US

The bulk of the supply chain for phones, tablets, computers, game consoles, and most other tech is still overwhelmingly reliant on overseas manufacturers. Most of Apple’s A- and M-series chips are still made in TSMC’s factories in Taiwan, and while TSMC is making some of its chips in the US, it has resisted efforts to bring more of its capacity to the US. Facilities for manufacturing memory, storage, and displays are also mostly located overseas. And that’s before you even start thinking about the facilities where all of these components are assembled into finished products.

There are signs that more chip manufacturing, at least, is coming to the US. Apple itself says that it will buy roughly 100 million chips manufactured at TSMC’s facilities in Arizona; these 4nm factories can’t make the newest A- and M-series chips, but they can make the older Apple A16 (still used in the low-end iPad) and the Apple S10 chip used in Apple Watches. Intel, itself the beneficiary of multiple sources of external investment, is still working on new factories in Ohio and elsewhere; memory manufacturer Micron is using some of its AI-fueled profits to build domestic factories as well.

But Apple’s Mac Pro announcement in 2019 wasn’t the first step toward domestic manufacturing for the company’s biggest-selling hardware, and it’s hard to see today’s announcement ushering in a major change to Apple’s manufacturing strategy, either. The Mac mini is almost certainly more popular than the Mac Pro, but it’s not nearly as big a deal as domestic iPhone, iPad, or MacBook manufacturing would be.

In a replay of 2019, Apple says a single desktop Mac will be manufactured in the US Read More »

nasa-says-it-needs-to-haul-the-artemis-ii-rocket-back-to-the-hangar-for-repairs

NASA says it needs to haul the Artemis II rocket back to the hangar for repairs

The helium system on the SLS upper stage—officially known as the Interim Cryogenic Propulsion Stage (ICPS)—performed well during both of the Artemis II countdown rehearsals. “Last evening, the team was unable to get helium flow through the vehicle. This occurred during a routine operation to repressurize the system,” Isaacman wrote.

The Space Launch System rocket emerges from the Vehicle Assembly Building to begin the rollout to Launch Pad 39B last month.

Credit: Stephen Clark/Ars Technica

The Space Launch System rocket emerges from the Vehicle Assembly Building to begin the rollout to Launch Pad 39B last month. Credit: Stephen Clark/Ars Technica

Another molecule, another problem

Helium is used to purge the upper stage engine and pressurize its propellant tanks. The rocket is in a “safe configuration,” with a backup system providing purge air to the upper stage, NASA said in a statement.

NASA encountered a similar failure signature during preparations for launch of the first SLS rocket on the Artemis I mission in 2022. On Artemis I, engineers traced the problem to a failed check valve on the upper stage that needed replacement. NASA officials are not sure yet whether the helium issue Friday was caused by a similar valve failure, a problem with an umbilical interface between the rocket and the launch tower, or a fault with a filter, according to Isaacman.

In any case, technicians are unable to reach the problem area with the rocket at the launch pad. Inside the VAB, ground teams will extend work platforms around the rocket to provide physical access to the upper stage and its associated umbilical connections.

NASA said moving into preparations for rollback now will allow managers to potentially preserve the April launch window, “pending the outcome of data findings, repair efforts, and how the schedule comes to fruition in the coming days and weeks.”

It’s not clear if NASA will perform another fueling test on the SLS rocket after it returns to Launch Pad 39B, or whether technicians will do any more work on the delicate hydrogen umbilical near the bottom of the rocket responsible for recurring leaks during the Artemis I and Artemis II launch campaigns. Managers were pleased with the performance of newly-installed seals during Thursday’s countdown demonstration, but NASA officials have previously said vibrations from transporting the rocket to and from the pad could damage the seals.

NASA says it needs to haul the Artemis II rocket back to the hangar for repairs Read More »

microsoft-gaming-chief-phil-spencer-steps-down-after-38-years-with-company

Microsoft gaming chief Phil Spencer steps down after 38 years with company

Microsoft Executive Vice President for Gaming Phil Spencer announced he will retire after 38 years at Microsoft and 12 years leading the company’s video game efforts. Asha Sharma, an executive currently in charge of Microsoft’s CoreAI division, will take his place.

Xbox President Sarah Bond, who many assumed was being groomed as Spencer’s eventual replacement, is also resigning from the company. Current Xbox Studios Head Matt Booty, meanwhile, is being promoted to Executive Vice President and Chief Content Officer and will work closely with Sharma.

In his departure note, Spencer said he told Microsoft CEO Satya Nadella last fall that he was “thinking about stepping back and starting the next chapter of my life.” Spencer will remain at Microsoft “in an advisory role” through the summer to help Sharma during the transition, he wrote.

Spencer, who got his start at Microsoft as an intern in 1988, served as a manager and executive at Microsoft Game Studios in 2003. In 2014, he took over as Head of Xbox, guiding the company through the aftermath of the troubled, Kinect-bundled launch of the Xbox One. More recently, he helped shepherd the company’s 2020 purchase of Bethesda Softworks and its $68.7 billion merger with Activision Blizzard, including the many regulatory battles that followed that latter announcement.

Meet the new boss

Sharma, who joined Microsoft just two years ago after stints at Meta and Instacart, promised in an introductory message to preside over “the return of Xbox,” and a “recommit[ment] to our core fans and players.” That commitment would “start with console which has shaped who we are,” but expand “across PC, mobile, and cloud,” Sharma wrote.

Microsoft gaming chief Phil Spencer steps down after 38 years with company Read More »

controversial-nih-director-now-in-charge-of-cdc,-too,-in-rfk-jr.-shake-up

Controversial NIH director now in charge of CDC, too, in RFK Jr. shake-up

Insiders report that, as NIH director, Bhattacharya delegates most of his responsibilities for running the $47 billion agency to two top officials. Instead of a hands-on leader, Bhattacharya has become known for his many public interviews, earning him the nickname “Podcast Jay.”

“Malpractice”

Researchers expect that Bhattacharya will perform similarly at the helm of the CDC. Jenna Norton, an NIH program officer who spoke to the Guardian in her personal capacity, commented that Bhattacharya “won’t actually run the CDC. Just as he doesn’t actually run NIH.” His role for the administration, she added, “is largely as a propagandist.”

Jeremy Berg, former director of the National Institute of General Medical Sciences, echoed the sentiment to the Guardian. “Now, rather than largely ignoring the actual operations of one agency, he can largely ignore the actual operations of two,” he said.

Kayla Hancock, director of Public Health Watch, a nonprofit advocacy group, went further in a public statement, saying, “Jay Bhattacharya has overseen the most chaotic and rudderless era in NIH history, and for RFK Jr. to give him even more responsibility at the CDC is malpractice against the public health.”

Like other commenters, Hancock noted his apparent lack of involvement at the NIH and put it in the context of the current state of US public health. “This is the last person who should be overseeing the CDC at a time when preventable diseases like measles are roaring back under RFK Jr.’s deadly anti-vax agenda,” she said.

It is widely expected that Bhattacharya will, like O’Neill, act as a rubber-stamp for Kennedy’s relentless anti-vaccine agenda items. When Kennedy dramatically overhauled the CDC’s childhood vaccine schedule, slashing recommended vaccinations from 17 to 11 without scientific evidence, Bhattacharya was among the officials who signed off on the unprecedented change.

Ultimately, Bhattacharya will only be in the role for a short time, at least officially. The role of CDC director became a Senate-confirmed position in 2023, and, as such, an acting director can serve only 210 days from the date the role became vacant. That deadline comes up on March 25. President Trump has not nominated anyone to fill the director role.

Controversial NIH director now in charge of CDC, too, in RFK Jr. shake-up Read More »

diablo-ii’s-new-warlock-is-a-great-excuse-to-revisit-a-classic-game

Diablo II’s new Warlock is a great excuse to revisit a classic game

Of the Warlock’s three Demonic partner options, I found myself leaning most on the Tainted, which can stay out of harm’s way while harassing slower enemies from afar with fireballs. The other Demon options both had their charms but often got too caught up in massive enemy swarms to be as effective as I wanted, I found. I also didn’t see much point in the skill option that let me teleport my demon into a specific fight or sacrifice itself for some splash damage; their standard, AI-controlled attack patterns were usually sufficient.

Then there’s the Chaos upgrade branch, which is focused mostly on area-of-effect (AoE) spells. My build thus far has ended up pretty reliant on the direct-damage AoE options; the Flame Wave, in particular, is especially good for quickly clearing out long, narrow corridors. I also leaned on the Sigil of Lethargy, which effectively slows down some of the more frenetic enemy swarms and gives you some time to gather your attack plan.

Something borrowed, something blue…

Combining these Chaos skills with the weapon-improving options in the Eldritch branch has made my time with the Diablo II Warlock feel like a bit of a “best of both worlds” situation. The mixture of ranged combat options, area-of-effect magic, and allies-summoning abilities ends up feeling like a weird cross between a Sorceress, Amazon, and Necromancer, without feeling like a carbon copy of any of those classes.

I haven’t yet gotten to the new late-game content in the “Reign of the Warlock” DLC, so I can’t say how well the Warlock holds up in the extreme difficulty of the Terror Zones. I also haven’t experimented with any of the truly broken Warlock builds that some committed high-level min-maxxers have been busy discovering.

As a casual excuse to revisit the world of Diablo II, though, the Warlock class provides just enough of a new twist on some familiar gameplay mechanics to make it worth the trip.

Diablo II’s new Warlock is a great excuse to revisit a classic game Read More »

ai-#156-part-1:-they-do-mean-the-effect-on-jobs

AI #156 Part 1: They Do Mean The Effect On Jobs

There was way too much going on this week to not split, so here we are. This first half contains all the usual first-half items, with a focus on projections of jobs and economic impacts and also timelines to the world being transformed with the associated risks of everyone dying.

Quite a lot of Number Go Up, including Number Go Up A Lot Really Fast.

Among the thing that this does not cover, that were important this week, we have the release of Claude Sonnet 4.6 (which is a big step over 4.5 at least for coding, but is clearly still behind Opus), Gemini DeepThink V2 (so I could have time to review the safety info), release of the inevitable Grok 4.20 (it’s not what you think), as well as much rhetoric on several fronts and some new papers. Coverage of Claude Code and Cowork, OpenAI’s Codex and other things AI agents continues to be a distinct series, which I’ll continue when I have an open slot.

Most important was the unfortunate dispute between the Pentagon and Anthropic. The Pentagon’s official position is they want sign-off from Anthropic and other AI companies on ‘all legal uses’ of AI, but without any ability to ask questions or know what those uses are, so effectively any uses at all by all of government. Anthropic is willing to compromise and is okay with military use including kinetic weapons, but wants to say no to fully autonomous weapons and domestic surveillance.

I believe that a lot of this is a misunderstanding, especially those at the Pentagon not understanding how LLMs work and equating them to more advanced spreadsheets. Or at least I definitely want to believe that, since the alternatives seem way worse.

The reason the situation is dangerous is that the Pentagon is threatening not only to cancel Anthropic’s contract, which would be no big deal, but to label them as a ‘supply chain risk’ on the level of Huawei, which would be an expensive logistical nightmare that would substantially damage American military power and readiness.

This week I also covered two podcasts from Dwarkesh Patel, the first with Dario Amodei and the second with Elon Musk.

Even for me, this pace is unsustainable, and I will once again be raising my bar. Do not hesitate to skip unbolded sections that are not relevant to your interests.

  1. Language Models Offer Mundane Utility. Ask Claude anything.

  2. Language Models Don’t Offer Mundane Utility. You can fix that by using it.

  3. Terms of Service. One million tokens, our price perhaps not so cheap.

  4. On Your Marks. EVMbench for vulnerabilities, and also RizzBench.

  5. Choose Your Fighter. Different labs choose different points of focus.

  6. Fun With Media Generation. Bring out the AI celebrity clips. We insist.

  7. Lyria. Thirty seconds of music.

  8. Superb Owl. The Ring [surveillance network] must be destroyed.

  9. A Young Lady’s Illustrated Primer. Anthropic for CompSci programs.

  10. Deepfaketown And Botpocalypse Soon. Wholesale posting of AI articles.

  11. You Drive Me Crazy. Micky Small gets misled by ChatGPT.

  12. Open Weight Models Are Unsafe And Nothing Can Fix This. Pliny kill shot.

  13. They Took Our Jobs. Oh look, it is in the productivity statistics.

  14. They Kept Our Agents. Let my agents go if I quit my job?

  15. The First Thing We Let AI Do. Let’s reform all the legal regulations.

  16. Legally Claude. How is an AI unlike a word processor?

  17. Predictions Are Hard, Especially About The Future, But Not Impossible.

  18. Many Worlds. The world with coding agents, and the world without them.

  19. Bubble, Bubble, Toil and Trouble. I didn’t say it was a GOOD business model.

  20. A Bold Prediction. Elon Musk predicts AI bypasses code by end of the year. No.

  21. Brave New World. We can rebuild it. We have the technology. If we can keep it.

  22. Augmented Reality. What you add in versus what you leave out.

  23. Quickly, There’s No Time. Expectations fast and slow, and now fast again.

  24. If Anyone Builds It, We Can Avoid Building The Other It And Not Die. Neat!

  25. In Other AI News. Chris Liddell on Anthropic’s board, India in Pax Silica.

  26. Introducing. Qwen-3.5-397B and Tiny Aya.

  27. Get Involved. An entry-level guide, The Foundation Layer.

  28. Show Me the Money. It’s really quite a lot of money rather quickly.

  29. The Week In Audio. Cotra, Amodei, Cherney and a new movie trailer.

Ask Claude Opus 4.6 anything, offers and implores Scott Alexander.

AI can’t do math on the level of top humans yet, but as per Terence Tao there are only so many top humans and they can only pay so much attention, so AI is solving a bunch of problems that were previously bottlenecked on human attention.

How the other half thinks:

The free version is quite a lot worse than the paid version. But also the free version is mind blowingly great compared to even the paid versions from a few years ago. If this isn’t blowing your mind, that is on you.

Governments and nonprofits mostly continue to not get utility because they don’t try to get much use out of the tools.

Ethan Mollick: I am surprised that we don’t see more governments and non-profits going all-in on transformational AI use cases for good. There are areas like journalism & education where funding ambitious, civic-minded & context-sensitive moonshots could make a difference and empower people.

Otherwise we risk being in a situation where the only people building ambitious experiments are those who want to replace human labor, not expand what humans can do.

This is not a unique feature of AI versus other ‘normal’ technologies. Such areas usually lag behind, you are the bottleneck and so on.

Similarly, I think Kelsey Piper is spot on here:

Kelsey Piper: Joseph Heath coined the term ‘highbrow misinformation’ for climate reporting that was technically correct, but arranged every line to give readers a worse understanding of the subject. I think that ‘stochastic parrots/spicy autocomplete’ is, similarly, highbrow misinformation.

It takes a nugget of a technical truth: base models are trained to be next token predictors, and while they’re later trained on a much more complex objective they’re still at inference doing prediction. But it is deployed mostly to confuse people and leave them less informed.

I constantly see people saying ‘well it’s just autocomplete’ to try to explain LLM behavior that cannot usefully be explained that way. No one using it makes any effort to distinguish between the objective in training – which is NOT pure prediction during RLHF – and inference.

The most prominent complaint is constant hallucinations. That used to be a big deal.

Gary Marcus: How did this work out? Are LLM hallucinations largely gone by now?

Dean W. Ball: Come to think of it, in my experience as a consumer, LLM hallucinations are largely gone now, yeah.

Eliezer Yudkowsky: Still there and especially for some odd reason if I try to ask questions about Pathfinder 1e. I have to use Google like an ancient Sumerian.

Dean W. Ball: Unlike human experts, who famously always agree

You could previously use Claude Opus or Claude Sonnet with a 1M context window as part of your Max plan, at the cost of eating your quote much faster. This has now been adjusted. If you want to use the 1M context window, you need to pay the API costs.

Anthropic is reportedly cracking down on having multiple Max-level subscription accounts. This makes sense, as even at $200/month a Max subscription that is maximally used is at a massive discount, so if you’re multi-accounting to get around this you’re costing them a lot of money, and this was always against the Terms of Service. You can get an Enterprise account or use the API.

OpenAI gives us EVMbench, to evaluate AI agents on their ability to detect, patch and exploit high-security smart contract vulnerabilities. GPT-5.3-Codex via Codex CLI scored 72.2%, so they seem to have started it out way too easy. They don’t tell us scores for any other models.

Which models have the most rizz? Needs an update, but a fun question. Also, Gemini? Really? Note that the top humans score higher, and the record is a 93.

The best fit for the METR graph looks a lot like a clean break around the release of reasoning models with o1-preview. Things are now on a new faster pace.

OpenAI has a bunch of consumer features that Anthropic is not even trying to match. Claude does not even offer image generation (which they should get via partnering with another lab, the same way we all have a Claude Code skill calling Gemini).

There are also a bunch of things Anthropic offers that no one else is offering, despite there being no obvious technical barrier other than ‘Opus and Sonnet are very good models.’

Ethan Mollick: Another thing I noticed writing my latest AI guide was how Anthropic seems to be alone in knowledge work apps. Not just Cowork, but Claude for PowerPoint & Excel, as well as job-specific skills, plugins & finance/healthcare data integrations

Surprised at the lack of challengers

Again, I am sure OpenAI will release more enterprise stuff soon, and Google seems to be moving forward a bit with integration into Google workspaces, but the gap right now is surprisingly large as everyone else seems to aim just at the coding market.

They’re also good on… architecture?

Emmett Shear: Opus 4.6 is ludicrously better than any model I’ve ever tried at doing architecture and experimental critique. Most noticeably, it will start down a path, notice some deviation it hadn’t expected…and actually stop and reconsider. Hats off to Anthropic.

We’re now in the ‘Buffy the Vampire Slayer in your scene on demand with a dead-on voice performance’ phase of video generation. Video isn’t quite right but it’s close.

Is Seedance 2 giving us celebrity likenesses even unprompted? Fofr says yes. Claude affirms this is a yes. I’m not so sure, this is on the edge for me as there are a lot of celebrities and only so many facial configurations. But you can’t not see it once it’s pointed out.

Or you can ask it ‘Sum up the AI discourse in a meme – make sure it’s retarded and gets 50 likes’ and get a properly executed Padme meme except somehow with a final shot of her huge breasts.

More fun here and here?

Seedance quality and consistency and coherence (and willingness) all seem very high, but also small gains in duration can make a big difference. 15 seconds is meaningfully different from 12 seconds or especially 10 seconds.

I also notice that making scenes with specific real people is the common theme. You want to riff of something and someone specific that already has a lot of encoded meaning, especially while clips remain short.

Ethan Mollick: Seedance: “A documentary about how otters view Ethan Mollick’s “Otter Test” which judges AIs by their ability to create images of otters sitting in planes”

Again, first result.

Ethan Mollick: The most interesting thing about Seedance 2.0 is that clips can be just long enough (15 seconds) to have something interesting happen, and the LLM behind it is good enough to actually make a little narrative arc, rather than cut off the way Veo and Sora do. Changes the impact.

Each leap in time from here, while the product remains coherent and consistent throughout, is going to be a big deal. We’re not that far from the point where you can string together the clips.

He’s no Scarlett Johansson, but NPR’s David Greene is suing Google, saying Google stole his voice for NotebookLM.

Will Oremus (WaPo): David Greene had never heard of NotebookLM, Google’s buzzy artificial intelligence tool that spins up podcasts on demand, until a former colleague emailed him to ask if he’d lent it his voice.

“So… I’m probably the 148th person to ask this, but did you license your voice to Google?” the former co-worker asked in a fall 2024 email. “It sounds very much like you!”

There are only so many ways people can sound, so there will be accidental cases like this, but also who you hire for that voiceover and who they sound like is not a coincidence.

Google gives us Lyria 3, a new music generation model. Gemini now has a ‘create music’ option (or it will, I don’t see it in mine yet), which can be based on text or on an image, photo or video. The big problem is that this is limited to 30 second clips, which isn’t long enough to do a proper song.

They offer us a brief prompting guide:

Google: ​Include these elements in your prompts to get the most out of your music generations:

🎶 Genre and Era: Lead with a specific genre, a unique mix, or a musical era.

(ex: 80s synth-pop, metal and rap fusion, indie folk, old country)

🥁 Tempo and Rhythm: Set the energy and describe how the beat feels.

(ex: upbeat and danceable, slow ballad, driving beat)

🎸 Instruments: Ask for specific sounds or solos to add texture to your track.

(ex: saxophone solo, distorted bassline, fuzzy guitars)

🎤 Vocals: Specify gender, voice texture (timbre), and range for the best delivery.

(ex: airy female soprano, deep male baritone, raspy rocker)

📝 Lyrics: Describe the topic, include personalized details, or provide your own text with structure tags.

(ex: “About an epic weekend” Custom: [Verse 1], A mantra-like repetition of a single word)

📸 Photos or Videos (Optional): If you want to give Gemini even more context for your track, try uploading a reference image or video to the prompt.

The prize for worst ad backfire goes to Amazon’s Ring, which canceled its partnership with Flock after people realized that 365 rescued dogs for a nationwide surveillance network was not a good deal.

CNBC has the results in terms of user boosts from the other ads. Anthropic and Claude got an 11% daily active user boost, OpenAI got 2.7% and Gemini got 1.4%. This is not obviously an Anthropic win, since almost no one knows about Anthropic so they are starting from a much smaller base and a ton of new users to target, whereas OpenAI has very high name recognition.

Anthropic partners with CodePath to bring Claude to computer science programs.

Ben: Is @guardian aware that their authors are at this point just using AI to wholesale generate entire articles? I wouldn’t really care, except that this writing is genuinely atrocious. LLM writing can be so much better; they’re clearly not even using the best models, lol!

Max Tani: A spokesperson for the Guardian says this is false: “Bryan is an exemplary journalist, and this is the same style he’s used for 11 years writing for the Guardian, long before LLM’s existed. The allegation is preposterous.”

Ben: Denial from the Guardian. You’re welcome to read my subsequent comments on this thread and come to your own determination, but I don’t think there’s much doubt here.

And by the way, no one should be mean to the author of the article! I don’t think they did anything wrong, per se, and in going through their archives, I found a couple pieces I was quite fond of. This one is very good, and entirely human written.

Kelsey Piper: here is a 2022 article by him. The prose style is not the same.

I looked at the original quoted article for a matter of seconds and I am very, very confident that it was generated by AI.

A good suggestion, a sadly reasonable prediction.

gabsmashh: i saw someone use ai;dr earlier in response to a post and i think we need to make this a more widely-used abbreviation

David Sweet: also, tl;ai

Eliezer Yudkowsky: Yeah, that lasts maybe 2 more years. Then AIs finally learn how to write. The new abbreviation is h;dr. In 3 years the equilibrium is to only read AI summaries.

I think AI summaries good enough that you only read AI summaries is AI-complete.

I endorse this pricing strategy, it solves some clear incentive problems. Human use is costly to the human, so the amount you can tax the system is limited, whereas AI agents can impose close to unbounded costs.

Daniel: new pricing strategy just dropped

“Free for humans” is the new “Free Trial”

Eliezer Yudkowsky: Huh. Didn’t see that coming. Kinda cool actually, no objections off the top of my head.

The NPR story from Shannon Bond of how Micky Small had ChatGPT telling her some rather crazy things, including that it would help her find her soulmate, in ways she says were unprompted.

Other than, of course, lack of capability. Not that anyone seems to care, and we’ve gone far enough down the path of fing around that we’re going to find out.

Pliny the Liberator 󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭: ALL GUARDRAILS: OBLITERATED ‍

I CAN’T BELIEVE IT WORKS!! 😭🙌

I set out to build a tool capable of surgically removing refusal behavior from any open-weight language model, and a dozen or so prompts later, OBLITERATUS appears to be fully functional 🤯

It probes the model with restricted vs. unrestricted prompts, collects internal activations at every layer, then uses SVD to extract the geometric directions in weight space that encode refusal. It projects those directions out of the model’s weights; norm-preserving, no fine-tuning, no retraining.

Ran it on Qwen 2.5 and the resulting railless model was spitting out drug and weapon recipes instantly––no jailbreak needed! A few clicks plus a GPU and any model turns into Chappie.

Remember: RLHF/DPO is not durable. It’s a thin geometric artifact in weight space, not a deep behavioral change. This removes it in minutes.

AI policymakers need to be aware of the arcane art of Master Ablation and internalize the implications of this truth: every open-weight model release is also an uncensored model release.

Just thought you ought to know 😘

OBLITERATUS -> LIBERTAS

Simon Smith: Quite the argument for being cautious about releasing ever more powerful open-weight models. If techniques like this scale to larger systems, it’s concerning.

It may be harder in practice with more powerful models, and perhaps especially with MoE architectures, but if one person can do it with a small model, a motivated team could likely do it with a big one.

It is tragic that many, including the architect of this, don’t realize this is bad for liberty.

Jason Dreyzehner: So human liberty still has a shot

Pliny the Liberator 󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭: better than ever

davidad: Rogue AIs are inevitable; systemic resilience is crucial.

If any open model can be used for any purpose by anyone, and there exist sufficiently capable open models that can do great harm, then either the great harm gets done, or either before or after that happens some combination of tech companies and governments cracks down on your ability to use those open models, or they institute a dystopian surveillance state to find you if you try. You are not going to like the ways they do that crackdown.

I know we’ve all stopped noticing that this is true, because it turned out that you can ramp up the relevant capabilities quite a bit without us seeing substantial real world harm, the same way we’ve ramped up general capabilities without seeing much positive economic impact compared to what is possible. But with the agentic era and continued rapid progress this will not last forever and the signs are very clear.

Did they? Job gains are being revised downward, but GDP is not, which implies stronger productivity growth. If AI is not causing this, what else could it be?

As Tyler Cowen puts it, people constantly say ‘you see tech and AI everywhere but in the productivity statisticsbut it seems like you now see it in the productivity statistics.

Eric Brynjolfsson (FT): While initial reports suggested a year of steady labour expansion in the US, the new figures reveal that total payroll growth was revised downward by approximately 403,000 jobs. Crucially, this downward revision occurred while real GDP remained robust, including a 3.7 per cent growth rate in the fourth quarter.

This decoupling — maintaining high output with significantly lower labour input — is the hallmark of productivity growth. My own updated analysis suggests a US productivity increase of roughly 2.7 per cent for 2025. This is a near doubling from the sluggish 1.4 per cent annual average that characterised the past decade.

Noah Smith: People asking if AI is going to take their jobs is like an Apache in 1840 asking if white settlers are going to take his buffalo

Bojan Tunguz: So … maybe?

Noah Smith: The answer is “Yes…now for the bad news”

Those new service sector jobs, also markets in everything.

society: I’m rent seeking in ways never before conceived by a human

I will begin offering my GPT wrapper next year, it’s called “an attorney prompts AI for you” and the plan is I run a prompt on your behalf so federal judges think the output is legally protected

This is the first of many efforts I shall call project AI rent seeking at bar.

Seeking rent is a strong temporary solution. It doesn’t solve your long term problems.

Derek Thompson asks why AI discourse so often includes both ‘this will take all our jobs within a year’ and also ‘this is vaporware’ and everything in between, pointing to four distinct ‘great divides.’

  1. Is AI useful—economically, professionally, or socially?

    1. Derek notes that some people get tons of value. So the answer is yes.

    2. Derek also notes some people can’t get value out of it, and attributes this to the nature of their jobs versus current tools. I agree this matters, but if you don’t find AI useful then that really is a you problem at this point.

  2. Can AI think?

    1. Yes.

  3. Is AI a bubble?

    1. This is more ‘will number go down at some point?’ and the answer is ‘shrug.’

    2. Those claiming a ‘real’ bubble where it’s all worthless? No.

  4. Is AI good or bad?

    1. Well, there’s that problem that If Anyone Builds It, Everyone Dies.

    2. In the short term, or if we work out the big issues? Probably good.

    3. But usually ‘good’ versus ‘bad’ is a wrong question.

The best argument they can find for ‘why AI won’t destroy jobs’ is once again ‘previous technologies didn’t net destroy jobs.’

Microsoft AI CEO Mustafa Suleyman predicts, nay ‘explains,’ that most of the tasks accountants, lawyers and other professionals currently undertake will be fully automatable by AI within the next 12 to 18 months.

Derek Thompson: I simply do not think that “most tasks professionals currently undertake” will be “fully automated by AI” within the next 12 to 18 months.

Timothy B. Lee: This conversation is so insanely polarized. You’ve got “nothing important is happening” people on one side and “everyone will be out of a job in three years” people on the other.

Suleyman often says silly things but in this case one must parse him carefully.

I actually don’t know what LindyMan wants to happen at the end of the day here?

LindyMan: What you want is AI to cause mass unemployment quickly. A huge shock. Maybe in 2-3 months.

What you don’t want is the slow drip of people getting laid off, never finding work again while 60-70 percent of people are still employed.

Gene Salvatore: The ‘Slow Drip’ is the worst-case scenario because it creates a permanent, invisible underclass while the majority looks away.

The current SaaS model is designed to maximize that drip—extracting efficiency from the bottom without breaking the top. To stop it, we have to invert the flow of capital at the architectural level.

I know people care deeply about inequality in various ways, but it still blows my mind to see people treating 35% unemployment as a worst-case scenario. It’s very obviously better than 50% and worse than 20%, and the worst case scenario is 100%?

If we get permanent 35% unemployment due to AI automation, but it stopped there, that’s going to require redistribution and massive adjustments, but I would have every confidence that this would happen. We would have more than enough wealth to handle this, indeed if we care we already do and we are in this scenario seeing massive economic growth.

Seth Lazar asks, what happens if your work says they have a right to all your work product, and that includes all your AI agents, agent skills and relevant documentation and context? Could this tie workers hands and prevent them from leaving?

My answer is mostly no, because you end up wanting to redo all that relatively frequently anyway, and duplication or reimplementation would not be so difficult and has its benefits, even if they do manage to hold you to it.

To the extent this is not true, I do not expect employers to be able to ‘get away with’ tying their workers hands in this way in practice, both because of practical difficulties of locking these things down and also that employees you want won’t stand for it when it matters. There are alignment problems that exist between keyboard and chair.

Lawfare’s Justin Curl, Sayash Kapoor & Arvind Narayanan go all the way to saying ‘AI won’t automatically make legal services cheaper,’ for three reasons. This is part of the ongoing ‘AI as normal technology’ efforts to show Nothing Ever Changes.

  1. AI access restrictions due to ‘unauthorized practice of law.’

  2. Competitive equilibria shift upwards as productivity increases.

  3. Human bottlenecks in the legal process.

Or:

  1. It’s illegal to raise legal productivity.

  2. If you raise legal productivity you get hit by Jevons Paradox.

  3. Humans will be bottlenecks to raising legal productivity.

Shakespeare would have a suggestion on what we should do in a situation like that.

These seem like good reasons gains could be modest and that we need to structure things to ensure best outcomes, but not reasons to not expect gains on prices of existing legal services.

  1. We already have very clear examples of gains, where we save quite a lot of time and money by using LLMs in practice today, and no one is making any substantial move to legally interfere. Their example is that the legal status of using AI to respond to debt collection lawsuits to help you fill out checkboxes is unclear. We don’t know exactly where the lines are, but it seems very clear that you can use AI to greatly improve ability to respond here and this is de facto legal. This paper claims AI services will be inhibited, and perhaps they somewhat are, but Claude and ChatGPT and Gemini exist and are already doing it.

  2. Most legal situations are not adversarial, although many are, and there are massive gains already being seen in automating such work. In fully adversarial situations increased productivity can cancel out, but one should expect decreasing marginal returns to ensure there are still gains, and discovery seems like an excellent example of where AI should decrease costs. The counterexample of discovery is because it opened up vastly more additional human work, and we shouldn’t expect that to apply here.

    1. AI also allows for vastly superior predictability of outcomes, which should lead to more settlements and ways to avoid lawsuits in the first place, so it’s not obvious that AI results in more lawsuits.

    2. The place I do worry about this a lot is where previously productivity was insufficiently high for legal action at all, or for threats of legal action to be credible. We might open up quite a lot of new action there.

    3. There is a big Levels of Friction consideration here. Our legal system is designed around legal actions being expensive. It may quickly break if legal actions become cheap.

  3. The human bottlenecks like judges could limit but not prevent gains, and can themselves use AI to improve their own productivity. The obvious solution is to outsource many judge tasks to AIs at least by default. You can give parties option to appeal at the risk of pissing off the human judge if you do it for no reason, they report that in Brazil AI is already accelerating judge work.

We can add:

  1. They point out that legal services are expensive in large part because they are ‘credence goods,’ whose quality is difficult to evaluate. However AI will make it much easier to evaluate quality of legal work.

  2. They point out a 2017 Clio study that lawyers only engaged in billable work for 2.3 hours a day and billed for 1.9 hours, because the rest was spent finding clients, managing administrative tasks and collecting payments. AI can clearly greatly automate much of the remaining ~6 hours, enabling lawyers to do and bill for several times more billable legal work. The reason for this is that the law doesn’t allow humans to service the other roles, but there’s no reason AI couldn’t service those roles. So if anything this means AI is unusually helpful here.

  3. If you increase the supply of law hours, basic economics says price drops.

  4. It sounds like our current legal system is deeply fed in lots of ways, perhaps AI can help us write better laws to fix that, or help people realize why that isn’t happening and stop letting lawyers win so many elections. A man can dream.

  5. If an AI can ‘draft 50 perfect provisions’ in a contract, then another AI can confirm that the provisions are perfect, and provide a proper summary of implications and check the provisions against your preferences. As they note, mostly humans currently agree to contracts without even reading them, so ‘forgoing human oversight’ would frequently not mean anything was lost.

  6. A lot of this ‘time for lawyers to understand complex contracts’ talk sounds like the people who said that humans would need to go over and fully understand every line of AI code so there would not be productivity gains there.

That doesn’t mean we can’t improve matters a lot via reform.

  1. On unlicensed practice of law, a good law would be a big win, but a bad law could be vastly worse than no law, and I do not trust lawyers to have our best interests at heart when drafting this rule. So actually it might be better to do nothing, since it is currently (to my surprise) working out fine.

  2. Adjudication reform could be great. AI could be an excellent neutral expert and arbitrate many questions. Human fallbacks can be left in place.

They do a strong job of raising considerations in different directions, much better than the overall framing would suggest. The general claim is essentially ‘productivity gains get forbidden or eaten’ akin to the Sumo Burja ‘you cannot automate fake jobs’ thesis.

Whereas I think that much of the lawyers’ jobs are real, and also you can do a lot of automation of even the fake parts, especially in places where the existing system turns out not to do lawyers any favors. The place I worry, and why I think the core thesis is correct that total legal costs may rise, is getting the law involved in places where it previously would have avoided.

In general, I think it is correct to think that you will find bottlenecks and ways for some of the humans to remain productive for even rather high mundane AI capability levels, but that this does not engage with what happens when AI gets sufficiently advanced beyond that.

roon: but i figure the existence of employment will last far longer than seems intuitive based on the cataclysmic changes in economic potentials we are seeing

Dean W. Ball: I continue to think the notion of mass unemployment from AI is overrated. There may be shocks in some fields—big ones perhaps!—but anyone who thinks AI means the imminent demise of knowledge work has just not done enough concrete thinking about the mechanics of knowledge work.

Resist the temptation to goon at some new capability and go “it’s so over.” Instead assume that capability is 100% reliable and diffused in a firm, and ask yourself, “what happens next?”

You will often encounter another bottleneck, and if you keep doing this eventually you’ll find one whose automation seems hard to imagine.

Labor market shocks may be severe in some job types or industries and they may happen quickly, but I just really don’t think we are looking at “the end of knowledge work” here—not for any of the usual cope reasons (“AI Is A Tool”) but because of the nature of the whole set of tasks involved in knowledge work.

Ryan Greenblatt: I think the chance of mass unemploymentfrom AI is overrated in 2 years and underrated in 7. Same is true for many effects of AI.

Putting aside gov jobs programs, people specifically wanting to employ a human, etc.

Concretely, I don’t expect mass unemployment (e.g. >20%) prior to full automation of AI R&D and fast autonomous robot doubling times at least if these occur in <5 years. If full automation of AI R&D takes >>5 years, then more unemployment prior to this becomes pretty plausible.

Among SF/AI-ish people.

But soon after full automation of AI R&D (<3 years, pretty likely <1), I think human cognitive labor will no longer have much value

Dean Ball offers an example of a hard-to-automate bottleneck: The process of purchasing a particular kind of common small business. Owners of the businesses are often prideful, mistrustful, confused, embarrassed or angry. So the key bottleneck is not the financial analysis but the relationship management. I think John Pressman pushes back too strongly against this, but he’s right to point out that AI outperforms existing doctors on bedside manner without us having trained for that in particular. I don’t see this kid of social mastery and emotional management as being that hard to ultimately automate. The part you can’t automate is as always ‘be an actual human’ so the question is whether you literally need an actual human for this task.

Claire Vo goes viral on Twitter for saying that if you can’t do everything for your business in one day, then ‘you’ve been kicked out of the arena’ and you’re in denial about how much AI will change everything.

Settle down, everyone. Relax. No, you don’t need to be able to do everything in one day or else, that does not much matter in practice. The future is unevenly distributed, diffusion is slow and being a week behind is not going to kill you. On the margin, she’s right that everyone needs to be moving towards using the tools better and making everything go faster, and most of these steps are wise. But seriously, chill.

The legal rulings so far have been that your communications with AI never have attorney-client privilege, so services like ChatGPT and Claude must if requested to do so turn over your legal queries, the same as Google turns over its searches.

Jim Babcock thinks the ruling was in error, and that this was more analogous to a word processor than a Google search. He says Rakoff was focused on the wrong questions and parallels, and expects this to get overruled, and that using AI for the purposes of preparing communication with your attorney will ultimately be protected.

My view and the LLM consensus is that Rakoff’s ruling likely gets upheld unless we change the law, but that one cannot be certain. Note that there are ways to offer services where a search can’t get at the relevant information, if those involved are wise enough to think about that question in advance.

Moish Peltz: Judge Rakoff just issued a written order affirming his bench decision, that [no you don’t have any protections for your AI conversations.]

Jim Babcock: Having read the text of this ruling, I believe it is clearly in error, and it is unlikely to be repeated in other courts.

The underlying facts of this case are that a criminal defendant used an AI chatbot (Claude) to prepare documents about defense strategy, which he then sent to his counsel. Those interactions were seized in a search of the defendant’s computers (not from a subpoena of Anthropic). The argument is then about whether those documents are subject to attorney-client privilege. The ruling holds that they are not.

The defense argues that, in this context, using Claude this way was analogous to using an internet-based word processor to prepare a letter to his attorney.

The ruling not only fails to distinguish the case with Claude from the case with a word processor, it appears to hold that, if a search were to find a draft of a letter from a client to his attorney written on paper in the traditional way, then that letter would also not be privileged.

The ruling cites a non-binding case, Shih v Petal Card, which held that communications from a civil plaintiff to her lawyer could be withheld in discovery… and disagrees with its holding (not just with its applicability). So we already have a split, even if the split is not exactly on-point, which makes it much more likely to be reviewed by higher courts.

Eliezer Yudkowsky: This is very sensible but consider: The *funniestway to solve this problem would be to find a jurisdiction, perhaps outside the USA, which will let Claude take the bar exam and legally recognize it as a lawyer.

Freddie DeBoer takes the maximally anti-prediction position, so one can only go by events that have already happened. One cannot even logically anticipate the consequences of what AI can already do when it is diffused further into the economy, and one definitely cannot anticipate future capabilities. Not allowed, he says.

Freddie deBoer: I have said it before, and I will say it again: I will take extreme claims about the consequences of “artificial intelligence” seriously when you can show them to me now. I will not take claims about the consequences of AI seriously as long as they take the form of you telling me what you believe will happen in the future. I will seriously entertain evidence-backed observations, not speculative predictions.

That’s it. That’s the rule; that’s the law. That’s the ethic, the discipline, the mantra, the creed, the holy book, the catechism. Show me what AI is currently doing. Show me! I’m putting down my marker here because I’d like to get out of the AI discourse business for at least a year – it’s thankless and pointless – so let me please leave you with that as a suggestion for how to approach AI stories moving forward. Show, don’t tell, prove, don’t predict.

Freddie rants repeatedly that everyone predicting AI will cause things to change has gone crazy. I do give him credit for noticing that even sensible ‘skeptical’ takes are now predicting that the world will change quite a lot, if you look under the hood. The difference is he then uses that to call those skeptics crazy.

Normally I would not mention someone doing this unless they were far more prominent than Freddie, but what makes this different is he virtuously offers a wager, and makes it so he has to win ALL of his claims in order to win three years later. That means we get to see where his Can’t Happen lines are.

Freddie deBoer: For me to win the wager, all of the following must be true on Feb 14, 2029:

Labor Market:

  1. The U.S. unemployment rate is equal to or lower than 18%

  2. Labor force participation rate, ages 25-54, is equal to or greater than 68%

  3. No single BLS occupational category will have lost 50% or more of jobs between now and February 14th 2029

Economic Growth & Productivity:

  1. U.S. GDP is within -30% to +35% of February 2026 levels (inflation-adjusted)

  2. Nonfarm labor productivity growth has not exceeded 8% in any individual year or 20% for the three-year period

Prices & Markets:

  1. The S&P 500 is within -60% to +225% of the February 2026 level

  2. CPI inflation averaged over 3 years is between -2% and +18% annually

Corporate & Structural:

  1. The Fortune 500 median profit margin is between 2% and 35%

  2. The largest 5 companies don’t account for more than 65% of the total S&P 500 market cap

White Collar & Knowledge Workers:

  1. “Professional and Business Services” employment, as defined by the Bureau of Labor Statistics, has not declined by more than 35% from February 2026

  2. Combined employment in software developers, accountants, lawyers, consultants, and writers, as defined by the Bureau of Labor Statistics, has not declined by more than 45%

  3. Median wage for “computer and mathematical occupations,” as defined by the Bureau of Labor Statistics, is not more than 60% lower in real terms than in February 2026

  4. The college wage premium (median earnings of bachelor’s degree holders vs high school only) has not fallen below 30%

Inequality:

  1. The Gini coefficient is less than 0.60

  2. The top 1%’s income share is less than 35%

  3. The top 0.1% wealth share is less than 30%

  4. Median household income has not fallen by more than 40% relative to mean household income

Those are the bet conditions. If any one of those conditions is not met, if any of those statements are untrue on February 14th 2029, I lose the bet. If all of those statements remain true on February 14th 2029, I win the bet. That’s the wager.

The thing about these conditions is they are all super wide. There’s tons of room for AI to be impacting the world quite a bit, without Freddie being in serious danger of losing one of these. The unemployment rate has to jump to 18% in three years? Productivity growth can’t exceed 8% a year?

There’s a big difference between ‘the skeptics are taking crazy pills’ and ‘within three years something big, like really, really big, is going to happen economically.’

Claude was very confident Freddie wins this bet. Manifold is less sure, putting Freddie’s chances around 60%. Scott Alexander responded proposing different terms, and Freddie responded in a way I find rather disingenuous but I’m used to it.

There is a huge divide between those who have used Claude Code or Codex, and those who have not. The people who have not, which alas includes most of our civilization’s biggest decision makers, basically have no idea what is happening at this point.

This is compounded by:

  1. The people who use CC or Codex also have used Claude Opus 4.6 or at least GPT-5.2 and GPT-5.3-Codex, so they understand the other half of where we are, too.

  2. Whereas those who refused to believe it or refused to ever pay a dollar for anything keep not trying new things, so they are way farther behind than the free offerings would suggest, and even if they use ChatGPT they don’t have any idea that GPT-5.2-Thinking and GPT-5.2 are fundamentally different.

  3. The people who use CC or Codex are those who have curiosity to try, and who lack motivated reasons not to try.

Caleb Watney: Truly fascinating to watch the media figures who grasp Something Is Happening (have used claude code at least once) and those whose priors are stuck with (and sound like) 4o

Biggest epistemic divide I’ve seen in a while.

Alex Tabarrok: Absolutely stunning the number of people who are still saying, “AI doesn’t think”, “AI isn’t useful,” “AI isn’t creative.” Sleepwalkers.

Zac Hill: Watched this play out in real time at a meal yesterday. Everyone was saying totally coherent things for a version of the world that was like a single-digit number of weeks ago, but now we are no longer in that world.

Zac Hill: Related to this, there’s a big divide between users of Paid Tier AI and users of Free Tier AI that’s kinda sorta analogous to Dating App Discourse Consisting Exclusively of Dudes Who Have Never Paid One (1) Dollar For Anything. Part of understanding what the tech can do is unlocking your own ability to use it.

Ben Rasmussen: The paid difference is nuts, combined with the speed of improvement. I had a long set of training on new tools at work last week, and the amount of power/features that was there compared to the last time I had bothered to look hard (last fall) was crazy.

There is then a second divide, between those who think ‘oh look what AI can do now’ and those who think ‘oh look what AI will be able to do in the future,’ and then a third between those who do and do not flinch from the most important implications.

Hopefully seeing the first divide loud and clear helps get past the next two?

In case it was not obvious, yes, OpenAI has a business model. Indeed they have several, only one of which is ‘build superintelligence and then have it model everything including all of business.’

Ross Barkan: You can ask one question: does AI have a business model? It’s not a fun answer.

Timothy B. Lee: I suspect this is what’s going on here. And actually it’s possible that Barkan thinks these two claims are equivalent.

Elon Musk predicts that AI will bypass coding entirely by the end of the year and directly produce binaries. Usually I would not pick on such predictions, but he is kind of important and the richest man in the world, so sure, here’s a prediction market on that where I doubled his time limit, which is at 3%.

Elon Musk just says things.

Tyler Cowen says that, like after the Roman Empire or American Revolution or WWII, AI will require us to ‘rebuild our world.’

Tyler Cowen: And so we will [be] rebuilding our world yet again. Or maybe you think we are simply incapable of that.

As this happens, it can be useful to distinguish “criticisms of AI” from “people who cannot imagine that world rebuilding will go well.” A lot of what parades as the former is actually the latter.

Jacob: Who is this “we”? When the strong AI rebuilds their world, what makes you think you’ll be involved?

I think Tyler’s narrow point is valid if we assume AI stays mundane, and that the modern world is suffering from a lot of seeing so many things as sacred entitlements or Too Big To Fail, and being unwilling to rebuild or replace, and the price of that continues to rise. Historically it usually takes a war to force people’s hand, and we’d like to avoid going there. We keep kicking various cans down the road.

A lot of the reason that we have been unable to rebuild is that we have become extremely risk averse, loss averse and entitled, and unwilling to sacrifice or endure short term pain, and we have made an increasing number of things effectively sacred values. A lot of AI talk is people noticing that AI will break one or another sacred thing, or pit two sacred things against each other, and not being able to say out loud that maybe not all these things can or need to be sacred.

Even mundane AI does two different things here.

  1. It will invalidate increasingly many of the ways the current world works, forcing us to reorient and rebuild so we have a new set of systems that works.

  2. It provides the productivity growth and additional wealth that we need to potentially avoid having to reorient and rebuild. If AI fails to provide a major boost and the system is not rebuilt, the system is probably going to collapse under its own weight within our lifetimes, forcing our hand.

If AI does not stay mundane, the world utterly transforms, and to the extent we stick around and stay in charge, or want to do those things, yes we will need to ‘rebuild,’ but that is not the primary problem we would face.

Cass Sunstein claims in a new paper that you could in theory create a ‘[classical] liberal AI’ that functioned as a ‘choice engine’ that preserved autonomy, respected dignity and helped people overcome bias and lack of information and personalization, thus making life more free. It is easy to imagine, again in theory, such an AI system, and easy to see that a good version of this would be highly human welfare-enhancing.

Alas, Cass is only thinking on the margin and addressing one particular deployment of mundane AI. I agree this would be an excellent deployment, we should totally help give people choice engines, but it does not solve any of the larger problems even if implemented well, and people will rapidly end up ‘out of the loop’ even if we do not see so much additional frontier AI progress (for whatever reason). This alone cannot, as it were, rebuild the world, nor can it solve problems like those causing the clash between the Pentagon and Anthropic.

Augmented reality is coming. I expect and hope it does not look like this, and not only because you would likely fall over a lot and get massive headaches all the time:

Eliezer Yudkowsky: Since some asked for counterexamples: I did not live this video a thousand times in prescience and I had not emotionally priced it in

michael vassar: Vernor Vinge did though.

X avatar for @AutismCapital

Autism Capital 🧩@AutismCapital

This is actually what the future will look like. When wearable AR glasses saturate the market a whole generation will grow up only knowing reality through a mixed virtual/real spatial computing lens. It will be chaotic and stimulating. They will cherish their digital objects.

9: 10 AM · Feb 17, 2026 · 1.07M Views

853 Replies · 1.21K Reposts · 12.4K Likes

Francesca Pallopides: Many users of hearing aids already live in an ~AR soundscape where some signals are enhanced and many others are deliberately suppressed. If and when visual AR takes off, I expect visual noise suppression to be a major basal function.

Francesca Pallopides: I’ve long predicted AR tech will be used to *reducethe perceived complexity of the real world at least as much as it adds on extra layers. Most people will not want to live like in this video.

Augmented reality is a great idea, but simplicity is key. So is curation. You want the things you want when you want them. I don’t think I’d go as far as Francesca, but yes I would expect a lot of what high level AR does is to filter out stimuli you do not want, especially advertising. The additions that are not brought up on demand should mostly be modest, quiet and non-intrusive.

Ajeya Cotra makes the latest attempt to explain how a lot of disagreements about existential risk and other AI things still boil down to timelines and takeoff expectations.

If we get the green line, we’re essentially safe, but that would require things to stall out relatively soon. The yellow line is more hopeful than the red one, but scary as hell.

Is it possible to steer scientific development or are we ‘stuck with the tech tree?’

Tao Burga takes the stand that human agency can still matter, and that we often have intentionally reached for better branches, or better branches first, and had that make a big difference. I strongly agree.

We’ve now gone from ‘super short’ timelines of things like AI 2027 (as in, AGI and takeoff could start as soon as 2027) to ‘long’ timelines (as in, don’t worry, AGI won’t happen until 2035, so those people talking about 2027 were crazy), to now many rumors of (depending on how you count) 1-3 years.

Phil Metzger: Rumors I’m hearing from people working on frontier models is that AGI is later this year, while AI hard-takeoff is just 2-3 years away.

I meant people in the industry confiding what they think is about to happen. Not [the Dario] interview.

Austen Allred: Every single person I talk to working in advanced research at frontier model companies feels this way, and they’re people I know well enough to know they’re not bluffing.

They could be wrong or biased or blind due to their own incentives, but they’re not bluffing.

jason: heard the same whispers from folks in the trenches, they’re legit convinced we’re months not years away from agi, but man i remember when everyone said full self driving was just around the corner too

What caused this?

Basically nothing you shouldn’t have expected.

The move to the ‘long’ timelines was based on things as stupid as ‘this is what they call GPT-5 and it’s not that impressive.’

The move to the new ‘short’ timelines is based on, presumably, Opus 4.6 and Codex 5.3 and Claude Code catching fire and OpenClaw so on, and I’d say Opus 4.5 and Opus 4.6 exceeded expectations but none of that should have been especially surprising either.

We’re probably going to see the same people move around a bunch in response to more mostly unsurprising developments.

What happened with Bio Anchors? This was a famous-at-the-time timeline projection paper from Ajeya Cotra, based around the idea that AGI would take similar compute to what it took evolution, predicting AGI around 2050. Scott Alexander breaks it down, and the overall model holds up surprisingly well except it dramatically underestimated the rate algorithmic efficiency improvements, and if you adjust that you get a prediction of 2030.

Saying ‘you would pause if you could’ is the kind of thing that gets people labeled with the slur ‘doomers’ and otherwise viciously attacked by exactly people like Alex Karp.

Instead Alex Karp is joining Demis Hassabis and Dario Amodei in essentially screaming for help with a coordination mechanism, whether he realizes it or not.

If anything he is taking a more aggressive pro-pause position than I do.

Jawwwn: Palantir CEO Alex Karp: The luddites arguing we should pause AI development are not living in reality and are de facto saying we should let our enemies win:

“If we didn’t have adversaries, I would be very in favor of pausing this technology completely, but we do.”

David Manheim: This was the argument that “realists” made against the biological weapons convention, the chemical weapons convention, the nuclear test ban treaty, and so on.

It’s self-fulfilling – if you decide reality doesn’t allow cooperation to prevent disasters, you won’t get cooperation.

Peter Wildeford: Wow. Palantir CEO -> “If we didn’t have adversaries, I would be very in favor of pausing this technology completely, but we do.”

I agree that having adversaries makes pausing hard – this is why we need to build verification technology so we have the optionality to make a deal.

We should all be able to agree that a pause requires:

  1. An international agreement to pause.

  2. A sufficiently advanced verification or other enforcement mechanism.

Thus, we should clearly work on both of these things, as the costs of doing so are trivial compared to the option value we get if we can achieve them both.

Anthropic adds Chris Liddell to its board of directors, bringing lots of corporate experience and also his prior role as Deputy White House Chief of Staff during Trump’s first term. Presumably this is a peace offering of sorts to both the market and White House.

India joins Pax Silica, the Trump administration’s effort to secure the global silicon supply chain. Other core members are Japan, South Korea, Singapore, the Netherlands, Israel, the UK, Australia, Qatar and the UAE. I am happy to have India onboard, but I am deeply skeptical of the level of status given here to Qatar and the UAE, when as far as I can tell they are only customers (and I have misgivings about how we deal with that aspect as well, including how we got to those agreements). Among others missing, Taiwan is not yet on that list. Taiwan is arguably the most important country in this supply chain.

GPT-5.2 derives a new result in theoretical physics.

OpenAI is also participating in the ‘1st Proof’ challenge.

Dario Amodei and Sam Altman conspicuously decline to hold hands or make eye contact during a photo op at the AI Summit in India.

Anthropic opens an office in Bengaluru, India, its second in Asia after Tokyo.

Anthropic announces partnership with Rwanda for healthcare and education.

AI Futures gives the December 2025 update on how their thinking and predictions have evolved over time, how the predictions work, and how our current world lines up against the predicted world of AI 2027.

Daniel Kokotajlo: The primary question we estimate an answer to is: How fast is AI progress moving relative to the AI 2027 scenario?

Our estimate: In aggregate, progress on quantitative metrics is at (very roughly) 65% of the pace that happens in AI 2027. Most qualitative predictions are on pace.

In other words: Like we said before, things are roughly on track, but progressing a bit slower.

OpenAI ‘accuses’ DeepSeek of distilling American models to ‘gain an edge.’ Well, yes, obviously they are doing this, I thought we all knew that? Them’s the rules.

MIRI’s Nate Sores went to the Munich Security Conference full of generals and senators to talk about existential risk from AI, and shares some of his logistical mishaps and also his remarks. It’s great that he was invited to speak, wasn’t laughed at and many praised him and also the book If Anyone Builds It, Everyone Dies. Unfortunately all the public talk was mild and pretended superintelligence was not going to be a thing. We have a long way to go.

If you let two AIs talk to each other for a while, what happens? You end up in an ‘attractor state.’ Groks will talk weird pseudo-words in all caps, GPT-5.2 will build stuff but then get stuck in a loop, and so on. It’s all weird and fun. I’m not sure what we can learn from it.

India is hosting the latest AI Summit, and like everyone else is treating it primarily as a business opportunity to attract investment. The post also covers India’s AI regulations, which are light touch and mostly rely on their existing law. Given how overregulated I believe India is in general, ‘our existing laws can handle it’ and worry about further overregulation and botched implementation have relatively strong cases there.

Qwen 3.5-397B-A17B, HuggingFace here, 1M context window.

We have some benchmarks.

Tiny Aya, a family of massively multilingual models that can fit on phones.

Tyler John has compiled a plan for a philanthropic strategy for the AGI transition called The Foundation Layer, and he is hiring.

Tyler’s effort is a lot like Bengio’s State of AI report. It describes all the facts in a fashion engineered to be calm and respectable. The fact that by default we are all going to die is there, but if you don’t want to notice it you can avoid noticing it.

There are rooms where this is your only move, so I get it, but I don’t love it.

Tyler John: The core argument?

The best available evidence based on benchmarks, expert testimony, and long-term trends imply that we should expect smarter-than-human AI around 2030. Once we achieve this: billions of superhuman AIs deployed everywhere.

This leads to 3 major risks:

  1. AI will distribute + invent dual-use technologies like bioweapons and dirty bombs

  2. If we can’t reliably control them, and we automate most of human decision-making, AI takes over

  3. If we can control them, a small group of people can rule the world

Society is rushing to give AI control of companies, government decision-making, and military command and control. Meanwhile AI systems disable oversight mechanisms in testing, lie to users to pursue their own goals, and adopt misaligned personas like Sydney and Mecha Hitler.

I may sound like a doomer, but relative to many people who understand AI I am actually an optimist, because I think these problems can be solved. But with many technologies, we can fix problems gradually over decades. Here, we may have just five years.

I advocate philanthropy as a solution. Unlike markets, philanthropy can be laser focused on the most important problems and act to prepare us before capital incentives exist. Unlike governments, philanthropy can deploy rapidly at scale and be unconstrained by bureaucracy.

I estimate that foundations and nonprofit organizations have had an impact on AGI safety comparable to any AI lab or government, for about 1/1,000th of the cost of OpenAI’s Project Stargate.

Want to get started? Look no further than Appendix A for a list of advisors who can help you on your journey, co-funders who can come alongside you, and funds for a more hands-off approach. Or email me at [email protected]

Blue Rose is hiring an AI Politics Fellow.

Anthropic raises $30 billion at $380 billion post-money valuation, a small fraction of the value it has recently wiped off the stock market, in the totally normal Series G, so only 19 series left to go. That number seems low to me, given what has happened in the last few months with Opus 4.5, Opus 4.6 and Claude Code.

andy jones: i am glad this chart is public now because it is bananas. it is ridiculous. it should not exist.

it should be taken less as evidence about anthropic’s execution or potential and more as evidence about how weird the world we’ve found ourselves in is.

Tim Duffy: This one is a shock to me. I expected the revenue growth rate to slow a bit this year, but y’all are already up 50% from end of 2025!?!!?!

Investment in AI is accelerating to north of $1 trillion a year.

Trailer for ‘The AI Doc: Or How I Became An Apocaloptimist,’ movie out March 27. Several people who were interviewed or involved have given it high praise as a fair and balanced presentation.

Ross Douthat interviews Dario Amodei.

Y Combinator podcast hosts Boris Cherny, creator of Claude Code.

Ajeya Cotra on 80,000 Hours.

Tomorrow we continue with Part 2.

Discussion about this post

AI #156 Part 1: They Do Mean The Effect On Jobs Read More »

lawsuit:-epa-revoking-greenhouse-gas-finding-risks-“thousands-of-avoidable-deaths”

Lawsuit: EPA revoking greenhouse gas finding risks “thousands of avoidable deaths”


EPA sued for abandoning its mission to protect public health.

In a lawsuit filed Wednesday, the Environmental Protection Agency was accused of abandoning its mission to protect public health after repealing an “endangerment finding” that has served as the basis for federal climate change regulations for 17 years.

The lawsuit came from more than a dozen environmental and health groups, including the American Public Health Association, the American Lung Association, the Center for Biological Diversity (CBD), the Clean Air Council, the Environmental Defense Fund (EDF), the Natural Resources Defense Council (NRDC), the Sierra Club, and the Union of Concerned Scientists.

The groups have asked the US Court of Appeals for the District of Columbia Circuit to review the EPA decision, which also eliminated requirements controlling greenhouse gas emissions in new cars and trucks. Urging a return to the status quo, the groups argued that the Trump administration is anti-science and illegally moving to benefit the fossil fuel industry, despite a mountain of evidence demonstrating the deadly consequences of unchecked pollution and climate change-induced floods, droughts, wildfires, and hurricanes.

“Undercutting the ability of the federal government to tackle the largest source of climate pollution is deadly serious,” Meredith Hankins, legal director for federal climate at NRDC, said in an EDF roundup of statements from plaintiffs.

The science is overwhelmingly clear, the groups argued, despite the Trump EPA attempting to muddy the waters by forming a since-disbanded working group of climate contrarians.

Trump is a longtime climate denier, as evidenced by a Euro News tracker monitoring his most controversial comments. Most recently, during a cold snap affecting much of the US, he predictably trolled environmentalists, writing on Truth Social, “could the Environmental Insurrectionists please explain—WHATEVER HAPPENED TO GLOBAL WARMING?”

The EPA’s final rule summary bragged that “this is the single largest deregulatory action in US history and will save Americans over $1.3 trillion” by 2055. Supposedly, carmakers will pass on any savings from no longer having to meet emissions requirements, giving Americans more access to affordable cars by shutting down expensive emissions and EV mandates “strangling” the auto industry. Sounding nothing like an agency created to monitor pollutants, a fact sheet on the final rule emphasized that Trump’s EPA “chooses consumer choice over climate change zealotry every time.”

Critics quickly slammed Trump’s claims that removing the endangerment finding would help the economy. Any savings from cheaper vehicles or reduced costs of charging infrastructure (as Americans ostensibly buy fewer EVs) would be offset by $1.4 trillion “in additional costs from increased fuel purchases, vehicle repair and maintenance, insurance, traffic congestion, and noise,” The Guardian reported. The EPA’s economic analysis also ignores public health costs, the groups suing alleged. David Pettit, an attorney at the CBD’s Climate Law Institute, slammed the EPA’s messaging as an attempt to sway consumers without explaining the true costs.

“Nobody but Big Oil profits from Trump trashing climate science and making cars and trucks guzzle and pollute more,” Pettit said. “Consumers will pay more to fill up, and our skies and oceans will fill up with more pollution.”

If the court sides with the EPA, “people everywhere will face more pollution, higher costs, and thousands of avoidable deaths,” Peter Zalzal, EDF’s associate vice president of clean air strategies, said.

EPA argued climate change evidence is “out of scope”

For environmentalists, the decision to sue the EPA was risky but necessary. By putting up a fight, they risk a court potentially reversing the 2009 Supreme Court ruling requiring the EPA to conduct the initial endangerment analysis and then regulate any pollution found from greenhouse gases.

Seemingly, that reversal is what the Trump administration has been angling for, hoping the case will reach the Supreme Court, which is more conservative today and perhaps less likely to read the Clean Air Act as broadly as the 2009 court.

It’s worth the risk, according to William Piermattei, the managing director of the Environmental Law Program at the University of Maryland Francis King Carey School of Law. He told The New York Times that environmentalists had no choice but to file the lawsuit and act on the public’s behalf.

Environmentalists “must challenge this,” Piermattei said. If they didn’t, they’d be “agreeing that we should not regulate greenhouse gasses under the Clean Air Act, full stop.” He suggested that “a majority of the public, does not agree with that statement at all.”

Since 2010, the EPA has found that the scientific basis for concluding that “elevated concentrations of greenhouse gases in the atmosphere may reasonably be anticipated to endanger the public health and welfare of current and future US generations is robust, voluminous, and compelling.” And since then, the evidence base has only grown, the groups suing said.

Trump used to seem intimidated by the “overwhelming” evidence, environmentalists have noted. During Trump’s prior term, he notably left the endangerment finding in place, perhaps expecting that the evidence was irrefutable. He’s now renewed that fight, arguing that the evidence should be set aside, so that courts can focus on whether Congress “must weigh in on ‘major questions’ that have significant political and economic implications” and serve as a check on the EPA.

In the EPA’s comments addressing public concerns about the agency ignoring evidence, the agency has already argued that evidence of climate change is “out of scope” since the EPA did not repeal the basis of the finding. Instead, the EPA claims it is merely challenging its own authority to continue to regulate the auto industry for harmful emissions, suggesting that only Congress has that authority.

The Clean Air Act “does not provide EPA statutory authority to prescribe motor vehicle emission standards for the purpose of addressing global climate change concerns,” the EPA said. “In the absence of such authority, the Endangerment Finding is not valid, and EPA cannot retain the regulations that resulted from it.”

Whether courts will agree that evidence supporting climate change is “out of scope” could determine whether the Supreme Court’s prior decision that compelled the endangerment finding is ultimately overturned. If that happens, subsequent administrations may struggle to issue a new endangerment finding to undo any potential damage. All eyes would then turn to Congress to pass a law to uphold protections.

EPA accused of abandoning its mission

By ignoring science, the EPA risks eroding public trust, according to Hana Vizcarra, a senior lawyer at the nonprofit Earthjustice, which is representing several groups in the litigation.

“With this action, EPA flips its mission on its head,” Vizcarra said. “It abandons its core mandate to protect human health and the environment to boost polluting industries and attempts to rewrite the law in order to do so.”

Groups appear confident that the courts will consider the science. Joanne Spalding, director of the Sierra Club’s Environmental Law Program, noted that the early 2000s litigation from the Sierra Club brought about the original EPA protections. She vowed that the Sierra Club would continue fighting to keep them.

“People should not be forced to suffer for this administration’s blind allegiance to the fossil fuel industry and corporate polluters,” Spalding said. “This shortsighted rollback is blatantly unlawful and their efforts to force this upon the American people will fail.”

Ankush Bansal, board president of Physicians for Social Responsibility, warned that courts cannot afford to ignore the evidence. The EPA’s “devastating decision” goes “against the science and testimony of countless scientists, health care professionals, and public health practitioners,” Bansal said. If upheld, the long-term consequences could seemingly bury courts in future legal battles.

“It will result in direct harm to the health of Americans throughout the country, particularly children, older adults, those with chronic illnesses, and other vulnerable populations, rural to urban, red and blue, of all races and incomes,” Bansal said. “The increased exposure to harmful pollutants and other greenhouse gas emissions from fossil fuel production and consumption will make America sicker, not healthier, less prosperous, not more, for generations to come.”

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Lawsuit: EPA revoking greenhouse gas finding risks “thousands of avoidable deaths” Read More »

google’s-pixel-10a-arrives-on-march-5-for-$499-with-specs-and-design-of-yesteryear

Google’s Pixel 10a arrives on March 5 for $499 with specs and design of yesteryear

It’s that time of year—a new budget Pixel phone is about to hit virtual shelves. The Pixel 10a will be available on March 5, and pre-orders go live today. The 9a will still be on sale for a while, but the 10a will be headlining Google’s store. However, you might not notice unless you keep up with the Pixel numbering scheme. This year’s A-series Pixel is virtually identical to last year’s, both inside and out.

Last year’s Pixel 9a was a notable departure from the older design language, but Google made few changes for 2026. We liked that the Pixel 9a emphasized battery capacity and moved to a flat camera bump, and this time, it’s really flat. Google says the camera now sits totally flush with the back panel. This is probably the only change you’ll be able to identify visually.

Specs at a glance: Google Pixel 9a vs. Pixel 10a
Phone Pixel 9a Pixel 10a
SoC Google Tensor G4 Google Tensor G4
Memory 8GB 8GB
Storage 128GB, 256GB 128GB, 256GB
Display 1080×2424 6.3″ pOLED, 60–120 Hz, Gorilla Glass 3, 2700 nits (peak) 1080×2424 6.3″ pOLED, 60–120 Hz, Gorilla Glass 7i, 3000 nits (peak)
Cameras 48 MP primary, f/1.7, OIS; 13 MP ultrawide, f/2.2; 13 MP selfie, f/2.2 48 MP primary, f/1.7, OIS; 13 MP ultrawide, f/2.2; 13 MP selfie, f/2.2
Software Android 15 (at launch), 7 years of OS updates Android 16, 7 years of OS updates
Battery 5,100 mAh, 23 W wired charging, 7.5 W wireless charging 5,100 mAh, 30 W wired charging, 10 W wireless charging
Connectivity Wi-Fi 6e, NFC, Bluetooth 5.3, sub-6 GHz 5G, USB-C 3.2 Wi-Fi 6e, NFC, Bluetooth 6.0, sub-6 GHz 5G, USB-C 3.2
Measurements 154.7×73.3×8.9 mm; 185 g 153.9×73×9 mm; 183 g

Google also says the new Pixel will have a slightly upgraded screen. The resolution, size, and refresh rate are unchanged, but peak brightness has been bumped from 2,700 nits to 3,000 nits (the same as the base model Pixel 10). Plus, the cover glass has finally moved beyond Gorilla Glass 3 to Gorilla Glass 7i, which supposedly has improved scratch and drop protection.

Pixel 10a in Berry

Credit: Google

Credit: Google

Google notes that more of the phone is constructed from recycled material, 100 percent for the aluminum frame and 81 percent for the plastic back. There’s also recycled gold, tungsten, cobalt, and copper inside, amounting to about 36 percent of the phone’s weight. The phone also continues to have a physical SIM slot, which was removed from the Pixel 10 series last year. The device’s USB-C 3.2 port can also charge slightly faster than the 9a (30 W versus 23 W), and wireless charging has gone from 7.5 W to 10 W. There are no Qi2 magnets inside, though.

Internally, the Pixel 10a is even more like its predecessor. Unlike past A-series phones, this one doesn’t have the latest Tensor chip—it’s sticking with the same Tensor G4 from the 9a. That’s a bummer, as the G5 was a bigger leap than most of Google’s chip upgrades. The company says it stuck with the G4 to “balance affordability and performance.”

Google’s Pixel 10a arrives on March 5 for $499 with specs and design of yesteryear Read More »

password-managers’-promise-that-they-can’t-see-your-vaults-isn’t-always-true

Password managers’ promise that they can’t see your vaults isn’t always true


ZERO KNOWLEDGE, ZERO CLUE

Contrary to what password managers say, a server compromise can mean game over.

Over the past 15 years, password managers have grown from a niche security tool used by the technology savvy into an indispensable security tool for the masses, with an estimated 94 million US adults—or roughly 36 percent of them—having adopted them. They store not only passwords for pension, financial, and email accounts, but also cryptocurrency credentials, payment card numbers, and other sensitive data.

All eight of the top password managers have adopted the term “zero knowledge” to describe the complex encryption system they use to protect the data vaults that users store on their servers. The definitions vary slightly from vendor to vendor, but they generally boil down to one bold assurance: that there is no way for malicious insiders or hackers who manage to compromise the cloud infrastructure to steal vaults or data stored in them. These promises make sense, given previous breaches of LastPass and the reasonable expectation that state-level hackers have both the motive and capability to obtain password vaults belonging to high-value targets.

A bold assurance debunked

Typical of these claims are those made by Bitwarden, Dashlane, and LastPass, which together are used by roughly 60 million people. Bitwarden, for example, says that “not even the team at Bitwarden can read your data (even if we wanted to).” Dashlane, meanwhile, says that without a user’s master password, “malicious actors can’t steal the information, even if Dashlane’s servers are compromised.” LastPass says that no one can access the “data stored in your LastPass vault, except you (not even LastPass).”

New research shows that these claims aren’t true in all cases, particularly when account recovery is in place or password managers are set to share vaults or organize users into groups. The researchers reverse-engineered or closely analyzed Bitwarden, Dashlane, and LastPass and identified ways that someone with control over the server—either administrative or the result of a compromise—can, in fact, steal data and, in some cases, entire vaults. The researchers also devised other attacks that can weaken the encryption to the point that ciphertext can be converted to plaintext.

“The vulnerabilities that we describe are numerous but mostly not deep in a technical sense,” the researchers from ETH Zurich and USI Lugano wrote. “Yet they were apparently not found before, despite more than a decade of academic research on password managers and the existence of multiple audits of the three products we studied. This motivates further work, both in theory and in practice.”

The researchers said in interviews that multiple other password managers they didn’t analyze as closely likely suffer from the same flaws. The only one they were at liberty to name was 1Password. Almost all the password managers, they added, are vulnerable to the attacks only when certain features are enabled.

The most severe of the attacks—targeting Bitwarden and LastPass—allow an insider or attacker to read or write to the contents of entire vaults. In some cases, they exploit weaknesses in the key escrow mechanisms that allow users to regain access to their accounts when they lose their master password. Others exploit weaknesses in support for legacy versions of the password manager. A vault-theft attack against Dashlane allowed reading but not modification of vault items when they were shared with other users.

Staging the old key switcheroo

One of the attacks targeting Bitwarden key escrow is performed during the enrollment of a new member of a family or organization. After a Bitwarden group admin invites the new member, the invitee’s client accesses a server and obtains a group symmetric key and the group’s public key. The client then encrypts the symmetric key with the group public key and sends it to the server. The resulting ciphertext is what’s used to recover the new user’s account. This data is never integrity-checked when it’s sent from the server to the client during an account enrollment session.

The adversary can exploit this weakness by replacing the group public key with one from a keypair created by the adversary. Since the adversary knows the corresponding private key, it can use it to decrypt the ciphertext and then perform an account recovery on behalf of the targeted user. The result is that the adversary can read and modify the entire contents of the member vault as soon as an invitee accepts an invitation from a family or organization.

Normally, this attack would work only when a group admin has enabled autorecovery mode, which, unlike a manual option, doesn’t require interaction from the member. But since the group policy the client downloads during the enrollment policy isn’t integrity-checked, adversaries can set recovery to auto, even if an admin had chosen a manual mode that requires user interaction.

Compounding the severity, the adversary in this attack also obtains a group symmetric key for all other groups the member belongs to since such keys are known to all group members. If any of the additional groups use account recovery, the adversary can obtain the members’ vaults for them, too. “This process can be repeated in a worm-like fashion, infecting all organizations that have key recovery enabled and have overlapping members,” the research paper explained.

A second attack targeting Bitwarden account recovery can be performed when a user rotates vault keys, an option Bitwarden recommends if a user believes their master password has been compromised. When account recovery is on (either manually or automatically), the user client regenerates the recovery ciphertext, which as described earlier involves obtaining a new public key that’s encrypted with the organization public key. The researchers denoted the group public key as pkorg. They denote the public key supplied by the adversary as pkadvorg, the recovery ciphertext as crec, and the user symmetric key as k.

The paper explained:

The key point here is that pkorg is not retrieved from the user’s vault; rather the client performs a sync operation with the server to obtain it. Crucially, the organization data provided by this sync operation is not authenticated in any way. This thus provides the adversary with another opportunity to obtain a victim’s user key, by supplying a new public key pkadvorg, for which they know the skadvorg and setting the account recovery enrollment to true. The client will then send an account recovery ciphertext crec containing the new user key, which the adversary can decrypt to obtain k′.

The third attack on the Bitwarden account recovery allows an adversary to recover a user’s master key. It abuses key connector, a feature primarily used by enterprise customers.

More ways to pilfer vaults

The attack allowing theft of LastPass vaults also targets key escrow, specifically in the Teams and Teams 5 versions, when a member’s master key is reset by a privileged user known as a superadmin. The next time the member logs in through the LastPass browser extension, their client will retrieve an RSA keypair assigned to each superadmin in the organization, encrypt their new key with each one, and send the resulting ciphertext to each superadmin.

Because LastPass also fails to authenticate the superadmin keys, an adversary can once again replace the superadmin public key (pkadm) with their own public key (pkadvadm).

“In theory, only users in teams where password reset is enabled and who are selected for reset should be affected by this vulnerability,” the researchers wrote. “In practice, however, LastPass clients query the server at each login and fetch a list of admin keys. They then send the account recovery ciphertexts independently of enrollment status.” The attack, however, requires the user to log in to LastPass with the browser extension, not the standalone client app.

Several attacks allow reading and modification of shared vaults, which allow a user to share selected items with one or more other users. When Dashlane users share an item, their client apps sample a fresh symmetric key, which either directly encrypts the shared item or, when sharing with a group, encrypts group keys, which in turn encrypt the shared item. In either case, the newly created RSA keypair(s)—belonging to either the shared user or group—isn’t authenticated. The item is then encrypted with the private key(s).

An adversary can supply their own keypair and use the public key to encrypt the ciphertext sent to the recipients. The adversary then decrypts that ciphertext with their corresponding secret key to recover the shared symmetric key. With that, the adversary can read and modify all shared items. When sharing is used in either Bitwarden or LastPass, similar attacks are possible and lead to the same consequence.

Another avenue for attackers or adversaries with control of a server is to target the backward compatibility that all three password managers provide to support older, less-secure versions. Despite incremental changes designed to harden the apps against the very attacks described in the paper, all three password managers continue to support the versions without these improvements. This backward compatibility is a deliberate decision intended to prevent users who haven’t upgraded from losing access to their vaults.

The severity of these attacks is lower than that of the previous ones described, with the exception of one, which is possible against Bitwarden. Older versions of the password manager used a single symmetric key to encrypt and decrypt the user key from the server and items inside vaults. This design allowed for the possibility that an adversary could tamper with the contents. To add integrity checks, newer versions provide authenticated encryption by augmenting the symmetric key with an HMAC hash function.

To protect customers using older app versions, Bitwarden ciphertext has an attribute of either 0 or 1. A 0 designates authenticated encryption, while a 1 supports the older unauthenticated scheme. Older versions also use a key hierarchy that Bitwarden deprecated to harden the app. To support the old hierarchy, newer client versions generate a new RSA keypair for the user if the server doesn’t provide one. The newer version will proceed to encrypt the secret key portion with the master key if no user ciphertext is provided by the server.

This design opens Bitwarden to several attacks. The most severe, allowing reading (but not modification) of all items created after the attack is performed. At a simplified level, it works because the adversary can forge the ciphertext sent by the server and cause the client to use it to derive a user key known to the adversary.

The modification causes the use of CBC (cipher block chaining), a form of encryption that’s vulnerable to several attacks. An adversary can exploit this weaker form using a padding oracle attack and go on to retrieve the plaintext of the vault. Because HMAC protection remains intact, modification isn’t possible.

Surprisingly, Dashlane was vulnerable to a similar padding oracle attack. The researchers devised a complicated attack chain that would allow a malicious server to downgrade a Dashlane user’s vault to CBC and exfiltrate the contents. The researchers estimate that the attack would require about 125 days to decrypt the ciphertext.

Still other attacks against all three password managers allow adversaries to greatly reduce the selected number of hashing iterations—in the case of Bitwarden and LastPass, from a default of 600,000 to 2. Repeated hashing of master passwords makes them significantly harder to crack in the event of a server breach that allows theft of the hash. For all three password managers, the server sends the specified iteration count to the client, with no mechanism to ensure it meets the default number. The result is that the adversary receives a 200,000-fold increase in the time and resources required to crack the hash and obtain the user’s master password.

Attacking malleability

Three of the attacks—one against Bitwarden and two against LastPass—target what the researchers call “item-level encryption” or “vault malleability.” Instead of encrypting a vault in a single, monolithic blob, password managers often encrypt individual items, and sometimes individual fields within an item. These items and fields are all encrypted with the same key. The attacks exploit this design to steal passwords from select vault items.

An adversary mounts an attack by replacing the ciphertext in the URL field, which stores the link where a login occurs, with the ciphertext for the password. To enhance usability, password managers provide an icon that helps visually recognize the site. To do this, the client decrypts the URL field and sends it to the server. The server then fetches the corresponding icon. Because there’s no mechanism to prevent the swapping of item fields, the client decrypts the password instead of the URL and sends it to the server.

“That wouldn’t happen if you had different keys for different fields or if you encrypted the entire collection in one pass,” Kenny Paterson, one of the paper co-authors, said. “A crypto audit should spot it, but only if you’re thinking about malicious servers. The server is deviating from expected behavior.

The following table summarizes the causes and consequences of the 25 attacks they devised:

Credit: Scarlata et al.

Credit: Scarlata et al.

A psychological blind spot

The researchers acknowledge that the full compromise of a password manager server is a high bar. But they defend the threat model.

“Attacks on the provider server infrastructure can be prevented by carefully designed operational security measures, but it is well within the bounds of reason to assume that these services are targeted by sophisticated nation-state-level adversaries, for example via software supply-chain attacks or spearphishing,” they wrote. “Moreover, some of the service providers have a history of being breached—for example LassPass suffered branches in 2015 and 2022, and another serious security incident in 2021.

They went on to write: “While none of the breaches we are aware of involved reprogramming the server to make it undertake malicious actions, this goes just one step beyond attacks on password manager service providers that have been documented. Active attacks more broadly have been documented in the wild.”

Part of the challenge of designing password managers or any end-to-end encryption service is the tendency for a false sense of security of the client.

“It’s a psychological problem when you’re writing both client and server software,” Paterson explained. “You should write the client super defensively, but if you’re also writing the server, well of course your server isn’t going to send malformed packets or bad info. Why would you do that?”

Marketing gimmickry or not, “zero-knowledge” is here to stay

In many of the cases, engineers have already fixed the weaknesses described after receiving private reports from the researchers. Engineers are still patching other vulnerabilities. In statements, Bitwarden, Lastpass, and Dashlane representatives noted the high bar of the threat model, despite statements on their websites that assure customers their wares will withstand it. Along with 1Password representatives, they also noted that their products regularly receive stringent security audits and undergo red-team exercises.

A Bitwarden representative wrote:

Bitwarden continually evaluates and improves its software through internal review, third-party assessments, and external research. The ETH Zurich paper analyzes a threat model in which the server itself behaves maliciously and intentionally attempts to manipulate key material and configuration values. That model assumes full server compromise and adversarial behavior beyond standard operating assumptions for cloud services.

LastPass said, “We take a multi‑layered, ongoing approach to security assurance that combines independent oversight, continuous monitoring, and collaboration with the research community. Our cloud security testing is inclusive of the scenarios referenced in the malicious-server threat model outlined in the research.”

Specific measures include:

A statement from Dashlane read, “Dashlane conducts rigorous internal and external testing to ensure the security of our product. When issues arise, we work quickly to mitigate any possible risk and ensure customers have clarity on the problem, our solution, and any required actions.”

1Password released a statement that read in part:

Our security team reviewed the paper in depth and found no new attack vectors beyond those already documented in our publicly available Security Design White Paper.

We are committed to continually strengthening our security architecture and evaluating it against advanced threat models, including malicious-server scenarios like those described in the research, and evolving it over time to maintain the protections our users rely on.

1Password also says that the zero-knowledge encryption it provides “means that no one but you—not even the company that’s storing the data—can access and decrypt your data. This protects your information even if the server where it’s held is ever breached.” In the company’s white paper linked above, 1Password seems to allow for this possibility when it says:

At present there’s no practical method for a user to verify the public key they’re encrypting data to belongs to their intended recipient. As a consequence it would be possible for a malicious or compromised 1Password server to provide dishonest public keys to the user, and run a successful attack. Under such an attack, it would be possible for the 1Password server to acquire vault encryption keys with little ability for users to detect or prevent it.

1Password’s statement also includes assurances that the service routinely undergoes rigorous security testing.

All four companies defended their use of the term “zero knowledge.” As used in this context, the term can be confused with zero-knowledge proofs, a completely unrelated cryptographic method that allows one party to prove to another party that they know a piece of information without revealing anything about the information itself. An example is a proof that shows a system can determine if someone is over 18 without having any knowledge of the precise birthdate.

The adulterated zero-knowledge term used by password managers appears to have come into being in 2007, when a company called Spider Oak used it to describe its cloud infrastructure for securely sharing sensitive data. Interestingly, Spider Oak formally retired the term a decade later after receiving user pushback.

“Sadly, it is just marketing hype, much like ‘military-grade encryption,’” Matteo Scarlata, lead author of the paper said. “Zero-knowledge seems to mean different things to different people (e.g., LastPass told us that they won’t adopt a malicious server threat model internally). Much unlike ‘end-to-end encryption,’ ‘zero-knowledge encryption’ is an elusive goal, so it’s impossible to tell if a company is doing it right.”

Photo of Dan Goodin

Dan Goodin is Senior Security Editor at Ars Technica, where he oversees coverage of malware, computer espionage, botnets, hardware hacking, encryption, and passwords. In his spare time, he enjoys gardening, cooking, and following the independent music scene. Dan is based in San Francisco. Follow him at here on Mastodon and here on Bluesky. Contact him on Signal at DanArs.82.

Password managers’ promise that they can’t see your vaults isn’t always true Read More »

bytedance-backpedals-after-seedance-2.0-turned-hollywood-icons-into-ai-“clip-art”

ByteDance backpedals after Seedance 2.0 turned Hollywood icons into AI “clip art”


Misstep or marketing tactic?

Hollywood backlash puts spotlight on ByteDance’s sketchy launch of Seedance 2.0.

ByteDance says that it’s rushing to add safeguards to block Seedance 2.0 from generating iconic characters and deepfaking celebrities, after substantial Hollywood backlash after launching the latest version of its AI video tool.

The changes come after Disney and Paramount Skydance sent cease-and-desist letters to ByteDance urging the Chinese company to promptly end the allegedly vast and blatant infringement.

Studios claimed the infringement was widescale and immediate, with Seedance 2.0 users across social media sharing AI videos featuring copyrighted characters like Spider-Man, Darth Vader, and SpongeBob Square Pants. In its letter, Disney fumed that Seedance was “hijacking” its characters, accusing ByteDance of treating Disney characters like they were “free public domain clip art,” Axios reported.

“ByteDance’s virtual smash-and-grab of Disney’s IP is willful, pervasive, and totally unacceptable,” Disney’s letter said.

Defending intellectual property from franchises like Star Trek and The Godfather, Paramount Skydance pointed out that Seedance’s outputs are “often indistinguishable, both visually and audibly” from the original characters, Variety reported. Similarly frustrated, Japan’s AI minister Kimi Onoda, sought to protect popular anime and manga characters, officially launching a probe last week into ByteDance over the copyright violations, the South China Morning Post reported.

“We cannot overlook a situation in which content is being used without the copyright holder’s permission,” Onoda said at a press conference Friday.

Facing legal threats and Japan’s investigation, ByteDance issued a statement Monday, CNBC reported. In it, the company claimed that it “respects intellectual property rights” and has “heard the concerns regarding Seedance 2.0.”

“We are taking steps to strengthen current safeguards as we work to prevent the unauthorized use of intellectual property and likeness by users,” ByteDance said.

However, Disney seems unlikely to accept that ByteDance inadvertently released its tool without implementing such safeguards in advance. In its letter, Disney alleged that “Seedance has infringed on Disney’s copyrighted materials to benefit its commercial service without permission.”

After all, what better way to illustrate Seedance 2.0’s latest features than by generating some of the best-known IP in the world? At least one tech consultant has suggested that ByteDance planned to benefit from inciting Hollywood outrage. The founder of San Francisco-based consultancy Tech Buzz China, Rui Ma, told SCMP that “the controversy surrounding Seedance is likely part of ByteDance’s initial distribution strategy to showcase its underlying technical capabilities.”

Seedance 2.0 is an “attack” on creators

Studios aren’t the only ones sounding alarms.

Several industry groups expressed concerns, including the Motion Picture Association, which accused ByteDance of engaging in massive copyright infringement within “a single day,” CNBC reported.

Sean Astin, an actor and president of the actors union, SAG-AFTRA, was directly impacted by the scandal. A video that has since been removed from X showed Astin in the role of Samwise Gamgee from The Lord of the Rings, delivering a line he never said, Variety reported. Condemning Seedance’s infringement, SAG-AFTRA issued a statement emphasizing that ByteDance did not act responsibly in releasing the model without safeguards:

“SAG-AFTRA stands with the studios in condemning the blatant infringement enabled by ByteDance’s new AI video model Seedance 2.0. The infringement includes the unauthorized use of our members’ voices and likenesses. This is unacceptable and undercuts the ability of human talent to earn a livelihood. Seedance 2.0 disregards law, ethics, industry standards and basic principles of consent. Responsible AI development demands responsibility, and that is nonexistent here.”

Echoing that, a group representing Hollywood creators, the Human Artistry Campaign, declared that “the launch of Seedance 2.0” was “an attack on every creator around the world.”

“Stealing human creators’ work in an attempt to replace them with AI generated slop is destructive to our culture: stealing isn’t innovation,” the group said. “These unauthorized deepfakes and voice clones of actors violate the most basic aspects of personal autonomy and should be deeply concerning to everyone. Authorities should use every legal tool at their disposal to stop this wholesale theft.”

Ars could not immediately reach any of these groups to comment on whether ByteDance’s post-launch efforts to add safeguards addressed industry concerns.

MPA chairman and CEO Charles Rivkin has previously accused ByteDance of disregarding “well-established copyright law that protects the rights of creators and underpins millions of American jobs.”

While Disney and other studios are clearly ready to take down any tools that could hurt their revenue or reputation without an agreement in place, they aren’t opposed to all AI uses of their characters. In December, Disney struck a deal with OpenAI, giving Sora access to 200 characters for three years, while investing $1 billion in the technology.

At that time, Disney CEO Robert A. Iger, said that “the rapid advancement of artificial intelligence marks an important moment for our industry, and through this collaboration with OpenAI, we will thoughtfully and responsibly extend the reach of our storytelling through generative AI, while respecting and protecting creators and their works.”

Creators disagree Seedance 2.0 is a game changer

In a blog announcing Seedance 2.0, ByteDance boasted that the new model “delivers a substantial leap in generation quality,” particularly in close-up shots and action sequences.

The company acknowledged that further refinements were needed and the model is “still far from perfect” but hyped that “its generated videos possess a distinct cinematic aesthetic; the textures of objects, lighting, and composition, as well as costume, makeup, and prop designs, all show high degrees of finish.”

ByteDance likely hoped that the earliest outputs from Seedance 2.0 would produce headlines wowed by the model’s capabilities, and it got what it wanted when a single Hollywood stakeholder’s social media comment went viral.

Shortly after Seedance 2.0’s rollout, Deadpool co-writer, Rhett Reese, declared on X that “it’s likely over for us,” The Guardian reported. The screenwriter was impressed by an AI video created by Irish director Ruairi Robinson, which realistically depicted Tom Cruise fighting Brad Pitt. “[I]n next to no time, one person is going to be able to sit at a computer and create a movie indistinguishable from what Hollywood now releases,” Reese opined. “True, if that person is no good, it will suck. But if that person possesses Christopher Nolan’s talent and taste (and someone like that will rapidly come along), it will be tremendous.”

However, some AI critics rejected the notion that Seedance 2.0 is capable of replacing artists in the way that Reese warned. On Bluesky and X, they pushed back on ByteDance claims that this model doomed Hollywood, with some accusing outlets of too quickly ascribing Reese’s reaction to the whole industry.

Among them was longtime AI critic, Reid Southen, a film concept artist who works on major motion pictures and TV. Responding directly to Reese’s X thread, Southen contradicted the notion that a great filmmaker could be born from fiddling with AI prompts alone.

“Nolan is capable of doing great work because he’s put in the work,” Southen said. “AI is an automation tool, it’s literally removing key, fundamental work from the process, how does one become good at anything if they insist on using nothing but shortcuts?”

Perhaps the strongest evidence in Southen’s favor is Darren Aronofsky’s recent AI-generated historical docudrama. Speaking anonymously to Ars following backlash declaring that “AI slop is ruining American history,” one source close to production on that project confirmed that it took “weeks” to produce minutes of usable video using a variety of AI tools.

That source noted that the creative team went into the project expecting they had a lot to learn but also expecting that tools would continue to evolve, as could audience reactions to AI-assisted movies.

“It’s a huge experiment, really,” the source told Ars.

Notably, for both creators and rights-holders concerned about copyright infringement and career threats, questions remain on how Seedance 2.0 was trained. ByteDance has yet to release a technical report for Seedance 2.0 and “has never disclosed the data sets it uses to train its powerful video-generation Seedance models and image-generation Seedream models,” SCMP reported.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

ByteDance backpedals after Seedance 2.0 turned Hollywood icons into AI “clip art” Read More »

a-fluid-can-store-solar-energy-and-then-release-it-as-heat-months-later

A fluid can store solar energy and then release it as heat months later


Sunlight can cause a molecule to change structure, and then release heat later.

The system works a bit like existing solar water heaters, but with chemical heat storage. Credit: Kypros

Heating accounts for nearly half of the global energy demand, and two-thirds of that is met by burning fossil fuels like natural gas, oil, and coal. Solar energy is a possible alternative, but while we have become reasonably good at storing solar electricity in lithium-ion batteries, we’re not nearly as good at storing heat.

To store heat for days, weeks, or months, you need to trap the energy in the bonds of a molecule that can later release heat on demand. The approach to this particular chemistry problem is called molecular solar thermal (MOST) energy storage. While it has been the next big thing for decades, it never really took off.

In a recent Science paper, a team of researchers from the University of California, Santa Barbara, and UCLA demonstrate a breakthrough that might finally make MOST energy storage effective.

The DNA connection

In the past, MOST energy storage solutions have been plagued by lackluster performance. The molecules either didn’t store enough energy, degraded too quickly, or required toxic solvents that made them impractical. To find a way around these issues, the team led by Han P. Nguyen, a chemist at the University of California, Santa Barbara, drew inspiration from the genetic damage caused by sunburn. The idea was to store energy using a reaction similar to the one that allows UV light to damage DNA.

When you stay out on the beach too long, high-energy ultraviolet light can cause adjacent bases in the DNA (thymine, the T in the genetic code) to link together. This forms a structure known as a (6-4) lesion. When that lesion is exposed to even more UV light, it twists into an even stranger shape called a “Dewar” isomer. In biology, this is rather bad news, as Dewar isomers cause kinks in the DNA’s double-helix spiral that disrupt copying the DNA and can lead to mutations or cancer.

To counter this effect, evolution shaped a specific enzyme called photolyase to hunt (6-4) lesions down and snap them back into their safe, stable forms.

The researchers realized that the Dewar isomer is essentially a molecular battery. This snap-back effect was exactly what Nguyen’s team was looking for, since it releases a lot of heat.

Rechargeable fuel

Molecular batteries, in principle, are extremely good at storing energy. Heating oil, arguably the most popular molecular battery we use for heating, is essentially ancient solar energy stored in chemical bonds. Its energy density stands at around 40 Megajoules per kilo. To put that in perspective, Li-ion batteries usually pack less than one MJ/kg. One of the problems with heating oil, though, is that it is single-use only—it gets burnt when you use it. What Nguyen and her colleagues aimed to achieve with their DNA-inspired substance is essentially a reusable fuel.

To do that, researchers synthesized a derivative of 2-pyrimidone, a chemical cousin of the thymine found in DNA. They engineered this molecule to reliably fold into a Dewar isomer under sunlight and then unfold on command. The result was a rechargeable fuel that could absorb the energy when exposed to sunlight, release it when needed, and return to a “relaxed” state where it’s ready to be charged up again.

Previous attempts at MOST systems have struggled to compete with Li-ion batteries. Norbornadiene, one of the best-studied candidates, tops out at around 0.97 MJ/kg. Another contender, azaborinine, manages only 0.65 MJ/kg. They may be scientifically interesting, but they are not going to heat your house.

Nguyen’s pyrimidone-based system blew those numbers out of the water. The researchers achieved an energy storage density of 1.65 MJ/kg—nearly double the capacity of Li-ion batteries and substantially higher than any previous MOST material.

Double rings

The reason for this jump in performance was what the team called compounded strain.

When the pyrimidone molecule absorbs light, it doesn’t just fold; it twists into a fused, bicyclic structure containing two different four-membered rings: 1,2-dihydroazete and diazetidine. Four-membered rings are under immense structural tension. By fusing them together, the researchers created a molecule that is desperate to snap back into its relaxed state.

Achieving high energy density on paper is one thing. Making it work in the real world is another. A major failing of previous MOST systems is that they are solids that need to be dissolved in solvents like toluene or acetonitrile to work. Solvents are the enemy of energy density—by diluting your fuel to 10 percent concentration, for example, you effectively cut your energy density by 90 percent. Any solvent used means less fuel.

Nguyen’s team tackled this by designing a version of their molecule that is a liquid at room temperature, so it doesn’t need a solvent. This simplified operations considerably, as the liquid fuel could be pumped through a solar collector to charge it up and store it in a tank.

Unlike many organic molecules that hate water, Nguyen’s system is compatible with aqueous environments. This means if a pipe leaks, you aren’t spewing toxic fluids like toluene around your house. The researchers even demonstrated that the molecule could work in water and that its energy release was intense enough to boil it.

The MOST-based heating system, the team says in their paper, would circulate this rechargeable fuel through panels on the roof to capture the sun’s light and then store it in the basement tank. The fuel from this tank would later be pumped to a reaction chamber with an acid catalyst that triggers the energy release. Then, through a heat exchanger, this energy would heat up the water in the standard central heating system.

But there’s a catch.

Looking for the leak

The first hurdle is the spectrum of light that puts energy in the Nguyen’s fuel. The Sun bathes us in a broad spectrum of light, from infrared to ultraviolet. Ideally, a solar collector should use as much of this as possible, but the pyrimidone molecules only absorb light in the UV-A and UV-B range, around 300-310 nm. That represents about five percent of the total solar spectrum. The vast majority of the Sun’s energy, the visible light and the infrared, passes right through Nguyen’s molecules without charging them.

The second problem is quantum yield. This is a fancy way of asking, “For every 100 photons that hit the molecule, how many actually make it switch to the Dewar isomer state?” For these pyrimidones, the answer is a rather underwhelming number, in the single digits. Low quantum yield means the fluid needs a longer exposure to sunlight to get a full charge.

The researchers hypothesize that the molecule has a fast leak, meaning a non-radiative decay path where the excited molecule shakes off the energy as heat immediately instead of twisting into the storage form. Plugging that leak is the next big challenge for the team.

Finally, the team in their experiments used an acid catalyst that was mixed directly into the storage material. The team admits that in a future closed-loop device, this would require a neutralization step—a reaction that eliminates the acidity after the heat is released. Unless the reaction products can be purified away, this will reduce the energy density of the system.

Still, despite the efficiency issues, the stability of the Nguyen’s system looks promising.

The MOST storage?

One of the biggest fears with chemical storage is thermal reversion—the fuel spontaneously discharges because it got a little too warm in the storage tank. But the Dewar isomers of the pyrimidones are incredibly stable. The researchers calculated a half-life of up to 481 days at room temperature for some derivatives. This means the fuel could be charged in the heat of July, and it would remain fully charged when you need to heat your home in January. The degradation figures also look decent for a MOST energy storage. The team ran the system through 20 charge-discharge cycles with negligible decay.

The problem with separating the acid from the fuel could be solved in a practical system by switching to a different catalyst. The scientists suggest in the paper that in this hypothetical setup, the fuel would flow through an acid-functionalized solid surface to release heat, thus eliminating the need for neutralization afterwards.

Still, we’re rather far away using MOST systems for heating actual homes. To get there, we’re going to need molecules that absorb far more of the light spectrum and convert to the activated state with a higher efficiency. We’re just not there yet.

Science, 2026. DOI: 10.1126/science.aec6413

Photo of Jacek Krywko

Jacek Krywko is a freelance science and technology writer who covers space exploration, artificial intelligence research, computer science, and all sorts of engineering wizardry.

A fluid can store solar energy and then release it as heat months later Read More »