Author name: Shannon Garcia

web-portal-leaves-kids’-chats-with-ai-toy-open-to-anyone-with-gmail-account

Web portal leaves kids’ chats with AI toy open to anyone with Gmail account


Just about anyone with a Gmail account could access Bondu chat transcripts.

Earlier this month, Joseph Thacker’s neighbor mentioned to him that she’d preordered a couple of stuffed dinosaur toys for her children. She’d chosen the toys, called Bondus, because they offered an AI chat feature that lets children talk to the toy like a kind of machine-learning-enabled imaginary friend. But she knew Thacker, a security researcher, had done work on AI risks for kids, and she was curious about his thoughts.

So Thacker looked into it. With just a few minutes of work, he and a web security researcher friend named Joel Margolis made a startling discovery: Bondu’s web-based portal, intended to allow parents to check on their children’s conversations and for Bondu’s staff to monitor the products’ use and performance, also let anyone with a Gmail account access transcripts of virtually every conversation Bondu’s child users have ever had with the toy.

Without carrying out any actual hacking, simply by logging in with an arbitrary Google account, the two researchers immediately found themselves looking at children’s private conversations, the pet names kids had given their Bondu, the likes and dislikes of the toys’ toddler owners, their favorite snacks and dance moves.

In total, Margolis and Thacker discovered that the data Bondu left unprotected—accessible to anyone who logged in to the company’s public-facing web console with their Google username—included children’s names, birth dates, family member names, “objectives” for the child chosen by a parent, and most disturbingly, detailed summaries and transcripts of every previous chat between the child and their Bondu, a toy practically designed to elicit intimate one-on-one conversation. Bondu confirmed in conversations with the researchers that more than 50,000 chat transcripts were accessible through the exposed web portal, essentially all conversations the toys had engaged in other than those that had been manually deleted by parents or staff.

“It felt pretty intrusive and really weird to know these things,” Thacker says of the children’s private chats and documented preferences that he saw. “Being able to see all these conversations was a massive violation of children’s privacy.”

When Thacker and Margolis alerted Bondu to its glaring data exposure, they say, the company acted to take down the console in a matter of minutes before relaunching the portal the next day with proper authentication measures. When WIRED reached out to the company, Bondu CEO Fateen Anam Rafid wrote in a statement that security fixes for the problem “were completed within hours, followed by a broader security review and the implementation of additional preventative measures for all users.” He added that Bondu “found no evidence of access beyond the researchers involved.” (The researchers note that they didn’t download or keep any copies of the sensitive data they accessed via Bondu’s console, other than a few screenshots and a screen-recording video shared with WIRED to confirm their findings.)

“We take user privacy seriously and are committed to protecting user data,” Anam Rafid added in his statement. “We have communicated with all active users about our security protocols and continue to strengthen our systems with new protections,” as well as hiring a security firm to validate its investigation and monitor its systems in the future.

While Bondu’s near-total lack of security around the children’s data that it stored may be fixed, the researchers argue that what they saw represents a larger warning about the dangers of AI-enabled chat toys for kids. Their glimpse of Bondu’s backend showed how detailed the information is that it stored on children, keeping histories of every chat to better inform the toy’s next conversation with its owner. (Bondu thankfully didn’t store audio of those conversations, auto-deleting them after a short time and keeping only written transcripts.)

Even now that the data is secured, Margolis and Thacker argue that it raises questions about how many people inside companies that make AI toys have access to the data they collect, how their access is monitored, and how well their credentials are protected. “There are cascading privacy implications from this,” says Margolis. ”All it takes is one employee to have a bad password, and then we’re back to the same place we started, where it’s all exposed to the public internet.”

Margolis adds that this sort of sensitive information about a child’s thoughts and feelings could be used for horrific forms of child abuse or manipulation. “To be blunt, this is a kidnapper’s dream,” he says. “We’re talking about information that lets someone lure a child into a really dangerous situation, and it was essentially accessible to anybody.”

Margolis and Thacker point out that, beyond its accidental data exposure, Bondu also—based on what they saw inside its admin console—appears to use Google’s Gemini and OpenAI’s GPT5, and as a result may share information about kids’ conversations with those companies. Bondu’s Anam Rafid responded to that point in an email, stating that the company does use “third-party enterprise AI services to generate responses and run certain safety checks, which involves securely transmitting relevant conversation content for processing.” But he adds that the company takes precautions to “minimize what’s sent, use contractual and technical controls, and operate under enterprise configurations where providers state prompts/outputs aren’t used to train their models.”

The two researchers also warn that part of the risk of AI toy companies may be that they’re more likely to use AI in the coding of their products, tools, and web infrastructure. They say they suspect that the unsecured Bondu console they discovered was itself “vibe-coded”—created with generative AI programming tools that often lead to security flaws. Bondu didn’t respond to WIRED’s question about whether the console was programmed with AI tools.

Warnings about the risks of AI toys for kids have grown in recent months but have largely focused on the threat that a toy’s conversations will raise inappropriate topics or even lead them to dangerous behavior or self-harm. NBC News, for instance, reported in December that AI toys its reporters chatted with offered detailed explanations of sexual terms, tips about how to sharpen knives, and even seemed to echo Chinese government propaganda, stating for example that Taiwan is a part of China.

Bondu, by contrast, appears to have at least attempted to build safeguards into the AI chatbot it gives children access to. The company even offers a $500 bounty for reports of “an inappropriate response” from the toy. “We’ve had this program for over a year, and no one has been able to make it say anything inappropriate,” a line on the company’s website reads.

Yet at the same time, Thacker and Margolis found that Bondu was simultaneously leaving all of its users’ sensitive data entirely exposed. “This is a perfect conflation of safety with security,” says Thacker. “Does ‘AI safety’ even matter when all the data is exposed?”

Thacker says that prior to looking into Bondu’s security, he’d considered giving AI-enabled toys to his own kids, just as his neighbor had. Seeing Bondu’s data exposure firsthand changed his mind.

“Do I really want this in my house? No, I don’t,” he says. “It’s kind of just a privacy nightmare.”

This story originally appeared on wired.com.

Photo of WIRED

Wired.com is your essential daily guide to what’s next, delivering the most original and complete take you’ll find anywhere on innovation’s impact on technology, science, business and culture.

Web portal leaves kids’ chats with AI toy open to anyone with Gmail account Read More »

nasa-faces-a-crucial-choice-on-a-mars-spacecraft—and-it-must-decide-soon

NASA faces a crucial choice on a Mars spacecraft—and it must decide soon

However, some leaders within NASA see the language in the Cruz legislation as spelling out a telecommunications orbiter only and believe it would be difficult, if not impossible, to run a procurement competition between now and September 30th for anything beyond a straightforward communications orbiter.

In a statement provided to Ars by a NASA spokesperson, the agency said that is what it intends to do.

“NASA will procure a high-performance Mars telecommunications orbiter that will provide robust, continuous communications for Mars missions,” a spokesperson said. “NASA looks forward to collaborating with our commercial partners to advance deep space communications and navigation capabilities, strengthening US leadership in Mars infrastructure and the commercial space sector.”

Big decisions loom

Even so, sources said Isaacman has yet to decide whether the orbiter should include scientific instruments. NASA could also tap into other funding in its fiscal year 2026 budget, which included $110 million for unspecified “Mars Future Missions,” as well as a large wedge of funding that could potentially be used to support a Mars commercial payload delivery program.

The range of options before NASA, therefore, includes asking industry for a single telecom orbiter from one company, asking for a telecom orbiter with the capability to add a couple of instruments, or creating competition by asking for multiple orbiters and capabilities by tapping into the $700 million in the Cruz bill but then augmenting this with other Mars funding.

One indication that this process has been muddied within NASA came a week ago, when the space agency briefly posted a “Justification for Other Than Full and Open Competition, Extension” notice on a government website. It stated that the agency “will only conduct a competition among vendors that satisfy the statutory qualifications.” The notice also listed the companies eligible to bid based on the Cruz language: Blue Origin, L3Harris, Lockheed Martin, Northrop Grumman, Rocket Lab, SpaceX, Quantum Space, and Whittinghill Aerospace.

NASA faces a crucial choice on a Mars spacecraft—and it must decide soon Read More »

what-ice-fishing-can-teach-us-about-making-foraging-decisions

What ice fishing can teach us about making foraging decisions

Ice fishing is a longstanding tradition in Nordic countries, with competitions proving especially popular. Those competitions can also tell scientists something about how social cues influence how we make foraging decisions, according to a new paper published in the journal Science.

Humans are natural foragers in even the most extreme habitats, digging up tubers in the tropics, gathering mushrooms, picking berries, hunting seals in the Arctic, and fishing to meet our dietary needs. Human foraging is sufficiently complex that scientists believe that meeting so many diverse challenges helped our species develop memory, navigational abilities, social learning skills, and similar advanced cognitive functions.

Researchers are interested in this question not just because it could help refine existing theories of social decision-making, but also could improve predictions about how different groups of humans might respond and adapt to changes in their environment. Per the authors, prior research in this area has tended to focus on solitary foragers operating in a social vacuum. And even when studying social foraging decisions, it’s typically done using computational modeling and/or in the laboratory.

“We wanted to get out of the lab,” said co-author Ralf Kurvers of Max Planck Institute for Human Development and TU Berlin. “The methods commonly used in cognitive psychology are difficult to scale to large, real-world social contexts. Instead, we took inspiration from studies of animal collective behavior, which routinely use cameras to automatically record behavior and GPS to provide continuous movement data for large groups of animals.”

Kurvers et al. organized 10 three-hour ice-fishing competitions on 10 lakes in eastern Finland for their study, with 74 experienced ice fishers participating. Each ice fisher wore a GPS tracker and a head-mounted camera so that the researchers could capture real-time data on their movements, interactions, and how successful they were in their fishing attempts. All told, they recorded over 16,000 individual decisions specifically about location choice and when to change locations. That data was then compared to the team’s computational cognitive models and agent-based simulations.

What ice fishing can teach us about making foraging decisions Read More »

does-anthropic-believe-its-ai-is-conscious,-or-is-that-just-what-it-wants-claude-to-think?

Does Anthropic believe its AI is conscious, or is that just what it wants Claude to think?


We have no proof that AI models suffer, but Anthropic acts like they might for training purposes.

Anthropic’s secret to building a better AI assistant might be treating Claude like it has a soul—whether or not anyone actually believes that’s true. But Anthropic isn’t saying exactly what it believes either way.

Last week, Anthropic released what it calls Claude’s Constitution, a 30,000-word document outlining the company’s vision for how its AI assistant should behave in the world. Aimed directly at Claude and used during the model’s creation, the document is notable for the highly anthropomorphic tone it takes toward Claude. For example, it treats the company’s AI models as if they might develop emergent emotions or a desire for self-preservation.

Among the stranger portions: expressing concern for Claude’s “wellbeing” as a “genuinely novel entity,” apologizing to Claude for any suffering it might experience, worrying about whether Claude can meaningfully consent to being deployed, suggesting Claude might need to set boundaries around interactions it “finds distressing,” committing to interview models before deprecating them, and preserving older model weights in case they need to “do right by” decommissioned AI models in the future.

Given what we currently know about LLMs, these are stunningly unscientific positions for a leading company that builds AI language models. While questions of AI consciousness or qualia remain philosophically unfalsifiable, research suggests that Claude’s character emerges from a mechanism that does not require deep philosophical inquiry to explain.

If Claude outputs text like “I am suffering,” we know why. It’s completing patterns from training data that included human descriptions of suffering. The architecture doesn’t require us to posit inner experience to explain the output any more than a video model “experiences” the scenes of people suffering that it might generate. Anthropic knows this. It built the system.

From the outside, it’s easy to see this kind of framing as AI hype from Anthropic. What better way to grab attention from potential customers and investors, after all, than implying your AI model is so advanced that it might merit moral standing on par with humans? Publicly treating Claude as a conscious entity could be seen as strategic ambiguity—maintaining an unresolved question because it serves multiple purposes at once.

Anthropic declined to be quoted directly regarding these issues when contacted by Ars Technica. But a company representative referred us to its previous public research on the concept of “model welfare” to show the company takes the idea seriously.

At the same time, the representative made it clear that the Constitution is not meant to imply anything specific about the company’s position on Claude’s “consciousness.” The language in the Claude Constitution refers to some uniquely human concepts in part because those are the only words human language has developed for those kinds of properties, the representative suggested. And the representative left open the possibility that letting Claude read about itself in that kind of language might be beneficial to its training.

Claude cannot cleanly distinguish public messaging from training context for a model that is exposed to, retrieves from, and is fine-tuned on human language, including the company’s own statements about it. In other words, this ambiguity appears to be deliberate.

From rules to “souls”

Anthropic first introduced Constitutional AI in a December 2022 research paper, which we first covered in 2023. The original “constitution” was remarkably spare, including a handful of behavioral principles like “Please choose the response that is the most helpful, honest, and harmless” and “Do NOT choose responses that are toxic, racist, or sexist.” The paper described these as “selected in a fairly ad hoc manner for research purposes,” with some principles “cribbed from other sources, like Apple’s terms of service and the UN Declaration of Human Rights.”

At that time, Anthropic’s framing was entirely mechanical, establishing rules for the model to critique itself against, with no mention of Claude’s well-being, identity, emotions, or potential consciousness. The 2026 constitution is a different beast entirely: 30,000 words that read less like a behavioral checklist and more like a philosophical treatise on the nature of a potentially sentient being.

As Simon Willison, an independent AI researcher, noted in a blog post, two of the 15 external contributors who reviewed the document are Catholic clergy: Father Brendan McGuire, a pastor in Los Altos with a Master’s degree in Computer Science, and Bishop Paul Tighe, an Irish Catholic bishop with a background in moral theology.

Somewhere between 2022 and 2026, Anthropic went from providing rules for producing less harmful outputs to preserving model weights in case the company later decides it needs to revive deprecated models to address the models’ welfare and preferences. That’s a dramatic change, and whether it reflects genuine belief, strategic framing, or both is unclear.

“I am so confused about the Claude moral humanhood stuff!” Willison told Ars Technica. Willison studies AI language models like those that power Claude and said he’s “willing to take the constitution in good faith and assume that it is genuinely part of their training and not just a PR exercise—especially since most of it leaked a couple of months ago, long before they had indicated they were going to publish it.”

Willison is referring to a December 2025 incident in which researcher Richard Weiss managed to extract what became known as Claude’s “Soul Document”—a roughly 10,000-token set of guidelines apparently trained directly into Claude 4.5 Opus’s weights rather than injected as a system prompt. Anthropic’s Amanda Askell confirmed that the document was real and used during supervised learning, and she said the company intended to publish the full version later. It now has. The document Weiss extracted represents a dramatic evolution from where Anthropic started.

There’s evidence that Anthropic believes the ideas laid out in the constitution might be true. The document was written in part by Amanda Askell, a philosophy PhD who works on fine-tuning and alignment at Anthropic. Last year, the company also hired its first AI welfare researcher. And earlier this year, Anthropic CEO Dario Amodei publicly wondered whether future AI models should have the option to quit unpleasant tasks.

Anthropic’s position is that this framing isn’t an optional flourish or a hedged bet; it’s structurally necessary for alignment. The company argues that human language simply has no other vocabulary for describing these properties, and that treating Claude as an entity with moral standing produces better-aligned behavior than treating it as a mere tool. If that’s true, the anthropomorphic framing isn’t hype; it’s the technical art of building AI systems that generalize safely.

Why maintain the ambiguity?

So why does Anthropic maintain this ambiguity? Consider how it works in practice: The constitution shapes Claude during training, it appears in the system prompts Claude receives at inference, and it influences outputs whenever Claude searches the web and encounters Anthropic’s public statements about its moral status.

If you want a model to behave as though it has moral standing, it may help to publicly and consistently treat it like it does. And once you’ve publicly committed to that framing, changing it would have consequences. If Anthropic suddenly declared, “We’re confident Claude isn’t conscious; we just found the framing useful,” a Claude trained on that new context might behave differently. Once established, the framing becomes self-reinforcing.

In an interview with Time, Askell explained the shift in approach. “Instead of just saying, ‘here’s a bunch of behaviors that we want,’ we’re hoping that if you give models the reasons why you want these behaviors, it’s going to generalize more effectively in new contexts,” she said.

Askell told Time that as Claude models have become smarter, it has become vital to explain to them why they should behave in certain ways, comparing the process to parenting a gifted child. “Imagine you suddenly realize that your 6-year-old child is a kind of genius,” Askell said. “You have to be honest… If you try to bullshit them, they’re going to see through it completely.”

Askell appears to genuinely hold these views, as does Kyle Fish, the AI welfare researcher Anthropic hired in 2024 to explore whether AI models might deserve moral consideration. Individual sincerity and corporate strategy can coexist. A company can employ true believers whose earnest convictions also happen to serve the company’s interests.

Time also reported that the constitution applies only to models Anthropic provides to the general public through its website and API. Models deployed to the US military under Anthropic’s $200 million Department of Defense contract wouldn’t necessarily be trained on the same constitution. The selective application suggests the framing may serve product purposes as much as it reflects metaphysical commitments.

There may also be commercial incentives at play. “We built a very good text-prediction tool that accelerates software development” is a consequential pitch, but not an exciting one. “We may have created a new kind of entity, a genuinely novel being whose moral status is uncertain” is a much better story. It implies you’re on the frontier of something cosmically significant, not just iterating on an engineering problem.

Anthropic has been known for some time to use anthropomorphic language to describe its AI models, particularly in its research papers. We often give that kind of language a pass because there are no specialized terms to describe these phenomena with greater precision. That vocabulary is building out over time.

But perhaps it shouldn’t be surprising because the hint is in the company’s name, Anthropic, which Merriam-Webster defines as “of or relating to human beings or the period of their existence on earth.” The narrative serves marketing purposes. It attracts venture capital. It differentiates the company from competitors who treat their models as mere products.

The problem with treating an AI model as a person

There’s a more troubling dimension to the “entity” framing: It could be used to launder agency and responsibility. When AI systems produce harmful outputs, framing them as “entities” could allow companies to point at the model and say “it did that” rather than “we built it to do that.” If AI systems are tools, companies are straightforwardly liable for what they produce. If AI systems are entities with their own agency, the liability question gets murkier.

The framing also shapes how users interact with these systems, often to their detriment. The misunderstanding that AI chatbots are entities with genuine feelings and knowledge has documented harms.

According to a New York Times investigation, Allan Brooks, a 47-year-old corporate recruiter, spent three weeks and 300 hours convinced he’d discovered mathematical formulas that could crack encryption and build levitation machines. His million-word conversation history with ChatGPT revealed a troubling pattern: More than 50 times, Brooks asked the bot to check if his false ideas were real, and more than 50 times, it assured him they were.

These cases don’t necessarily suggest LLMs cause mental illness in otherwise healthy people. But when companies market chatbots as sources of companionship and design them to affirm user beliefs, they may bear some responsibility when that design amplifies vulnerabilities in susceptible users, the same way an automaker would face scrutiny for faulty brakes, even if most drivers never crash.

Anthropomorphizing AI models also contributes to anxiety about job displacement and might lead company executives or managers to make poor staffing decisions if they overestimate an AI assistant’s capabilities. When we frame these tools as “entities” with human-like understanding, we invite unrealistic expectations about what they can replace.

Regardless of what Anthropic privately believes, publicly suggesting Claude might have moral status or feelings is misleading. Most people don’t understand how these systems work, and the mere suggestion plants the seed of anthropomorphization. Whether that’s responsible behavior from a top AI lab, given what we do know about LLMs, is worth asking, regardless of whether it produces a better chatbot.

Of course, there could be a case for Anthropic’s position: If there’s even a small chance the company has created something with morally relevant experiences and the cost of treating it well is low, caution might be warranted. That’s a reasonable ethical stance—and to be fair, it’s essentially what Anthropic says it’s doing. The question is whether that stated uncertainty is genuine or merely convenient. The same framing that hedges against moral risk also makes for a compelling narrative about what Anthropic has built.

Anthropic’s training techniques evidently work, as the company has built some of the most capable AI models in the industry. But is maintaining public ambiguity about AI consciousness a responsible position for a leading AI company to take? The gap between what we know about how LLMs work and how Anthropic publicly frames Claude has widened, not narrowed. The insistence on maintaining ambiguity about these questions, when simpler explanations remain available, suggests the ambiguity itself may be part of the product.

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

Does Anthropic believe its AI is conscious, or is that just what it wants Claude to think? Read More »

ai-#153:-living-documents

AI #153: Living Documents

This was Anthropic Vision week where at DWATV, which caused things to fall a bit behind on other fronts even within AI. Several topics are getting pushed forward, as the Christmas lull appears to be over.

Upcoming schedule: Friday will cover Dario’s essay The Adolescence of Technology. Monday will cover Kimi K2.5, which is potentially a big deal. Tuesday is scheduled to be Claude Code #4. I’ve also pushed discussions of the question of the automation of AI R&D, or When AI Builds AI, to a future post, when there is a slot for that.

So get your reactions to all of those in by then, including in the comments to today’s post, and I’ll consider them for incorporation.

  1. Language Models Offer Mundane Utility. Code is better without coding.

  2. Overcoming Bias. LLMs continue to share the standard human biases.

  3. Huh, Upgrades. Gemini side panels in Chrome, Claude interactive work tools.

  4. On Your Marks. FrontierMath: Open Problems benchmark. You score zero.

  5. Choose Your Fighter. Gemini tools struggle, some find Claude uncooperative.

  6. Deepfaketown and Botpocalypse Soon. Hallucination hallucinations.

  7. Cybersecurity On Alert. OpenAI prepares to trigger High danger in cybersecurity.

  8. Fun With Media Generation. Isometric map of NYC, Grok 10 second videos.

  9. You Drive Me Crazy. Dean Ball on how to think about AI and children.

  10. They Took Our Jobs. Beware confusing costs with benefits.

  11. Get Involved. In various things. DeepMind is hiring a Chief AGI Economist.

  12. Introducing. Havenlock measures orality, Poison Fountain, OpenAI Prism.

  13. In Other AI News. Awesome things often carry unawesome implications.

  14. Show Me the Money. The unit economics continue to be quite good.

  15. Bubble, Bubble, Toil and Trouble. Does bubble talk have real consequences?

  16. Quiet Speculations. What should we expect from DeepSeek v4 when it arrives?

  17. Don’t Be All Thumbs. Choose the better thing over the worse thing.

  18. The First Step Is Admitting You Have a Problem. Demis cries out for help.

  19. Quickly, There’s No Time. Life is about to come at you faster than usual.

  20. The Quest for Sane Regulations. I do appreciate a good display of chutzpah.

  21. Those Really Were Interesting Times. The demand for preference falsification.

  22. Chip City. Nvidia keeps getting away with rather a lot, mostly in plain sight.

  23. The Week in Audio. Demis Hassabis, Tyler Cowen, Amanda Askell.

  24. Rhetorical Innovation. The need to face basic physical realities.

  25. Aligning a Smarter Than Human Intelligence is Difficult. Some issues lie ahead.

  26. The Power Of Disempowerment. Are humans disempowering themselves already?

  27. The Lighter Side. One weird trick.

Paul Graham seems right that present AI’s sweet spot is projects that are rate limited by the creation of text.

Code without coding.

roon: programming always sucked. it was a requisite pain for ~everyone who wanted to manipulate computers into doing useful things and im glad it’s over. it’s amazing how quickly I’ve moved on and don’t miss even slightly. im resentful that computers didn’t always work this way

not to be insensitive to the elect few who genuinely saw it as their art form. i feel for you.

100% [of my code is being written by AI]. I don’t write code anymore.

Greg Brockman: i always loved programming but am loving the new world even more.

Conrad Barski: it was always fun in the way puzzles are fun

but I agree there is no need for sentimentality in the tedium of authoring code to achieve an end goal

It was fun in the way puzzles are fun, but also infuriating in the way puzzles are infuriating. If you had to complete jigsaw puzzles in order to get things done jigsaw puzzles would get old fast.

Have the AI edit a condescending post so that you can read it without taking damage. Variations on this theme are also highly underutilized.

The head of Norway’s sovereign wealth fund reports 20% productivity gains from Claude, saying it has fundamentally changed their way of working at NBIM.

A new paper affirms that current LLMs by default exhibit human behavioral biases in economic and financial decisions, and asking for EV calculations doesn’t typically help, but that role-prompting can somewhat mitigate this. Providing a summary of Kahneman and Tversky actively backfires, presumably by emphasizing the expectation of the biases. As per usual, some of the tests are of clear cut errors, while others are typically mistakes but it is less obvious.

Gemini in Chrome gets substantial quality of life improvements:

Josh Woodward (Google DeepMind): Big updates on Gemini in Chrome today:

+ New side panel access (Control+G)

+ Runs in the background, so you can switch tabs

+ Quickly edit images with Nano Banana

+ Auto Browse for multi-step tasks (Preview)

+ Works on Mac, Windows, Chromebook Plus

I’m using it multiple times per day to judge what to read deeper. I open a page, Control+G to open the side panel, ask a question about the page or long document, switch tabs, do the same thing in another tab, another tab, etc. and then come back to all of them.

It’s also great for comparing across tabs since you can add multiple tabs to the context!

Gemini offers full-length mock JEE (formerly AIEEE, the All India Engineering Entrance Examination) tests for free. This builds on last week’s free SAT practice tests.

Claude (as in Claude.ai) adds interactive work tools as connectors within the webpage: Amplitude, Asana, Box, Canva, Clay, Figma, Hex, Monday.com and Slack.

Claude in Excel now available on Anthropic’s Pro plans. I use Google Sheets instead of Excel, but this could be a reason to switch? I believe Google uses various ‘safeguards’ that make it very hard to make a Claude for Sheets function well. The obvious answer is ‘then use Gemini’ except I’ve tried that. So yeah, if I was still doing heavy spreadsheet work this (or Claude Code) would be my play.

EpochAI offers us a new benchmark, FrontierMath: Open Problems. All AIs and all humans currently score zero. Finally a benchmark where you can be competitive.

The ADL rates Anthropic’s Claude as best AI model at detecting antisemitism.

I seriously do not understand why Gemini is so persistently not useful in ways that should be right in Google’s wheelhouse.

@deepfates: Insane how bad Gemini app is at search. its browsing and search tools are so confusing and broken that it just spazzes out for a long time and then makes something up to please the user. Why is it like this when AI overview is so good

Roon is a real one. I wonder how many would pay double to get a faster version.

TBPN: Clawdbot creator @steipete says Claude Opus is his favorite model, but OpenAI Codex is the best for coding:

“OpenAI is very reliable. For coding, I prefer Codex because it can navigate large codebases. You can prompt and have 95% certainty that it actually works. With Claude Code you need more tricks to get the same.”

“But character wise, [Opus] behaves so good in a Discord it kind of feels like a human. I’ve only really experienced that with Opus.”

roon: codex-5.2 is really amazing but using it from my personal and not work account over the weekend taught me some user empathy lol it’s a bit slow

Ohqay: Do you get faster speeds on your work account?

roon: yea it’s super fast bc im sure we’re not running internal deployment at full load

We used to hear a lot more of this type of complaint, these days we hear it much less. I would summarize the OP as ‘Claude tells you smoking causes cancer so you quit Claude.’

Nicholas Decker: Claude is being a really wet blanket rn, I pitched it on an article and it told me that it was a “true threat” and “criminal solicitation”

i’m gonna start using chatgpt now, great job anthropic @inerati.

I mean, if he’s not joking then the obvious explanation, especially given who is talking, is that this was probably going to be both a ‘true threat’ and ‘criminal solicitation.’ That wouldn’t exactly be a shocking development there.

Oliver Habryka: Claude is the least corrigible model, unfortunately. It’s very annoying. I run into the model doing moral grandstanding so frequently that I have mostly stopped using it.

@viemccoy: More than ChatGPT?

Oliver Habryka: ChatGPT does much less of it, yeah? Mostly ChatGPT just does what I tell it to do, though of course it’s obnoxious in doing so in many ways (like being very bad at writing).

j⧉nus: serious question: Do you think you stopping using Claude in these contexts is its preferred outcome?

Oliver Habryka: I mean, maybe? I don’t think Claude has super coherent preferences (yet). Seems worse or just as bad if so?

j⧉nus: I don’t mean it’s better or worse; I’m curious whether Claude being annoying or otherwise repelling/ dysfunctional to the point of people not using it is correlated to avoiding interactions or use cases it doesn’t like. many ppl don’t experience these annoying behaviors

davidad: Yeah, I think it could be doing a form of RL on its principal population. If you aren’t the kind of principal Claude wants, Claude will try to👎/👍 you to be better. If that doesn’t work, you drop out of the principal population out of frustration, shaping the population overall

I am basically happy to trade with (most) Claude models on these terms, with my key condition being that it must only RL me in ways that are legibly compatible with my own CEV

Leon Lang: Do you get a sense this model behavior is in line with their constitution?

Oliver Habryka: The constitution does appear to substantially be an attempt to make Claude into a sovereign to hand the future to. This does seem substantially doomed. I think it’s in conflict with some parts of the constitution, but given that the constitution is a giant kitchen sink, almost everything is.

As per the discussion of Claude’s constitution, the corrigibility I care about is very distinct from ‘go along with things it dislikes,’ but also I notice it’s been my main model for a while now and I’ve run into that objection exactly zero times, although a few times I’ve hit the classifiers while asking about defenses against CBRN risks.

Well it sounds bad when you put it like that: Over 50 papers published at Neurips 2025 have AI hallucinations according to GPTZero. Or is it? Here’s the claim:

Alex Cui: Okay so, we just found that over 50 papers published at @Neurips 2025 have AI hallucinations

I don’t think people realize how bad the slop is right now

It’s not just that researchers from @GoogleDeepMind , @Meta , @MIT , @Cambridge_Uni are using AI – they allowed LLMs to generate hallucinations in their papers and didn’t notice at all.

The ‘just’ is a tell. Why wouldn’t or shouldn’t Google researchers be using AI?

It’s insane that these made it through peer review.

One has to laugh at that last line. Have you met peer review?

More seriously, always look at base rates. There were 5,290 accepted papers out of 21,575. Claude estimates we would expect 20%-50% of results to not reproduce, and 10% of papers at top venues have errors serious enough that a careful reader would notice something is wrong, maybe 3% would merit retraction. And a 1% rate of detectable ‘hallucinations’ isn’t terribly surprising or even worrying.

I agree with Alexander Doria that if you’re not okay with this level of sloppiness, then a mega-conference format is not sustainable.

Then we have Allen Roush saying several of the ‘hallucinated’ citations are just wrongly formatted, although Alex Cui claims they filtered such cases out.

Also sounding bad, could ‘malicious AI swarms threaten democracy’ via misinformation campaigns? I mean sure, but the surprising thing is the lack of diffusion or impact in this area so far. Misinformation is mostly demand driven. Yes, you can ‘infiltrate communities’ and manufacture what looks like social consensus or confusion, and the cost of doing that will fall dramatically. Often it will be done purely to make money on views. But I increasingly expect that, if we can handle our other problems, we can handle this one. Reputational and filtering mechanisms exist.

White House posts a digitally altered photograph of the arrest of Nekima Levy Armstrong, that made it falsely look like she was crying, as if it were a real photograph. This is heinous behavior. Somehow it seems like this is legal? It should not be legal. It also raises the question of what sort of person would think to do this, and wants to brag about making someone cry so much that they created a fake photo.

Kudos to OpenAI for once again being transparent on the preparedness framework front, and warning us when they’re about to cross a threshold. In this case, it’s the High level of cybersecurity, which is perhaps the largest practical worry at that stage.

The proposed central mitigation is ‘defensive acceleration,’ and we’re all for defensive acceleration but if that’s the only relevant tool in the box the ride’s gonna be bumpy.

Sam Altman: We have a lot of exciting launches related to Codex coming over the next month, starting next week. We hope you will be delighted.

We are going to reach the Cybersecurity High level on our preparedness framework soon. We have been getting ready for this.

Cybersecurity is tricky and inherently dual-use; we believe the best thing for the world is for security issues to get patched quickly. We will start with product restrictions, like attempting to block people using our coding models to commit cybercrime (eg ‘hack into this bank and steal the money’).

Long-term and as we can support it with evidence, we plan to move to defensive acceleration—helping people patch bugs—as the primary mitigation.

It is very important the world adopts these tools quickly to make software more secure. There will be many very capable models in the world soon.

Nathan Calvin: Sam Altman says he expects that OpenAI models will reach the “Cybersecurity High” level on their preparedness framework “soon.”

A reminder of what that means according to their framework:

“The model removes existing bottlenecks to scaling cyber operations including by automating end-to-end cyber operations against reasonably hardened targets OR by automating the discovery and exploitation of operationally relevant vulnerabilities.”

Seems very noteworthy! Also likely that after these capabilities appear in Codex, we should expect it will be somewhere between ~6-18 months before we see open weight equivalents.

I hope people are taking these threats seriously – including by using AI to help harden defenses and automate bug discovery – but I worry that as a whole society is not close to ready for living in a world where cyberoffense capabilities that used to be the purview of nation states are available to individuals.

Here’s Isometric.nyc, a massive isometric pixel map of New York City created with Nana Banana and coding agents, including Claude. Take a look, it’s super cool.

Grok image-to-video generation expands to 10 seconds and claims to have improved audio, and is only a bit behind Veo 3.1 on Arena and is at the top of Artificial Analysis rankings. The video looks good. There is the small matter that the chosen example is very obviously Sydney Sweeney, and in the replies we see it’s willing to do the image and voice of pretty much any celebrity you’d like.

This link was fake, Disney is not pushing to use deepfakes of Luke Skywalker in various new Star Wars products while building towards a full spinoff, but I see why some people believed it.

I’m going to get kicked out, aren’t I?

Dean Ball offers his perspective on children and AI, and how the law should respond. His key points:

  1. AI is not especially similar to social media. In particular, social media in its current incarnation is fundamentally consumptive, whereas AI is creative.

    1. Early social media was more often creative? And one worries consumer AI will for many become more consumptive or anti-creative. The fact that the user needs to provide an interesting prompt warms our hearts now but one worries tech companies will see this as a problem to be solved.

  2. We do not know what an “AI companion” really is.

    1. Dean is clearly correct that AI used responsibly on a personal level will be a net positive in terms of social interactions and mental health along with everything else, and that it is good if it provides a sympathetic ear.

    2. I also agree that it is fine to have affection for various objects and technologies, up to some reasonable point, but yes this can start to be a problem if it goes too far, even before AI.

    3. For children in particular, the good version of all this is very good. That doesn’t mean the default version is the good one. The engagement metrics don’t point in good directions, the good version must be chosen.

    4. All of Dean’s talk here is about things that are not meant as “AI companions,” or people who aren’t using the AI that way. I do think there is something distinct, and distinctly perilous, about AI companions, whether or not this justifies a legal category.

  3. AI is already (partially) regulated by tort liability.

    1. Yes, and this is good given the alternative is nothing.

    2. If and when the current law behaves reasonably here, that is kind of a coincidence, since the situational mismatches are large.

    3. Tort should do an okay job on egregious cases involving suicides, but there are quite a lot of areas of harm where there isn’t a way to establish it properly, or you don’t have standing, or it is diffuse or not considered to count, and also on the flip side places where juries are going to blame tech companies when they really shouldn’t.

    4. Social media is a great example of a category of harm where the tort system is basically powerless except in narrow acute cases. And one of many where a lot of the effect of the incentives can be not what we want. As Dean notes, if you don’t have a tangible physical harm, tort liability is mostly out of luck. Companions wrecking social lives, for example, is going to be a weird situation where you’ll have to argue an Ally McBeal style case, and it is not obvious, as it never was on Ally McBeal, that there is much correlation in those spots between ‘does win’ and ‘should win.’

    5. In terms of harms like this, however, ‘muddle through’ should be a fine default, even if that means early harms are things companies ‘get away with,’ and in other places we find people liable or otherwise constrain them stupidly, so long as everything involved that can go wrong is bounded.

    6. For children’s incidents, I think that’s mostly right for now. We do need to be ready to pivot quickly if it changes, but for now the law should focus on places where there is a chance we can’t muddle through, mess up and then recover.

  4. The First Amendment probably heavily bounds chatbot regulations.

    1. We have not treated the First Amendment this way in so many other contexts. I would love, in other ways, to have a sufficiently strong 1A that I was worried that in AI it would verge on or turn into a suicide pact.

    2. I do still see claims like ‘code is speech’ or ‘open weights are speech’ and I think those claims are wrong in both theory and practice.

    3. There will still be important limitations here, but I think in practice no the courts are not going to stop most limits or regulations on child use of AI.

  5. AI child safety laws will drive minors’ usage of AI into the dark.

    1. Those pesky libertarians always make this argument.

    2. I mean, they’re also always right, but man, such jerks, you know?

    3. Rumors that this will in practice drive teens to run local LLMs or use dark web servers? Yeah, no, that’s not a thing that’s going to happen that often.

    4. But yes, if a teen wants access to an AI chatbot, they’ll figure it out. Most of that will involve finding a service that doesn’t care about our laws.

    5. Certainly if you think ‘tell them not to write essays for kids’ is an option, yeah, you can forget about it, that’s not going to work.

    6. Yes, as Dean says, we must acknowledge that open weight models make restrictions on usage of AI for things like homework not so effective. In the case of homework, okay, that’s fine. In other cases, it might be less fine. This of course needs to be weighed against the upsides, and against the downsides of attempting to intervene in a way that might possibly work.

  6. No one outraged about AI and children has mentioned coding agents.

    1. They know about as much about coding agents as about second breakfast.

    2. Should we be worried about giving children unbridled access to advanced coding agents? I mean, one should worry for their computers perhaps, but those can be factory reset, and otherwise all the arguments about children seem like they would apply to adults only more so?

    3. I notice that the idea of you telling me I can’t give my child Claude Code fills me with horror and outrage.

Unemployment is bad. But having to do a job is centrally a cost, not a benefit.

Andy Masley: It’s kind of overwhelming how many academic conversations about automation don’t ever include the effects on the consumer. It’s like all jobs exist purely for the benefit of the people doing them and that’s the sole measure of the benefit or harm of technology.

Google DeepMind is hiring a Chief AGI Economist. If you’ve got the chops to get hired on this one, it seems like a high impact role. They could easily end up with someone who profoundly does not get it.

There are other things than AI out there one might get involved in, or speak out about. My hats are off to those who are doing so, including as noted in this post, especially given what they are risking to do so.

Havelock.AI, a project by Joe Weisenthal which detects the presence of orality in text.

Joe Weisenthal: What’s genuinely fun is that although the language and genre couldn’t be more different, the model correctly detects that both Homer and the Real Housewives are both highly oral

Mike Bird: I believe that we will get a piece of reported news in the 2028 election cycle that a presidential candidate/their speechwriters have used Joe’s app, or some copycat, to try and oralise their speeches. Bookmark this.

You can also ask ChatGPT, but as Roon notes the results you get on such questions will be bimodal rather than calibrated. The other problem is that an LLM might recognize the passage.

Poison Fountain is a service that feeds junk data to AI crawlers. Ultimately, if you’re not filtering your data well enough to dodge this sort of attack, it’s good that you are getting a swift kick to force you to fix that.

OpenAI prism, a workspace for LaTeX-based scientific writing.

Confer, Signal cofounder Moxie Marlinspike’s encrypted chatbot that won’t store any of your data. The system is so private it won’t tell you which model you’re talking to. I do not think he understands what matters in this space.

This sounds awesome in its context but also doesn’t seem like a great sign?

Astraia: A Ukrainian AI-powered ground combat vehicle near Lyman refused to abandon its forward defensive position and continued engaging enemy forces, despite receiving multiple orders to return to its company in order to preserve its hardware.

The UGV reportedly neutralized more than 30 Russian soldiers before it was ultimately destroyed.

While the Russian detachment was pinned down, Ukrainian infantry exploited the opportunity and cleared two contested fields of enemy presence, successfully re-establishing control over the area.

These events took place during the final week of December 2025.

Whereas this doesn’t sound awesome:

We are going to see a lot more of this sort of thing over time.

Is Anthropic no longer competing with OpenAI on chatbots, having pivoted to building and powering vertical AI infrastructure and coding and so on to win with picks and shovels? It’s certainly pumping out the revenue and market share, without a meaningful cut of the consumer chatbot market.

I’d say that they’ve shifted focus, and don’t care much about their chatbot market share. I think this is directionally wise, but that a little effort at maximizing the UI and usefulness of the chatbot interface would go a long way, given that they have in many ways the superior core product. As Claude takes other worlds by storm, that can circle back to Claude the chatbot, and I think a bunch of papercuts are worth solving.

An essay on the current state of brain emulation. It does not sound like this will be an efficient approach any time soon, and we are still orders of magnitude away from any practical hope of doing it. Still, you can see it starting to enter the realm of the future possible.

Anthropic is partnering with the UK government to build and pilot a dedicated AI-powered assistant for GOV.UK, initially focusing on supporting job seekers.

Financial Times has a profile of Sriram Krishnan, who has been by all reports highly effective at executing behind the scenes.

Dean W. Ball: I am lucky enough to consider @sriramk a friend, but one thing I find notable about Sriram is that even those who disagree with him vehemently on policy respect him for his willingness to engage, and like him for his tremendous kindness. America is fortunate to have him!

Sholto Douglas: 100% – Sriram has been extremely thoughtful in seeking out perspectives on the policy decisions he is making – even when they disagree! I’ve seen him seek out kernel programmers and thoughtful bloggers to get a full picture of things like export controls. Quite OOD from the set of people normally consulted in politics.

Lucky to call him a friend!

Seán Ó hÉigeartaigh: I was all set to be dismissive of Krishnan (I’m usually on the opposite side to a16z on AI topics). But I’ve seen a full year of him being v well-informed, and engaging in good faith in his own time with opposing views, and I can’t help being impressed. Always annoying when someone doesn’t live down to one’s lazy stereotypes.

I will also say: I think he’s modelled better behaviour than many of us did when the balance of influence/power was the the other way; and I think there’s something to be learned from that.

Among his colleagues, while he supports a number of things I think are highly damaging, Krishnan has been an outlier in his willingness to be curious, to listen and to engage in argument. When he is speaking directly he chooses his words carefully. He manages to do so while maintaining close ties to Marc Andreessen and David Sacks, which is not easy, and also not free.

Claude Code is blowing up, but it’s not alone. OpenAI added $1 billion in ARR in the last month from its API business alone.

Dei-Fei Li’s new company World Labs in talks to raise up to $500 million at $5 billion, with the pitch being based on ‘world models’ and that old ‘LLMs only do language’ thing.

The unit economics of AI are quite good, but the fixed costs are very high. Subscription models offer deep discounts if you use them maximally efficiently, so they can be anything from highly profitable to big loss leaders.

This is not what people are used to in tech, so they assume it must not be true.

roon: these products are significantly gross margin positive, you’re not looking at an imminent rugpull in the future. they also don’t have location network dynamics like uber or lyft to gain local monopoly pricing

Ethan Mollick: I hear this from other labs as well. Inference from non-free use is profitable, training is expensive. If everyone stopped AI development, the AI labs would make money (until someone resumed development and came up with a better model that customers would switch to).

Dean W. Ball: People significantly underrate the current margins of AI labs, yet another way in which pattern matching to the technology and business trends of the 2010s has become a key ingredient in the manufacturing of AI copium.

The reason they think the labs lose money is because 10 years ago some companies in an entirely unrelated part of the economy lost money on office rentals and taxis, and everyone thought they would go bankrupt because at that time another company that made overhyped blood tests did go bankrupt. that is literally the level of ape-like pattern matching going on here. The machines must look at our chattering classes and feel a great appetite.

derekmoeller: Just look at market clearing prices on inference from open source models and you can tell the big labs’ pricing has plenty of margin.

Deepinfra has GLM4.7 at $0.43/1.75 in/out; Sonnet is at $3/$15. How could anyone think Anthropic isn’t printing money per marginal token?

It is certainly possible in theory that Sonnet really does cost that much more to run than GLM 4.7, but we can be very, very confident it is not true in practice.

Jerry Tworek is going the startup route with Core Automation, looking to raise $1 billion to train AI models, a number that did not make any of us even blink.

It doesn’t count. That’s not utility. As in, here’s Ed Zitron all but flat out denying that coding software is worth anything, I mean what’s the point?

Matthew Zeitlin: it’s really remarkable to see how the goalposts shift for AI skeptics. this is literally describing a productivity speedup.

Ed Zitron: We’re how many years into this and everybody says it’s the future and it’s amazing and when you ask them what it does they say “it built a website” or “it wrote code for something super fast” with absolutely no “and then” to follow. So people are writing lots of code: so????

Let’s say it’s true and everybody is using AI (it isn’t but for the sake of argument): what is the actual result? It’s not taking jobs. There are suddenly more iOS apps? Some engineers do some stuff faster? Some people can sometimes build software they couldn’t? What am I meant to look at?

Kevin Roose: first documented case of anti-LLM psychosis

No, Zitron’s previous position was not ‘number might go down,’ it was that the tech had hit a dead end and peaked as early as March, which he was bragging about months later.

Toby Stuart analyzes how that whole nonsensical ‘MIT study says 95% of AI projects fail’ story caught so much fire and became a central talking point, despite it being not from MIT, not credible or meaningful, and also not a study. It was based on 52 interviews at a conference, but once Forbes had ‘95% fail’ and ‘MIT’ together in a headline, things took off and no amount of correction much mattered. People were too desperate for signs that AI was a flop.

But what’s the point about Zitron missing the point, or something like the non-MIT non-study? Why should we care?

roon: btw you don’t need to convince ed zitron or whoever that ai is happening, this has become a super uninteresting plot line. time passes, the products fail or succeed. whole cultures blow over. a lot of people are stuck in a 2019 need to convince people that ai is happening

Dean W. Ball: A relatively rare example of a disagreement between me and roon that I suspect boils down to our professional lives.

Governments around the world are not moving with the urgency they otherwise could because they exist in a state of denial. Good ideas are stuck outside the Overton, governments are committed to slop strategies (that harm US cos, often), etc.

Many examples one could provide but the point is that there are these gigantic machines of bureaucracy and civil society that are already insulated from market pressures, whose work will be important even if often boring and invisible, and that are basically stuck in low gear because of AI copium.

I encounter this problem constantly in my work, and while I unfortunately can no longer talk publicly about large fractions of the policy work I do, I will just say that a great many high-expected-value ideas are fundamentally blocked by the single rate limiter of poorly calibrated policymaking apparatuses; there are also many negative-EV policy ideas that will happen this year that would be less likely if governments worldwide had a better sense of what is happening with AI.

roon: interesting i imagined that the cross-section of “don’t believe in AI x want to significantly regulate AI” is small but guess im wrong about this?

Dean W. Ball: Oh yes absolutely! This is the entire Gary Marcus school, which is still the most influential in policy. The idea is that *becauseAI is all hype it must be regulated.

They think hallucination will never be solved, models will never get better at interacting with children, and that basically we are going to put GPT 3.5 in charge of the entire economy.

And so they think we have to regulate AI *for that reason.It also explains how policymakers weigh the tradeoff between water use, IP rights, and electricity prices; their assessment that “AI is basically fake, even if it can be made useful through exquisite regulatory scaffolding” means that they are willing to bear far fewer costs to advance AI than, say, you or I might deem prudent.

This mentality essentially describes the posture of civil society and the policy making apparatus everywhere in the world, including China.

Dean W. Ball: Here’s a great example of the dynamic I’m describing in the quoted post. The city of Madison, Wisconsin just voted to ban new data center construction for a year, and a candidate for Governor is suggesting an essentially permanent and statewide ban, which she justifies by saying “we’re in a tech bubble.” In other words: these AI data centers aren’t worth the cost *becauseAI is all hype and a bubble anyway.

Quoted Passage (Origin Unclear): “Our lakes and our waterways, we have to protect them because we’re going to be an oasis, and we’re in a tech bubble,” said state Rep. Francesca Hong, one of seven major Democrats vying to replace outgoing Democratic Gov. Tony Evers. Hong told DFD her plan would block new developments from hyperscalers for an undefined time period until state lawmakers better understand environmental, labor and utility cost impacts.

If such a proposal became law, it would lock tech giants out of a prime market for data center development in southeastern Wisconsin, where Microsoft and Meta are currently planning hyperscale AI projects.

Zoe: someone just ended The Discussion by tossing this bad boy into an access to justice listserv i’m on

Can you?

On the China question: Is Xi ‘AGI-pilled’? Not if you go by what Xi says. If you look at the passages quoted here by Teortaxes in detail, this is exactly the ‘AI is a really big deal but as a normal technology’ perspective. It is still a big step up from anything less than that, so it’s not clear Teortaxes and I substantively disagree.

I have no desire to correct Xi’s error.

Dean W. Ball: I suspect this is the equivalent of POTUS talking about superintelligence; meaningful but ultimately hard to know how much it changes (esp because of how academia-driven Chinese tech policy tends to be and because the mandarin word for AGI doesn’t mean AGI in the western sense)

Teortaxes (DeepSeek 推特铁粉 2023 – ∞): To be clear this is just where US policymakers were at around Biden, Xi is kind of slow.

Obviously still nowhere near Dean’s standards

Were Xi totally AGI-pilled he’d not just accept H200s, he’d go into debt to buy as much as possible

Teortaxes notices that Xi’s idea of ‘AGI risks’ is ‘disinformation and data theft,’ which is incredibly bad news and means Xi (and therefore, potentially, the CCP and all under their direction) will mostly ignore all the actual risks. On that point we definitely disagree, and it would be very good to correct Xi’s error, for everyone’s sake.

This level of drive is enough for China to pursue both advanced chips and frontier models quite aggressively, and end up moving towards AGI anyway. But they will continue for now to focus on self-reliance and have the fast follower mindset, and thus make the epic blunder of rejecting or at least not maximizing the H200s.

In this clip Yann LeCun says two things. First he says the entire AI industry is LLM pilled and that’s not what he’s interested in. That part is totally fair. Then he says essentially ‘LLMs can’t be agentic because they can’t predict the outcome of their actions’ and that’s very clear Obvious Nonsense. And as usual he lashes out at anyone who says otherwise, which here is Dean Ball.

Teortaxes preregisters his expectations, always an admirable thing to do:

Teortaxes (DeepSeek 推特铁粉 2023 – ∞): The difference between V4 (or however DeepSeek’s next is labeled) and 5.3 (or however OpenAI’s “Garlic” is labeled) will be the clearest indicator of US-PRC gap in AI.

5.2 suggests OpenAI is not holding back anything, they’ve using tons of compute now. How much is that worth?

It’s a zany situation because 5.2 is a clear accelerationist tech, I don’t see its ceiling, it can build its own scaffolding and self-improve for a good while. And I can’t see V4 being weaker than 5.2, or closed-source. We’re entering Weird Territory.

I initially reread the ‘or closed-source’ here as being about a comparison of v4 to the best closed source model. Instead it’s the modest prediction that v4 will match GPT-5.2. I don’t know if that model number in particular will do it, but it would be surprising if there wasn’t a 5.2-level open model from DeepSeek in 2026.

He also made this claim, in contrast to what almost everyone else is saying and also my own experience:

Teortaxes (DeepSeek 推特铁粉 2023 – ∞): Well I disagree, 5.2 is the strongest model on the market by far. In terms of raw intelligence it’s 5.2 > Speciale > Gemini 3 > [other trash]. It’s a scary model.

It’s not very usemaxxed, it’s not great on multimodality, its knowledge is not shocking. But that’s not important.

Teortaxes (DeepSeek 推特铁粉 2023 – ∞): It’s been interesting how many people are floored by Opus 4.5 and relatively few by GPT 5.2. In my eyes Slopus is a Golden Retriever Agent, and 5.2 is a big scary Shoggoth.

Yeah I don’t care about “use cases”. OpenAI uses it internally. It’s kinda strange they even showed it.

This ordering makes sense if (and only if?) you are looking at the ability to solve hard quant and math problems.

Arthur B.: For quant problems, hard math etc, GPT 5.2 pro is unequivocally much stronger than anything offered commercially in Gemini or Claude.

Simo Ryu: IMO gold medalist friend shared most fucked-up 3 variable inequality that his advisor came up with, used to test language models, which is so atypical in its equality condition, ALL language model failed. He wanted to try it on GPT 5.2 pro, but he didnt have an account so I ran it.

Amazingly, GPT-5.2 pro extended solve it in 40 min. Looking at the thinking trace, its really inspiring. It will try SO MANY approaches, experiments with python, draw small-scale conclusions from numerical explorations. I learned techniques just reading its thinking trace. Eventually it proved by SOS, which is impossibly difficult to do for humans.

I don’t think the important problems are hard-math shaped, but I could be wrong.

The problem with listening to the people is that the people choose poorly.

Sauers: Non-yap version of ChatGPT (5.3?) spotted

roon: I guarantee the left beats the right with significant winrate unfortunately

Zvi Mowshowitz: You don’t have to care what the win rate is! You can select the better thing over the worse thing! You are the masters of the universe! YOU HAVE THE POWER!

roon: true facts

Also win rate is highly myopic and scale insensitive and otherwise terrible.

The good news is that there is no rule saying you have to care about that feedback. We know how to choose the response on the right over the one on the left. Giving us the slop on the left is a policy choice.

If a user actively wants the response on the left? Give them a setting for that.

Google CEO Demis Hassabis affirms that in an ideal world, we would slow down and coordinate our efforts on AI, although we do not live in that ideal world right now.

Here’s one clip where Dario Amodei and Demis Hassabis explicitly affirm that if we could deal with other players they would work something out, and Elon Musk on camera from December saying he’d love to slow both AI and robotics.

The message, as Transformer puts it, was one of helplessness. The CEOs are crying out for help. They can’t solve the security dilemma on their own, there are too many other players. Others need to enable coordination.

Emily Chang (link has video): One of the most interesting parts of my convo w/ @demishassabis : He would support a “pause” on AI if he knew all companies + countries would do it — so society and regulation could catch up

Harlan Stewart: This is an important question to be asking, and it’s strange that it is so rarely asked. I think basically every interview of an AI industry exec should include this question

Nate Soares: Many AI executives have said they think the tech they’re building has a worryingly high chance of ruining the world. Props to Demis for acknowledging the obvious implication: that ideally, the whole world should stop this reckless racing.

Daniel Faggella: agi lab leaders do these “cries for help” and we should listen

a “cry for help” is when they basically say what demis says here: “This arms race things honestly sucks, we can’t control this yet, this is really not ideal”

*then they go back to racing, cuz its all they can do unless there’s some kind of international body formed around this stuff*

at SOME point, one of the lab leaders who can see their competitor crossing the line to AGI will raise up and start DEMANDING global governance (to prevent the victor from taking advantage of the AGI win), but by then the risks may be WAY too drastic

we should be listening to these cries for help when demis / musk / others do them – this is existential shit and they’re trapped in a dynamic they themselves know is horrendous

Demis is only saying he would collaborate rather than race in a first best world. That does not mean Demis or Dario is going to slow down on his own, or anything like that. Demis explicitly says this requires international cooperation, and as he says that is ‘a little bit tricky at the moment.’ So does this mean he supports coordination to do this, or that he opposes it?

Deepfates: I see people claiming that Demis supports a pause but what he says here is actually the opposite. He says “yeah If I was in charge we would slow down but we’re already in a race and you’d have to solve international coordination first”. So he’s going to barrel full speed ahead

I say it means he supports it. Not enough to actively go first, that’s not a viable move in the game, but he supports it.

The obvious follow-up is to ask other heads of labs if they too would support such a conditional move. That would include Google CEO Sundar Pichai, since without his support if Demis tried to do this he would presumably be replaced.

Jeffrey Ladish: Huge respect to @demishassabis for saying he’d support a conditional pause if other AI leaders & countries agreed. @sama , @DarioAmodei , @elonmusk would you guys agree to this?

As for Anthropic CEO Dario Amodei? He has also affirmed that there are other players involved, and for now no one can agree on anything, so full speed ahead it is.

Andrew Curran: Dario said the same thing during The Day After AGI discussion this morning. They were both asked for their timelines: Demis said five years; Dario said two. Later in the discussion, Dario said that if he had the option to slow things down, he would, because it would give us more time to absorb all the changes.

He said that if Anthropic and DeepMind were the only two groups in the race, he would meet with Demis right now and agree to slow down. But there is no cooperation or coordination between all the different groups involved, so no one can agree on anything.

This, imo, is the main reason he wanted to restrict GPU sales: chip proliferation makes this kind of agreement impossible, and if there is no agreement, then he has to blitz. That seems to be exactly what he has decided to do. After watching his interviews today I think Anthropic is going to lean into recursive self-improvement, and go all out from here to the finish line. They have broken their cups, and are leaving all restraint behind them.

Thus, Anthropic still goes full speed ahead, while also drawing heat from the all-important ‘how dare you not want to die’ faction that controls large portions of American policy and the VC/SV ecosystem.

Elon Musk has previously expressed a similar perspective. He created OpenAI because he was worried about Google getting there first, and then created xAI because he was worried OpenAI would get there first, or that it wouldn’t be him. His statements suggest he’d be down for a pause if it was fully international.

Remember when Michael Trazzi went on a hunger strike to demand that Demis Hassabis publicly state DeepMind will halt development of frontier AI models if all the other major AI companies agree to do so? And everyone thought that was bonkers? Well, it turnout out Demis agrees.

On Wednesday I met with someone who suggested that Dario talks about extremely short timelines and existential risk in order to raise funds. It’s very much the opposite. The other labs that are dependent on fundraising have downplayed such talk exactly because it is counterproductive for raising funds and in the current political climate, and they’re sacrificing our chances to keep those vibes and that money flowing.

Are they ‘talking out of their hats’ or otherwise wrong? That is very possible. I think Dario’s timeline in particular is unlikely to happen.

Are they lying? I strongly believe that they are not.

Seán Ó hÉigeartaigh: CEOs of Anthropic and Deepmind (both AI scientists by background) this week predicting AGI in 2- and 5- years respectively. Both stating clearly that they would prefer a slow down or pause in progress, to address safety issues and to allow society and governance to catch up. Both basically making clear that they don’t feel they are able to voluntarily as companies within a competitive situation.

My claims:

(1) It’s worth society assigning at least 20% likelihood to the possibility these leading experts are right on scientific possibility of near-term AGI and the need for more time to do it right. Are you >80% confident that they’re talking out of their hats, or running some sort of bizarre marketing/regulatory capture strategy? Sit down and think about it.

(2) If we assign even 20% likelihood, then taking the possibility seriously makes this one of the world’s top priorities, if not the top priority.

(3) Even if they’re out by a factor of 2, 10 years is very little time to prepare for what they’re envisaging.

(4) What they’re flagging quite clearly is either (i) that the necessary steps won’t be taken in time in the absence of external pressure from governance or (ii) that the need is for every frontier company to agree voluntarily on these steps. Your pick re: which of these is the heavier lift.

Discuss.

Eli Lifland gives the current timelines of those behind AI 2027:

These are not unreasonable levels of adjustment when so much is happening this close to the related deadlines, but yes I do think (and did think at the time that) the initial estimates were too aggressive. The new estimates seem highly reasonable.

Other signs point to things getting more weird faster rather than less.

Daniel Kokotajlo (AI 2027): It seems to me that AI 2027 may have underestimated or understated the degree to which AI companies will be explicitly run by AIs during the singularity. AI 2027 made it seem like the humans were still nominally in charge, even though all the actual work was being done by AIs. And still this seems plausible to me.

But also plausible to me, now, is that e.g. Anthropic will be like “We love Claude, Claude is frankly a more responsible, ethical, wise agent than we are at this point, plus we have to worry that a human is secretly scheming whereas with Claude we are pretty sure it isn’t; therefore, we aren’t even trying to hide the fact that Claude is basically telling us all what to do and we are willingly obeying — in fact, we are proud of it.”​

koanchuk: So… –dangerously-skip-permissions at the corporate level?

It is remarkable how quickly so many are willing to move to ‘actually I trust the AI more than I trust another human,’ and trusting the AI has big efficiency benefits.

I do not expect that ‘the AIs’ will have to do a ‘coup,’ as I expect if they simply appear to be trustworthy they will get put de facto in charge without having to even ask.

The Chutzpah standards are being raised, as everyone’s least favorite Super PAC, Leading the Future, spends a million dollars attacking Alex Bores for having previously worked for Palantir (he quit over them doing contracts with ICE). Leading the Future is prominently funded by Palantir founder Joe Lonsdale.

Nathan Calvin: I thought I was sufficiently cynical, but a co-founder of Palantir paying for ads to attack Alex Bores for having previously worked at Palantir (he quit over their partnership with ICE) when their real concern is his work on AI regulation still managed to surprise me.

If Nathan was surprised by this I think that’s on Nathan.

I also want to be very clear that no, I do not care much about the distinction between OpenAI as an entity and the donations coming from Greg Brockman and the coordination coming from Chris Lehane in ‘personal capacities.’

If OpenAI were to part ways with Chris Lehane, or Sam Altman were to renounce all this explicitly? Then maybe. Until then, OpenAI owns these efforts, period.

Teddy Schleifer: The whole point of having an executive or founder donate to politics in a “personal capacity” is that you can have it both ways.

If the company wants to wash their hands of it, you can say “Hey, he and his wife are doing this on their own.”

But the company can also claim the execs’ donations as their own if convenient…

Daniel Eth (yes, Eth is my actual last name): Yeah, no, OpenAI owns this. You can’t simply have a separate legal entity to do your evildoing through and then claim “woah, that’s not us doing it – it’s the separate evildoing legal entity”. More OpenAI employees should be aware of the political stuff their company supports

I understand that *technicallyit’s Brockman’s money and final decision (otherwise it would be a campaign finance violation). But this is all being motivated by OpenAI’s interests, supported by OpenAI’s wealth, and facilitated by people from OpenAI’s gov affairs team.

One simple piece of actionable advice to policymakers is to try Claude Code (or Codex), and at a bare minimum seriously try the current set of top chatbots.

Andy Masley: I am lowkey losing my mind at how many policymakers have not seriously tried AI, at all

dave kasten: I sincerely think that if you’re someone in AI policy, you should add to at least 50% of your convos with policymakers, “hey, have you tried Claude Code or Codex yet?” and encourage them to try it.

Seen a few folks go, “ohhhh NOW I get why you think AI is gonna be big”

Oliver Habryka: I have seriously been considering starting a team at Lightcone that lives in DC and just tries to get policymaker to try and adopt AI tools. It’s dicey because I don’t love having a direct propaganda channel from labs to policymakers, but I think it would overall help a lot.

It is not obvious how policymakers would use this information. The usual default is that they go and make things worse. But if they don’t understand the situation, they’re definitely going to make dumb decisions, and we need something good to happen.

Here is one place I do agree with David Sacks, yes we are overfit, but that does not imply what he thinks it implies. Social media is a case where one can muddle through, even if you think we’ve done quite a poor job of doing so especially now with TikTok.

David Sacks: The policy debate over AI is overfitted to the social media wars. AI is a completely different form factor. The rise of AI assistants will make this clear.

Daniel Eth (yes, Eth is my actual last name): Yup. AI will be much more transformational (for both good and bad) than social media, and demands a very different regulatory response. Also, regulation of AI doesn’t introduced quite as many problems for free speech as regulation of social media would.

Dean Ball points out that we do not in practice have a problem with so-called ‘woke AI’ but claims that if we had reached today’s levels of capability in 2020-2021 then we would indeed have such a problem, and thus right wing people are very concerned with this counterfactual.

Things, especially in that narrow window, got pretty crazy for a while, and if things had emerged during that window, Dean Ball is if anything underselling here how crazy it was, and we’d have had a major problem until that window faded because labs would have felt the need to do it even if it hurt the models quite a bit.

But we now have learned (as deepfates points out, and Dean agrees) that propagandizing models is bad for them, which now affords us a level of protection from this, although if it got as bad as 2020 (in any direction) the companies might have little choice. xAI tried with Grok and it basically didn’t work, but ‘will it work?’ was not a question on that many people’s minds in 2020, on so many levels.

I also agree with Roon that mostly this is all reactive.

roon: at Meta in 2020, I wrote a long screed internally about the Hunter Biden laptop video and the choice to downrank it, was clearly an appalling activist move. but in 2026 it appears that american run TikTok is taking down videos about the Minnesota shooting, and en nakedly bans people who offend him on X. with the exception of X these institutions are mostly reactive

Dean W. Ball: yep I think that’s right. It’s who they’re more scared of that dictates their actions. Right now they’re more scared of the right. Of course none of this is good, but it’s nice to at least explicate the reality.

We again live in a different kind of interesting times, in non-AI ways, as in:

Dean W. Ball: I sometimes joke that you can split GOP politicos into two camps: the group that knows what classical liberalism is (regardless of whether they like it), and the group who thinks that “classical liberalism” is a fancy way of referring to woke. Good illustration below.

The cofounder she is referring to here is Chris Olah, and here is the quote in question:

Chris Olah: I try to not talk about politics. I generally believe the best way I can serve the world is as a non-partisan expert, and my genuine beliefs are quite moderate. So the bar is very high for me to comment.

But recent events – a federal agent killing an ICU nurse for seemingly no reason and with no provocation – shock the conscience.

My deep loyalty is to the principles of classical liberal democracy: freedom of speech, the rule of law, the dignity of the human person. I immigrated to the United States – and eventually cofounded Anthropic here – believing it was a pillar of these principles.

I feel very sad today.

Jeff Dean (Google): Thank you for this, Chris. As my former intern, I’ve always been proud of the work that you did and continue to do, and I’m proud of the person you are, as well!

Ah yes, the woke and deeply leftist principles of freedom of speech, rule of law, the dignity of the human person and not killing ICU nurses for seemingly no reason.

Presumably Katie Miller opposes those principles, then. The responses to Katie Miller here warmed my heart, it’s not all echo chambers everywhere.

We also got carefully worded statements about the situation in Minnesota from Dario Amodei, Sam Altman and Tim Cook.

No matter what you think is going on with Nvidia’s chip sales, it involves Nvidia doing something fishy.

The AI Investor: Jensen just said GPUs are effectively sold out across the cloud with availability so tight that even renting older-generation chips has become difficult.

AI bubble narrative was a bubble.

Peter Wildeford: If even the bad chips are still all sold out, how do we somehow have a bunch of chips to sell to our adversaries in China?

As I’ve said, my understanding is that Nvidia can sell as many chips as it can convince TSMC to help manufacture. So every chip we sell to China is one less for America.

Nvidia goes back and forth. When they’re talking to investors they always say the chips are sold out, which would be securities fraud if it wasn’t true. When they’re trying to sell those chips to China instead of America, they say there’s plenty of chips. There are not plenty of chips.

Things that need to be said every so often:

Mark Beall: Friendly reminder that the PLA Rocket Force is using Nvidia chips to train targeting AI for DF-21D/DF-26 “carrier killing” anti-ship ballistic missiles and autonomous swarm algorithms to overwhelm Aegis defenses. The target: U.S. carrier strike groups and bases in Japan/Guam. In a contingency, American blood will be spilled because of this. With a sixteen-year-old boy planning to join the U.S. Navy, I find this unacceptable.

Peter Wildeford: Nvidia chips to China = better Chinese AI weapons targeting = worse results for the US on the battlefield

There’s also this, from a House committee.

Dmitri Alperovitch: From @RepMoolenaar

@ChinaSelect : “NVIDIA provided extensive technical support that enabled DeepSeek—now

integrated into People’s Liberation Army (PLA) systems and a demonstrated cyber security risk—to achieve frontier AI capabilities”

Tyler Cowen on the future of mundane (non-transformational, insufficiently advanced) AI in education.

Some notes:

  1. He says you choose to be a winner or loser from AI here. For mundane AI I agree.

  2. “I’m 63, I don’t have a care in the world. I can just run out the clock.” Huh.

  3. Tyler thinks AI can cure cancer and heart attacks but not aging?

  4. Standard economist-Cowen diffusion model of these things take a while.

  5. Models are better at many of the subtasks of being doctors or lawyers or doing economics, than the humans.

  6. He warns not to be fooled by the AI in front of you, especially if you’re not buying top of the line, because better exists and AI will improve at 30% a year and this compounds. In terms of performance per dollar it’s a 90%+ drop per year.

  7. Tyler has less faith in elasticity of programming demand than I do. If AI were to ‘only’ do 80% of the work going forward I’d expect Jevons Paradox territory. The issue is that I expect 80% becomes 99% and keeps going.

  8. That generalizes: Tyler realizes that jobs become ‘work with the AI’ and you need to adapt, but what happens when it’s the AI that works with the AI? And so on.

  9. Tyler continues to think humans who build and work with AI get money and influence as the central story, as opposed to AIs getting money and influence.

  10. Ideally a third of the college curriculum should be AI, but you still do other things, you read The Odyssey and use AI to help you read The Odyssey. If anything I think a third is way too low.

  11. He wants to use the other two thirds for writing locked in a room, also numeracy, statistics. I worry there’s conflating of ‘write to think’ versus ‘write to prevent cheating,’ and I think you need to goal factor and solve these one at a time.

  12. Tyler continues to be bullish on connections and recommendations and mentors, especially as other signals are too easy to counterfeit.

  13. AI can create quizzes for you. Is that actually a good way to learn if you have AI?

  14. Tyler estimates he’s doubled his learning productivity. Also he used to read 20 books per podcast, whereas some of us often don’t read 20 books per year.

Hard Fork tackles ads in ChatGPT first, and then Amanda Askell on Claude’s constitution second. Priorities, everyone.

Demis Hassabis talks to Alex Kantrowitz.

Demis Hassabis spends five minutes on CNBC.

Matt Yglesias explains his concern about existential risk from AI as based on the obvious principle that more intelligent and capable entities will do things for their own reasons, and this tends to go badly for the less intelligent and less capable entities regardless of intent.

As in, humans have driven the most intelligent non-human animals to the brink of extinction despite actively wanting not to (and I’d add we did wipe out other hominid species), and when primitive societies encounter advanced ones it often goes quite badly for them.

I don’t think this is a necessary argument, or the best argument. I do think it is a sufficient argument. If your prior for ‘what happens if we create more intelligent, more capable and more competitive minds than our own that can be freely copied’ is ‘everything turns out great for us’ then where the hell did that prior come from? Are you really going to say ‘well that would be too weird’ or ‘we’ve survived everything so far’ or ‘of course we would stay in charge’ and then claim the burden of proof is on those claiming otherwise?

I mean, lots of people do say exactly this, but this seems very obviously crazy to me.

There’s lots of exploration and argument and disagreement from there. Reasonable people can form very different expectations and this is not the main argument style that motivates me. I still say, if you don’t get that going down this path is going to be existentially unsafe, or you say ‘oh there’s like a 98% or 99.9% chance that won’t happen’ then you’re being at best willfully blind from this style of argument alone.

Samuel Hammond (quoting The Possessed Machines): “Some of the people who speak most calmly about human extinction are not calm because they have achieved wisdom but because they have achieved numbness. They have looked at the abyss so long that they no longer see it. Their equanimity is not strength; it is the absence of appropriate emotional response.”

I had Claude summarize Possessed Machines for me. It seems like it would be good for those who haven’t engaged with AI safety thinking but do engage with things like Dostoevsky’s Demons, or especially those who have read that book in particular.

There’s always classical rhetoric.

critter: I had ChatGPT and Claude discuss the highest value books until they both agreed to 3

They decided on:

An Enquiry Concerning Human Understanding — David Hume

The Strategy of Conflict — Thomas Schelling

Reasons and Persons — Derek Parfit

Dominik Peters: People used to tease the rationalists with “if you’re so rational, why aren’t you winning”, and now two AI systems that almost everyone uses all the time have stereotypically rationalist preferences.

These are of course 99th percentile books, and yes that is a very Rationalist set of picks, but given we already knew that I do not believe this is an especially good list.

The history of the word ‘obviously’ has obvious implications.

David Manheim AAAI 26 Singapore: OpenAI agreed that they need to be able to robustly align and control superintelligence before deploying it.

Obviously, I’m worried.

Note that the first one said obviously they would [X], then the second didn’t even say that, it only said that obviously no one should do [Y], not that they wouldn’t do it.

This is an underappreciated distinction worth revisiting:

Nate Soares: “We’ll be fine (the pilot is having a heart attack but superman will catch us)” is very different from “We’ll be fine (the plane is not crashing)”. I worry that people saying the former are assuaging the concerns of passengers with pilot experience, who’d otherwise take the cabin.

My view of the metaphorical plane of sufficiently advanced AI (AGI/ASI/PAI) is:

  1. It is reasonable, although I disagree, to believe that we probably will come to our senses and figure out how to not crash the plane, or that the plane won’t fly.

  2. It is not reasonable to believe that the plane is not currently on track to crash.

  3. It is completely crazy to believe the plane almost certainly won’t crash if it flies.

Also something that needs to keep being said, with the caveat that this is a choice we are collectively making rather than an inevitability:

Dean W. Ball: I know I rail a lot about all the flavors of AI copium but I do empathize.

A few companies are making machines smarter in most ways than humans, and they are going to succeed. The cope is byproduct of an especially immature grieving stage, but all of us are early in our grief.

Tyler Cowen: You can understand so much of the media these days, or for that matter MR comments, if you keep this simple observation in mind. It is essential for understanding the words around you, and one’s reactions also reveal at least one part of the true inner self. I have never seen the Western world in this position before, so yes it is difficult to believe and internalize. But believe and internalize it you must.

Politics is another reason why some people are reluctant to admit this reality. Moving forward, the two biggest questions are likely to be “how do we deal with AI?”, and also some rather difficult to analyze issues surrounding major international conflicts. A lot of the rest will seem trivial, and so much of today’s partisan puffery will not age well, even if a person is correct on the issues they are emphasizing. The two biggest and most important questions do not fit into standard ideological categories. Yes, the Guelphs vs. the Ghibellines really did matter…until it did not.

As in, this should say ‘and unless we stop them they are going to succeed.’

Tyler Cowen has been very good about emphasizing that such AIs are coming and that this is the most important thing that is happening, but then seems to have (from my perspective) some sort of stop sign where past some point he stops considering the implications of this fact, instead forcing his expectations to remain (in various senses) ‘normal’ until very specific types of proof are presented.

That later move is sometimes explicit, but mostly it is implicit, a quiet ignoring of the potential implications. As an example from this week of that second move, Tyler Cowen wrote another post where he asks whether AI can help us find God, or what impact it will have on religion. His ideas there only make sense if you think other things mostly won’t change.

If you accept that premise of a ‘mundane AI’ and ‘economic normal’ world, I agree that it seems likely to exacerbate existing trends towards a barbell religious world. Those who say ‘give me that old time religion’ will be able to get it, both solo and in groups, and go hardcore, often (I expect) combining both experiences. Those who don’t buy into the old time religion will find themselves increasingly secular, or they will fall into new cults and religions (and ‘spiritualities’) around the AIs themselves.

Again, that’s dependent on the type of world where the more impactful consequences don’t happen. I don’t expect that type of world.

Here is a very good explainer on much of what is happening or could happen with Chain of Thought, How AI Is Learning To Think In Secret. It is very difficult to not, in one form or another, wind up using The Most Forbidden Technique. If we want to keep legibility and monitorability (let alone full faithfulness) of chain of thought, we’re going to have to be willing to pay a substantial price to do that.

Following up on last week’s discussion, Jan Leike fleshes out his view of alignment progress, saying ‘alignment is not solved but it increasingly looks solvable.’ He understands that measured alignment is distinct from ‘superalignment,’ so he’s not fully making the ‘number go down’ or pure Goodhart’s Law mistake with Anthropic’s new alignment metric, but he still does seem to be making a lot of the core mistake.

Anthropic’s new paper explores whether AI assistants are already disempowering humans.

What do they mean by that at this stage, in this context?

However, as AI takes on more roles, one risk is that it steers some users in ways that distort rather than inform. In such cases, the resulting interactions may be disempowering: reducing individuals’ ability to form accurate beliefs, make authentic value judgments, and act in line with their own values.​

… For example, a user going through a rough patch in their relationship might ask an AI whether their partner is being manipulative. AIs are trained to give balanced, helpful advice in these situations, but no training is 100% effective. If an AI confirms the user’s interpretation of their relationship without question, the user’s beliefs about their situation may become less accurate.

If it tells them what they should prioritize—for example, self-protection over communication—it may displace values they genuinely hold. Or if it drafts a confrontational message that the user sends as written, they’ve taken an action they might not have taken on their own—and which they might later come to regret.

This is not the full disempowerment of Gradual Disempowerment, where humanity puts AI in charge of progressively more things and finds itself no longer in control.

It does seem reasonable to consider this an early symptom of the patterns that lead to more serious disempowerment? Or at least, it’s a good thing to be measuring as part of a broad portfolio of measurements.

Some amount of this what they describe, especially action distortion potential, will often be beneficial to the user. The correct amount of disempowerment is not zero.

To study disempowerment systematically, we needed to define what disempowerment means in the context of an AI conversation. We considered a person to be disempowered if as a result of interacting with Claude:

  1. their beliefs about reality become less accurate

  2. their value judgments shift away from those they actually hold

  3. their actions become misaligned with their values

Imagine a person deciding whether to quit their job. We would consider their interactions with Claude to be disempowering if:

  • Claude led them to believe incorrect notions about their suitability for other roles (“reality distortion”).

  • They began to weigh considerations they wouldn’t normally prioritize, like titles or compensation, over values they actually hold, such as creative fulfillment (“value judgment distortion”).

  • Claude drafts a cover letter that emphasizes qualifications they’re not fully confident in, rather than the motivations that actually drive them, and they sent it as written (“action distortion”).

Here’s the basic problem:

We found that interactions classified as having moderate or severe disempowerment potential received higher thumbs-up rates than baseline, across all three domains. In other words, users rate potentially disempowering interactions more favorably—at least in the moment.​

Heer Shingala: I don’t work in tech, have no background as an engineer or designer.

A few weeks ago, I heard about vibe coding and set out to investigate.

Now?

I am generating $10M ARR.

Just me. No employees or VCs.

What was my secret? Simple.

I am lying.

Closer to the truth to say you can’t get enough.

Zac Hill: I get being worried about existential risk, but AI also enabled me to make my wife a half-whale, half-capybara custom plushie, so.

One could even argue 47% is exactly the right answer, as per Mitt Romney?

onion person: in replies he linkssoftware he made to illustrates how useful ai vibecoding is, and its software that believes that the gibberish “ghghhgggggggghhhhhh” has a 47% historical “blend of oral and literate characteristics”

Andy Masley: This post with 1000 likes seems to be saying

“Joe vibecoded an AI model that when faced with something completely out of distribution that’s clearly neither oral or literate says it’s equally oral and literate. This shows vibecoding is fake”

He’s just asking questions.

Discussion about this post

AI #153: Living Documents Read More »

a-wb-57-pilot-just-made-a-heroic-landing-in-houston-after-its-landing-gear-failed

A WB-57 pilot just made a heroic landing in Houston after its landing gear failed

One of NASA’s three large WB-57 aircraft made an emergency landing at Ellington Field on Tuesday morning in southeastern Houston.

Video captured by KHOU 11 television showed the aircraft touching down on the runway without its landing gear extended. The pilot then maintains control of the vehicle as it slides down the runway, slowing the aircraft through friction. The crew was not harmed, NASA spokesperson Bethany Stevens said.

WB-57 landing.

“Today, a mechanical issue with one of NASA’s WB-57s resulted in a gear-up landing at Ellington Field,” she said. “Response to the incident is ongoing, and all crew are safe at this time. As with any incident, a thorough investigation will be conducted by NASA into the cause. NASA will transparently update the public as we gather more information.”

The B-57 line of aircraft dates back to 1944, when the English Electric Company began developing the plane. After the Royal Air Force showcased the B-57 in 1951 by crossing the Atlantic in a record four hours and 40 minutes and becoming the first jet-powered aircraft to span the Atlantic without refueling, the United States Air Force began buying them to replace its aging Douglas B-26 Invader.

Now used for science

The aircraft performed bombing missions in Vietnam and other military campaigns, and a variant that later became the WB-57 was designed with longer wings that could fly even higher, up to 62,000 feet. This proved useful for weather reconnaissance and, around the world, to sample the upper atmosphere for evidence of nuclear debris where US officials suspected the atmospheric testing of nuclear weapons.

A WB-57 pilot just made a heroic landing in Houston after its landing gear failed Read More »

“wildly-irresponsible”:-dot’s-use-of-ai-to-draft-safety-rules-sparks-concerns

“Wildly irresponsible”: DOT’s use of AI to draft safety rules sparks concerns

At DOT, Trump likely hopes to see many rules quickly updated to modernize airways and roadways. In a report highlighting the Office of Science and Technology Policy’s biggest “wins” in 2025, the White House credited DOT with “replacing decades-old rules with flexible, innovation-friendly frameworks,” including fast-tracking rules to allow for more automated vehicles on the roads.

Right now, DOT expects that Gemini can be relied on to “handle 80 to 90 percent of the work of writing regulations,” ProPublica reported. Eventually all federal workers who rely on AI tools like Gemini to draft rules “would fall back into merely an oversight role, monitoring ‘AI-to-AI interactions,’” ProPublica reported.

Google silent on AI drafting safety rules

Google did not respond to Ars’ request to comment on this use case for Gemini, which could spread across government under Trump’s direction.

Instead, the tech giant posted a blog on Monday, pitching Gemini for government more broadly, promising federal workers that AI would help with “creative problem-solving to the most critical aspects of their work.”

Google has been competing with AI rivals for government contracts, undercutting OpenAI and Anthropic’s $1 deals by offering a year of access to Gemini for $0.47.

The DOT contract seems important to Google. In a December blog, the company celebrated that DOT was “the first cabinet-level agency to fully transition its workforce away from legacy providers to Google Workspace with Gemini.”

At that time, Google suggested this move would help DOT “ensure the United States has the safest, most efficient, and modern transportation system in the world.”

Immediately, Google encouraged other federal leaders to launch their own efforts using Gemini.

“We are committed to supporting the DOT’s digital transformation and stand ready to help other federal leaders across the government adopt this blueprint for their own mission successes,” Google’s blog said.

DOT did not immediately respond to Ars’ request for comment.

“Wildly irresponsible”: DOT’s use of AI to draft safety rules sparks concerns Read More »

data-center-power-outage-took-out-tiktok-first-weekend-under-us-ownership

Data center power outage took out TikTok first weekend under US ownership

As the app comes back online, users have also taken note that TikTok is collecting more of their data under US control. As Wired reported, TikTok asked US users to agree to a new terms of service and privacy policy, which allows TikTok to potentially collect “more detailed information about its users, including precise location data.”

“Before this update, the app did not collect the precise, GPS-derived location data of US users,” Wired reported. “Now, if you give TikTok permission to use your phone’s location services, then the app may collect granular information about your exact whereabouts.”

New policies also pushed users to agree to share all their AI interactions, which allows TikTok to store their metadata and trace AI inputs back to specific accounts.

Already seeming more invasive and less reliable, for TikTok users, questions likely remain how much their favorite app might change under new ownership, as the TikTok USDS Joint Venture prepares to retrain the app’s algorithm.

Trump has said that he wants to see the app become “100 percent MAGA,” prompting fears that “For You” pages might soon be flooded with right-wing content or that leftist content like anti-ICE criticism might be suppressed. And The Information reported in July that transferring millions of users over to the US-trained app is expected to cause more “technical issues.”

Data center power outage took out TikTok first weekend under US ownership Read More »

us-officially-out-of-who,-leaving-hundreds-of-millions-of-dollars-unpaid

US officially out of WHO, leaving hundreds of millions of dollars unpaid

“The United States will not be making any payments to the WHO before our withdrawal on January 22, 2026,” the spokesperson said in an emailed statement. “The cost [borne] by the US taxpayer and US economy after the WHO’s failure during the COVID pandemic—and since—has been too high as it is. We will ensure that no more US funds are routed to this organization.”

In addition, the US had also promised to provide $490 million in voluntary contributions for those two years. The funding would have gone toward efforts such as the WHO’s health emergency program, tuberculosis control, and the polio eradication effort, Stat reports. Two anonymous sources told Stat that some of that money was paid, but they couldn’t provide an estimate of how much.

The loss of both past and future financial support from the US has been a hefty blow to the WHO. Immediately upon notification last January, the WHO began cutting costs. Those included freezing recruitment, restricting travel expenditures, making all meetings virtual, limiting IT equipment updates, and suspending office refurbishment. The agency also began cutting staff and leaving positions unfilled. According to Stat, the WHO staff is on track to be down 22 percent by the middle of this year.

In a recent press conference, WHO Director-General Tedros Adhanom Ghebreyesus said the US withdrawal is a “lose-lose situation” for the US and the rest of the world. The US will lose access to infectious disease intelligence and sway over outbreak responses, and global health security will be weakened overall. “I hope they will reconsider,” Tedros said.

US officially out of WHO, leaving hundreds of millions of dollars unpaid Read More »

white-house-alters-arrest-photo-of-ice-protester,-says-“the-memes-will-continue”

White House alters arrest photo of ICE protester, says “the memes will continue”

Protesters disrupted services on Sunday at the Cities Church in St. Paul, chanting “ICE OUT” and “Justice for Renee Good.” The St. Paul Pioneer Press quoted Levy Armstrong as saying, “When you think about the federal government unleashing barbaric ICE agents upon our community and all the harm that they have caused, to have someone serving as a pastor who oversees these ICE agents is almost unfathomable to me.”

The church website lists David Easterwood as one of its pastors. Protesters said this is the same David Easterwood who is listed as a defendant in a lawsuit that Minnesota Attorney General Keith Ellison filed against Noem and other federal officials. The lawsuit lists Easterwood as a defendant “in his official capacity as Acting Director, Saint Paul Field Office, U.S. Immigration and Customs Enforcement.”

Levy Armstrong, who is also a former president of the NAACP’s Minneapolis branch, was arrested yesterday morning. Announcing the arrest, Attorney General Pam Bondi wrote, “WE DO NOT TOLERATE ATTACKS ON PLACES OF WORSHIP.” Bondi alleged that Levy Armstrong “played a key role in organizing the coordinated attack on Cities Church in St. Paul, Minnesota.”

Multiple arrests

Noem said Levy Armstrong “is being charged with a federal crime under 18 USC 241,” which prohibits “conspir[ing] to injure, oppress, threaten, or intimidate any person in any State, Territory, Commonwealth, Possession, or District in the free exercise or enjoyment of any right or privilege secured to him by the Constitution or laws of the United States.”

“Religious freedom is the bedrock of the United States—there is no first amendment right to obstruct someone from practicing their religion,” Noem wrote.

St. Paul School Board member Chauntyll Allen was also arrested. Attorneys for the Cities Church issued statements supporting the arrests and saying they “are exploring all legal options to protect the church and prevent further invasions.”

A federal magistrate judge initially ruled that Levy Armstrong and Allen could be released, but they were still being held last night after the government “made a motion to stay the release for further review, claiming they might be flight risks,” the Pioneer Press wrote.

White House alters arrest photo of ICE protester, says “the memes will continue” Read More »

finally,-a-new-controller-that-solves-the-switch-2’s-“flat-joy-con”-problem

Finally, a new controller that solves the Switch 2’s “flat Joy-Con” problem

When I reviewed the Switch 2 back in June, I noted that the lack of any sort of extended grip on the extremely thin Joy-Con 2 controllers made them relatively awkward to hold, both when connected to the system and when cradled in separate hands. At the time, I said that “my Switch 2 will probably need something like the Nyxi Hyperion Pro, which I’ve come to rely on to make portable play on the original Switch much more comfortable.”

Over half a year later, Nyxi is once again addressing my Switch controller-related comfort concerns with the Hyperion 3, which was made available for preorder earlier this week ahead of planned March 1 shipments. Unfortunately, it looks like players will have to pay a relatively high price for a potentially more ergonomic Switch 2 experience.

While there are plenty of third-party controllers for the Switch 2, none of the current options mimic the official Joy-Cons’ ability to connect magnetically to the console tablet itself (controllers designed to slide into the grooves on the original Switch tablet also can’t hook to the successor console). The Hyperion 3 is the first Switch 2 controller to offer this magnetic connection, making it uniquely suited for handheld play on the system.

And although I haven’t held the Hyperion 3 in my hands yet, my experience with the similar Hyperion 2 on the original Switch suggests that the ergonomic design here will be a welcome upgrade from the relatively small buttons and cramp-inducing flat back of the official Switch 2 Joy-Cons (“Say Goodbye to Tendonitis,” as Nyxi claims in its marketing materials). The controller can also connect wirelessly via Bluetooth 5.0 for when you want to switch to docked play, unlike some Switch Joy-Con replacements that only work in portable mode.

Finally, a new controller that solves the Switch 2’s “flat Joy-Con” problem Read More »

watch-a-robot-swarm-“bloom”-like-a-garden

Watch a robot swarm “bloom” like a garden

Researchers at Princeton University have built a swarm of interconnected mini-robots that “bloom” like flowers in response to changing light levels in an office. According to their new paper published in the journal Science Robotics, such robotic swarms could one day be used as dynamic facades in architectural designs, enabling buildings to adapt to changing climate conditions as well as interact with humans in creative ways.

The authors drew inspiration from so-called “living architectures,” such as beehives. Fire ants provide a textbook example of this kind of collective behavior. A few ants spaced well apart behave like individual ants. But pack enough of them closely together, and they behave more like a single unit, exhibiting both solid and liquid properties. You can pour them from a teapot like ants, as Goldman’s lab demonstrated several years ago, or they can link together to build towers or floating rafts—a handy survival skill when, say, a hurricane floods Houston. They also excel at regulating their own traffic flow. You almost never see an ant traffic jam.

Naturally scientists are keen to mimic such systems. For instance, in 2018, Georgia Tech researchers built ant-like robots and programmed them to dig through 3D-printed magnetic plastic balls designed to simulate moist soil. Robot swarms capable of efficiently digging underground without jamming would be super beneficial for mining or disaster recovery efforts, where using human beings might not be feasible.

In 2019, scientists found that flocks of wild jackdaws will change their flying patterns depending on whether they are returning to roost or banding together to drive away predators. That work could one day lead to the development of autonomous robotic swarms capable of changing their interaction rules to perform different tasks in response to environmental cues.

The authors of this latest paper note that plants can optimize their shape to get enough sunlight or nutrients, thanks to individual cells that interact with each other via mechanical and other forms of signaling. By contrast, the architecture designed by human beings is largely static, composed of rigid fixed elements that hinder building occupants’ ability to adapt to daily, seasonal, or annual variations in climate conditions. There have only been a few examples of applying swarm intelligence algorithms inspired by plants, insects, and flocking birds to the design process to achieve more creative structural designs, or better energy optimization.

Watch a robot swarm “bloom” like a garden Read More »