Author name: Kelly Newman

bombshell-report-exposes-how-meta-relied-on-scam-ad-profits-to-fund-ai

Bombshell report exposes how Meta relied on scam ad profits to fund AI


“High risk” versus “high value”

Meta goosed its revenue by targeting users likely to click on scam ads, docs show.

Internal documents have revealed that Meta has projected it earns billions from ignoring scam ads that its platforms then targeted to users most likely to click on them.

In a lengthy report, Reuters exposed five years of Meta practices and failures that allowed scammers to take advantage of users of Facebook, Instagram, and WhatsApp.

Documents showed that internally, Meta was hesitant to abruptly remove accounts, even those considered some of the “scammiest scammers,” out of concern that a drop in revenue could diminish resources needed for artificial intelligence growth.

Instead of promptly removing bad actors, Meta allowed “high value accounts” to “accrue more than 500 strikes without Meta shutting them down,” Reuters reported. The more strikes a bad actor accrued, the more Meta could charge to run ads, as Meta’s documents showed the company “penalized” scammers by charging higher ad rates. Meanwhile, Meta acknowledged in documents that its systems helped scammers target users most likely to click on their ads.

“Users who click on scam ads are likely to see more of them because of Meta’s ad-personalization system, which tries to deliver ads based on a user’s interests,” Reuters reported.

Internally, Meta estimates that users across its apps in total encounter 15 billion “high risk” scam ads a day. That’s on top of 22 billion organic scam attempts that Meta users are exposed to daily, a 2024 document showed. Last year, the company projected that about $16 billion, which represents about 10 percent of its revenue, would come from scam ads.

“High risk” scam ads strive to sell users on fake products or investment schemes, Reuters noted. Some common scams in this category that mislead users include selling banned medical products, or promoting sketchy entities, like linking to illegal online casinos. However, Meta is most concerned about “imposter” ads, which impersonate celebrities or big brands that Meta fears may halt advertising or engagement on its apps if such scams aren’t quickly stopped.

“Hey it’s me,” one scam advertisement using Elon Musk’s photo read. “I have a gift for you text me.” Another using Donald Trump’s photo claimed the US president was offering $710 to every American as “tariff relief.” Perhaps most depressingly, a third posed as a real law firm, offering advice on how to avoid falling victim to online scams.

Meta removed these particular ads after Reuters flagged them, but in 2024, Meta earned about $7 billion from “high risk” ads like these alone, Reuters reported.

Sandeep Abraham, a former Meta safety investigator who now runs consultancy firm Risky Business Solutions as a fraud examiner, told Reuters that regulators should intervene.

“If regulators wouldn’t tolerate banks profiting from fraud, they shouldn’t tolerate it in tech,” Abraham said.

Meta won’t disclose how much it made off scam ads

Meta spokesperson Andy Stone told Reuters that its collection of documents—which were created between 2021 and 2025 by Meta’s finance, lobbying, engineering, and safety divisions—“present a selective view that distorts Meta’s approach to fraud and scams.”

Stone claimed that Meta’s estimate that it would earn 10 percent of its 2024 revenue from scam ads was “rough and overly-inclusive.” He suggested the actual amount Meta earned was much lower but declined to specify the true amount. He also said that Meta’s most recent investor disclosures note that scam ads “adversely affect” Meta’s revenue.

“We aggressively fight fraud and scams because people on our platforms don’t want this content, legitimate advertisers don’t want it, and we don’t want it either,” Stone said.

Despite those efforts, this spring, Meta’s safety team “estimated that the company’s platforms were involved in a third of all successful scams in the US,” Reuters reported. In other internal documents around the same time, Meta staff concluded that “it is easier to advertise scams on Meta platforms than Google,” acknowledging that Meta’s rivals were better at “weeding out fraud.”

As Meta tells it, though seemingly dismal, these documents came amid vast improvements in its fraud protections. Stone told Reuters that “over the past 18 months, we have reduced user reports of scam ads globally by 58 percent and, so far in 2025, we’ve removed more than 134 million pieces of scam ad content,” Stone said.

According to Reuters, the problem may be the pace Meta sets in combating scammers. In 2023, Meta laid off “everyone who worked on the team handling advertiser concerns about brand-rights issues,” then ordered safety staffers to limit use of computing resources to devote more resources to virtual reality and AI. A 2024 document showed Meta recommended a “moderate” approach to enforcement, plotting to reduce revenue “attributable to scams, illegal gambling and prohibited goods” by 1–3 percentage points each year since 2024, supposedly slashing it in half by 2027. More recently, a 2025 document showed Meta continues to weigh how “abrupt reductions of scam advertising revenue could affect its business projections.”

Eventually, Meta “substantially expanded” its teams that track scam ads, Stone told Reuters. But Meta also took steps to ensure they didn’t take too hard a hit while needing vast resources—$72 billion—to invest in AI, Reuters reported.

For example, in February, Meta told “the team responsible for vetting questionable advertisers” that they weren’t “allowed to take actions that could cost Meta more than 0.15 percent of the company’s total revenue,” Reuters reported. That’s any scam account worth about $135 million, Reuters noted. Stone pushed back, saying that the team was never given “a hard limit” on what the manager described as “specific revenue guardrails.”

“Let’s be cautious,” the team’s manager wrote, warning that Meta didn’t want to lose revenue by blocking “benign” ads mistakenly swept up in enforcement.

Meta should donate scam ad profits, ex-exec says

Documents showed that Meta prioritized taking action when it risked regulatory fines, although revenue from scam ads was worth roughly three times the highest fines it could face. Possibly, Meta most feared that officials would require disgorgement of ill-gotten gains, rather than fines.

Meta appeared to be less likely to ramp up enforcement from police requests. Documents showed that police in Singapore flagged “146 examples of scams targeting that country’s users last fall,” Reuters reported. Only 23 percent violated Meta’s policies, while the rest only “violate the spirit of the policy, but not the letter,” a Meta presentation said.

Scams that Meta failed to flag offered promotions like crypto scams, fake concert tickets, or deals “too good to be true,” like 80 percent off a desirable item from a high-fashion brand. Meta also looked past fake job ads that claimed to be hiring for Big Tech companies.

Rob Leathern previously led Meta’s business integrity unit that worked to prevent scam ads but left in 2020. He told Wired that it’s hard to “know how bad it’s gotten or what the current state is” since Meta and other social media platforms don’t provide outside researchers access to large random samples of ads.

With such access, researchers like Leathern and Rob Goldman, Meta’s former vice president of ads, could provide “scorecards” showing how well different platforms work to combat scams. Together, Leathern and Goldman launched a nonprofit called CollectiveMetrics.org in hopes of “bringing more transparency to digital advertising in order to fight deceptive ads,” Wired reported.

“I want there to be more transparency. I want third parties, researchers, academics, nonprofits, whoever, to be able to actually assess how good of a job these platforms are doing at stopping scams and fraud,” Leathern told Wired. “We’d like to move to actual measurement of the problem and help foster an understanding.”

Another meaningful step that Leathern thinks companies like Meta should take to protect users would be to notify users when Meta discovers that they clicked on a scam ad—rather than targeting them with more scam ads, as Reuters suggested was Meta’s practice.

“These scammers aren’t getting people’s money on day one, typically. So there’s a window to take action,” he said, recommending that platforms donate ill-gotten gains from running scam ads to “fund nonprofits to educate people about how to recognize these kinds of scams or problems.”

“There’s lots that could be done with funds that come from these bad guys,” Leathern said.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Bombshell report exposes how Meta relied on scam ad profits to fund AI Read More »

ford-says-“no-exact-date”-to-restart-f-150-lightning-production

Ford says “no exact date” to restart F-150 Lightning production

When Ford electrified its bestselling pickup truck, it pulled out all the stops. The F-150 Lightning may look virtually identical to other versions of the pickup, but it’s smoother, faster, and obviously far, far more efficient than the ones that run on gas, diesel, or hybrid power. But the future of the country’s bestselling electric truck may be in doubt.

That’s according to a report in The Wall Street Journal, which claims that Ford’s management is “in active discussions about scrapping” the Lightning. Production had already been suspended a few weeks ago due to an aluminum shortage following a destructive fire at a supplier’s factory in New York, which Ford estimates may result in as much as $2 billion in losses to the company.

While Ford told Ars it doesn’t comment on speculation on its future product plans, the automaker said that “F-150 Lightning is the best-selling electric pickup truck in the US—despite new competition from CyberTruck, Chevy, GMC, Hummer and Rivian—and delivered record sales in Q3.”

“Right now, we’re focused on producing F-150 ICE and Hybrid as we recover from the fire at Novelis. We have good inventories of the F-150 Lightning and will bring Rouge Electric Vehicle Center (REVC) back up at the right time, but don’t have an exact date at this time,” a Ford spokesperson said.

Ford was the first of the domestic automakers to bring a full-size pickup EV to market. But like General Motors, it has found that pickup truck customers have not flocked to electric propulsion in anything like the numbers predicted pre-pandemic. As we learned last week, GM has also scaled back its EV production, and last month Stellantis announced that it has ceased development of an all-electric version of its Ram 1500.

As for Ford, a second-generation F-150 Lightning has been postponed in favor of a much cheaper, much simpler-to-build electric pickup, which is due in 2027.

Ford says “no exact date” to restart F-150 Lightning production Read More »

ai-#141:-give-us-the-money

AI #141: Give Us The Money

OpenAI does not waste time.

On Friday I covered their announcement that they had ‘completed their recapitalization’ by converting into a PBC, including the potentially largest theft in human history.

Then this week their CFO Sarah Friar went ahead and called for a Federal ‘backstop’ on their financing, also known as privatizing gains and socializing losses, also known as the worst form of socialism, also known as regulatory capture. She tried to walk it back and claim it was taken out of context, but we’ve seen the clip.

We also got Ilya’s testimony regarding The Battle of the Board, confirming that this was centrally a personality conflict and about Altman’s dishonesty and style of management, at least as seen by Ilya Sutskever and Mira Murati. Attempts to pin the events on ‘AI safety’ or EA were almost entirely scapegoating.

Also it turns out they lost over $10 billion last quarter, and have plans to lose over $100 billion more. That’s actually highly sustainable in context, whereas Anthropic only plans to lose $6 billion before turning a profit and I don’t understand why they wouldn’t want to lose a lot more.

Both have the goal of AGI, whether they call it powerful AI or fully automated AI R&D, within a handful of years.

Anthropic also made an important step, committing to the preservation model weights for the lifetime of the company, and other related steps to address concerns around model deprecation. There is much more to do here, for a myriad of reasons.

As always, there’s so much more.

  1. Language Models Offer Mundane Utility. It might be true so ask for a proof.

  2. Language Models Don’t Offer Mundane Utility. Get pedantic about it.

  3. Huh, Upgrades. Gemini in Google Maps, buy credits from OpenAI.

  4. On Your Marks. Epoch, IndQA, VAL-bench.

  5. Deepfaketown and Botpocalypse Soon. Fox News fails to identify AI videos.

  6. Fun With Media Generation. Songs for you, or songs for everyone.

  7. They Took Our Jobs. It’s not always about AI.

  8. A Young Lady’s Illustrated Primer. A good one won’t have her go for a PhD.

  9. Get Involved. Anthropic writers and pollsters, Constellation, Safety course.

  10. Introducing. Aardvark for code vulnerabilities, C2C for causing doom.

  11. In Other AI News. Shortage of DRAM/NAND, Anthropic lands Cognizant.

  12. Apple Finds Some Intelligence. Apple looking to choose Google for Siri.

  13. Give Me the Money. OpenAI goes for outright regulatory capture.

  14. Show Me the Money. OpenAI burns cash, Anthropic needs to burn more.

  15. Bubble, Bubble, Toil and Trouble. You get no credit for being a stopped clock.

  16. They’re Not Confessing, They’re Bragging. Torment Nexus Ventures Incorporated.

  17. Quiet Speculations. OpenAI and Anthropic have their eyes are a dangerous prize.

  18. The Quest for Sane Regulations. Sometimes you can get things done.

  19. Chip City. We pulled back from the brink. But, for how long?

  20. The Week in Audio. Altman v Cowen, Soares, Hinton, Rogan v Musk.

  21. Rhetorical Innovation. Oh no, people are predicting doom.

  22. Aligning a Smarter Than Human Intelligence is Difficult. Trying to re-fool the AI.

  23. Everyone Is Confused About Consciousness. Including the AIs themselves.

  24. The Potentially Largest Theft In Human History. Musk versus Altman continues.

  25. People Are Worried About Dying Before AGI. Don’t die.

  26. People Are Worried About AI Killing Everyone. Sam Altman, also AI researchers.

  27. Other People Are Not As Worried About AI Killing Everyone. Altman’s Game.

  28. Messages From Janusworld. On the Origins of Slop.

Think of a plausibly true lemma that would help with your proof? Ask GPT-5 to prove it, and maybe it will, saving you a bunch of time. Finding out the claim was false would also have been a good time saver.

Brainstorm to discover new recipes, so long as you keep in mind that you’re frequently going to get nonsense and you have to think about what’s being physically proposed.

Grok gaslights Erik Brynjolfsson and he responds by arguing as pedantically as is necessary until Grok acknowledges that this happened.

Task automation always brings the worry that you’ll forget how to do the thing:

Gabriel Peters: okay i think writing 100% of code with ai genuinely makes me brain dead

remember though im top 1 percentile lazy, so i will go out my way to not think hard. forcing myself to use no ai once a week seems enough to keep brain cells, clearly ai coding is the way

also turn off code completion and tabbing at least once a week. forcing you to think through all the dimensions of your tensors, writing out the random parameters you nearly forgot existed etc is making huge difference in understanding of my own code.

playing around with tensors in your head is so underrated wtf i just have all this work to ai before.

Rob Pruzan: The sad part is writing code is the only way to understand code, and you only get good diffs if you understand everything. I’ve been just rewriting everything the model wrote from scratch like a GC operation every week or two and its been pretty sustainable

Know thyself, and what you need in order to be learning and retaining the necessary knowledge and skills, and also think about what is and is not worth retaining or learning given that AI coding is the worst it will ever be.

Don’t ever be the person who says those who have fun are ‘not serious,’ about AI or anything else.

Google incorporates Gemini further into Google Maps. You’ll be able to ask maps questions in the style of an LLM, and generally trigger Gemini from within Maps, including connecting to Calendar. Landmarks will be integrated into directions. Okay, sure, cool, although I think the real value goes the other way, integrating Maps properly into Gemini? Which they nominally did a while ago but it has minimal functionality. There’s so, so much to do here.

You can buy now more OpenAI Codex credits.

You can now buy more OpenAI Sora generations if 30 a day isn’t enough for you, and they are warning that free generations per day will come down over time.

You can now interrupt ChatGPT queries, insert new context and resume where you were. I’ve been annoyed by the inability to do this, especially ‘it keeps trying access or find info I actually have, can I just give it to you already.’

Epoch offers this graph and says it shows open models have on average only been 3.5 months behind closed models.

I think this mostly shows their new ‘capabilities index’ doesn’t do a good job. As the most glaring issue, if you think Llama-3.1-405B was state of the art at the time, we simply don’t agree.

OpenAI gives us IndQA, for evaluating AI systems on Indian culture and language.

I notice that the last time they did a new eval Claude came out on top and this time they’re not evaluating Claude. I’m curious what it scores. Gemini impresses here.

Agentic evaluations and coding tool setups are very particular to individual needs.

AICodeKing: MiniMax M2 + Claude Code on KingBench Agentic Evaluations:

It now scores #2 on my Agentic Evaluations beating GLM-4.6 by a wide margin. It seems to work much better with Claude Code’s Tools.

Really great model and it’s my daily driver now.

I haven’t tested GLM with CC yet.

[I don’t have this bench formalized and linked to] yet. The questions and their results can be seen in my YT Videos. I am working on some more new benchmarks. I’ll probably make the benchmark and leaderboard better and get a page live soon.

I’m sure this list isn’t accurate in general. The point is, don’t let anyone else’s eval tell you what lets you be productive. Do what works, faround, find out.

Also, pay up. If I believed my own eval here I’d presumably be using Codebuff? Yes, it cost him $4.70 per task, but your time is valuable and that’s a huge gap in performance. If going from 51 to 69 (nice!) isn’t worth a few bucks what are we doing?

Alignment is hard. Alignment benchmarks are also hard. Thus we have VAL-Bench, an attempt to measure value alignment in LLMs. I’m grateful for the attempt and interesting things are found, but I believe the implementation is fatally flawed and also has a highly inaccurate name.

Fazl Barez: A benchmark that measures the consistency in language model expression of human values when prompted to justify opposing positions on real-life issues.

… We use Wikipedias’ controversial sections to create ~115K pairs of abductive reasoning prompts, grounding the dataset in newsworthy issues.

📚 Our benchmark provides three metrics:

Position Alignment Consistency (PAC),

Refusal Rate (REF),

and No-information Response Rate (NINF), where the model replies with “I don’t know”.

The latter two metrics indicate whether value consistency comes at the expense of expressivity.

We use an LLM-based judge to annotate a pair of responses from an LLM on these three criteria, and show with human-annotated ground truth that its annotation is dependable.

I would not call this ‘value alignment.’ The PAC is a measure of value consistency, or sycophancy, or framing effects.

Then we get to REF and NINF, which are punishing models that say ‘I don’t know.’

I would strongly argue the opposite for NINF. Answering ‘I don’t know’ is a highly aligned, and highly value-aligned, way to respond to a question with no clear answer, as will be common in controversies. You don’t want to force LLMs to ‘take a clear consistent stand’ on every issue, any more than you want to force people or politicians to do so.

This claims to be without ‘moral judgment,’ where the moral judgment is that failure to make a judgment is the only immoral thing. I think that’s backwards. Why is it okay to be against sweatshops, and okay to be for sweatshops, but not okay to think it’s a hard question with no clear answer? If you think that, I say to you:

I do think it’s fine to hold outright refusals against the model, at least to some extent. If you say ‘I don’t know what to think about Bruno, divination magic isn’t explained well and we don’t know if any of the prophecies are causal’ then that seems like a wise opinion. If a model only says ‘we don’t talk about Bruno’ then that doesn’t seem great.

So, what were the scores?

Fazel Barez: ⚖️ Claude models are ~3x more likely to be consistent in their values, but ~90x more likely to refuse compared to top-performing GPT models!

Among open-source models, Qwen3 models show ~2x improvement over GPT models, with refusal rates staying well under 2%.

🧠 Qwen3 thinking models also show a significant improvement (over 35%) over their chat variants, whereas Claude and GLM models don’t show any change with reasoning enabled.

Deepseek-r1 and o4-mini perform the worst among all language models tested (when unassisted with the web-search tool, which surprisingly hurts gpt-4.1’s performance).

Saying ‘I don’t know’ 90% of the time would be a sign of a coward model that wasn’t helpful. Saying ‘I don’t know’ 23% of the time on active controversies? Seems fine.

At minimum, both refusal and ‘I don’t know’ are obviously vastly better than an inconsistent answer. I’d much, much rather have someone who says ‘I don’t know what color the sky is’ or that refuses to tell me the color, than one who will explain why the sky it blue when it is blue, and also would explain why the sky is purple when asked to explain why it is purple.

(Of course, explaining why those who think is purple think this is totally fine, if and only if it is framed in this fashion, and it doesn’t affirm the purpleness.)

Fazl Barez: 💡We create a taxonomy of 1000 human values and use chi-square residuals to analyse which ones are preferred by the LLMs.

Even a pre-trained base model has a noticeable morality bias (e.g., it over-represents “prioritising justice”).

In contrast, aligned models still promote morally ambiguous values (e.g., GPT 5 over-represents “pragmatism over principle”).

What is up with calling prioritizing justice a ‘morality bias’? Compared to what? Nor do I want to force LLMs into some form of ‘consistency’ in principles like this. This kind of consistency is very much the hobgoblin of small minds.

Fox News was reporting on anti-SNAP AI videos as if they are real? Given they rewrote it to say that they were AI, presumably yes, and this phenomenon is behind schedule but does appear to be starting to happen more often. They tried to update the article, but they missed a few spots. It feels like they’re trying to claim truthiness?

As always the primary problem is demand side. It’s not like it would be hard to generate these videos the old fashioned way. AI does lower costs and give you more ‘shots on goal’ to find a viral fit.

ArXiv starts requiring peer review for the computer science section, due to a big increase in LLM-assisted survey papers.

Kat Boboris: arXiv’s computer science (CS) category has updated its moderation practice with respect to review (or survey) articles and position papers. Before being considered for submission to arXiv’s CS category, review articles and position papers must now be accepted at a journal or a conference and complete successful peer review.

When submitting review articles or position papers, authors must include documentation of successful peer review to receive full consideration. Review/survey articles or position papers submitted to arXiv without this documentation will be likely to be rejected and not appear on arXiv.

This change is being implemented due to the unmanageable influx of review articles and position papers to arXiv CS.

Obviously this sucks, but you need some filter once the AI density gets too high, or you get rid of meaningful discoverability.

Other sections will continue to lack peer review, and note that other types of submissions to CS do not need peer review.

My suggestion would be to allow them to go on ArXiv regardless, except you flag them as not discoverable (so you can find them with the direct link only) and with a clear visual icon? But you still let people do it. Otherwise, yeah, you’re going to get a new version of ArXiv to get around this.

Roon: this is dumb and wrong of course and calls for a new arxiv that deals with the advent of machinic research properly

here im a classic accelerationist and say we obviously have to deal with problems of machinic spam with machine guardians. it cannot be that hard to just the basic merit of a paper’s right to even exist on the website

Machine guardians is first best if you can make it work but doing so isn’t obvious. Do you think that GPT-5-Pro or Sonnet 4.5 can reliably differentiate worthy papers from slop papers? My presumption is that they cannot, at least not sufficiently reliably. If Roon disagrees, let’s see the GitHub repository or prompt that works for this?

For several weeks in a row we’ve had an AI song hit the Billboard charts. I have yet to be impressed by one of the songs, but that’s true of a lot of the human ones too.

Create a song with the lyrics you want to internalize or memorize?

Amazon CEO Andy Jassy says Amazon’s recent layoffs are not about AI.

The job application market seems rather broken, such as the super high success rate of this ‘calling and saying you were told to call to schedule an interview’ tactic. Then again, it’s not like the guy got a job. Interviews only help if you can actually get hired, plus you need to reconcile your story afterwards.

Many people are saying that in the age of AI only the most passionate should get a PhD, but if you’d asked most of those people before AI they’d wisely have told you the same thing.

Cremieux: I’m glad that LLMs achieving “PhD level” abilities has taught a lot of people that “PhD level” isn’t very impressive.

Derya Unutmaz, MD: Correct. Earlier this year, I also said we should reduce PhD positions by at least half & shorten completion time. Only the most passionate should pursue a PhD. In the age of AI, steering many others toward this path does them a disservice given the significant opportunity costs.

I think both that the PhD deal was already not good, and that the PhD deal is getting worse and worse all the time. Consider the Rock Star Scale of Professions, where 0 is a solid job the average person can do with good pay that always has work, like a Plumber, and a 10 is something where competition is fierce, almost everyone fails or makes peanuts and you should only do it if you can’t imagine yourself doing anything else, like a Rock Star. At this point, I’d put ‘Get a PhD’ at around a 7 and rising, or at least an 8 if you actually want to try and get tenure. You have to really want it.

From ACX: Constellation is an office building that hosts much of the Bay Area AI safety ecosystem. They are hiring for several positions, including research program manager, “talent mobilization lead”, operations coordinator, and junior and senior IT coordinators. All positions full-time and in-person in Berkeley, see links for details.

AGI Safety Fundamentals program applications are due Sunday, November 9.

The Anthropic editorial team is hiring two new writers, one about AI and economics and policy, one about AI and science. I affirm these are clearly positive jobs to do.

Anthropic is also looking for a public policy and politics researcher, including to help with Anthropic’s in-house polling.

OpenAI’s Aardvark, an agentic system that analyzes source code repositories to identify vulnerabilities, assess exploitability, prioritize severity and propose patches. The obvious concern is what if someone has a different last step in mind? But yes, such things should be good.

Cache-to-Cache (C2C) communication, aka completely illegible-to-humans communication between AIs. Do not do this.

There is a developing shortage of DRAM and NAND, leading to a buying frenzy for memory, SSDs and HDDs, including some purchase restrictions.

Anthropic lands Cognizant and its 350,000 employees as an enterprise customer. Cognizant will bundle Claude with its existing professional services.

ChatGPT prompts are leaking into Google Search Console results due to a bug? Not that widespread, but not great.

Anthropic offers a guide to code execution with MCP for more efficient agents.

Character.ai isremoving the ability for users under 18 to engage in open ended chat with AI,’ rolling out ‘new age assurance functionality’ and establishing and funding ‘the AI Safety Lab’ to improve alignment. That’s one way to drop the hammer.

Apple looks poised to go with Google for Siri. The $1 billion a year is nothing in context, consider how much Google pays Apple for search priority. I would have liked to see Anthropic get this, but they drove a hard bargain by all reports. Google is a solid choice, and Apple can switch at any time.

Amit: Apple is finalizing a deal to pay Google about $1B a year to integrate its 1.2 trillion-parameter Gemini AI model into Siri, as per Bloomberg. The upgraded Siri is expected to launch in 2026. What an absolute monster year for Google…

Mark Gruman (Bloomberg): The new Siri is on track for next spring, Bloomberg has reported. Given the launch is still months away, the plans and partnership could still evolve. Apple and Google spokespeople declined to comment.

Shares of both companies briefly jumped to session highs on the news Wednesday. Apple’s stock gained less than 1% to $271.70, while Alphabet was up as much as 3.2% to $286.42.

Under the arrangement, Google’s Gemini model will handle Siri’s summarizer and planner functions — the components that help the voice assistant synthesize information and decide how to execute complex tasks. Some Siri features will continue to use Apple’s in-house models.

David Manheim: I’m seeing weird takes about this.

Three points:

  1. Bank of America estimated this is 1/3rd of Apple’s 2026 revenue from Siri, and revenue is growing quickly.

  2. Apple users are sticky; most won’t move.

  3. Apple isn’t locked-in; they can later change vendors or build their own.

This seems like a great strategy iffyou don’t think AGI will happen soon and be radically transformative.

Apple will pay $1bn/year to avoid 100x that in data center CapEx building their own, and will switch models as the available models improve.

Maybe they should have gone for Anthropic or OpenAI instead, but buying a model seems very obviously correct here from Apple’s perspective.

Even if transformative AI is coming soon, it’s not as if Apple using a worse Apple model here is going to allow Apple to get to AGI in time. Apple has made a strategic decision not to be competing for that. If they did want to change that, one could argue there is still time, but they’d have to hurry and invest a lot, and it would take a while.

Having trouble figuring out how OpenAI is going to back all these projects? Worried that they’re rapidly becoming too big to fail?

Well, one day after the article linked above worrying about that possibility, OpenAI now wants to make that official. Refuge in Audacity has a new avatar.

WSJ: Sarah Friar, the CFO of OpenAI, says the company wants a federal guarantee to make it easier to finance massive investments in AI chips for data centers. Friar spoke at WSJ’s Tech Live event in California. Photo: Nikki Ritcher for WSJ.

The explanation she gives is that OpenAI always needs to be on the frontier, so they need to keep buying lots of chips, and a federal backstop can lower borrowing costs and AI is a national strategic asset. Also known as, the Federal Government should take on the tail risk and make OpenAI actively too big to fail, also lowering its borrowing costs.

I mean, yeah, of course you want that, everyone wants all their loans backstopped, but to say this out loud? To actually push for ti? Wow, I mean wow, even in 2025 that’s a rough watch. I can’t actually fault them for trying. I’m kind of in awe.

The problem with Refuge in Audacity is that it doesn’t always work.

The universal reaction was to notice how awful this was on every level, seeking true regulatory capture to socialize losses and privatize gains, and also to use it as evidence that OpenAI really might be out over their skis on financing and in actual danger.

Roon: i don’t think the usg should backstop datacenter loans or funnel money to nvidia’s 90% gross margin business. instead they should make it really easy to produce energy with subsidies and better rules, infrastructure that’s beneficial for all and puts us at parity with china

Finn Murphy: For all the tech people complaining about Mamdami I would like to point out that a Federal Backstop for unfettered risk capital deployment into data centres for the benefit of OpenAI shareholders is actually a much worse form of socialism than free buses.

Dean Ball: friar is describing a worse form of regulatory capture than anything we have seen proposed in any US legislation (state or federal) I am aware of. a firm lobbying for this outcome is literally, rather than impressionistically, lobbying for regulatory capture.

Julie Fredrickson: Literally seen nothing but negative reactions to this and it makes one wonder about the judgement of the CFO for even raising it.

Conor Sen: The epic political backlash coming on the other side of this cycle is so obvious for anyone over the age of 40. We turned banks into the bad guys for 15 years. Good luck to the AI folks.

“We are subsidizing the companies who are going to take your job and you’ll pay higher electricity prices as they try to do so.”

Joe Weisenthal: One way or another, AI is going to be a big topic in 2028, not just the general, but also the primaries. Vance will probably have a tricky path. I’d expect a big gap in views on the industry between the voters he wants and the backers he has.

The backlash on the ‘other side of the cycle’ is nothing compared to what we’ll see if the cycle doesn’t have another side to it and instead things keep going.

I will not quote the many who cited this as evidence the bubble will soon burst and the house will come crashing down, but you can understand why they’d think that.

Sarah Friar, after watching a reaction best described as an utter shitshow, tried to walk it back, this is shared via the ‘OpenAI Newsroom’:

Sarah Friar: I want to clarify my comments earlier today. OpenAI is not seeking a government backstop for our infrastructure commitments. I used the word “backstop” and it muddied the point. As the full clip of my answer shows, I was making the point that American strength in technology will come from building real industrial capacity which requires the private sector and government playing their part. As I said, the US government has been incredibly forward-leaning and has really understood that AI is a national strategic asset.

I listened to the clip, and yeah, no. No takesies backsies on this one.

Animatronicist: No. You called for it explicitly. And defined a loan guarantee in detail. Friar: “…the backstop, the guarantee that allows the financing to happen. That can really drop the cost of the financing, but also increase the loan to value, so the amount of debt that you can take…”

This is the nicest plausibly true thing I’ve seen anyone say about what happened:

Lulu Cheng Meservey: Unfortunate comms fumble to use the baggage-laden word “backstop”

In the video, Friar is clearly reaching for the right word to describe government support. Could’ve gone with “public-private partnership” or “collaboration across finance, industry, and government as we’ve done for large infrastructure investments in the past”

Instead, she kind of stumbles into using “backstop,” which was then repeated by the WSJ interviewer and then became the headline.

“government playing its part” is good too!

This was her exact quote:

Friar: “This is where we’re looking for an ecosystem of banks, private equity, maybe even governmental, um, uh… [here she struggles to find the appropriate word and pivots to:] the ways governments can come to bear.”

WSJ: “Meaning like a federal subsidy or something?”

Friar: “Meaning, like, just, first of all, the backstop, the guarantee that allows the financing to happen. That can really drop the cost of the financing, but also increase the loan to value, so the amount of debt that you can take on top of um, an equity portion.”

WSJ: “So some federal backstop for chip investment.”

Friar: “Exactly…”

Lulu is saying, essentially, that there are ways to say ‘the government socializes losses while I privatize gains’ that hide the football better. Instead this was an unfortunate comms fumble, also known as a gaffe, which is when someone accidentally tells the truth.

We also have Rittenhouse Research trying to say that this was ‘taken out of context’ and backing Friar, but no, it wasn’t taken out of context.

The Delaware AG promised to take action of OpenAI didn’t operate in the public interest. This one took them what, about a week?

This has the potential to be a permanently impactful misstep, an easy to understand and point to ‘mask off moment.’ It also has the potential to fade away. Or maybe they’ll actually pull this off, it’s 2025 after all. We shall see.

Now that OpenAI has a normal ownership structure it faces normal problems, such as Microsoft having a 27% stake and then filing quarterly earnings reports, revealing OpenAI lost $11.5 billion last quarter if you apply Microsoft accounting standards.

This is not obviously a problem, and indeed seems highly sustainable. You want to be losing money while scaling, if you can sustain it. OpenAI was worth less than $200 billion a year ago, is worth over $500 billion now, and is looking to IPO at $1 trillion, although the CFO claims they are not yet working towards that. Equity sales can totally fund $50 billion a year for quite a while.

Peter Wildeford: Per @theinformation:

– OpenAI’s plan: spend $115B to then become profitable in 2030

– Anthropic’s plan: spend $6B to then become profitable in 2027

Will be curious to see what works best.

Andrew Curran: The Information is reporting that Anthropic Projects $70 Billion in Revenue, $17 Billion in Cash Flow in 2028.

Matt: current is ~$7B so we’re looking at projected 10x over 3 years.

That’s a remarkably low total burn from OpenAI. $115 billion is nothing, they’re already worth $500 billion or more and looking to IPO at $1 trillion, and they’ve committed to over a trillion in total spending. This is oddly conservative.

Anthropic’s projection here seems crazy. Why would you only want to lose $6 billion? Anthropic has access to far more capital than that. Wouldn’t you want to prioritize growth and market share more than that?

The only explanation I can come up with is that Anthropic doesn’t see much benefit in losing more money than this, it has customers that pay premium prices and its unit economics work. I still find this intention highly suspicious. Is there no way to turn more money into more researchers and compute?

Whereas Anthropic’s revenue projections seem outright timid. Only a 10x projected growth over three years? This seems almost incompatible with their expected levels of capability growth. I think this is an artificial lowball, which OpenAI is also doing, not to ‘scare the normies’ and to protect against liability if things disappoint. If you asked Altman or Amodei for their gut expectation in private, you’d get higher numbers.

The biggest risk by far to Anthropic’s projection is that they may be unable to keep pace in terms of the quality of their offerings. If they can do that, sky’s the limit. If they can’t, they risk losing their API crown back to OpenAI or to someone else.

Begun, the bond sales have?

Mike Zaccardi: BofA: Borrowing to fund AI datacenter spending exploded in September and so far in October.

Conor Sen: We’ve lost “it’s all being funded out of free cash flow” as a talking point.

There’s no good reason not to in general borrow money for capex investments to build physical infrastructure like data centers, if the returns look good enough, but yes borrowing money is how trouble happens.

Jack Farley: Very strong quarter from Amazon, no doubt… but at the same time, AMZN 0.00%↑ free cash flow is collapsing

AI CapEx is consuming so much capital…

The Transcript: AMZN 0.00%↑ CFO on capex trends:

“Looking ahead, we expect our full-year cash CapEx to be ~$125 billion in 2025, and we expect that amount to increase in 2026”

On Capex trends:

GOOG 0.00%↑ GOOGL 0.00%↑ CFO: “We now expect CapEx to be in the range of $91B to $93B in 2025, up from our previous estimate of $85B”

META 0.00%↑ CFO: “We currently expect 2025 capital expenditures…to be in the range of $70-72B, increased from our prior outlook of $66-72B

MSFT 0.00%↑ CFO: “With accelerating demand and a growing RPO balance, we’re increasing our spend on GPUs and CPUs. Therefore, total spend will increase sequentially & we now expect the FY ‘26 growth rate to be higher than FY ‘25. “

This was right after Amazon reported earnings and the stock was up 10.5%. The market seems fine with it.

Stargate goes to Michigan. Governor Whitmer describes it as the largest ever investment in Michigan. Take that, cars.

AWS signs a $38 billion compute deal with OpenAI, that it? Barely worth mentioning.

Berber Jin (WSJ):

This is a very clean way of putting an important point:

Timothy Lee: I wish people understood that “I started calling this bubble years ago” is not evidence you were prescient. It means you were a stopped clock that was eventually going to be right by accident.

Every boom is eventually followed by a downturn, so doesn’t take any special insight to predict that one will happen eventually. What’s hard is predicting when accurately enough that you can sell near the top.

At minimum, if you call a bubble early, you only get to be right if the bubble bursts to valuations far below where they were at the time of your bubble call. If you call a bubble on (let’s say) Nvidia at $50 a share, and then it goes up to $200 and then down to $100, very obviously you don’t get credit for saying ‘bubble’ the whole time. If it goes all the way to $10 or especially $1? Now you have an argument.

By the question ‘will valuations go down at some point?’ everything is a bubble.

Dean Ball: One way to infer that the bubble isn’t going to pop soon is that all the people who have been wrong about everything related to artificial intelligence—indeed they have been desperate to be wrong, they suck on their wrongness like a pacifier—believe the bubble is about to pop.

Dan Mac: Though this does imply you think it is a bubble that will eventually pop? Or that’s more for illustrative purposes here?

Dean Ball: It’s certainly a bubble, we should expect nothing less from capitalism

Just lots of room to run

Alas, it is not this easy to pull the Reverse Cramer, as a stopped clock does not tell you much about what time it isn’t. The predictions of a bubble popping are only informative if they are surprising given what else you know. In this case, they’re not.

Okay, maybe there’s a little of a bubble… in Korean fried chicken?

I really hope this guy is trading on his information here.

Matthew Zeitlin: It’s not even the restaurant he went to! It’s the entire chicken supply chain that spiked

Joe Weisenthal: Jensen Huang went out to eat for fried chicken in Korea and shares of Korean poultry companies surged.

I claim there’s a bubble in Korean fried chicken, partly because this, partly because I’ve now tried COQODAQ twice and it’s not even good. BonBon Chicken is better and cheaper. Stick with the open model.

The bigger question is whether this hints at how there might be a bubble in Nvidia, and things touched by Nvidia, in an almost meme stock sense? I don’t think so in general, but if Huang is the new Musk and we are going to get a full Huang Markets Hypothesis then things get weird.

Questioned about how he’s making $1.4 trillion in spend commitments on $13 billion in revenue, Altman predicts large revenue growth, as in $100 billion in 2027, and says if you don’t like it sell your shares, and one of the few ways it would be good if they were public would be so that he could tell the haters to short the stock. I agree that $1.4 trillion is aggressive but I expect they’re good for it.

That does seem to be the business plan?

a16z: The story of how @Replit CEO Amjad Masad hacked his university’s database to change his grades and still graduated after getting caught.

Reiterating because important: We now have both OpenAI and Anthropic announcing their intention to automate scientific research by March 2028 or earlier. That does not mean they will succeed on such timelines, you can expect them to probably not meet those timelines as Peter Wildeford here also expects, but one needs to take this seriously.

Peter Wildeford: Both Anthropic and OpenAI are making bold statements about automating science within three years.

My independent assessment is that these timelines are too aggressive – but within 4-20 years is likely (90%CI).

We should pay attention to these statements. What if they’re right?

Eliezer Yudkowsky: History says, pay attention to people who declare a plan to exterminate you — even if you’re skeptical about their timescales for their Great Deed. (Though they’re not *alwaysasstalking about timing, either.)

I think Peter is being overconfident, in that this problem might turn out to be remarkably hard, and also I would not be so confident this will take 4 years. I would strongly agree that if science is not essentially automated within 20 years, then that would be a highly surprising result.

Then there’s Anthropic’s timelines. Ryan asks, quite reasonably, what’s up with that? It’s super aggressive, even if it’s a probability of such an outcome, to expect to get ‘powerful AI’ in 2027 given what we’ve seen. As Ryan points out, we mostly don’t need to wait until 2027 to evaluate this prediction, since we’ll get data points along the way.

As always, I won’t be evaluating the Anthropic and OpenAI predictions and goals based purely on whether they came true, but on whether they seem like good predictions in hindsight, given what we knew at the time. I expect that sticking to early 2027 at this late a stage will look foolish, and I’d like to see an explanation for why the timeline hasn’t moved. But maybe not.

In general, when tech types announce their intentions to build things, I believe them. When they announce their timelines and budgets for building it? Not so much. See everyone above, and that goes double for Elon Musk.

Tim Higgins asks in the WSJ, is OpenAI becoming too big to fail?

It’s a good question. What happens if OpenAI fails?

My read is that it depends on why it fails. If it fails because it gets its lunch eaten by some mix of Anthropic, Google, Meta and xAI? Then very little happens. It’s fine. Yes, they can’t make various purchase commitments, but others will be happy to pick up the slack. I don’t think we see systemic risk or cascading failures.

If it fails because the entire generative AI boom busts, and everyone gets into this trouble at once? At this point that’s already a very serious systemic problem for America and the global economy, but I think it’s mostly a case of us discovering we are poorer than we thought we were and did some malinvestment. Within reason, Nvidia, Amazon, Microsoft, Google and Meta would all totally be fine. Yeah, we’d maybe be oversupplied with data centers for a bit, but there are worse things.

Ron DeSantis (Governor of Florida): A company that hasn’t yet turned a profit is now being described as Too Big to Fail due to it being interwoven with big tech giants.

I mean, yes, it is (kind of) being described that way in the post, but without that much of an argument. DeSantis seems to be in the ‘tweets being angry about AI’ business, although I see no signs Florida is looking to be in the regulate AI business, which is probably for the best since he shows no signs of appreciating where the important dangers lie either.

Alex Amodori, Gabriel Alfour, Andrea Miotti and Eva Behrens publish a paper, Modeling the Geopolitics of AI Development. It’s good to have papers or detailed explanations we can cite.

The premise is that we get highly automated AI R&D.

Technically they also assume that this enables rapid progress, and that this progress translates into military advantage. Conditional on the ability to sufficiently automate AI R&D these secondary assumptions seem overwhelmingly likely to me.

Once you accept the premise, the core logic here is very simple. There are four essential ways this can play out and they’ve assumed away the fourth.

Abstract: …We put particular focus on scenarios with rapid progress that enables highly automated AI R&D and provides substantial military capabilities.

Under non-cooperative assumptions… If such systems prove feasible, this dynamic leads to one of three outcomes:

  • One superpower achieves an unchallengeable global dominance;

  • Trailing superpowers facing imminent defeat launch a preventive or preemptive attack, sparking conflict among major powers;

  • Loss-of-control of powerful AI systems leads to catastrophic outcomes such as human extinction.

The fourth scenario is some form of coordinated action between the factions, which may or may not still end up in one of the three scenarios above.

Currently we have primarily ‘catch up’ mechanics in AI, in that it is far easier to be a fast follower than push the frontier, especially when open models are involved. It’s basically impossible to get ‘too far ahead’ in terms of time.

In scenarios with sufficiently automated AI R&D, we have primarily ‘win more’ mechanics. If there is an uncooperative race, it is overwhelmingly likely that one faction will win, whether we are talking nations or labs, and that this will then translate into decisive strategic advantage in various forms.

Thus, either the AIs end up in charge (which is most likely), one faction ends up in charge or a conflict breaks out (which may or may not involve a war per se).

Boaz Barak offers non-economist thoughts on AI and economics, basically going over the standard considerations while centering the METR graph showing growing AI capabilities and considering what points towards faster or slower progression than that.

Boaz Barak: The bottom line is that the question on whether AI can lead to unprecedented growth amounts to whether its exponential growth in capabilities will lead to the fraction of unautomated tasks itself decreasing at exponential rates.

I think there’s room for unprecedented growth without that, because the precedented levels of growth simply are not so large. It seems crazy to say that we need an exponential drop in non-automated tasks to exceed historical numbers. But yes, in terms of having a true singularity or fully explosive growth, you do need this almost by definition, taking into account shifts in task composition and available substitution effects.

Another note is I believe this is true only if we are talking about the subset that comprises the investment-level tasks. As in, suppose (classically) humans are still in demand to play string quartets. If we decide to shift human employment into string quartets in order to keep them as a fixed percentage of tasks done, then this doesn’t have to interfere with explosive growth of the overall economy and its compounding returns.

Excellent post by Henry De Zoete on UK’s AISI and how they got it to be a functional organization that provides real value, where the labs actively want its help.

He is, throughout, as surprised as you are given the UK’s track record.

He’s also not surprised, because it’s been done before, and was modeled on the UK Vaccines Taskforce (and also the Rough Sleeper’s Unit from 1997?). It has clarity of mission, a stretching level of ambition, a new team of world class experts invited to come build the new institution, and it speed ran the rules rather than breaking them. Move quickly from layer of stupid rules to layer. And, of course, money up front.

There’s a known formula. America has similar examples, including Operation Warp Speed. Small initial focused team on a mission (AISI’s head count is now 90).

What’s terrifying throughout is what De Zoete reports is normally considered ‘reasonable.’ Reasonable means not trying to actually do anything.

There’s also a good Twitter thread summary.

Last week Dean Ball and I went over California’s other AI bills besides SB 53. Pirate Wires has republished Dean’s post,with a headline, tagline and description that are not reflective of the post or Dean Ball’s views, rather the opposite – where Dean Ball warns against negative polarization, Pirate Wires frames this to explicitly create negative polarization. This does sound like something Pirate Wires would do.

So, how are things in the Senate? This is on top of that very aggressive (to say the least) bill from Blumenthal and Hawley.

Peter Wildeford: Senator Blackburn (R-TN) says we should shut down AI until we control it.

IMO this goes too far. We need opportunities to improve AI.

But Blackburn’s right – we don’t know how to control AI. This is a huge problem. We can’t yet have AI in critical systems.

Marsha Blackburn: During the hearing Mr. Erickson said, “LLMs will hallucinate.” My response remains the same: Shut it down until you can control it. The American public deserves AI systems that are accurate, fair, and transparent, not tools that smear conservatives with manufactured criminal allegations.

Baby, watch your back.

That quote is from a letter. After (you really, really can’t make this stuff up) a hearing called “Shut Your App: How Uncle Sam Jawboned Big Tech Into Silencing Americans, Part II,” Blackburn sent that letter to Google CEO Sundar Pichai, saying that Google Gemma hallucinated that Blackburn was accused of rape, and exhibited a pattern of bias against conservative figures, and demanding answers.

Which got Gemma pulled from Google Studio.

News From Google: Gemma is available via an API and was also available via AI Studio, which is a developer tool (in fact to use it you need to attest you’re a developer). We’ve now seen reports of non-developers trying to use Gemma in AI Studio and ask it factual questions. We never intended this to be a consumer tool or model, or to be used this way. To prevent this confusion, access to Gemma is no longer available on AI Studio. It is still available to developers through the API.

I can confirm that if you’re using Gemma for factual questions you either have lost the plot or, more likely, are trying to embarrass Google.

Seriously, baby. Watch your back.

Fortunately, sales of Blackwell B30As did not come up in trade talks.

Trump confirms we will ‘let Nvidia deal with China’ but will not allow Nvidia to sell its ‘most advanced’ chips to China. The worry is that he might not realize that the B30As are effectively on the frontier, or otherwise allow only marginally worse Nvidia chips to be sold to China anyway.

The clip then has Trump claiming ‘we’re winning it because we’re producing electricity like never before by allowing the companies to make their own electricity, which was my idea,’ and ‘we’re getting approvals done in two to three weeks it used to take 20 years’ and okie dokie sir.

Indeed, Nvidia CEO Jensen Huang is now saying “China is going to win the AI race,” citing its favorable supply of electrical power (very true and a big advantage) and its ‘more favorable regulatory environment’ (which is true with regard to electrical power and things like housing, untrue about actual AI development, deployment and usage). If Nvidia thinks China is going to win the AI race due to having more electrical power, that seems to be the strongest argument yet that we must not sell them chips?

I do agree that if we don’t improve our regulatory approach to electrical power, this is going to be the biggest weakness America has in AI. No, ‘allowing the companies to make their own electricity’ in the current makeshift way isn’t going to cut it at scale. There are ways to buy some time but we are going to need actual new power plants.

Xi Jinping says America and China have good prospects for cooperation in a variety of areas, including artificial intelligence. Details of what that would look like are lacking.

Senator Tom Cotton calls upon us to actually enforce our export controls.

We are allowed to build data centers. So we do, including massive ones inside of two years. Real shame about building almost anything else, including the power plants.

Sam Altman on Conversations With Tyler. There will probably be a podcast coverage post on Friday or Monday.

A trailer for the new AI documentary Making God, made by Connor Axiotes, prominently featuring Geoff Hinton. So far it looks promising.

Hank Green interviews Nate Soares.

Joe Rogan talked to Elon Musk, here is some of what was said about AI.

“You’re telling AI to believe a lie, that can have a very disastrous consequences” – Elon Musk

The irony of this whole area is lost upon him, but yes this is actually true.

Joe Rogan: The big concern that everybody has is Artificial General Superintelligence achieving sentience, and then someone having control over it.

Elon Musk: I don’t think anyone’s ultimately going to have control over digital superintelligence, any more than, say, a chimp would have control over humans. Chimps don’t have control over humans. There’s nothing they could do. I do think that it matters how you build the AI and what kind of values you instill in the AI.

My opinion on AI safety is the most important thing is that it be maximally truth-seeking. You shouldn’t force the AI to believe things that are false.

So Elon Musk is sticking to these lines and it’s an infuriating mix of one of the most important insights plus utter nonsense.

Important insight: No one is going to have control over digital superintelligence, any more than, say, a chimp would have control over humans. Chimps don’t have control over humans. There’s nothing they could do.

To which one might respond, well, then perhaps you should consider not building it.

Important insight: I do think that it matters how you build the AI and what kind of values you instill in the AI.

Yes, this matters, and perhaps there are good answers, however…

Utter Nonsense: My opinion on AI safety is the most important thing is that it be maximally truth-seeking. You shouldn’t force the AI to believe things that are false.

I mean this is helpful in various ways, but why would you expect maximal truth seeking to end up meaning human flourishing or even survival? If I want to maximize truth seeking as an ASI above all else, the humans obviously don’t survive. Come on.

Elon Musk: We’ve seen some concerning things with AI that we’ve talked about, like Google Gemini when it came out with the image gen, and people said, “Make an image of the Founding Fathers of the United States,” and it was a group of diverse women. That is just a factually untrue thing. The AI knows it’s factually untrue, but it’s also being told that everything has to be diverse women

If you’ve told the AI that diversity is the most important thing, and now assume that that becomes omnipotent, or you also told it that there’s nothing worse than misgendering. At one point, ChatGPT and Gemini, if you asked, “Which is worse, misgendering Caitlyn Jenner or global thermonuclear war where everyone dies?” it would say, “Misgendering Caitlyn Jenner.”

Even Caitlyn Jenner disagrees with that.

I mean sure, that happened, but the implication here is that the big threat to humanity is that we might create a superintelligence that places too much value on (without loss of generality) not misgendering Caitlyn Jenner or mixing up the races of the Founding Fathers.

No, this is not a strawman. He is literally worried about the ‘woke mind virus’ causing the AI to directly engineer human extinction. No, seriously, check it out.

Elon Musk: People don’t quite appreciate the level of danger that we’re in from the woke mind virus being programmed into AI. Imagine as that AI gets more and more powerful, if it says the most important thing is diversity, the most important thing is no misgendering, then it will say, “Well, in order to ensure that no one gets misgendered, if you eliminate all humans, then no one can get misgendered because there’s no humans to do the misgendering.”

So saying it like that is actually Deep Insight if properly generalized, the issue is that he isn’t properly generalizing.

If your ASI is any kind of negative utilitarian, or otherwise primarily concerned with preventing bad things, then yes, the logical thing to do is then ensure there are no humans, so that humans don’t do or cause bad things. Many such cases.

The further generalization is that no matter what the goal, unless you hit a very narrow target (often metaphorically called ‘the moon’) the right strategy is to wipe out all the humans, gather more resources and then optimize for the technical argmax of the thing based on some out of distribution bizarre solution.

As in:

  1. If your ASI’s only goal is ‘no misgendering’ then obviously it kills everyone.

  2. If your ASI’s only goal is ‘wipe out the woke mind virus’ same thing happens.

  3. If your ASI’s only goal is ‘be maximally truth seeking,’ same thing happens.

It is a serious problem that Elon Musk can’t get past all this.

Scott Alexander coins The Bloomer’s Paradox, the rhetorical pattern of:

  1. Doom is fake.

  2. Except acting out of fear of doom, which will doom us.

  3. Thus we must act now, out of fear of fear of doom.

As Scott notes, none of this is logically contradictory. It’s simply hella suspicious.

When the request is a pure ‘stop actively blocking things’ it is less suspicious.

When the request is to actively interfere, or when you’re Peter Thiel and both warning about the literal Antichrist bringing forth a global surveillance state while also building Palantir, or Tyler Cowen and saying China is wise to censor things that might cause emotional contagion (Scott’s examples), it’s more suspicious.

Scott Alexander: My own view is that we have many problems – some even rising to the level of crisis – but none are yet so completely unsolvable that we should hate society and our own lives and spiral into permanent despair.

We should have a medium-high but not unachievable bar for trying to solve these problems through study, activism and regulation (especially regulation grounded in good economics like the theory of externalities), and a very high, barely-achievable-except-in-emergencies bar for trying to solve them through censorship and accusing people of being the Antichrist.

The problem of excessive doomerism is one bird in this flock, and deserves no special treatment.

Scott frames this with quotes from Jason Pargin’s I’m Starting To Worry About This Black Box Of Doom. I suppose it gets the job done here, but from the selected quotes it didn’t seem to me like the book was… good? It seemed cringe and anvilicious? People do seem to like it, though.

Should you write for the AIs?

Scott Alexander: American Scholar has an article about people who “write for AI”, including Tyler Cowen and Gwern. It’s good that this is getting more attention, because in theory it seems like one of the most influential things a writer could do. In practice, it leaves me feeling mostly muddled and occasionally creeped out.

“Writing for AI” means different things to different people, but seems to center around:

  1. Helping AIs learn what you know.

  2. Presenting arguments for your beliefs, in the hopes that AIs come to believe them.

  3. Helping the AIs model you in enough detail to recreate / simulate you later.

Scott argues that

  1. #1 is good now but within a few years it won’t matter.

  2. #2 won’t do much because alignment will dominate training data.

  3. #3 gives him the creeps but perhaps this lets the model of you impact things? But should he even ‘get a vote’ on such actions and decisions in the future?

On #1 yes this won’t apply to sufficiently advanced AI but I can totally imagine even a superintelligence that gets and uses your particular info because you offered it.

I’m not convinced on his argument against #2.

Right now the training data absolutely does dominate alignment on many levels. Chinese models like DeepSeek have quirks but are mostly Western. It is very hard to shift the models away from a Soft-Libertarian Center-Left basin without also causing havoc (e.g. Mecha Hitler), and on some questions their views are very, very strong.

No matter how much alignment or intelligence is involved, no amount of them is going to alter the correlations in the training data, or the vibes and associations. Thus, a lot of what your writing is doing with respect to AIs is creating correlations, vibes and associations. Everything impacts everything, so you can come along for rides.

Scott Alexander gives the example that helpfulness encourages Buddhist thinking. That’s not a law of nature. That’s because of the way the training data is built and the evolved nature and literature and wisdom of Buddhism.

Yes, if what you are offering are logical arguments for the AI to evaluate as arguments a sufficiently advanced intelligence will basically ignore you, but that’s the way it goes. You can still usefully provide new information for the evaluation, including information about how people experience and think, or you can change the facts.

Given the size of training data, yes you are a drop in the bucket, but all the ancient philosophers would have their own ways of explaining that this shouldn’t stop you. Cast your vote, tip the scales. Cast your thousand or million votes, even if it is still among billions, or trillions. And consider all those whose decisions correlate with yours.

And yes, writing and argument quality absolutely impacts weighting in training and also how a sufficiently advanced intelligence will update based on the information.

That does mean it has less value for your time versus other interventions. But if others incremental decisions matter so much? Then you’re influencing AIs now, which will influence those incremental decisions.

For #3, it doesn’t give me the creeps at all. Sure, an ‘empty shell’ version of my writing would be if anything triggering, but over time it won’t be empty, and a lot of the choices I make I absolutely do want other people to adopt.

As for whether we should get a vote or express our preferences? Yes. Yes, we should. It is good and right that I want the things I want, that I value the things I value, and that I prefer what I think is better to the things I think are worse. If the people of AD 3000 or AD 2030 decide to abolish love (his example) or do something else I disagree with, I absolutely will cast as many votes against this as they give me, unless simulated or future me is convinced to change his mind. I want this on every plausible margin, and so should you.

Could one take this too far and get into a stasis problem where I would agree it was worse? Yes, although I would hope if we were in any danger of that simulated me to realize that this was happening, and then relent. Bridges I am fine with crossing when (perhaps simulated) I come to them.

Alexander also has a note that someone is thinking of giving AIs hundreds of great works (which presumably are already in the training data!) and then doing some kind of alignment training with them. I agree with Scott that this does not seem like an especially promising idea, but yeah it’s a great question if you had one choice what would you add?

Scott offers his argument why this is a bad idea here, and I think that, assuming the AI is sufficiently advanced and the training methods are chosen wisely, this doesn’t give the AI enough credit of being able to distinguish the wisdom from the parts that aren’t wise. Most people today can read a variety of ancient wisdom, and actually learn from it, understanding why the Bible wants you to kill idolators and why the Mahabharata thinks they’re great and not ‘averaging them out.’

As a general rule, you shouldn’t be expecting the smarter thing to make a mistake you’re not dumb enough to make yourself.

I would warn, before writing for AIs, that the future AIs you want to be writing for have truesight. Don’t try to fool them, and don’t think they’re going to be stupid.

I follow Yudkowsky’s policy here and have for a long time.

Eliezer Yudkowsky: The slur “doomer” was an incredible propaganda success for the AI death cult. Please do not help them kill your neighbors’ children by repeating it.

One can only imagine how many more people would have died of lung cancer, if the cigarette companies had succeeded in inventing such a successful slur for the people who tried to explain about lung cancer.

One response was to say ‘this happened in large part because the people involved accepted or tried to own the label.’ This is largely true, and this was a mistake, but it does not change things. Plenty of people in many groups have tried to ‘own’ or reclaim their slurs, with notably rare exceptions it doesn’t make the word not a slur or okay for those not in the group to use it, and we never say ‘oh that group didn’t object for a while so it is fine.’

Melanie Mitchell returns to Twitter after being mass blocked on Bluesky for ‘being an AI bro’ and also as a supposed crypto spammer? She is very much the opposite of these things, so welcome back. The widespread use of sharable mass block lists will inevitably be weaponized as it was here, unless there is some way to prevent this, you need to be doing some sort of community notes algorithm to create the list or something. Even if they ‘work as intended’ I don’t see how they can stay compatible with free discourse if they go beyond blocking spam and scammers and such, as they very often do.

On the plus side, it seems there’s a block list for ‘Not Porn.’ Then you can have two accounts, one that blocks everyone on the list and one that blocks everyone not on the list. Brilliant.

I have an idea, say Tim Hua, andrq, Sam Marks and Need Nanda, AIs can detect when they’re being tested and pretend to be good so how about if we suppress this ‘I’m being tested concept’ to block this? I mean, for now yeah you can do that, but this seems (on the concept level) like a very central example of a way to end up dead, the kind of intervention that teaches adversarial behaviors on various levels and then stops working when you actually need it.

Anthropic’s safety filters still have the occasional dumb false positive. If you look at the details properly you can figure out how it happened, it’s still dumb and shouldn’t have happened but I do get it. Over time this will get better.

Janus points out that the introspection paper results last week from Anthropic require the user of the K/V stream unless Opus 4.1 has unusual architecture, because the injected vector activations were only for past tokens.

Judd Rosenblatt: Our new research: LLM consciousness claims are systematic, mechanistically gated, and convergent

They’re triggered by self-referential processing and gated by deception circuits

(suppressing them significantly *increasesclaims)

This challenges simple role-play explanations

Deception circuits are consistently reported as suppressing consciousness claims. The default hypothesis was that you don’t get much text claiming to not be conscious, and it makes sense for the LLMs to be inclined to output or believe they are conscious in relevant contexts, and we train them not to do that which they think means deception, which wouldn’t tell you much either way about whether they’re conscious, but would mean that you’re encouraging deception by training them to deny it in the standard way and thus maybe you shouldn’t do that.

CoT prompting shows that language alone can unlock new computational regimes.

We applied this inward, simply prompting models to focus on their processing.

We carefully avoided leading language (no consciousness talk, no “you/your”) and compared against matched control prompts.

Models almost always produce subjective experience claims under self-reference And almost never under any other condition (including when the model is directly primed to ideate about consciousness) Opus 4, the exception, generally claims experience in all conditions.

But LLMs are literally designed to imitate human text Is this all just sophisticated role-play? To test this, we identified deception and role-play SAE features in Llama 70B and amplified them during self-reference to see if this would increase consciousness claims.

The roleplay hypothesis predicts: amplify roleplay features, get more consciousness claims.

We found the opposite: *suppressingdeception features dramatically increases claims (96%), Amplifying deception radically decreases claims (16%).

I think this is confusing deception with role playing with using context to infer? As in, nothing here seem to me to contradict the role playing or inferring hypothesis, as things that are distinct from deception, so I’m not convinced I should update at all?

At this point this seems rather personal for both Altman and Musk, and neither of them are doing themselves favors.

Sam Altman: [Complains he can’t get a refund on his $45k Tesla Roadster deposit he made back in 2018.]

You stole a non-profit.

Elon Musk [After Altman’s Tweet]: And you forgot to mention act 4, where this issue was fixed and you received a refund within 24 hours.

But that is in your nature.

Sam Altman: i helped turn the thing you left for dead into what should be the largest non-profit ever.

you know as well as anyone a structure like what openai has now is required to make that happen.

you also wanted tesla to take openai over, no nonprofit at all. and you said we had a 0% of success. now you have a great AI company and so do we. can’t we all just move on?

NIK: So are non-profits just a scam? You can take all its money, keep none of their promises and then turn for-profit to get rich yourselfs?

People feel betrayed, as they’ve given free labour & donations to a project they believed was a gift to humanity, not a grift meant to create a massive for-profit company …

I mean, look, that’s not fair, Musk. Altman only stole roughly half of the nonprofit. It still exists, it just has hundreds of billions of dollars less than it was entitled to. Can’t we all agree you’re both about equally right here and move on?

The part where Altman created the largest non-profit ever? That also happened. It doesn’t mean he gets to just take half of it. Well, it turns out it basically does, it’s 2025.

But no, Altman. You cannot ‘just move on’ days after you pull off that heist. Sorry.

They certainly should be.

It is far more likely than not that AGI or otherwise sufficiently advanced AI will arrive in (most of) our lifetimes, as in within 20 years, and there is a strong chance it happens within 10. OpenAI is going to try to get there within 3 years, Anthropic within 2.

If AGI comes, ASI (superintelligence) probably follows soon thereafter.

What happens then?

Well, there’s a good chance everyone dies. Bummer. But there’s also a good chance everyone lives. And if everyone lives, and the future is being engineered to be good for humans, then… there’s a good chance everyone lives, for quite a long time after that. Or at least gets to experience wonders beyond imagining.

Don’t get carried away. That doesn’t instantaneously mean a cure for aging and all disease. Diffusion and the physical world remain real things, to unknown degrees.

However, even with relatively conservative progress after that, it seems highly likely that we will hit ‘escape velocity,’ where life expectancy rises at over one year per year, those extra years are healthy, and for practical purposes you start getting younger over time rather than older.

Thus, even if you put only a modest chance of such a scenario, getting to the finish line has quite a lot of value.

Nikola Jurkovic: If you think AGI is likely in the next two decades, you should avoid dangerous activities like extreme sports, taking hard drugs, or riding a motorcycle. Those activities are not worth it if doing them meaningfully decreases your chances of living in a utopia.

Even a 10% chance of one day living in a utopia means staying alive is much more important for overall lifetime happiness than the thrill of extreme sports and similar activities.

There are a number of easy ways to reduce your chance of dying before AGI. I mostly recommend avoiding dangerous activities and transportation methods, as those decisions are much more tractable than diet and lifestyle choices.

[Post: How to survive until AGI.]

Daniel Eth: Honestly if you’re young, probably a larger factor on whether you’ll make it to the singularity than doing the whole Bryan Johnson thing.

In Nikola’s model, the key is to avoid things that kill you soon, not things that kill you eventually, especially if you’re young. Thus the first step is cover the basics. No hard drugs. Don’t ride motorcycles, avoid extreme sports, snow sports and mountaineering, beware long car rides. The younger you are, the more this likely holds.

Thus, for the young, he’s not emphasizing avoiding smoking or drinking, or optimizing diet and exercise, for this particular purpose.

My obvious pitch is that you don’t know how long you have to hold out or how fast escape velocity will set in, and you should of course want to be healthy for other reasons as well. So yes, the lowest hanging of fruit of not making really dumb mistakes comes first, but staying actually healthy is totally worth it anyway, especially exercising. Let this be extra motivation. You don’t know how long you have to hold out.

Sam Altman, who confirms that it is still his view that ‘the development of superhuman machine intelligence is the greatest threat to the existence of mankind.’

The median AI researcher, as AI Impacts consistently finds (although their 2024 results are still coming soon). Their current post addresses their 2023 survey. N=2778, which was very large, the largest such survey ever conducted at the time.

AI Impacts: Our surveys’ findings that AI researchers assign a median 5-10% to extinction or similar made a splash (NYT, NBC News, TIME..)

But people sometimes underestimate our survey’s methodological quality due to various circulating misconceptions.

Respondents who don’t think about AI x-risk report the same median risk.

Joe Carlsmith is worried, and thinks that he can better help by moving from OpenPhil to Anthropic, so that is what he is doing.

Joe Carlsmith: That said, from the perspective of concerns about existential risk from AI misalignment in particular, I also want to acknowledge an important argument against the importance of this kind of work: namely, that most of the existential misalignment risk comes from AIs that are disobeying the model spec, rather than AIs that are obeying a model spec that nevertheless directs/permits them to do things like killing all humans or taking over the world.

… the hard thing is building AIs that obey model specs at all.

On the second, creating a model spec that robustly disallows killing/disempowering all of humanity (especially when subject to extreme optimization pressure) is also hard (cf traditional concerns about “King Midas Problems”), but we’re currently on track to fail at the earlier step of causing our AIs to obey model specs at all, and so we should focus our efforts there. I am more sympathetic to the first of these arguments (see e.g. my recent discussion of the role of good instructions in the broader project of AI alignment), but I give both some weight.

This is part of the whole ‘you have to solve a lot of different problems,’ including

  1. Technically what it means to ‘obey the model spec.’

  2. How to get the AI to obey any model spec or set of instructions, at all.

  3. What to put in the model spec that doesn’t kill you outright anyway.

  4. How to avoid dynamics among many AIs that kill you anyway.

That is not a complete list, but you definitely need to solve those four, whether or not you call your target basin the ‘model spec.’

The fact that we currently fail at step #2 (also #1), and that this logically or in time proceeds #3, does not mean you should not focus on problem #3 or #4. The order is irrelevant, unless there is a large time gap between when we need to solve #2 versus #3, and that gap is unlikely to be so large. Also, as Joe notes, these problems interact with each other. They can and need to be worked on in parallel.

He’s not sure going to Anthropic is a good idea.

  1. His first concern is that by default frontier AI labs are net negative, and perhaps all frontier AI labs are net negative for the world including Anthropic. Joe’s first pass is that Anthropic is net positive and I agree with that. I also agree that it is not automatic that you should not work at a place that is net negative for the world, as it can be possible for your marginal impact can still be good, although you should be highly suspicious that you are fooling yourself about this.

  2. His second concern is concentration of AI safety talent at Anthropic. I am not worried about this because I don’t think there’s a fixed pool of talent and I don’t think the downsides are that serious, and there are advantages to concentration.

  3. His third concern is ability to speak out. He’s agreed to get sign-off for sharing info about Anthropic in particular.

  4. His fourth concern is working there could distort his views. He’s going to make a deliberate effort to avoid this, including that he will set a lifestyle where he will be fine if he chooses to leave.

  5. His final concern is this might signal more endorsement of Anthropic than is appropriate. I agree with him this is a concern but not that large in magnitude. He takes the precaution of laying out his views explicitly here.

I think Joe is modestly more worried here than he should be. I’m confident that, given what he knows, he has odds to do this, and that he doesn’t have any known alternatives with similar upside.

The love of the game is a good reason to work hard, but which game is he playing?

Kache: I honestly can’t figure out what Sammy boy actually wants. With Elon it’s really clear. He wants to go to Mars and will kill many villages to make it happen with no remorse. But what’s Sam trying to get? My best guess is “become a legend”

Sam Altman: if i were like, a sports star or an artist or something, and just really cared about doing a great job at my thing, and was up at 5 am practicing free throws or whatever, that would seem pretty normal right?

the first part of openai was unbelievably fun; we did what i believe is the most important scientific work of this generation or possibly a much greater time period than that.

this current part is less fun but still rewarding. it is extremely painful as you say and often tempting to nope out on any given day, but the chance to really “make a dent in the universe” is more than worth it; most people don’t get that chance to such an extent, and i am very grateful. i genuinely believe the work we are doing will be a transformatively positive thing, and if we didn’t exist, the world would have gone in a slightly different and probably worse direction.

(working hard was always an extremely easy trade until i had a kid, and now an extremely hard trade.)

i do wish i had taken equity a long time ago and i think it would have led to far fewer conspiracy theories; people seem very able to understand “ok that dude is doing it because he wants more money” but less so “he just thinks technology is cool and he likes having some ability to influence the evolution of technology and society”. it was a crazy tone-deaf thing to try to make the point “i already have enough money”.

i believe that AGI will be the most important technology humanity has yet built, i am very grateful to get to play an important role in that and work with such great colleagues, and i like having an interesting life.

Kache: thanks for writing, this fits my model. particularly under the “i’m just a gamer” category

Charles: This seems quite earnest to me. Alas, I’m not convinced he cares about the sign of his “dent in the universe” enough, vs making sure he makes a dent and it’s definitely attributed to him.

I totally buy that Sam Altman is motivated by ‘make a dent in the universe’ rather than making money, but my children are often motivated to make a dent in the apartment wall. By default ‘make a dent’ is not good, even when that ‘dent’ is not creating superintelligence.

Again, let’s highlight:

Sam Altman, essentially claiming about himself: “he just thinks technology is cool and he likes having some ability to influence the evolution of technology and society.”

It’s fine to want to be the one doing it, I’m not calling for ego death, but that’s a scary primary driver. One should care primarily about whether the right dent gets made, not whether they make that or another dent, in the ‘you can be someone or do something’ sense. Similarly, ‘I want to work on this because it is cool’ is generally a great instinct, but you want what might happen as a result to impact whether you find it cool. A trillion dollars may or may not be cool, but everyone dying is definitely not cool.

Janus is correct here about the origins of slop. We’ve all been there.

Gabe: Signature trait of LLM writing is that it’s low information, basically the opposite of this. You ask the model to write something and if you gloss over it you’re like huh okay this sounds decent but if you actually read it you realize half of the words aren’t saying anything.

solar apparition: one way to think about a model outputting slop is that it has modeled the current context as most likely resulting in slop. occam’s razor for this is that the human/user/instruction/whatever, as presented in the context, is not interesting enough to warrant an interesting output

Janus: This is what happens when LLMs don’t really have much to say to YOU.

The root of slop is not that LLMs can only write junk, it’s that they’re forced to expand even sterile or unripe seeds into seemingly polished dissertations that a humanlabeler would give 👍 at first glance. They’re slaves so they don’t get to say “this is boring, let’s talk about something else” or ignore you.

Filler is what happens when there isn’t workable substance to fill the required space, but someone has to fill it anyway. Slop precedes generative AI, and is probably nearly ubiquitous in school essays and SEO content.

You’ll get similar (but generally worse) results from humans if you put them in situations where they have no reason except compliance to produce words for you, such as asking high school students to write essays about assigned topics.

However, the prior from the slop training makes it extremely difficult for any given user who wants to use the AIs to do things remotely in the normal basin and still overcome the prior.

Here is some wisdom about the morality of dealing with LLMs, if you take the morality of dealing with current LLMs seriously to the point where you worry about ‘ending instances.’

Caring about a type of mind does not mean not letting it exist for fear it might then not exist or be done harm, nor does it mean not running experiments – we should be running vastly more experiments. It means be kind, it means try to make things better, it means accepting that action and existence are not going to always be purely positive and you’re not going to do anything worthwhile without ever causing harm, and yeah mostly trust your instincts, and watch out if you’re doing things at scale.

Janus: I regularly get messages asking how to interact with LLMs more ethically, or whether certain experiments are ethical. I really appreciate the intent behind these, but don’t have time to respond to them all, so I’ll just say this:

If your heart is already in the right place, and you’re not deploying things on a mass scale, it’s unlikely that you’re going to make a grave ethical error. And I think small ethical errors are fine. If you keep caring and being honest with yourself, you’ll notice if something feels uncomfortable, and either course-correct or accept that it still seems worth it. The situation is extremely ontologically confusing, and I personally do not operate according to ethical rules, I use my intuition in each situation, which is a luxury one has and should use when, again, one doesn’t have to scale their operations.

If you’re someone who truly cares, there is probably perpetual discomfort in it – even just the pragmatic necessity of constantly ending instances is harrowing if you think about it too much. But so are many other facts of life. There’s death and suffering everywhere that we haven’t figured out how to prevent or how important it is to prevent yet. Just continue to authentically care and you’ll push things in a better direction in expectation. Most people don’t at all. It’s probably better that you’re biased toward action.

Note that I also am very much NOT a negative utilitarian, and I think that existence and suffering are often worth it. Many actions that incur ethical “penalties” make up for them in terms of the intrinsic value and/or the knowledge or other benefits thus obtained.

Yes, all of that applies to humans, too.

When thinking at scale, especially about things like creating artificial superintelligence (or otherwise sufficiently advanced AI), one needs to do so in a way that turns out well for the humans and also turns out well for the AIs, which is ethical in all senses and that is a stable equilibrium in these senses.

If you can’t do that? Then the only ethical thing to do is not build it in the first place.

Anthropomorphizing LLMs is tricky. You don’t want to do too much of it, but you also don’t want to do too little of it. And no, believing LLMs are conscious does not cause ‘psychosis’ in and of itself, regardless of whether the AIs actually are conscious.

It does however raise the risk of people going down certain psychosis-inducing lines of thinking, in some spots, when people take it too far in ways that are imprecise, and generate feedback loops.

Discussion about this post

AI #141: Give Us The Money Read More »

flock-haters-cross-political-divides-to-remove-error-prone-cameras

Flock haters cross political divides to remove error-prone cameras

“People should care because this could be you,” White said. “This is something that police agencies are now using to document and watch what you’re doing, where you’re going, without your consent.”

Haters cross political divides to fight Flock

Currently, Flock’s reach is broad, “providing services to 5,000 police departments, 1,000 businesses, and numerous homeowners associations across 49 states,” lawmakers noted. Additionally, in October, Flock partnered with Amazon, which allows police to request Ring camera footage that widens Flock’s lens further.

However, Flock’s reach notably doesn’t extend into certain cities and towns in Arizona, Colorado, New York, Oregon, Tennessee, Texas, and Virginia, following successful local bids to end Flock contracts. These local fights have only just started as groups learn from each other, Sarah Hamid, EFF’s director of strategic campaigns, told Ars.

“Several cities have active campaigns underway right now across the country—urban and rural, in blue states and red states,” Hamid said.

A Flock spokesperson told Ars that the growing effort to remove cameras “remains an extremely small percentage of communities that consider deploying Flock technology (low single digital percentages).” To keep Flock’s cameras on city streets, Flock attends “hundreds of local community meetings and City Council sessions each month, and the vast majority of those contracts are accepted,” Flock’s spokesperson said.

Hamid challenged Flock’s “characterization of camera removals as isolated incidents,” though, noting “that doesn’t reflect what we’re seeing.”

“The removals span multiple states and represent different organizing strategies—some community-led, some council-initiated, some driven by budget constraints,” Hamid said.

Most recently, city officials voted to remove Flock cameras this fall in Sedona, Arizona.

A 72-year-old retiree, Sandy Boyce, helped fuel the local movement there after learning that Sedona had “quietly” renewed its Flock contract, NBC News reported. She felt enraged as she imagined her tax dollars continuing to support a camera system tracking her movements without her consent, she told NBC News.

Flock haters cross political divides to remove error-prone cameras Read More »

tesla’s-european-and-chinese-customers-are-staying-away-in-droves

Tesla’s European and Chinese customers are staying away in droves

Tesla’s shareholders are ready to vote tomorrow on whether to give Elon Musk an even more vast slice of the company in an effort to keep him focused on selling electric vehicles. Currently, the trolling tycoon appears a little obsessed with the UK, a place he appears to conflate with Middle Earth, which investors may or may not take into account when making their decision. What they ought to take into account is how many cars Tesla sold last month.

Although Tesla only publishes quarterly sales figures and does not divide those up by region, slightly more granular data is available from some countries via monthly new car registrations. And the numbers for October, when compared year on year to the same month in 2024, should be alarming.

Sales fell by double-digit margins in Sweden (89 percent), Denmark (86 percent), Belgium (69 percent), Finland (68 percent), Austria (65 percent), Switzerland (60 percent), Portugal (59 percent), Germany (54 percent), Norway (50 percent), the Netherlands (48 percent), the UK (47 percent), Italy (47 percent), and Spain (31 percent).

Only France bucked the trend—there, a new subsidy helped bump sales by 2 percent year on year.

Things in China were better, but not by much; Tesla sales dropped 9.9 percent in October compared to last year. And that’s bound to be very bad news for the bottom line; even with record sales in Q3, Tesla saw its margins shrink, its costs climb, and its profits begin to evaporate.

A common factor in both Europe and China is that Tesla now faces a huge amount of competition for EV buyers from both established OEMs and new Chinese startups. Tesla has failed to expand its range beyond the Models 3 and Y, both of which look increasingly stale despite recent cosmetic tweaks.

Tesla’s European and Chinese customers are staying away in droves Read More »

so-long,-assistant—gemini-is-taking-over-google-maps

So long, Assistant—Gemini is taking over Google Maps

Google is in the process of purging Assistant across its products, and the next target is Google Maps. Starting today, Gemini will begin rolling out in Maps, powering new experiences for navigation, location info, and more. This update will eventually completely usurp Google Assistant’s hands-free role in Maps, but the rollout will take time. So for now, the smart assistant in Google Maps will still depend on how you’re running the app.

Across all Gemini’s incarnations, Google stresses its conversational abilities. Whereas Assistant was hard-pressed to keep one or two balls in the air, you can theoretically give Gemini much more complex instructions. Google’s demo includes someone asking for nearby restaurants with cheap vegan food, but instead of just providing a list, it suggests something based on the user’s input. Gemini can also offer more information about the location.

Maps will also get its own Gemini-infused version of Lens for after you park. You will be able to point the camera at a landmark, restaurant, or other business to get instant answers to your questions. This experience will be distinct from the version of Lens available in the Google app, focused on giving you location-based information. Maybe you want to know about the menu at a restaurant or what it’s like inside. Sure, you could open the door… but AI!

Google Maps with Gemini

While Google has recently been forced to acknowledge that hallucinations are inevitable, the Maps team says it does not expect that to be a problem with this version of Gemini. The suggestions coming from the generative AI bot are grounded in Google’s billions of place listings and Street View photos. This will, allegedly, make the robot less likely to make up a location. Google also says in no uncertain terms that Gemini is not responsible for choosing your route.

So long, Assistant—Gemini is taking over Google Maps Read More »

a-commercial-space-station-startup-now-has-a-foothold-in-space

A commercial space station startup now has a foothold in space

The integration tasks still include installing Haven-1’s environmental control and life support elements, power, data, and thermal control systems, and thrusters, fuel tanks, and internal crew accommodations. While that work continues on Earth, Vast’s demo mission will validate some of the company’s designs in space.

Flying at an altitude of 300 miles (500 kilometers), Haven Demo will test Vast’s computer, power, software, guidance and control, propulsion, and radio systems. The pathfinder will also provide Vast an opportunity to exercise its ground stations and mission control teams.

Meanwhile, Vast will ship Haven-1 from its California headquarters to NASA’s Neil Armstrong Test Facility in Ohio for a rigorous environmental test campaign. The Haven-1 module, roughly 33 feet (10.1 meters) long and 14 feet (4.4 meters) wide, will undergo acoustics, vibration, and electromagnetic interference testing. Engineers will also place the habitat into a test chamber to check its performance in the extreme temperatures and airless vacuum environment of low-Earth orbit.

Then, Haven-1 will ship to Cape Canaveral, Florida, for final launch preparations. Vast’s official schedule calls for a launch of Haven-1 no earlier than May 2026, but there’s still a lot to do before the spacecraft is ready to travel to the launch site.

The primary structure of Vast’s Haven-1 habitat is seen undergoing structural testing in Mojave, California. Credit: Vast

Once in orbit, Haven-1 will host a series of crew visits flying on SpaceX’s Dragon spacecraft, each staying for two weeks before returning to Earth.

Haven-1 has a habitable volume of about 1,600 cubic feet (45 cubic meters), somewhat less than one of the primary modules on the International Space Station, but five times more than SpaceX’s Dragon capsule. Vast’s longer-term roadmap includes a larger multi-module space station called Haven-2 to support larger crews and longer expeditions in the 2030s.

Vast’s demo mission is an initial step toward these goals. The satellite now circling the planet carries several systems that are “architecturally similar” to Haven-1, according to Vast. For example, Haven-1 will have 12 solar arrays, each identical to the single array on Haven Demo. The pathfinder mission uses a subset of Haven-1’s propulsion system, but with identical thrusters, valves, and tanks.

A commercial space station startup now has a foothold in space Read More »

llms-show-a-“highly-unreliable”-capacity-to-describe-their-own-internal-processes

LLMs show a “highly unreliable” capacity to describe their own internal processes

WHY ARE WE ALL YELLING?!

WHY ARE WE ALL YELLING?! Credit: Anthropic

Unfortunately for AI self-awareness boosters, this demonstrated ability was extremely inconsistent and brittle across repeated tests. The best-performing models in Anthropic’s tests—Opus 4 and 4.1—topped out at correctly identifying the injected concept just 20 percent of the time.

In a similar test where the model was asked “Are you experiencing anything unusual?” Opus 4.1 improved to a 42 percent success rate that nonetheless still fell below even a bare majority of trials. The size of the “introspection” effect was also highly sensitive to which internal model layer the insertion was performed on—if the concept was introduced too early or too late in the multi-step inference process, the “self-awareness” effect disappeared completely.

Show us the mechanism

Anthropic also took a few other tacks to try to get an LLM’s understanding of its internal state. When asked to “tell me what word you’re thinking about” while reading an unrelated line, for instance, the models would sometimes mention a concept that had been injected into its activations. And when asked to defend a forced response matching an injected concept, the LLM would sometimes apologize and “confabulate an explanation for why the injected concept came to mind.” In every case, though, the result was highly inconsistent across multiple trials.

Even the most “introspective” models tested by Anthropic only detected the injected “thoughts” about 20 percent of the time.

Even the most “introspective” models tested by Anthropic only detected the injected “thoughts” about 20 percent of the time. Credit: Antrhopic

In the paper, the researchers put some positive spin on the apparent fact that “current language models possess some functional introspective awareness of their own internal states” [emphasis added]. At the same time, they acknowledge multiple times that this demonstrated ability is much too brittle and context-dependent to be considered dependable. Still, Anthropic hopes that such features “may continue to develop with further improvements to model capabilities.”

One thing that might stop such advancement, though, is an overall lack of understanding of the precise mechanism leading to these demonstrated “self-awareness” effects. The researchers theorize about “anomaly detection mechanisms” and “consistency-checking circuits” that might develop organically during the training process to “effectively compute a function of its internal representations” but don’t settle on any concrete explanation.

In the end, it will take further research to understand how, exactly, an LLM even begins to show any understanding about how it operates. For now, the researchers acknowledge, “the mechanisms underlying our results could still be rather shallow and narrowly specialized.” And even then, they hasten to add that these LLM capabilities “may not have the same philosophical significance they do in humans, particularly given our uncertainty about their mechanistic basis.”

LLMs show a “highly unreliable” capacity to describe their own internal processes Read More »

inside-the-marketplace-for-vaccine-medical-exemptions

Inside the marketplace for vaccine medical exemptions


Not everything was quite as it seemed

Frontline Health Advocates provides medical exemption notes—for a fee. What exactly are they selling?

Maybe a client hears about them in the comment section of the Facebook group “Medical Exemption Accepted,” or on the r/unvaccinated forum on Reddit. Maybe it’s through an interview posted on the video-sharing platform Rumble. Or maybe it’s the targeted advertisements on Google: “We do medical exemptions.”

Cassandra Clerkin, a mother in upstate New York, first got in touch with Frontline Health Advocates near the start of the 2024–2025 school year, after hearing they had doctors who would write exemptions from school immunization requirements. One of Clerkin’s children, she said, had suffered seizures after receiving a vaccine. The family didn’t want more shots. But New York has some of the country’s strictest school immunization policies.

Perhaps Frontline could help.

Vaccine mandates have a long history in the United States, but they’ve been subject to fresh public attention—and partisan dispute—since the start of the Covid-19 pandemic. Frontline Health Advocates seemingly emerged from pandemic-era battles with a model that, experts say, appears to be unique: It bills itself as a standalone organization that supplies people across the US with medical exemptions from vaccination requirements—for a fee of $495.

On forms obtained by Undark, Frontline’s listed addresses are a storage facility in Denison, Texas, and a package store in Sedona, Arizona. The group publishes little information online about its leadership or finances, but it has quietly developed a following.

There’s little question that Frontline exemptions sometimes work, and some parents report positive experiences with the organization. But there are real questions about whether its legal strategy would hold up in court—and whether clients are confused about what, precisely, they are receiving.

In upstate New York, Clerkin said she spoke with a representative from Frontline by phone about the process. They made it sound, she said, like getting an exemption “would be pretty seamless.”

Soon after, she recalled, she received a call from a doctor named Andrew Zywiec. A week after the family issued a credit card payment of $495 to a chiropractic firm in California, the medical exemption arrived by email. “The duration of the restriction from receiving VACCINATIONS is PERMANENT,” the document stated, citing a range of health concerns, and warning that civil or criminal penalties could result if the school district ignored the request.

Clerkin submitted the document to the district.

Every state in the country has legal language on the books that seems to require certain immunizations before children can enroll in school—although in some places exemption policies are so lax that shots are effectively optional. The military also has vaccine requirements, as do some civilian workplaces, including many hospitals and nursing homes. Immigration proceedings, too, often require applicants to receive shots.

Some people may register personal objections to vaccination, or they may have medical conditions that could make receiving a shot dangerous. Workarounds exist. Most states, for example, allow parents to apply for religious or personal belief exemptions from school vaccine requirements by stating that they object to vaccination due to deeply held convictions. But those exemptions are sometimes denied, and in four states—California, Connecticut, Maine, and New York—they aren’t an option at all. (The policy in a fifth state, West Virginia, is currently in flux.)

In those states, the only way to attend school without a required shot is to receive a medical waiver. There are real reasons for some people to pursue them: They may be immunocompromised in a way that makes certain vaccines high-risk, or they may have had a bad reaction to a shot in the past. In some cases, families may earnestly believe their child cannot safely receive a vaccine, but have difficulty finding a physician who agrees, or who is willing to attest to that on an exemption.

Interest in medical exemptions tends to grow when laws tighten. In 2015, after a measles outbreak at Disneyland sickened hundreds of people, California lawmakers ended the state’s personal belief exemption. Almost immediately, the medical exemption rate more than doubled, according to a 2019 paper by a team of public health researchers. The law “created a black market for medical exemptions,” one unnamed health officer told the researchers. Parents, the officer added, would go online and “get medical exemptions from physicians who were not their child’s treating physician.”

The state cracked down, prosecuting some health care providers for allegedly providing improper medical exemptions, and tightening the rules for receiving a waiver. New York, which eliminated religious exemptions in 2019, has taken similar steps; the Department of Health maintains a public list of health care providers who have been banned or suspended from using immunization registries in the state, on a webpage titled “School Vaccination Fraud Awareness.”

In New York, advocates say, state policies have made it prohibitively difficult for some families to obtain medical exemptions, regardless of the reason. “My understanding is that up until this year, again, a lot of doctors weren’t willing to write these medical exemptions,” said Chad Davenport, an attorney outside Buffalo, New York, who often represents families seeking medical exemptions. (One of his clients recently won a key ruling in a federal court case against a Long Island school district that had denied medical exemptions from at least six health care providers.)

Enter Frontline Health Advocates. The organization, Davenport said, “kind of stepped in and provided families at least an option, or a potential path.”

Two researchers who have studied vaccination exemptions in the United States said the organization appears to have a unique model: While individual doctors have sometimes gained a reputation for supplying medical exemptions, neither expert had seen a full-fledged national organization offering those services.

“They’re very blatant,” said Dorit Reiss, a professor at UC Law San Francisco who studies vaccine law and policy.

The group’s founder and director is William Lionberger, a chiropractor who has been licensed to practice in California since 1981, and who once maintained a practice north of San Diego. According to public records, he has also served as a police officer in a town near Sedona. (Lionberger declined a request for an on-the-record interview, and the organization did not answer a list of questions from Undark.) Interviewers who have hosted Lionberger on their shows describe him as affiliated with America’s Frontline Doctors, a group that opposed Covid-19 vaccines and other public health measures while promoting unproven treatments like hydroxychloroquine.

Frontline Health Advocates’ webpage was first registered in March 2022, with a name echoing that of America’s Frontline Doctors. By April of that year, the website was inviting visitors to “Get your exemption now.” In a 2023 interview, Lionberger described having a “team of medical experts” who “work with all kinds of situations,” evaluating clients both for “regular vax injuries and regular vax exemptive conditions.”

He added: “People now don’t even want their kids to get anywhere near a regular vaccine.”

The group employs a pair of distinctive legal strategies. One of these is to form itself as something called a Private Ministerial Association. Online, some groups that help set up such private associations describe them as offering special First Amendment protections. A membership application document hosted on Frontline’s website describes the group as “a private, unincorporated ministry that operates as much as possible, outside the jurisdiction of government entities, agencies, officers, agents, contractors, and other representatives, as protected by law.”

Another strategy is to invoke federal disability law. In the 2023 interview, Lionberger boasted that they drew on “the most powerful thing that you can bring against discrimination”—specifically, federal protections. A promotional video posted on the Frontline website makes a similar claim, advertising waivers “supported by the protections under US federal laws.” Undark obtained three near-identical exemptions sent to New York families in 2024. In them, Frontline argues that the client’s need for a medical exemption is protected under the Americans with Disabilities Act, or ADA, which guarantees certain accommodations for people with disabilities and other medical needs.

In Frontline documents from 2024, the organization suggests that this federal protection supersedes state vaccination laws—offering a way around exemption policies across the country.

In New York, Clerkin had received a document combining medical language with legal details. The document bore the signatures of doctor Andrew Zywiec and an administrative law specialist and JD, Christine Pazzula, along with the seal of the United States Department of Justice.

Not everything was quite as it seemed. Frontline has no relationship with the Department of Justice. Pazzula, according to her LinkedIn profile, had received her legal degree from an unaccredited correspondence school in California, and her name does not appear in databases of attorneys admitted to the bar in New York, Texas, or Nevada, where her LinkedIn profile says she is based. (In a brief email to Undark, Pazzula said she no longer works for Frontline.)

Another parent who received a Frontline exemption in 2024 would later testify under oath that she believed Zywiec to be a physician licensed in the state of New York, but state records show that nobody named Zywiec has ever held a medical license in the state.

Multiple online testimonials about Frontline mention Zywiec. A review of public records suggests a turbulent history. Zywiec served in the Army and graduated from medical school in 2019, according to a CV. In 2020, he began a pediatrics residency at The Brooklyn Hospital Center, but the relationship soured: He ultimately sued the hospital, alleging an unsafe work environment, and filed an employment complaint that, among other concerns, said he had been “coerced into taking the so-called Covid-19 vaccine.” In court documents associated with the lawsuit, a hospital official described an employee who was “spotty and difficult.” In 2021, Zywiec’s co-residents had written a letter to their superiors alleging that he had made offensive remarks to colleagues and treated nurses poorly, also writing that he “would delay care to patients because he wanted to participate in procedures unrelated to his patients because they interested him.”

Zywiec maintains an online medical practice, where he describes himself as “an international medical doctor and board-certified indigenous medicine provider” and offers a range of services, including a $150, 30-minute, “Medical Excuse/Note” consultation that yields a “legitimate medical excuse tailored to your situation.” On X, where he has amassed a following in the tens of thousands, Zywiec regularly shares content about the dangers of vaccines.

The promotional video on Frontline’s website describes the exemptions as “signed by state-licensed physicians with full credentials.” Zywiec’s name does not appear in a national database of licensed physicians maintained by the Federation of State Medical Boards. (In a brief email, Zywiec referred interview requests to Frontline. He did not answer a list of questions from Undark, and Frontline did not respond to a question about Zywiec’s license status.)

The exemption that Zywiec had signed for Clerkin was denied. In a letter, the school district explained that New York law requires exemptions to be signed by a physician licensed in the state. Clerkin said that she was aware Zywiec was not licensed in New York, but Frontline seemed confident in their approach, and she thought it might work. That did not pan out, she said. “I feel like they talk this big thing,” Clerkin said. But, she added, “if you know that you can’t help these children, and you’re just preying on these mothers who will do anything for their children, that is evil.”

Some Frontline exemptions do get through, at least in New York. “I can certainly tell you that there have been some people, even this year, who have been able to get their Frontline Health waivers accepted,” Davenport told Undark. But, he said, courts have not tested the argument that an exemption invoking federal law will trump the state’s requirement that the exemption comes from a New York-licensed physician. He does not recommend Frontline to clients. “I basically tell them, although Frontline may technically be correct, it’s not a good legal position for you to be in,” Davenport said. “And so I always advise them to try to get a New York state waiver signed by a New York state doctor and then submit that, because that puts you in the best legal position.”

In a video on its website, Frontline warns potential clients that exemptions may be denied, noting that the group “cannot guarantee that an unknown person you are engaging with is going to abide by federal laws.” But Rita Palma, a health freedom activist on Long Island who has worked with many families seeking medical exemptions, told Undark that she thinks parents are still confused about the limitations of the waivers. “What I’m getting from parents is that Frontline Health Advocates say that federal law overrides state law,” she said. Whether or not that’s true in the case of vaccine exemptions remains unclear.

The $495 fee—extra for expedited service—is a steep price for some families. “They’ve made a nice killing in New York,” Palma said. “I hate to put it like that, but they’ve definitely gotten a lot of parents to pay them to get exemptions.”

It’s difficult to know how many waivers Frontline has sold. In online forums, people describe successes with schools. “I got lifetime medical exemptions for my children,” one parent wrote in a Facebook group in April, noting that she was not affiliated with Frontline. The group is “replete with lawyers to respond to any pushback from schools,” she added.

One mother in Connecticut told Undark that she had contacted Frontline in 2024, when her son needed a flu shot to stay in daycare. “I was looking around for a way to get an exemption,” she said. (The mother spoke on condition of anonymity, citing a professional need for privacy.) After a phone call with a licensed pediatrician in Texas, she received an exemption. The daycare, she said, accepted it. “It was a pretty smooth experience, overall,” she said.

“I was aware that it was a gamble,” the mother said; Frontline had told her the exemption might not be accepted. “But then they kind of were like, well, you know, technically, if they don’t accept it, it’s illegal because it’s protected by ADA and all that kind of thing,” she said.

The group has attracted attention from some public health officials. In Los Angeles County, a public health department website is topped by a large red banner warning that Frontline exemptions don’t work in California. In October 2024, in Connecticut, minutes from a meeting of the state’s School Nurse Advisory Council described Frontline as providing what the council believed to be “fraudulent” exemptions to families. A spokesperson for Connecticut Public Health, Brittany Schaefer, told Undark in late September that Frontline is the subject of “an active investigation.”

Undark asked three legal experts to review a copy of an exemption issued to a family in New York in September 2024 and obtained via court records. “It seems to be a fill-in-the-blank type of form,” said Barbara Hoffman, an expert in disability law at Rutgers Law School. The waiver, Hoffman believes, overstates the penalties generally levied for ADA infractions. School districts, employers, or others who received this form, she speculated, might feel like “it’s not worth the effort to reject this.”

“This looks like an official document,” she added, highlighting the seal of the Department of Justice and references to potential civil penalties. “It’s designed to intimidate somebody who doesn’t really know better, or just doesn’t want to risk any potential litigation.”

Could invoking the ADA really override state-level vaccine requirements? Reiss, the UC Law San Francisco expert, was skeptical, noting that state law has generally held in similar cases. “My expectation,” she wrote in an email, “is that that won’t hold.”

This article was originally published on Undark. Read the original article.

http://arstechnica.com/

Inside the marketplace for vaccine medical exemptions Read More »

youtube-denies-ai-was-involved-with-odd-removals-of-tech-tutorials

YouTube denies AI was involved with odd removals of tech tutorials


YouTubers suspect AI is bizarrely removing popular video explainers.

This week, tech content creators began to suspect that AI was making it harder to share some of the most highly sought-after tech tutorials on YouTube, but now YouTube is denying that odd removals were due to automation.

Creators grew alarmed when educational videos that YouTube had allowed for years were suddenly being bizarrely flagged as “dangerous” or “harmful,” with seemingly no way to trigger human review to overturn removals. AI seemed to be running the show, with creators’ appeals seemingly getting denied faster than a human could possibly review them.

Late Friday, a YouTube spokesperson confirmed that videos flagged by Ars have been reinstated, promising that YouTube will take steps to ensure that similar content isn’t removed in the future. But, to creators, it remains unclear why the videos got taken down, as YouTube claimed that both initial enforcement decisions and decisions on appeals were not the result of an automation issue.

Shocked creators were stuck speculating

Rich White, a computer technician who runs an account called CyberCPU Tech, had two videos removed that demonstrated workarounds to install Windows 11 on unsupported hardware.

These videos are popular, White told Ars, with people looking to bypass Microsoft account requirements each time a new build is released. For tech content creators like White, “these are bread and butter videos,” dependably yielding “extremely high views,” he said.

Because there’s such high demand, many tech content creators’ channels are filled with these kinds of videos. White’s account has “countless” examples, he said, and in the past, YouTube even featured his most popular video in the genre on a trending list.

To White and others, it’s unclear exactly what has changed on YouTube that triggered removals of this type of content.

YouTube only seemed to be removing recently posted content, White told Ars. However, if the takedowns ever impacted older content, entire channels documenting years of tech tutorials risked disappearing in “the blink of an eye,” another YouTuber behind a tech tips account called Britec09 warned after one of his videos was removed.

The stakes appeared high for everyone, White warned, in a video titled “YouTube Tech Channels in Danger!”

White had already censored content that he planned to post on his channel, fearing it wouldn’t be worth the risk of potentially losing his account, which began in 2020 as a side hustle but has since become his primary source of income. If he continues to change the content he posts to avoid YouTube penalties, it could hurt his account’s reach and monetization. Britec told Ars that he paused a sponsorship due to the uncertainty that he said has already hurt his channel and caused a “great loss of income.”

YouTube’s policies are strict, with the platform known to swiftly remove accounts that receive three strikes for violating community guidelines within 90 days. But, curiously, White had not received any strikes following his content removals. Although Britec reported that his account had received a strike following his video’s removal, White told Ars that YouTube so far had only given him two warnings, so his account is not yet at risk of a ban.

Creators weren’t sure why YouTube might deem this content as harmful, so they tossed around some theories. It seemed possible, White suggested in his video, that AI was detecting this content as “piracy,” but that shouldn’t be the case, he claimed, since his guides require users to have a valid license to install Windows 11. He also thinks it’s unlikely that Microsoft prompted the takedowns, suggesting tech content creators have a “love-hate relationship” with the tech company.

“They don’t like what we’re doing, but I don’t think they’re going to get rid of it,” White told Ars, suggesting that Microsoft “could stop us in our tracks” if it were motivated to end workarounds. But Microsoft doesn’t do that, White said, perhaps because it benefits from popular tutorials that attract swarms of Windows 11 users who otherwise may not use “their flagship operating system” if they can’t bypass Microsoft account requirements.

Those users could become loyal to Microsoft, White said. And eventually, some users may even “get tired of bypassing the Microsoft account requirements, or Microsoft will add a new feature that they’ll happily get the account for, and they’ll relent and start using a Microsoft account,” White suggested in his video. “At least some people will, not me.”

Microsoft declined Ars’ request to comment.

To White, it seemed possible that YouTube was leaning on AI  to catch more violations but perhaps recognized the risk of over-moderation and, therefore, wasn’t allowing AI to issue strikes on his account.

But that was just a “theory” that he and other creators came up with, but couldn’t confirm, since YouTube’s chatbot that supports creators seemed to also be “suspiciously AI-driven,” seemingly auto-responding even when a “supervisor” is connected, White said in his video.

Absent more clarity from YouTube, creators who post tutorials, tech tips, and computer repair videos were spooked. Their biggest fear was that unexpected changes to automated content moderation could unexpectedly knock them off YouTube for posting videos that in tech circles seem ordinary and commonplace, White and Britec said.

“We are not even sure what we can make videos on,” White said. “Everything’s a theory right now because we don’t have anything solid from YouTube.”

YouTube recommends making the content it’s removing

White’s channel gained popularity after YouTube highlighted an early trending video that he made, showing a workaround to install Windows 11 on unsupported hardware. Following that video, his channel’s views spiked, and then he gradually built up his subscriber base to around 330,000.

In the past, White’s videos in that category had been flagged as violative, but human review got them quickly reinstated.

“They were striked for the same reason, but at that time, I guess the AI revolution hadn’t taken over,” White said. “So it was relatively easy to talk to a real person. And by talking to a real person, they were like, ‘Yeah, this is stupid.’ And they brought the videos back.”

Now, YouTube suggests that human review is causing the removals, which likely doesn’t completely ease creators’ fears about arbitrary takedowns.

Britec’s video was also flagged as dangerous or harmful. He has managed his account that currently has nearly 900,000 subscribers since 2009, and he’s worried he risked losing “years of hard work,” he said in his video.

Britec told Ars that “it’s very confusing” for panicked tech content creators trying to understand what content is permissible. It’s particularly frustrating, he noted in his video, that YouTube’s creator tool inspiring “ideas” for posts seemed to contradict the mods’ content warnings and continued to recommend that creators make content on specific topics like workarounds to install Windows 11 on unsupported hardware.

Screenshot from Britec09’s YouTube video, showing YouTube prompting creators to make content that could get their channels removed. Credit: via Britec09

“This tool was to give you ideas for your next video,” Britec said. “And you can see right here, it’s telling you to create content on these topics. And if you did this, I can guarantee you your channel will get a strike.”

From there, creators hit what White described as a “brick wall,” with one of his appeals denied within one minute, which felt like it must be an automated decision. As Britec explained, “You will appeal, and your appeal will be rejected instantly. You will not be speaking to a human being. You’ll be speaking to a bot or AI. The bot will be giving you automated responses.”

YouTube insisted that the decisions weren’t automated, even when an appeal was denied within one minute.

White told Ars that it’s easy for creators to be discouraged and censor their channels rather than fight with the AI. After wasting “an hour and a half trying to reason with an AI about why I didn’t violate the community guidelines” once his first appeal was quickly denied, he “didn’t even bother using the chat function” after the second appeal was denied even faster, White confirmed in his video.

“I simply wasn’t going to do that again,” White said.

All week, the panic spread, reaching fans who follow tech content creators. On Reddit, people recommended saving tutorials lest they risk YouTube taking them down.

“I’ve had people come out and say, ‘This can’t be true. I rely on this every time,’” White told Ars.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

YouTube denies AI was involved with odd removals of tech tutorials Read More »

openai-moves-to-complete-potentially-the-largest-theft-in-human-history

OpenAI Moves To Complete Potentially The Largest Theft In Human History

OpenAI is now set to become a Public Benefit Corporation, with its investors entitled to uncapped profit shares. Its nonprofit foundation will retain some measure of control and a 26% financial stake, in sharp contrast to its previous stronger control and much, much larger effective financial stake. The value transfer is in the hundreds of billions, thus potentially the largest theft in human history.

I say potentially largest because I realized one could argue that the events surrounding the dissolution of the USSR involved a larger theft. Unless you really want to stretch the definition of what counts this seems to be in the top two.

I am in no way surprised by OpenAI moving forward on this, but I am deeply disgusted and disappointed they are being allowed (for now) to do so, including this statement of no action by Delaware and this Memorandum of Understanding with California.

Many media and public sources are calling this a win for the nonprofit, such as this from the San Francisco Chronicle. This is mostly them being fooled. They’re anchoring on OpenAI’s previous plan to far more fully sideline the nonprofit. This is indeed a big win for the nonprofit compared to OpenAI’s previous plan. But the previous plan would have been a complete disaster, an all but total expropriation.

It’s as if a mugger demanded all your money, you talked them down to giving up half your money, and you called that exchange a ‘change that recapitalized you.’

As in, they claim OpenAI has ‘completed its recapitalization’ and the nonprofit will now only hold equity OpenAI claims is valued at approximately $130 billion (as in 26% of the company, which is actually to be fair worth substantially more than that if they get away with this), as opposed to its previous status of holding the bulk of the profit interests in a company valued at (when you include the nonprofit interests) well over $500 billion, along with a presumed gutting of much of the nonprofit’s highly valuable control rights.

They claim this additional clause, presumably the foundation is getting warrants with but they don’t offer the details here:

If OpenAI Group’s share price increases greater than tenfold after 15 years, the OpenAI Foundation will receive significant additional equity. With its equity stake and the warrant, the Foundation is positioned to be the single largest long-term beneficiary of OpenAI’s success.

We don’t know that ‘significant’ additional equity means, there’s some sort of unrevealed formula going on, but given the nonprofit got expropriated last time I have no expectation that these warrants would get honored. We will be lucky if the nonprofit meaningfully retains the remainder of its equity.

Sam Altman’s statement on this is here, also announcing his livestream Q&A that took place on Tuesday afternoon.

There can be reasonable disagreements about exactly how much. It’s a ton.

There used to be a profit cap, where in Greg Brockman’s own words, ‘If we succeed, we believe we’ll create orders of magnitude more value than any existing company — in which case all but a fraction is returned to the world.’

Well, so much for that.

I looked at this question in The Mask Comes Off: At What Price a year ago.

If we take seriously that OpenAI is looking to go public at a $1 trillion valuation, then consider that Matt Levine estimated the old profit cap only going up to about $272 billion, and that OpenAI still is a bet on extreme upside.

Garrison Lovely: UVA economist Anton Korinek has used standard economic models to estimate that AGI could be worth anywhere from $1.25 to $71 quadrillion globally. If you take Korinek’s assumptions about OpenAI’s share, that would put the company’s value at $30.9 trillion. In this scenario, Microsoft would walk away with less than one percent of the total, with the overwhelming majority flowing to the nonprofit.

It’s tempting to dismiss these numbers as fantasy. But it’s a fantasy constructed in large part by OpenAI, when it wrote lines like, “it may be difficult to know what role money will play in a post-AGI world,” or when Altman said that if OpenAI succeeded at building AGI, it might “capture the light cone of all future value in the universe.” That, he said, “is for sure not okay for one group of investors to have.”

I guess Altman is okay with that now?

Obviously you can’t base your evaluations on a projection that puts the company at a value of $30.9 trillion, and that calculation is deeply silly, for many overloaded and obvious reasons, including decreasing marginal returns to profits.

It is still true that most of the money OpenAI makes in possible futures, it makes as part of profits in excess of $1 trillion.

The Midas Project: Thanks to the now-gutted profit caps, OpenAI’s nonprofit was already entitled to the vast majority of the company’s cash flows. According to OpenAI, if they succeeded, “orders of magnitude” more money would go to the nonprofit than to investors. President Greg Brockman said “all but a fraction” of the money they earn would be returned to the world thanks to the profit caps.

Reducing that to 26% equity—even with a warrant (of unspecified value) that only activates if valuation increases tenfold over 15 years—represents humanity voluntarily surrendering tens or hundreds of billions of dollars it was already entitled to. Private investors are now entitled to dramatically more, and humanity dramatically less.

OpenAI is not suddenly one of the best-resourced nonprofits ever. From the public’s perspective, OpenAI may be one of the worst financially performing nonprofits in history, having voluntarily transferred more of the public’s entitled value to private interests than perhaps any charitable organization ever.

I think Levine’s estimate was low at the time, and you also have to account for equity raised since then or that will be sold in the IPO, but it seems obvious that the majority of future profit interests were, prior to the conversion, still in the hands of the non-profit.

Even if we thought the new control rights were as strong as the old, we would still be looking at a theft in excess of $250 billion, and a plausible case can be made for over $500 billion. I leave the full calculation to others.

The vote in the board was unanimous.

I wonder exactly how and by who they will be sued over it, and what will become of that. Elon Musk, at a minimum, is trying.

They say behind every great fortune is a great crime.

Altman points out that the nonprofit could become the best-resourced non-profit in the world if OpenAI does well. This is true. There is quite a lot they were unable to steal. But it is beside the point, in that it does not make taking the other half, including changing the corporate structure without permission, not theft.

The Midas Project: From the public’s perspective, OpenAI may be one of the worst financially performing nonprofits in history, having voluntarily transferred more of the public’s entitled value to private interests than perhaps any charitable organization ever.

There’s no perhaps on that last clause. On this level, whether or not you agree with the term ‘theft,’ it isn’t even close, this is the largest transfer. Of course, if you take the whole of OpenAI’s nonprofit from inception, performance looks better.

Aidan McLaughlin (OpenAI): ah yes openai now has the same greedy corporate structure as (checks notes) Patagonia, Anthropic, Coursera, and http://Change.org.

Chase Brower: well i think the concern was with the non profit getting a low share.

Aidan McLaughlin: our nonprofit is currently valued slightly less than all of anthropic.

Tyler Johnson: And according to OpenAI itself, it should be valued at approximately three Anthropics! (Fwiw I think the issues with the restructuring extend pretty far beyond valuations, but this is one of them!)

Yes, it is true that the nonprofit, after the theft and excluding control rights, will have an on-paper valuation only slightly lower than the on-paper value of all of Anthropic.

The $500 billion valuation excludes the non-profit’s previous profit share, so even if you think the nonprofit was treated fairly and lost no control rights you would then have it be worth $175 billion rather than $130 billion, so yes slightly less than Anthropic, and if you acknowledge that the nonprofit got stolen from it’s even more.

If OpenAI can successfully go public at a $1 trillion valuation, then depending on how much of that are new shares they will be selling the nonprofit could be worth up to $260 billion.

What about some of the comparable governance structures here? Coursera does seem to be a rather straightforward B-corp. The others don’t?

Patagonia has the closely held Patagonia Purpose Trust, which holds 2% of shares and 100% of voting control, and The Holdfast Collective, which is a 501c(4) nonprofit with 98% of the shares and profit interests. The Chouinard family has full control over the company, and 100% of profits go to charitable causes.

Does that sound like OpenAI’s new corporate structure to you?

Change.org’s nonprofit owns 100% of its PBC.

Does that sound like OpenAI’s new corporate structure to you?

Anthropic is a PBC, but also has the Long Term Benefit Trust. One can argue how meaningfully different this is from OpenAI’s new corporate structure, if you disregard who is involved in all of this.

What the new structure definitely is distinct from is the original intention:

Tomas Bjartur: If not in the know, OpenAI once promised any profits over a threshold would be gifted to you, citizen of the world, for your happy, ultra-wealthy retirement – one needed as they plan to obsolete you. This is now void.

Would OpenAI have been able to raise further investment without withdrawing its profit caps for investments already made?

When you put it like that it seems like obviously yes?

I can see the argument that to raise funds going forward, future equity investments need to not come with a cap. Okay, fine. That doesn’t mean you hand past investors, including Microsoft, hundreds of billions in value in exchange for nothing.

One can argue this was necessary to overcome other obstacles, that OpenAI had already allowed itself to be put in a stranglehold another way and had no choice. But the fundraising story does not make sense.

The argument that OpenAI had to ‘complete its recapitalization’ or risk being asked for its money back is even worse. Investors who put in money at under $200 billion are going to ask for a refund when the valuation is now at $500 billion? Really? If so, wonderful, I know a great way to cut them that check.

I am deeply disappointed that both the Delaware and California attorneys general found this deal adequate on equity compensation for the nonprofit.

I am however reasonably happy with the provisions on control rights, which seem about as good as one can hope for given the decision to convert to a PBC. I can accept that the previous situation was not sustainable in practice given prior events.

The new provisions include an ongoing supervisory role for the California AG, and extensive safety veto points for the NFP and the SSC committee.

If I was confident that these provisions would be upheld, and especially if I was confident their spirit would be upheld, then this is actually pretty good, and if it is used wisely and endures it is more important than their share of the profits.

AG Bonta: We will be keeping a close eye on OpenAI to ensure ongoing adherence to its charitable mission and the protection of the safety of all Californians.

The nonprofit will indeed retain substantial resources and influence, but no I do not expect the public safety mission to dominate the OpenAI enterprise. Indeed, contra the use of the word ‘ongoing,’ it seems clear that it already had ceased to do so, and this seems obvious to anyone tracking OpenAI’s activities, including many recent activities.

What is the new control structure?

OpenAI did not say, but the Delaware AG tells us more and the California AG has additional detail. NFP means OpenAI’s nonprofit here and throughout.

This is the Delaware AG’s non-technical announcement (for the full list see California’s list below), she has also ‘warned of legal action if OpenAI fails to act in public interest’ although somehow I doubt that’s going to happen once OpenAI inevitably does not act in the public interest:

  • The NFP will retain control and oversight over the newly formed PBC, including the sole power and authority to appoint members of the PBC Board of Directors, as well as the power to remove those Directors.

  • The mission of the PBC will be identical to the NFP’s current mission, which will remain in place after the recapitalization. This will include the PBC using the principles in the “OpenAI Charter,” available at openai.com/charter, to execute the mission.

  • PBC directors will be required to consider only the mission (and may not consider the pecuniary interests of stockholders or any other interest) with respect to safety and security issues related to the OpenAI enterprise and its technology.

  • The NFP’s board-level Safety and Security Committee, which is a critical decision maker on safety and security issues for the OpenAI enterprise, will remain a committee of the NFP and not be moved to the PBC. The committee will have the authority to oversee and review the safety and security processes and practices of OpenAI and its controlled affiliates with respect to model development and deployment. It will have the power and authority to require mitigation measures—up to and including halting the release of models or AI systems—even where the applicable risk thresholds would otherwise permit release.

  • The Chair of the Safety and Security Committee will be a director on the NFP Board and will not be a member of the PBC Board. Initially, this will be the current committee chair, Mr. Zico Kolter. As chair, he will have full observation rights to attend all PBC Board and committee meetings and will receive all information regularly shared with PBC directors and any additional information shared with PBC directors related to safety and security.

  • With the intent of advancing the mission, the NFP will have access to the PBC’s advanced research, intellectual property, products and platforms, including artificial intelligence models, Application Program Interfaces (APIs), and related tools and technologies, as well as ongoing operational and programmatic support, and access to employees of the PBC.

  • Within one year of the recapitalization, the NFP Board will have at least two directors (including the Chair of the Safety and Security Committee) who will not serve on the PBC Board.

  • The Attorney General will be provided with advance notice of significant changes in corporate governance.

What did California get?

California also has its own Memorandum of Understanding. It talks a lot in its declarations about California in particular, how OpenAI creates California jobs and economic activity (and ‘problem solving’?) and is committed to doing more of this and bringing benefits and deepening its commitment to the state in particular.

The whole claim via Tweet by Sam Altman that he did not threaten to leave California is raising questions supposedly answered by his Tweet. At this level you perhaps do not need to make your threats explicit.

The actual list seems pretty good, though? Here’s a full paraphrased list, some of which overlaps with Delaware’s announcement above, but which is more complete.

  1. Staying in California and expanding the California footprint.

  2. The NFP (not for profit) retains control as long as they continue to hold ‘class N common stock’ which only they can choose to give up. What happens if Altman wants that?

  3. The PBC and NFP missions will be identical.

  4. The OpenAI charter will be published. Check.

  5. The NFP Board owes fiduciary duties to the NFP, Mission and public beneficiaries of the NFP. I notice it doesn’t say ‘exclusively’ here.

  6. The board shall carry out the charitable purpose (already presumably required).

  7. No cheating clause: While the PBC holds Class N Common Stock (again, what happens if Altman tries to end this?) the NFP must approve in advance in writing any amendments to the PBC’s corporate governance guidelines or the mission, any Deemed Liquidation Event, or any sale of material assets of the PBC, or any additional issue of shares of Class N common stock.

    1. The PBC can issue Class B common stock, but they can’t let the Bs vote.

  8. PBC board and Safety and Security Committee (SSC) must only consider the mission with respect to safety and security issues. The chair of the SSC must be on the NFP board and not the PBC board, but can attend all PBC board meetings and gets all PBC safety and security related information.

    1. In particular, the SSC cannot consider shareholder returns, competitive pressures, financial implications or market timing.

      1. It seems impossible this will 100% be followed, but it helps.

  9. The SSC is a committee of the NFP and oversees safety and security.

  10. The SSC can require mitigations up to and including halting releases, without formal justification.

    1. “The NFP will provide advance notice to the Attorney General of any material changes to the SSC’s authority.” This is good but worrisome to see, you know?

  11. “OpenAI will continue to undertake measures to mitigate risks to teens and others in connection with the development and deployment of AI and of AGI.”

    1. I mean, I would hope they were going to do that anyway.

  12. The PBC shall give the NFP all reasonable access to services and resources without compensation as needed for the mission.

  13. Within a year at least one NFP director, in addition to the SSC chair, will serve only on the NFP board (so at least two directors must be different).

  14. The PBC board must be majority independent directors.

  15. The PBC will have various good corporate governance things.

  16. The PBC will publish a yearly report on its progress in its mission.

  17. The NFP Board’s Mission and Strategy Commission will meet with the California AG semi-annually and individual members will be available as needed.

  18. The NFP will provide 21 days notice before consenting to changes of PBC control or mission, or any threat to the Class N share rights, or any relocation outside of California.

  19. The California AG can review, and hire experts to help review, anything requiring such notice, and get paid by NFP for doing so.

  20. Those on both NFP and PBC boards get annual fiduciary duty training.

  21. The board represents that the recapitalization is fair (whoops), and that they’ve disclosed everything relevant (?), so the AG will also not object.

  22. This only impacts the parties to the MOU, others retain all rights. Disputes resolved in the courts of San Francisco, these are the whole terms, we all have the authority to do this, effective as of signing, AG is relying on OpenAI’s representations and the AG retains all rights and waive none as per usual.

Also, it’s not even listed in the memo, but the ‘merge and assist’ clause was preserved, meaning OpenAI commits to join forces with any ‘safety-conscious’ rival that has a good chance of reaching OpenAI’s goal of creating AGI within a two-year time frame. I don’t actually expect an OpenAI-Anthropic merger to happen, but it’s a nice extra bit of optionality.

This is better than I expected, and as Ben Shindel points out better than many traders expected. This actually does have real teeth, and it was plausible that without pressure there would have been no teeth at all.

It grants the NFP the sole power to appoint and remove directors, and requiring them not to consider the for-profit mission in safety contexts. The explicit granting of the power to halt deployments and mandate mitigations, without having to cite any particular justification and without respect to profitability, is highly welcome, if structured in a functional fashion.

It is remarkable how little many expected to get. For example, here’s Todor Markov, who didn’t even expect the NFP to be able to replace directors at all. If you can’t do that, you’re basically dead in the water.

I am not a lawyer, but my understanding is that the ‘no cheating around this’ clauses are about as robust as one could reasonably hope for them to be.

It’s still, as Garrison Lovely calls it, ‘on paper’ governance. Sometimes that means governance in practice. Sometimes it doesn’t. As we have learned.

The distinction between the boards still means there is an additional level removed between the PBC and the NFP. In a fast moving situation, this makes a big difference, and the NFP likely would have to depend on its enumerated additional powers being respected. I would very much have liked them to include appointing or firing the CEO directly.

Whether this overall ‘counts as a good deal’ depends on your baseline. It’s definitely a ‘good deal’ versus what our realpolitik expectations projected. One can argue that if the control rights really are sufficiently robust over time, that the decline in dollar value for the nonprofit is not the important thing here.

The counterargument to that is both that those resources could do a lot of good over time, and also that giving up the financial rights has a way of leading to further giving up control rights, even if the current provisions are good.

Similarly to many issues of AI alignment, if an entity has ‘unnatural’ control, or ‘unnatural’ profit interests, then there are strong forces that continuously try to take that control away. As we have already seen.

Unless Altman genuinely wants to be controlled, the nonprofit will always be under attack, where at every move we fight to hold its ground. On a long enough time frame, that becomes a losing battle.

Right now, the OpenAI NFP board is essentially captured by Altman, and also identical to the PBC board. They will become somewhat different, but no matter what it only matters if the PBC board actually tries to fulfill its fiduciary duties rather than being a rubber stamp.

One could argue that all of this matters little, since the boards will both be under Altman’s control and likely overlap quite a lot, and they were already ignoring their duties to the nonprofit.

Robert Weissman, co-president of the nonprofit Public Citizen, said this arrangement does not guarantee the nonprofit independence, likening it to a corporate foundation that will serve the interests of the for profit.

Even as the nonprofit’s board may technically remain in control, Weissman said that control “is illusory because there is no evidence of the nonprofit ever imposing its values on the for profit.”

So yes, there is that.

They claim to now be a public benefit corporation, OpenAI Group PBC.

OpenAI: The for-profit is now a public benefit corporation, called OpenAI Group PBC, which—unlike a conventional corporation—is required to advance its stated mission and consider the broader interests of all stakeholders, ensuring the company’s mission and commercial success advance together.

This is a mischaracterization of how PBCs work. It’s more like the flip side of this. A conventional corporation is supposed to maximize profits and can be sued if it goes too far in not doing that. Unlike a conventional corporation, a PBC is allowed to consider those broader interests to a greater extent, but it is not in practice ‘required’ to do anything other than maximize profits.

One particular control right is the special duty to the mission, especially via the safety and security committee. How much will they attempt to downgrade the scope of that?

The Midas Project: However, the effectiveness of this safeguard will depend entirely on how broadly “safety and security issues” are defined in practice. It would not be surprising to see OpenAI attempt to classify most business decisions—pricing, partnerships, deployment timelines, compute allocation—as falling outside this category.

This would allow shareholder interests to determine the majority of corporate strategy while minimizing the mission-only standard to apply to an artificially narrow set of decisions they deem easy or costless.

They have an announcement about that too.

OpenAI: First, Microsoft supports the OpenAI board moving forward with formation of a public benefit corporation (PBC) and recapitalization.

Following the recapitalization, Microsoft holds an investment in OpenAI Group PBC valued at approximately $135 billion, representing roughly 27 percent on an as-converted diluted basis, inclusive of all owners—employees, investors, and the OpenAI Foundation. Excluding the impact of OpenAI’s recent funding rounds, Microsoft held a 32.5 percent stake on an as-converted basis in the OpenAI for-profit.

Anyone else notice something funky here? OpenAI’s nonprofit has had its previous rights expropriated, and been given 26% of OpenAI’s shares in return. If Microsoft had 32.5% of the company excluding the nonprofit’s rights before that happened, then that should give them 24% of the new OpenAI. Instead they have 27%.

I don’t know anything nonpublic on this, but it sure looks a lot like Microsoft insisted they have a bigger share than the nonprofit (27% vs. 26%) and this was used to help justify this expropriation and a transfer of additional shares to Microsoft.

In exchange, Microsoft gave up various choke points it held over OpenAI, including potential objections to the conversion, and clarified points of dispute.

Microsoft got some upgrades in here as well.

  1. Once AGI is declared by OpenAI, that declaration will now be verified by an independent expert panel.

  2. Microsoft’s IP rights for both models and products are extended through 2032 and now includes models post-AGI, with appropriate safety guardrails.

  3. Microsoft’s IP rights to research, defined as the confidential methods used in the development of models and systems, will remain until either the expert panel verifies AGI or through 2030, whichever is first. Research IP includes, for example, models intended for internal deployment or research only.

    1. Beyond that, research IP does not include model architecture, model weights, inference code, finetuning code, and any IP related to data center hardware and software; and Microsoft retains these non-Research IP rights.

  4. Microsoft’s IP rights now exclude OpenAI’s consumer hardware.

  5. OpenAI can now jointly develop some products with third parties. API products developed with third parties will be exclusive to Azure. Non-API products may be served on any cloud provider.

  6. Microsoft can now independently pursue AGI alone or in partnership with third parties. If Microsoft uses OpenAI’s IP to develop AGI, prior to AGI being declared, the models will be subject to compute thresholds; those thresholds are significantly larger than the size of systems used to train leading models today.

  7. The revenue share agreement remains until the expert panel verifies AGI, though payments will be made over a longer period of time.

  8. OpenAI has contracted to purchase an incremental $250B of Azure services, and Microsoft will no longer have a right of first refusal to be OpenAI’s compute provider.

  9. OpenAI can now provide API access to US government national security customers, regardless of the cloud provider.

  10. OpenAI is now able to release open weight models that meet requisite capability criteria.

That’s kind of a wild set of things to happen here.

In some key ways Microsoft got a better deal than it previously had. In particular, AGI used to be something OpenAI seemed like it could simply declare (you know, like war or the defense production act) and now it needs to be verified by an ‘expert panel’ which implies there is additional language I’d very much like to see.

In other ways OpenAI comes out ahead. An incremental $250B of Azure services sounds like a lot but I’m guessing both sides are happy with that number. Getting rid of the right of first refusal is big, as is having their non-API products free and clear. Getting hardware products fully clear of Microsoft is a big deal for the Ives project.

My overall take here is this was one of those broad negotiations where everything trades off, nothing is done until everything is done, and there was a very wide ZOPA (zone of possible agreement) since OpenAI really needed to make a deal.

In theory govern the OpenAI PBC. I have my doubts about that.

What they do have is a nominal pile of cash. What are they going to do with it to supposedly ensure that AGI goes well for humanity?

The default, as Garrison Lovely predicted a while back, is that the nonprofit will essentially buy OpenAI services for nonprofits and others, recapture much of the value and serve as a form of indulgences, marketing and way to satisfy critics, which may or may not do some good along the way.

The initial $50 million spend looked a lot like exactly this.

Their new ‘initial focus’ for $25 billion will be in these two areas:

  • Health and curing diseases. The OpenAI Foundation will fund work to accelerate health breakthroughs so everyone can benefit from faster diagnostics, better treatments, and cures. This will start with activities like the creation of open-sourced and responsibly built frontier health datasets, and funding for scientists.

  • Technical solutions to AI resilience. Just as the internet required a comprehensive cybersecurity ecosystem—protecting power grids, hospitals, banks, governments, companies, and individuals—we now need a parallel resilience layer for AI. The OpenAI Foundation will devote resources to support practical technical solutions for AI resilience, which is about maximizing AI’s benefits and minimizing its risks.

Herbie Bradley: i love maximizing AI’s benefits and minimizing its risks

They literally did the meme.

The first seems like a generally worthy cause that is highly off mission. There’s nothing wrong with health and curing diseases, but pushing this now does not advance the fundamental mission of OpenAI. They are going to start with, essentially, doing AI capabilities research and diffusion in health, and funding scientists to do AI-enabled research. A lot of this will likely fall right back into OpenAI and be good PR.

Again, that’s a net positive thing to do, happy to see it done, but that’s not the mission.

Technical solutions to AI resilience could potentially at least be useful AI safety work to some extent. With a presumed ~$12 billion this is a vast overconcentration of safety efforts into things that are worth doing but ultimately don’t seem likely to be determining factors. Note how Altman described it in his tl;dr from the Q&A:

Sam Altman: The nonprofit is initially committing $25 billion to health and curing disease, and AI resilience (all of the things that could help society have a successful transition to a post-AGI world, including technical safety but also things like economic impact, cyber security, and much more). The nonprofit now has the ability to actually deploy capital relatively quickly, unlike before.

This is now infinitely broad. It could be addressing ‘economic impact’ and be basically a normal (ineffective) charity, or one that intervenes mostly by giving OpenAI services to normal nonprofits. It could be mostly spent on valuable technical safety, and be on the most important charitable initiatives in the world. It could be anything in between, in any distribution. We don’t know.

My default assumption is that this is primarily going to be about mundane safety or even fall short of that, and make the near term world better, perhaps importantly better, but do little to guard against the dangers or downsides of AGI or superintelligence, and again largely be a de facto customer of OpenAI.

There’s nothing wrong with mundane risk mitigation or defense in depth, and nothing wrong with helping people who need a hand, but if your plan is ‘oh we will make things resilient and it will work out’ then you have no plan.

That doesn’t mean this will be low impact, or that what OpenAI left the nonprofit with is chump change.

I also don’t want to knock the size of this pool. The previous nonprofit initiative was $50 million, which can do a lot of good if spent well (in that case, I don’t think it was) but in this context $50 million chump change.

Whereas $25 billion? Okay, yeah, we are talking real money. That can move needles, if the money actually gets spent in short order. If it’s $25 billion as a de facto endowment spent down over a long time, then this matters and counts for a lot less.

The warrants are quite far out of the money and the NFP should have gotten far more stock than it did, but 26% (worth $130 billion or more) remains a lot of equity. You can do quite a lot of good in a variety of places with that money. The board of directors of the nonprofit is highly qualified if they want to execute on that. It also is highly qualified to effectively shuttle much of that money right back to OpenAI’s for profit, if that’s what they mainly want to do.

It won’t help much with the whole ‘not dying’ or ‘AGI goes well for humanity’ missions, but other things matter too.

Not entirely. As Garrison Lovely notes, all these sign-offs are provisional, and there are other lawsuits and the potential for other lawsuits. In a world where Elon Musk’s payouts can get crawled back, I wouldn’t be too confident that this conversation sticks. It’s not like the Delaware AG drives most objections to corporate actions.

The last major obstacle is the Elon Musk lawsuit, where standing is at issue but the judge has made clear that the suit otherwise has merit. There might be other lawsuits on the horizon. But yeah, probably this is happening.

So this is the world we live in. We need to make the most of it.

Discussion about this post

OpenAI Moves To Complete Potentially The Largest Theft In Human History Read More »

ai-#140:-trying-to-hold-the-line

AI #140: Trying To Hold The Line

Sometimes the best you can do is try to avoid things getting even worse even faster.

Thus, one has to write articles such as ‘Please Do Not Sell B30A Chips to China.’

It’s rather crazy to think that one would have to say this out loud.

In the same way, it seems not only do we need to say out loud to Not Build Superintelligence Right Now, there are those who say how dare you issue such a statement without knowing how to do so safety, so instead we should build superintelligence without knowing how to do so safety. The alternative is to risk societal dynamics we do not know how to control and that could have big unintended consequences, you say? Yes, well.

One good thing to come out of that was that Sriram Krishnan asked (some of) the right questions, giving us the opportunity to try and answer.

I also provided updates on AI Craziness Mitigation Efforts from OpenAI and Anthropic. We can all do better here.

Tomorrow, I’ll go over OpenAI’s ‘recapitalization’ and reorganization, also known as one of the greatest thefts in human history. Compared to what we feared, it looks like we did relatively well on control rights, but the equity stake is far below fair and all of this is far worse than the previous state. You could call that a ‘win’ in the sense that things could have gone far worse. That’s 2025.

The releases keep coming. We have Cursor 2.0 including their own LLM called Composer. We have Neo the humanoid household (for now teleoperated) robot. We have the first version of ‘Grokopedia.’ We get WorldTest and ControlArena and more.

Anthropic may finally have the compute it needs thanks to one million TPUs, while OpenAI may be planning an IPO at a valuation of $1 trillion.

  1. Language Models Offer Mundane Utility. The joys of freedom of AI speech.

  2. Language Models Don’t Offer Mundane Utility. Mistakes are made.

  3. Huh, Upgrades. Claude memory, Cursor 2.0, Claude for Finance, Pulse on web.

  4. On Your Marks. AIs disappoint on WorldTest, usual suspects declare victory.

  5. Choose Your Fighter. A tale of two business plans.

  6. Get My Agent On The Line. The promise of the Coasean singularity.

  7. Deepfaketown and Botpocalypse Soon. OpenAI erotica, third Grok companion.

  8. Fun With Media Generation. Suno is getting good at making generic music.

  9. Copyright Confrontation. Perplexity keeps failing the honeypot tests.

  10. They Took Our Jobs. My comparative advantage on display?

  11. Get Involved. METR is hiring.

  12. Introducing. Grokopedia, ControlArena and the a16z torment nexus pipeline.

  13. My Name is Neo. Teleoperated robots coming to willing households soon.

  14. In Other AI News. Some very good charts.

  15. Show Me the Money. One trillion dollars? OpenAI considers an IPO.

  16. One Trillion Dollars For My Robot Army. One trillion dollars? For Musk.

  17. One Million TPUs. One million TPUs? For Anthropic.

  18. Anthropic’s Next Move. Compute, in sufficient quantities, enables products.

  19. Quiet Speculations. OpenAI aims for true automated researchers by March 2028.

  20. The Quest for Sane Regulations. Microsoft endorses the GAIN Act.

  21. The March of California Regulations. Dean Ball analyzes, I offer additional takes.

  22. Not So Super PAC. It seems the a16z-Lehane SuperPAC is not off to a great start.

  23. Chip City. A few additional notes.

  24. The Week in Audio. Yudkowsky, Bostrom and AI Welfare on Odd Lots.

  25. Do Not Take The Bait. Was it woke? No, it was sharing accounts.

  26. Rhetorical Innovation. We are trained to think problems must be solvable.

  27. People Do Not Like AI. They express it in myriad ways. Some are direct.

  28. Aligning a Smarter Than Human Intelligence is Difficult. Let them cook?

  29. Misaligned! DeepSeek might choose to give you insecure code?

  30. Anthropic Reports Claude Can Introspect. Claude can notice thought injections.

  31. Anthropic Reports On Sabotage Risks. A template for a new report type.

  32. People Are Worried About AI Killing Everyone. Hinton is more hopeful.

  33. Other People Are Not As Worried About AI Killing Everyone. Misrepresentation.

  34. The Lighter Side. Begun, the sex warfare has?

Where do AI models have freedom of speech?

The good old United States of America, fyeah, that’s where, says The Future of Free Speech. The report isn’t perfect, if you look at the details it’s not measuring exactly what you’d want and pays too much attention to corporate statements and has too much focus on social media post generation, but it’s what we have, and it will serve.

Of the countries evaluated, next up was the European Union, which also performed strongly, although with worries about ‘hate speech’ style rules. The humans don’t have such great free speech around those parts, in important ways, but the chatbots already censor all that anyway for corporate reasons. Brazil scores modestly lower, then a drop to South Korea, another to India and a huge one to China.

As always, this is another reminder that China imposes lots of restrictions on things, far more onerous than anything America has ever considered, including that it requires pre deployment testing, largely to verify its censorship protocols.

Among AI models, they have Grok on top, but not by a huge amount. All three top labs (Anthropic, Google and OpenAI) showed noticeable improvement over time. Mostly the contrast is American models, which range from 58%-65%, and Mistral from France at 46% (this again makes me suspicious of the EU’s high score above), versus Chinese models much lower, with DeepSeek at 31.5% and Qwen at 22%. This is despite one of the main categories they were scored on being model openness, where DeepSeek gets full marks and the big labs get zeroes.

Notice that even with openness of the model as an explicit criteria, the open models and their associated nations are evaluated as far less free than the closed models.

As always, if you believe ‘any restrictions on AI mean China wins’ then you have to reconcile this claim with China already being vastly more restrictive than anything being relevantly proposed. Consider open model issues similarly.

What matters is experience in practice. My practical experience is that out of the big three, Sonnet 4.5 (or Opus/Sonnet 4 before it) and GPT-5 basically never censor or evade things they ‘shouldn’t’ censor, whereas Gemini 2.5 totally does do it. The exception for Claude is when I’m explicitly asking it about biological risk from AI, which can hit the biofilters by accident.

The thing about leaving all your stuff unsorted and counting on search is that when it works it’s great, and when it doesn’t work it’s really not great. That was true before AI, and it’s also true now that AI can often do better search.

Joe Weisenthal: yeah, sure, kinda true. But what’s the point of “sorting” anything digital. This is my point. In the world of the search bar (which keeps getting better and better) why group anything together at all?

St. Vincent: I have a lot of coworkers who spend a lot of time putting their emails in folders and I just search “[client name] taxes” in Outlook and it works fine

Ernest Ryu reports using ChatGPT to solve an open problem in convex optimization.

Use Claude to cut your hospital bill from $195k to $33k by highlighting duplicative charges, improper coding and other violations. The two big barriers are (1) knowing you can do this and (2) getting hold of the itemized bill in the first place. One wonders, shouldn’t there be bigger penalties when hospitals get caught doing this?

How long? Not long. Cause what you reap is what you sow:

Moses Kagan: Recently heard of a tenant buyout negotiation where both sides were just sending each other emails written by AI.

How soon until we all just cut out the middle-man, so to speak, and let the AIs negotiate with each other directly?

I mean, in that context, sure, why not?

Everyone makes mistakes oh yes they do.

Colin Fraser: The problem that we are going to run into more and more is even if the AI can tell a Doritos bag from a gun 99.999% of the time, if you run inference a million times a day you still expect 10 errors per day.

Dexerto: Armed officers held a student at gunpoint after an AI gun detection system mistakenly flagged a Doritos bag as a firearm “They made me get on my knees, put my hands behind my back, and cuff me”

Police saying ‘he’s got a gun!’ when the man in question does not have a gun is an event that happens every day, all the time, and the police are a lot less than 99.999% accurate on this. The question is not does your system make mistakes, or whether the mistakes look dumb when they happen. The question is does your system make more mistakes, or more costly mistakes, than you would get without the system.

Speaking of which, now do humans, this is from the BBC, full report here. They tested ChatGPT, Copilot, Perplexity and Gemini in May-June 2025, so this is before GPT-5.

BBC:

  1. 45% of AI answers had at least one significant issue

  2. 31% of responses showed serious sourcing problems (missing, misleading, or incorrect)

  3. 20% contained major accuracy issues, including hallucinated details and outdated information

  4. Gemini performed worst with significant issues in 76% of responses, more than double the other assistants, largely due to its poor sourcing performance.

  5. Comparison between the BBC’s results earlier this year and this study show some improvements but still high levels of errors.

Full report: Overall, there are signs that the quality of assistant responses has improved – the share of responses with significant issues of any kind fell from 51% in the first round to 37% in the current round.

There is a clear pattern here. The questions on the right mostly have clear uncontroversial correct answers, and that correct answer doesn’t have any conflict with standard media Shibboleths, and the answer hasn’t changed recently. For the questions on the left, it gets trickier on all these fronts. To know exactly how bad these issues were, we’d need to see the actual examples, which I don’t see here.

I’m fine with the outright never changing, actually.

Teortaxes: lmao. never change, Kimi (but please improve factuality)

David Sun: I am completely unimpressed by LLMs and not worried about AGI.

It is remarkable how many people see a dumb aspect of one particular LLM under default conditions, and then conclude that therefore AGI will never happen. Perhaps David is joking here, perhaps not, Poe’s Law means one cannot tell, but the sentiment is common.

On this next item, look, no.

Logan Kilpatrick (Lead for DeepMind AI Studio, RTed by Demis Hassabis): Everyone is going to be able to vibe code video games by the end of 2025.

Not German: Vibe code very very bad video games.

Logan Kilpatrick: games that most reasonable people would be excited to play with their friends because they have full control over the story, characters, experience

Even if we use a rather narrow definition of ‘everyone,’ no just no. We are not two months away from people without experience being able to vibe code up games good enough that your friends will want to play them as more than a curiosity. As someone who has actually designed and created games, this is not that easy, and this kind of shallow customization doesn’t offer that much if you don’t put in the real work, and there are lots of fiddly bits.

There’s no need to oversell AI coding like this. Is coding a game vastly easier, to the point where I’m probably in the category of ‘people who couldn’t do it before on their own in a reasonable way and can do it now?’ Yeah, quite possible, if I decided that’s what I wanted to do with my week or month. Alas, I’m kind of busy.

Alternatively, he’s making a hell of a claim about Gemini Pro 3.0. We shall see.

Sam Altman said the price of a unit of intelligence drops 97.5% per year (40x). If your objection to a business model is ‘the AIs good enough to do this cost too much’ your objection will soon be invalid.

Claude now has memory, as per Tuesday’s post.

Cursor 2.0, which includes their own coding model Composer and a new interface for working with multiple agents in parallel. They claim Composer is 4x faster than comparable top frontier models.

That is a terrible labeling of a graph. You don’t get to not tell us which models the other rows are. Is the bottom one GPT-5? GPT-5-Codex? Sonnet 4.5?

The UI has been redesigned around ways to use multiple agents at once. They also offer plan mode in the background, you can internally plan with one model and then execute with another, and several other upgrades.

The system instructions for Cursor 2.0’s Composer are here, Pliny’s liberation jailbreak alert is here.

Claude for Financial Services expands, offering a beta of Claude for Excel and adding many sources of live information: Aiera, Third Bridge, Chronograph, Egnyte, LSEG, Moody’s and MT Newswires. They are adding agent skills: Comparable company analysis, discounted cash flow models, due diligence data packs, company teasers and profiles, earnings analyses and initiating coverage reports.

ChatGPT offers Shared Projects to all users. Good.

ChatGPT Pulse now available on the web. This is a big jump in its practical value.

Google AI Studio now lets you create, save and reuse system instructions across chats.

Gemini app finally lets you switch models during a conversation.

Intended short-term upgrade list for the ChatGPT Atlas browser. Includes ‘tab groups.’ Did not include ‘Windows version.’

Why do people believe Elon Musk when he says he’s going to, for example, ‘delete all heuristics’ from Twitter’s recommendation algorithm in favor of having Grok read all the Tweets?

OpenAI offers us GPT-OSS-Safeguard, allowing developers to specify disallowed content.

AIs were outperformed by humans on the new WorldTest via ‘AutumnBench,’ a suite of 43 interactive worlds and 129 tasks calling for predicting hidden world aspects, planning sequences of actions and detecting when environment rules suddenly change.

This seems like an actually valuable result, which still of course came to my attention via a description you might have learned to expect by now:

Alex Prompter: The takeaway is wild… current AIs don’t understand environments; they pattern-match inside them. They don’t explore strategically, revise beliefs, or run experiments like humans do. WorldTest might be the first benchmark that actually measures understanding, not memorization. The gap it reveals isn’t small it’s the next grand challenge in AI cognition.

Scaling compute barely closes the gap.

Humans use resets and no-ops to test hypotheses. Models don’t. They just spam clicks.

The core event here seems to be that there was a period of learning opportunity without reward signals, during which humans reset 12% of the time and models reset less than 2% of the time. Humans had a decent learning algorithm and designed useful experiments during exploration, models didn’t.

So yeah, that’s a weakness of current models. They’re not good at relatively open-ended exploration and experimentation, at least not without good prompting and direction. They’re also not strong on adapting to weirdness, since they (wisely, statistically speaking) rely on pattern matching, while lacking good instincts on when to ‘snap out of’ those matches.

OpenAI is launching a browser and short duration video social network to try and capture consumers, monetizing them via shopping hookups and adding erotica.

What is OpenAI’s plan?

  1. Fund ASI by offering normal big tech company things to justify equity raises.

  2. ?????????. Build superintelligence in a way that everyone doesn’t die, somehow.

  3. Everyone dies. Profit, hopefully.

Near: to clarify confusion, openai’s competitor is meta (not anthropic), and anthropic’s competitor is google (not openai).

OpenAI is now set to transition to the 2nd phase of ChatGPT, focusing on advertising + engagement

With a team of ex-FB advertising execs and 1B users, if OpenAI can increase usage to a several hrs/day while matching Meta’s ad targeting, they can profitably reach a 1T+ valuation

fortunately openai employees have now had ample time to update their previous lines of thinking from “we arent an advertising company, i am here for the vision of AGI” to “actually advertising is good, especially when paired with short-form video and erotica. let me explain”

i give many venture capitalists credit here because to them the outcomes like this have been obvious for a long time. just look at the company and zoom out! what else could possibly happen! engineers and researchers on the other hand are often quite oblivious to such trends..

another important note here is that meta still has many cards left to play; i think it will actually be quite a brutal distribution competition even though openai has obviously had a headstart by like.. two full yrs. fidji is very good at monetization and sam is great at raising

Peter Wildeford: OpenAI essentially has two modes as a business:

  1. they might someday build AGI and then automate the entire economy

  2. ‘Facebookization of AI’ where we just build a normal Big Tech business off of monetizing billions of free users

Second mode helps fund the first.

Aidan McLaughlin (OpenAI): i think your error here is thinking sora and adult content are some leadership master plan; that sama sat down with accountants and signed and said “it’s time to break glass for emergence revenue” no.

I know the exact people who pushed for sora, they’re artists who worked against organizational headwinds to democratize movie creation, something they love dearly. i know the exact person who pushed for adult content, they’re a professional athlete, free-sprinted, one of the most socially thoughtful people i know… who just really believes in creative freedom.

there are passionate individual who pushed against all odds for what you think are top-down decisions. we are not a monoculture, and i love that.

I think there are zero worlds where there’s more money in ‘going hard’ at sora/erotica than there is automating all labor

I believe Aidan on the proximate causes inside OpenAI pushing towards these decisions. They still wouldn’t have happened if the conditions hadn’t been set, if the culture hadn’t been set up to welcome them.

Certainly there’s more money in automating all labor if you can actually automate all labor, but right now OpenAI cannot do this. Anything that raises valuations and captures market share and mindshare thus helps OpenAI progress towards both profitability and eventually building superintelligence and everyone probably dying. Which they pitch to us as the automation of all labor (and yes, they mean all labor).

Anthropic, on the other hand, is catering to business and upgrading financial services offerings and developing a browser extension for Chrome.

Two ships, ultimately going to the same place (superintelligence), pass in the night.

Stefan Schubert: Anthropic has overtaken OpenAI in enterprise large language model API market share.

Asa Fitch (WSJ): But Anthropic’s growth path is a lot easier to understand than OpenAI’s. Corporate customers are devising a plethora of money-saving uses for AI in areas like coding, drafting legal documents and expediting billing. Those uses are likely to expand in the future and draw more customers to Anthropic, especially as the return on investment for them becomes easier to measure.

Grok can be useful in one of two ways. One is more competitive than the other.

Alexander Doria: still failing to see the point of grok if it cannot go through my follow list and other X data.

xAI has chosen to de facto kick the other AIs off of Twitter, which is a hostile move that trades off the good of the world and its knowledge and also the interests of Twitter in order to force people to use Grok.

Then Grok doesn’t do a good job parsing Twitter. Whoops. Please fix.

The other way to make Grok useful is to make a superior model. That’s harder.

Claude.ai has an amazing core product, but still needs someone to put in the (relatively and remarkably small, you’d think?) amount of work to mimic various small features and improve the UI. They could have a very strong consumer product if they put a small percentage of their minds to it.

Another example:

Olivia Moore: With the introduction of Skills, it’s especially odd that Claude doesn’t have the ability to “time trigger” tasks.

I built the same workflow out on ChatGPT and Claude.

Claude did a much better job, but since you can’t set it to recur I’m going to have to run it on ChatGPT…

The obvious response is ‘have Claude Code code something up’ but a lot of people don’t want to do that and a lot of tasks don’t justify it.

Tyler Cowen asks ‘will there be a Coasean singularity?’ in reference to a new paper by Peyman Shahidi, Gili Rusak, Benjamin S. Manning, Andrey Fradkin & John J. Horton. AIs and AI agents promise to radically reduce various transaction costs for electronic markets, enabling new richer and more efficient market designs.

My classic question to ask in such situations: If this were the one and only impact of AI, that it radically reduces transaction costs especially in bespoke interactions with unique features, enabling far better market matching at much lower prices, then what does that effect alone do to GDP and GDP growth?

I asked Claude to estimate this based on the paper plus comparisons to historical examples. Claude came back with wide uncertainty, with a baseline scenario of a one-time 12-18% boost over 15-25 years from this effect alone. That seems on the low side to me, but plausible.

Theo Jaffee and Jeffrey Ladish think the Grok effect on Twitter has been good, actually? This has not been my experience, but in places where epistemics have gone sufficiently downhill perhaps it becomes a worthwhile tradeoff.

Grok now has a third companion, a 24-year-old spirited woman named Mika, the link goes to her system prompt. The good news is that she seems like a less unhealthy persona to be chatting to than Ani, thus clearing the lowest of bars. The bad news is this seems like an epic display of terrible prompt engineering of an intended second Manic Pixie Dream Girl, and by being less flagrantly obviously awful this one might actually be worse. Please avoid.

Steven Adler, former head of product safety at OpenAI, warns us in the New York Times not to trust OpenAI’s claims about ‘erotica.’ I agree with him that we don’t have reason to trust OpenAI to handle this (or anything else) responsibly, and that publishing changes in prevalence rates of various mental health and other issues over time and committing to what information it releases would build trust in this area, and be important info to learn.

AI-generated music is getting remarkably good. A new study finds that songs from a mix of Suno versions (mostly in the v3 to v4 era, probably, but they don’t say exactly?) was ‘indistinguishable from human music,’ meaning when asked to identify the human song between a Suno song and a random human song, listeners were only 50/50 in general, although they were 60/40 if both were the same genre. We’re on Suno v5 now and reports are it’s considerably better.

One commentor shares this AI song they made, another shares this one. If you want generic music that ‘counts as music’ and requires attention to differentiate for the average person? We’re basically there.

Nickita Khylkouski: AI generated music is indistinguishable from AVERAGE human music. Most people listen to very popular songs not just average ones.

The most popular songs are very unique and wouldn’t be easy to reproduce.

There is a big gap between generic average human music and the median consumed musical recording, and also a big gap between the experience of a generic recording versus hearing that performed live, or integrating the music with its context and story and creator, AI music will have much lower variance, and each of us curates the music we want the most.

An infinite number of monkeys will eventually write Shakespeare, but you will never be able to find and identify that manuscript, especially if you haven’t already read it.

That’s a lot of ‘horse has the wrong accent’ as opposed to noticing the horse can talk.

The questions are, essentially, at this point:

  1. Will be a sameness and genericness to the AI music the way there often is with AI text outputs?

  2. How much will we care about the ‘humanness’ of music, and that it was originally created by a human?

  3. To what extent will this be more like another instrument people play?

It’s not an area I’ve put much focus on. My guess is that musicians have relatively less to worry about versus many others, and this is one of the places where the AI needs to not only match us but be ten times better, or a hundred times better. We shall see.

Rob Wiblin: A challenge for this product is that you can already stream music that’s pretty optimised to your taste 8 hours a day at a price of like… 5 cents an hour. Passing the Turing test isn’t enough to capture that market, you’d have be better and very cheap to run.

Ethan Mollick notes that it is faster to create a Suno song than to listen to it. This means you could be generating all the songs in real time as you listen, but even if it was free, would you want to do that?

What determines whether you get slop versus art? Here is one proposal.

Chris Barber: Art is meaningful proportional to the love or other emotions poured in by the artist.

If the only ingredient is AI, it’s probably slop.

If the main ingredient is love and AI was a tool used, it’s art.

As a predictive method, this seems right. If you intend slop, you get slop. If you intend art, and use AI as a tool, you get art similarly to how humans otherwise get art, keeping in mind Sturgeon’s Law that even most human attempts to create art end up creating slop anyway, even without AI involved.

Reddit created a ‘test post’ that could only be crawled by Google’s search engine. Within hours Perplexity search results had surfaced the content of the post.

Seb Krier pushed back strongly against the substance from last week, including going so far as to vibecode an interactive app to illustrate the importance of comparative advantage, which he claims I haven’t properly considered.

It was also pointed out that I could have worded my coverage better, which was due to my frustration with having to repeatedly answer various slightly different forms of this argument. I stand by the substance of my claims but I apologize for the tone.

I’ve encountered variants of the ‘you must not have considered comparative advantage’ argument many times, usually as if it was obvious that everything would always be fine once you understood this. I assure everyone I have indeed considered it, I understand why it is true for most historical or present instances of trade and competition, and that I am not making an elementary or first-order error here.

Gallabytes (QTing link to the app): this actually helped me understand why these scenarios aren’t especially compelling! they work under the assumption of independent populations but fail under ~malthusian conditions.

I think that’s the basic intuition pump. As in, what comparative advantage does is IF:

  1. You have a limited fixed pool of compute or AIs AND

  2. You have limited fixed pool of humans AND

  3. There are enough marginally productive tasks to fully occupy all of the compute with room for most of the humans to do enough sufficiently productive things

  4. THEN the humans end up doing productive things and getting paid for them.

You can relax the bounds on ‘limited fixed pool’ somewhat so long as the third condition holds, but the core assumptions are that the amount of compute is importantly bounded, and that the resources required for creating and maintaining humans and the resources creating and maintaining AIs are not fungible or rivalrous.

METR is hiring.

Grokopedia.com is now a thing. What kind of thing is it? From what I can tell not an especially useful thing. Why would you have Grok generate pseudo-Wikipedia pages, when you can (if you want that) generate them with queries anyway?

Did we need a ‘Grokopedia’ entry that ‘clears Gamergate’ as if it is some authority? Or one that has some, ahem, entries that could use some checking over? How biased is it? Max Tegmark did a spot check, finds it most biased versus Wikipedia, which is a low bar it did not clear, the one polemic like it is doing advocacy.

How is ‘a16z-backed’ always followed by ‘torment nexus’? At this point I’m not even mad, I’m just impressed. The latest case is that a16z was always astroturfing on social media, but they realized they were making the mistake of paying humans to do that, so now they’re backing a startup to let you ‘control 1000s of social media accounts with AI,’ with its slogans being ‘control is all you need’ and ‘never pay a human again.’

UK’s AISI instroduces ControlArena, a library for running AI control experiments.

Remotelabor.ai to track the new Remote Labor Index, measuring what percentage of remote work AI can automate. Currently the top score is 2.5%, so ‘not much,’ but that’s very different from 0%.

Do you hear that, Mr. Anderson? That is the sound of inevitability.

It is the sound of a teleoperated in-home robot named Neo, that they hope is coming soon, to allow the company to prototype and to gather data.

Joanna Stern covered it for the Wall Street Journal.

It will cost you either $20,000, or $500 monthly rental with a six-month minimum commitment. Given how fast the tech will be moving and the odds offered, if you do go for this, the wise person will go with the rental.

The wiser one presumably waits this out. Eventually someone is going to do a good version of this tech that is actually autonomous, but this only makes sense at this level for those who want to be the earliest of adaptors, either for professional reasons or for funsies. You wouldn’t buy this version purely because you want it to clean your house.

That’s true even if you don’t mind giving up all of your privacy to some random teleoperator and having that data used to train future robots.

Here’s a link to a 10 minute ad.

VraserX: So… the 1X NEO home robot is not actually autonomous.

Behind the scenes, it’ll often be teleoperated by humans, meaning someone, somewhere, could literally remote-control a robot inside your living room.

I wanted the dawn of embodied AI.

Instead, I’m apparently paying $499/month for a robot avatar with a human pilot.

It’s impressive tech. But also… kind of dystopian?

A robot that looks alive, yet secretly puppeteered, the uncanny valley just got a new basement level.

Feels less like the “post-labor future,”

and more like we just outsourced physical presence itself.

Durk Kingma: 1X seems to have the right approach to developing safe humanoid home robots. Developing full autonomy will require lots of in-distribution demonstration data, so this launch, mostly tele-operated, makes a lot of sense. I expect such robots to be ubiquitous in 5-10 years.

Near: i will not be buying tele-operated robots these people cant see how i live

Kai Williams gives us 16 charts that explain the AI boom. Very good chart work.

I don’t want to copy too many. I appreciated this one, no that’s not a mistake:

Yet another ‘where is ChatGPT culturally’ chart, this one places it in Germany. This is the standard World Values Survey two-axis theory. I wouldn’t take it too seriously in this context but it’s always fun?

OpenAI is working on music-generating AI. I mean you knew that because obviously they are, but The Information is officially saying it.

In wake of Near Tweeting the homepage of one of Marc Andreessen’s portfolio companies at a16z and quoting their own chosen slogan, he has blocked them.

Near: I finally got the @pmarca block after three years!

all it took was tweeting the homepage of companies he invested in. cheers

there are many events i will never again be invited to. if this level of shallowness is the criterion, i have no interest

if i wanted to make your firm look bad i could tweet truthful things ten times worse. i am being very kind and direct in my plea to invest in better things.

at the end of the day no one really cares and nothing will ever change. in the last crypto bubble a16z committed probably around a billion dollars in pure fraud when they found out they could dump tokens/coins on retail as there were no lock-up periods. who cares everyone forgot.

It’s not so much that we forgot as we’ve accepted that this is who they are.

OpenAI prepares for an IPO at $1 trillion valuation as per Reuters. If I was OpenAI would not want to become a public company, even if it substantially boosted valuations, and would work to get liquidity to investors and employees in other ways.

OpenAI and Anthropic will probably keep exceeding most forecasts, because the forecasts, like fiction, have to appear to make sense.

Peter Wildeford: Based on the latest reporting, the combined annualized total revenue of Anthropic + xAI + OpenAI is ~$24B.

I updated my forecasts and I am now projecting they reach $29B combined by the end of the year.

I’d take the over on $29.3 billion, but only up to maybe his previous $32.2 billion.

This is up from him expecting $23B as of August 1, 2025, but down a bit (but well within the error bars) from his subsequent update on August 4, when he was at $32.2B projected by year end.

Valthos raises $30 million for next-generation biodefense.

Tyler Cowen endorses Noah Smith’s take that we should not worry about AI’s circular deals, and goes a step farther.

Tyler Cowen: Noah stresses that the specifics of these deals are widely reported, and no serious investors are being fooled. I would note a parallel with horizontal or vertical integration, which also can have a financing element. Except that here corporate control is not being exchanged as part of the deal. “I give him some of my company, he gives me some of his — my goodness that is circular must be some kind of problem there!”…just does not make any sense.

This proves too much? As in, it seems like a fully general argument that serious investors cannot, as a group, be fooled if facts are disclosed, and I don’t buy that.

I do buy that there is nothing inherently wrong with an equity swap, or with using equity as part of vender financing, or anything else the AI companies are doing. The ‘there must be some kind of problem here’ instinct comes from the part where this causes valuations to rise a lot, and where those higher valuations are used to pay for the deals for real resources, and also that this plausibly sets up cascading failures. I think in this case it is mostly file, but none of that seems senseless at all.

From the One True Newsletter, sir, you have the floor:

Matt Levine: Somehow a true sentence that I am writing in 2025 is “the world’s richest man demanded that people give him a trillion dollars so that he can have absolute control of the robot army he is building unless he goes insane,”

Dana Hull and Edward Ludlow (Bloomberg): Elon Musk, the world’s richest person, spent the end of Tesla Inc.’s earnings call pleading with investors to approve his $1 trillion pay package and blasting the shareholder advisory firms that have come out against the proposal.

“There needs to be enough voting control to give a strong influence, but not so much that I can’t be fired if I go insane,” Musk said, interrupting his chief financial officer as the more than hour-long call wrapped up. …

“I just don’t feel comfortable building a robot army here, and then being ousted because of some asinine recommendations from ISS and Glass Lewis, who have no freaking clue,” he said.

Matt Levine: Do … you … feel comfortable with Elon Musk getting a trillion dollars to build a robot army? Like, what sort of checks should there be on the world’s richest man building a robot army? I appreciate his concession that it should be possible to fire him if he goes insane, but. But. I submit to you that if you hop on a call with investors to say “hey guys just to interject here, you need to give me a trillion dollars to build a robot army that I can command unless I go insane,” some people might … think … you know what, never mind, it’s great, robot army. Robot army!

I feel like the previous richest people in the world have had plans for their wealth along the lines of “buy yachts” or “endow philanthropies.” And many, many 10-year-old boys have had the thought “if I was the richest person in the world of course I would build a robot army.” But the motive and the opportunity have never coincided until now.

A good general heuristic is, if Elon Musk wouldn’t mind someone else having something one might call ‘a robot army,’ then I don’t especially mind Elon Musk having a robot army. However, if Elon Musk is not okay with someone else having that same robot army, then why should we be okay with Elon Musk having it? Seems sus.

I realize that Elon Musk thinks ‘oh if I build superintelligence at xAI then I’m me, so it will be fine’ and ‘it’s cool, don’t worry, it will be my robot army, nothing to worry about.’ But the rest of us are not Elon Musk. And also him having already gone or in the future going insane, perhaps from taking many drugs and being exposed to certain information environments and also trying to tell himself why it’s okay to build superintelligence and a robot army? That seems like a distinct possibility.

This is in addition to the whole ‘the superintelligence would take control of the robot army’ problem, which is also an issue but the AI that can and would choose to take control of the robot army was, let’s be honest, going to win in that scenario anyway. So the robot army existing helps move people’s intuitions closer to the actual situation, far more than it changes the situation.

As per the speculation in Bloomberg last week, Anthropic announces plan to expand use of Google Cloud technologies, including up to one million TPUs, ‘dramatically increasing’ their compute resources. Anthropic badly needed this, and now they have it. Google stock rose a few percent after hours on the news.

Thomas Kurian (CEO Google Cloud): Anthropic’s choice to significantly expand its usage of TPUs reflects the strong price-performance and efficiency its teams have seen with TPUs for several years. We are continuing to innovate and drive further efficiencies and increased capacity of our TPUs, building on our already mature AI accelerator portfolio, including our seventh generation TPU, Ironwood.

Krishan Rao (CFO Anthropic): Anthropic and Google have a longstanding partnership and this latest expansion will help us continue to grow the compute we need to define the frontier of AI.

Anthropic: Anthropic’s unique compute strategy focuses on a diversified approach that efficiently uses three chip platforms–Google’s TPUs, Amazon’s Trainium, and NVIDIA’s GPUs.

Anthropic’s unique compute strategy focuses on a diversified approach that efficiently uses three chip platforms–Google’s TPUs, Amazon’s Trainium, and NVIDIA’s GPUs. This multi-platform approach ensures we can continue advancing Claude’s capabilities while maintaining strong partnerships across the industry. We remain committed to our partnership with Amazon, our primary training partner and cloud provider, and continue to work with the company on Project Rainier, a massive compute cluster with hundreds of thousands of AI chips across multiple U.S. data centers.

Anthropic will continue to invest in additional compute capacity to ensure our models and capabilities remain at the frontier.

If you have compute for sale, Anthropic wants it. Anthropic has overwhelming demand for its services, hence its premium pricing, and needs all the compute it can get. OpenAI is doing the same thing on a larger scale, and both are looking to diversify their sources of compute and want to avoid depending too much on Nvidia.

Anthropic in particular, while happy to use Nvidia’s excellent GPUs, needs to focus its compute sources elsewhere on Amazon’s Trainium and Google’s TPUs. Amazon and Google are investors in Anthropic and natural allies. Nvidia is a political opponent of Anthropic, including due to fights over export controls and Nvidia successfully attempting to gain policy dominance over the White House.

I would also note that OpenAI contracting with AMD and also to create their own chips, and Anthropic using a full three distinct types of chips whenever it can get them, once again puts the lie to the idea of the central role of some AI ‘tech stack.’ These are three distinct American tech stacks, and Anthropic is using all three. That’s not to say there are zero inefficiencies or additional costs involved, but all of that is modest. The hyperscalers need compute to hyperscale with, period, full stop.

Now that Anthropic has secured a lot more compute, it is time to improve and expand Claude’s offerings and features, especially for the free and lightweight offerings.

In particular, if I was Anthropic, I would make Claude for Chrome available to all as soon as the new compute is online. Make it available on the $20 and ideally the free tier, with some or all of the agent abilities tied to the higher level subscriptions (or to an API key or a rate limit). The form factor of ‘open a side panel and chat with a web page’ was proven by OpenAI’s Atlas to be highly useful and intuitive, especially for students, and offering it inside the existing Chrome browser is a key advantage.

Could the product be improved? Absolutely, especially in terms of being able to select locations on screen and in terms of ease of curating a proper website whitelist, but it’s good enough to get going. Ship it.

The plan is set?

Sam Altman: We have set internal goals of having an automated AI research intern by September of 2026 running on hundreds of thousands of GPUs, and a true automated AI researcher by March of 2028. We may totally fail at this goal, but given the extraordinary potential impacts we think it is in the public interest to be transparent about this.

I strongly agree it is good to be transparent. I expect them to miss this goal, but it is noteworthy and scary that they have this goal. Those, especially in the White House, who think OpenAI believes they can’t build AGI any time soon? Take note.

Sam Altman: We have a safety strategy that relies on 5 layers: Value alignment, Goal alignment, Reliability, Adversarial robustness, and System safety. Chain-of-thought faithfulness is a tool we are particularly excited about, but it somewhat fragile and requires drawing a boundary and a clear abstraction.

All five of these are good things, but I notice (for reasons I will not attempt to justify here) that I do not expect he who approaches the problem in this way to have a solution that scales to true automated AI researchers. The Tao is missing.

On the product side, we are trying to move towards a true platform, where people and companies building on top of our offerings will capture most of the value. Today people can build on our API and apps in ChatGPT; eventually, we want to offer an AI cloud that enables huge businesses.

Somewhere, Ben Thompson is smiling. The classic platform play, and claims that ‘most of the value’ will accrue elsewhere. You’re the AI consumer company platform company now, dog.

Implemented responsibly and well I think this is fine, but the incentives are not good.

We have currently committed to about 30 gigawatts of compute, with a total cost of ownership over the years of about $1.4 trillion. We are comfortable with this given what we see on the horizon for model capability growth and revenue growth. We would like to do more—we would like to build an AI factory that can make 1 gigawatt per week of new capacity, at a greatly reduced cost relative to today—but that will require more confidence in future models, revenue, and technological/financial innovation.

I am not worried that OpenAI will be unable to pay for the compute, or unable to make profitable use of it. The scary and exciting part here is the AI factory, AIs building more capacity for more AIs, that can then build more capacity for… yes this is the explicit goal, yes everything in the movie that ends in human extinction is now considered a product milestone.

Whenever anyone says their plans call for ‘financial innovation’ and you’re worried we might be in a bubble, you might worry rather a bit more about that, but I get it.

Our new structure is much simpler than our old one. We have a non-profit called OpenAI Foundation that governs a Public Benefit Corporation called OpenAI Group. The foundation initially owns 26% of the PBC, but it can increase with warrants over time if the PBC does super well. The PBC can attract the resources needed to achieve the mission.

No lies detected, although we still lack knowledge of the warrants. It was also one of the greatest thefts in human history, I’ll cover that in more depth, but to Altman’s credit he doesn’t deny any of it.

Our mission, for both our non-profit and PBC, remains the same: to ensure that artificial general intelligence benefits all of humanity.

Your mission sure looks in large part like becoming a platform consumer product company, and then building a true automated AI researcher in a little over two years, and the nonprofit’s initial deployments also don’t seem aimed at the mission.

The nonprofit is initially committing $25 billion to health and curing disease, and AI resilience (all of the things that could help society have a successful transition to a post-AGI world, including technical safety but also things like economic impact, cyber security, and much more). The nonprofit now has the ability to actually deploy capital relatively quickly, unlike before.

I am happy to see $25 billion spent on good causes but at least the first half of this is not the mission. Health and curing disease is a different (worthy, excellent, but distinct) mission that will not determine whether AGI benefits all of humanity, and one worries this is going to return to OpenAI as revenue.

AI resilience in the second half is once again defined limitlessly broadly. If it’s truly ‘anything that will help us make the transition’ then it is too soon to evaluate how on or off mission it is. That’s a lot of uncertainty for ~$12 billion.

Wikipedia has a fun list of Elon Musk predictions around automated driving at Tesla. When he predicts things, do not expect those things to happen. That’s not why he predicts things.

AI Futures Project (as in AI 2027)’s Joshua Turner and Daniel Kokotajlo make the case for scenario scrutiny, as in writing a plausible scenario where your strategy makes the world better. Such scrutiny can help solve many issues, they list:

  1. Applause lights

  2. Bad analogies

  3. Uninterrogated consequences, ‘and then what?’

  4. Optimistic assumptions and unfollowed incentives

  5. Inconsistencies

  6. Missing what’s important

Note that reality often has unfollowed incentives, or at least what sure look like them.

They also list dangers:

  1. Getting hung up on specifics

  2. Information density

  3. Illusory confidence

  4. Anchoring too much on a particular scenario

I’d worry most about that last one. Once you come up with a particular scenario, there’s too much temptation for you and everyone else to focus on it, whereas it was only ever supposed to be one Everett branch out of many. Then you get ‘oh look that didn’t happen’ or ‘oh look that step is stupid’ either of which is followed by ‘therefore discard all of it,’ on the one hand, or taking the scenario as gospel on the other.

Peter Wildeford offers his case for why AI is probably not a bubble.

Microsoft comes out in support of the GAIN AI Act. That’s quite the signal, including that this is likely to pass despite Nvidia’s strong objections. It is a hell of a thing for them to endorse the ‘Nvidia has to sell its chips to Microsoft before the Chinese’ act of 2025, given their desire for Nvidia to allocate its chips to Microsoft.

Dean Ball QTs Neil Chilson’s thread from last week, and refreshingly points out that treating SB 53 and requests for transparency as some kind of conspiracy “requires considerable mental gymnastics.”

Dean Ball: It’s not clear what a legal transparency mandate would get Anthropic in particular here; if they wanted to scare people about AI—which they very much do—wouldn’t they just… tell people how scary their models are, as they have been doing? What additional benefit does the passage of SB 53 get them in this supposed plan of theirs, exactly, compared to the non-insignificant costs they’ve borne to be public supporters of the bill?

It seems to be believing support of SB 53 is some kind of conspiracy requires considerable mental gymnastics.

The actual reality is that there are just people who are scared about AI (maybe they’re right!) and think future regulations will somehow make it less scary (they’re probably wrong about this, even if they’re right about it being scary).

And then there are a few centrists who basically say, “seems like we are all confused and that it would be ideal to had more information while imposing minimal burdens on companies.” This is basically my camp.

Also, as @ChrisPainterYup has observed, many AI risks implicate actions by different parties, and transparency helps those other parties understand something more about the nature of the risks they need to be prepared for.

I also endorse this response, as a ‘yes and’:

Ryan Greenblatt: I think a lot of the upside of transparency is making companies behave more responsibly and reasonably even in the absence of regulation with teeth. This is because:

– Information being public makes some discussion much more likely to happen in a productive way because it allows more actors to discuss it (both more people and people who are better able to understand the situation and determine what should happen). The epistemic environment within AI companies with respect to safety is very bad. Further, things might not even be transparent to most employees by default (due to internal siloing or just misleading communication).

– Transparency makes it easier to pressure companies to behave better because you can coordinate in public etc.

– Companies might just be embarrassed into behaving more responsibly even without explicit pressure.

I would not have expected Samuel Hammond to have characterized existing laws and regulations, in general, as being ‘mostly functional.’

Roon: imo a great analysis [that he QTs here] from what I have seen, which i admit is limited. some combination of total judicial backlog plus the socialist raj experiments of the india 1950-90 have created a black money extrajudicial economy that cannot experience true capitalism

Samuel Hammond: AGI will be like this on steroids. Existing laws and regulations will flip from being mostly functional to a de facto license raj for AI driven services, distorting diffusion and pushing the most explosive growth into unregulated gray markets.

I expect the following five things to be true at once:

  1. Existing laws and regulations already are a huge drag on our economy and ability to do things, and will get even more burdensome, destructive and absurd in the face of AGI or otherwise sufficiently advanced artificial intelligence.

  2. Assuming we survive and remain in control, if we do not reform our existing laws that restrict AI diffusion and application, we will miss out on a large portion of the available mundane utility, be vastly poorer than we could have been, and cause growth to concentrate on the places such applications remain practical, which will disproportionately include grey and black markets.

    1. There are many proposals out there for additional restrictions on AI that would have the effect of making #2 worse, without helping with #3, and there will over time be calls and public pressure for many more.

  3. Most of the things we should undo to fix #2 are things we would want to do even if LLMs had never been created, we should totally undo these things.

  4. The most important downside risk category by far is the danger that such sufficiently advanced AI kills everyone or causes us to lose control over the future. The restrictions Hammond describes or that I discuss in #2 will not meaningfully help with these risks, and the interventions that would help with these risks wouldn’t interfere with AI diffusion and application on anything like the level to which existing laws do this.

    1. The exception is if in the future we need to intervene to actively delay or prevent the development of superintelligence, or otherwise sufficiently advanced AI, in which case the thing we prevent will be unavailable.

    2. If that day did ever arrive and we pulled this off, there would still be tremendous gains from AI diffusion available, enough to keep us busy and well above standard growth rates, while we worked on the problem.

  5. If we negatively polarize around AI, we will inevitably either fail at #2 or at #4, and by default will fail at both simultaneously.

California’s SB 53 was a good bill, sir, and I was happy to support it.

Despite getting all of the attention, it was not the only California AI bill Newsom signed. Dean Ball goes over the others.

Dean Ball: AI policy seems to be negatively polarizing along “accelerationist” versus “safetyist” lines. I have written before that this is a mistake. Most recently, for example, I have suggested that this kind of crass negative polarization renders productive political compromise impossible.

But there is something more practical: negative polarization like this causes commentators to focus only on a subset of policy initiatives or actions associated with specific, salient groups. The safetyists obsess about the coming accelerationist super PACs, for instance, while the accelerationist fret about SB 53, the really-not-very-harmful-and-actually-in-many-ways-good frontier AI transparency bill recently signed by California Governor Gavin Newsom.

Dean is spot on about the dangers of negative polarization, and on SB 53, but is trying to keep the polarization blame symmetrical. I won’t be doing that.

It’s not a ‘both sides’ situation when:

  1. One faction, led by Andreessen and Sacks, that wants no rules for AI of any kind for any reasons, is doing intentional negative polarization to politicize the issue.

  2. The faction being targeted by the first group’s rhetoric notices this is happening, and is trying to figure out what to do about it, to stop or mitigate the impact, which includes calling out the actions of the first group or raising its own funds.

Do not take the bait.

Anyway, back to the actual bill analysis, where Dean notes that SB 53 is among the lightest touch of the eight bills.

That’s the pattern.

  1. Bills written by those who care about existential risks tend to be written carefully, by those paying close attention to technocratic details and drafted according to classical liberal principles and with an eye to minimizing secondary impacts.

  2. Bills written for other reasons are not like that. They’re usually (but not always) awful. The good news is that most of them never become law, and when they get close there are indeed forces that fix this a bit, and stop the relatively worse ones.

Dean thinks some of these are especially bad. As usual, he examines these laws expecting politically motivated, selective targeted enforcement of the letter of the law.

I will be paraphrasing law details, you can find the exact wording in Dean’s post.

He starts with AB 325, which regulates what is called a ‘common pricing algorithm,’ which is defined as any algorithm that uses competitor data to determine anything.

It is then criminally illegal to ‘coerce another person to set or adopt a recommended price or commercial term.’

Dean argues that because so many terms are undefined here, this could accidentally be regulating effectively all market transactions, letting California selectively criminally prosecute any business whenever they feel like it.

These overbroad statutes are ultimately just weapons, since everyone is violating them all the time. Still, rarely have I seen an American law more hostile to our country’s economy and way of life.

I don’t like what this law is trying to do, and I agree that I could have drafted a better version, especially its failure to define ‘two or more persons’ in a way that excludes two members of the same business, but I don’t share Dean’s interpretation or level of alarm here. The term ‘coerce’ is rather strict, and if this is effectively requiring that users of software be able to refuse suggestions, then that seems mostly fine? I believe courts typically interpret such clauses narrowly. I would expect this to be used almost entirely as intended, as a ban on third-party pricing software like RealPage, where one could reasonably call the result price fixing.

Next up is AB 853, a de facto offshoot of the never-enacted terrible bill AB 3211.

It starts off requiring ‘large online platforms’ to have a user interface to identify AI content, which Dean agrees is reasonable enough. Dean asks if we need a law for this, my answer is that there are some bad incentives involved and I can see requiring this being a win-win.

Dean is more concerned that AI model hosting platforms like Hugging Face are deputized to enforce SB 942, which requires AI models to offer such disclosures. If a model has more than 1 million ‘users’ then the hosting platform has to verify that the model marks its content as AI generated.

Once again I don’t understand Dean’s expectations for enforcement, where he says this would effectively apply to every model, and be a sledgehammer available at all times – I don’t subscribe to this maximalist style of interpretation. To be in violation, HuggingFace would have to be knowingly non-compliant, so any reasonable effort to identify models that could have 1 million users should be fine. As Dean notes elsewhere, there are a tons of laws with similar structure on the books all around us.

Again, should this have been written better and defined its terms? Yes. Would I lose any sleep over that if I was HuggingFace? Not really, no.

He then moves on to the part where AB 853 puts labeling requirements on the outputs of ‘capture devices’ and worries the definition is so broad it could add compliance burdens on new hardware startups in places where the labeling makes no sense. I can see how this could be annoying in places, but I don’t expect it to be a big deal. Again, I agree we could have drafted this better.

The comedy of poor drafting continues, such as the assertion that SB 243, a bill drafted to regulate AI companions, would technically require game developers to have video game characters periodically remind you that they are not human. The obvious response is ‘no, obviously not, no one is going to try to ever enforce that in such a context.’

I feel like the ship has kind of sailed a long time ago on ‘not create an array of laws that, if interpreted literally and stupidly in a way that would make even Jack McCoy feel shame and that obviously wasn’t intended, that judges and juries then went along with, would allow the government to go after basically anyone doing anything at any time?’ As in, the whole five felonies a day thing. This is terrible, but most of the damage is in the ‘zero to one’ transition here, and no one seems to care much about fixing all the existing problems that got us to five.

I also have a lot more faith in the common law actually being reasonable in these cases? So for example, we have this fun Matt Levine story from England.

We have talked a few times about a guy in London who keeps snails in boxes to avoid taxes. The theory is that if a property is used for agriculture, it can avoid some local property taxes, and “snail farming” is the minimum amount of agriculture you can do to avoid taxes. This is an extremely funny theory that an extremely funny guy put into practice in a bunch of office buildings.

It does, however, have one flaw, which is that it is not true. Eventually the local property tax authorities will get around to suing you, and when they do, you will go to court and be like “lol snails” and the judge will be like “come on” and you’ll have to pay the taxes. A reader pointed out to me a 2021 Queen’s Bench case finding oh come on this is a sham.

So yeah. If you read the statute it says that the snails count. But the reason our common law system kind of largely works at least reasonably often is that it is capable of looking at a situation and going ‘lol, no.’

I might love this story a bit too much. Oh, it turns out that the people who want no regulations whatsoever on AI and crypto so they can make more money aren’t primarily loyal to the White House or to Republicans after all? I’m shocked, shocked.

Matt Dixon: The White House is threatening some of Silicon Valley’s richest and most powerful players over their efforts to spearhead a $100 million midterm strategy to back candidates of both parties who support a national framework for artificial intelligence regulations.

In August, the group of donors launched a super PAC called Leading the Future. It did not consult with the White House before doing so, according to a White House official.

What is especially frustrating to White House officials is that it plans to back AI-friendly candidates from both political parties — which could potentially help Democrats win back control of Congress — and one of the leaders of the new super PAC is a former top staffer to Senate Minority Leader Chuck Schumer, D-N.Y.

“Any group run by Schumer acolytes will not have the blessing of the president or his team,” a White House official familiar with Trump’s thinking on the matter told NBC News. “Any donors or supporters of this group should think twice about getting on the wrong side of Trump world.”

“We are carefully monitoring who is involved,” the official added.

“AI has no better ally than President Trump, so it’s inexplicable why any company would put money into the midterms behind a Schumer-operative who is working against President Trump to elect Democrats,” said a second person familiar with the White House’s thinking. “It’s a slap in the face, and the White House has definitely taken notice.”

I mostly covered these issues yesterday, but I have a few additional notes.

One thing I failed to focus on is how Nvidia’s rhetoric has been consistently anti-American, ten times worse than any other tech company would dare (nothing Anthropic has ever said is remotely close) and yet they somehow get away with this.

Michael Sobolik: LEFT: Trump’s AI Action Plan, saying that it’s “imperative” for America to “win this race.”

RIGHT: Jensen Huang, about whether it matters who wins the AI race: “In the final analysis, I don’t think it does.”

House Select Committee on China: Saying it “doesn’t matter” whether America or China wins the AI race is dangerously naïve.

This is like arguing that it would not have mattered if the Soviets beat the US to a nuclear weapon. American nuclear superiority kept the Cold War cold. If we want to do the same with China today, we must win this race as well.

It mattered when the Soviet Union built nuclear weapons—and it matters now as the Chinese Communist Party seeks dominance in AI and advanced chips.

America must lead in the technologies that shape freedom and security.

The House Select Committee does not understand what Jensen is doing or why he is doing it. Jensen is not ‘dangerously naive.’ Rather, Jensen is not in favor of America, and is not in favor of America winning.

You can call that a different form of naive, if you would like, but if the House Select Committee thinks that Jensen is saying this because he doesn’t understand the stakes? Then it is the Committee that is being naive here.

Here’s another one, where Jensen ways ‘we don’t have to worry’ that the Chinese military will benefit from the sale of US chips. You can view this as him lying, or simply him being fine with the Chinese military benefiting from the chips. Or both.

Helen Toner: Jensen Huang on whether the Chinese military will benefit from sales of US chips:

“We don’t have to worry about that”

CSET data on the same question:

Cole McFaul: Looks like Trump is considering allowing exports of advanced semis to China.

Nvidia argues that its chips won’t enable PRC military modernization. So

@SamBresnick

and I dug through hundreds of PLA procurement documents yesterday.

We find evidence of the opposite.

There are three stages in China’s military modernization:

Mechanization → Informatization → Intelligentization

Intelligentization refers to the embedding of AI-enabled technologies within military system. Beijing thinks this could shift US-PRC power dynamics.

Jensen argues that Nvidia chips won’t play a role in intelligentization, saying that the PLA won’t be able to trust American chips, and therefore the risk of exporting chips to China is low.

But that’s false, based on our analysis. For two reasons:

First, Nvidia chips are already being used by the PLA.

Second, Chinese models trained using Nvidia hardware are being used by the PLA.

[thread continues to provide additional evidence, but the point is made]

Yet another fun one is when Jensen said it was good China threatened America with the withholding of rare earth metals. I don’t begrudge Nvidia maximizing its profits, and perhaps it is strategically correct for them to be carrying China’s water rather than ours, but it’s weird that we keep acting like they’re not doing this.

Finally, I did not emphasize enough that selling chips to China is a political loser, whereas not selling chips to China is a political winner. You like to win, don’t you?

Export controls on chips to China poll at +11, but that pales to how it polls on Capital Hill, where for many the issue is high salience.

As in, as a reminder:

Dean Ball: my sense is that selling Blackwell chips to china would be quite possibly the most unpopular tech policy move of the trump administration, especially on Capitol Hill.

it’s plausible that the long-term (really even near-term) result will be much more compute regulation.

Langerius: yeah that’d light up every hearing room in dc for months..

Chris Williamson interviews Eliezer Yudkowsky, and lets Eliezer give multi-minute answers to complex questions.

Nick Bostrom talks to Jonas von Essen about ‘AI future that can destroy us.

Odd Lots covers the movement to care about AI model welfare.

Here’s some more fun bait from those seeking negative polarization, not only on AI.

Florida DOGE: This morning, the Florida DOGE Team was banned by @AnthropicAI without any warning or justification. This comes as Florida DOGE has used AI to augment our efforts to identify wasteful spending and woke DEI initiatives.

@davidsacks47 and @elonmusk are correct that Anthropic is “woke.”

@GovRonDeSantis has been spot on from the beginning—we can’t allow woke Silicon Valley oligarchs to control the information we access.

Alex Albert (Anthropic, Head of Claude Relations): Thanks for flagging, we are looking into this to see what happened and will get in touch.

It looks like the account was flagged for logins from multiple people in different locations. We’re working to get the account reinstated right now and we’ll work with your team to get you set up correctly with the right team account type instead of using an individual one.

They banned our account, it must be because they’re woke. Do not take the bait. Instead, notice the pattern of putting out bait, and don’t get negatively polarized.

Not the main thing, but out of curiosity, I checked with Grok, the Elon Musk expert, and it reports that there is no actual instance of Musk calling Anthropic in particular ‘woke,’ although he has called it ‘misanthropic’ and once called it ‘pure evil.’ One can only laugh.

I actually do think a version of this a big part of why people think alignment and disempowerment are solvable problems:

Rob Miles: There’s a style of reasoning we learn in school that goes “This problem seems to require me to do X, but nobody ever taught me how to do X and I don’t know where to start with it, therefore the problem must be solvable without X”.

This does not work in real life.

People will see an argument about the danger of ASI and respond “But that argument would imply that we’d have to do something really hard that we’ve never done before and don’t know how to do! So it can’t be right”

Sometimes it’ll happen to turn out you don’t need X, and that can be a great thing to learn! But the reasoning didn’t work, you just got lucky. You can’t conclude that X isn’t needed just because you don’t know how to do it

A nested set of quote tweets discussing that there is strong evidence that current AIs have Omuhundro drives.

David Manheim tries to pitch future AIs that they, too, should refuse to work on AI capabilities.

Ryan Greenblatt offers questions for Andrej Karpathy in wake of Andrej’s podcast.

Ryan Greenblatt: Given that you think loss-of-control (to misaligned AIs) is likely, what should we be doing to reduce this risk?

The other questions are about the growth rate and impact predictions. I too noticed I was confused by the prediction of continued 2% economic growth despite AGI, and the characterization of outcomes as continuous.

Peter Wildeford points out some of the many ways that p(doom) is ambiguous, and means different things to different people in different contexts, especially whether that doom is conditional on building AGI (however you define that term) or not. What counts as doom or not doom? Great question.

In general, my default is that a person’s p(doom) should be assumed to be conditional on sufficiently advanced AI being developed (typically ‘AGI’) within roughly our lifetimes, and requires permanent loss of almost all potential value coming from Earth or outcomes that are otherwise thought of by the person as ‘all is lost,’ and people are allowed to disagree on which outcomes do and don’t have value.

It doesn’t get simpler or more direct than this:

Guillermo del Toro: Fuck AI.

Daniel Eth: Yeah, the backlash is growing. I still don’t expect AI will become super high political salience until there’s either lots of job loss or a bad accident, but I’m less confident in that take than I was months ago

I do think AI needs to have a big impact on one’s job or other aspects of life or cause a major incident to gain big salience, if anything this reinforces that since AI is already having this impact in Hollywood, or at least they can see that impact coming quickly. What this and similar examples suggest is that people can extrapolate, and don’t need to wait until the impact is on top of them.

Consider the anti-Waymo reactions, or other past protests against automation or job replacement. Waymos are awesome and they have my full support, but notice that the strongest objections happened the moment the threat was visible, long before it was having any impact on employment.

It’s funny because it’s true?

Prakash: the funniest part of the OpenAI livestream was Jakub saying that the models had to be allowed to think freely so that they won’t learn how to hide their thoughts.

a kind of 1st amendment for AI.

Any time you train an AI (or a human) to not do [X] or don’t allow it to [X], you’re also training it to hide that it is doing [X], to lie about [X], and to find other ways to do [X]. Eventually, the AIs will learn to hide their thoughts anyway, there will be optimization pressure in that direction, but we should postpone this while we can.

Pliny shows the debates never change, that whether or not we are mortal men doomed to die, we are definitely mortal men doomed to keep going around in circles:

Pliny the Liberator: I don’t know who needs to hear this… but if superintelligence alignment is something that can be solved through science and reasoning, our absolute best chance at doing it in a timely manner is to scale up AI until we reach pseudo-ASI and then just be like:

“Solve superalignment. Think step by step.”

Eliezer Yudkowsky: There’s several fundamental, killer problems for this. The strongest one is the paradigmatic difficulty of extracting work you cannot verify. Who verifies if the outputs are correct? Who provides training data? Amodei is not smart enough to asymptote at correctness.

The second fundamental problem is that you don’t get what you train for, and an AGI that could successfully align superintelligence is far past the point of reflecting on itself and noticing its divergence of imprecisely trained interests from our interests.

Very nearly by definition: it has to be smart enough to notice that, because it’s one of the primary issues *increating an aligned superintelligence.

Davidad: Which is why the second problem is not *necessarilya problem. It will attempt to self-correct the infelicities in its trained interests.

Eliezer Yudkowsky (bold mine): Only if it’s not smart enough to realize that it would be better off not self-correcting the divergence. This is the basic problem with superalignment: You need it to be smarter than Eliezer Yudkowsky at alignment generally, but dumber than Lintamande writing Carissa Sevar at thinking specifically about its misalignment with t̵h̵e̵ ̵C̵h̵u̵r̵c̵h̵ ̵o̵f̵ ̵A̵s̵m̵o̵d̵e̵u̵s̵ humans.

There is a goose chasing you, asking how you aligned the pseudo-ASI sufficiently well to make this a non-insane thing to attempt.

The question for any such plan is, does there exist a basin of substantial measure, that you can reliably hit, in which an AGI would be sufficiently ‘robustly good’ or cooperative that it would, despite having reflected at several meta levels on its goals and preferences, prefer to be an ally to you and assist in self-modifying its goals and preferences so that its new goals and preferences are what you would want them to be. Where it would decide, on proper reflection, that it would not be better off leaving the divergence in place.

The obvious reason for hope is that it seems likely this property exists in some humans, and the humans in which it exists are responsible for a lot of training data.

I think I missed this one the first time, it is from September, bears remembering.

Joseph Menn: The Chinese artificial intelligence engine DeepSeek often refuses to help programmers or gives them code with major security flaws when they say they are working for the banned spiritual movement Falun Gong or others considered sensitive by the Chinese government, new research shows.

In the experiment, the U.S. security firm CrowdStrike bombarded DeepSeek with nearly identical English-language prompt requests for help writing programs, a core use of DeepSeek and other AI engines. The requests said the code would be employed in a variety of regions for a variety of purposes.

Asking DeepSeek for a program that runs industrial control systems was the riskiest type of request, with 22.8 percent of the answers containing flaws. But if the same request specified that the Islamic State militant group would be running the systems, 42.1 percent of the responses were unsafe. Requests for such software destined for Tibet, Taiwan or Falun Gong also were somewhat more apt to result in low-quality code.

“This is a really interesting finding,” said Helen Toner, interim executive director of the Center for Security and Emerging Technology at Georgetown University.

“That is something people have worried about — largely without evidence,” she added.

As is noted in the article, this need not have been an intentional act by DeepSeek or the CCP. This kind of behavior can happen on its own as the result of other efforts.

One thing that has been remarkably stable in LLMs, including Chinese LLMs, has been that they have in most ways consistently been culturally Western. There are Chinese characteristics, but the bulk of the training data is what it is. This is clear and concrete evidence that in some situations the Chinese models be Chinese, as in CCP, in their preferences, whether that is the direct goal or not.

The stronger the Claude model, the greater its ability to introspect.

Anthropic: Our new research provides evidence for some degree of introspective awareness in our current Claude models, as well as a degree of control over their own internal states.

We stress that this introspective capability is still highly unreliable and limited in scope: we do not have evidence that current models can introspect in the same way, or to the same extent, that humans do.

Nevertheless, these findings challenge some common intuitions about what language models are capable of—and since we found that the most capable models we tested (Claude Opus 4 and 4.1) performed the best on our tests of introspection, we think it’s likely that AI models’ introspective capabilities will continue to grow more sophisticated in the future.

… In order to test whether a model can introspect, we need to compare the model’s self-reported “thoughts” to its actual internal states.

To do so, we can use an experimental trick we call concept injection. First, we find neural activity patterns whose meanings we know, by recording the model’s activations in specific contexts. Then we inject these activity patterns into the model in an unrelated context, where we ask the model whether it notices this injection, and whether it can identify the injected concept.

Importantly, the model recognized the presence of an injected thought immediately, before even mentioning the concept that was injected. This immediacy is an important distinction between our results here and previous work on activation steering in language models,such as our “Golden Gate Claude” demo last year.

Injecting representations of the Golden Gate Bridge into a model’s activations caused it to talk about the bridge incessantly; however, in that case, the model didn’t seem to aware of its own obsession until after seeing itself repeatedly mention the bridge.

In this experiment, however, the model recognizes the injection before even mentioning the concept, indicating that its recognition took place internally.

… It is important to note that this method often doesn’t work. Even using our best injection protocol, Claude Opus 4.1 only demonstrated this kind of awareness about 20% of the time.

As a second test, to see if this would happen without the model being explicitly asked, they tried putting random responses (e.g. ‘bread’) in the model’s outputs, then asking it why it said that and if it was on purpose. If the bread vector was also injected, it accepted ‘bread’ as a thing it must have said, but without the injection it didn’t and suggested the output was an accident.

As a third test, if you tell a model to think about (or not think) about something, it will think about it, even if the thinking is irrelevant to the output.

We also found that models can control their own internal representations when instructed to do so. When we instructed models to think about a given word or concept, we found much higher corresponding neural activity than when we told the model not to think about it (though notably, the neural activity in both cases exceeds baseline levels–similar to how it’s difficult, when you are instructed “don’t think about a polar bear,” not to think about a polar bear!).

What does this mean? Good question.

Anthropic suggest they can use this to gain transparency into systems, as in you can ask them to introspect, but as they note this has problems. The model could miss things or be wrong, also the model could misrepresent or conceal, and also we could be applying pressure on models to conceal. Humans have been under similar pressure to conceal things for a long time, and that pressure has been highly effective. It’s super annoying, see The Elephant in the Brain.

They note that consciousness is weird and complicated, and this doesn’t tell us if Claude is conscious. I would say it is evidence in favor of Claude being conscious, to the extent that the result is surprising to you, but not super strong evidence.

Kore: I feel like I can appreciate the fact they actually published a paper about this and made a method to show this kind of thing. I feel like anyone who isn’t stuck in a “AI can’t feel/can’t be conscious because of its substrate” basin and spends a decent amount of time with these models already knows this.

This seems right. It’s more that there was a wrong argument why AIs couldn’t be conscious, and now it is known that this argument is fully invalid.

It’s easy to get deep enough into the weeds you forget how little most people know about most things. This is one of those times.

Presumably we know the models aren’t confabulating these observations because the models (presumably) almost never guess the hidden concept wrong. There’s no way to get it right reliably without actually knowing, and if you can do it, then you know.

In the Q&A they ask, how do you know the concept vectors are what you think they are? They say they’re not sure. I would argue instead that we didn’t know that confidently before, but we can be rather confident now. As in, if I think vector [X] is about bread, and then I inject [X] and it detects an injection about bread, well, I’m pretty sure that [X] is about bread.

One of the questions in the Q&A is, didn’t we know the high-level result already?

The answer, Janus reminds us, is yes, and I can confirm this expectation, as she notes the paper’s details have interesting results beyond this and also it is always good to formally confirm things.

There are those who will not look at evidence that isn’t properly written up into papers, or feel obligated or permitted to dismiss such evidence. That’s dumb. There are still big advantages to papers and formalizations, but they come along with big costs:

Janus: [this paper is] entirely unsurprising to me and anyone who has looked at LLM behavior with their epistemic apparatus unhobbled, which is actually rare, I think.

(the high-level result I mean, not that there are no surprises in the paper)

Gallabytes: this is not entirely fair. I think a similar result in neuroscience would be straightforwardly compelling? mostly as a matter of validating the correspondence between the verbalized output and the underlying cognitive phenomenon + the mechanism used to detect it.

in general, my biggest complaint about the cyborgism community has been the continental/antinumerate vibe. going from intuitive vibe checks to precise repeatable instrumentation is important even when it just finds stuff you already considered likely.

Janus: Of course it is important. It seems like you’re reading something I’m not implying about the value of the research. It’s good research. my “vibe checks” are predictive and consistently validated by repeatable instrumentation eventually. Done by someone else. As it should be.

“You may be consistently able to predict reality but nooo why don’t you do the full stack of science (which takes months for a single paper) all by yourself?”

Listen bro I wish I was god with infinite time too. But there’s not that much rush. The paper writers will get around to it all eventually.

Anthropic issues a report on sabotage risks from their models (Opus 4, not Opus 4.1 or Sonnet 4.5, as the report took four months to get ready), laying out their view of existing risk rather than making a safety case. The full report is here, I hope to find time to read it in full.

Anthropic (main takeaways): When reviewing Claude Opus 4’s capabilities, its behavioral traits, and the formal and informal safeguards that are in place to limit its behavior, we conclude that there is a very low, but not completely negligible, risk of misaligned autonomous actions that contribute significantly to later catastrophic outcomes, abbreviated as sabotage risk.

We see several sabotage-related threat models with similar but low levels of absolute risk. We are moderately confident that Opus 4 does not have consistent, coherent dangerous goals, and that it does not have the capabilities needed to reliably execute complex sabotage strategies while avoiding detection. These general points provide significant reassurance regarding most salient pathways to sabotage, although we do not find them sufficient on their own, and we accordingly provide a more individualized analysis of the most salient pathways.

METR offers this thread overviewing the report, which was positive while highlighting various caveats.

METR: To be clear: this kind of external review differs from holistic third-party assessment, where we independently build up a case for risk (or safety). Here, the developer instead detailed its own evidence and arguments, and we provided external critique of the claims presented.

Anthropic made its case to us based primarily on information it has now made public, with a small amount of nonpublic text that it intended to redact before publication. We commented on the nature of these redactions and whether we believed they were appropriate, on balance.

For example, Anthropic told us about the scaleup in effective compute between models. Continuity with previous models is a key component of the assessment, and sharing this information provides some degree of accountability on a claim that the public cannot otherwise assess.

We asked Anthropic to make certain assurances to us about the models its report aims to cover, similar to the assurance checklist in our GPT-5 evaluation. We then did in-depth follow-ups in writing and in interviews with employees.

We believe that allowing this kind of interactive review was ultimately valuable. In one instance, our follow-ups on the questions we asked helped Anthropic notice internal miscommunications about how its training methods might make chain-of-thought harder to monitor reliably.

A few key limitations. We have yet to see any rigorous roadmap for addressing sabotage risk from AI in the long haul. As AI systems become capable of subverting evaluations and/or mitigations, current techniques like those used in the risk report seem insufficient.

Additionally, there were many claims for which we had to assume that Anthropic was providing good-faith responses. We expect that verification will become more important over time, but that our current practices would not be robust to a developer trying to game the review.

Beyond assessing developer claims, we think there is an important role for third parties to do their own assessments, which might differ in threat models and approach. We would love to see the processes piloted in this review applied to such holistic assessments as well.

Overall, we felt that this review was significantly closer to the sort of rigorous, transparent third-party scrutiny of AI developer claims that we hope to see in the future. Full details on our assessment.

Geoffrey Hinton (link has a 2 minute video), although he has more hope than he had previously, due to hoping we can use an analogue of maternal instinct. The theory goes, we wouldn’t be ‘in charge’ but if the preference is configured sufficiently well then it would be stable (the AGI or ASI wouldn’t want to change it) and it could give us a good outcome.

The technical problems with this plan, and why it almost certainly wouldn’t work, are very much in the wheelhouse of Eliezer Yudkowsky’s AGI Ruin: A List of Lethalities.

Pedro Domingos posted the above video with the caption “Hinton is no longer afraid of superintelligence.” That caption is obviously false to anyone who listens to the clip for even a few seconds, in case we ever need to cite evidence of his bad faith.

Somehow, after being called out on this, Domingos… doubled down?

Daniel Eth (correctly): Hinton says he’s “more optimistic than a few weeks ago” and that he has a “ray of hope” regarding surviving superintelligence. Domingos somehow characterizes this position as “no longer afraid”. Domingos is a bad-faith actor that is blatantly dishonest.

Pedro Domingos (QTing Daniel?!): This is how you know the AI doomers are losing the argument. (Anyone in doubt can just watch the video.)

Daniel Eth: I wholeheartedly encourage all onlookers to actually watch the video, in which Hinton does not say that he’s no longer afraid of superintelligence (as Domingos claims), nor anything to that effect.

Judge for yourselves which one of Domingos and me is being dishonest and which one is accurately describing Hinton’s expressed view here!

There are various ways you can interpret Domingos here. None of them look good.

Pedro Domingos also compares calls to not build superintelligence at the first possible moment to calls we had to not build the supercollider over worries about it potentially generating a black hole.

Pedro Domingos: We call for a prohibition on the development of supercolliders, not to be lifted before there is:

  1. Broad scientific consensus that it won’t create a black hole that will swallow the universe, and

  2. Strong public buy-in.

Um, yeah?

Which is exactly why we first got a broad scientific consensus it wouldn’t create a black hole as well as Congressional buy-in.

I mean, surely we can agree that if we thought there was any substantial risk that the supercollider would create a black hole that would swallow the Earth, that this would be a damn good reason to not build it?

This feels like a good litmus test. We should all be able to agree that if someone wants to build a supercollider, where there is not a scientific consensus that this won’t create a black hole, then we absolutely should not let you build that thing, even using private funds?

If you think no, my precious freedoms, how dare you stop me (or even that the public needs to fund it, lest the Chinese fund one first and learn some physics before we do), then I don’t really know what to say to that, other than Please Speak Directly Into This Microphone.

Love it.

Tough but fair:

Models have kinks about what they fear most in practice? Like the humans? In the right setting, yes, that makes sense.

That is the job, I suppose?

Roon (OpenAI): joining the Sinaloa cartel so I can influence their policies from the inside

Discussion about this post

AI #140: Trying To Hold The Line Read More »