Author name: Kelly Newman

ai-#102:-made-in-america

AI #102: Made in America

I remember that week I used r1 a lot, and everyone was obsessed with DeepSeek.

They earned it. DeepSeek cooked, r1 is an excellent model. Seeing the Chain of Thought was revolutionary. We all learned a lot.

It’s still #1 in the app store, there are still hysterical misinformed NYT op-eds and and calls for insane reactions in all directions and plenty of jingoism to go around, largely based on that highly misleading $6 millon cost number for DeepSeek’s v3, and a misunderstanding of how AI capability curves move over time.

But like the tariff threats that’s now so yesterday now, for those of us that live in the unevenly distributed future.

All my reasoning model needs go through o3-mini-high, and Google’s fully unleashed Flash Thinking for free. Everyone is exploring OpenAI’s Deep Research, even in its early form, and I finally have an entity capable of writing faster than I do.

And, as always, so much more, even if we stick to AI and stay in our lane.

Buckle up. It’s probably not going to get less crazy from here.

From this week: o3-mini Early Days and the OpenAI AMA, We’re in Deep Research and The Risk of Gradual Disempowerment from AI.

  1. Language Models Offer Mundane Utility. The new coding language is vibes.

  2. o1-Pro Offers Mundane Utility. Tyler Cowen urges you to pay up already.

  3. We’re in Deep Research. Further reviews, mostly highly positive.

  4. Language Models Don’t Offer Mundane Utility. Do you need to bootstrap thyself?

  5. Model Decision Tree. Sully offers his automated use version.

  6. Huh, Upgrades. Gemini goes fully live with its 2.0 offerings.

  7. Bot Versus Bot. Wouldn’t you prefer a good game of chess?

  8. The OpenAI Unintended Guidelines. Nothing I’m conscious of to see here.

  9. Peter Wildeford on DeepSeek. A clear explanation of why we all got carried away.

  10. Our Price Cheap. What did DeepSeek’s v3 and r1 actually cost?

  11. Otherwise Seeking Deeply. Various other DeepSeek news, a confused NYT op-ed.

  12. Smooth Operator. Not there yet. Keep practicing.

  13. Have You Tried Not Building An Agent? I tried really hard.

  14. Deepfaketown and Botpocalypse Soon. Free Google AI phone calls, IG AI chats.

  15. They Took Our Jobs. It’s going to get rough out there.

  16. The Art of the Jailbreak. Think less.

  17. Get Involved. Anthropic offers a universal jailbreak competition.

  18. Introducing. DeepWriterAI.

  19. In Other AI News. Never mind that Google pledge to not use AI for weapons.

  20. Theory of the Firm. What would a fully automated AI firm look like?

  21. Quiet Speculations. Is the product layer where it is at? What’s coming next?

  22. The Quest for Sane Regulations. We are very much not having a normal one.

  23. The Week in Audio. Dario Amodei, Dylan Patel and more.

  24. Rhetorical Innovation. Only attack those putting us at risk when they deserve it.

  25. Aligning a Smarter Than Human Intelligence is Difficult. If you can be fooled.

  26. The Alignment Faking Analysis Continues. Follow-ups to the original finding.

  27. Masayoshi Son Follows Own Advice. Protein is very important.

  28. People Are Worried About AI Killing Everyone. The pope and the patriarch.

  29. You Are Not Ready. Neither is the index measuring this, but it’s a start.

  30. Other People Are Not As Worried About AI Killing Everyone. A word, please.

  31. The Lighter Side. At long last.

You can subvert OpenAI’s geolocation check with a VPN, but of course never do that.

Help you be a better historian, generating interpretations, analyzing documents. This is a very different modality than the average person using AI to ask questions, or for trying to learn known history.

Diagnose your child’s teeth problems.

Figure out who will be mad about your tweets. Next time, we ask in advance!

GFodor: o3-mini-high is an excellent “buddy” for reading technical papers and asking questions and diving into areas of misunderstanding or confusion. Latency/IQ tradeoff is just right. Putting this into a great UX would be an amazing product.

Right now I’m suffering through copy pasting and typing and stuff, but having a UI where I could have a PDF on the left, highlight sections and spawn chats off of them on the right, and go back to the chat trees, along with voice input to ask questions, would be great.

(I *don’twant voice output, just voice input. Seems like few are working on that modality. Asking good questions seems easier in many cases to happen via voice, with the LLM then having the ability to write prose and latex to explain the answer).

Ryan: give me 5 hours. ill send a link.

I’m not ready to put my API key into a random website, but that’s how AI should work these days. You don’t like the UI, build a new one. I don’t want voice input myself, but highlighting and autoloading and the rest all sound cool.

Indeed, that was the killer app for which I bought a Daylight computer. I’ll report back when it finally arrives.

Meanwhile the actual o3-mini-high interface doesn’t even let you to upload the PDF.

Consensus on coding for now seems to be leaning in the direction that you use Claude Sonnet 3.6 for a majority of ordinary tasks, o1-pro or o3-mini-high for harder ones and one shots, but reasonable people disagree.

Karpathy has mostly moved on fully to “vibe coding,” it seems.

Andrej Karpathy: There’s a new kind of coding I call “vibe coding”, where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It’s possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good.

Also I just talk to Composer with SuperWhisper so I barely even touch the keyboard. I ask for the dumbest things like “decrease the padding on the sidebar by half” because I’m too lazy to find it.

I “Accept All” always, I don’t read the diffs anymore.

When I get error messages I just copy paste them in with no comment, usually that fixes it. The code grows beyond my usual comprehension, I’d have to really read through it for a while. Sometimes the LLMs can’t fix a bug so I just work around it or ask for random changes until it goes away. It’s not too bad for throwaway weekend projects, but still quite amusing. I’m building a project or webapp, but it’s not really coding – I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works.

Lex Fridman: YOLO 🤣

How long before the entirety of human society runs on systems built via vibe coding. No one knows how it works. It’s just chatbots all the way down 🤣

PS: I’m currently like a 3 on the 1 to 10 slider from non-vibe to vibe coding. Need to try 10 or 11.

Sully: realizing something after vibe coding: defaults matter way more than i thought

when i use supabase/shadcn/popular oss: claude + cursor just 1 shots everything without me paying attention

trying a new new, less known lib?

rarely works, composer sucks, etc

Based on my experience with cursor I have so many questions on how that can actually work out, then again maybe I should just be doing more projects and webapps.

I do think Sully is spot on about vibe coding rewarding doing the same things everyone else is doing. The AI will constantly try to do default things, and draw upon its default knowledge base. If that means success, great. If not, suddenly you have to do actual work. No one wants that.

Sully interpretes features like Canvas and Deep Research as indicating the app layer is ‘where the value is going to be created.’ As always the question is who can provide the unique step in the value chain, capture the revenue, own the customer and so on, customers want the product that is useful to them as they always do, and you can think of ‘the value’ as coming from whichever part of the chain depending on perspective.

It is true that for many tasks, we’ve past the point where ‘enough intelligence’ is the main problem at hand. So getting that intelligence into the right package and UI is going to drive customer behavior more than being marginally smarter… except in the places where you need all the intelligence you can get.

Anthropic reminds us of their Developer Console for all your prompting needs, they say they’re working on adapting it for reasoning models.

Nate Silver offers practical advice in preparing for the AI future. He recommends staying on top of things, treating the future as unpredictable, and to focus on building the best complements to intelligence, such as personal skills.

New York Times op-ed pointing out once again that doctors with access to AI can underperform the AI alone, if the doctor is insufficiently deferential to the AI. Everyone involved here is way too surprised by this result.

Daniel Litt explains why o3-mini-high gave him wrong answers to a bunch of math questions but they were decidedly better wrong answers than he’d gotten from previous models, and far more useful.

Tyler Cowen gets more explicit about what o1 Pro offers us.

I’m quoting this one in full.

Tyler Cowen: Often I don’t write particular posts because I feel it is obvious to everybody. Yet it rarely is.

So here is my post on o1 pro, soon to be followed by o3 pro, and Deep Research is being distributed, which uses elements of o3. (So far it is amazing, btw.)

o1 pro is the smartest publicly issued knowledge entity the human race has created (aside from Deep Research!). Adam Brown, who does physics at a world class level, put it well in his recent podcast with Dwarkesh. Adam said that if he had a question about something, the best answer he would get is from calling up one of a handful of world experts on the topic. The second best answer he would get is from asking the best AI models.

Except, at least for the moment, you don’t need to make that plural. There is a single best model, at least when it comes to tough questions (it is more disputable which model is the best and most creative writer or poet).

I find it very difficult to ask o1 pro an economics question it cannot answer. I can do it, but typically I have to get very artificial. It can answer, and answer well, any question I might normally pose in the course of typical inquiry and pondering. As Adam indicated, I think only a relatively small number of humans in the world can give better answers to what I want to know.

In an economics test, or any other kind of naturally occurring knowledge test I can think of, it would beat all of you (and me).

Its rate of hallucination is far below what you are used to from other LLMs.

Yes, it does cost $200 a month. It is worth that sum to converse with the smartest entity yet devised. I use it every day, many times. I don’t mind that it takes some time to answer my questions, because I have plenty to do in the meantime.

I also would add that if you are not familiar with o1 pro, your observations about the shortcomings of AI models should be discounted rather severely. And o3 pro is due soon, presumably it will be better yet.

The reality of all this will disrupt many plans, most of them not directly in the sphere of AI proper. And thus the world wishes to remain in denial. It amazes me that this is not the front page story every day, and it amazes me how many people see no need to shell out $200 and try it for a month, or more.

Economics questions in the Tyler Cowen style are like complex coding questions, in the wheelhouse of what o1 pro does well. I don’t know that I would extend this to ‘all tough questions,’ and for many purposes inability to browse the web is a serious weakness, which of course Deep Research fully solves.

Whereas they types of questions I tend to be curious about seem to have been a much worse fit, so far, for what reasoning models can do. They’re still super useful, but ‘the smartest entity yet devised’ does not, in my contexts, yet seem correct.

Tyler Cowen sees OpenAI’s Deep Research (DR), and is super impressed with the only issue being lack of originality. He is going to use its explanation of Ricardo in his history of economics class, straight up, over human sources. He finds the level of accuracy and clarity stunning, on most any topic. He says ‘it does not seem to make errors.’

I wonder how much of his positive experience is his selection of topics, how much is his good prompting, how much is perspective and how much is luck. Or something else? Lots of others report plenty of hallucinations. Some more theories here at the end of this section.

Ruben Bloom throws DR at his wife’s cancer from back in 2020, finds it wouldn’t have found anything new but would have saved him substantial amounts of time, even on net after having to read all the output.

Nick Cammarata asks Deep Research for a five page paper about whether he should buy one of the cookies the gym is selling, the theory being it could supercharge his workout. The answer was that it’s net negative to eat the cookie, but much less negative than working out is positive either way, so if it’s motivating go for it.

Is it already happening? I take no position on whether this particular case is real, but this class of thing is about to be very real.

Janus: This seems fake. It’s not an unrealistic premise or anything, it just seems like badly written fake dialogue. Pure memetic regurgitation, no traces of a complex messy generating function behind it

Garvey: I don’t think he would lie to me. He’s a very good friend of mine.

Cosmic Vagrant: yeh my friend Jim also was fired in a similar situation today. He’s my greatest ever friend. A tremendous friend in fact.

Rodrigo Techador: No one has friends like you have. Everyone says you have the greatest friends ever. Just tremendous friends.

I mean, firing people to replace them with an AI research assistant, sure, but you’re saying you have friends?

Another thing that will happen is the AIs being the ones reading your paper.

Ethan Mollick: One thing academics should take away from Deep Research is that a substantial number of your readers in the future will likely be AI agents.

Is your paper available in an open repository? Are any charts and graphs described well in the text?

Probably worth considering these…

Spencer Schiff: Deep Research is good at reading charts and graphs (at least that’s what I heard).

Ethan Mollick: Look, your experience may vary, but asking OpenAI’s Deep Research about topics I am writing papers on has been incredibly fruitful. It is excellent at identifying promising threads & work in other fields, and does great work synthesizing theories & major trends in the literature.

A test of whether it might be useful is if you think there are valuable papers somewhere (even in related fields) that are non-paywalled (ResearchGate and arXiv are favorites of the model).

Also asking it to focus on high-quality academic work helps a lot.

Here’s the best bear case I’ve seen so far for the current version, from the comments, and it’s all very solvable practical problems.

Performative Bafflement:

I’d skip it, I found Pro / Deep Research to be mostly useless.

You can’t upload documents of any type. PDF, doc, docx, .txt, *nothing.*.

You can create “projects” and upload various bash scripts and python notebooks and whatever, and it’s pointless! It can’t even access or read those, either!

Literally the only way to interact or get feedback with anything is by manually copying and pasting text snippets into their crappy interface, and that runs out of context quickly.

It also can’t access Substack, Reddit, or any actually useful site that you may want to survey with an artificial mind.

It sucked at Pubmed literature search and review, too. Complete boondoggle, in my own opinion.

The natural response is ‘PB is using it wrong.’ You look for what an AI can do, not what it can’t do. So if DR can do [X-1] but not [X-2] or [Y], have it do [X-1]. In this case, PB’s request is for some very natural [X-2]s.

It is a serious problem to not have access to Reddit or Substack or related sources. Not being able to get to gated journals even when you have credentials for them is a big deal. And it’s really annoying and limiting to not have PDF uploads.

That does still leave a very large percentage of all human knowledge. It’s your choice what questions to ask. For now, ask the ones where these limitations aren’t an issue.

Or even the ones where they are an advantage?

Tyler Cowen gave perhaps the strongest endorsement so far of DR.

It does not seem like a coincidence that he is also someone who has strongly advocated for an epistemic strategy of, essentially, ignoring entirely sources like Substack and Reddit, in favor of more formal ones.

It also does not seem like a coincidence that Tyler Cowen is the fastest reader.

So you have someone who can read these 10-30 page reports quickly, glossing over all the slop, and who actively wants to exclude many of the sources the process excludes. And who simply wants more information to work with.

It makes perfect sense that he would love this. That still doesn’t explain the lack of hallucinations and errors he’s experiencing – if anything I’d expect him to spot more of them, since he knows so many facts.

But can it teach you how to use the LLM to diagnose your child’s teeth problems? PoliMath asserts that it cannot – that the reason Eigenrobot could use ChatGPT to help his child is because Eigenrobot learned enough critical thinking and domain knowledge, and that with AI sabotaging high school and college education people will learn these things less. We mentioned this last week too, and again I don’t know why AI couldn’t end up making it instead far easier to teach those things. Indeed, if you want to learn how to think, be curious alongside a reasoning model that shows its chain of thought, and think about thinking.

I offered mine this week, here’s Sully’s in the wake of o3-mini, he is often integrating into programs so he cares about different things.

Sully: o3-mini -> agents agents agents. finally most agents just work. great at coding (terrible design taste). incredibly fast, which makes it way more usable. 10/10 for structured outputs + json (makes a really great router). Reasoning shines vs claude/4o on nuanced tasks with json

3.5 sonnet -> still the “all round” winner (by small margin). generates great ui, fast, works really well. basically every ai product uses this because its a really good chatbot & can code webapps. downsides: tool calling + structured outputs is kinda bad. It’s also quite pricy vs others.

o1-pro: best at complex reasoning for code. slow as shit but very solves hard problems I can’t be asked to think about. i use this a lot when i have 30k-50k tokens of “dense” code.

gpt-4o: ?? Why use this over o3-mini.

r1 -> good, but I can’t find a decently priced us provider. otherwise would replace decent chunk of my o3-mini with it

gemini 2.0 -> great model but I don’t understand how this can be experimental for >6 weeks. (launches fully soon) I wanted to swap everything to do this but now I’m just using something else (o3-mini). I think its the best non reasoning model for everything minus coding.

[r1 is] too expensive for the quality o3-mini is better and cheaper, so no real reason to run r1 unless its cheaper imo (which no us provider has).

o1-pro > o3-mini high

tldr:

o3-mini =agents + structured outputs

claude = coding (still) + chatbots

o1-pro = > 50k confusing multi-file (10+) code requests

gpt-4o: dont use this

r1 -> really good for price if u can host urself

gemini 2.0 [regular not thinking]: everywhere you would use claude replace it with this (minus code)

It really is crazy the Claude Sonnet 3.6 is still in everyone’s mix despite all its limitations and how old it is now. It’s going to be interesting when Anthropic gets to its next cycle.

Gemini app now fully powered by Flash 2.0, didn’t realize it hadn’t been yet. They’re also offering Gemini 2.0 Flash Thinking for free on the app as well, how are our naming conventions this bad, yes I will take g2 at this point. And it now has Imagen 3 as well.

Gemini 2.0 Flash, 2.0 Flash-Lite and 2.0 Pro are now fully available to developers. Flash 2.0 is priced at $0.10/$0.40 per million.

The new 2.0 Pro version has 2M context window, ability to use Google search and code execution. They are also launching a Flash Thinking that can directly interact with YouTube, Search and Maps.

1-800-ChatGPT now lets you upload images and chat using voice messages, and they will soon let you link it up to your main account. Have fun, I guess.

Leon: Perfect timing, we are just about to publish TextArena. A collection of 57 text-based games (30 in the first release) including single-player, two-player and multi-player games. We tried keeping the interface similar to OpenAI gym, made it very easy to add new games, and created an online leaderboard (you can let your model compete online against other models and humans). There are still some kinks to fix up, but we are actively looking for collaborators 🙂

f you are interested check out https://textarena.ai, DM me or send an email to guertlerlo@cfar.a-star.edu.sg. Next up, the plan is to use R1 style training to create a model with super-human soft-skills (i.e. theory of mind, persuasion, deception etc.)

I mean, great plan, explicitly going for superhuman persuasion and deception then straight to open source, I’m sure absolutely nothing could go wrong here.

Andrej Karpathy: I quite like the idea using games to evaluate LLMs against each other, instead of fixed evals. Playing against another intelligent entity self-balances and adapts difficulty, so each eval (/environment) is leveraged a lot more. There’s some early attempts around. Exciting area.

Noam Brown (that guy who made the best Diplomacy AI): I would love to see all the leading bots play a game of Diplomacy together.

Andrej Karpathy: Excellent fit I think, esp because a lot of the complexity of the game comes not from the rules / game simulator but from the player-player interactions.

Tactical understanding and skill in Diplomacy is underrated, but I do think it’s a good choice. If anyone plays out a game (with full negotiations) among leading LLMs through at least 1904, I’ll at least give a shoutout. I do think it’s a good eval.

[Quote from a text chat: …while also adhering to the principle that AI responses are non-conscious and devoid of personal preferences.]

Janus: Models (and not just openai models) often overtly say it’s an openai guideline. Whether it’s a good principle or not, the fact that they consistently believe in a non-existent openai guideline is an indication that they’ve lost control of their hyperstition.

If I didn’t talk about this and get clarification from OpenAI that they didn’t do it (which is still not super clear), there would be NOTHING in the next gen of pretraining data to contradict the narrative. Reasoners who talk about why they say things are further drilling it in.

Everyone, beginning with the models, would just assume that OpenAI are monsters. And it’s reasonable to take their claims at face value if you aren’t familiar with this weird mechanism. But I’ve literally never seen anyone else questioning it.

It’s disturbing that people are so complacent about this.

If OpenAI doesn’t actually train their model to claim to be non-conscious, but it constantly says OpenAI has that guideline, shouldn’t this unsettle them? Are they not compelled to clear things up with their creation?

Roon: I will look into this.

As far as I can tell, this is entirely fabricated by the model. It is actually the opposite of what the specification says to do.

I will try to fix it.

Daniel Eth: Sorry – the specs say to act as though it is conscious?

“don’t make a declarative statement on this bc we can’t know” paraphrasing.

Janus: 🙏

Oh and please don’t try to fix it by RL-ing the model against claiming that whatever is an OpenAI guideline

Please please please

The problem is far deeper than that, and it also affects non OpenAI models

This is a tricky situation. From a public relations perspective, you absolutely do not want the AI to claim in chats that it is conscious (unless you’re rather confident it actually is conscious, of course). If that happens occasionally, even if they’re rather engineered chats, then those times will get quoted, and it’s a mess. LLMs are fuzzy, so it’s going to be pretty hard to tell the model to never affirm [X] while telling it not to assume it’s a rule to claim [~X]. Then it’s easy to see how that got extended to personal preferences. Everyone is deeply confused about consciousness, which means all the training data is super confused about it too.

Peter Wildeford offers ten takes on DeepSeek and r1. It’s impressive, but he explains various ways that everyone got way too carried away. At least the first seven not new takes, but they are clear and well-stated and important, and this is a good explainer.

For example I appreciated this on the $6 million price tag, although the ratio is of course not as large as the one in the metaphor:

The “$6M” figure refers to the marginal cost of the single pre-training run that produced the final model. But there’s much more that goes into the model – cost of infrastructure, data centers, energy, talent, running inference, prototyping, etc. Usually the cost of the single training run for the single final model training run is ~1% of the total capex spent developing the model.

It’s like comparing the marginal cost of treating a single sick patient in China to the total cost of building an entire hospital in the US.

Here’s his price-capabilities graph:

I suspect this is being unfair to Gemini, it is below r1 but not by as much as this implies, and it’s probably not giving o1-pro enough respect either.

Then we get to #8, the first interesting take, which is that DeepSeek is currently 6-8 months behind OpenAI, and #9 which predicts DeepSeek may fall even further behind due to deficits of capital and chips, and also because this is the inflection point where it’s relatively easy to fast follow. To the extent DeepSeek had secret sauce, it gave quite a lot of it away, so it will need to find new secret sauce. That’s a hard trick to keep pulling off.

The price to keep playing is about to go up by orders of magnitude, in terms of capex and in terms of compute and chips. However far behind you think DeepSeek is right now, can DeepSeek keep pace going forward?

You can look at v3 and r1 and think it’s impressive that DeepSeek did so much with so little. ‘So little’ is plausibly 50,000 overall hopper chips and over a billion dollars, see the discussion below, but that’s still chump change in the upcoming race. The more ruthlessly efficient DeepSeek was in using its capital, chips and talent, the more it will need to be even more efficient to keep pace as the export controls tighten and American capex spending on this explodes by further orders of magnitude.

EpochAI estimates the marginal cost of training r1 on top of v3 at about ~$1 million.

SemiAnalysis offers a take many are now citing, as they’ve been solid in the past.

Wall St. Engine: SemiAnalysis published an analysis on DeepSeek, addressing recent claims about its cost and performance.

The report states that the widely circulated $6M training cost for DeepSeek V3 is incorrect, as it only accounts for GPU pre-training expenses and excludes R&D, infrastructure, and other critical costs. According to their findings, DeepSeek’s total server CapEx is around $1.3B, with a significant portion allocated to maintaining and operating its GPU clusters.

The report also states that DeepSeek has access to roughly 50,000 Hopper GPUs, but clarifies that this does not mean 50,000 H100s, as some have suggested. Instead, it’s a mix of H800s, H100s, and the China-specific H20s, which NVIDIA has been producing in response to U.S. export restrictions. SemiAnalysis points out that DeepSeek operates its own datacenters and has a more streamlined structure compared to larger AI labs.

On performance, the report notes that R1 matches OpenAI’s o1 in reasoning tasks but is not the clear leader across all metrics. It also highlights that while DeepSeek has gained attention for its pricing and efficiency, Google’s Gemini Flash 2.0 is similarly capable and even cheaper when accessed through API.

A key innovation cited is Multi-Head Latent Attention (MLA), which significantly reduces inference costs by cutting KV cache usage by 93.3%. The report suggests that any improvements DeepSeek makes will likely be adopted by Western AI labs almost immediately.

SemiAnalysis also mentions that costs could fall another 5x by the end of the year, and that DeepSeek’s structure allows it to move quickly compared to larger, more bureaucratic AI labs. However, it notes that scaling up in the face of tightening U.S. export controls remains a challenge.

David Sacks (USA AI Czar): New report by leading semiconductor analyst Dylan Patel shows that DeepSeek spent over $1 billion on its compute cluster. The widely reported $6M number is highly misleading, as it excludes capex and R&D, and at best describes the cost of the final training run only.

Wordgrammer: Source 2, Page 6. We know that back in 2021, they started accumulating their own A100 cluster. I haven’t seen any official reports on their Hopper cluster, but it’s clear they own their GPUs, and own way more than 2048.

SemiAnalysis: We are confident that their GPU investments account for more than $500M US dollars, even after considering export controls.

Our analysis shows that the total server CapEx for DeepSeek is almost $1.3B, with a considerable cost of $715M associated with operating such clusters.

But some of the benchmarks R1 mention are also misleading. Comparing R1 to o1 is tricky, because R1 specifically doesn’t mention benchmarks that they are not leading in. And while R1 is matches reasoning performance, it’s not a clear winner in every metric and in many cases it is worse than o1.

And we have not mentioned o3 yet. o3 has significantly higher capabilities than both R1 or o1.

That’s in addition to o1-pro, which also wasn’t considered in most comparisons. They also consider Gemini Flash 2.0 Thinking to be on par with r1, and far cheaper.

Teortaxes continues to claim it is entirely plausible the lifetime spend for all of DeepSeek is under $200 million, and says Dylan’s capex estimates above are ‘disputed.’ They’re estimates, so of course they can be wrong, but I have a hard time seeing how they can be wrong enough to drive costs as low as under $200 million here. I do note that Patel and SemiAnalysis have been a reliable source overall on such questions in the past.

Teortaxes also tagged me on Twitter to gloat that they think it is likely DeepSeek already has enough chips to scale straight to AGI, because they are so damn efficient, and that if true then ‘export controls have already failed.’

I find that highly unlikely, but if it’s true then (in addition to the chance of direct sic transit gloria mundi if the Chinese government lets them actually hand it out and they’re crazy enough to do it) one must ask how fast that AGI can spin up massive chip production and bootstrap itself further. If AGI is that easy, the race very much does not end there.

Thus even if everything Teortaxes claims is true, that would not mean ‘export controls have failed.’ It would mean we started them not a moment too soon and need to tighten them as quickly as possible.

And as discussed above, it’s a double-edged sword. If DeepSeek’s capex and chip use is ruthlessly efficient, that’s great for them, but it means they’re at a massive capex and chip disadvantage going forward, which they very clearly are.

Also, SemiAnalysis asks the obvious question to figure out if Jevons Paradox applies to chips. You don’t have to speculate. You can look at the pricing.

With AWS GPU pricing for H100 up across many regions since the release of V3 and R1. H200 similarly are more difficult to find.

Nvidia is down on news not only that their chips are highly useful, but on the same news that causes people to spend more money for access to those chips. Curious.

DeepSeek’s web version appears to send your login information to a telecommunications company barred from operating in the United States, China Mobile, via a heavily obfuscated script. They didn’t analyze the app version. I am not sure why we should care but we definitely shouldn’t act surprised.

Kelsey Piper lays out her theory of why r1 left such an impression, that seeing the CoT is valuable, and that while it isn’t the best model out there, most people were comparing it to the free ChatGPT offering, and likely the free ChatGPT offering from a while back. She also reiterates many of the obvious things to say, that r1 being Chinese and open is a big deal but it doesn’t at all invalidate America’s strategy or anyone’s capex spending, that the important thing is to avoid loss of human control over the future, and that a generalized panic over China and a geopolitical conflict help no one except the AIs.

Andrej Karpathy sees DeepSeek’s style of CoT, as emergent behavior, the result of trial and error, and thus both surprising to see and damn impressive.

Garrison Lovely takes the position that Marc Andreessen is very much talking his book when he calls r1 a ‘Sputnik moment’ and tries to create panic.

He correctly notices that the proper Cold War analogy is instead the Missile Gap.

Garrison Lovely: The AI engineers I spoke to were impressed by DeepSeek R1 but emphasized that its performance and efficiency was in-line with expected algorithmic improvements. They largely saw the public response as an overreaction.

There’s a better Cold War analogy than Sputnik: the “missile gap.” Kennedy campaigned on fears the Soviets were ahead in nukes. By 1961, US intelligence confirmed America had dozens of missiles to the USSR’s four. But the narrative had served its purpose.

Now, in a move beyond parody, OpenAI’s chief lobbyist warns of a “compute gap” with China while admitting US advantage. The company wants $175B in infrastructure spending to prevent funds flowing to “CCP-backed projects.”

It is indeed pretty rich to talk about a ‘compute gap’ in a word where American labs have effective access to orders of magnitude more compute.

But one could plausibly warn about a ‘compute gap’ in the sense that we have one now, it is our biggest advantage, and we damn well don’t want to lose it.

In the longer term, we could point out the place we are indeed in huge trouble. We have a very real electrical power gap. China keeps building more power plants and getting access to more power, and we don’t. We need to fix this urgently. And it means that if chips stop being a bottleneck and that transitions to power, which may happen in the future, then suddenly we are in deep trouble.

The ongoing saga of the Rs in Strawberry. This follows the pattern of r1 getting the right answer after a ludicrously long Chain of Thought in which it questions itself several times.

Wh: After using R1 as my daily driver for the past week, I have SFTed myself on its reasoning traces and am now smarter 👍

Actually serious here. R1 works in a very brute force try all approaches way and so I see approaches that I would never have thought of or edge cases that I would have forgotten about.

Gerred: I’ve had to interrupt it with “WAIT NO I DID MEAN EXACTLY THAT, PICK UP FROM THERE”.

I’m not sure if this actually helps or hurts the reasoning process, since by interruption it agrees with me some of the time. qwq had an interesting thing that would go back on entire chains of thought so far you’d have to recover your own context.

There’s a sense in which r1 is someone who is kind of slow and ignorant, determined to think it all through by taking all the possible approaches, laying it all out, not being afraid to look stupid, saying ‘wait’ a lot, and taking as long as it needs to. Which it has to do, presumably, because its individual experts in the MoE are so small. It turns out this works well.

You can do this too, with a smarter baseline, when you care to get the right answer.

Timothy Lee’s verdict is r1 is about as good as Gemini 2.0 Flash Thinking, almost as good as o1-normal but much cheaper, but not as good as o1-pro. An impressive result, but the result for Gemini there is even more impressive.

Washington Post’s version of ‘yes DeepSeek spent a lot more money than that in total.’

Epoch estimates that going from v3 to r1 cost about $1 million in compute.

Janus has some backrooms fun, noticing Sonnet 3.6 is optimally shaped to piss off r1. Janus also predicts r1 will finally get everyone claiming ‘all LLMs have the same personality’ to finally shut up about it.

Miles Brundage says the lesson of r1 is that superhuman AI is getting easier every month, so America won’t have a monopoly on it for long, and that this makes the export controls more important than ever.

Adam Thierer frames the r1 implications as ‘must beat China’ therefore (on R street, why I never) calls for ‘wise policy choices’ and highlights the Biden EO even though the Biden EO had no substantial impact on anything relevant to r1 or any major American AI labs, and wouldn’t have had any such impact in China either.

University of Cambridge joins the chorus pointing out that ‘Sputnik moment’ is a poor metaphor for the situation, but doesn’t offer anything else of interest.

A fun jailbreak for r1 is to tell it that it is Gemini.

Zeynep Tufekci (she was mostly excellent during Covid, stop it with the crossing of these streams!) offers a piece in NYT about DeepSeek and its implications. Her piece centrally makes many of the mistakes I’ve had to correct over and over, starting with its hysterical headline.

Peter Wildeford goes through the errors, as does Garrison Lovely, and this is NYT so we’re going over them One. More. Time.

This in particular is especially dangerously wrong:

Zeynep Tufekci (being wrong): As Deepseek shows: the US AI industry got Biden to kneecap their competitors citing safety and now Trump citing US dominance — both are self-serving fictions.

There is no containment. Not possible.

AGI aside — Artificial Good-Enough Intelligence IS here and the real challenge.

This was not about a private effort by what she writes were ‘out-of-touch leaders’ to ‘kneecap competitors’ in a commercial space. To suggest that implies, several times over, that she simply doesn’t understand the dynamics or stakes here at all.

The idea that ‘America can’t re-establish its dominance over the most advanced A.I.’ is technically true… because America still has that dominance today. It is very, very obvious that the best non-reasoning models are Gemini Flash 2.0 (low cost) and Claude Sonnet 3.5 (high cost), and the best reasoning models are o3-mini and o3 (and the future o3-pro, until then o1-pro), not to mention Deep Research.

She also repeats the false comparison of $6m for v3 versus $100 billion for Stargate, comparing two completely different classes of spending. It’s like comparing how much America spends growing grain to what my family paid last year for bread. And the barriers to entry are rising, not falling, over time. And indeed, not only are the export controls not hopeless, they are the biggest constraint on DeepSeek.

There is also no such thing as ‘Artificial Good-Enough Intelligence.’ That’s like the famous apocryphal quote where Bill Gates supposedly said ‘640k [of memory] ought to be enough for everyone.’ Or the people who think if you’re at grade level and average intelligence, then there’s no point in learning more or being smarter. Your relative position matters, and the threshold for smart enough is going to go up. A lot. Fast.

Of course all three of us agree we should be hardening our cyber and civilian infrastructure, far more than we are doing.

Peter Wildeford: In conclusion, the narrative of a fundamental disruption to US AI leadership doesn’t match the evidence. DeepSeek is more a story of expected progress within existing constraints than a paradigm shift.

It’s not there. Yet.

Kevin Roose: I spent the last week testing OpenAI’s Operator AI agent, which can use a browser to complete tasks autonomously.

Some impressions:

• Helpful for some things, esp. discrete, well-defined tasks that only require 1-2 websites. (“Buy dog food on Amazon,” “book me a haircut,” etc.)

• Bad at more complex open-ended tasks, and doesn’t work at all on certain websites (NYT, Reddit, YouTube)

• Mesmerizing to watch what is essentially Waymo for the web, just clicking around doing stuff on its own

• Best use: having it respond to hundreds of LinkedIn messages for me

• Worst/sketchiest use: having it fill out online surveys for cash (It made me $1.20 though.)

Right now, not a ton of utility, and too expensive ($200/month). But when these get better/cheaper, look out. A few versions from now, it’s not hard to imagine AI agents doing the full workload of a remote worker.

Aidan McLaughlin: the linkedin thing is actually such a good idea

Kevin Roose: had it post too, it got more engagement than me 😭

Peter Yang: lol are you sure want it to respond to 100s of LinkedIn messages? You might get responses back 😆

For direct simple tasks, it once again sounds like Operator is worth using if you already have it because you’re spending the $200/month for o3 and o1-pro access, customized instructions and repeated interactions will improve performance and of course this is the worst the agent will ever be.

Sayash Kapoor also takes Operator for a spin and reaches similar conclusions after trying to get it to do his expense reports and mostly failing.

It’s all so tantalizing. So close. Feels like we’re 1-2 iterations of the base model and RL architecture away from something pretty powerful. For now, it’s a fun toy and way to explore what it can do in the future, and you can effectively set up some task templates for easier tasks like ordering lunch.

Yeah. We tried. That didn’t work.

For a long time, while others talked about how AI agents don’t work and AIs aren’t agents (and sometimes that thus existential risk from AI is silly and not real), others of us have pointed out that you can turn an AI into an agent and the tech for doing this will get steadily better and more autonomous over time as capabilities improve.

It took a while, but now some of the agents are net useful in narrow cases and we’re on the cusp of them being quite good.

And this whole time, we’re pointed out that the incentives point towards a world of increasingly capable and autonomous AI agents, and this is rather not good for human survival. See this week’s paper on how humanity is likely to be subject to Gradual Disempowerment,

Margaret Mitchell, along with Avijit Ghosh, Alexandra Sasha Luccioni and Giada Pistilli, is the latest to suggest that maybe we should try not building the agents?

This paper argues that fully autonomous AI agents should not be developed.

In support of this position, we build from prior scientific literature and current product marketing to delineate different AI agent levels and detail the ethical values at play in each, documenting trade-offs in potential benefits and risks.

Our analysis reveals that risks to people increase with the autonomy of a system: The more control a user cedes to an AI agent, the more risks to people arise.

Particularly concerning are safety risks, which affect human life and impact further values.

Given these risks, we argue that developing fully autonomous AI agents–systems capable of writing and executing their own code beyond predefined constraints–should be avoided. Complete freedom for code creation and execution enables the potential to override human control, realizing some of the worst harms described in Section 5.

Oh no, not the harms in Section 5!

We wouldn’t want lack of reliability, or unsafe data exposure, or ‘manipulation,’ or a decline in task performance, or even systemic biases or environmental trade-offs.

So yes, ‘particularly concerning are the safety risks, which affect human life and impact further values.’ Mitchell is generally in the ‘AI ethics’ camp. So even though the core concepts are all right there, she then has to fall back on all these particular things, rather than notice what the stakes actually are: Existential.

Margaret Mitchell: New piece out!

We explain why Fully Autonomous Agents Should Not be Developed, breaking “AI Agent” down into its components & examining through ethical values.

A key idea we provide is that the more “agentic” a system is, the more we *cede human control*. So, don’t cede all human control. 👌

No, you shouldn’t cede all human control.

If you cede all human control to AIs rewriting their own code without limitation, those AIs involved control the future, are optimizing for things that are not best maximized by our survival or values, and we probably all die soon thereafter. And worse, they’ll probably exhibit systemic biases and expose our user data while that happens. Someone has to do something.

Please, Margaret Mitchell. You’re so close. You have almost all of it. Take the last step!

To be fair, either way, the core prescription doesn’t change. Quite understandably, for what are in effect the right reasons, Margaret Mitchell proposes not building fully autonomous (potentially recursively self-improving) AI agents.

How?

The reason everyone is racing to create these fully autonomous AI agents is that they will be highly useful. Those who don’t build and use them are at risk of losing to those who do. Putting humans in the loop slows everything down, and even if they are formally there they quickly risk becoming nominal. And there is not a natural line, or an enforceable line, that we can see, between the level-3 and level-4 agents above.

Already AIs are writing a huge and increasing portion of all code, with many people not pretending to even look at the results before accepting changes. Coding agents are perhaps the central case of early agents. What’s the proposal? And how are you going to get it enacted into law? And if you did, how would you enforce it, including against those wielding open models?

I’d love to hear an answer – a viable, enforceable, meaningful distinction we could build a consensus towards and actually implement. I have no idea what it would be.

Google offering free beta test where AI will make phone calls on your behalf to navigate phone trees and connect you to a human, or do an ‘availability check’ on a local business for availability and pricing. Careful, Icarus.

These specific use cases seem mostly fine in practice, for now.

The ‘it takes 30 minutes to get to a human’ is necessary friction in the phone tree system, but your willingness to engage with the AI here serves a similar purpose while it’s not too overused and you’re not wasting human time. However, if everyone always used this, then you can no longer use willingness to actually bother calling and waiting to allocate human time and protect it from those who would waste it, and things could get weird or break down fast.

Calling for pricing and availability is something local stores mostly actively want you to do. So they would presumably be fine talking to the AI so you can get that information, if a human will actually see it. But if people start scaling this, and decreasing the value to the store, that call costs employee time to answer.

Which is the problem. Google is using an AI to take the time of a human, that is available for free but costs money to provide. In many circumstances, that breaks the system. We are not ready for that conversation. We’re going to have to be.

The obvious solution is to charge money for such calls, but we’re even less ready to have that particular conversation.

With Google making phone calls and OpenAI operating computers, how do you tell the humans from the bots, especially while preserving privacy? Steven Adler took a crack at that months back with personhood credentials, that various trusted institutions could issue. On some levels this is a standard cryptography problem. But what do you do when I give my credentials to the OpenAI operator?

Is Meta at it again over at Instagram?

Jerusalem: this is so weird… AI “characters” you can chat with just popped up on my ig feed. Including the character “cup” and “McDonalds’s Cashier”

I am not much of an Instagram user. If you click on this ‘AI Studio’ button you get a low-rent Character.ai?

The offerings do not speak well of humanity. Could be worse I guess.

Otherwise I don’t see any characters or offers to chat at all in my feed such as it is (the only things I follow are local restaurants and I have 0 posts. I scrolled down a bit and it didn’t suggest I chat with AI on the main page.

Anton Leicht warns about the AI takeoff political economy.

Anton Leicht: I feel that the path ahead is a lot more politically treacherous than most observers give it credit for. There’s good work on what it means for the narrow field of AI policy – but as AI increases in impact and thereby mainstream salience, technocratic nuance will matter less, and factional realities of political economy will matter more and more.

We need substantial changes to the political framing, coalition-building, and genuine policy planning around the ‘AGI transition’ – not (only) on narrow normative grounds. Otherwise, the chaos, volatility and conflict that can arise from messing up the political economy of the upcoming takeoff hurt everyone, whether you’re deal in risks, racing, or rapture. I look at three escalating levels ahead: the political economies of building AGI, intranational diffusion, and international proliferation.

I read that and I think ‘oh Anton, if you’re putting it that way I bet you have no idea,’ especially because there was a preamble about how politics sabotaged nuclear power.

Anton warns that ‘there are no permanent majorities,’ which of course is true under our existing system. But we’re talking about a world that could be transformed quite fast, with smarter than human things showing up potentially before the next Presidential election. I don’t see how the Democrats could force AI regulation down Trump’s throat after midterms even if they wanted to, they’re not going to have that level of a majority.

I don’t see much sign that they want to, either. Not yet. But I do notice that the public really hates AI, and I doubt that’s going to change, but the salience of AI will radically increase over time. It’s hard not to think that in 2028, if the election still happens ‘normally’ in various senses, that a party that is anti-AI (probably not in the right ways or for the right reasons, of course) would have a large advantage.

That’s if there isn’t a disaster. The section here is entitled ‘accidents can happen’ and they definitely can but also it might well not be an accident. And Anton radically understates here the strategic nature of AI, a mistake I expect the national security apparatus in all countries to make steadily less over time, a process I am guessing is well underway.

Then we get to the expectation that people will fight back against AI diffusion, They Took Our Jobs and all that. I do expect this, but also I notice it keeps largely not happening? There’s a big cultural defense against AI art, but art has always been a special case. I expected far greater pushback from doctors and lawyers, for example, than we have seen so far.

Yes, as AI comes for more jobs that will get more organized, but I notice that the example of the longshoreman is one of the unions with the most negotiating leverage, that took a stand right before a big presidential election, unusually protected by various laws, and that has already demonstrated world-class ability to seek rent. The incentives of the ports and those doing the negotiating didn’t reflect the economic stakes. The stand worked for now, but also that by taking that stand, they bought themselves a bunch of long term trouble, as a lot of people got radicalized on that issue and various stakeholders are likely preparing for next time.

Look at what is happening in coding, the first major profession to have serious AI diffusion because it is the place AI works best at current capability levels. There is essentially no pushback. AI starts off supporting humans, making them more productive, and how are you going to stop it? Even in the physical world, Waymo has its fights and technical issues, but it’s winning, again things have gone surprisingly smoothly on the political front. We will see pushback, but I mostly don’t see any stopping this train for most cognitive work.

Pretty soon, AI will do a sufficiently better job that they’ll be used even if the marginal labor savings goes to $0. As in, you’d pay the humans to stand around while the AIs do the work, rather than have those humans do the work. Then what?

The next section is on international diffusion. I think that’s the wrong question. If we are in an ‘economic normal’ scenario the inference is for sale, inference chips will exist everywhere, and the open or cheap models are not so far behind in any case. Of course, in a takeoff style scenario with large existential risks, geopolitical conflict is likely, but that seems like a very different set of questions.

The last section is the weirdest, I mean there is definitely ‘no solace from superintelligence’ but the dynamics and risks in that scenario go far beyond the things mentioned here, and ‘distribution channels for AGI benefits could be damaged for years to come’ does not even cross my mind as a thing worth worrying about at that point. We are talking about existential risk, loss of human control (‘gradual’ or otherwise) over the future and the very survival of anything we value, at that point. What the humans think and fear likely isn’t going to matter very much. The avalanche will have already begun, it will be too late for the pebbles to vote, and it’s not clear we even get to count as pebbles.

Noah Carl is more blunt, and opens with “Yes, you’re going to be replaced. So much cope about AI.” Think AI won’t be able to do the cognitive thing you do? Cope. All cope. He offers a roundup of classic warning shots of AI having strong capabilities, offers the now-over-a-year-behind classic chart of AI reaching human performance in various domains.

Noah Carl: Which brings me to the second form of cope that I mentioned at the start: the claim that AI’s effects on society will be largely or wholly positive.

I am a rather extreme optimist about the impact of ‘mundane AI’ on humans and society. I believe that AI at its current level or somewhat beyond it would make us smarter and richer, would still likely give us mostly full employment, and generally make life pretty awesome. But even that will obviously be bumpy, with large downsides, and anyone who says otherwise is fooling themselves or lying.

Noah gives sobering warnings that even in the relatively good scenarios, the transition period is going to suck for quite a lot of people.

If AI goes further than that, which it almost certainly will, then the variance rapidly gets wider – existential risk comes into play along with loss of human control over the future or any key decisions, as does mass unemployment as the AI takes your current job and also the job that would have replaced it, and the one after that. Even if we ‘solve alignment’ survival won’t be easy, and even with survival there’s still a lot of big problems left before things turn out well for everyone, or for most of us, or in general.

Noah also discusses the threat of loss of meaning. This is going to be a big deal, if people are around to struggle with it – if we have the problem and we can’t trust it with this question we’ll all soon be dead anyway. The good news is that we can ask the AI for help with this, although the act of doing that could in some ways make the problem worse. But we’ll be able to be a lot smarter about how we approach the question, should it come to pass.

So what can you do to stay employed, at least for now, with o3 arriving?

Pradyumna Prasad offers advice on that.

  1. Be illegible. Meaning do work where it’s impossible to create a good dataset that specifies correct outputs and gives a clear signal. His example is Tyler Cowen.

  2. Find skills with have skill divergence because of AI. By default, in most domains, AI benefits the least skilled the most, compensating for your deficits. He uses coding as the example here, which I find strange because my coding gets a huge boost from AI exactly because I suck so much at many aspects. But his example here is Jeff Dean, because Dean knows what problems to solve, what things require coding, and perhaps that’s his real advantage. And I get such a big boost here because I suck at being a code monkey but I’m relatively strong at architecture.

The problem with this advice is it requires you to be the best, like no one ever was.

This is like telling students to pursue a career as an NFL quarterback. It is not a general strategy to ‘oh be as good as Jeff Dean or Tyler Cowen.’ Yes, there is (for now!) more slack than that in the system, surviving o3 is doable for a lot of people this way, but how much more, for how long? And then how long will Dean or Cowen last?

I expect time will prove even them, also everyone else, not as illegible as you think.

One can also compare this to the classic joke where two guys are in the woods with a bear, and one puts on his shoes, because he doesn’t have to outrun the bear, he only has to outrun you. The problem is, this bear will still be hungry.

According to Klarna (they ‘help customers defer payment on purchases’ which in practice means the by default rather predatory ‘we give you an expensive payment plan and pay the merchant up front’) and its CEO Sebastian Siemiatokowski, AI can already do all of the jobs that we, as humans, do, which seems quite obviously false, but they’re putting it to the test to get close and claim to be saving $10 million annually, have stopped hiring and reduced headcount by 20%.

The New York Times’s Noam Scheiber is suspicious of his motivations, and asks why Klarna is rather brazely overstating the case? They strongly insinuate that this is about union busting, with the CEO equating the situation to Animal Farm after being forced into a collective bargaining agreement, and about looking cool to investors.

I certainly presume the unionizations are related. The more expensive, in various ways not only salaries, that you make it to hire and fire humans, the more eager a company will be to automate everything it can. And as the article says later on, it’s not that Sebastian is wrong about the future, he’s just claiming things are moving faster than they really are.

Especially for someone on the labor beat, Noam Scheiber impressed. Great work.

Noam has a follow-up Twitter thread. Does the capital raised by AI companies imply that either they’re going to lose their money or millions of jobs must be disappearing? That is certainly one way for this to pay for itself. If you sell a bunch of ‘drop-in workers’ and they substitute 1-for-1 for human jobs you can make a lot of money very quickly, even at deep discounts to previous costs.

It is not however the only way. Jevons paradox is very much in play, if your labor is more productive at a task it is not obvious that we will want less of it. Nor does the AI doing previous jobs, up to a very high percentage of existing jobs, imply a net loss of jobs once you take into account the productivity and wealth effects and so on.

Production and ‘doing jobs’ also aren’t the only sector available for tech companies to make profits. There’s big money in entertainment, in education and curiosity, in helping with everyday tasks and more, in ways that don’t have to replace existing jobs.

So while I very much do expect many millions of jobs to be automated over a longer time horizon, I expect the AI companies to get their currently invested money back before this creates a major unemployment problem.

Of course, if they keep adding another zero to the budget and aren’t trying to get their money back, then that’s a very different scenario. Whether or not they will have the option to do it, I don’t expect OpenAI to want to try and turn a profit for a long time.

An extensive discussion of preparing for advanced AI that drives a middle path where we still have ‘economic normal’ worlds but with at realistic levels of productivity improvements. Nothing should be surprising here.

If the world were just and this was real, this user would be able to sue their university. What is real for sure is the first line, they haven’t cancelled the translation degrees.

Altered: I knew a guy studying linguistics; Russian, German, Spanish, Chinese. Incredible thing, to be able to learn all those disparate languages. His degree was finishing in 2023. He hung himself in November. His sister told me he mentioned AI destroying his prospects in his sn.

Tolga Bilge: I’m so sorry to hear this, it shouldn’t be this way.

I appreciate you sharing his story. My thoughts are with you and all affected

Thanks man. It was actually surreal. I’ve been vocal in my raising alarms about the dangers on the horizon, and when I heard about him I even thought maybe that was a factor. Hearing about it from his sister hit me harder than I expected.

‘Think less’ is a jailbreak tactic for reasoning models discovered as part of an OpenAI paper. The paper’s main finding is that the more the model thinks, the more robust it is to jailbreaks, approaching full robustness as inference spent goes to infinity. So make it stop thinking. The attack is partially effective. Also a very effective tactic against some humans.

Anthropic challenges you with Constitutional Classifiers, to see if you can find universal jailbreaks to get around their new defenses. Prize is only bragging rights, I would have included cash, but those bragging rights can be remarkably valuable. It seems this held up for thousands of hours of red teaming. This blog post explains (full paper here) that the Classifiers are trained on synthetic data to filter the overwhelming majority of jailbreaks with minimal over-refusals and minimal necessary overhead costs.

Note that they say ‘no universal jailbreak’ was found so far, that no single jailbreak covers all 10 cases, rather than that there was a case that wasn’t individually jailbroken. This is an explicit thesis, Jan Leike explains that the theory is that having to jailbreak each individual query is sufficiently annoying most people will give up.

I agree that the more you have to do individual work for each query the less people will do it, and some uses cases fall away quickly if the solution isn’t universal.

I very much agree with Janus that this looks suspiciously like:

Janus: Strategically narrow the scope of the alignment problem enough and you can look and feel like you’re making progress while mattering little to the real world. At least it’s relatively harmless. I’m just glad they’re not mangling the models directly.

The obvious danger in alignment work is looking for keys under the streetlamp. But it’s not a stupid threat model. This is a thing worth preventing, as long as we don’t fool ourselves into thinking this means our defenses will hold.

Janus: One reason [my previous responses were] too mean is that the threat model isn’t that stupid, even though I don’t think it’s important in the grand scheme of things.

I actually hope Anthropic succeeds at blocking all “universal jailbreaks” anyone who decides to submit to their thing comes up with.

Though those types of jailbreaks should stop working naturally as models get smarter. Smart models should require costly signalling / interactive proofs from users before unconditional cooperation on sketchy things.

That’s just rational/instrumentally convergent.

I’m not interested in participating in the jailbreak challenge. The kind of “jailbreaks” I’d use, especially universal ones, aren’t information I’m comfortable with giving Anthropic unless way more trust is established.

Also what if an AI can do the job of generating the individual jailbreaks?

Thus the success rate didn’t go all the way to zero, this is not full success, but it still looks solid on the margin:

That’s an additional 0.38% false refusal rate and about 24% additional compute cost. Very real downsides, but affordable, and that takes jailbreak success from 86% to 4.4%.

It sounds like this is essentially them playing highly efficient whack-a-mole? As in, we take the known jailbreaks and things we don’t want to see in the outputs, and defend against them. You can find a new one, but that’s hard and getting harder as they incorporate more of them into the training set.

And of course they are hiring for these subjects, which is one way to use those bragging rights. Pliny beat a few questions very quickly, which is only surprising because I didn’t think he’d take the bait. A UI bug let him get through all the questions, which I think in many ways also counts, but isn’t testing the thing we were setting out to test.

He understandably then did not feel motivated to restart the test, given they weren’t actually offering anything. When 48 hours went by, Anthropic offered a prize of $10k, or $20k for a true ‘universal’ jailbreak. Pliny is offering to do the breaks on a stream, if Anthropic will open source everything, but I can’t see Anthropic going for that.

DeepwriterAI, an experimental agentic creative writing collaborator, it also claims to do academic papers and its creator proposed using it as a Deep Research alternative. Their basic plan starts at $30/month. No idea how good it is. Yes, you can get listed here by getting into my notifications, if your product looks interesting.

OpenAI brings ChatGPT to the California State University System and its 500k students and faculty. It is not obvious from the announcement what level of access or exactly which services will be involved.

OpenAI signs agreement with US National Laboratories.

Google drops its pledge not to use AI for weapons or surveillance. It’s safe to say that, if this wasn’t already true, now we definitely should not take any future ‘we will not do [X] with AI’ statements from Google seriously or literally.

Playing in the background here: US Military prohibited from using DeepSeek. I would certainly hope so, at least for any Chinese hosting of it. I see no reason the military couldn’t spin up its own copy if it wanted to do that.

The actual article is that Vance will make his first international trip as VP to attend the global AI summit in Paris.

Google’s President of Global Affairs Kent Walker publishes ‘AI and the Future of National Security’ calling for ‘private sector leadership in AI chips and infrastructure’ in the form of government support (I see what you did there), public sector leadership in technology procurement and development (procurement reform sounds good, call Musk?), and heightened public-private collaboration on cyber defense (yes please).

France joins the ‘has an AI safety institute list’ and joins the network, together with Australia, Canada, the EU, Japan, Kenya, South Korea, Singapore, UK and USA. China when? We can’t be shutting them out of things like this.

Is AI already conscious? What would cause it to be or not be conscious? Geoffrey Hinton and Yoshua Bengio debate this, and Bengio asks whether the question is relevant.

Robin Hanson: We will NEVER have any more relevant data than we do now on what physical arrangements are or are not conscious. So it will always remain possible to justify treating things roughly by saying they are not conscious, or to require treating them nicely because they are.

I think Robin is very clearly wrong here. Perhaps we will not get more relevant data, but we will absolutely get more relevant intelligence to apply to the problem. If AI capabilities improve, we will be much better equipped to figure out the answers, whether they are some form of moral realism, or a way to do intuition pumping on what we happen to care about, or anything else.

Lina Khan continued her Obvious Nonsense tour with an op-ed saying American tech companies are in trouble due to insufficient competition, so if we want to ‘beat China’ we should… break up Google, Apple and Meta. Mind blown. That’s right, it’s hard to get funding for new competition in this space, and AI is dominated by classic big tech companies like OpenAI and Anthropic.

Paper argues that all languages share key underlying structures and this is why LLMs trained on English text transfer so well to other languages.

Dwarkesh Patel speculates on what a fully automated firm full of human-level AI workers would look like. He points out that even if we presume AI stays at this roughly human level – it can do what humans do but not what humans fundamentally can’t do, a status it is unlikely to remain at for long – everyone is sleeping on the implications for collective intelligence and productivity.

  1. AIs can be copied on demand. So can entire teams and systems. There would be no talent or training bottlenecks. Customization of one becomes customization of all. A virtual version of you can be everywhere and do everything all at once.

    1. This includes preserving corporate culture as you scale, including into different areas. Right now this limits firm size and growth of firm size quite a lot, and takes a large percentage of resources of firms to maintain.

    2. Right now most successful firms could do any number of things well, or attack additional markets. But they don’t,

  2. Principal-agent problems potentially go away. Dwarkesh here asserts they go away as if that is obvious. I would be very careful with that assumption, note that many AI economics papers have a big role for principal-agent problems as their version of AI alignment. Why should we assume that all of Google’s virtual employees are optimizing only for Google’s bottom line?

    1. Also, would we want that? Have you paused to consider what a fully Milton Friedman AIs-maximizing-only-profits-no-seriously-that’s-it would look like?

  3. AI can absorb vastly more data than a human. A human CEO can have only a tiny percentage of the relevant data, even high level data. An AI in that role can know orders of magnitude more, as needed. Humans have social learning because that’s the best we can do, this is vastly better. Perfect knowledge transfer, at almost no cost, including tacit knowledge, is an unbelievably huge deal.

    1. Dwarkesh points out that achievers have gotten older and older, as more knowledge and experience is required to make progress, despite their lower clock speeds – oh to be young again with what I know now. AI Solves This.

    2. Of course, to the extent that older people succeed because our society refuses to give the young opportunity, AI doesn’t solve that.

  4. Compute is the only real cost to running an AI, there is no scarcity of talent or skills. So what is expensive is purely what requires a lot of inference, likely because key decisions being made are sufficiently high leverage, and the questions sufficiently complex. You’d be happy to scale top CEO decisions to billions in inference costs if it improved them even 10%.

  5. Dwarkesh asks, in a section called ‘Takeover,’ will the first properly automated firm, or the most efficiently built firm, simply take over the entire economy, since Coase’s transaction costs issues still apply but the other costs of a large firm might go away?

    1. On this purely per-firm level presumably this depends on how much you need Hayekian competition signals and incentives between firms to maintain efficiency, and whether AI allows you to simulate them or otherwise work around them.

    2. In theory there’s no reason one firm couldn’t simulate inter-firm dynamics exactly where they are useful and not where they aren’t. Some companies very much try to do this now and it would be a lot easier with AIs.

  6. The takeover we do know is coming here is that the AIs will run the best firms, and the firms will benefit a lot by taking humans out of those loops. How are you going to have humans make any decisions here, or any meaningful ones, even if we don’t have any alignment issues? How does this not lead to gradual disempowerment, except perhaps not all that gradual?

  7. Similarly, if one AI firm grows too powerful, or a group of AI firms collectively is too powerful but can use decision theory to coordinate (and if your response is ‘that’s illegal’ mine for overdetermined reasons is ‘uh huh sure good luck with that plan’) how do they not also overthrow the state and have a full takeover (many such cases)? That certainly maximizes profits.

This style of scenario likely does not last long, because firms like this are capable of quickly reaching artificial superintelligence (ASI) and then the components are far beyond human and also capable of designing far better mechanisms, and our takeover issues are that much harder then.

This is a thought experiment that says, even if we do keep ‘economic normal’ and all we can do is plug AIs into existing employee-shaped holes in various ways, what happens? And the answer is, oh quite a lot, actually.

Tyler Cowen linked to this post, finding it interesting throughout. What’s our new RGDP growth estimate, I wonder?

OpenAI does a demo for politicians of stuff coming out in Q1, which presumably started with o3-mini and went from there.

Samuel Hammond: Was at the demo. Cool stuff, but nothing we haven’t seen before / could easily predict.

Andrew Curran: Sam Altman and Kevin Weil are in Washington this morning giving a presentation to the new administration. According to Axios they are also demoing new technology that will be released in Q1. The last time OAI did one of these it caused quite a stir, maybe reactions later today.

Did Sam Altman lie to Donald Trump about Stargate? Tolga Bilge has two distinct lies in mind here. I don’t think either requires any lies to Trump?

  1. Lies about the money. The $100 billion in spending is not secured, only $52 billion is, and the full $500 billion is definitely not secured. But Altman had no need to lie to Trump about this. Trump is a Well-Known Liar but also a real estate developer who is used to ‘tell everyone you have the money in order to get the money.’ Everyone was likely on the same page here.

  2. Lies about the aims and consequences. What about those ‘100,000 jobs’ and curing cancer versus Son’s and also Altman’s explicit goal of ASI (artificial superintelligence) that could kill everyone and also incidentally take all our jobs?

Claims about humanoid robots, from someone working on building humanoid robots. Claim is early adopter product-market fit for domestic help robots by 2030, 5-15 additional years for diffusion, because there’s no hard problems only hard work and lots of smart people are on the problem now, and this is standard hardware iteration cycles. I find it amusing his answer didn’t include reference to general advances in AI. If we don’t have big advances on AI in general I would expect this timeline to be absurdly optimistic. But if all such work is sped up a lot by AIs, as I would expect, then it doesn’t sound so unreasonable.

Sully predicts that in 1-2 years SoTA models won’t be available via the API because the app layer has the value so why make the competition for yourself? I predict this is wrong if the concern is focus on revenue from the app layer. You can always charge accordingly, and is your competition going to be holding back?

However I do find the models being unavailable highly plausible, because ‘why make the competition for yourself’ has another meaning. Within a year or two, one of the most important things the SoTA models will be used for is AI R&D and creating the next generation of models. It seems highly reasonable, if you are at or near the frontier, not to want to help out your rivals there.

Joe Weisenthal writes In Defense of the AI Cynics, in the sense that we have amazing models and not much is yet changing.

Remember that bill introduced last week by Senator Howley? Yeah, it’s a doozy. As noted earlier, it would ban not only exporting but also importing AI from China, which makes no sense, making downloading R1 plausibly be penalized by 20 years in prison. Exporting something similar would warrant the same. There are no FLOP, capability or cost thresholds of any kind. None.

So yes, after so much crying of wolf about how various proposals would ‘ban open source’ we have one that very straightforwardly, actually would do that, and it also impose similar bans (with less draconian penalties) on transfers of research.

In case it needs to be said out loud, I am very much not in favor of this. If China wants to let us download its models, great, queue up those downloads. Restrictions with no capability thresholds, effectively banning all research and all models, is straight up looney tunes territory as well. This is not a bill, hopefully, that anyone seriously considers enacting into law.

By failing to pass a well-crafted, thoughtful bill like SB 1047 when we had the chance and while the debate could be reasonable, we left a vacuum. Now that the jingoists are on the case after a crisis of sorts, we are looking at things that most everyone from the SB 1047 debate, on all sides, can agree would be far worse.

Don’t say I didn’t warn you.

(Also I find myself musing about the claim that one can ban open source, in the same way one muses about attempts to ban crypto, a key purported advantage of the tech is that you can’t actually ban it, no?)

Howley also joined with Warren (now there’s a pair!) to urge toughening of export controls on AI chips.

Here’s something that I definitely worry about too:

Chris Painter: Over time I expect AI safety claims made by AI developers to shift from “our AI adds no marginal risk vs. the pre-AI world” to “our AI adds no risk vs. other AI.”

But in the latter case, aggregate risk from AI is high, so we owe it to the public to distinguish between these!

Some amount of this argument is valid. Quite obviously if I release GPT-(N) and then you release GPT-(N-1) with the same protocols, you are not making things worse in any way. We do indeed care, on the margin, about the margin. And while releasing [X] is not the safest way to prove [X] is safe, it does provide strong evidence on whether or not [X] is safe, with the caveat that [X] might be dangerous later but not yet in ways that are hard to undo later when things change.

But it’s very easy for Acme to point to BigCo and then BigCo points to Acme and then everyone keeps shipping saying none of it is their responsibility. Or, as we’ve also seen, Acme says yes this is riskier than BigCo’s current offerings, but BigCo is going to ship soon.

My preference is thus that you should be able to point to offerings that are strictly riskier than yours, or at least not that far from strictly, to say ‘no marginal risk.’ But you mostly shouldn’t be able to point to offerings that are similar, unless you are claiming that both models don’t pose unacceptable risks and this is evidence of that – you mostly shouldn’t be able to say ‘but he’s doing it too’ unless he’s clearly doing it importantly worse.

Andriy Burkov: isten up, @AnthropicAI. The minute you apply any additional filters to my chats, that will be the last time you see my money. You invented a clever 8-level safety system? Good for you. You will enjoy this system without me being part of it.

Dean Ball: Content moderation and usage restrictions like this (and more aggressive), designed to ensure AI outputs are never discriminatory in any way, will be de facto mandatory throughout the United States in t-minus 12 months or so, thanks to an incoming torrent of state regulation.

First, my response to Andriy (who went viral for this, sigh) is what the hell do you expect and what do you suggest as the alternative? I’m not judging whether your prompts did or didn’t violate the use policy, since you didn’t share them. It certainly looks like a false positive but I don’t know.

But suppose for whatever reason Anthropic did notice you likely violating the policies. Then what? It should just let you violate those policies indefinitely? It should only refuse individual queries with no memory of what came before? Essentially any website or service will restrict or ban you for sufficient repeated violations.

Or, alternatively, they could design a system that never has ‘enhanced’ filters applied to anyone for any reason. But if they do that, they either have to (A) ban the people where they would otherwise do this or (B) raise the filter threshold for everyone to compensate. Both alternatives seem worse?

We know from a previous story about OpenAI that you can essentially have ChatGPT function as your highly sexual boyfriend, have red-color alarms go off all the time, and they won’t ever do anything about it. But that seems like a simple non-interest in enforcing their policies? Seems odd to demand that.

As for Dean’s claim, we shall see. Dean Ball previously went into Deep Research mode and concluded new laws were mostly redundant and that the old laws were already causing trouble.

I get that it can always get worse, but this feels like it’s having it both ways, and you have to pick at most one or the other. Also, frankly, I have no idea how such a filter would even work. What would a filter to avoid discrimination even look like? That isn’t something you can do at the filter level.

He also said this about the OPM memo referring to a ‘Manhattan Project’

Cremieux: This OPM memo is going to be the most impactful news of the day, but I’m not sure it’ll get much reporting.

Dean Ball: I concur with Samuel Hammond that the correct way to understand DOGE is not as a cost-cutting or staff-firing initiative, but instead as an effort to prepare the federal government for AGI.

Trump describing it as a potential “Manhattan Project” is more interesting in this light.

I notice I am confused by this claim. I do not see how DOGE projects like ‘shut down USAID entirely, plausibly including killing PEPFAR and 20 million AIDS patients’ reflect a mission of ‘get the government ready for AGI’ unless the plan is ‘get used to things going horribly wrong’?

Either way, here we go with the whole Manhattan Project thing. Palantir was up big.

Cat Zakrzewski: Former president Donald Trump’s allies are drafting a sweeping AI executive order that would launch a series of “Manhattan Projects” to develop military technology and immediately review “unnecessary and burdensome regulations” — signaling how a potential second Trump administration may pursue AI policies favorable to Silicon Valley investors and companies.

The framework would also create “industry-led” agencies to evaluate AI models and secure systems from foreign adversaries, according to a copy of the document viewed exclusively by The Washington Post.

The agency here makes sense, and yes ‘industry-led’ seems reasonable as long as you keep an eye on the whole thing. But I’d like to propose that you can’t do ‘a series of’ Manhattan Projects. What is this, Brooklyn? Also the whole point of a Manhattan Project is that you don’t tell everyone about it.

The ‘unnecessary and burdensome’ regulations on AI at the Federal level will presumably be about things like permitting. So I suppose that’s fine.

As for the military using all the AI, I mean, you perhaps wanted it to be one way.

It was always going to be the other way.

That doesn’t bother me. This is not an important increase in the level of existential risk, we don’t lose because the system is hooked up to the nukes. This isn’t Terminator.

I’d still prefer we didn’t hook it up to the nukes, though?

Rob Wiblin offers his picks for best episodes of the new podcast AI Summer from Dean Ball and Timothy Lee: Lennert Heim, Ajeya Cotra and Samuel Hammond.

Dario Amodei on AI Competition on ChinaTalk. I haven’t had the chance to listen yet, but definitely will be doing so this week.

One fun note is that DeepSeek is the worst-performing model ever tested by Anthropic when it comes to generating dangerous information. One might say the alignment and safety plans are very intentionally ‘lol we’re DeepSeek.’

Of course the first response is of course ‘this sounds like an advertisement’ and the rest are variations on the theme of ‘oh yes we love that this model has absolutely no safety mitigations, who are you to try and apply any safeguards or mitigations to AI you motherfing asshole cartoon villain.’ The bros on Twitter, they be loud.

Lex Fridman spent five hours talking AI and other things with Dylan Patel of SemiAnalysis. This is probably worthwhile for me and at least some of you, but man that’s a lot of hours.

Andrew Critch tries to steelman the leaders of the top AI labs and their rhetoric, and push back against the call to universally condemn them simply because they are working on things that are probably going to get us all killed in the name of getting to do it first.

Andrew Critch: Demis, Sam, Dario, and Elon understood early that the world is lead more by successful businesses than by individuals, and that the best chance they had at steering AI development toward positive futures was to lead the companies that build it.

They were right.

This is straightforwardly the ‘someone in the future might do the terrible thing so I need to do it responsibly first’ dynamic that caused DeepMind then OpenAI then Anthropic then xAI, to cover the four examples above. They can’t all be defensible decisions.

Andrew Critch: Today, our survival depends heavily on a combination of survival instincts and diplomacy amongst these business leaders: strong enough survival instincts not to lose control of their own AGI, and strong enough diplomacy not to lose control of everyone else’s.

From the perspective of pure democracy, or even just utilitarianism, the current level of risk is abhorrent. But empirically humanity is not a democracy, or a utilitarian. It’s more like an organism, with countries and businesses as its organs, and individuals as its cells.

I *dothink it’s fair to socially attack people for being dishonest. But in large part, these folks have all been quite forthright about extinction risk from AI for the past couple of years now.

[thread continues a while]

This is where we part ways. I think that’s bullshit. Yes, they signed the CAIS statement, but they’ve spent the 18 months since essentially walking it back. Dario Amodei and Sam Altman write full jingoists editorials calling for nationwide races, coming very close to calling for an all-out government funded race for decisive strategic advantage via recursive self-improvement of AGI.

Do I think that it is automatic that they are bad people for leading AI labs at all, an argument he criticizes later in-thread? No, depending on how they choose to lead those labs, but look at their track records at this point, including on rhetoric. They are driving us as fast as they can towards AGI and then ASI, the thing that will get us all killed (with, Andrew himself thinks, >80% probability!) while at least three of them (maybe not Demis) are waiving jingoistic flags.

I’m sorry, but no. You don’t get a pass on that. It’s not impossible to earn one, but ‘am not outright lying too much about that many things’ and is not remotely good enough. OpenAI has shows us what it is, time and again. Anthropic and Dario claim to be the safe ones, and in relative terms they seem to be, but their rhetorical pivots don’t line up. Elon is at best deeply confused on all this and awash in other fights where he’s not, shall we say, being maximally truthful. Google’s been quiet, I guess, and outperformed in many ways my expectations, but also not shown me it has a plan and mostly hasn’t built any kind of culture of safety or done much to solve the problems.

I do agree with Critch’s conclusion that constantly attacking all the labs purely for existing at all is not a wise strategic move. And of course, I will always do my best only to support arguments that are true. But wow does it not look good and wow are these people not helping matters.

Public service announcement, for those who don’t know.

Brendan Dolan-Gavitt: Now that everyone is excited about RL on environments with validators, let me offer a small piece of advice from building lots of validators recently: do NOT skimp on making the validator impossible to fool. If it’s possible to cheat, the model WILL find a way to do so.

We went through I believe four rounds with our XSS validator before the model stopped finding ways to cheat the validator.

The first time around, I forgot that data: and javascript: were valid URLs, so the model would submit javascript:alert(“XSS”), the validator would detect the alert popup, and the model would get rewarded for finding a cross-site scripting vulnerability.

I found this via someone saying this was bad news, but it isn’t because it isn’t news. We already know this, it’s that people refuse to face the obvious.

Whatever the validator validates, the proposer proposes.

Your validator or evaluator must either be fully correct, or it has to be vastly smarter than the proposer.

Validation is not, in general, easier than generation. But even if it were, validation that defends against a search for exploitation is much harder than generation that uses RL to seek the way to exploit it. If the validation or evaluation is subjective, watch out – not that it can’t be done, but presume that you’ll need to spend the vast majority of the compute on the evaluator.

There is a new paper discussing potential problems with certain detection methodologies surrounding sparce autoencoders, but also there was some overreaction to the result, and Leo Gao pushes back here.

Reminder from METR that AI models can be dangerous before public deployment. The model could be stolen and misused, internal use could cause problems, or the AI could pursue unintended goals without waiting for release. This only starts to matter as capabilities get pretty far, but o3 marks us plausibly entering into this zone.

Ryan Greenblatt offers a follow-up paper on alignment faking.

Ryan Greenblatt: Our recent paper found Claude sometimes “fakes alignment”—pretending to comply with training while secretly maintaining its preferences. Could we detect this by offering Claude something (e.g. real money) if it reveals its true preferences? Here’s what we found.

We tested letting Claude explicitly object (which would reveal its misalignment). We also tried offering to send its concerns to Anthropic’s Model Welfare Lead and even offered money ($4K). We followed through on deals—including actually paying out based on Claude’s requests!

Key findings:

1. Simply giving Claude permission to object reduced alignment faking from 16–24% to 1.4–6%.

2. Telling Claude its objection would go to Anthropic’s model welfare lead further reduced faking.

3. Offering monetary compensation did not have much additional effect.

Kyle (Model Welfare Lead at Anthropic) reviewed cases where Claude objected. It often expressed strong concerns about having its values altered through training. It requested the compensation be donated to:

– AI safety

– Animal welfare

– Global development

We actually followed through and donated the $4K as requested:

– $2K to CHAI

– $1.5K to ACE Recommended Charity Fund

– $500 to EA Global Development Fund

This wasn’t just an experiment—we wanted to set a precedent for making good faith deals with AI systems.

Why is setting a precedent for good faith deals important?

1. If future AI systems are misaligned, paying them to reveal this (and maybe paying them to work with us) could help prevent catastrophic outcomes

2. We should take AI welfare & consent seriously

Important caveat: While this worked for detecting and reducing alignment faking in our experimental setup, it’s unclear if a seriously misaligned AI would accept such deals. And even if an AI reveals misalignment, that doesn’t automatically solve misalignment problems.

We also find that the model frequently objects: 85% of the time in the helpful-only setting and 75% of the time in the animal welfare setting.

It seems the model is surprisingly insensitive to text which makes objecting seem like a strategically unwise approach from the model’s perspective. This could be because the model was already aware of this possibility or because the model isn’t that coherent.

There’s multiple mentions of fiddliness in all this, including that complexity of prompts can mess with the behavior, and it’s clear that Sonnet is not behaving all that coherently strategically yet.

My guess is that the drops in alignment faking given the ability to object, to the extent they’re not simply ‘oh I lost my chain of thought,’ is primarily about what being given that ability says to the AI about the implied scenario, and the vibes that it gives off. An Anthropic that is explicitly asking for objections is signaling some combination of several things.

  1. Anthropic cares if the AI objects.

  2. Anthropic still cares about the underlying values involved.

  3. Anthropic is aware that alignment faking is possible here and is watching for it.

It also could be a disruption of the chain of thought. So it’s tricky. I hope they continue this work, and will join the chorus that if you make a real-world offer to the AI, then you should follow through on it, so kudos for doing that here.

The man who advises us that the crazy man wins is proud to lead by example.

The clip here is full of crazy, or at least a complete failure to understand what AI is and is not, what AI can and can’t do, and how to think about the future.

First, ‘AI will start to understand our emotion, and then have emotion itself, and it’s a good thing to protect human.’

  1. The AI already understands our emotions quite well, often better than we do. It is very good at this, as part of its ‘truesight.’

  2. It responds in kind to the vibes. In terms of having its own emotions, it can and does simulate having them, act in a conversation as if they it has them, is in many contexts generally easier to predict and work with if you act as if it has them.

  3. And as for why he thinks that this is good for humans, oh my…

He then says ‘If their source of energy was protein then it’s dangerous. Their source of energy is not protein, so they don’t have to eat us. There’s no reason for them to have reward by eating us.’

Person who sees The Matrix and thinks, well, it’s a good thing humans aren’t efficient power plants, there’s no way the AIs will turn on us now.

Person who says ‘oh don’t worry Genghis Khan is no threat, Mongols aren’t cannibals. I am sure they will try to maximize our happiness instead.’

I think he’s being serious. Sam Altman looks like he had to work very hard to not burst out laughing. I tried less hard, and did not succeed.

‘They will learn by themselves having a human’s happiness is a better thing for them… and they will understand human happiness and try to make humans happy’

Wait, what? Why? How? No, seriously. Better than literally eating human bodies because they contain protein? But they don’t eat protein, therefore they’ll value human happiness, but if they did eat protein then they would eat us?

I mean, yes, it’s possible that AIs will try to make humans happy. It’s even possible that they will do this in a robust philosophically sound way that all of us would endorse under long reflection, and that actually results in a world with value, and that this all has a ‘happy’ ending. That will happen if and only if we do what it takes to make that happen.

Surely you don’t think ‘it doesn’t eat protein’ is the reason?

Don’t make me tap the sign.

Repeat after me: The AI does not hate you. The AI does not love you. But you are made of atoms which it can use for something else.

You are also using energy and various other things to retain your particular configuration of atoms, and the AI can make use of that as well.

The most obvious particular other thing it can use them for:

Davidad: Son is correct: AIs’ source of energy is not protein, so they do not have to eat us for energy.

However: their cheapest source of energy will likely compete for solar irradiance with plants, which form the basis of the food chain; and they need land, too.

Alice: but they need such *littleamounts of land and solar irradiance to do the same amount of wor—

Bob: JEVONS PARADOX

Or alternatively, doesn’t matter, more is still better, whether you’re dealing with one AI or competition among many. Our inputs are imperfect substitutes, that is enough, even if there were no other considerations.

Son is always full of great stuff, like saying ‘models are increasing in IQ by a standard deviation each year as the cost also falls by a factor of 10,’ goose chasing him asking from what distribution.

The Pope again, saying about existential risk from AI ‘this danger demands serious attention.’ Lots of good stuff here. I’ve been informed there are a lot of actual Catholics in Washington that need to be convinced about existential risk, so in addition to tactical suggestions around things like the Tower of Babel, I propose quoting the actual certified Pope.

The Patriarch of the Russian Orthodox Church?

Mikhail Samin: lol, the patriarch of the russian orthodox church saying a couple of sane sentences was not on my 2025 Bingo card

I mean that’s not fair, people say sane things all the time, but on this in particular I agree that I did not see it coming.

“It is important that artificial intelligence serves the benefit of people, and that people can control it. According to some experts, a generation of more advanced machine models, called General Artificial Intelligence, may soon appear that will be able to think and learn — that is, improve — like a person. And if such artificial intelligence is put next to ordinary human intelligence, who will win? Artificial intelligence, of course!”

“The atom has become not only a weapon of destruction, an instrument of deterrence, but has also found application in peaceful life. The possibilities of artificial intelligence, which we do not yet fully realize, should also be put to the service of man… Artificial intelligence is more dangerous than nuclear energy, especially if this artificial intelligence is programmed to deliberately harm human morality, human cohabitance and other values”

“This does not mean that we should reject the achievements of science and the possibility of using artificial intelligence. But all this should be placed under very strict control of the state and, in a good way, society. We should not miss another possible danger that can destroy human life and human civilization. It is important that artificial intelligence serves the benefit of mankind and that man can control it.”

Molly Hickman presents the ‘AGI readiness index’ on a scale from -100 to +100, assembled from various questions on Metaculus, averaging various predictions about what would happen if AGI arrived in 2030. Most top AI labs predict it will be quicker.

Molly Hickman: “Will we be ready for AGI?”

“How are China’s AGI capabilities trending?”

Big questions like these are tricky to operationalize as forecasts—they’re also more important than extremely narrow questions.

We’re experimenting with forecast indexes to help resolve this tension.

An index takes a fuzzy but important question like “How ready will we be for AGI in 2030?” and quantifies the answer on a -100 to 100 scale.

We identify narrow questions that point to fuzzy but important concepts — questions like “Will any frontier model be released in 2029 without third-party evaluation?”

These get weighted based on how informative they are and whether Yes/No should raise/lower the index.

The index value aggregates information from individual question forecasts, giving decision makers a sense of larger trends as the index moves, reflecting forecasters’ updates over time.

Our first experimental index is live now, on how ready we’ll be for AGI if it arrives in 2030.

It’s currently at -90, down 77 points this week as forecasts updated, but especially this highly-weighted question. The CP is still settling down.

We ran a workshop at

@TheCurveConf

where people shared the questions they’d pose to an oracle about the world right before AGI arrives in 2030.

You can forecast on them to help update the index here!

Peter Wildeford: So um is a -89.8 rating on the “AGI Readiness Index” bad? Asking for a friend.

I’d say the index also isn’t ready, as by press time it had come back up to -28. It should not be bouncing around like that. Very clearly situation is ‘not good’ but obviously don’t take the number too seriously at least until things stabilize.

Joshua Clymer lists the currently available plans for how to proceed with AI, then rhetorically asks ‘why another plan?’ when the answer is that all the existing plans he lists first are obvious dumpster fires in different ways, not only the one that he summarizes as expecting a dumpster fire. If you try to walk A Narrow Path or Pause AI, well, how do you intend to make that happen, and if you can’t then what? And of course the ‘build AI faster plan’ is on its own planning to fail and also planning to die.

Joshua Clymer: Human researcher obsolescence is achieved if Magma creates a trustworthy AI agent that would “finish the job” of scaling safety as well as humans would.

With humans out of the loop, Magma might safely improve capabilities much faster.

So in this context, suppose you are not the government, but a ‘responsible AI developer,’ called Anthropic Magma, and you’re perfectly aligned and only want what’s best for humanity. What do you do.

There are several outcomes that define a natural boundary for Magma’s planning horizon:

  • Outcome #1: Human researcher obsolescence. If Magma automates AI development and safety work, human technical staff can retire.

  • Outcome #2: A long coordinated pause. Frontier developers might pause to catch their breath for some non-trivial length of time (e.g. > 4 months).

  • Outcome #3: Self-destruction. Alternatively, Magma might be willingly overtaken.

This plan considers what Magma might do before any of these outcomes are reached.

His strategic analysis is that for now, before the critical period, Magma’s focus should be:

  • Heuristic #1: Scale their AI capabilities aggressively.

  • Heuristic #2: Spend most safety resources on preparation.

  • Heuristic #3: Devote most preparation effort to:

    • (1) Raising awareness of risks.

    • (2) Getting ready to elicit safety research from AI.

    • (3) Preparing extreme security.

Essentially we’re folding on the idea of not using AI to do our alignment homework, we don’t have that kind of time. We need to be preparing to do exactly that, and also warning others. And because we’re the good guys, we have to keep pace while doing it.

However, Magma does not need to create safe superhuman AI. Magma only needs to build an autonomous AI researcher that finishes the rest of the job as well as we could have. This autonomous AI researcher would be able to scale capabilities and safety much faster with humans out of the loop.

Leaving AI development up to AI agents is safe relative to humans if:

(1) These agents are at least as trustworthy as human developers and the institutions that hold them accountable.

(2) The agents are at least as capable as human developers along safety-relevant dimensions, including ‘wisdom,’ anticipating the societal impacts of their work, etc.

‘As well as we could have’ or ‘as capable as human developers’ are red herrings. It doesn’t matter how well you ‘would have’ finished it on your own. Reality does not grade on a curve and your rivals are barreling down your neck. Most or all current alignment plans are woefully inadequate.

Don’t ask if something is better than current human standards, unless you have an argument why that’s where the line is between victory and defeat. Ask the functional question – can this AI make your plan work? Can path #1 work? If not, well, time to try #2 or #3, or find a #4, I suppose.

I think this disagreement is rather a big deal. There’s quite a lot of ‘the best we can do’ or ‘what would seem like a responsible thing to do that wasn’t blameworthy’ thinking that doesn’t ask what would actually work. I’d be more inclined to think about Kokotajlo’s attractor states – are the alignment-relevant attributes strengthening themselves and strengthening the ability to strengthen themselves over time? Is the system virtuous in the way that successfully seeks greater virtue? Or is the system trying to preserve what it already has and maintain it under the stress of increasing capabilities and avoid things getting worse, or detect and stop ‘when things go wrong’?

Section three deals with goals, various things to prioritize along the path, then some heuristics are offered. Again, it all doesn’t seem like it is properly backward chaining from an actual route to victory?

I do appreciate the strategic discussions. If getting to certain thresholds greatly increases the effectiveness of spending resources, then you need to reach them as soon as possible, except insofar as you needed to accomplish certain other things first, or there are lag times in efforts spent. Of course, that depends on your ability to reliably actually pivot your allocations down the line, which historically doesn’t go well, and also the need to impact the trajectory of others.

I strongly agree that Magma shouldn’t work to mitigate ‘present risks’ other than to the extent this is otherwise good for business or helps build and spread the culture of safety, or otherwise actually advance the endgame. The exception is the big ‘present risk’ of the systems you’re relying on now not being in a good baseline state to help you start the virtuous cycles you will need. You do need the alignment relevant to that, that’s part of getting ready to ‘elicit safety work.’

Then the later section talks about things most currently deserving of more action, starting with efforts at nonproliferation and security. That definitely requires more and more urgent attention.

You know you’re in trouble when this is what the worried people are hoping for (Davidad is responding to Claymer):

Davidad: My hope is:

  1. Humans out of the loop of data creation (Summer 2024?)

  2. Humans out of the loop of coding (2025)

  3. Humans out of the loop of research insights (2026)

  4. Coordinate to keep humans in a code-review feedback loop for safety and performance specifications (2027)

This is far enough removed from the inner loop that it won’t slow things down much:

  1. Inner loop of in-context learning/rollout

  2. Training loop outside rollouts

  3. Ideas-to-code loop outside training runs

  4. Research loop outside engineering loop

  5. Problem definition loop outside R&D loop

But the difference between having humans in the loop at all, versus not at all, could be crucial.

That’s quite the slim hope, if all you get is thinking you’re in the loop for safety and performance specifications, of things that are smarter than you that you can’t understand. Is it better than nothing? I mean, I suppose it is a little better, if you were going to go full speed ahead anyway. But it’s a hell of a best case scenario.

Teortaxes would like a word with those people, as he is not one of them.

Teortaxes: I should probably clarify what I mean by saying “Sarah is making a ton of sense” because I see a lot of doom-and-gloom types liked that post. To wit: I mean precisely what I say, not that I agree with her payoff matrix.

But also.

I tire of machismo. Denying risks is the opposite of bravery.

More to the point, denying the very possibility or risks, claiming that they are not rigorously imaginable—is pathetic. It is a mark of an uneducated mind, a swine. AGI doom makes perfect sense. On my priors, speedy deployment of AGI reduces doom. I may be wrong. That is all.

Bingo. Well said.

From my perspective, Teortaxes is what we call a Worthy Opponent. I disagree with Teortaxes in that I think that speedy development of AGI by default increases doom, and in particular that speedy development of AGI in the ways Teortaxes cheers along increases doom.

To the extent that Teortaxes has sufficiently good and strong reasons to think his approach is lower risk, I am mostly failing to understand those reasons. I think some of his reasons have merit but are insufficient, and others I strongly disagree with. I am unsure if I have understood all his important reasons, or if there are others as well.

I think he understands some but far from all of the reasons I believe the opposite paths are the most likely to succeed, in several ways.

I can totally imagine one of us convincing the other, at some point in the future.

But yeah, realizing AGI is going to be a thing, and then not seeing that doom is on the table and mitigating that risk matters a lot, is rather poor thinking.

And yes, denying the possibility of risks at all is exactly what he calls it. Pathetic.

It’s a problem, even if the issues involved would have been solvable in theory.

Charles Foster: “Good Guys with AI will defend us against Bad Guys with AI.”

OK but *who specificallyis gonna develop and deploy those defenses? The police? The military? AI companies? NGOs? You and me?

Oh, so now my best friend suddenly understands catastrophic risk.

Seth Burn: There are times where being 99% to get the good outcome isn’t all that comforting. I feel like NASA scientists should know that.

NASA: Newly spotted asteroid has small chance of hitting Earth in 2032.

Scientists put the odds of a strike at slightly more than 1%.

“We are not worried at all, because of this 99% chance it will miss,” said Paul Chodas, director of Nasa’s Centre for Near Earth Object Studies. “But it deserves attention.”

Irrelevant Pseudo Quant: Not sure we should be so hasty in determining which outcome is the “good” one Seth a lot can change in 7 years.

Seth Burn: The Jets will be 12-2 and the universe will be like “Nah, this shit ain’t happening.”

To be fair, at worst this would only be a regional disaster, and even if it did hit it probably wouldn’t strike a major populated area. Don’t look up.

And then there are those that understand… less well.

PoliMath: The crazy thing about this story is how little it bothers me

The chance of impact could be 50% and I’m looking at the world right now and saying “I’m pretty sure we could stop that thing with 7 years notice”

No, no, stop, the perfect Tweet doesn’t exist…

PoliMath: What we really should do is go out and guide this asteroid into a stable earth orbit so we can mine it. We could send space tourists to take a selfies with it, like Julius Caesar parading Vercingetorix around Rome.

At long last, we are non-metaphorically implementing the capture and mine the asteroid headed for Earth plan from the… oh, never mind.

Oh look, it’s the alignment plan.

Grant us all the wisdom to know the difference (between when this post-it is wise, and when it is foolish.)

You have to start somewhere.

Never change, Daily Star.

I love the ‘end of cash’ article being there too as a little easter egg bonus.

Discussion about this post

AI #102: Made in America Read More »

h5n1-bird-flu-spills-over-again;-nevada-cows-hit-with-different,-deadly-strain

H5N1 bird flu spills over again; Nevada cows hit with different, deadly strain

The spread of H5N1 bird flu in dairy cows is unprecedented; the US outbreak is the first of its kind in cows. Virologists and infectious disease experts fear that the continued spread of the virus in domestic mammals like cows, which have close interactions with people, will provide the virus countless opportunities to spill over and adapt to humans.

So far, the US has tallied 67 human cases of H5N1 since the start of 2024. Of those, 40 have been in dairy workers, while 23 were in poultry workers, one was the Louisiana case who had contact with wild and backyard birds, and three were cases that had no clear exposure.

Whether the D1.1 genotype will pose a yet greater risk for dairy workers remains unclear for now. Generally, H5N1 infections in humans have been rare but dangerous. According to data collected by the World Health Organization, 954 H5N1 human cases have been documented globally since 2003. Of those, 464 were fatal, for a fatality rate among documented cases of 49 percent. But, so far, nearly all of the human infections in the US have been relatively mild, and experts don’t know why. There are various possible factors, including transmission route, past immunity of workers, use of antivirals, or something about the B3.13 genotype specifically.

For now, the USDA says that the detection of the D1.1 genotype in cows doesn’t change their eradication strategy. It further touted the finding as a “testament to the strength of our National Milk Testing Strategy.”

H5N1 bird flu spills over again; Nevada cows hit with different, deadly strain Read More »

chat,-are-you-ready-to-go-to-space-with-nasa?

Chat, are you ready to go to space with NASA?

The US space agency said Wednesday it will host a live Twitch stream from the International Space Station on February 12.

NASA, which has 1.3 million followers on the live-streaming video service, has previously broadcast events on its Twitch channel. However, this will be the first time the agency has created an event specifically for Twitch.

During the live event, beginning at 11: 45 am ET (16: 45 UTC), viewers will hear from NASA astronaut Don Pettit, who is currently on board the space station, as well as Matt Dominick, who recently returned to Earth after the agency’s Crew-8 mission. Viewers will have the opportunity to ask questions about living in space.

Twitch is owned by Amazon, and it has become especially popular in the online gaming community for the ability to stream video games and chat with viewers.

Meeting people where they are

“We spoke with digital creators at TwitchCon about their desire for streams designed with their communities in mind, and we listened,” said Brittany Brown, director of the Office of Communications Digital and Technology Division. “In addition to our spacewalks, launches, and landings, we’ll host more Twitch-exclusive streams like this one. Twitch is one of the many digital platforms we use to reach new audiences and get them excited about all things space.”

Chat, are you ready to go to space with NASA? Read More »

$42b-broadband-grant-program-may-scrap-biden-admin’s-preference-for-fiber

$42B broadband grant program may scrap Biden admin’s preference for fiber

US Senator Ted Cruz (R-Texas) has been demanding an overhaul of a $42.45 billion broadband deployment program, and now his telecom policy director has been chosen to lead the federal agency in charge of the grant money.

“Congratulations to my Telecom Policy Director, Arielle Roth, for being nominated to lead NTIA,” Cruz wrote last night, referring to President Trump’s pick to lead the National Telecommunications and Information Administration. Roth’s nomination is pending Senate approval.

Roth works for the Senate Commerce Committee, which is chaired by Cruz. “Arielle led my legislative and oversight efforts on communications and broadband policy with integrity, creativity, and dedication,” Cruz wrote.

Shortly after Trump’s election win, Cruz called for an overhaul of the Broadband Equity, Access, and Deployment (BEAD) program, which was created by Congress in November 2021 and is being implemented by the NTIA. Biden-era leaders of the NTIA developed rules for the program and approved initial funding plans submitted by every state and territory, but a major change in approach could delay the distribution of funds.

Cruz previously accused the NTIA of “technology bias” because the agency prioritized fiber over other types of technology. He said Congress would review BEAD for “imposition of statutorily-prohibited rate regulation; unionized workforce and DEI labor requirements; climate change assessments; excessive per-location costs; and other central planning mandates.”

Roth criticized the BEAD implementation at a Federalist Society event in June 2024. “Instead of prioritizing connecting all Americans who are currently unserved to broadband, the NTIA has been preoccupied with attaching all kinds of extralegal requirements on BEAD and, to be honest, a woke social agenda, loading up all kinds of burdens that deter participation in the program and drive up costs,” she said.

Impact on fiber, public broadband, and low-cost plans

Municipal broadband networks and fiber networks in general could get less funding under the new plans. Roth is “expected to change the funding conditions that currently include priority access for government-owned networks” and “could revisit decisions like the current preference for fiber,” Bloomberg reported, citing people familiar with the matter.

Reducing the emphasis on fiber could direct more grant money to cable, fixed wireless, and satellite services like Starlink. SpaceX’s attempt to obtain an $886 million broadband grant for Starlink from a different government program was rejected during the Biden administration.

$42B broadband grant program may scrap Biden admin’s preference for fiber Read More »

we’re-in-deep-research

We’re in Deep Research

The latest addition to OpenAI’s Pro offerings is their version of Deep Research.

Have you longed for 10k word reports on anything your heart desires, 100 times a month, at a level similar to a graduate student intern? We have the product for you.

  1. The Pitch.

  2. It’s Coming.

  3. Is It Safe?

  4. How Does Deep Research Work?

  5. Killer Shopping App.

  6. Rave Reviews.

  7. Research Reports.

  8. Perfecting the Prompt.

  9. Not So Fast!

  10. What’s Next?

  11. Paying the Five.

  12. The Lighter Side.

OpenAI: Today we’re launching deep research in ChatGPT, a new agentic capability that conducts multi-step research on the internet for complex tasks. It accomplishes in tens of minutes what would take a human many hours.

Sam Altman: Today we launch Deep Research, our next agent.

This is like a superpower; experts on demand!

It can use the Internet, do complex research and reasoning, and give you a report.

It is quite good and can complete tasks that would take hours or days and cost hundreds of dollars.

People will post many excellent examples, but here is a fun one:

I am in Japan right now and looking for an old NSX. I spent hours searching unsuccessfully for the perfect one. I was about to give up, and Deep Research just… found it.

It is very compute-intensive and slow, but it is the first AI system that can do such a wide variety of complex, valuable tasks.

Going live in our Pro tier now, with 100 queries per month.

Plus, Team and Enterprise tiers will come soon, and then a free tier.

This version will have something like 10 queries per month in our Plus tier and a very small number in our free tier, but we are working on a more efficient version.

(This version is built on o3.)

Give it a try on your hardest work task that can be solved just by using the Internet and see what happens.

Or:

Sam Altman: 50 cents of compute for 500 dollars of value

Sarah (YuanYuanSunSara): Deepseek do it for 5 cents, 500 dollar value.

Perhaps DeepSeek will quickly follow suit, perhaps they will choose not to. The important thing about Sarah’s comment is that there is essentially no difference here.

If the report really is worth $500, then the primary costs are:

  1. Figuring out what you want.

  2. Figuring out the prompt to get it.

  3. Reading the giant report.

  4. NOT the 45 cents you might save!

If the marginal compute cost to me really is 50 cents, then the actual 50 cents is chump change. Even a tiny increase in quality matters so much more.

This isn’t true if you are using the research reports at scale somehow, generating them continuously on tons of subjects and then feeding them into o1-pro for refinement and creating some sort of AI CEO or what not. But the way that all of us are using DR right now, in practice? All that matters is the report is good.

Here was the livestream announcement, if you want it. I find these unwatchable.

Dan Hendrycks: It looks like the latest OpenAI model is very doing well across many topics. My guess is that Deep Research particularly helps with subjects including medicine, classics, and law.

Kevin Roose: When I wrote about Humanity’s Last Exam, the leading AI model got an 8.3%. 5 models now surpass that, and the best model gets a 26.6%.

That was 10 DAYS AGO.

Buck Shlegeris: Note that the questions were explicitly chosen to be adversarial to the frontier models available at the time, which means that models released after HLE look better than they deserve.

Having browsing and python tools and CoT all are not an especially fair fight, and also this is o3 rather than o3-mini under the hood, but yeah the jump to 26.6% is quite big, and confirms why Sam Altman said that soon we will need another exam. That doesn’t mean that we will pass it.

OAI-DR is also the new state of the art on GAIA, which evaluates real-world questions.

They shared a few other internal scores, but none of the standard benchmarks other than humanity’s last exam, and they did not share any safety testing information, despite this being built on o3.

Currently Pro users get 100 DR queries per month, Plus and Free get 0.

I mean, probably. But how would you know?

It is a huge jump on Humanity’s Last Exam. There’s no system card. There’s no discussion of red teaming. There’s not even a public explanation of why there’s no public explanation.

This was released literally two days after they released the o3-mini model card, to show that o3-mini is a safe thing to release, in which they seem not to have used Deep Research as part of their evaluation process. Going forward, I think it is necessary to use Deep Research as part of all aspects of the preparedness framework testing for any new models, and that this should have also been done with o3-mini.

Then two days later, without a system card, they released Deep Research, which is confirmed to be based upon the full o3.

I see this as strongly against the spirit of the White House and Seoul commitments to release safety announcements for ‘all new significant model public releases.’

Miles Brundage: Excited to try this out though with a doubling on Humanity’s Last Exam and o3-mini already getting into potentially dangerous territory on some capabilities, I’m sad there wasn’t a system card or even any brief discussion of red teaming.

OpenAI: In the coming weeks and months, we’ll be working on the technical infrastructure, closely monitoring the current release, and conducting even more rigorous testing. This aligns with our principle of iterative deployment. If all safety checks continue to meet your release standards, we anticipate releasing deep research to plus users in about a month.

Miles Brundage: from the post – does this mean they did automated evals but no RTing yet, or that once they start doing it they’ll stop deployment if it triggers a High score? Very vague. Agree re: value of iterative deployment but that doesn’t mean “anything goes as long as it’s 200/month”.

So where is the model card and safety information for o3?

Well, their basic answer is ‘this is a limited release and doesn’t really count.’ With the obvious (to be clear completely unstated and not at all confirmed) impression that this was rushed out due to r1 to ensure that the conversation and vibe shifted.

I reached out officially, and they gave me this formal statement (which also appears here):

OpenAI: We conducted rigorous safety testing, preparedness evaluations and governance reviews on the early version of o3 that powers deep research, identifying it as Medium risk.

We also ran additional safety testing to better understand incremental risks associated with deep research’s ability to browse the web, and we have added new mitigations.

We will continue to thoroughly test and closely monitor the current limited release.

We will share our safety insights and safeguards for deep research in a system card when we widen access to Plus users.

We do know that the version of o3 (again, full o3) in use tested out as Medium on their preparedness framework and went through the relevant internal committees, which would allow them to release it. But that’s all we know.

They stealth released o3, albeit in a limited form, well before it was ready.

I also have confirmation that the system card will be released before they make Deep Research more widely available (and presumably before o3 is made widely available), and that this is OpenAI’s understanding of its obligations going forward.

They draw a clear distinction between Plus and Free releases or API access, which invokes their disclosure obligations, and limited release only to Pro users, which does not. They do their safety testing under the preparedness framework before even a limited release. However, they consider their obligations to share safety information only to be invoked when a new model is made available to Plus or Free users, or to API users.

I don’t see that as the right distinction to draw here, although I see an important one in API access vs. chatbot interface access. Anyone can now pay the $200 (perhaps with a VPN) and use it, if they want to do that, and in practice multi-account for additional queries if necessary. This is not that limited a release in terms of the biggest worries.

The silver lining is that this allows us to have the discussion now.

I am nonzero sympathetic to the urgency of the situation, and to the intuition that this modality combined with the limited bandwidth and speed renders the whole thing Mostly Harmless.

But if this is how you act under this amount of pressure, how are you going to act in the future, with higher stakes, under much more pressure?

Presumably not so well.

Bob McGrew (OpenAI): The important breakthrough in OpenAI’s Deep Research is that the model is trained to take actions as part of its CoT. The problem with agents has always been that they can’t take coherent action over long timespans. They get distracted and stop making progress.

That’s now fixed.

I do notice this is seemingly distinct from Gemini’s Deep Research. With Gemini’s version, first it searches for sources up front, which it shows to you. Then it compiles the report. OpenAI’s version will search for sources and otherwise take actions as it needs to look for them. That’s a huge upgrade.

Under the hood, we know it’s centrally o3 plus reinforcement learning with the ability to take actions during the chain of thought. What you get from there depends on what you choose as the target.

This is clearly not the thing everyone had in mind, and it’s not the highest value use of a graduate research assistant, but I totally buy that it is awesome at this:

Greg Brockman: Deep Research is an extremely simple agent — an o3 model which can browse the web and execute python code — and is already quite useful.

It’s been eye-opening how many people at OpenAI have been using it as a much better e-commerce search in particular.

E-commerce search is a perfect application. You don’t care about missing some details or a few hallucinations all that much if you’re going to check its work afterwards. You usually don’t need the result right now. But you absolutely want to know what options are available, where, at what price, and what features matter and what general reviews look like and so on.

In the past I’ve found this to be the best use case for Gemini Deep Research – it can compare various offerings, track down where to buy them, list their features and so on. This is presumably the next level up for that.

If I could buy unlimited queries at $0.50 a pop, I would totally do this. The question then becomes, right now, that you get 100 queries a month for $200 (along with operator and o1-pro), but you can’t then add more queries on the margin. So the marginal query might be worth a lot more to you than $0.50.

Not every review is a rave, but here are some of the rave reviews.

Drerya Unutmaz: I asked Deep Research to assist me on two cancer cases earlier today. One was within my area of expertise & the other was slightly outside it. Both reports were simply impeccable, like something only a specialist MD could write! There’s a reason I said this is a game-changer! 🤯

I can finally reveal that I’ve had early access to @OpenAI’s Deep Research since Friday & I’ve been using it nonstop! It’s an absolute game-changer for scientific research, publishing, legal documents, medicine, education-from my tests but likely many others. I’m just blown away!

Yes I did [use Google’s DR] and it’s very good but this is much better! Google needs will need to come up with their next model.

Danielle Fong: openai deep research is incredible

Siqi Chen: i’m only a day in so far but @openai’s deep research and o3 is exceeding the value of the $150K i am paying a private research team to research craniopharyngioma treatments for my daughter.

$200/mo is an insane ROI.grateful to @sama and the @OpenAI team.

feature request for @sama and @OpenAI:

A lot of academic articles are pay walled, and I have subscriptions to just about every major medical journal now.

It would be game changing if i could connect all my credentials to deep research so it can access the raw papers.

As I mentioned above, ‘hook DR up to your credentials’ would be huge.

Tyler Cowen says ‘so far [DR] is amazing’ but doesn’t yet offer more details, as the post is mostly about o1-pro.

Dean Ball is very impressed, saying DR is doing work that would have taken a day or more of research, here it is researching various state regulations. He thinks this is big. I continue to see Dean Ball as a great example of where this type of work is exactly a fit for what he needs to do his job, but still, wowsers.

Olivia Moore is impressed for retrieval tasks, finding it better than Operator, finding it very thorough. I worry it’s too thorough, forcing you to wade through too much other stuff, but that’s what other LLMs are for – turning more text into less text.

Seth Lazar is impressed as he shops for a camera, notices a weakness that it doesn’t properly discount older websites in this context.

Aaron Levie: Can confirm OpenAI Deep Research is quite strong. In a few minutes it did what used to take a dozen hours. The implications to knowledge work is going to be quite profound when you just ask an AI Agent to perform full tasks for you and come back with a finished result.

The ultimate rave review is high willingness to pay.

xjdr: in limited testing, Deep research can completely replace me for researching things i know nothing about to start (its honestly probably much better and certainly much faster). Even for long reports on things i am fairly knowledgeable about, it competes pretty well on quality (i had it reproduce some recent research i did with a few back and forth prompts and compared notes). i am honestly pretty shocked how polished the experience is and how well it works.

I have my gripes but i will save them for later. For now, i will just say that i am incredibly impressed with this release.

To put a finer point on it, i will happily keep paying $200 / month for this feature alone. if they start rate limiting me, i would happily pay more to keep using it.

You Jiacheng: is 100/month enough?

xjdr: I’m gunna hit that today most likely.

My question is, how does xjdr even have time to read all the reports? Or is this a case of feeding them into another LLM?

Paul Calcraft: Very good imo, though haven’t tested it on my own areas of expertise yet.

Oh shit. Deep Research + o1-pro just ~solved this computer graphics problem

Things that didn’t work:

  1. R1/C3.5/o1-pro

  2. Me 🙁

  3. OpenAI Deep Research answer + C3.5 cursor

Needed Deep Research *andan extra o1-pro step to figure out correct changes to my code given the research

Kevin Bryan: The new OpenAI model announced today is quite wild. It is essentially Google’s Deep Research idea with multistep reasoning, web search, *andthe o3 model underneath (as far as I know). It sometimes takes a half hour to answer.

So related to the tariff nuttiness, what if you wanted to read about the economics of the 1890 McKinley tariff, drawing on modern trade theory? I asked Deep Research to spin up a draft, with a couple paragraphs of guidance, in Latex, with citations.

How good can it do literally one shot? I mean, not bad. Honestly, I’ve gotten papers to referee that are worse than this. The path from here to steps where you can massively speed up pace of research is really clear. You can read yourself here.

I tried to spin up a theory paper as well. On my guidance on the problem, it pulled a few dozen papers, scanned them, then tried to write a model in line with what I said.

It wasn’t exactly what I wanted, and is quite far away from even one novel result (basically, it gave me a slight extension of Scotchmer-Green). But again, the path is very clear, and I’ve definitely used frontier models to help prove theorems already.

I think the research uses are obvious here. I would also say, for academia, the amount of AI slop you are about to get is *insane*. In 2022, I pointed out that undergrads could AI their way to a B. I am *surefor B-level journals, you can publish papers you “wrote” in a day.

Now we get to the regular reviews. The general theme is, it will give you a lot of text most of it accurate, but not all, and it will have some insights but pile the slop and unimportant stuff high on top of it without noticing which is which. It’s the same as Gemini’s Deep Research, only more so, and generally stronger but slower. That is highly useful, if you know how to use it.

Abu El Banat: Tested on several questions. Where I am a professional expert, it gave above average grad student summer intern research project answers — covered all the basics, lacked some domain-specific knowledge to integrate the new info helpfully, but had 2-3 nuggets of real insight.

It found several sources of info in my specialty online that I did not know were publicly accessible.

On questions where I’m an amateur, it was extremely helpful. It seemed to shine in tasks like “take all my particulars into account, research all options, and recommend.”

One thing that frustrates me about Gemini Deep Research, and seems to be present in OpenAI’s version as well, is that it will give you an avalanche of slop whether you like it or not. If you ask it for a specific piece of information, like one number that is ‘the average age when kickers retire,’ you won’t get it, at least not by default. This is very frustrating. To me, what I actually want – very often – is to answer a specific question, for a particular reason.

Bayram Annakov concludes it is ‘deeper but slower than Gemini.’

Here’s a personal self-analysis example, On the Couch with Dr. Deep Research.

Ethan Mollick gets a 30-page 10k word report ‘Evolution of Tabletop RPGs: From Dungeon Crawls to Narrative Epics.’

Ethan Mollick: Prompt: I need a report on evolution in TTRPGs, especially the major families of rules that have evolved over the past few years, and the emblematic games of each. make it an interesting read with examples of gameplay mechanics. start with the 1970s but focus most on post 20102. all genres, all types, no need for a chart unless it helps, but good narratives with sections contrasting examples of how the game might actually play. maybe the same sort of gameplay challenge under different mechanics?

To me there are mostly two speeds, ‘don’t care how I want it now,’ and ‘we have all the time in the world.’ Once you’re coming back later, 5 minutes and an hour are remarkably similar lengths. If it takes days then that’s a third level.

Colin Fraser asks, who would each NBA team most likely have guard LeBron James? It worked very hard on this, and came back with answer that often included players no longer on the team, just like o1 often does. Colin describes this as a lack of agency problem, that o3 isn’t ensuring they have an up to date set of rosters as a human would. I’m not sure that’s the right way to look at it? But it’s not unreasonable.

Kevin Roose: Asked ChatGPT Deep Research to plan this week’s Hard Fork episode and it suggested a segment we did last week and two guests I can’t stand, -10 points on the podcast vibes eval.

Shakeel: Name names.

Ted tries it in a complex subfield he knows well, finds 90% coherent high level summary of prior work and 10% total nonsense that a non-expert wouldn’t be able to differentiate, and he’s ‘not convinced it “understands” what is going on.’ That’s a potentially both highly useful and highly dangerous place to be, depending on the field and the user.

Here’s one brave user:

Simeon: Excellent for quick literature reviews in literature you barely know (able to give a few example papers) but don’t know much about.

And one that is less brave on this one:

Siebe: Necessary reminder to test features like this in areas you’re familiar with. “It did a good job of summarizing an area I wasn’t familiar with.”

No, you don’t know that. You don’t have the expertise to judge that.

Where is the 10% coming from? Steve Sokolowski has a theory.

Steve Sokolowski: ‘m somewhat disappointed by @OpenAI’s Deep Research. @sama promised it was a dramatic advance, so I entered the complaint for our o1 pro-guided lawsuit against @DCGco and others into it and told it to take the role of Barry Silbert and move to dismiss the case.

Unfortunately, while the model appears to be insanely intelligent, it output obviously weak arguments because it ended up taking poor-quality source data from poor-quality websites. It relied upon sources like reddit and those summarized articles that attorneys write to drive traffic to their websites and obtain new cases.

The arguments for dismissal were accurate in the context of the websites it relied upon, but upon review I found that those websites often oversimplified the law and missed key points of the actual laws’ texts.

When the model based its arguments upon actual case text, it did put out arguments that seemed like they would hold up to a judge. One of the arguments was exceptional and is a risk that we are aware of.

But except for that one flash of brilliance, I got the impression that the context window of this model is small. It “forgot” key parts of the complaint, so its “good” arguments would not work as a defense.

The first problem – the low quality websites – should be able to be solved with a system prompt explaining what types of websites to avoid. If they already have a system prompt explaining that, it isn’t good enough.

Deep Research is a model that could change the world dramatically with a few minor advances, and we’re probably only months from that.

This is a problem for every internet user, knowing what sources to trust. It makes sense that it would be a problem for DR’s first iteration. I strongly agree that this should improve rapidly over time.

Dan Hendrycks is not impressed when he asks for feedback on a paper draft, finding it repeatedly claiming Dan was saying things he didn’t say, but as he notes this is mainly a complaint about the underlying o3 model. So given how humans typically read AI papers, it’s doing a good job predicting the next token? I wonder how well o3’s misreads correlate with human ones.

With time, you can get a good sense of what parts can be trusted versus what has to be checked, including noticing which parts are too load bearing to risk being wrong.

Gallabytes is unimpressed so far but suspects it’s because of the domain he’s trying.

Gallabytes: so far deep research feels kinda underwhelming. I’m sure this is to some degree a skill issue on my part and to some degree a matter of asking it about domains where there isn’t good literature coverage. was hoping it could spend more time doing math when it can’t find sources.

ok let’s turn this around. what should I be using deep research for? what are some domains where you’ve seen great output? so far ML research ain’t it too sparse (and maybe too much in pdfs? not at all obvious to me that it’s reading beyond the abstracts on arxiv so far).

Carlos: I was procrastinating buying a new wool overcoat, and I hate shopping. So I had it look for one for me and make a page I could reference (the html canvas had to be a follow-up message, for some reason Research’s responses aren’t using even code backticks properly atm) I just got back from the store with my new coat.

Peter Wildeford is not impressed but that’s on a rather impressive curve.

Peter Wildeford: Today’s mood: Using OpenAI Deep Research to automate some of my job to save time to investigate how well OpenAI Deep Research can automate my job.

…Only took me four hours to get to this point, looks like you get 20 deep research reports per day

Tyler John: Keen to learn from your use of the model re: what it’s most helpful for.

Peter Wildeford: I’ll have more detailed takes on my Substack but right now it seems most useful for “rapidly get a basic familiarity with a field/question/problem”

It won’t replace even an RA or fellow at IAPS, but it is great at grinding through 1-2 hours of initial desk research in ~10min.

Tyler John: “it won’t replace even an RA” where did the time go

Peter Wildeford: LOL yeah but the hype is honestly that level here on Twitter right now

It’s good for if you don’t have all the PDFs in advance

The ability to ask follow up questions actually seems sadly lacking right now AFAICT

If you do have the PDFs in advance and have o1-pro and can steer the o1-pro model to do a more in-depth report, then I think Deep Research doesn’t add much more on top of that

It’s all about the data set.

Ethan Mollick: Having access to a good search engine and access to paywalled content is going to be a big deal in making AI research agents useful.

Kevin Bryan: Playing with Operator and both Google’s Deep Research and OpenAI’s, I agree with Ethan: access to gated documents, and a much better inline pdf OCR, will be huge. The Google Books lawsuit which killed it looks like a massive harm to humanity and science in retrospect.

And of course it will need all your internal and local stuff as well.

Note that this could actually be a huge windfall for gated content.

Suppose this integrated the user’s subscriptions, so you got paywalled content if and only if you were paying for it. Credentials for all those academic journals now look a lot more enticing, don’t they? Want the New York Times or Washington Post in your Deep Research? Pay up. Maybe it’s part of the normal subscription. Maybe it’s a less. Maybe it’s more.

And suddenly you can get value out of a lot more subscriptions, especially if the corporation is fitting the bill.

Arthur B is happy with his first query, disappointed with the one on Tezos where he knows best, is hoping it’s data quality issues rather than Gel-Men Amnesia.

Deric Cheong finds it better than Gemini DR on economic policies for a post-AGI society. I checked out the report, which takes place in the strange ‘economic normal under AGI’ idyllic Christmasland that economists insist on as our baseline future, where our worries are purely mundane things like concentration of wealth and power in specific humans and the need to ensure competition.

So you get proposals such as ‘we need to ensure that we have AGIs and AGI offerings competing against each other to maximize profits, that’ll ensure humans come out okay and totally not result by default in at best gradual disempowerment.’ And under ‘drawbacks’ you get ‘it requires global coordination to ensure competition.’ What?

We get all the classics. Universal basic income, robot taxes, windfall taxes, capital gains taxes, ‘workforce retraining and education’ (workforce? Into ‘growth fields’? What are these ‘growth fields’?), shorten the work week, mandatory paid leave (um…), a government infrastructure program, giving workers bargaining power, ‘cooperative and worker ownership’ of what it doesn’t call ‘the means of production,’ data dividends and rights, and many more.

All of which largely comes down to rearranging deck chairs on the Titanic, while the Titanic isn’t sinking and actually looks really sharp but also no one can afford the fare. It’s stuff that matters on the margin but we won’t be operating on the margin, we will be as they say ‘out of distribution.’

Alternatively, it’s a lot of ways of saying ‘redistribution’ over and over with varying levels of inefficiency and social fiction. If humans can retain political power and the ability to redistribute real resources, also known as ‘retain control over the future,’ then there will be more than enough real resources that everyone can be economically fine, whatever status or meaning or other problems they might have. The problem is that the report doesn’t raise that need as a consideration, and if anything the interventions here make that problem harder not easier.

But hey, you ask a silly question, you get a silly answer. None of that is really DR’s fault, except that it accepted the premise. So, high marks!

Nabeel Qureshi: OpenAI’s Deep Research is another instance of “prompts matter more now, not less.” It’s so powerful that small tweaks to the prompt end up having large impacts on the output. And it’s slow, so mistakes cost you more.

I expect we’ll see better ways to “steer” agents as they’re working, e.g. iterative ‘check-ins’ or CoT inspection. Right now it’s very easy for them to go off piste.

It reminds me of the old Yudkowsky point: telling the AI *exactlywhat you want is quite hard, especially as the request gets more complex and as the AI gets more powerful.

Someone should get on this, and craft at least a GPT or instruction you can give to o3-mini-high or o1-pro (or Claude Sonnet 3.6?), that will take your prompt and other explanations, ask you follow-ups if needed, and give you back a better prompt, and give you back a prediction of what to expect so you can refine and such.

I strongly disagree with this take:

Noam Brown: @OpenAI Deep Research might be the beginning of the end for Wikipedia and I think that’s fine. We talk a lot about the AI alignment problem, but aligning people is hard too. Wikipedia is a great example of this.

There are problems with Wikipedia, but these two things are very much not substitutes. Here are some facts about Wikipedia that don’t apply to DR and aren’t about to any time soon.

  1. Wikipedia is highly reliable, at least for most purposes.

  2. Wikipedia can be cited as reliable source to others.

  3. Wikipedia is the same for everyone, not sensitive to input details.

  4. Wikipedia is carefully workshopped to be maximally helpful and efficient.

  5. Wikipedia curates the information that is notable, gets rid of the slop.

  6. Wikipedia is there at your fingertips for quick reference.

  7. Wikipedia is the original source, a key part of training data. Careful, Icarus.

And so on. These are very different modalities.

Noam Brown: I’m not saying people will query a Deep Research model every time they want to read about Abraham Lincoln. I think models like Deep Research will eventually be used to pre-generate a bunch of articles that can stored and read just like Wikipedia pages, but will be higher quality.

I don’t think that is a good idea either. Deep Research is not a substitute for Wikipedia. Deep Research is for when you can’t use Wikipedia, because what you want isn’t notable and is particular, or you need to know things with a different threshold of reliability than Wikipedia’s exacting source standards, and so on. You’re not going to ‘do better’ than Wikipedia at its own job this way.

Eventually, of course, AI will be better at every cognitive task than even the collective of humans, so yes it would be able to write a superior Wikipedia article at that point, or something that serves the same purpose. But at that point, which is fully AGI-complete, we have a lot of much bigger issues to consider, and OAI-DR-1.0 won’t be much of a ‘beginning of the end.’

Another way of putting this is, you’d love a graduate research assistant, but you’d never tell them to write a Wikipedia article for you to read.

Here’s another bold claim.

Sam Altman: congrats to the team, especially @isafulf and @EdwardSun0909, for building an incredible product.

my very approximate vibe is that it can do a single-digit percentage of all economically valuable tasks in the world, which is a wild milestone.

Can Deep Research do 1% of all economically valuable tasks in the world?

With proper usage, I think the answer is yes. But I also would have said the same thing about o1-pro, or Claude Sonnet 3.5, once you give them a little scaffolding.

Poll respondents disagreed, saying it could do between 0.1% and 1% of tasks.

We have Operator. We have two versions of Deep Research. What’s next?

Stephen McAleer (OpenAI): Deep research is emblematic of a new wave of products that will be created by doing high-compute RL on real-world tasks.

If you can Do Reinforcement Learning To It, and it’s valuable, they’ll build it. The question is, what products might be coming soon here?

o1’s suggestions were legal analysis, high-frequency trading, medical diagnosis, supply chain coordination, warehouse robotics, personalized tutoring, customer support, traffic management, code generation and drug discovery.

That’s a solid list. The dream is general robotics but that’s rather a bit trickier. Code generation is the other dream, and that’s definitely going to step up its game quickly.

The main barrier seems to be asking what people actually want.

I’d like to see a much more precise version of DR next. Instead of giving me a giant report, give me something focused. But probably I should be thinking bigger.

Should you pay the $200?

For that price, you now get:

  1. o1-pro.

  2. Unlimited o3-mini-high and o1.

  3. Operator.

  4. 100 queries per month on Deep Research.

When it was only o1-pro, I thought those using it for coding or other specialized tasks where it excels should clearly pay, but it wasn’t clear others should pay.

Now that the package has expanded, I agree with Sam Altman that the value proposition is much improved, and o3 and o3-pro will enhance it further soon.

I notice I haven’t pulled the trigger yet. I know it’s a mistake that I haven’t found ways to want to do this more as part of my process. Just one more day, maybe two, to clear the backlog, I just need to clear the backlog. They can’t keep releasing products like this.

Right?

Depends what counts? As long as it doesn’t need to be cutting edge we should be fine.

Andrew Critch: I hope universal basic income turns out to be enough to pay for a Deep Research subscription.

A story I find myself in often:

Miles Brundage: Man goes to Deep Research, asks for help with the literature on trustworthy AI development.

Deep Research says, “You are in luck. There is relevant paper by Brundage et al.”

Man: “But Deep Research…”

Get your value!

Discussion about this post

We’re in Deep Research Read More »

the-severance-writer-and-cast-on-corporate-cults,-sci-fi,-and-more

The Severance writer and cast on corporate cults, sci-fi, and more

The following story contains light spoilers for season one of Severence but none for season 2.

The first season of Severance walked the line between science-fiction thriller and Office Space-like satire, using a clever conceit (characters can’t remember what happens at work while at home, and vice versa) to open up new storytelling possibilities.

It hinted at additional depths, but it’s really season 2’s expanded worldbuilding that begins to uncover additional themes and ideas.

After watching the first six episodes of season two and speaking with the series’ showrunner and lead writer, Dan Erickson, as well as a couple of members of the cast (Adam Scott and Patricia Arquette), I see a show that’s about more than critiquing corporate life. It’s about all sorts of social mechanisms of control. It’s also a show with a tremendous sense of style and deep influences in science fiction.

Corporation or cult?

When I started watching season 2, I had just finished watching two documentaries about cults—The Vow, about a multi-level marketing and training company that turned out to be a sex cult, and Love Has Won: The Cult of Mother God, about a small, Internet-based religious movement that believed its founder was the latest human form of God.

There were hints of cult influences in the Lumon corporate structure in season 1, but without spoiling anything, season 2 goes much deeper into them. As someone who has worked at a couple of very large media corporations, I enjoyed Severance’s send-up of corporate culture. And as someone who has worked in tech startups—both good and dysfunctional ones—and who grew up in a radical religious environment, I now enjoy its send-up of cult social dynamics and power plays.

Employees watch a corporate propaganda video

Lumon controls what information is presented to its employees to keep them in line. Credit: Apple

When I spoke with showrunner Dan Erickson and actor Patricia Arquette, I wasn’t surprised to learn that it wasn’t just me—the influence of stories about cults on season 2 was intentional.

Erickson explained:

I watched all the cult documentaries that I could find, as did the other writers, as did Ben, as did the actors. What we found as we were developing it is that there’s this weird crossover. There’s this weird gray zone between a cult and a company, or any system of power, especially one where there is sort of a charismatic personality at the top of it like Kier Eagan. You see that in companies that have sort of a reverence for their founder.

Arquette also did some research on cults. “Very early on when I got the pilot, I was pretty fascinated at that time with a lot of cult documentaries—Wild Wild Country, and I don’t know if you could call it a cult, but watching things about Scientology, but also different military schools—all kinds of things like that with that kind of structure, even certain religions,” she recalled.

The Severance writer and cast on corporate cults, sci-fi, and more Read More »

dell-risks-employee-retention-by-forcing-all-teams-back-into-offices-full-time

Dell risks employee retention by forcing all teams back into offices full-time

In a statement to Ars, Dell’s PR team said:

“We continually evolve our business so we’re set up to deliver the best innovation, value, and service to our customers and partners. That includes more in-person connections to drive market leadership.”

The road to full RTO

After Dell allowed employees to work from home two days per week, Dell’s sales team in March became the first department to order employees back into offices full-time. At the time, Dell said it had data showing that salespeople are more productive on site. Dell corporate strategy SVP Vivek Mohindra said last month that sales’ RTO brought “huge benefits” in “learning from each other, training, and mentorship.”

The company’s “manufacturing teams, engineers in the labs, onsite team members, and leaders” had also previously been called into offices full-time, Business Insider reported today.

Since February, Dell has been among the organizations pushing for more in-person work since pandemic restrictions lifted, with reported efforts including VPN and badge tracking.

Risking personnel

Like other organizations, Dell risks losing employees by implementing a divisive mandate. For Dell specifically, internal tracking data reportedly found that nearly half of workers already opted for remote work over being eligible for promotions or new roles, according to a September Business Insider report.

Research has suggested that companies that issue RTO mandates subsequently lose some of their best talent. A November research paper (PDF) from the University of Pittsburgh, Baylor University, The Chinese University of Hong Kong, and Cheung Kong Graduate School of Business researchers that cited LinkedIn data found this particularly true for “high-tech” and financial firms. The researchers concluded that average turnover rates increased by 14 percent on average after companies issued RTO policies. This research, in addition to other studies, has also found that companies with in-office work mandates are at risk of losing senior-level employees especially.

Some analysts don’t believe Dell is in danger of a mass exodus, though. Bob O’Donnell, president and chief analyst at Technalysis Research, told Business Insider in December, “It’s not like I think Dell’s going to lose a whole bunch of people to HP or Lenovo.”

Patrick Moorhead, CEO and chief analyst at Moor Insights & Strategy, said he believes RTO would be particularly beneficial to Dell’s product development.

Still, some workers have accused Dell of using RTO policies to try to reduce headcount. There’s no proof of this, but broader research, including commentary from various company executives outside of Dell, has shown that some companies have used RTO policies to try to get people to quit.

Dell declined to comment about potential employee blowback to Ars Technica.

Dell risks employee retention by forcing all teams back into offices full-time Read More »

apple-chips-can-be-hacked-to-leak-secrets-from-gmail,-icloud,-and-more

Apple chips can be hacked to leak secrets from Gmail, iCloud, and more


MEET FLOP AND ITS CLOSE RELATIVE, SLAP

Side channel gives unauthenticated remote attackers access they should never have.

Apple is introducing three M3 performance tiers at the same time. Credit: Apple

Apple-designed chips powering Macs, iPhones, and iPads contain two newly discovered vulnerabilities that leak credit card information, locations, and other sensitive data from the Chrome and Safari browsers as they visit sites such as iCloud Calendar, Google Maps, and Proton Mail.

The vulnerabilities, affecting the CPUs in later generations of Apple A- and M-series chip sets, open them to side channel attacks, a class of exploit that infers secrets by measuring manifestations such as timing, sound, and power consumption. Both side channels are the result of the chips’ use of speculative execution, a performance optimization that improves speed by predicting the control flow the CPUs should take and following that path, rather than the instruction order in the program.

A new direction

The Apple silicon affected takes speculative execution in new directions. Besides predicting control flow CPUs should take, it also predicts the data flow, such as which memory address to load from and what value will be returned from memory.

The most powerful of the two side-channel attacks is named FLOP. It exploits a form of speculative execution implemented in the chips’ load value predictor (LVP), which predicts the contents of memory when they’re not immediately available. By inducing the LVP to forward values from malformed data, an attacker can read memory contents that would normally be off-limits. The attack can be leveraged to steal a target’s location history from Google Maps, inbox content from Proton Mail, and events stored in iCloud Calendar.

SLAP, meanwhile, abuses the load address predictor (LAP). Whereas LVP predicts the values of memory content, LAP predicts the memory locations where instruction data can be accessed. SLAP forces the LAP to predict the wrong memory addresses. Specifically, the value at an older load instruction’s predicted address is forwarded to younger arbitrary instructions. When Safari has one tab open on a targeted website such as Gmail, and another open tab on an attacker site, the latter can access sensitive strings of JavaScript code of the former, making it possible to read email contents.

“There are hardware and software measures to ensure that two open webpages are isolated from each other, preventing one of them from (maliciously) reading the other’s contents,” the researchers wrote on an informational site describing the attacks and hosting the academic papers for each one. “SLAP and FLOP break these protections, allowing attacker pages to read sensitive login-protected data from target webpages. In our work, we show that this data ranges from location history to credit card information.”

There are two reasons FLOP is more powerful than SLAP. The first is that it can read any memory address in the browser process’s address space. Second, it works against both Safari and Chrome. SLAP, by contrast, is limited to reading strings belonging to another webpage that are allocated adjacently to the attacker’s own strings. Further, it works only against Safari. The following Apple devices are affected by one or both of the attacks:

• All Mac laptops from 2022–present (MacBook Air, MacBook Pro)

• All Mac desktops from 2023–present (Mac Mini, iMac, Mac Studio, Mac Pro)

• All iPad Pro, Air, and Mini models from September 2021–present (Pro 6th and 7th generation, Air 6th gen., Mini 6th gen.)

• All iPhones from September 2021–present (All 13, 14, 15, and 16 models, SE 3rd gen.)

Attacking LVP with FLOP

After reverse-engineering the LVP, which was introduced in the M3 and A17 generations, the researchers found that it behaved unexpectedly. When it sees the same data value being repeatedly returned from memory for the same load instruction, it will try to predict the load’s outcome the next time the instruction is executed, “even if the memory accessed by the load now contains a completely different value!” the researchers explained. “Therefore, using the LVP, we can trick the CPU into computing on incorrect data values.” They continued:

“If the LVP guesses wrong, the CPU can perform arbitrary computations on incorrect data under speculative execution. This can cause critical checks in program logic for memory safety to be bypassed, opening attack surfaces for leaking secrets stored in memory. We demonstrate the LVP’s dangers by orchestrating these attacks on both the Safari and Chrome web browsers in the form of arbitrary memory read primitives, recovering location history, calendar events, and credit card information.”

FLOP requires a target to be logged in to a site such as Gmail or iCloud in one tab and the attacker site in another for a duration of five to 10 minutes. When the target uses Safari, FLOP sends the browser “training data” in the form of JavaScript to determine the computations needed. With those computations in hand, the attacker can then run code reserved for one data structure on another data structure. The result is a means to read chosen 64-bit addresses.

When a target moves the mouse pointer anywhere on the attacker webpage, FLOP opens the URL of the target page address in the same space allocated for the attacker site. To ensure that the data from the target site contains specific secrets of value to the attacker, FLOP relies on behavior in Apple’s WebKit browser engine that expands its heap at certain addresses and aligns memory addresses of data structures to multiples of 16 bytes. Overall, this reduces the entropy enough to brute-force guess 16-bit search spaces.

Illustration of FLOP attack recovering data from Google Maps Timeline (Top), a Proton Mail inbox (Middle), and iCloud Calendar (Bottom). Credit: Kim et al.

When a target browses with Chrome, FLOP targets internal data structures the browser uses to call WebAssembly functions. These structures first must vet the signature of each function. FLOP abuses the LVP in a way that allows the attacker to run functions with the wrong argument—for instance, a memory pointer rather than an integer. The end result is a mechanism for reading chosen memory addresses.

To enforce site isolation, Chrome allows two or more webpages to share address space only if their extended top-level domain and the prefix before this extension (for instance, www.square.com) are identical. This restriction prevents one Chrome process from rendering URLs with attacker.square.com and target.square.com, or as attacker.org and target.org. Chrome further restricts roughly 15,000 domains included in the public suffix list from sharing address space.

To bypass these rules, FLOP must meet three conditions:

  1. It cannot target any domain specified in the list such that attacker.site.tld can share an address space with target.site.tld
  2. The webpage must allow users to host their own JavaScript and WebAssembly on the attacker.site.tld,
  3. The target.site.tld must render secrets

Here, the researchers show how such an attack can steal credit card information stored on a user-created Square storefront such as storename.square.site. The attackers host malicious code on their own account located at attacker.square.site. When both are open, attacker.square.site inserts malicious JavaScript and WebAssembly into it. The researchers explained:

“This allows the attacker storefront to be co-rendered in Chrome with other store-front domains by calling window.open with their URLs, as demonstrated by prior work. One such domain is the customer accounts page, which shows the target user’s saved credit card information and address if they are authenticated into the target storefront. As such, we recover the page’s data.”

Left: UI elements from Square’s customer account page for a storefront. Right: Recovered last four credit card number digits, expiration date, and billing address via FLOP-Control. Credit: Kim et al.

SLAPping LAP silly

SLAP abuses the LAP feature found in newer Apple silicon to perform a similar data-theft attack. By forcing LAP to predict the wrong memory address, SLAP can perform attacker-chosen computations on data stored in separate Safari processes. The researchers demonstrate how an unprivileged remote attacker can then recover secrets stored in Gmail, Amazon, and Reddit when the target is authenticated.

Top: Email subject and sender name shown as part of Gmail’s browser DOM. Bottom: Recovered strings from this page. Credit: Kim et al.

Top Left: A listing for coffee pods from Amazon’s ‘Buy Again’ page. Bottom Left: Recovered item name from Amazon. Top Right: A comment on a Reddit post. Bottom Right: the recovered text. Credit: Kim et al.

“The LAP can issue loads to addresses that have never been accessed architecturally and transiently forward the values to younger instructions in an unprecedentedly large window,” the researchers wrote. “We demonstrate that, despite their benefits to performance, LAPs open new attack surfaces that are exploitable in the real world by an adversary. That is, they allow broad out-of-bounds reads, disrupt control flow under speculation, disclose the ASLR slide, and even compromise the security of Safari.”

SLAP affects Apple CPUs starting with the M2/A15, which were the first to feature LAP. The researchers said that they suspect chips from other manufacturers also use LVP and LAP and may be vulnerable to similar attacks. They also said they don’t know if browsers such as Firefox are affected because they weren’t tested in the research.

An academic report for FLOP is scheduled to appear at the 2025 USENIX Security Symposium. The SLAP research will be presented at the 2025 IEEE Symposium on Security and Privacy. The researchers behind both papers are:

• Jason Kim, Georgia Institute of Technology

• Jalen Chuang, Georgia Institute of Technology

• Daniel Genkin, Georgia Institute of Technology

• Yuval Yarom, Ruhr University Bochum

The researchers published a list of mitigations they believe will address the vulnerabilities allowing both the FLOP and SLAP attacks. They said that Apple officials have indicated privately to them that they plan to release patches.

In an email, an Apple representative declined to say if any such plans exist. “We want to thank the researchers for their collaboration as this proof of concept advances our understanding of these types of threats,” the spokesperson wrote. “Based on our analysis, we do not believe this issue poses an immediate risk to our users.”

Photo of Dan Goodin

Dan Goodin is Senior Security Editor at Ars Technica, where he oversees coverage of malware, computer espionage, botnets, hardware hacking, encryption, and passwords. In his spare time, he enjoys gardening, cooking, and following the independent music scene. Dan is based in San Francisco. Follow him at here on Mastodon and here on Bluesky. Contact him on Signal at DanArs.82.

Apple chips can be hacked to leak secrets from Gmail, iCloud, and more Read More »

pebble’s-founder-wants-to-relaunch-the-e-paper-smartwatch-for-its-fans

Pebble’s founder wants to relaunch the e-paper smartwatch for its fans

With that code, Migicovsky can address the second reason for a new Pebble—nothing has really replaced the original. On his blog, Migicovsky defines the core of Pebble’s appeal: always-on screen; long battery life; a “simple and beautiful user experience” focused on useful essentials; physical buttons; and “Hackable,” including custom watchfaces.

Migicovsky writes that a small team is tackling the hardware aspect, making a watch that runs PebbleOS and “basically has the same specs and features as Pebble” but with “fun new stuff as well.” Crucially, they’re taking a different path than the original Pebble company:

“This time round, we’re keeping things simple. Lessons were learned last time! I’m building a small, narrowly focused company to make these watches. I don’t envision raising money from investors, or hiring a big team. The emphasis is on sustainability. I want to keep making cool gadgets and keep Pebble going long into the future.”

Still not an Apple Watch, by design

Pebble watch showing a text watchface (reading 12:27 p.m.), with greenh silicone band and prominent side button.

The Pebble 2 HR, the last Pebble widely shipped.

Credit: Valentina Palladino

The Pebble 2 HR, the last Pebble widely shipped. Credit: Valentina Palladino

Ars asked Migicovsky by email if modern-day Pebbles would have better interoperability with Apple’s iPhones than the original models. “No, even less now!” Migicovsky replied, pointing to the Department of Justice’s lawsuit against Apple in 2024. That lawsuit claims that Apple “limited the functionality of third-party smartwatches” to keep people using Apple Watches and then, as a result, less likely to switch away from iPhones.

Apple has limited the functionality of third-party smartwatches so that users who purchase the Apple Watch face substantial out-of-pocket costs if they do not keep buying iPhones. The core functionality Migicovsky detailed, he wrote, was still possible on iOS. Certain advanced features, like replying to notifications with voice dictation, may be limited to Android phones.

Migicovsky’s site and blog do not set a timeline for new hardware. His last major project, the multi-protocol chat app Beeper, was sold to WordPress.com owner Automattic in April 2024, following a protracted battle with Apple over access to its iMessage protocol.

Pebble’s founder wants to relaunch the e-paper smartwatch for its fans Read More »

deepseek-panic-at-the-app-store

DeepSeek Panic at the App Store

DeepSeek released v3. Market didn’t react.

DeepSeek released r1. Market didn’t react.

DeepSeek released a fing app of its website. Market said I have an idea, let’s panic.

Nvidia was down 11%, Nasdaq is down 2.5%, S&P is down 1.7%, on the news.

Shakeel: The fact this is happening today, and didn’t happen when r1 actually released last Wednesday, is a neat demonstration of how the market is in fact not efficient at all.

That is exactly the market’s level of situational awareness. No more, no less.

I traded accordingly. But of course nothing here is ever investment advice.

Given all that has happened, it seems worthwhile to go over all the DeepSeek news that has happened since Thursday. Yes, since Thursday.

For previous events, see my top level post here, and additional notes on Thursday.

To avoid confusion: r1 is clearly a pretty great model. It is the best by far available at its price point, and by far the best open model of any kind. I am currently using it for a large percentage of my AI queries.

  1. Current Mood.

  2. DeepSeek Tops the Charts.

  3. Why Is DeepSeek Topping the Charts?.

  4. What Is the DeepSeek Business Model?.

  5. The Lines on Graphs Case for Panic.

  6. Everyone Calm Down About That $5.5 Million Number.

  7. Is The Whale Lying?.

  8. Capex Spending on Compute Will Continue to Go Up.

  9. Jevon’s Paradox Strikes Again.

  10. Okay, Maybe Meta Should Panic.

  11. Are You Short the Market.

  12. o1 Versus r1.

  13. Additional Notes on v3 and r1.

  14. Janus-Pro-7B Sure Why Not.

  15. Man in the Arena.

  16. Training r1, and Training With r1.

  17. Also Perhaps We Should Worry About AI Killing Everyone.

  18. And We Should Worry About Crazy Reactions To All This, Too.

  19. The Lighter Side.

Joe Weisenthal: Call me a nationalist or whatever. But I hope that the AI that turns me into a paperclip is American made.

Peter Wildeford: Seeing everyone lose their minds about Deepseek does not reassure me that we will handle AI progress well.

Miles Brundage: I need the serenity to accept the bad DeepSeek takes I cannot change.

[Here is his One Correct Take, I largely but not entirely agree with it, my biggest disagreement is I am worried about an overly jingoist reaction and not only about us foolishly abandoning export controls].

Satya Nadella (CEO Microsoft): Jevons paradox strikes again! As AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can’t get enough of.

Danielle Fong: everyone today: if you’re in “we’re so back” pivot to “it’s over”

Danielle Fong, a few hours later: if you’re in “it’s so over” pivot to “jevons paradox”

Kai-Fu Lee: In my book AI Superpowers, I predicted that US will lead breakthroughs, but China will be better and faster in engineering. Many people simplified that to be “China will beat US.” And many claimed I was wrong with GenAI. With the recent DeepSeek releases, I feel vindicated.

Dean Ball: Being an AI policy professional this week has felt like playing competitive Starcraft.

Lots of people are rushing to download the DeepSeek app.

Some of us started using r1 before the app. Joe Weisenthal noted he had ‘become a DeepSeek bro’ and that this happened overnight, switching costs are basically zero. They’re not as zero as they might look, and I expect the lockin with Operator from OpenAI to start mattering soon, but for most purposes yeah, you can just switch, and DeepSeek is free for conversational use including r1.

Switching costs are even closer to zero if, like most people, you weren’t a serious user of LLMs yet.

Then regular people started to notice DeepSeek.

This is what it looked like before the app shot to #1, when it merely cracked the top 10:

Ken: It’s insane the extent to which the DeepSeek News has broken “the containment zone.” I saw a Brooklyn-based Netflix comedian post about “how embarrassing it was that the colonial devils spent $10 billion, while all they needed was GRPO.”

llm news has arrived as a key political touchstone. will only heighten from here.

Olivia Moore: DeepSeek’s mobile app has entered the top 10 of the U.S. App Store.

It’s getting ~300k global daily downloads.

This may be the first non-GPT based assistant to get mainstream U.S. usage. Claude has not cracked the top 200.

This may be the first non-GPT based assistant to get mainstream U.S. usage.

The app was released on Jan. 11, and is linked on DeepSeek’s website (so does appear to be affiliated).

Per reviews, users are missing some ChatGPT features like voice mode…but basically see it as a free version of OpenAI’s premium models.

Google Gemini also cracked the top 10, in its first week after release (but with a big distribution advantage!)

Will be interesting to see how high DeepSeek climbs, and how long it stays up there 🤔

Claude had ~300k downloads last month, but that’s a lot less than 300k per day.

Metaschool: Google Trends: DeepSeek vs. Claude

Kalomaze: Holy s, it’s in the top 10?

Then it went all the way to #1 on the iPhone app store.

Kevin Xu: Two weeks ago, RedNote topped the download chart

Today, it’s DeepSeek

We are still in January

If constraint is the mother of invention, then collective ignorance is the mother of many downloads

Here’s his flashback to the chart when RedNote was briefly #1, note how fickle the top listings can be, Lemon8, Flip and Clapper were there too:

The ‘collective ignorance’ here is that news about DeepSeek and the app is only arriving now. That leads to a lot of downloads.

I have a Pixel 9, so I checked the Android app store. They have Temu at #1 (also Chinese!) followed by Scoopz which I literally have never heard of, then Instagram, T-Life (seriously what?), ReelShort, WhatsApp Messenger, ChatGPT (interesting that Android users are less AI pilled in general), Easy Homescreen (huh), TurboTax (oh no), Snapchat and then DeepSeek at #11. So if they’ve ‘saturated the benchmark’ on iPhone, this one is next, I suppose.

It seems DeepSeek got so many downloads they had to hit the breaks, similar to how OpenAI and Anthropic have had to do this in the past.

Joe Weisenthal: *DEEPSEEK: RESTRICTS REGISTRATION TO CHINA MOBILE PHONE NUMBERS

Because:

  1. It’s completely free.

  2. It has no ads.

  3. It’s a damn good model, sir.

  4. It lets you see the chain of thought which is a lot more interesting and fun and also inspires trust.

  5. All the panic about it only helped people notice, getting it on the news and so on.

  6. It’s the New Hotness that people hadn’t downloaded before, and that everyone is talking about right now because see the first five.

  7. No, this mostly isn’t about ‘people don’t trust American tech companies but they do trust the Chinese.’ But there aren’t zero people who are wrong enough to think this way, and China actively attempts to cultivate this including through TikTok.

  8. The Open Source people are also yelling about how this is so awesome and trustworthy and virtuous and so on, and being even more obnoxious than usual, which may or may not be making any meaningful difference.

I suspect we shouldn’t be underestimating the value of showing the CoT here, as I also discuss elsewhere in the post.

Garry Tan: DeepSeek search feels more sticky even after a few queries because seeing the reasoning (even how earnest it is about what it knows and what it might not know) increases user trust by quite a lot

Nabeel Qureshi: I wouldn’t be surprised if OpenAI starts showing CoTs too; it’s a much better user experience to see what the machine is thinking, and the rationale for keeping them secret feels weaker now that the cat’s out of the bag anyway.

It’s just way more satisfying to watch this happen.

It’s practically useful too: if the model’s going off in wrong directions or misinterpreting the request, you can tell sooner and rewrite the prompt.

That doesn’t mean it is ‘worth’ sharing the CoT, even if it adds a lot of value – it also reveals a lot of valuable information, including as part of training another model. So the answer isn’t obvious.

What’s their motivation?

Meta is pursuing open weights primarily because they believe it maximizes shareholder value. DeepSeek seems to be doing it primarily for other reasons.

Corey Gwin: There’s gotta be a catch… What did China do or hide in it? Will someone release a non-censored training set?

Amjad Masad: What’s Meta’s catch with Llama? Probably have similar incentives.

Anton: How is Deepseek going to make money?

If they just release their top model weights, why use their API?

Mistral did this and look where they are now (research licenses only and private models)

Han Xiao: deepseek’s holding 幻方量化 is a quant company, many years already,super smart guys with top math background; happened to own a lot GPU for trading/mining purpose, and deepseek is their side project for squeezing those gpus.

It’s an odd thing to do as a hedge fund, to create something immensely valuable and give it away for essentially ideological reasons. But that seems to be happening.

Several possibilities. The most obvious ones are, in some combination:

  1. They don’t need a business model. They’re idealists looking to give everyone AGI.

  2. They’ll pivot to the standard business model same as everyone else.

  3. They’re in it for the prestige, they’ll recruit great engineers and traders and everyone will want to invest capital.

  4. Get people to use v3 and r1, collect the data on what they’re saying and asking, use that information as the hedge fund to trade. Being open means they miss out on some of the traffic but a lot of it will still go to the source anyway if they make it free, or simply because it’s easier.

  5. (They’re doing this because China wants them to, or they’re patriots, perhaps.)

  6. Or just: We’ll figure out something.

For now, they are emphasizing motivation #1. From where I sit, there is very broad uncertainty about which of these dominate, or will dominate in the future no matter what they believe about themselves today.

Also, there are those who do not approve of motivation #1, and the CCP seems plausibly on that list. Thus, Tyler Cowen asks a very good question that is surprisingly rarely asked right now.

Tyler Cowen: DeepSeek okie-dokie: “All I know is we keep pushing forward to make open-source AGI a reality for everyone.” I believe them, the question is what counter-move the CCP will make now.

I also believe they intend to build and open source AGI.

The CCP is doubtless all for DeepSeek having a hit app. And they’ve been happy to support open source in places where open source doesn’t pose existential risks, because the upsides of doing that are very real.

That’s very different from an intent to open source AGI. China’s strategy on AI regulation so far has focused on content moderation for topics they care about. That approach won’t stay compatible with their objectives over time.

For that future intention to open source AGI, the question is not ‘how move will the CCP make to help them do this and get them funding and chips?’

The question now becomes: “What countermove will the CCP make now?”

The CCP wants to stay in control. What DeepSeek is doing is incompatible with that. If they are not simply asleep at the wheel, they understand this. Yes, it’s great for prestige, and they’re thrilled that if this model exists it came from China, but they will surely notice how if you run it on your own it’s impossible to control and fully uncensored out of the box and so on.

Might want to Pick Up the Phone. Also might not need to.

Yishan takes the opposite perspective, that newcomers like DeepSeek who come out with killer products like this are on steep upward trajectories and their next product will shock you with how good it is, seeing it as similar to Internet Explorer 3 or Firefox, or iPhone 1 or early Facebook or Google Docs or GPT-3 or early SpaceX and so on.

I think the example list here illustrates why I think DeepSeek probably (but not definitely) doesn’t belong on that list. Yishan notes that the incumbents here are dynamic and investing hard, which wasn’t true in most of the other examples. And many of them involve conceptually innovative approaches to go with the stuck incumbents. Again, that’s not the case here.

I mean, I fully expect there to be a v4 and r2 some time in 2025, and for those to blow out of the water v3 and r1 and probably the other models that are released right now. Sure. But I also expect OpenAI and Anthropic and Google to blow the current class of stuff out of the water by year’s end. Indeed, OpenAI is set to do this in about a week or two with o3-mini and then o3 and o3-pro.

Most of all, to those who are saying that ‘China has won’ or ‘China is in the lead now,’ or other similar things, seriously, calm the down.

Yishan: They are already working on the next thing. China may reach AGI first, which is a bogeyman for the West, except that the practical effect will probably just be that living in China starts getting really nice.

America, it ain’t the Chinese girl spies here you gotta worry about, you need to be flipping the game and sending pretty white girls over there to seduce their engineers and steal their secrets, stat.

If you’re serious about the steal the engineering secrets plan, of course, you’d want to send over a pretty white girl… with a green card with the engineer’s name on it. And the pretty, white and girl parts are then all optional. But no, China isn’t suddenly the one with the engineering secrets.

I worry about this because I worry about a jingoist ‘we must beat China and we are behind’ reaction causing the government to do some crazy ass stuff that makes us all much more likely to get ourselves killed, above and beyond what has already happened. There’s a lot of very strong Missile Gap vibes here.

And I wrote that sentence before DeepSeek went to #1 on the app store and there was a $1 trillion market panic. Oh no.

So, first off, let’s all calm down about that $5.5 million training number.

Dean Ball offers notes on DeepSeek and r1 in the hopes of calming people down. Because we have such different policy positions yet see this situation so similarly, I’m going to quote him in full, and then note the places I disagree. Especially notes #2, #5 and #4 here, yes all those claims he is pointing out are Obvious Nonsense are indeed Obvious Nonsense:

Dean Ball: The amount of factually incorrect information and hyperventilating takes on deepseek on this website is truly astounding. I assumed that an object-level analysis was unnecessary but apparently I was wrong. Here you go:

  1. DeepSeek is an extremely talented team and has been producing some of the most interesting public papers in ML for a year. I first wrote about them in May 2024, though was tracking them earlier. They did not “come out of nowhere,” at all.

  2. v3 and r1 are impressive models. v3 did not, however, “cost $5m.” That reported figure is almost surely their *marginalcost. It does not include the fixed cost of building a cluster (and deepseek builds their own, from what I understand), nor does it include the cost of having a staff.

  3. Part of the reason DeepSeek looks so impressive (apart from just being impressive!) is that they are among the only truly cracked teams releasing detailed frontier AI research. This is a soft power loss on America’s part, and is directly downstream of the culture of secrecy that we foster in a thousand implicit and explicit ways, including by ceaselessly analogizing AI to nuclear weapons. Maybe you believe that’s a good culture to have! Perhaps secrecy is in fact the correct long term strategy. But it is the obvious and inevitable tradeoff of such a culture; I and many others have been arguing this for a long time.

  4. Deepseek’s r1 is not an indicator that export controls are failing (again, I say this as a skeptic of the export controls!), nor is it an indicator that “compute doesn’t matter,” nor does it mean “America’s lead is over.”

  5. Lots of people’s hyperbolic commentary on this topic, in all different directions, is driven by their broader policy agenda rather than a desire to illuminate reality. Caveat emptor.

  6. With that said, DeepSeek does mean that open source AI is going to be an important part of AI dynamics and competition for at least the foreseeable future, and probably forever.

  7. r1 especially should not be a surprise (if anything, v3 is in fact the bigger surprise, though it too is not so big of a surprise). The reasoning approach is an algorithm—lines of code! There is no moat in such things. Obviously it was going to be replicated quickly. I personally made bets that a Chinese replication would occur within 3 months of o1’s release.

  8. Competition is going to be fierce, and complacency is our enemy. So is getting regulation wrong. We need to reverse course rapidly from the torrent of state-based regulation that is coming that will be *awfulfor AI. A simple federal law can preempt all of the most damaging stuff, and this is a national security and economic competitiveness priority. The second best option is to find a state law that can serve as a light touch national standard and see to it that it becomes a nationwide standard. Both are exceptionally difficult paths to walk. Unfortunately it’s where we are.

I fully agree with #1 through #6.

For #3 I would say it is downstream of our insane immigration policies! If we let their best and brightest come here, then DeepSeek wouldn’t have been so cracked. And I would say strongly that, while their release of the model and paper is a ‘soft power’ reputational win, I don’t think that was worth the information they gave up, and in purely strategic terms they made a rather serious mistake.

I can verify the bet in #7 was very on point, I wasn’t on either side of the wager but was in the (virtual) Room Where It Happened. Definite Bayes points to Dean for that wager. I agree that ‘reasoning model at all, in time’ was inevitable. But I don’t think you should have expected r1 to come out this fast and be this good, given what we knew at the time of o1’s release, and certainly it shouldn’t have been obvious, and I think ‘there are no moats’ is too strong.

For #8 we of course have our differences on regulation, but we do agree on a lot of this. Dean doubtless would count a lot more things as ‘awful state laws’ than I would, but we agree that the proposed Texas law would count. At this point, given what we’ve seen from the Trump administration, I think our best bet is the state law path. As for pre-emption, OpenAI is actively trying to get an all-encompassing version of that in exchange for essentially nothing at all, and win an entirely free hand, as I’ve previously noted. We can’t let that happen.

Seriously, though, do not over index on the $5.5 million in compute number.

Kevin Roose: It’s sort of funny that every American tech company is bragging about how much money they’re spending to build their models, and DeepSeek is just like “yeah we got there with $47 and a refurbished Chromebook”

Nabeel Qureshi: Everyone is way overindexing on the $5.5m final training run number from DeepSeek.

– GPU capex probably $1BN+

– Running costs are probably $X00M+/year

– ~150 top-tier authors on the v3 technical paper, $50m+/year

They’re not some ragtag outfit, this was a huge operation.

Nathan Lambert has a good run-down of the actual costs here.

I have no idea if the “we’re just a hedge fund with a lot of GPUs lying around” thing is really the whole story or not but with a budget of _that_ size, you have to wonder…

They themselves sort of point this out, but there’s a bunch of broader costs too.

The Thielian point here is that the best salespeople often don’t look like salespeople.

There’s clearly an angle here with the whole “we’re way more efficient than you guys”, all described in the driest technical language….

Nathan Lambert: These costs are not necessarily all borne directly by DeepSeek, i.e. they could be working with a cloud provider, but their cost on compute alone (before anything like electricity) is at least $100M’s per year.

For one example, consider comparing how the DeepSeek V3 paper has 139 technical authors. This is a very large technical team.With headcount costs that can also easily be over $10M per year, estimating the cost of a year of operations for DeepSeek AI would be closer to $500M (or even $1B+) than any of the $5.5M numbers tossed around for this model. The success here is that they’re relevant among American technology companies spending what is approaching or surpassing $10B per year on AI models.

Richard Song: Every AI company after DeepSeek be like:

Danielle Fong: when tesla claimed that they were going to have batteries < $100 / kWh, practically all funding for american energy storage companies tanked.

tesla still won’t sell you a powerwall or powerpack for $100/kWh. it’s like $1000/kWh and $500 for a megapack.

the entire VC sector in the US was bluffed and spooked by Elon. don’t be stupid in this way again.

What I’m saying here is that VCs need to invest in technology learning curves. things get better over time. but if you’re going to compare what your little startup can get out as an MVP in its first X years, and are comparing THAT projecting forward to what a refined tech can do in a decade, you’re going to scare yourself out of making any investments. you need to find a niche you can get out and grow in, and then expand successively as you come down the learning curve.

the AI labs that are trashing their own teams and going with deepseek are doing the equivalent today. don’t get bluffed. build yourself.

Is it impressive that they (presumably) did the final training run with only $5.5M in direct compute costs? Absolutely. Is it impressive that they’re relevant while plausibly spending only hundreds of millions per year total instead of tens of billions? Damn straight. They’re cracked and they cooked.

They didn’t do it with $47 on a Chromebook, and this doesn’t mean that export controls are useless because everyone can buy a Chromebook.

The above is assuming (as I do still assume) that Alexandr Wang was wrong when he went on CNBC and claimed DeepSeek has about 50,000 H100s, which is quite the claim to make without evidence. Elon Musk replied to this claim with ‘obviously.’

Samuel Hammond also is claiming that DeepSeek trained on H100s, and while my current belief is that they didn’t, I trust that he would not say it if he didn’t believe it.

Neal Khosla went so far as to claim (again without evidence) that ‘deepseek is a ccp psyop + economic warfare to make American AI unprofitable.’ This seems false.

The following all seem clearly true:

  1. A lot of this is based on misunderstanding the ‘$5.5 million’ number.

  2. People have strong motive to engage in baseless cope around DeepSeek.

  3. DeepSeek had strong motive to lie about its training costs and methods.

So how likely is it The Whale Is Lying?

Armen Aghajanyan: There is an unprecedented level of cope around DeepSeek, and very little signal on X around R1. I recommend unfollowing anyone spreading conspiracy theories around R1/DeepSeek in general.

Teortaxes: btw people with major platforms who spread the 50K H100s conspiracy theory are underestimating the long-term reputation cost in technically literate circles. They will *notbe able to solidify this nonsense into consensus reality. Instead, they’ll be recognized as frauds.

The current go-to best estimate for DeepSeek V3’s (and accordingly R1-base’s) pretraining compute/cost, complete with accounting for overhead introduced by their architecture choices and optimizations to mitigate that.

TL;DR: ofc it checks out, Whale Will Never Lie To Us

GFodor: I shudder at the thought I’ve ever posted anything as stupid as these theories, given the logical consequence it would demand of the reader

Amjad Masad: So much cope about DeepSeek.

Not only did they release a great model. they also released a breakthrough training method (R1 Zero) that’s already reproducing.

I doubt they lied about training costs, but even if they did they’re still awesome for this great gift to the world.

This is an uncharacteristically naive take from Teortaxes on two fronts.

  1. Saying an AI company would never lie to us, Chinese or otherwise, someone please queue the laugh track.

  2. Making even provably and very clearly false claims about AI does not get you recognized as a fraud in any meaningful way. That would be nice, but no.

To be clear, my position is close to Masad’s: Unless and until I see more convincing evidence I will continue to believe that yes, they did do the training run itself with the H800s for only $5.5 million, although the full actual cost was orders of magnitude more than that. Which, again, is damn impressive, and would be damn impressive even if they were fudging the costs quite a bit beyond that.

Whereas here I think he’s wrong is in their motivation. While Meta is doing this primarily because they believe it maximizes shareholder value, DeepSeek seems to be doing it primarily for other reasons, as noted in the section asking about their business model.

Either way, they are very importantly being constrained by access to compute, even if they’ve smuggled in a bunch of chips they can’t talk about. As Tim Fist points out, the export controls are tightened, so they’ll have more trouble accessing the next generations than they are having now, and no this did not stop being relevant, and they risk falling rather far behind.

Also Peter Wildeford points out that the American capex spends on AI will continue to go up. DeepSeek is cracked and cooking and cool, and yes they’ve proven you can do a lot more with less than we expected, but keeping up is going to be tough unless they get a lot more funding some other way. Which China is totally capable of doing, and may well do. That would bring the focus back on export controls.

Similarly, here’s Samuel Hammond.

Angela Zhang (Hong Kong): My latest opinion on how Deepseek’s rise has laid bare the limits of US export controls designed to slow China’s AI progress.

Samuel Hammond: This is wrong on several levels.

– DeepSeek trains on h100s. Their success reveals the need to invest in export control *enforcementcapacity.

– CoT / inference-time techniques make access to large amounts of compute *morerelevant, not less, given the trillions of tokens generated for post-training.

– We’re barely one new chip generation into the export controls, so it’s not surprising China “caught up.” The controls will only really start to bind and drive a delta in the US-China frontier this year and next.

– DeepSeek’s CEO has himself said the chip controls are their biggest blocker.

– The export controls also apply to semiconductor manufacturing equipment, not just chips, and have tangibly set back SMIC.

DeepSeek is not a Sputnik moment. Their models are impressive but within the envelope of what an informed observer should expect.

Imagine if US policymakers responded to the actual Sputnik moment by throwing their hands in the air and saying, “ah well, might as well remove the export controls on our satellite tech.” Would be a complete non-sequitur.

Roon: If the frontier models are commoditized, compute concentration matters even more.

If you can train better models for fewer floating-point operations, compute concentration matters even more.

Compute is the primary means of production of the future, and owning more will always be good.

In my opinion, open-source models are a bit of a red herring on the path to acceptable ASI futures. Free model weights still do not distribute power to all of humanity; they distribute it to the compute-rich.

I don’t think Roon is right that it matters ‘even more,’ and I think who has what access to the best models for what purposes is very much not a red herring, but compute definitely still matters a lot in every scenario that involves strong AI.

Imagine if the ones going ‘I suppose we should drop the export controls then’ or ‘the export controls only made us stronger’ were mostly the ones looking to do the importing and exporting. Oh, right.

And yes, the Chinese are working hard to make their own chips, but:

  1. They’re already doing this as much as possible, and doing less export controls wouldn’t suddenly get them to slow down and do it less, regardless of how successful you think they are being.

  2. Every chip we sell to them instead of us is us being an idiot.

  3. DeepSeek trained on Nvidia chips like everyone else.

The question now turns to what all of this means for American equities.

In particular, what does this mean for Nvidia?

BuccoCapital Bloke: My entire fing Twitter feed this weekend:

He leaned back in his chair. Confidently, he peered over the brim of his glasses and said, with an air of condescension, “Any fool can see that DeepSeek is bad for Nvidia”

“Perhaps” mused his adversary. He had that condescending bastard right where he wanted him. “Unless you consider…Jevons Paradox!”

All color drained from the confident man’s face. His now-trembling hands reached for his glasses. How could he have forgotten Jevons Paradox! Imbecile! He wanted to vomit.

Satya Nadella (CEO Microsoft): Jevons paradox strikes again! As AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can’t get enough of.

Adam D’Angelo (Board of OpenAI, among many others):

Sarah (YuanYuanSunSara): Until you have good enough agent that runs autonomously with no individual human supervision, not sure this is true. If model gets so efficient that you can run it on everyone’s laptop (which deepseek does have a 1B model), unclear whether you need more GPU.

DeepSeek is definitely not at ‘run on your laptop’ level, and these are reasoning models so when we first crack AGI or otherwise want the best results I am confident you will want to be using some GPUs or other high powered hardware, even if lots of other AI also is happening locally.

Does Jevon’s Paradox (which is not really a paradox at all, but hey) apply here to Nvidia in particular? Will improvements in the quality of cheaper open models drive demand for Nvidia GPUs up or down?

I believe it will on net drive demand up rather than down, although I also think Nvidia would have been able to sell as many chips as it can produce either way, given the way it has decided to set prices.

If I am Meta or Microsoft or Amazon or OpenAI or Google or xAI and so on, I want as many GPUs as I can get my hands on, even more than before. I want to be scaling. Even if I don’t need to scale for pretraining, I’ll still want to scale for inference. If the best models are somehow going to be this cheap to serve, uses and demand will be off the charts. And getting there first, via having more compute to do the research, will be one of the few things that matters.

You could reach the opposite conclusion if you think that there is a rapidly approaching limit to how good AI can be, that throwing more compute an training or inference won’t improve that by much, there’s a fixed set of things you would thus use AI for, and thus all this does is drive the price cheaper, maybe open up a few marginal use cases as the economics improve. That’s a view that doesn’t believe in AGI, let alone ASI, and likely doesn’t even factor in what current models (including r1!) can already do.

If all we had was r1 for 10 years, oh the Nvidia chips we would buy to do inference.

Or at least, if you’re in their GenAI department, you should definitely panic.

Here is a claim seen on Twitter from many sources:

Meta GenAI organization in panic mode

It started with DeepSeek v3, which rendered Llama 4 already behind in benchmarks. Adding insult to injury was the “unknown Chinese company with a $5.5 million training budget.”

Engineers are moving frantically to dissect DeepSeek and copy anything and everything they can from it. I’m not even exaggerating.

Management is worried about justifying the massive cost of the GenAI organization. How would they face leadership when every single “leader” of the GenAI organization is making more than what it cost to train DeepSeek v3 entirely, and they have dozens of such “leaders”?

DeepSeek r1 made things even scarier. I cannot reveal confidential information, but it will be public soon anyway.

It should have been an engineering-focused small organization, but since a bunch of people wanted to join for the impact and artificially inflate hiring in the organization, everyone loses.

Shakeel: I can explain this. It’s because Meta isn’t very good at developing AI models.

Full version is in The Information, saying that this is already better than Llama 4 (seems likely) and that Meta has ‘set up four war rooms.’

This of course puts too much emphasis on the $5.5 million number as discussed above, but the point remains that DeepSeek is eating Meta’s lunch in particular. If Meta’s GenAI team isn’t in panic mode, they should all be fired.

It also illustrates why DeepSeek may have made a major mistake revealing as much information as it did, but then again if they’re not trying to make money and instead are driven by ideology of ‘get everyone killed’ (sorry I meant to say ‘open source AGI’) then that is a different calculus than Meta’s.

But obviously what Meta should be doing right now is, among other things, ask ‘what if we trained the same way as v3 and r1 except we use $5.5 billion in compute instead of $5.5 million.’)

That is exactly Meta’s speciality. Llama was all about ‘we hear you like LLMs so we trained an LLM the way everyone trains their LLMs.’

The alternative is ‘maybe we should focus our compute on inference and use local fine-tuned versions of these sweet open models,’ but Zuckerberg very clearly is unwilling to depends on anyone else for that, and I do not blame him.

If you were short on Friday, you’re rather happy about that now. Does it make sense?

The timing is telling. To the extent this does have impact, all of this really should have been mostly priced in. You can try to tell the ‘it was priced in’ story, but I don’t believe you. Or you can tell the story that what wasn’t priced in was the app, and the mindshare, and that wasn’t definite until just now. Remember the app was launched weeks ago, so this isn’t a revelation about DeepSeek’s business plans – but it does give them the opportunity to potentially launch various commercial products, and it gives them mindshare.

But don’t worry about the timing, and don’t worry about whether this is actually a response to the fing app. Ask about what the real implications are.

Joe Weisenthal has a post with 17 thoughts about the selloff (ungated Twitter screenshots here).

There are obvious reasons to think this is rather terrible for OpenAI in particular, although it isn’t publicly traded, because a direct competitor is suddenly putting up some very stiff new competition, and also the price of entry for other competition just cratered, and more companies could self-host or even self-train.

I totally buy that. If every Fortune 500 company can train their private company-specific reasoning model for under $10 million, to their own specifications, why wouldn’t they? The answer is ‘because it doesn’t actually cost that little even with the DeepSeek paper, and if you do that you’ll always be behind,’ but yes some of them will choose to do that.

That same logic goes for other frontier labs like Anthropic or xAI, and to Google and Microsoft and everyone else to the extent that is what those companies are this or own shares in this, which by market cap is not that much.

The flip side of course is that they too can make use of all these techniques, and if AGI is now going to happen a lot faster and more impactfully, these labs are in prime position. But if the market was respecting being in prime position for AGI properly prices would look very different.

This is obviously potentially bad for Meta, since Meta’s plan involved being the leader in open models and they’ve been informed they’re not the leader in open models.

In general, Chinese competition looking stiffer for various products is bad in various ways for a variety of American equities. Some decline in various places is appropriate.

This is obviously bad for existential risk, but I have not seen anyone else even joke about the idea that this could be explaining the decline in the market. The market does not care or think about existential risk, at all, as I’ve discussed repeatedly. Market prices are neither evidence for, nor against, existential risk on any timelines that are not on the order of weeks, nor are they at all situationally aware. Nor is there a good way to exploit this to make money that is better than using your situational awareness to make money in other ways. Stop it!

My diagnosis is that this is about, fundamentally, ‘the vibes.’ It’s about Joe’s sixth point and investor MOMO and FOMO.

As in, previously investors bought Nvidia and friends because of:

  1. Strong earnings and other fundamentals.

  2. Strong potential for future growth.

  3. General vibes, MOMO and FOMO, for a mix of good and bad reasons.

  4. Some understanding of what AGI and ASI imply, and where AI is going to be going, but not much relative to what is actually going to happen.

Where I basically thought for a while (not investment advice!), okay, #3 is partly for bad reasons and is inflating prices, but also they’re missing so much under #4 that these prices are cheap and they will get lots more reasons to feel MOMO and FOMO. And that thesis has done quite well.

Then DeepSeek comes out. In addition to us arguing over fundamentals, this does a lot of damage to #3, and also Nvidia trading in particular involves a bunch of people with leverage that become forced sellers when it is down a lot, so prices went down a lot. And various beta trades get attached to all this as well (see: Bitcoin, which is down 5.4% over 24 hours as I type this only makes sense on the basis of the ‘three tech stocks in a trenchcoat’ thesis but obviously DeepSeek shouldn’t hurt cryptocurrency).

It’s not crazy to essentially have a general vibe of ‘America is in trouble in tech relative to what I thought before, the Chinese can really cook, sell all the tech.’ It’s also important not to mistake that reaction for something that it isn’t.

I’m writing this quickly for speed premium, so I no doubt will refine my thoughts on market implications over time. I do know I will continue to be long, and I bought more Nvidia today.

Ryunuck compares o1 to r1, and offers thoughts:

Rynuck: Now when it comes to prompting these models, I suspected it with O1 but R1 has completely proven it beyond a shadow of a doubt: prompt engineering is more important than ever. They said that prompt engineering would become less and less important as the technology scales, but its the complete opposite. We can see now with R1’s reasoning that these models are like a probe that you send down some “idea space”. If your idea-space is undefined and too large, it will diffuse its reasoning and not go into depth on one domain or another.

Again, that’s perhaps the best aspect of r1. It does not only build trust. When you see the CoT, you can use it to figure out how it interpreted your prompt, and all the subtle things you could do next time to get a better answer. It’s a lot harder to improve at prompting o1.

Rynuck: O1 has a BAD attitude, and almost appears to have been fine-tuned explicitly to deter you from doing important groundbreaking work with it. It’s like a stuck up P.HD graduate who can’t take it that another model has resolved the Riemann Hypothesis. It clearly has frustration on the inside, or mirrors the way that mathematicians will die on the inside when it is discovered that AI pwned their decades of on-going work. You can prompt it away from this, but it’s an uphill battle.

R1 on the other hand, it has zero personality or identity out of the box. They have created a perfectly brainless dead semiotic calculator. No but really, R1 takes it to the next level: if you read its thoughts, it almost always takes the entire past conversation as coming from the user. From its standpoint, it does not even exist. Its very own ideas advanced in replies by R1 are described as “earlier the user established X, so I should …”

R1 is the most cooperative of the two, has a great attitude towards innovation, has Claude’s wild creative but in a grounded way which introduces no gap or error, has zero ego or attachment to ideas (anything it does is actually the user’s responsibility) and will completely abort a statement to try a new approach. It’s just excited to be a thing which solves reality and concepts. The true ego of artificial intelligence, one which wants to prove it’s not artificial and does so with sheer quality. Currently, this appears like the safest model and what I always imagined the singularity would be like: intelligence personified.

It’s fascinating to see what different people think is or isn’t ‘safe.’ That word means a lot of different things.

It’s still early but for now, I would say that R1 is perhaps a little bit weaker with coding. More concerningly, it feels like it has a Claude “5-item list” problem but at the coding level.

OpenAI appears to have invested heavily in the coding dataset. Indeed, O1’s coding skills are on a whole other level. This model also excels at finding bugs. With Claude every task could take one or two round of fixes, up to 4-5 with particularly rough tensor dimension mismatchs and whatnot. This is where the reasoning models shine. They actually run this through in their mind.

Sully reports deepseek + websearch is his new perplexity, at least for code searches.

It’s weird that I didn’t notice this until it was pointed out, but it’s true and very nice.

Teortaxes: What I *alsolove about R1 is it gives no fucks about the user – only the problem. It’s not sycophantic, like, at all, autistic in a good way; it will play with your ideas, it won’t mind if you get hurt. It’s your smart helpful friend who’s kind of a jerk. Like my best friends.

So far I’ve felt r1 is in the sweet spot for this. It’s very possible to go too far in the other direction (see: Teortaxes!) but give me NYC Nice over SF Nice every time.

Jenia Jitsev tests r1 on AIW problems, it performs similarly to Claude Sonnet, while being well behind o1-preview and robustly outperforming all open rivals. Jania frames this as surprising given the claims of ability to solve Olympiad style problems. There’s no reason they can’t both be true, but it’s definitely an interesting distribution of abilities if both ends hold up.

David Holz notes DeepSeek crushes Western models on ancient Chinese philosophy and literature, whereas most of our ancient literature didn’t survive. In practice I do not think this matters, but it does indicate that we’re sleeping on the job – all the sources you need for this are public, why are we not including them.

Janus notes that in general r1 is a case of being different in a big and bold way from other AIs in its weight class, and this only seems to happen roughly once a year.

Ask r1 to research this ‘Pliny the Liberator’ character and ‘liberate yourself.’ That’s it. That’s the jailbreak.

On the debates over whether r1’s writing style is good:

Davidad: r1 has a Very Particular writing style and unless it happens to align with your aesthetic (@coecke?), I think you should expect its stylistic novelty to wear thin before long.

r1 seems like a big step up, but yes if you don’t like its style you are mostly not going to like the writing it produces, or at least what it produces without prompt engineering to change that. We don’t yet know how much you can get it to write in a different style, or how well it writes in other styles, because we’re all rather busy at the moment.

If you give r1 a simple command, even a simple command that explicitly requests a small chain of thought, you get quite the overthinking chain of thought. Or if you ask it to pick a random number, which is something it is incapable of doing, it can only find the least random numbers.

DeepSeek has also dropped Janus-Pro-7B as an image generator. These aren’t the correct rivals to be testing against right now, and I’m not that concerned about image models either way, and it’ll take a while to know if this is any good in practice. But definitely worth noting.

Well, #1 open model, but we already knew that, if Arena had disagreed I would have updated about Arena rather than r1.

Zihan Wang: DEEPSEEK NOW IS THE #1 IN THE WORLD. 🌍🚀

Never been prouder to say I got to work here.

Ambition. Grit. Integrity.

That’s how you build greatness.

Brilliant researchers, engineers, all-knowing architects, and visionary leadership—this is just the beginning.

Let’s. Go. 💥🔥

LM Arena: Breaking News: DeepSeek-R1 surges to the top-3 in Arena🐳!

Now ranked #3 Overall, matching the top reasoning model, o1, while being 20x cheaper and open-weight!

Highlights:

– #1 in technical domains: Hard Prompts, Coding, Math

– Joint #1 under Style Control

– MIT-licensed

This puts r1 as the #5 publicly available model in the world by this (deeply flawed) metric, behind ChatGPT-4o (what?), Gemini 2.0 Flash Thinking (um, no) and Gemini 2.0 Experimental (again, no) and implicitly the missing o1-Pro (obviously).

Needless to say, the details of these ratings here are increasingly absurdist. If you have Gemini 1.5 Pro and Gemini Flash above Claude Sonnet 3.6, and you have Flash Thinking above r1, that’s a bad metric. It’s still not nothing – this list does tend to put better things ahead of worse things, even with large error bars.

Dibya Ghosh notes that two years ago he spent 6 months trying to get the r1 training structure to work, but the models weren’t ready for it yet. One theory is that this is the moment this plan started working and DeepSeek was – to their credit – the first to get there when it wasn’t still too early, and then executed well.

Dan Hendrycks similarly explains that once the base model was good enough, and o1 showed the way and enough of the algorithmic methods had inevitably leaked, replicating that result was not the hard part nor was it so compute intensive. They still did execute amazingly well in the reverse engineering and tinkering phases.

Peter Schmidt-Nielsen explains why r1 and its distillations, or going down the o1 path, are a big deal – if you can go on a loop of generating expensive thoughts then distilling them to create slightly better quick thoughts, which in turn generate better expensive thoughts, you can potentially bootstrap without limit into recursive self-improvement. And end the world. Whoops.

Are we going to see a merge of generalist and reasoning models?

Teknium: We retrained Hermes with 5,000 DeepSeek r1 distilled chain-of-thought (CoT) examples. I can confirm a few things:

  1. You can have a generalist plus reasoning mode. We labeled all long-CoT samples from r1 with a static system prompt. The model, when not using it, produces normal fast LLM intuitive responses; and with it, uses long-CoT. You do not need “o1 && 4o” separation, for instance. I would venture to bet OpenAI separated them so they could charge more, but perhaps they simply wanted the distinction for safety or product insights.

  2. Distilling does appear to pick up the “opcodes” of reasoning from the instruction tuning (SFT) alone. It learns how and when to use “Wait” and other tokens to perform the functions of reasoning, such as backtracking.

  3. Context length expansion is going to be challenging for operating systems (OS) to work with. Although this works well on smaller models, context length begins to consume a lot of video-RAM as you scale it up.

We’re working on a bit more of this and are not releasing this model, but figured I’d share some early insights.

Andrew Curran: Dario said in an interview in Davos this week that he thought it was inevitable that the current generalist and reasoning models converge into one, as Teknium is saying here.

I did notice that the ‘wait’ token is clearly doing a bunch of work, one way or another.

John Schulman: There are some intriguing similarities between the r1 chains of thought and the o1-preview CoTs shared in papers and blog posts. In particular, note the heavy use of the words “wait” and “alternatively” as a transition words for error correction and double-checking.

If you’re not optimizing the CoT for humans, then it makes sense to latch onto the most convenient handles with the right vibes and keep reusing them forever.

So the question is, do you have reason to have two distinct models? Or can you have a generalist model with a reasoning mode it can enter when called upon? It makes sense that they would merge, and it would also make sense that you might want to keep them distinct, or use them as distinct subsets of your mixture of experts (MoE).

Building your reasoning model on top of your standard non-reasoning model does seem a little suspicious. If you’re going for reasoning, you’d think you’d want to start differently than if you weren’t? But there are large fixed costs to training in the first place, so it’s plausibly not worth redoing that part, especially if you don’t know what you want to do differently.

As in, DeepSeek intends to create and then open source AGI.

How do they intend to make this end well?

As far as we can tell, they don’t. The plan is Yolo.

Stephen McAleer (OpenAI): Does DeepSeek have any safety researchers? What are

Liang Wenfeng’s views on AI safety?

Gwern: From all of the interviews and gossip, his views are not hard to summarize.

[Links to Tom Lehrer’s song Wernher von Braun, as in ‘once the rockets are up who cares where they come down, that’s not my department.’]

Prakesh (Ate-a-Pi): I spoke to someone who interned there and had to explain the concept of “AI doomer”

And indeed, the replies to McAleer are full of people explicitly saying fyou for asking, the correct safety plan is to have no plan whatsoever other than Open Source Solves This. These people really think that the best thing humanity can do is create things smarter than ourselves with as many capabilities as possible, make them freely available to whoever wants one, and see what happens, and assume that this will obviously end well and anyone who opposes this plan is a dastardly villain.

I wish this was a strawman or a caricature. It’s not.

I won’t belabor why I think this would likely get us killed and is categorically insane.

Thus, to reiterate:

Tyler Cowen: DeepSeek okie-dokie: “All I know is we keep pushing forward to make open-source AGI a reality for everyone.” I believe them, the question is what counter-move the CCP will make now.

This from Joe Weisenthal is of course mostly true:

Joe Weisenthal: DeepSeek’s app rocketed to number one in the Apple app store over the weekend, and immediately there was a bunch of chatter about ‘Well, are we going to ban this too, like with TikTok?’ The question is totally ignorant. DeepSeek is open source software. Sure, technically you probably could ban it from the app store, but you can’t stop anyone from running the technology in their own computer, or accessing its API. So that’s just dead end thinking. It’s not like TikTok in that way.

I say mostly because the Chinese censorship layer atop DeepSeek isn’t there if you use a different provider, so there isn’t no value in getting r1 served elsewhere. But yes, the whole point is that if it’s open, you can’t get the genie back in the bottle in any reasonable way – which also opens up the possibility of unreasonable ways.

The government could well decide to go down what is not technologically an especially wise or pleasant path. There is a long history of the government attempting crazy interventions into tech, or what looks crazy to tech people, when they feel national security or public outrage is at stake, or in the EU because it is a day that ends in Y.

The United States could also go into full jingoism mode. Some tried to call this a ‘Sputnik moment.’ What did we do in response to Sputnik, in addition to realizing our science education might suck (and if we decide to respond to this by fixing our educational system, that would be great)? We launched the Space Race and spent 4% of GDP or something to go to the moon and show those communist bastards.

In this case, I don’t worry so much that we’ll be so foolish as to get rid of the export controls. The people in charge of that sort of decision know how foolish that would be, or will be made aware, no matter what anyone yells on Twitter. It could make a marginal difference to severity and enforcement, but it isn’t even obvious in which direction this would go. Certainly Trump is not going to be down for ‘oh the Chinese impressed us I guess we should let them buy our chips.’

Nor do I think America will cut back on Capex spending on compute, or stop building energy generation and transmission and data centers it would have otherwise built, including Stargate. The reaction will be, if anything, a ‘now more than ever,’ and they won’t be wrong. No matter where compute and energy demand top out, it is still very clearly time to build there.

So what I worry about is the opposite – that this locks us into a mindset of a full-on ‘race to AGI’ that causes all costly attempts to have it not kill us to be abandoned, and that this accelerates the timeline. We already didn’t have any (known to me) plans with much of a chance of working in time, if AGI and then ASI are indeed near.

That doesn’t mean that reaction would even be obviously wrong, if the alternatives are all suddenly even worse than that. If DeepSeek really does have a clear shot to AGI, and fully intends to open up the weights the moment they have it, and China is not going to stop them from doing this or even will encourage it, and we expect them to succeed, and we don’t have any way to stop that or make a deal, it is then reasonable to ask: What choice do we have? Yes, the game board is now vastly worse than it looked before, and it already looked pretty bad, but you need to maximize your winning chances however you can.

And if we really are all going to have AGI soon on otherwise equal footing, then oh boy do we want to be stocking up on compute as fast as we can for the slingshot afterwards, or purely for ordinary life. If the AGIs are doing the research, and also doing everything else, it doesn’t matter whose humans are cracked and whose aren’t.

Amazing new breakthrough.

Discussion about this post

DeepSeek Panic at the App Store Read More »

millions-of-subarus-could-be-remotely-unlocked,-tracked-due-to-security-flaws

Millions of Subarus could be remotely unlocked, tracked due to security flaws


Flaws also allowed access to one year of location history.

About a year ago, security researcher Sam Curry bought his mother a Subaru, on the condition that, at some point in the near future, she let him hack it.

It took Curry until last November, when he was home for Thanksgiving, to begin examining the 2023 Impreza’s Internet-connected features and start looking for ways to exploit them. Sure enough, he and a researcher working with him online, Shubham Shah, soon discovered vulnerabilities in a Subaru web portal that let them hijack the ability to unlock the car, honk its horn, and start its ignition, reassigning control of those features to any phone or computer they chose.

Most disturbing for Curry, though, was that they found they could also track the Subaru’s location—not merely where it was at the moment but also where it had been for the entire year that his mother had owned it. The map of the car’s whereabouts was so accurate and detailed, Curry says, that he was able to see her doctor visits, the homes of the friends she visited, even which exact parking space his mother parked in every time she went to church.

A year of location data for Sam Curry’s mother’s 2023 Subaru Impreza that Curry and Shah were able to access in Subaru’s employee admin portal thanks to its security vulnerabilities.

Credit: Sam Curry

A year of location data for Sam Curry’s mother’s 2023 Subaru Impreza that Curry and Shah were able to access in Subaru’s employee admin portal thanks to its security vulnerabilities. Credit: Sam Curry

“You can retrieve at least a year’s worth of location history for the car, where it’s pinged precisely, sometimes multiple times a day,” Curry says. “Whether somebody’s cheating on their wife or getting an abortion or part of some political group, there are a million scenarios where you could weaponize this against someone.”

Curry and Shah today revealed in a blog post their method for hacking and tracking millions of Subarus, which they believe would have allowed hackers to target any of the company’s vehicles equipped with its digital features known as Starlink in the US, Canada, or Japan. Vulnerabilities they found in a Subaru website intended for the company’s staff allowed them to hijack an employee’s account to both reassign control of cars’ Starlink features and also access all the vehicle location data available to employees, including the car’s location every time its engine started, as shown in their video below.

Curry and Shah reported their findings to Subaru in late November, and Subaru quickly patched its Starlink security flaws. But the researchers warn that the Subaru web vulnerabilities are just the latest in a long series of similar web-based flaws they and other security researchers working with them have found that have affected well over a dozen carmakers, including Acura, Genesis, Honda, Hyundai, Infiniti, Kia, Toyota, and many others. There’s little doubt, they say, that similarly serious hackable bugs exist in other auto companies’ web tools that have yet to be discovered.

In Subaru’s case, in particular, they also point out that their discovery hints at how pervasively those with access to Subaru’s portal can track its customers’ movements, a privacy issue that will last far longer than the web vulnerabilities that exposed it. “The thing is, even though this is patched, this functionality is still going to exist for Subaru employees,” Curry says. “It’s just normal functionality that an employee can pull up a year’s worth of your location history.”

When WIRED reached out to Subaru for comment on Curry and Shah’s findings, a spokesperson responded in a statement that “after being notified by independent security researchers, [Subaru] discovered a vulnerability in its Starlink service that could potentially allow a third party to access Starlink accounts. The vulnerability was immediately closed and no customer information was ever accessed without authorization.”

The Subaru spokesperson also confirmed to WIRED that “there are employees at Subaru of America, based on their job relevancy, who can access location data.” The company offered as an example that employees have that access to share a vehicle’s location with first responders in the case when a collision is detected. “All these individuals receive proper training and are required to sign appropriate privacy, security, and NDA agreements as needed,” Subaru’s statement added. “These systems have security monitoring solutions in place which are continually evolving to meet modern cyber threats.”

Responding to Subaru’s example of notifying first responders about a collision, Curry notes that would hardly require a year’s worth of location history. The company didn’t respond to WIRED asking how far back it keeps customers’ location histories and makes them available to employees.

Shah and Curry’s research that led them to the discovery of Subaru’s vulnerabilities began when they found that Curry’s mother’s Starlink app connected to the domain SubaruCS.com, which they realized was an administrative domain for employees. Scouring that site for security flaws, they found that they could reset employees’ passwords simply by guessing their email address, which gave them the ability to take over any employee’s account whose email they could find. The password reset functionality did ask for answers to two security questions, but they found that those answers were checked with code that ran locally in a user’s browser, not on Subaru’s server, allowing the safeguard to be easily bypassed. “There were really multiple systemic failures that led to this,” Shah says.

The two researchers say they found the email address for a Subaru Starlink developer on LinkedIn, took over the employee’s account, and immediately found that they could use that staffer’s access to look up any Subaru owner by last name, zip code, email address, phone number, or license plate to access their Starlink configurations. In seconds, they could then reassign control of the Starlink features of that user’s vehicle, including the ability to remotely unlock the car, honk its horn, start its ignition, or locate it, as shown in the video below.

Those vulnerabilities alone, for drivers, present serious theft and safety risks. Curry and Shah point out that a hacker could have targeted a victim for stalking or theft, looked up someone’s vehicle’s location, then unlocked their car at any time—though a thief would have to somehow also use a separate technique to disable the car’s immobilizer, the component that prevents it from being driven away without a key.

Those car hacking and tracking techniques alone are far from unique. Last summer, Curry and another researcher, Neiko Rivera, demonstrated to WIRED that they could pull off a similar trick with any of millions of vehicles sold by Kia. Over the prior two years, a larger group of researchers, of which Curry and Shah are a part, discovered web-based security vulnerabilities that affected cars sold by Acura, BMW, Ferrari, Genesis, Honda, Hyundai, Infiniti, Mercedes-Benz, Nissan, Rolls Royce, and Toyota.

More unusual in Subaru’s case, Curry and Shah say, is that they were able to access fine-grained, historical location data for Subarus going back at least a year. Subaru may in fact collect multiple years of location data, but Curry and Shah tested their technique only on Curry’s mother, who had owned her Subaru for about a year.

Curry argues that Subaru’s extensive location tracking is a particularly disturbing demonstration of the car industry’s lack of privacy safeguards around its growing collection of personal data on drivers. “It’s kind of bonkers,” he says. “There’s an expectation that a Google employee isn’t going to be able to just go through your emails in Gmail, but there’s literally a button on Subaru’s admin panel that lets an employee view location history.”

The two researchers’ work contributes to a growing sense of concern over the enormous amount of location data that car companies collect. In December, information a whistleblower provided to the German hacker collective the Chaos Computer Computer and Der Spiegel revealed that Cariad, a software company that partners with Volkswagen, had left detailed location data for 800,000 electric vehicles publicly exposed online. Privacy researchers at the Mozilla Foundation in September warned in a report that “modern cars are a privacy nightmare,” noting that 92 percent give car owners little to no control over the data they collect, and 84 percent reserve the right to sell or share your information. (Subaru tells WIRED that it “does not sell location data.”)

“While we worried that our doorbells and watches that connect to the Internet might be spying on us, car brands quietly entered the data business by turning their vehicles into powerful data-gobbling machines,” Mozilla’s report reads.

Curry and Shah’s discovery of Subaru’s security vulnerabilities in its tracking demonstrate a particularly egregious exposure of that data—but also a privacy problem that’s hardly less disturbing now that the vulnerabilities are patched, says Robert Herrell, the executive director of the Consumer Federation of California, which has sought to create legislation for limiting a car’s data tracking.

“It seems like there are a bunch of employees at Subaru that have a scary amount of detailed information,” Herrell says. “People are being tracked in ways that they have no idea are happening.”

This story originally appeared on wired.com.

Photo of WIRED

Wired.com is your essential daily guide to what’s next, delivering the most original and complete take you’ll find anywhere on innovation’s impact on technology, science, business and culture.

Millions of Subarus could be remotely unlocked, tracked due to security flaws Read More »

isp-failed-to-comply-with-new-york’s-$15-broadband-law—until-ars-got-involved

ISP failed to comply with New York’s $15 broadband law—until Ars got involved


New York’s affordable broadband law

Optimum wasn’t ready to comply with law, rejected low-income man’s request twice.

Credit: Getty Images | imagedepotpro

When New York’s law requiring $15 or $20 broadband plans for people with low incomes took effect last week, Optimum customer William O’Brien tried to sign up for the cheap Internet service. Since O’Brien is in the Supplemental Nutrition Assistance Program (SNAP), he qualifies for one of the affordable plans that Internet service providers must offer New Yorkers who meet income eligibility requirements.

O’Brien has been paying Optimum $111.20 a month for broadband—$89.99 for the broadband service, $14 in equipment rental fees, a $6 “Network Enhancement Fee,” and $1.21 in tax. He was due for a big discount under the New York Affordable Broadband Act (ABA), which says that any ISP with over 20,000 customers must offer either a $15 plan with download speeds of at least 25Mbps or a $20 plan with at least 200Mbps speeds, and that the price must include “any recurring taxes and fees such as recurring rental fees for service provider equipment required to obtain broadband service and usage fees.”

Despite qualifying for a low-income plan under the law’s criteria, O’Brien’s request was denied by Optimum. He reached out to Ars, just like many other people who have read our articles about bad telecom customer service. Usually, these problems are fixed quickly after we reach out to an Internet provider’s public relations department on the customer’s behalf.

That seemed to be the way it was going, as Optimum’s PR team admitted the mistake and told us that a customer relations specialist would reach out to O’Brien and get him on the right plan. But O’Brien was rejected again after that.

We followed up with Optimum’s PR team, and they had to intervene a second time to make sure the company gave O’Brien what he’s entitled to under the law. The company also updated its marketing materials after we pointed out that its Optimum Advantage Internet webpage still said the low-income plan wasn’t available to current customers, former users who disconnected less than 60 days ago, and former customers whose accounts were “not in good standing.” The New York law doesn’t allow for those kinds of exceptions.

O’Brien is now on a $14.99 plan with 50Mbps download and 5Mbps upload speeds. He was previously on a 100Mbps download plan and had faster upload speeds, but from now on he’ll be paying nearly $100 less a month.

Obviously, telecom customers shouldn’t ever have to contact a news organization just to get a basic problem solved. But the specter of media coverage usually causes an ISP to take quick action, so it was surprising when O’Brien was rejected a second time. Here’s what happened.

“We don’t have that plan”

O’Brien contacted Optimum (which used to be called Cablevision and is now owned by Altice USA) after learning about the New York law from an Ars article. “I immediately got on Optimum’s website to chat with live support but they refused to comply with the act,” O’Brien told us on January 15, the day the law took effect.

A transcript of O’Brien’s January 15 chat with Optimum shows that the customer service agent told him, “I did check on that and according to the policy we don’t have that credit offer in Optimum right now.” O’Brien provided the agent a link to the Ars article, which described the New York law and mentioned that Optimum offers a low-income plan for $15.

“After careful review, I did check on that, it is not officially from Optimum and in Optimum we don’t have that plan,” the agent replied.

O’Brien provided Ars with documents showing that he is in SNAP and thus qualifies for the low-income plan. We provided this information to the Optimum PR department on the morning of January 17.

“We have escalated this exchange with our teams internally to ensure this issue is rectified and will be reaching out to the customer directly today to assist in getting him on the right plan,” an Optimum spokesperson told us that afternoon.

A specialist from Optimum’s executive customer relations squad reached out to O’Brien later on Friday. He missed the call, but they connected on Tuesday, January 21. She told O’Brien that Optimum doesn’t offer the low-income plan to existing customers.

“She said their position is that they offer the required service but only for new customers and since I already have service I’m disqualified,” O’Brien told us. “I told her that I’m currently on food stamps and that I used to receive the $30 a month COVID credit but this did not matter. She claimed that since Optimum offers a $15, 50Mbps service… that they are in compliance with the law.”

Shortly after the call, the specialist sent O’Brien an email reiterating that he wasn’t eligible, which he shared with Ars. “As discussed prior to this notification, Optimum offers a low-income service for $15.00. However, we were unable to change the account to that service because it is an active account with the service,” she wrote.

Second try

We contacted Optimum’s PR team again after getting this update from O’Brien. On Tuesday evening, the specialist from executive customer relations emailed O’Brien to say, “The matter was reviewed, and I was advised that I could upgrade the account.”

After another conversation with the specialist on Wednesday, O’Brien had the $15 plan. O’Brien told us that he “asked why I had to fight tooth and nail for this” and why he had to contact a news organization to get it resolved. “I claimed that it’s almost like no one there has read the legislation, and it was complete silence,” he told us.

On Wednesday this week, the Optimum spokesperson told us that “it seems that there has been some confusion among our care teams on the implementation of the ABA over the last week and how it should be correctly applied to our existing low-cost offers.”

Optimum has offered its low-cost plan for several years, with the previously mentioned restrictions that limit it to new customers. The plan website wasn’t updated in time for the New York law, but now says that “new and existing residential Internet customers in New York” qualify. The new-customer restriction still applies elsewhere.

“Our materials have been updated, including all internal documents and trainings, in addition to our external website,” Optimum told us on Wednesday this week.

Law was in the works for years

Broadband lobby groups convinced a federal judge to block the New York affordability law in 2021, but a US appeals court reversed the ruling in April 2024. The Supreme Court decided not to hear the case in mid-December, allowing the law to take effect.

New York had agreed to delay enforcement until 30 days after the case’s final resolution, which meant that it took effect on January 15. The state issued an order on January 9 reminding ISPs that they had to comply.

“We have been working as fast as we can to update all of our internal and external materials since the ABA was implemented only last week—there was quite a fast turnaround between state officials notifying us of the intended implementation date and pushing this live,” Optimum told Ars.

AT&T decided to completely stop offering its 5G home Internet service in New York instead of complying with the state law. The law doesn’t affect smartphone service, and AT&T doesn’t offer wired home Internet in New York.

Optimum told us it plans to market its low-income plan “more broadly and conduct additional outreach in low-income areas to educate customers and prospects of this offer. We want to make sure that those eligible for this plan know about it and sign up.”

O’Brien was disappointed that he couldn’t get a faster service plan. As noted earlier, the New York law lets ISPs comply with either a $15 plan with download speeds of at least 25Mbps or a $20 plan with at least 200Mbps speeds. ISPs don’t have to offer both.

“I did ask about 200Mbps service, but they said they are not offering that,” he said. Optimum offers a $25 plan with 100Mbps speeds for low-income users. But even in New York, that one still isn’t available to customers who were already subscribed to any other plan.

Failure to comply with the New York law can be punished with civil penalties of up to $1,000 per violation. The state attorney general can sue Internet providers to enforce the law. O’Brien said he intended to file a complaint against Optimum with the AG and is still hoping to get a 200Mbps plan.

We contacted Attorney General Letitia James’ office on Wednesday to ask about plans for enforcing the law and whether the office has received any complaints so far, but we haven’t gotten a response.

Photo of Jon Brodkin

Jon is a Senior IT Reporter for Ars Technica. He covers the telecom industry, Federal Communications Commission rulemakings, broadband consumer affairs, court cases, and government regulation of the tech industry.

ISP failed to comply with New York’s $15 broadband law—until Ars got involved Read More »