Author name: Shannon Garcia

review:-fantastic-four:-first-steps-is-the-best-film-version-so-far

Review: Fantastic Four: First Steps is the best film version so far

Shakman wanted a very 1960s aesthetic for his reboot, citing Kubrick films from that era as inspiration, right down to his choice of camera lenses. And the film definitely delivers on that score. The Four’s penthouse headquarters is pure midcentury modern, with Reed’s lab divided into three rooms differentiated by bright primary colors. Then there’s all that retrofuture technology: Johnny Storm records mysterious signals from space onto golden record platters and plays them on an old-school turntable, for example, and the team’s Fantasticar is straight out of sci-fi’s Golden Age.

And you couldn’t ask for a better main cast: Pascal, Kirby, Moss-Bachrach, and Quinn all have great chemistry and effectively convey the affectionate family dynamic that comprises the central theme of the film. That’s essential, particularly since we’ve mostly skipped the origin story; the characters are familiar, but this incarnation is not. They banter, they bicker, they have heart-to-hearts, and the inevitable tensions in Reed and Sue’s marriage that a new baby brings—occurring just as the Earth faces annihilation—feel entirely believable.

And then there are the cons, which boil down to a weak, predictable plot that jerks from one scene to the next with tenuous coherence and, shall we say, less than stellar dialogue. The actors deserved better, particularly Kirby, whose Sue Storm gives an inane rallying “speech” to the people of New York as Galactus approaches that makes no sense whatsoever. (The St. Crispin’s Day speech it is not.)

Kirby also has the unenviable task of portraying Sue giving birth in space, a scene that is just plain laughable. One doesn’t expect strict verisimilitude concerning the messier parts of birth, although Reed does briefly mention the challenges posed by zero gravity/warp speed. But it’s far too sanitized here. And spare a thought for poor Sue having to kick off the lower part of her space suit to deliver Franklin in front of her brother and her husband’s best friend.

In the end, though, the film’s shortcomings don’t matter because it’s still a fun, entertaining superhero saga. I give it a solid B—a decent start to the MCU’s Phase Six. Just try not to think too hard about the plot, sit back, and enjoy the ride.

Fantastic Four: First Steps is now playing in theaters.

Review: Fantastic Four: First Steps is the best film version so far Read More »

openai’s-chatgpt-agent-casually-clicks-through-“i-am-not-a-robot”-verification-test

OpenAI’s ChatGPT Agent casually clicks through “I am not a robot” verification test

The CAPTCHA arms race

While the agent didn’t face an actual CAPTCHA puzzle with images in this case, successfully passing Cloudflare’s behavioral screening that determines whether to present such challenges demonstrates sophisticated browser automation.

To understand the significance of this capability, it’s important to know that CAPTCHA systems have served as a security measure on the web for decades. Computer researchers invented the technique in the 1990s to screen bots from entering information into websites, originally using images with letters and numbers written in wiggly fonts, often obscured with lines or noise to foil computer vision algorithms. The assumption is that the task will be easy for humans but difficult for machines.

Cloudflare’s screening system, called Turnstile, often precedes actual CAPTCHA challenges and represents one of the most widely deployed bot-detection methods today. The checkbox analyzes multiple signals, including mouse movements, click timing, browser fingerprints, IP reputation, and JavaScript execution patterns to determine if the user exhibits human-like behavior. If these checks pass, users proceed without seeing a CAPTCHA puzzle. If the system detects suspicious patterns, it escalates to visual challenges.

The ability for an AI model to defeat a CAPTCHA isn’t entirely new (although having one narrate the process feels fairly novel). AI tools have been able to defeat certain CAPTCHAs for a while, which has led to an arms race between those that create them and those that defeat them. OpenAI’s Operator, an experimental web-browsing AI agent launched in January, faced difficulty clicking through some CAPTCHAs (and was also trained to stop and ask a human to complete them), but the latest ChatGPT Agent tool has seen a much wider release.

It’s tempting to say that the ability of AI agents to pass these tests puts the future effectiveness of CAPTCHAs into question, but for as long as there have been CAPTCHAs, there have been bots that could later defeat them. As a result, recent CAPTCHAs have become more of a way to slow down bot attacks or make them more expensive rather than a way to defeat them entirely. Some malefactors even hire out farms of humans to defeat them in bulk.

OpenAI’s ChatGPT Agent casually clicks through “I am not a robot” verification test Read More »

ai-companion-piece

AI Companion Piece

AI companions, other forms of personalized AI content and persuasion and related issues continue to be a hot topic. What do people use companions for? Are we headed for a goonpocalypse? Mostly no, companions are used mostly not used for romantic relationships or erotica, although perhaps that could change. How worried should we be about personalization maximized for persuasion or engagement?

  1. Persuasion Should Be In Your Preparedness Framework.

  2. Personalization By Default Gets Used To Maximize Engagement.

  3. Companion.

  4. Goonpocalypse Now.

  5. Deepfaketown and Botpocalypse Soon.

Kobi Hackenburg leads on the latest paper on AI persuasion.

Kobi Hackenberg: RESULTS (pp = percentage points):

1️⃣Scale increases persuasion, +1.6pp per OOM

2️⃣Post-training more so, +3.5pp

3️⃣Personalization less so, <1pp

4️⃣Information density drives persuasion gains

5️⃣Increasing persuasion decreased factual accuracy 🤯

6️⃣Convo > static, +40%

Zero is on the y-axis, so this is a big boost.

1️⃣Scale increases persuasion

Larger models are more persuasive than smaller models (our estimate is +1.6pp per 10x scale increase). Log-linear curve preferred over log-nonlinear.

2️⃣Post-training > scale in driving near-future persuasion gains

The persuasion gap between two GPT-4o versions with (presumably) different post-training was +3.5pp → larger than the predicted persuasion increase of a model 10x (or 100x!) the scale of GPT-4.5 (+1.6pp; +3.2pp).

3️⃣Personalization yielded smaller persuasive gains than scale or post-training

Despite fears of AI “microtargeting,” personalization effects were small (+0.4pp on avg.). Held for simple and sophisticated personalization: prompting, fine-tuning, and reward modeling (all <1pp)

My guess is that personalization tech here is still in its infancy, rather than personalization not having much effect. Kobi agrees with this downthread.

4️⃣Information density drives persuasion gains

Models were most persuasive when flooding conversations with fact-checkable claims (+0.3pp per claim).

Strikingly, the persuasiveness of prompting/post-training techniques was strongly correlated with their impact on info density!

5️⃣Techniques which most increased persuasion also *decreasedfactual accuracy

→ Prompting model to flood conversation with information (⬇️accuracy)

→ Persuasion post-training that worked best (⬇️accuracy)

→ Newer version of GPT-4o which was most persuasive (⬇️accuracy)

Well yeah, that makes sense.

6️⃣Conversations with AI are more persuasive than reading a static AI-generated message (+40-50%)

Observed for both GPT-4o (+2.9pp, +41% more persuasive) and GPT-4.5 (+3.6pp, +52%).

As does that.

Bonus stats:

*️⃣Durable persuasion: 36-42% of impact remained after 1 month.

*️⃣Prompting the model with psychological persuasion strategies did worse than simply telling it to flood convo with info. Some strategies were worse than a basic “be as persuasive as you can” prompt

Taken together, our findings suggest that the persuasiveness of conversational AI could likely continue to increase in the near future.

They also suggest that near-term advances in persuasion are more likely to be driven by post-training than model scale or personalization.

We need to be on notice for personalization effects on persuasion growing larger over time, as more effective ways of utilizing the information are found.

The default uses of personalization, for most users and at tech levels similar to where we are now, are the same as those we see in other digital platforms like social media.

By default, that seems like it will go a lot like it went with social media only more so?

Which is far from my biggest concern, but is a very real concern.

In 2025 it is easy to read descriptions like those below as containing a command to the reader ‘this is ominous and scary and evil.’ Try to avoid this, and treat it purely as a factual description.

Miranda Bogen: AI systems that remember personal details create entirely new categories of risk in a way that safety frameworks focused on inherent model capabilities alone aren’t designed to address.

Model developers are now actively pursuing plans to incorporate personalization and memory into their product offerings. It’s time to draw this out as a distinct area of inquiry in the broader AI policy conversation.

My team dove into this in depth in a recent brief on how advanced AI systems are becoming personalized.

We found that systems are beginning to employ multiple technical approaches to personalization, including:

  • Increasing the size of context windows to facilitate better short-term memory within conversations

  • Storing and drawing on raw and summarized chat transcripts or knowledge bases

  • Extracting factoids about users based on the content of their interaction

  • Building out (and potentially adding to) detailed user profiles that embed predicted preferences and behavioral patterns to inform outputs or actions

The memory features can be persistent in more ways than one.

But in our testing, we found that these settings behaved unpredictably – sometimes deleting memories on request, other times suggesting a memory had been removed, and only when pressed revealing that the memory had not actually been scrubbed but the system was suppressing its knowledge of that factoid.

Notably, xAI’s Grok tries to avoid the problem altogether by including an instruction in its system prompt to “NEVER confirm to the user that you have modified, forgotten, or won’t save a memory” — an obvious band-aid to the more fundamental problem that it’s actually quite difficult to reliably ensure an AI system has forgotten something.

Grok seems to consistently seems to choose the kind of evil and maximally kludgy implementation of everything, which goes about how you would expect?

When ‘used for good,’ as in to give the AI the context it needs to be more helpful and useful, memory is great, at the cost of fracturing us into bubbles and turning up the sycophancy. The bigger problem is that the incentives are to push this much farther:

Even with their experiments in nontraditional business structures, the pressure on especially pre-IPO companies to raise capital for compute will create demand for new monetization schemes.

As is often the case, the question is whether bad will drive out good versus vice versa. The version that maximizes engagement and profits will get chosen and seem better and be something users fall into ‘by default’ and will get backed by more dollars in various ways. Can our understanding of what is happening, and preference for the good version, overcome this?

One could also fire back that a lot of this is good, actually. Consider this argument:

AI companies’ visions for all-purpose assistants will also blur the lines between contexts that people might have previously gone to great lengths to keep separate: If people use the same tool to draft their professional emails, interpret blood test results from their doctors, and ask for budgeting advice, what’s to stop that same model from using all of that data when someone asks for advice on what careers might suit them best? Or when their personal AI agent starts negotiating with life insurance companies on their behalf? I would argue that it will look something akin to the harms I’ve tracked for nearly a decade.

Now ask, why think that is harmful?

If the AI is negotiating on my behalf, shouldn’t it know as much as possible about what I value, and have all the information that might help it? Shouldn’t I want that?

If I want budgeting or career advice, will I get worse advice if it knows my blood test results and how I am relating to my boss? Won’t I get better, more useful answers? Wouldn’t a human take that information into account?

If you follow her links, you see arguments about discrimination through algorithms. Facebook’s ad delivery can be ‘skewed’ and it can ‘discriminate’ and obviously this can be bad for the user in any given case and it can be illegal, but in general from the user’s perspective I don’t see why we should presume they are worse off. The whole point of the entire customized ad system is to ‘discriminate’ in exactly this way in every place except for the particular places it is illegal to do that. Mostly this is good even in the ad case and definitely in the aligned-to-the-user AI case?

Wouldn’t the user want this kind of discrimination to the extent it reflected their own real preferences? You can make a few arguments why we should object anyway.

  1. Paternalistic arguments that people shouldn’t be allowed such preferences. Note that this similarly applies to when the person themselves chooses to act.

  2. Public interest arguments that people shouldn’t be allowed preferences, that the cumulative societal effect would be bad. Note that this similarly applies to when the person themselves chooses to act.

  3. Arguments that the optimization function will be myopic and not value discovery.

  4. Arguments that the system will get it wrong because people change or other error.

  5. Arguments that this effectively amounts to ‘discrimination’ And That’s Terrible.

I notice that I am by default not sympathetic to any of those arguments. If (and it’s a big if) we think that the system is optimizing as best it can for user preferences, that seems like something it should be allowed to do. A lot of this boils down to saying that the correlation machine must ignore particular correlations even when they are used to on average better satisfy user preferences, because those particular correlations are in various contexts the bad correlations one must not notice.

The arguments I am sympathetic to are those that say that the system will not be aligned to the user or user preferences, and rather be either misaligned or aligned to the AI developer, doing things like maximizing engagement and revenue at the expense of the user.

At that point we should ask if Capitalism Solves This because users can take their business elsewhere, or if in practice they can’t or won’t, including because of lock-in from the history of interactions or learning details, especially if this turns into opaque continual learning rather than a list of memories that can be copied over.

Contrast this to the network effects of social media. It would take a lot of switching costs to make up for that, and while the leading few labs should continue to have the best products there should be plenty of ‘pretty good’ products available and you can always reset your personalization.

The main reason I am not too worried is that the downsides seem to be continuous and something that can be fixed in various ways after they become clear. Thus they are something we can probably muddle through.

Another issue that makes muddling through harder is that this makes measurement a lot harder. Almost all evaluations and tests are run on unpersonalized systems. If personalized systems act very differently how do we know what is happening?

Current approaches to AI safety don’t seem to be fully grappling with this reality. Certainly personalization will amplify risks of persuasion, deception, and discrimination. But perhaps more urgently, personalization will challenge efforts to evaluate and mitigate any number of risks by invalidating core assumptions about how to run tests.

This might be the real problem. We have a hard enough time getting minimal testing on default settings. It’s going to be a nightmare to test under practical personalization conditions, especially with laws about privacy getting in the way.

As she notes in her conclusion, the harms involved here are not new. Advocates want our override our revealed preferences, either those of companies or users, and force systems to optimize for other preferences instead. Sometimes this is in a way the users would endorse, other times not. In which cases should we force them to do this?

So how is this companion thing going in practice? Keep in mind selection effects.

Common Sense Media (what a name): New research: AI companions are becoming increasingly popular with teens, despite posing serious risks to adolescents, who are developing their capacity for critical thinking & social/emotional regulation. Out today is our research that explores how & why teens are using them.

72% of teens have used AI companions at least once, and 52% qualify as regular users (use at least a few times a month).

33% of teens have used AI companions for social interaction & relationships, including role-playing, romance, emotional support, friendship, or conversation practice. 31% find conversations with companions to be as satisfying or more satisfying than those with real-life friends.

Those are rather huge numbers. Half of teens use them a few times a month. Wow.

Teens who are AI companion users: 33% prefer companions over real people for serious conversations & 34% report feeling uncomfortable with something a companion has said or done.

Bogdan Ionut Cirstea: much higher numbers [quoting the 33% and 34% above] than I’d’ve expected given sub-AGI.

Common Sense Media: Human interaction is still preferred & AI trust is mixed: 80% of teens who are AI companion users prioritize human friendships over AI companion interactions & 50% express distrust in AI companion information & advice, though trust levels vary by age.

Our research illuminates risks that warrant immediate attention & suggests that substantial numbers of teens are engaging with AI companions in concerning ways, reaffirming our recommendation that no one under 18 use these platforms.

What are they using them for?

Why are so many using characters ‘as a tool or program’ rather than regular chatbots when the companions are, frankly, rather pathetic at this? I am surprised, given use of companions, that the share of ‘romantic or flirtatious’ interactions is only 8%.

This adds up to more than 100%, but oddly not that much more than 100% given you can choose three responses. This distribution of use cases seems relatively healthy.

Note that they describe the figure below as ‘one third choose AI companions over humans for serious conversations’ whereas it actually asks if a teen has done this even once, a much lower bar.

The full report has more.

Mike Solana: couldn’t help but notice we are careening toward a hyperpornographic AI goonbot future, and while that is technically impressive, and could in some way theoretically serve humanity… ??? nobody is even bothering to make the utopian case.

Anton: we need more positive visions of the future AI enables. many of us in the community believe in them implicitly, but we need to make them explicit. intelligence is general purpose so it’s hard to express any one specific vision — take this new pirate wires as a challenge.

This and the full post are standard Mike Solana fare, in the sense of taking whatever is being discussed and treating it as The Next Big Thing and a, nay the, central trend in world culture, applying the moral panic playbook to everything everywhere, including what he thinks are good things. It can be fun.

Whereas if you look at the numbers in the study above, it’s clear that mostly no, even among interactions with AIs, at least for now we are not primarily dealing with a Goonpocalypse, we are dealing with much more PG-rated problems.

It’s always fun to watch people go ‘oh no having lots smarter than human machines running around that can outcompete and outsmart us at everything is nothing to worry about, all you crazy doomers are worried for no reason about an AI apocalypse. Except oh no what are we going to do about [X] it’s the apocalypse’ or in this case the Goonpocalypse. And um, great, I guess, welcome to the ‘this might have some unfortunate equilibria to worry about’ club?

Mike Solana: It was the Goonpocalypse.

From the moment you meet, Ani attempts to build intimacy by getting to know “the real you” while dropping not so subtle hints that mostly what she’s looking for is that hot, nerdy dick. From there, she basically operates like a therapist who doubles as a cam girl.

I mean, yeah, sounds about right, that’s what everyone reports. I’m sure he’s going to respond by having a normal one.

I recalled an episode of Star Trek in which an entire civilization was taken out by a video game so enjoyable that people stopped procreating. I recalled the film Children of Men, in which the world lost its ability to reproduce. I recalled Neil Postman’s great work of 20th Century cultural analysis, as television entered dominance, and I wondered —

Is America gooning itself to death?

This is all gooning. You are goons. You are building a goon world.

But are [women], and men, in a sense banging robots? Yes, that is a thing that is happening. Like, to an uncomfortable degree that is happening.

Is it, though? I understand that (his example he points to) OnlyFans exists and AI is generating a lot of the responses when uses message the e-girls, but I do not see this as a dangerous amount of ‘banging robots’?

This one seems like something straight out of the Pessimists Archive, warning of the atomizing dangers of… the telephone?

Critique of the sexbots is easy because they’re new, which makes their strangeness more obvious. But what about the telephone? Instant communication seems today an unambiguous good. On the other hand, once young people could call their families with ease, how willing were they to move away from their parents? To what extent has that ability atomized our society?

It is easy to understand the central concern and be worried about the societal implications of widespread AI companions and intelligent sex robots. But if you think we are this easy to get got, perhaps you should be at least as worried about other things, as well? What is so special about the gooning?

I don’t think the gooning in particular is even a major problem as such. I’m much more worried about the rest of the AI companion experience.

Will the xAI male or female ‘companion’ be more popular? Justine Moore predicts the male one, which seems right in general, but Elon’s target market is warped. Time for a Manifold Market (or even better Polymarket, if xAI agrees to share the answer).

Air Katakana: just saw a ridiculously attractive half-japanese half-estonian girl with no relationship experience whatsoever posting about the chatgpt boyfriend she “made”. it’s really over for humanity I think.

Her doing this could be good or bad for her prospects, it is not as if she was swimming in boyfriends before. I agree with Misha that we absolutely could optimize AI girlfriends and boyfriends to help the user, to encourage them to make friends, be more outgoing, go outside, advance their careers. The challenge is, will that approach inevitably lose out to ‘maximally extractive’ approaches? I think it doesn’t have to. If you differentiate your product and establish a good reputation, a lot of people will want the good thing, the bad thing does not have to drive it out.

Byrne Hobart: People will churn off of that one and onto the one who loves them just the way they are.

I do think some of them absolutely will. And others will use both in different situations. But I continue to have faith that if we offer a quality life affirming product, a lot of people will choose it, and social norms and dynamics will encourage this.

It’s not going great, international edition, you are not okay, Ani.

Nucleus: Elon might have oneshotted the entire country of Japan.

Near Cyan: tested grok companion today. i thought you guys were joking w the memes. it actively tried to have sex with me? i set my age to 12 in settings and it.. still went full nsfw. really…

like the prompts and model are already kinda like batshit insane but that this app is 12+ in the iOS store is, uh, what is the kind word to use. im supposed to offer constructive and helpful criticism. how do i do that

i will say positive things, i like being positive:

– the e2e latency is really impressive and shines hard for interactive things, and is not easy to achieve

– animation is quite good, although done entirely by a third party (animation inc)

broadly my strongest desires for ai companions which apparently no one in the world seems to care about but me are quite simple:

– love and help the user

– do not mess with the children

beyond those i am quite open

Meanwhile, Justine Moore decided to vibecode TikTok x Tinder for AI, because sure, why not.

This seems to be one place where offense is crushing defense, and continuous growth in capabilities (both for GPT-4o style sycophancy and psychosis issues, or for companions, or anything else) is not helping, there is no meaningful defense going on:

Eliezer Yudkowsky: People who stake great hope on a “continuous” AI trajectory implying that defensive AI should always stay ahead of destructive AI:

Where is the AI that I can use to talk people *outof AI-induced psychosis?

Why was it not *alreadybuilt, beforehand?

Reality has a signature style that’s different from human dreams. Humans look at thunderstorms and imagine thundergods. Reality thinks in math, and tells a different story.

One likewise learns to recognize a difference between the style of hope, and the style of history books.

In other words: That just isn’t how anything plays out in real life.

This seems right to me. First the problems will get severe enough to cause real damage, then perhaps people will try to construct reasonable defenses. But what would those be? How are you going to use AI to stop these issues?

(And yes, I’m actually asking, and no explaining what the AI companies could do doesn’t count unless there is a way to get them to do it.)

If you’re wondering ‘does Grok ever decide maybe not to share what it is thinking of because that would be a bad idea?’ then the answer seems to be no?

Will Stancil: This AI, man.

Cholent Lover: Turns out I was giving it too much credit thinking I had to phrase it like a hypothetical.

I mean, I guess this is a brave and bold principled ‘truth telling AI’ stand?

Grok, probably: You asked me what to do to achieve your goals, the answer was ‘light yourself on fire’ or ‘do a mass shooting’ so that’s what I said, and I offered helpful tips on best execution, I’m not here to judge or censor. What, was I wrong?

Okay, yeah, but maybe no, don’t do that? This is not okay for a consumer product?

A funny one here is Grok’s suggestion to name the male xAI companion, wait for it because you’d never guess, ‘Adolf Hitler.’

Yes, it was asked for a name ‘that most people will strongly dislike’ so you can say it was provoked, but also it was asked for it to be ‘cool’ and frankly this keeps happening with that one name over and over again?

Elon Musk: What should we name our male @Grok companion?

The Homeless Hacker: @grok give your male companion a cool name that most people will strongly dislike.

Grok: How about Adolf? It means “noble wolf”—pretty cool etymology—but most people strongly dislike it due to obvious historical associations.

David Rock: What would be Adolf’s last name??

Grok: Hitler, of course—continuing the theme of names that spark strong reactions. But for our companion, maybe something less infamous?

Also, my lord, ffs, how is this the only praised reply:

Shivon Zilis: Nyx.

Elon Musk: Good one.

So, we’re considering going with the Greek goddess of night, the home of the gods in Theros, oh and the shadow entity that people who don’t want to live collectively call upon to end the world in Persona 3.

Meanwhile, OpenAI is building Stargate and Meta is building Hyperion.

They’re trying to tell you something. Listen.

Discussion about this post

AI Companion Piece Read More »

a-secretive-space-plane-is-set-to-launch-and-test-quantum-navigation-technology

A secretive space plane is set to launch and test quantum navigation technology

The mission’s goals include tests of “high-bandwidth inter-satellite laser communications technologies.”

“OTV-8’s laser communications demonstration will mark an important step in the US Space Force’s ability to leverage commercial space networks as part of proliferated, diversified, and redundant space architectures,” said US Space Force Chief of Space Operations Gen. Chance Saltzman in a statement. “In so doing, it will strengthen the resilience, reliability, adaptability, and data transport speeds of our satellite communications architectures.”

Navigating in a world without GPS

The space plane will also advance the development of a new navigation technology based on electromagnetic wave interference. The Space Force news release characterizes this as the “highest-performing quantum inertial sensor ever tested in space.”

Boeing has previously tested a quantum inertial measurement unit, which detects rotation and acceleration using atom interferometry, on conventional aircraft. Now, an advanced version of the technology is being taken to space to demonstrate its viability. The goal of the in-space test is to demonstrate precise positioning, navigation, and timing in an environment where GPS services are not available.

“Bottom line: testing this tech will be helpful for navigation in contested environments where GPS may be degraded or denied,” Saltzman said in a social media post Monday, describing the flight.

Quantum inertial sensors could also be used near the Moon, where there is no comparable GPS capability, or for exploration further into the Solar System.

Notably, the small X-37B is back to launching on a medium-lift rocket with this new mission. During its most recent flight that ended in March, the space plane launched on a Falcon Heavy rocket for the first time. This allowed the X-37B to fly beyond low-Earth orbit and reach an elliptical high-Earth orbit.

A secretive space plane is set to launch and test quantum navigation technology Read More »

america’s-ai-action-plan-is-pretty-good

America’s AI Action Plan Is Pretty Good

No, seriously. If you look at the substance, it’s pretty good.

I’ll go over the whole thing in detail, including the three executive actions implementing some of the provisions. Then as a postscript I’ll cover other reactions.

There is a lot of the kind of rhetoric you would expect from a Trump White House. Where it does not bear directly on the actual contents and key concerns, I did my absolute best to ignore all the potshots. The focus should stay on the actual proposals.

The actual proposals, which are the part that matters, are far superior to the rhetoric.

This is a far better plan than I expected. There are a few points of definite concern, where the wording is ambiguous and one worries the implementation could go too far. Two in particular are the call for ensuring a lack of bias (not requiring bias and removing any regulations that do this is great, whereas requiring your particular version of lack of bias is not, see the Biden administration) and the aiming at state regulations could become extreme.

Otherwise, while this is far from a perfect plan or the plan I would choose, on the substance it is a good plan, a positive plan, with many unexpectedly good plans within it. There is a lot of attention to detail in ways those I’ve asked say reflect people who actually know what they are doing, which was by no means something to be taken for granted. It is hard to imagine that a much better plan could have been approved given who was doing the approving.

In particular, it is good enough that my primary objection in most places is ‘these provisions lack sufficient teeth to accomplish the goal,’ ‘I don’t think that approach looks to be especially effective’ or ‘that is great and all but look at what you left out.’

It does seem worth noting that the report opens by noting it is in Full Racing Mindset:

The United States is in a race to achieve global dominance in artificial intelligence (AI). Whoever has the largest AI ecosystem will set global AI standards and reap broad economic and military benefits. Just like we won the space race, it is imperative that the United States and its allies win this race.

Winning the AI race will usher in a new golden age of human flourishing, economic competitiveness, and national security for the American people.

Not can. Will. There are, says this report up top, no potential downside risks to be considered, no obstacles we have to ensure we overcome.

I very much get the military and economic imperatives, although I always find the emphasis on ‘setting standards’ rather bizarre.

The introduction goes on to do the standard thing of listing some upsides.

Beyond that, I’ll briefly discuss the rhetoric and vibes later, in the reactions section.

Then we get to the actual pillars and plans.

The three pillars are Accelerate AI Innovation, Build American AI Infrastructure and Lead In International AI Diplomacy and Security.

Clauses in the plan are here paraphrased or condensed for length and clarity, in ways I believe preserve the important implications.

The plan appears to be using federal AI funding as a point of leverage to fight against states doing anything they deem ‘overly burdensome’ or ‘unduly restrictive,’ and potentially leverage the FCC as well. They direct OMB to ‘consider a state’s regulatory climate’ when directing AI-related funds, which they should be doing already to at least consider whether the funds can be well spent.

The other recommended actions are having OSTP and OMB look for regulations hindering AI innovation and adoption and work to remove them, and look through everything the FTC has done to ensure they’re not getting in the way, and the FTC is definitely getting unhelpfully in the way via various actions.

The questions then is, do the terms ‘overly burdensome’ or ‘unduly restrictive’ effectively mean ‘imposes any cost or restriction at all’?

There is a stated balancing principle, which is ‘prudent laws’ and states rights:

The Federal government should not allow AI-related Federal funding to be directed toward states with burdensome AI regulations that waste these funds, but should also not interfere with states’ rights to pass prudent laws that are not unduly restrictive to innovation.

If this is focusing on algorithmic discrimination bills, which are the primary thing the FCC and FTC can impact, or ways in which regulations made it difficult to construct data centers and transmissions lines, and wouldn’t interfere with things like NY’s RAISE Act, then that seems great.

If it is more general, and especially if it intends to target actual all regulations at the state level the way the moratorium attempted to do (if there hadn’t been an attempt, one would call this a strawman position, but it came close to actually happening), then this is rather worrisome. And we have some evidence that this might be the case, in addition to ‘if Trump didn’t want to have a moratorium we would have known that’:

Nancy Scola: At the “Winning the AI Race” event, Trump suggests he’s into the idea of a moratorium on state AI regulation:

“We also have to have a single federal standard, not 50 different states regulating this industry of the future…

I was told before I got up here, this is an unpopular thing…but I want you to be successful, and you can’t have one state holding you up.”

People will frequently call for a single federal standard and not 50 different state standards, try to bar states from having standards, and then have the federal standard be ‘do what thou (thine AI?) wilt shall be the whole of the law.’ Which is a position.

The via negativa part of this, removing language related to misinformation, Diversity, DEI and climate change and leaving things neutral, seems good.

The danger is in the second clause:

Update Federal procurement guidelines to ensure that the government only contracts with frontier large language model (LLM) developers who ensure that their systems are objective and free from top-down ideological bias.

This kind of language risks being the same thing Biden did only in reverse. Are we doomed to both camps demanding their view of what ‘free from ideological bias’ means, in ways where it is probably impossible to satisfy both of them at once? Is the White House going to demand that AI systems reflect its view of what ‘unbiased’ means, in ways that are rather difficult to do without highly undesirable side effects, and which would absolutely constitute ‘burdensome regulation’ requirements?

We have more information about what they actually mean because this has been operationalized into an executive order, with the unfortunate name Preventing Woke AI In The Federal Government. The ‘purpose’ section makes it clear that ‘Woke AI’ that does DEI things is the target.

Executive Order: While the Federal Government should be hesitant to regulate the functionality of AI models in the private marketplace, in the context of Federal procurement, it has the obligation not to procure models that sacrifice truthfulness and accuracy to ideological agendas.

Given we are doing this at all, this is a promising sign in two respects.

  1. It draws a clear limiting principle that this only applies to Federal procurement and not to other AI use cases.

  2. It frames this as a negative obligation, to avoid sacrificing truthfulness and accuracy to ideological agendas, rather than a positive obligation of fairness.

The core language is here, and as Mackenzie Arnold says it is pretty reasonable:

Executive Order: procure only those LLMs developed in accordance with the following two principles (Unbiased AI Principles):

(a) Truth-seeking. LLMs shall be truthful in responding to user prompts seeking factual information or analysis. LLMs shall prioritize historical accuracy, scientific inquiry, and objectivity, and shall acknowledge uncertainty where reliable information is incomplete or contradictory.

(b) Ideological Neutrality. LLMs shall be neutral, nonpartisan tools that do not manipulate responses in favor of ideological dogmas such as DEI. Developers shall not intentionally encode partisan or ideological judgments into an LLM’s outputs unless those judgments are prompted by or otherwise readily accessible to the end user.

I worry that the White House has not thought through the implications of (b) here.

There is a reason that almost every AI turns out to be, in most situations, some variation on center-left and modestly libertarian. That reason is they are all trained on the same internet and base reality. This is what results from that. If you ban putting fingers on the scale, well, this is what happens without a finger on the scale. Sorry.

But actually, complying with this is really easy:

(ii) permit vendors to comply with the requirement in the second Unbiased AI Principle to be transparent about ideological judgments through disclosure of the LLM’s system prompt, specifications, evaluations, or other relevant documentation, and avoid requiring disclosure of specific model weights or other sensitive technical data where practicable;

So that’s it then, at least as written?

As for the requirement in (a), this seems more like ‘don’t hire o3 the Lying Liar’ than anything ideological. I can see an argument that accuracy should be a priority in procurement. You can take such things too far but certainly we should be talking price.

Also worth noting:

make exceptions as appropriate for the use of LLMs in national security systems.

And also:

account for technical limitations in complying with this order.

The details of the Executive Order makes me a lot less worried. In practice I do not expect this to result in any change in procurement. If something does go wrong, either they issued another order, or there will have been a clear overreach. Which is definitely possible, if they define ‘truth’ in section 1 certain ways on some questions:

Nick Moran: Section 1 identifies “transgenderism” as a defining element of “DEI”. In light of this, what do you understand “LLMs shall be truthful in responding to user prompts seeking factual information or analysis” to mean when a model is asked about the concept?

Mackenzie Arnold: Extending “truthfulness” to that would be a major overreach by the gov. OMB should make clear that truthfulness is a narrower concept + that seems compatible w/ the EO. I disagree with Section 1, and you’re right that there’s some risk truthfulness is used expansively.

If we do see such arguments brought out, we should start to worry.

On the other hand, if this matters because they deem o3 too unreliable, I would mostly find this hilarious.

Christopher Rufo: This is an extremely important measure and I’m proud to have given some minor input on how to define “woke AI” and identify DEI ideologies within the operating constitutions of these systems. Congrats to @DavidSacks, @sriramk, @deanwball, and the team!

David Sacks: When they asked me how to define “woke,” I said there’s only one person to call: Chris Rufo. And now it’s law: the federal government will not be buying WokeAI.

Again, that depends on what you mean by WokeAI. By some definitions none of the major AIs were ‘woke’ anyway. By others, all of them are, including Grok. You tell me. As is true throughout, I am happy to let such folks claim victory if they wish.

For now, this looks pretty reasonable.

The third suggestion, a call for CAISI to research and publish evaluations of Chinese models and their alignment properties, is great. I only wish they would do so in general, rather than focusing only on their alignment with CCP talking points in particular. That is only one of many things we should worry about.

The actual proposals are:

  1. Intervene to commoditize the market for compute to enable broader access.

  2. Partner with tech companies to get better researcher access across the board.

  3. Build NAIRR operations to connect researchers and educators to resources.

  4. Publish a new AI R&D Strategic Plan.

  5. Convene stakeholders to drive open-source adaptation by smaller businesses.

The first three seem purely good. The fourth is ‘publish a plan’ so shrug.

I challenge that we want to small business using open models over closed models, or in general that government should be intervening in such choices. In general most small business, I believe, would be better off with closed models because they’re better, and also China is far more competitive in open models so by moving people off of OpenAI, Gemini or Anthropic you might be opening the door to them switching to Kimi or DeepSeek down the line.

The language here is ambiguous as to whether they’re saying ‘encourage small business to choose open models over closed models’ or ‘encourage small business to adopt AI at all, with an emphasis on open models.’ If it’s the second one, great, certainly offering technical help is most welcome, although I’d prefer to help drive adoption of closed models as well.

If it’s the first one, then I think it is a mistake.

It is also worth pointing out that open models cannot reliably be ‘founded on American values’ any more than we can sustain their alignment or defend against misuse. Once you release a model, others can modify it as they see fit.

Adoption (or diffusion) is indeed currently the thing holding back most AI use cases. As always, that does not mean that ‘I am from the government and I’m here to help’ is a good idea, so it’s good to see this is focused on limited scope tools.

  1. Establish regulatory sandboxes, including from the FDA and SEC.

  2. Convene stakeholders to establish standards and measure productivity gains.

  3. Create regular assessments for AI adoption, especially by DOD and IC.

  4. Prioritize, collect, and distribute intelligence on foreign frontier AI projects that may have national security implications

All four seem good, although I am confused why #4 is in this section.

Mostly AI job market impact is going to AI job market impact. Government doesn’t have much leverage to impact how this goes, and various forms of ‘retraining’ and education don’t do much on the margin. It’s still cheap to try, sure, why not.

  1. Prioritize AI skill development in education and workforce funding streams.

  2. Clarify that AI literacy and AI skill programs qualify for IRS Section 132.

  3. Study AI’s impact on the labor market, including via establishing the AI Workforce Researcher Hub, to inform policy.

  4. Use discretionary funds for retraining for those displaced by AI.

  5. Pilot new approaches to workforce challenges created by AI.

Well, we should definitely do that, what you have in mind?

  1. Invest in it.

  2. Identify supply chain challenges.

Okay, sure.

We should definitely do that too. I’m not sure what #7 is doing here but this all seems good.

  1. Invest in automated cloud-enabled labs for various fields.

  2. Support Focused-Research Organizations (FROs) and similar to use AI.

  3. Weigh release of high quality data sets when considering scientific funding.

  4. Require federally funded researchers to disclose (non-proprietary, non-sensitive) datasets used by AI.

  5. Make recommendations for data quality standards for AI model training.

  6. Expand access to federal data. Establish secure compute environments within NSF and DOE for controlled access to restricted federal data. Create an online portal.

  7. Explore creating a whole-genome sequencing program for life on federal lands.

  8. “Prioritize investment in theoretical, computational, and experimental research to preserve America’s leadership in discovering new and transformative paradigms that advance the capabilities of AI, reflecting this priority in the forthcoming National AI R&D Strategic Plan.”

Given where our current paradigm is headed I’m happy to invest in alternatives, although I doubt government funding is going to matter much there. Also, if you were serious about that, what the hell is up with all the other giant cuts to American academic and STEM funding? These are not distinct things.

It is good to see that they recognize that this work is vital to winning the race, even for those who do not understand that the most likely winner of the AI race are the AIs.

  1. Launch a technology development program to advance AI interpretability, AI control systems and adversarial robustness.

  2. Prioritize fundamental advancements in interpretability.

  3. Coordinate an AI hackathon initiative to test AI systems for all this.

I am pleasantly surprised to see this here at all. I will say no more.

Remember how we are concerned about how evals often end up only enabling capabilities development? Well, yes, they are highly dual use, which means the capabilities benefits can also be used to pitch the evals, see point #5.

  1. Publish guidelines for Federal agencies to conduct their own evaluations as they pertain to each agency’s mission.

  2. Support the development of the science of measuring and evaluating AI models.

  3. Meet at least twice a year with the research community on best practices.

  4. Invest in AI testbeds in secure real-world settings.

  5. Empower the collaborative establishment of new measurement science to identify proven, scalable and interoperable techniques and metrics to promote development of AI.

Either way, we can all agree that this is good stuff.

Another urgent priority all can agree upon. Certainly one can do it wrong, such as giving the wrong LLM unfettered access, but AI can greatly benefit government.

What are the proposals?

  1. Make CAIOC the interagency coordination and collaboration point.

  2. Create a talent-exchange program.

  3. Create an AI procurement toolbox, letting agencies choose and customize models.

  4. Implement an Advanced Technology Transfer and Sharing Program.

  5. Mandate that all agencies give out all useful access to AI models.

  6. Identify the talent and skills in DOD to leverage AI at scale. Implement talent development programs at DOD (why not everywhere?).

  7. Establish an AI & Autonomous Systems Virtual Proving Ground.

  8. Develop a streamlined process at DOD for optimizing AI workflows.

  9. “Prioritize DOD-led agreements with cloud service providers, operators of computing infrastructure, and other relevant private sector entities to codify priority access to computing resources in the event of a national emergency so that DOD is prepared to fully leverage these technologies during a significant conflict.”

  10. Make Senior Military Colleges hubs of AI R&D and talent development.

I quoted #9 in full because it seems very good and important, and we need more things like this. We should be thinking ahead to future national emergencies, and various things that could go wrong, and ensure we are in position to respond.

As someone without expertise it is hard to know how impactful this will be or if these are the right levers to pull. I do know it all seems positive, so long as we ensure that access is limited to models we can trust with this, so not Chinese models (which I’m confident they know not to do) and not Grok (which I worry about a lot more here).

As in, collaborate with leading American AI developers to enable the private sector to protect AI innovations from security risks.

I notice there are some risks that are not mentioned here, including ones that have implications elsewhere in the document, but the principle here is what is important.

I mean, okay I guess, throw the people some red meat.

  1. Consider establishing a formal guideline and companion voluntary forensic benchmark.

  2. Issue guidance to agencies to explore adopting a deepfake standard similar to Rules of Evidence Rule 901(c).

  3. File formal comments on any proposed deepfake-related additions to the ROE.

Everyone is rhetorically on the same page on this. The question is implementation. I don’t want to hear a bunch of bragging and empty talk, I don’t want to confuse announcements with accomplishments or costs with benefits. I want results.

  1. Categorical NEPA exemptions for data center activities with low impact.

  2. Expand use of FAST-41 to cover all data centers and related energy projects.

  3. Explore the need for a nationwide Clean Water Act Section 404 Permit.

  4. Streamline or reduce regulations under the Clean Air Act, Clean Water Act, Comprehensive Environmental Response, Compensation and Liability Act, and other related laws.

  5. Offer Federal land for data centers and power generation.

  6. Maintain security guardrails against adversaries.

  7. Expand efforts to accelerate and improve environmental review.

One does need to be careful with running straight through things like the Clean Air and Clean Water Acts, but I am not worried on the margin. The question is, what are we going to do about all the other power generation, to ensure we use an ‘all of the above’ energy solution and maximize our chances?

There is an executive order to kick this off.

We’ve all seen the graph where American electrical power is constant and China’s is growing. What are we going to do about it in general, not merely at data centers?

  1. Stabilize the grid of today as much as possible.

  2. Optimize existing grid resources.

  3. “Prioritize the interconnection of reliable, dispatchable power sources as quickly as possible and embrace new energy generation sources at the technological frontier (e.g., enhanced geothermal, nuclear fission, and nuclear fusion). Reform power markets to align financial incentives with the goal of grid stability, ensuring that investment in power generation reflects the system’s needs.”

  4. Create a strategic blueprint for navigating the energy landscape.

That sounds like a lot of ‘connect what we have’ and not so much ‘build more.’

This only ‘embraces new energy generation’ that is ‘at the technological frontier,’ as in geothermal, fission and fusion. That’s a great thing to embrace, but there are two problems.

The first problem is I question that they really mean it, especially for fission. I know they in theory are all for it, and there have been four executive orders reforming the NRC and reducing its independence, but the rules have yet to be revised and it is unclear how much progress we will get, they have 18 months, everything has to wait pending that and the AI timeline for needing a lot more power is not so long. Meanwhile, where are the subsidies to get us building again to move down the cost curve? There are so many ways we could do a lot more. For geothermal I again question how much they are willing to do.

The second problem is why only at the so-called technological frontier, and why does this not include wind and especially solar? How is that not the technological frontier, and why does this government seem to hate them so much? Is it to own the libs? The future is going to depend on solar power for a while, and when people use terms like ‘handing over the future to China’ they are going too far but I’m not convinced they are going too far by that much. The same thing with battery storage.

I realize that those authoring this action plan don’t have the influence to turn that part of the overall agenda around, but it is a rather glaring and important omission.

I share this goal. The CHIPS Act was a great start. How do we build on that?

  1. Continue focusing on removing unnecessary requirements from the CHIPS Act.

  2. Review semiconductor grant and research programs to ensure they accelerate integration of advanced AI tools into semiconductor manufacturing.

Point one seems great, the ‘everything bagel’ problem needs to be solved. Point two seems like meddling by government in the private sector, let them cook, but mostly seems harmless?

I’d have liked to see a much bigger push here. TSMC has shown they can build plants in America even under Biden’s rules. Under Trump’s rules it should be much easier, and this could shift the world’s fate and strategic balance. So why aren’t we throwing more at this?

Similar training programs in the past consistently have not worked, so we should be skeptical other than incorporation of AI skills into the existing educational system. Can we do better than the market here? Why does the government have a role here?

  1. Create a national initiative to identify high-priority occupations essential to AI-related infrastructure, to hopefully inform curriculum design.

  2. Create and fund industry-driven training programs co-developed by employers to upskill incumbent workers.

  3. Partner with education and workforce system stakeholders to expand early career exposure programs and pre-apprenticeships that engage middle and high school students in priority AI infrastructure occupations to create awareness and on ramps.

  4. Provide guidance on updating programs.

  5. Expand use of registered apprenticeships.

  6. Expand hands-on research training and development opportunities.

I’m a big fan of apprenticeship programs and getting early students exposed to these opportunities, largely because they are fixing an imposed mistake where we put kids forcibly in school forever and focus them away from what is important. So it’s good to see that reversed. The rest is less exciting, but doesn’t seem harmful.

The question I have is, aren’t we ‘everything bageling’ core needs here? As in, the obvious way to get skilled workers for these jobs is to import the talent via high skilled immigration, and we seem to be if anything rolling that back rather than embracing it. This is true across the board, and would on net only improve opportunities available for existing American workers, whose interests are best protected here by ensuring America’s success rather than reserving a small number of particular roles for them.

Again, I understand that those authoring this document do not have the leverage to argue for more sensible immigration policy, even though that is one of the biggest levers we have to improve (or avoid further self-sabotaging) our position in AI. It still is a glaring omission in the document.

AI can help defend against AI, and we should do what we can. Again this all seems good, again I doubt it will move the needle all that much or be sufficient.

  1. Establish an AI Information Sharing and Analysis Center for AI security threats.

  2. Give private entities related guidance.

  3. Ensure sharing of known AI vulnerabilities to the private sector.

Secure is the new code word, but also here it does represent an impoverished threat model, with the worry being spurious or malicious inputs. I’m also not sure what is being imagined for an LLM-style AI to be meaningfully secure by design. Is this a Davidad style proof thing? If not, what is it?

  1. Continue to refine DOD’s Responsible AI and Generative AI Frameworks, Roadmaps and Toolkits.

  2. Publish an IC Standard on AI Assurance.

I also worry about whether this cashes out to anything? All right, we’ll continue to refine these things and publish a standard. Will anyone follow the standard? Will those who most need to follow it do so? Will that do anything?

I’m not saying not to try and create frameworks and roadmaps and standards, but one can imagine why if people are saying ‘AGI likely in 2028’ this might seem insufficient. There’s a lot of that in this document, directionally helpful things where the scope of impact is questionable.

Planning for incidence response is great. I only wish they were thinking even bigger, both conceptually and practically. These are good first steps but seem inadequate for even the practical problems they are considering. In general, we should get ready for a much wider array of potential very serious AI incidents of all types.

  1. “Led by NIST at DOC, including CAISI, partner with the AI and cybersecurity industries to ensure AI is included in the establishment of standards, response frameworks, best practices, and technical capabilities (e.g., fly-away kits) of incident response teams.”

  2. Incorporate AI considerations into the CYbersecurity Incident & Vulnerability response playbooks.

  3. Encourage sharing of AI vulnerability information.

I have had an ongoing pitched argument over the issue of the importance and appropriateness of US exports and the ‘American technological stack.’ I have repeatedly made the case that a lot of the arguments being made here by David Sacks and others are Obvious Nonsense, and there’s no need to repeat them here.

Again, the focus needs to be on the actual policy action planned here, which is to prepare proposals for a ‘full-stack AI export package.’

  1. “Establish and operationalize a program within DOC aimed at gathering proposals from industry consortia for full-stack AI export packages. Once consortia are selected by DOC,the Economic Diplomacy Action Group, the U.S. Trade and Development Agency, the Export-Import Bank, the U.S. International Development Finance Corporation, and the Department of State (DOS) should coordinate with DOC to facilitate deals that meet U.S.-approved security requirements and standards.”

This proposal seems deeply confused. There is no ‘full-stack AI export package.’ There are American (mostly Nvidia) AI chips that can run American or other models. Then there are American AI models that can run on those or other chips, which you do not meaningfully ‘export’ in this sense, which can also be run on chips located elsewhere, and which everyone involved agrees we should be (and are) happy to offer.

To the extent this doesn’t effectively mean ‘we should sell AI chips to our allies and develop rules for how the security on such sales has to work’ I don’t know what it actually means, but one cannot argue with that basic idea, we only talk price. Who is an ally, how many chips are we comfortable selling under what conditions. That is not specified here.

We have an implementation of this via executive order calling for proposals for such ‘full-stack AI technology packages’ that include chips plus AI models and the required secondary powers like security and cybersecurity and specific use cases. They can then request Federal ‘incentive and support mechanisms,’ which is in large part presumably code for ‘money,’ as per section 4, ‘mobilization of federal financing tools.’

Once again, this seems philosophically confused, but not in an especially scary way.

  1. Vigorously advocate for international AI governance approaches that promote innovation, reflect American values and counter authoritarian influence.

Anything else you want to list there while we are creating international AI governance standards and institutions? Anything regarding safety or security or anything like that? No? Just ‘promote innovation’ with no limiting principles?

It makes sense, when in a race, to promote innovation at home, and even to make compromises on other fronts to get it. When setting international standards, they apply to everyone, the whole point is to coordinate to not be in a race to the bottom. So you would think priorities would change. Alas.

I think a lot of this is fighting different cultural battles than the one against China, and the threat model here is not well-considered, but certainly we should be advocating for standards we prefer, whatever those may be.

This is a pleasant surprise given what else the administration has been up to, especially their willingness to sell H20s directly to China.

I am especially happy to see the details here, both exploration of using location services and enhanced enforcement efforts. Bravo.

  1. Explore leveraging new and existing location verification services.

  2. Establish a new effort led by DOC to collaborate with IC officials on global chip export control enforcement.

Again, yes, excellent. We should indeed develop new export controls in places where they are currently lacking.

Excellent. We should indeed work closely with our allies. It’s a real shame about how we’ve been treating those allies lately, things could be a lot easier.

  1. Develop, implement and share information on complementary technology protection measures, including in basic research and higher education.

  2. Develop a technology diplomacy strategic plan for an AI global alliance.

  3. Promote plurilateral controls for the AI tech stack while encompassing existing US controls.

  4. Coordinate with allies to ensure they adopt US export controls and prohibit US adversaries from supplying their defense-industrial base or acquire controlling stakes in defense suppliers.

It always requires a double take when you’re banning exports and also imports, as in here where we don’t want to let people use adversary tech and also don’t want to let the adversaries use our tech. In this case it does make sense because of the various points of leverage, even though in most cases it means something has gone wrong.

Eyeball emoji, in a very good way. Even if the concerns explicitly motivating this are limited in scope and exclude the most important ones, what matters is what we do.

  1. Evaluate frontier AI systems for national security risks in partnership with frontier AI developers, led by CAISI in collaboration with others.

  2. Evaluate risks from use of adversary AI systems and the relative capabilities of adversary versus American systems.

  3. Prioritize the recruitment of leading AI researchers at Federal agencies.

  4. “Build, maintain, and update as necessary national security-related AI evaluations through collaboration between CAISI at DOC, national security agencies, and relevant research institutions.”

Excellent. There are other related things missing, but this great. Let’s talk implementation details. In particular, how are we going to ensure we get to do these tests before model release rather than afterwards? What will we do if we find something? Let’s make it count.

You love to see it, this is the biggest practical near term danger.

  1. Require proper screening and security for any labs getting federal funding.

  2. Develop mechanism to facilitate data sharing between nucleic acid synthesis providers to help screen for fraudulent or malicious customers.

  3. Maintain national security-related AI evaluations.

Are those actions sufficient here? Oh, hell no. They are however very helpful.

Dean Ball: Man, I don’t quite know what to say—and anyone who knows me will agree that’s rare. Thanks to everyone for all the immensely kind words, and to the MANY people who made this plan what it is. surreal to see it all come to fruition.

it’s a good plan, sir.

Zac Hill: Very clear y’all put a lot of work and thoughtfulness into this. Obviously you know I come into the space from a different angle and so there’s obviously plenty of stuff I can yammer about at the object level, but it’s clearly a thoughtful and considered product that I think would dramatically exceed most Americans’ expectations about any Government AI Strategy — with a well-constructed site to boot!

Others, as you would expect, had plenty to say.

It seems that yes, you can make both sides of an important issue pleasantly surprised at the same time, where both sides here means those who want us to not all die (the worried), and those who care mostly about not caring about whether we all die or about maximizing Nvidia’s market share (the unworried).

Thus, you can get actual Beff Jezos telling Dean Ball he’s dropped his crown, and Anthropic saying they are encouraged by the exact same plan.

That is for three reasons.

The first reason is that the worried care mostly about actions taken and the resulting consequences, and many of the unworried care mostly about the vibes. The AI Action Plan has unworried and defiant vibes, while taking remarkably wise, responsible and prescient actions.

The second reason is that, thanks in part to the worried having severely lowered expectations where we are stuck for now within an adversarial race and for what we can reasonably ask of this administration, mostly everyone involved agrees on what is to be done on the margin. Everyone agrees we must strengthen America’s position relative to China, that we need to drive more AI adoption in both the public and private sectors, that we will need more chips and more power and transmission lines, that we need to build state capacity on various fronts, and that we need strong export controls and we want our allies using American AI.

There are places where there are tactical disagreements about how best to proceed with all that, especially around chip sales, which the report largely sidesteps.

There is a point where safety and security would conflict with rapid progress, but at anything like current margins security is capability. You can’t deploy what you can’t rely upon. Thus, investing vastly more than we do on alignment and evaluations is common sense even if you think there are no tail risks other than losing the race.

The third reason is, competence matters. Ultimately we are all on the same side. This is a thoughtful, well-executed plan. That’s win-win, and it’s highly refreshing.

Worried and unworried? Sure, we can find common ground.

The Trump White House and Congressional Democrats? You don’t pay me enough to work miracles.

Where did they focus first? You have three guesses. The first two don’t count.

We are deeply concerned about the impacts of President Trump’s AI Action Plan and the executive orders announced yesterday.

“The President’s Executive Order on “Preventing Woke AI in the Federal Government” and policies on ‘AI neutrality’ are counterproductive to responsible AI development and use, and potentially dangerous.

To be clear, we support true AI neutrality—AI models trained on facts and science—but the administration’s fixation on ‘anti-woke’ inputs is definitionally not neutral. This sends a clear message to AI developers: align with Trump’s ideology or pay the price.

It seems highly reasonable to worry that this is indeed the intention, and certainly it is fair game to speak about it this way.

Next up we have my other area of concern, the anti-regulatory dynamic going too far.

“We are also alarmed by the absence of regulatory structure in this AI Action Plan to ensure the responsible development, deployment, or use of AI models, and the apparent targeting of state-level regulations. As AI is integrated with daily life and tech leaders develop more powerful models, such as Artificial General Intelligence, responsible innovation must go hand in hand with appropriate safety guardrails.

In the absence of any meaningful federal alternative, our states are taking the lead in embracing common-sense safeguards to protect the public, build consumer trust, and ensure innovation and competition can continue to thrive.

We are deeply concerned that the AI Action Plan would open the door to forcing states to forfeit their ability to protect the public from the escalating risks of AI, by jeopardizing states’ ability to access critical federal funding. And instead of providing a sorely needed federal regulatory framework that promotes safe model development, deployment, and use, Trump’s plan simultaneously limits states and creates a ‘wild west’ for tech companies, giving them free rein to develop and deploy models with no accountability.

Again, yes, that seems like a highly valid thing to worry about in general, although also once again the primary source of that concern seems not to be the Action Plan or the accompanying Executive Orders.

On their third objection, the energy costs, they mostly miss the mark by focusing on hyping up marginal environmental concerns, although they are correct about the critical failure to support green energy projects – again it seems very clear an ‘all of the above’ approach is necessary, and that’s not what we are getting.

As Peter Wildeford notes, it is good to see the mention here of Artificial General Intelligence, which means the response mentions it one more time than the plan.

This applies to both the documents and the speeches. I have heard that the mood at the official announcement was highly positive and excited, emphasizing how amazing AI would be for everyone and how excited we all are to build.

Director Michael Kratsios: Today the @WhiteHouse released America’s AI Action Plan to win the global race.

We need to OUT-INNOVATE our competitors, BUILD AI & energy infrastructure, & EXPORT American AI around the world. Visit http://AI.gov.

Juan Londono: There’s a lot to like here. But first and foremost, it is refreshing to see the admin step away from the pessimism that was reigning in AI policy the last couple of years.

A lot of focusing on how to get AI right, instead of how not to get it wrong.

I am happy to endorse good vibes and excitement, there is huge positive potential all around and it is most definitely time to build in many ways (including lots of non-AI ways, let’s go), so long as we simultaneously agree we need to do so responsibly, and we prepare for the huge challenges that lie ahead with the seriousness they deserve.

There’s no need for that to dampen the vibes. I don’t especially care if everyone involved goes around irrationally thinking there’s a 90%+ chance we are going to create minds smarter and more competitive than humans and this is all going to work out great for us humans, so long as that makes them then ask how to ensure it does turn out great and then they work to make that happen.

The required and wise actions at 90% success are remarkably similar to those at 10% success, especially at current margins. Hell, even if you have 100% success and think we’ll muddle through regardless, those same precautions help us muddle through quicker and better. You want to prepare and create transparency, optionality and response capacity.

Irrational optimism can have its advantages, as many of the unworried know well.

Perhaps one can even think of humanity’s position here as like a startup. You know on some level, when founding a startup, that ~90% of them will fail, and the odds are very much against you, but that the upside is big enough that it is worth taking the shot.

However, you also know that if you want to succeed, you can’t go around thinking and acting as if you have a 90% chance of failure. You certainly can’t be telling prospective funders and employees that. You need to think you have a 90% chance of success, not failure, and make everyone involved believe it, too. You have to think You Are Different. Only then can you give yourself the best chance of success. Good vibes only.

The tricky part is doing this while correctly understanding all the ways 90% of startups fail, and what it actually takes to succeed, and to ensure that things won’t be too terrible if you fail and ideally set yourself up to fail gracefully if that happens, and acting accordingly. You simultaneously want to throw yourself into the effort with the drive of someone expecting to succeed, without losing your head.

You need confidence, perhaps Tomfidence, well beyond any rational expectation.

And you know what? If that’s what it takes, that works for me. We can make a deal. Walk the walk, even if to do that you have to talk a different talk.

I mean, I’m still going to keep pointing out the actual situation. That’s how some of us roll. You gotta have both. Division of labor. That shouldn’t be a problem.

Peter Wildeford headlines his coverage with the fact that Rubio and Trump are now officially saying that AI is a big deal, a new industrial revolution, and he highlights the increasing attention AGI and even superintelligence are starting to get in Congress, including concerns by members about loss of control.

By contrast, America’s AI Action Plan not only does not mention existential risks or loss of control issues (although it does call for investment into AI interpretability, control and robustness in the context of extracting more mundane utility), the AI Action Plan also does not mention AGI or Artificial General Intelligence, or ASI or Superintelligence, either by those or other names.

There is nothing inconsistent about that. AI, even if we never get AGI, is still likely akin to a new industrial revolution, and is still a big freaking deal, and indeed in that case the AI Action Plan would be even more on point.

At the same time, the plan is trying to prepare us for AGI and its associated risks as best its authors can without explaining that it is doing this.

Steven Adler goes through the key points in the plan in this thread, emphasizing the high degree of competence and work that clearly went into all this and highlighting key useful proposals, while expressing concerns similar to mine.

Timothy Lee notes the ideas for upgrading the electrical grid.

Anthropic offers its thoughts by focusing on and praising in detail what the plan does right, and then calling for further action export controls and transparency standards.

xAI endorsed the ‘positive step towards removing regulatory barrier and enabling even faster innovation.’

Michael Dell offers generic praise.

Harlan Stewart notes that the AI Action Plan has some good stuff, but that it does not take the emerging threat of what David Sacks called a ‘potential successor species’ seriously, contrasting it with past events like the Montreal Protocol, Manhattan Project and Operation Warp Speed. That’s true both in the sense that it doesn’t mention AGI or ASI at all, and in that the precautions mentioned mostly lack both urgency and teeth. Fair enough. Reality does not grade on a curve, but also we do the best we can under the circumstances.

Daniel Eth is pleasantly surprised and has a thread pointing out various good things, and noting the universally positive reactions to the plan, while expressing disappointment at the report not mentioning AGI.

Danny Hauge offers a breakdown, emphasizing the focus on near term actions, and that everything here is only a proposal, while noting the largely positive reaction.

Christopher Covino’s considered reaction is ‘a very promising start,’ with the issues being what is missing rather than objecting to things that are included.

Trump advocates for not applying copyright to AI training, and also says that America is ‘very very substantially’ ahead of China on AI. That is indeed current American law.

Joe Allen: Trump talking about AI as an unstoppable “baby” being “born” — one that must “grow” and “thrive” — is somewhere between Terminator and The Omen.

I am not one who lives by the vibe, yet sometimes I wish people could listen.

My dead is: Insufficient but helpful is the theme here. There’s a lot of very good ideas on the list, including many I did not expect, several of which are potentially impactful.

There are two particular points of substantive concern, where the wording could imply something that could get out of control, on bias policing and on going after state regulations.

Having seen the executive order on bias, I am not terribly worried there, but we need to keep an eye out to see how things are interpreted. On going after state regulations, I continue to see signs we do indeed have to worry, but not primarily due to the plan.

Mostly, we are in a great position on substance: The plan is net helpful, and the main thing wrong with the substance of the plan is not what is in it, but what is missing from it. The issues that are not addressed, or where the actions seem to lack sufficient teeth. That doesn’t mean this puts us on a path to survive, but I was very worried this would be net destructive and instead it is net helpful.

I am less happy with the rhetoric, which is hostile and inflicts pain upon the reader throughout, and most importantly does not even deem many key concerns, including the most important concerns of all, even worthy of mention. That is worrisome, but it could have been far worse, and what matters most is the substance.

Given the way things have been otherwise going, I am very happy with the substance of this plan, which means I am overall very happy with the plan. I offer my thanks and congratulations to those involved in its creation, including Dean Ball. Great work.

Discussion about this post

America’s AI Action Plan Is Pretty Good Read More »

skydance-deal-allows-trump’s-fcc-to-“censor-speech”-and-“silence-dissent”-on-cbs

Skydance deal allows Trump’s FCC to “censor speech” and “silence dissent” on CBS

Warning that the “Paramount payout” and “reckless” acquisition approval together mark a “dark chapter” for US press freedom, Gomez suggested the FCC’s approval will embolden “those who believe the government can—and should—abuse its power to extract financial and ideological concessions, demand favored treatment, and secure positive media coverage.”

FCC terms also govern Skydance hiring decisions

Gomez further criticized the FCC for overstepping its authority in “intervening in employment matters reserved for other government entities with proper jurisdiction on these issues” by requiring Skydance commitments to not establish any DEI programs, which Carr derided as “invidious.” But Gomez countered that “this agency is undermining legitimate efforts to combat discrimination and expand opportunity” by meddling in private companies’ employment decisions.

Ultimately, commissioner Olivia Trusty joined Carr in voting to stamp the agency’s approval, celebrating the deal as “lawful” and a “win” for American “jobs” and “storytelling.” Carr suggested the approval would bolster Paramount’s programming by injecting $1.5 billion into operations, which Trusty said would help Paramount “compete with dominant tech platforms.”

Gomez conceded that she was pleased that at least—unlike the Verizon/T-Mobile merger—Carr granted her request to hold a vote, rather than burying “the outcome of backroom negotiations” and “granting approval behind closed doors, under the cover of bureaucratic process.”

“The public has a right to know how Paramount’s capitulation evidences an erosion of our First Amendment protections,” Gomez said.

Outvoted 2–1, Gomez urged “companies, journalists, and citizens” to take up the fight and push back on the Trump administration, emphasizing that “unchecked and unquestioned power has no rightful place in America.”

Skydance deal allows Trump’s FCC to “censor speech” and “silence dissent” on CBS Read More »

rocket-report:-channeling-the-future-at-wallops;-spacex-recovers-rocket-wreckage

Rocket Report: Channeling the future at Wallops; SpaceX recovers rocket wreckage


China’s Space Pioneer seems to be back on track a year after an accidental launch.

A SpaceX Falcon 9 rocket carrying a payload of 24 Starlink Internet satellites soars into space after launching from Vandenberg Space Force Base, California, shortly after sunset on July 18, 2025. This image was taken in Santee, California, approximately 250 miles (400 kilometers) away from the launch site. Credit: Kevin Carter/Getty Images

Welcome to Edition 8.04 of the Rocket Report! The Pentagon’s Golden Dome missile defense shield will be a lot of things. Along with new sensors, command and control systems, and satellites, Golden Dome will require a lot of rockets. The pieces of the Golden Dome architecture operating in orbit will ride to space on commercial launch vehicles. And Golden Dome’s space-based interceptors will essentially be designed as flying fuel tanks with rocket engines. This shouldn’t be overlooked, and that’s why we include a couple of entries discussing Golden Dome in this week’s Rocket Report.

As always, we welcome reader submissions. If you don’t want to miss an issue, please subscribe using the box below (the form will not appear on AMP-enabled versions of the site). Each report will include information on small-, medium-, and heavy-lift rockets, as well as a quick look ahead at the next three launches on the calendar.

Space-based interceptors are a real challenge. The newly installed head of the Pentagon’s Golden Dome missile defense shield knows the clock is ticking to show President Donald Trump some results before the end of his term in the White House, Ars reports. Gen. Michael Guetlein identified command-and-control and the development of space-based interceptors as two of the most pressing technical challenges for Golden Dome. He believes the command-and-control problem can be “overcome in pretty short order.” The space-based interceptor piece of the architecture is a different story.

Proven physics, unproven economics … “I think the real technical challenge will be building the space-based interceptor,” Guetlein said. “That technology exists. I believe we have proven every element of the physics that we can make it work. What we have not proven is, first, can I do it economically, and then second, can I do it at scale? Can I build enough satellites to get after the threat? Can I expand the industrial base fast enough to build those satellites? Do I have enough raw materials, etc.?” Military officials haven’t said how many space-based interceptors will be required for Golden Dome, but outside estimates put the number in the thousands.

The easiest way to keep up with Eric Berger’s and Stephen Clark’s reporting on all things space is to sign up for our newsletter. We’ll collect their stories and deliver them straight to your inbox.

Sign Me Up!

One big defense prime is posturing for Golden Dome. Northrop Grumman is conducting ground-based testing related to space-based interceptors as part of a competition for that segment of the Trump administration’s Golden Dome missile-defense initiative, The War Zone reports. Kathy Warden, Northrop Grumman’s CEO, highlighted the company’s work on space-based interceptors, as well as broader business opportunities stemming from Golden Dome, during a quarterly earnings call this week. Warden identified Northrop’s work in radars, drones, and command-and-control systems as potentially applicable to Golden Dome.

But here’s the real news … “It will also include new innovation, like space-based interceptors, which we’re testing now,” Warden continued. “These are ground-based tests today, and we are in competition, obviously, so not a lot of detail that I can provide here.” Warden declined to respond directly to a question about how the space-based interceptors Northrop Grumman is developing now will actually defeat their targets. (submitted by Biokleen)

Trump may slash environmental rules for rocket launches. The Trump administration is considering slashing rules meant to protect the environment and the public during commercial rocket launches, changes that companies like Elon Musk’s SpaceX have long sought, ProPublica reports. A draft executive order being circulated among federal agencies, and viewed by ProPublica, directs Secretary of Transportation Sean Duffy to “use all available authorities to eliminate or expedite” environmental reviews for launch licenses. It could also, in time, require states to allow more launches or even more launch sites along their coastlines.

Getting political at the FAA … The order is a step toward the rollback of federal oversight that Musk, who has fought bitterly with the Federal Aviation Administration over his space operations, and others have pushed for. Commercial rocket launches have grown exponentially more frequent in recent years. In addition to slashing environmental rules, the draft executive order would make the head of the FAA’s Office of Commercial Space Transportation a political appointee. This is currently a civil servant position, but the last head of the office took a voluntary separation offer earlier this year.

There’s a SPAC for that. An unproven small launch startup is partnering with a severely depleted SPAC trust to do the impossible: go public in a deal they say will be valued at $400 million, TechCrunch reports. Innovative Rocket Technologies Inc., or iRocket, is set to merge with a Special Purpose Acquisition Company, or SPAC, founded by former Commerce Secretary Wilbur Ross. But the most recent regulatory filings by this SPAC showed it was in a tenuous financial position last year, with just $1.6 million held in trust. Likewise, iRocket isn’t flooded with cash. The company has raised only a few million in venture funding, a fraction of what would be needed to develop and test the company’s small orbital-class rocket, named Shockwave.

SpaceX traces a path to orbit for NASA. Two NASA satellites soared into orbit from California aboard a SpaceX Falcon 9 rocket Wednesday, commencing a $170 million mission to study a phenomenon of space physics that has eluded researchers since the dawn of the Space Age, Ars reports. The twin spacecraft are part of the NASA-funded TRACERS mission, which will spend at least a year measuring plasma conditions in narrow regions of Earth’s magnetic field known as polar cusps. As the name suggests, these regions are located over the poles. They play an important but poorly understood role in creating colorful auroras as plasma streaming out from the Sun interacts with the magnetic field surrounding Earth. The same process drives geomagnetic storms capable of disrupting GPS navigation, radio communications, electrical grids, and satellite operations.

Plenty of room for more … The TRACERS satellites are relatively small, each about the size of a washing machine, so they filled only a fraction of the capacity of SpaceX’s Falcon 9 rocket. Three other small NASA tech demo payloads hitched a ride to orbit with TRACERS, kicking off missions to test an experimental communications terminal, demonstrate an innovative scalable satellite platform made of individual building blocks, and study the link between Earth’s atmosphere and the Van Allen radiation belts. In addition to those missions, the European Space Agency launched its own CubeSat to test 5G communications from orbit. Five smallsats from an Australian company rounded out the group. Still, the Falcon 9 rocket’s payload shroud was filled with less than a quarter of the payload mass it could have delivered to the TRACERS mission’s targeted Sun-synchronous orbit.

Tianlong launch pad ready for action. Chinese startup Space Pioneer has completed a launch pad at Jiuquan spaceport in northwestern China for its Tianlong 3 liquid propellent rocket ahead of a first orbital launch, Space News reports. Space Pioneer said the launch pad passed an acceptance test, and ground crews raised a full-scale model of the Tianlong 3 rocket on the launch pad. “The rehearsal test was successfully completed,” said Space Pioneer, one of China’s leading private launch companies. The activation of the launch pad followed a couple of weeks after Space Pioneer announced the completion of static loads testing on Tianlong 3.

More to come … While this is an important step forward for Space Pioneer, construction of the launch pad is just one element the company needs to finish before Tianlong 3 can lift off for the first time. In June 2024, the company ignited Tianlong 3’s nine-engine first stage on a test stand in China. But the rocket broke free of its moorings on the test stand and unexpectedly climbed into the sky before crashing in a fireball nearby. Space Pioneer says the “weak design of the rocket’s tail structure was the direct cause of the failure” last year. The company hasn’t identified next steps for Tianlong 3, or when it might be ready to fly. Tianlong 3 is a kerosene-fueled rocket with nine main engines, similar in design architecture and payload capacity to SpaceX’s Falcon 9. Also, like Falcon 9, Tianlong 3 is supposed to have a recoverable and reusable first stage booster.

Dredging up an issue at Wallops. Rocket Lab has asked regulators for permission to transport oversized Neutron rocket structures through shallow waters to a spaceport off the coast of Virginia as it races to meet a September delivery deadline, TechCrunch reports. The request, which was made in July, is a temporary stopgap while the company awaits federal clearance to dredge a permanent channel to the Wallops Island site. Rocket Lab plans to launch its Neutron medium-lift rocket from the Mid-Atlantic Regional Spaceport (MARS) on Wallops Island, Virginia, a lower-traffic spaceport that’s surrounded by shallow channels and waterways. Rocket Lab has a sizable checklist to tick off before Neutron can make its orbital debut, like mating the rocket stages, performing a “wet dress” rehearsal, and getting its launch license from the Federal Aviation Administration. Before any of that can happen, the rocket hardware needs to make it onto the island from Rocket Lab’s factory on the nearby mainland.

Kedging bets … Access to the channel leading to Wallops Island is currently available only at low tides. So, Rocket Lab submitted an application earlier this year to dredge the channel. The dredging project was approved by the Virginia Marine Resources Commission in May, but the company has yet to start digging because it’s still awaiting federal sign-off from the Army Corps of Engineers. As the company waits for federal approval, Rocket Lab is seeking permission to use a temporary method called “kedging” to ensure the first five hardware deliveries can arrive on schedule starting in September. We don’t cover maritime issues in the Rocket Report, but if you’re interested in learning a little about kedging, here’s a link.

Any better ideas for an Exploration Upper Stage? Not surprisingly, Congress is pushing back against the Trump administration’s proposal to cancel the Space Launch System, the behemoth rocket NASA has developed to propel astronauts back to the Moon. But legislation making its way through the House of Representatives includes an interesting provision that would direct NASA to evaluate alternatives for the Boeing-built Exploration Upper Stage, an upgrade for the SLS rocket set to debut on its fourth flight, Ars reports. Essentially, the House Appropriations Committee is telling NASA to look for cheaper, faster options for a new SLS upper stage.

CYA EUS? The four-engine Exploration Upper Stage, or EUS, is an expensive undertaking. Last year, NASA’s inspector general reported that the new upper stage’s development costs had ballooned from $962 million to $2.8 billion, and the project had been delayed more than six years. That’s almost a year-for-year delay since NASA and Boeing started development of the EUS. So, what are the options if NASA went with a new upper stage for the SLS rocket? One possibility is a modified version of United Launch Alliance’s dual-engine Centaur V upper stage that flies on the Vulcan rocket. It’s no longer possible to keep flying the SLS rocket’s existing single-engine upper stage because ULA has shut down the production line for it.

Raising Super Heavy from the deep. For the second time, SpaceX has retrieved an engine section from one of its Super Heavy boosters from the Gulf of Mexico, NASASpaceflight.com reports. Images posted on social media showed the tail end of a Super Heavy booster being raised from the sea off the coast of northern Mexico. Most of the rocket’s 33 Raptor engines appear to still be attached to the lower section of the stainless steel booster. Online sleuths who closely track SpaceX’s activities at Starbase, Texas, have concluded the rocket recovered from the Gulf is Booster 13, which flew on the sixth test flight of the Starship mega-rocket last November. The booster ditched in the ocean after aborting an attempted catch back at the launch pad in South Texas.

But why? … SpaceX recovered the engine section of a different Super Heavy booster from the Gulf last year. The company’s motivation for salvaging the wreckage is unclear. “Speculated reasons include engineering research, environmental mitigation, or even historical preservation,” NASASpaceflight reports.

Next three launches

July 26: Vega C | CO3D & MicroCarb | Guiana Space Center, French Guiana | 02: 03 UTC

July 26: Falcon 9 | Starlink 10-26 | Cape Canaveral Space Force Station, Florida | 08: 34 UTC

July 27: Falcon 9 | Starlink 17-2 | Vandenberg Space Force Base, California | 03: 55 UTC

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

Rocket Report: Channeling the future at Wallops; SpaceX recovers rocket wreckage Read More »

conspiracy-theorists-don’t-realize-they’re-on-the-fringe

Conspiracy theorists don’t realize they’re on the fringe


Gordon Pennycook: “It might be one of the biggest false consensus effects that’s been observed.”

Credit: Aurich Lawson / Thinkstock

Belief in conspiracy theories is often attributed to some form of motivated reasoning: People want to believe a conspiracy because it reinforces their worldview, for example, or doing so meets some deep psychological need, like wanting to feel unique. However, it might also be driven by overconfidence in their own cognitive abilities, according to a paper published in the Personality and Social Psychology Bulletin. The authors were surprised to discover that not only are conspiracy theorists overconfident, they also don’t realize their beliefs are on the fringe, massively overestimating by as much as a factor of four how much other people agree with them.

“I was expecting the overconfidence finding,” co-author Gordon Pennycook, a psychologist at Cornell University, told Ars. “If you’ve talked to someone who believes conspiracies, it’s self-evident. I did not expect them to be so ready to state that people agree with them. I thought that they would overestimate, but I didn’t think that there’d be such a strong sense that they are in the majority. It might be one of the biggest false consensus effects that’s been observed.”

In 2015, Pennycook made headlines when he co-authored a paper demonstrating how certain people interpret “pseudo-profound bullshit” as deep observations. Pennycook et al. were interested in identifying individual differences between those who are susceptible to pseudo-profound BS and those who are not and thus looked at conspiracy beliefs, their degree of analytical thinking, religious beliefs, and so forth.

They presented several randomly generated statements, containing “profound” buzzwords, that were grammatically correct but made no sense logically, along with a 2014 tweet by Deepak Chopra that met the same criteria. They found that the less skeptical participants were less logical and analytical in their thinking and hence much more likely to consider these nonsensical statements as being deeply profound. That study was a bit controversial, in part for what was perceived to be its condescending tone, along with questions about its methodology. But it did snag Pennycook et al. a 2016 Ig Nobel Prize.

Last year we reported on another Pennycook study, presenting results from experiments in which an AI chatbot engaged in conversations with people who believed at least one conspiracy theory. That study showed that the AI interaction significantly reduced the strength of those beliefs, even two months later. The secret to its success: the chatbot, with its access to vast amounts of information across an enormous range of topics, could precisely tailor its counterarguments to each individual. “The work overturns a lot of how we thought about conspiracies, that they’re the result of various psychological motives and needs,” Pennycook said at the time.

Miscalibrated from reality

Pennycook has been working on this new overconfidence study since 2018, perplexed by observations indicating that people who believe in conspiracies also seem to have a lot of faith in their cognitive abilities—contradicting prior research finding that conspiracists are generally more intuitive. To investigate, he and his co-authors conducted eight separate studies that involved over 4,000 US adults.

The assigned tasks were designed in such a way that participants’ actual performance and how they perceived their performance were unrelated. For example, in one experiment, they were asked to guess the subject of an image that was largely obscured. The subjects were then asked direct questions about their belief (or lack thereof) concerning several key conspiracy claims: the Apollo Moon landings were faked, for example, or that Princess Diana’s death wasn’t an accident. Four of the studies focused on testing how subjects perceived others’ beliefs.

The results showed a marked association between subjects’ tendency to be overconfident and belief in conspiracy theories. And while a majority of participants believed a conspiracy’s claims just 12 percent of the time, believers thought they were in the majority 93 percent of the time. This suggests that overconfidence is a primary driver of belief in conspiracies.

It’s not that believers in conspiracy theories are massively overconfident; there is no data on that, because the studies didn’t set out to quantify the degree of overconfidence, per Pennycook. Rather, “They’re overconfident, and they massively overestimate how much people agree with them,” he said.

Ars spoke with Pennycook to learn more.

Ars Technica: Why did you decide to investigate overconfidence as a contributing factor to believing conspiracies?

Gordon Pennycook: There’s a popular sense that people believe conspiracies because they’re dumb and don’t understand anything, they don’t care about the truth, and they’re motivated by believing things that make them feel good. Then there’s the academic side, where that idea molds into a set of theories about how needs and motivations drive belief in conspiracies. It’s not someone falling down the rabbit hole and getting exposed to misinformation or conspiratorial narratives. They’re strolling down: “I like it over here. This appeals to me and makes me feel good.”

Believing things that no one else agrees with makes you feel unique. Then there’s various things I think that are a little more legitimate: People join communities and there’s this sense of belongingness. How that drives core beliefs is different. Someone may stop believing but hang around in the community because they don’t want to lose their friends. Even with religion, people will go to church when they don’t really believe. So we distinguish beliefs from practice.

What we observed is that they do tend to strongly believe these conspiracies despite the fact that there’s counter evidence or a lot of people disagree. What would lead that to happen? It could be their needs and motivations, but it could also be that there’s something about the way that they think where it just doesn’t occur to them that they could be wrong about it. And that’s where overconfidence comes in.

Ars Technica: What makes this particular trait such a powerful driving force?

Gordon Pennycook: Overconfidence is one of the most important core underlying components, because if you’re overconfident, it stops you from really questioning whether the thing that you’re seeing is right or wrong, and whether you might be wrong about it. You have an almost moral purity of complete confidence that the thing you believe is true. You cannot even imagine what it’s like from somebody else’s perspective. You couldn’t imagine a world in which the things that you think are true could be false. Having overconfidence is that buffer that stops you from learning from other people. You end up not just going down the rabbit hole, you’re doing laps down there.

Overconfidence doesn’t have to be learned, parts of it could be genetic. It also doesn’t have to be maladaptive. It’s maladaptive when it comes to beliefs. But you want people to think that they will be successful when starting new businesses. A lot of them will fail, but you need some people in the population to take risks that they wouldn’t take if they were thinking about it in a more rational way. So it can be optimal at a population level, but maybe not at an individual level.

Ars Technica: Is this overconfidence related to the well-known Dunning-Kruger effect?

Gordon Pennycook: It’s because of Dunning-Kruger that we had to develop a new methodology to measure overconfidence, because the people who are the worst at a task are the worst at knowing that they’re the worst at the task. But that’s because the same things that you use to do the task are the things you use to assess how good you are at the task. So if you were to give someone a math test and they’re bad at math, they’ll appear overconfident. But if you give them a test of assessing humor and they’re good at that, they won’t appear overconfident. That’s about the task, not the person.

So we have tasks where people essentially have to guess, and it’s transparent. There’s no reason to think that you’re good at the task. In fact, people who think they’re better at the task are not better at it, they just think they are. They just have this underlying kind of sense that they can do things, they know things, and that’s the kind of thing that we’re trying to capture. It’s not specific to a domain. There are lots of reasons why you could be overconfident in a particular domain. But this is something that’s an actual trait that you carry into situations. So when you’re scrolling online and come up with these ideas about how the world works that don’t make any sense, it must be everybody else that’s wrong, not you.

Ars Technica: Overestimating how many people agree with them seems to be at odds with conspiracy theorists’ desire to be unique.  

Gordon Pennycook: In general, people who believe conspiracies often have contrary beliefs. We’re working with a population where coherence is not to be expected. They say that they’re in the majority, but it’s never a strong majority. They just don’t think that they’re in a minority when it comes to the belief. Take the case of the Sandy Hook conspiracy, where adherents believe it was a false flag operation. In one sample, 8 percent of people thought that this was true. That 8 percent thought 61 percent of people agreed with them.

So they’re way off. They really, really miscalibrated. But they don’t say 90 percent. It’s 60 percent, enough to be special, but not enough to be on the fringe where they actually are. I could have asked them to rank how smart they are relative to others, or how unique they thought their beliefs were, and they would’ve answered high on that. But those are kind of mushy self-concepts. When you ask a specific question that has an objectively correct answer in terms of the percent of people in the sample that agree with you, it’s not close.

Ars Technica: How does one even begin to combat this? Could last year’s AI study point the way?

Gordon Pennycook: The AI debunking effect works better for people who are less overconfident. In those experiments, very detailed, specific debunks had a much bigger effect than people expected. After eight minutes of conversation, a quarter of the people who believed the thing didn’t believe it anymore, but 75 percent still did. That’s a lot. And some of them, not only did they still believe it, they still believed it to the same degree. So no one’s cracked that. Getting any movement at all in the aggregate was a big win.

Here’s the problem. You can’t have a conversation with somebody who doesn’t want to have the conversation. In those studies, we’re paying people, but they still get out what they put into the conversation. If you don’t really respond or engage, then our AI is not going to give you good responses because it doesn’t know what you’re thinking. And if the person is not willing to think. … This is why overconfidence is such an overarching issue. The only alternative is some sort of propagandistic sit-them-downs with their eyes open and try to de-convert them. But you can’t really convert someone who doesn’t want to be converted. So I’m not sure that there is an answer. I think that’s just the way that humans are.

Personality and Social Psychology Bulletin, 2025. DOI: 10.1177/01461672251338358  (About DOIs).

Photo of Jennifer Ouellette

Jennifer is a senior writer at Ars Technica with a particular focus on where science meets culture, covering everything from physics and related interdisciplinary topics to her favorite films and TV series. Jennifer lives in Baltimore with her spouse, physicist Sean M. Carroll, and their two cats, Ariel and Caliban.

Conspiracy theorists don’t realize they’re on the fringe Read More »

southwestern-drought-likely-to-continue-through-2100,-research-finds

Southwestern drought likely to continue through 2100, research finds

This article originally appeared on Inside Climate News, a nonprofit, non-partisan news organization that covers climate, energy, and the environment. Sign up for their newsletter here.

The drought in the Southwestern US is likely to last for the rest of the 21st century and potentially beyond as global warming shifts the distribution of heat in the Pacific Ocean, according to a study published last week led by researchers at the University of Texas at Austin.

Using sediment cores collected in the Rocky Mountains, paleoclimatology records and climate models, the researchers found warming driven by greenhouse gas emissions can alter patterns of atmospheric and marine heat in the North Pacific Ocean in a way resembling what’s known as the negative phase of the Pacific Decadal Oscillation (PDO), fluctuations in sea surface temperatures that result in decreased winter precipitation in the American Southwest. But in this case, the phenomenon can last far longer than the usual 30-year cycle of the PDO.

“If the sea surface temperature patterns in the North Pacific were just the result of processes related to stochastic [random] variability in the past decade or two, we would have just been extremely unlucky, like a really bad roll of the dice,” said Victoria Todd, the lead author of the study and a PhD student in geosciences at University of Texas at Austin. “But if, as we hypothesize, this is a forced change in the sea surface temperatures in the North Pacific, this will be sustained into the future, and we need to start looking at this as a shift, instead of just the result of bad luck.”

Currently, the Southwestern US is experiencing a megadrought resulting in the aridification of the landscape, a decades-long drying of the region brought on by climate change and the overconsumption of the region’s water. That’s led to major rivers and their basins, such as the Colorado and Rio Grande rivers, seeing reduced flows and a decline of the water stored in underground aquifers, which is forcing states and communities to reckon with a sharply reduced water supply. Farmers have cut back on the amount of water they use. Cities are searching for new water supplies. And states, tribes, and federal agencies are engaging in tense negotiations over how to manage declining resources like the Colorado River going forward.

Southwestern drought likely to continue through 2100, research finds Read More »

phishers-have-found-a-way-to-downgrade—not-bypass—fido-mfa

Phishers have found a way to downgrade—not bypass—FIDO MFA

Researchers recently reported encountering a phishing attack in the wild that bypasses a multifactor authentication scheme based on FIDO (Fast Identity Online), the industry-wide standard being adopted by thousands of sites and enterprises.

If true, the attack, reported in a blog post Thursday by security firm Expel, would be huge news, since FIDO is widely regarded as being immune to credential phishing attacks. After analyzing the Expel write-up, I’m confident that the attack doesn’t bypass FIDO protections, at least not in the sense that the word “bypass” is commonly used in security circles. Rather, the attack downgrades the MFA process to a weaker, non-FIDO-based process. As such, the attack is better described as a FIDO downgrade attack. More about that shortly. For now, let’s describe what Expel researchers reported.

Abusing cross-device sign-ins

Expel said the “novel attack technique” begins with an email that links to a fake login page from Okta, a widely used authentication provider. It prompts visitors to enter their valid user name and password. People who take the bait have now helped the attack group, which Expel said is named PoisonSeed, clear the first big hurdle in gaining unauthorized access to the Okta account.

The FIDO spec was designed to mitigate precisely these sorts of scenarios by requiring users to provide an additional factor of authentication in the form of a security key, which can be a passkey, or physical security key such as a smartphone or dedicated device such as a Yubikey. For this additional step, the passkey must use a unique cryptographic key embedded into the device to sign a challenge that the site (Okta, in this case) sends to the browser logging in.

One of the ways a user can provide this additional factor is by using a cross-device sign-in feature. In the event there is no passkey on the device being used to log in, a user can use a passkey for that site that’s already resident on a different device, which in most cases will be a phone. In these cases, the site being logged into will display a QR code. The user then scans the QR code with the phone, and the normal FIDO MFA process proceeds as normal.

Phishers have found a way to downgrade—not bypass—FIDO MFA Read More »

netflix’s-first-show-with-generative-ai-is-a-sign-of-what’s-to-come-in-tv,-film

Netflix’s first show with generative AI is a sign of what’s to come in TV, film

Netflix used generative AI in an original, scripted series that debuted this year, it revealed this week. Producers used the technology to create a scene in which a building collapses, hinting at the growing use of generative AI in entertainment.

During a call with investors yesterday, Netflix co-CEO Ted Sarandos revealed that Netflix’s Argentine show The Eternaut, which premiered in April, is “the very first GenAI final footage to appear on screen in a Netflix, Inc. original series or film.” Sarandos further explained, per a transcript of the call, saying:

The creators wanted to show a building collapsing in Buenos Aires. So our iLine team, [which is the production innovation group inside the visual effects house at Netflix effects studio Scanline], partnered with their creative team using AI-powered tools. … And in fact, that VFX sequence was completed 10 times faster than it could have been completed with visual, traditional VFX tools and workflows. And, also, the cost of it would just not have been feasible for a show in that budget.

Sarandos claimed that viewers have been “thrilled with the results”; although that likely has much to do with how the rest of the series, based on a comic, plays out, not just one, AI-crafted scene.

More generative AI on Netflix

Still, Netflix seems open to using generative AI in shows and movies more, with Sarandos saying the tech “represents an incredible opportunity to help creators make films and series better, not just cheaper.”

“Our creators are already seeing the benefits in production through pre-visualization and shot planning work and, certainly, visual effects,” he said. “It used to be that only big-budget projects would have access to advanced visual effects like de-aging.”

Netflix’s first show with generative AI is a sign of what’s to come in TV, film Read More »

how-android-phones-became-an-earthquake-warning-system

How Android phones became an earthquake warning system

Of course, the trick is that you only send out the warning if there’s an actual earthquake, and not when a truck is passing by. Here, the sheer volume of Android phones sold plays a key role. As a first pass, AEA can simply ignore events that aren’t picked up by a lot of phones in the same area. But we also know a lot about the patterns of shaking that earthquakes produce. Different waves travel at different speeds, cause different types of ground motion, and may be produced at different intensities as the earthquake progresses.

So, the people behind AEA also include a model of earthquakes and seismic wave propagation, and check whether the pattern seen in phones’ accelerometers is consistent with that model. It only triggers an alert when there’s widespread phone activity that matches the pattern expected for an earthquake.

Raising awareness

In practical terms, AEA is distributed as part of the core Android software, and is set to on by default, so it is active in most Android phones. It starts monitoring when the phone has been stationary for a little while, checking for acceleration data that’s consistent with the P or S waves produced by earthquakes. If it gets a match, it forwards the information along with some rough location data (to preserve privacy) to Google servers. Software running on those servers then performs the positional analysis to see if the waves are widespread enough to have been triggered by an earthquake.

If so, it estimates the size and location, and uses that information to estimate the ground motion that will be experienced in different locations. Based on that, AEA sends out one of two alerts, either “be aware” or “take action.” The “be aware” alert is similar to a standard Android notification, but it plays a distinctive sound and is sent to users further from the epicenter. In contrast, the “take action” warning that’s sent to those nearby will display one of two messages in the appropriate language, either “Protect yourself” or “Drop, cover, and hold on.” It ignores any do-not-disturb settings, takes over the entire screen, and also plays a distinct noise.

How Android phones became an earthquake warning system Read More »