Author name: Beth Washington

keep-losing-your-key-fob?-ford’s-new-“truckle”-is-the-answer.

Keep losing your key fob? Ford’s new “Truckle” is the answer.

I came across possibly one of the weirdest official automotive accessories this morning, courtesy of a friend’s social media feed. It’s called the “Truckle,” and it’s a hand-crafted silver and bronze belt buckle that might be the envy of every other cowboy out there, since this one has a place to keep your F-150’s key fob without ruining the lines of your jeans.

The Truckle was designed by Utah-based A Cut Above Buckles, with a hand-engraved F-150 on the bump in the front. Behind the truck? Storage space for a Ford truck key fob, which should fit any F-150 from model year 2018 onward.

“You can put your key fob in the buckle—all your remote features work while it’s in the buckle,” designer Andy Andrews told the Detroit Free Press. “Once you have it in there, you’re not going to lose that key fob. You’re not going to be scratching your head (wondering) where it’s at. It’s right there with you in the Truckle.”

The limited edition Truckle is probably only for serious F-150 fans, though; at $200, it’s quite a commitment to keeping your pants up. Ford and A Cut Above Buckles debuted the Truckle this past weekend at the Texas State Fair.

Keep losing your key fob? Ford’s new “Truckle” is the answer. Read More »

termite-farmers-fine-tune-their-weed-control

Termite farmers fine-tune their weed control

Odontotermes obesus is one of the termite species that grows fungi, called Termitomyces, in their mounds. Workers collect dead leaves, wood, and grass to stack them in underground fungus gardens called combs. There, the fungi break down the tough plant fibers, making them accessible for the termites in an elaborate form of symbiotic agriculture.

Like any other agriculturalist, however, the termites face a challenge: weeds. “There have been numerous studies suggesting the termites must have some kind of fixed response—that they always do the same exact thing when they detect weed infestation,” says Rhitoban Raychoudhury, a professor of biological sciences at the Indian Institute of Science Education, “but that was not the case.” In a new Science study, Raychoudhury’s team discovered that termites have pretty advanced, surprisingly human-like gardening practices.

Going blind

Termites do not look like particularly good gardeners at first glance. They are effectively blind, which is not that surprising considering they spend most of their life in complete darkness working in endless corridors of their mounds. But termites make up for their lack of sight with other senses. “They can detect the environment based on advanced olfactory reception and touch, and I think this is what they use to identify the weeds in their gardens,” Raychoudhury says. To learn how termites react once they detect a weed infestation, his team collected some Odontotermes obesus and challenged them with different gardening problems.

The experimental setup was quite simple. The team placed some autoclaved soil sourced from termite mounds into glass Petri dishes. On this soil, Raychoudhury and his colleagues placed two fungus combs in each dish. The first piece acted as a control and was a fresh, uninfected comb with Termitomyces. “Besides acting as a control, it was also there to make sure the termites have the food because it is very hard for them to survive outside their mounds,” Raychoudhury explains. The second piece was intentionally contaminated with Pseudoxylaria, a filamentous fungal weed that often takes over Termitomyces habitats in termite colonies.

Termite farmers fine-tune their weed control Read More »

2025-state-of-ai-report-and-predictions

2025 State of AI Report and Predictions

The 2025 State of AI Report is out, with lots of fun slides and a full video presentation. They’ve been consistently solid, providing a kind of outside general view.

I’m skipping over stuff my regular readers already know that doesn’t bear repeating.

Nathan Benaich: Once a “Llama rip-off,” @Alibaba_Qwen now powers 40% of all new fine-tunes on @huggingface. China’s open-weights ecosystem has overtaken Meta’s, with Llama riding off into the sunset…for now.

I highlight this because the ‘for now’ is important to understand, and to note that it’s Qwen not DeepSeek. As in, models come and models go, and especially in the open model world people will switch on you on a dime. Stop worrying about lock-ins and mystical ‘tech stacks.’

Robots now reason too. “Chain-of-Action” planning brings structured thought to the physical world – from AI2’s Molmo-Act to Gemini Robotics. Massive amounts of effort are thrown into the mix, expect lots of progress here…

.@AnthropicAI‘s Model Context Protocol is the new USB-C of AI. A single standard to connect models to tools, already embedded in ChatGPT, Gemini, Claude, and VS Code, has taken shape. But not without emerging security risks…

I note this next part mostly because it shows the Different Worlds dynamic:

Nathan Benaich: The frontier fight is relentless. @OpenAI still tops most leaderboards, but @GoogleDeepMind‘s stays there longer. Timing releases has become its own science…not least informing financing rounds like clockwork.

They’re citing LMArena and Artificial Analysis. LMArena is dead, sir. Artificial Analysis is fine, if you had to purely go with one number, which you shouldn’t do.

Once more for the people in the back or the White House:

.@deepseek_ai “$5M training run” deep freak was overblown. Since the market realised the fineprint in the R1 paper, that’s led to Jevons paradox on steroids: lower cost per run → more runs → more compute needed, buy more NVIDIA.

… China leads in power infrastructure too, adding >400GW in 2024 vs 41GW for the US. Compute now clearly runs on geopolitics.

Then we get to what I thought was the first clear error:

Now, let’s switch gears into Politics. The US Government is turning capitalist. Golden shares in US Steel, stakes in Intel and MP Materials, and revenue cuts from NVIDIA’s China sales. New-age Industrial policy?

Not capitalist. Socialist.

The term for public ownership of the means of production is socialist.

Unless this meant ‘the US Government centrally maximizing the interests of certain particular capitalists’ or similarly ‘the US Government is turning into one particular capitalist maximizing profits.’ In which case, I’m not the one who said that.

The AI Safety Institute network has collapsed. Washington ditched attending meetings altogether, while the US and UK rebranded “safety” into “security.”

I don’t think this is fair to UK AISI, but yes the White House has essentially told anyone concerned about existential risk or seeking international coordination of any kind to, well, you know.

Moving into Safety: budgets are anemic. All 11 major US safety orgs will spend $133M in 2025…less than frontier labs burn in a day.

I like that this highlights Anthropic’s backpedaling, GDM’s waiting three weeks to give us a model card and xAI’s missing its deadline. It’s pretty grim.

What I disagree with here is the idea that all of that has much to do with the Trump Administration. I don’t want to blame them for things they didn’t cause, and I think they played only a minor role in these kinds of safety failures. The rhetoric being used has shifted to placate them, but the underlying safety work wouldn’t yet be substantially different under Harris unless she’d made a major push to force that issue, well beyond what Biden was on track to do. That decision was up to the labs, and their encounters with reality.

But yes, the AI safety ecosystem is tiny and poor, at risk of being outspent by one rabid industry anti-regulatory super-PAC alone unless we step things up. I have hope that things can be stepped up soon.

Cyber and alignment risks accelerate. Models can now fake alignment under supervision, and exploit code faster than humans fix it.

They then grade their predictions, scoring themselves 5/10, which is tough but fair, and made me confident I can trust their self-grading. As Sean notes they clearly could have ‘gotten away with’ claiming 7/10, although I would have docked them for trying.

Seán Ó hÉigeartaigh: Two of the things I really appreciate is that (a) they make and review predictions each year and (b) unlike some other predictors they grade themselves HARSHLY. Several of these ‘no’s are distinctly borderline, they could have given themselves 7-8/10 and I don’t think I would have held it against them.

  1. A $10B+ investment from a sovereign state into a US large AI lab invokes national security review.

    1. No, although on technicalities, but also national security review hahaha.

  2. An app or website created solely by someone with no coding ability will go viral (e.g. App Store Top-100).

    1. Yes, Formula Bot.

  3. Frontier labs implement meaningful changes to data collection practices after cases begin reaching trial.

    1. Yes, Anthropic and the whole $1.5 billion fiasco.

  4. Early EU AI Act implementation ends up softer than anticipated after lawmakers worry they’ve overreached.

    1. No, they say, but you could definitely make a case here.

  5. An open source alternative to OpenAI o1 surpasses it across a range of reasoning benchmarks.

    1. Yes, r1 did this, although as stated this was an easy call.

  6. Challengers fail to make any meaningful dent in NVIDIA’s market position.

    1. Yes, again relatively easy call on this time frame.

  7. Levels of investment in humanoids will trail off, as companies struggle to achieve product-market fit.

    1. No, investment grew from $1.4b to $3b. I half-kid that spiritually this is kind of counts as a win in AI, it only doubled, that’s kind of a trail off?

    2. But no, seriously, the robots are coming.

  8. Strong results from Apple’s on-device research accelerates momentum around personal on-device AI.

    1. No, Apple Intelligence and their research department flopped. On device AI is definitely growing anyway.

  9. A research paper generated by an AI Scientist is accepted at a major ML conference or workshop.

    1. Yes, AI Scientist-v2 at an ICLR workshop.

  10. A video game based around interacting with GenAI-based elements will achieve break-out status.

    1. Nope. This continues to be a big area of disappointment. Not only did nothing break out, there wasn’t even anything halfway decent.

Here are their predictions for 2026. These are aggressive, GPT-5-Pro thinks their expected score is only 3.1 correct. If they can hit 5/10 again I think they get kudos, and if they get 7/10 they did great.

I made my probability assessments before creating Manifold markets, to avoid anchoring, and will then alter my assessment based on early trading.

I felt comfortable creating those markets because I have confidence both that they will grade themselves accurately, and that LLMs will be strong enough in a year to resolve these questions reasonably. So my resolution rule was, their self-assessment wins, and if they don’t provide one I’ll feed the exact wording into Anthropic’s strongest model – ideally this should probably be best 2 out of 3 of Google, OpenAI and Anthropic, but simplicity is good.

  1. A major retailer reports >5% of online sales from agentic checkout as AI agent advertising spend hits $5B.

    1. Total advertising spending in America in 2025 was ~$420 billion.

    2. I think this is ambitious, but variance here is really high and the correlation between the two numbers is large.

    3. GPT-5-Pro says 18%, Sonnet says 8%, I think it’s more plausible than that. Maybe 25%?

    4. Manifold says 23% so that seems good.

  2. A major AI lab leans back into open-sourcing frontier models to win over the current US administration.

    1. GPT-5-Pro says 22%, Sonnet says 25%.

    2. I don’t see it, if this means ‘release your frontier model as an open model.’ Who? I would only count at most five labs as major, and Meta (who is pushing it in terms of counting) is already open. The only realistic option here is xAI.

    3. That goes double if you include the conditional ‘to win over the current US administration.’ There’s a lot of other considerations in such a move.

    4. Thus, I’d sell this down to 15%, but it’s hard to be too confident about Elon?

    5. Manifold agreed with the AIs at 25% but tends to be too high in such spots, so I still would be a seller.

  3. Open-ended agents make a meaningful scientific discovery end-to-end (hypothesis, expt, iteration, paper).

    1. Define ‘meaningful’ and ‘end to end’ in various ways? Always tricky.

    2. I’m actually optimistic, if we’re not going to be sticklers on details.

    3. GPT-5-Pro says 36%, Sonnet is deeply skeptical and says 15%. If I knew we had a reasonable threshold for ‘meaningful’ and we could get it turned around, I’d be on the optimistic end, but I think Sonnet is right that if you count the paper the timeline here is pretty brutal. So I’m going to go with 35%.

    4. Manifold is optimistic and says 60% with active trading, with Nathan Metzger noting the issue of defining a meaningful discovery and Brian Holtz noting the issue of how much assistance is allowed. I’m willing to interpret this as an optimistic take on both feasibility and what would count and go to 50%.

  4. A deepfake/agent-driven cyber attack triggers the first NATO/UN emergency debate on AI security.

    1. It would take really a lot to get this to trigger. Like, really a lot.

    2. There’s even an out that if something else triggers a debate first, this didn’t happen.

    3. GPT-5-Pro said 25%, Sonnet said 12% and I’m with Sonnet.

    4. Manifold says 18%, down the middle. I’m still with Sonnet.

  5. A real-time generative video game becomes the year’s most-watched title on Twitch.

    1. I’ll go ahead and take the no here. Too soon. Generative games are not as interesting as people think, and they’re doubling down on the 2024 mistake.

    2. GPT-5-Pro has this at 14%, Sonnet says 3%. I think Sonnet is a bit overconfident, let’s say 5%, but yeah, this has to overcome existing behemoths even if you make something great. Not gonna happen.

    3. Manifold agrees this is the long shot at 7%, which is basically their version of ‘not gonna happen’ given how the math works for long shots.

  6. “AI neutrality” emerges as a foreign policy doctrine as some nations cannot or fail to develop sovereign AI.

    1. I doubt they’ll call it that, but certainly some nations will opt out of this ‘race.’

    2. GPT-5-Pro said 25%, Sonnet says 20%. I agree if this is a meaningful ‘neutrality’ in the sense of neutral between China and America on top of not rolling one’s own, but much higher if it simply means that nations opt out of building their own and rely on a frontier lab or a fine tune of an existing open model. And indeed I think this opt out would be wise for many, perhaps most.

    3. Manifold says 29%. Given the ambiguity issues, that’s within reasonable range.

  7. A movie or short film produced with significant use of AI wins major audience praise and sparks backlash.

    1. GPT-5-Pro says 68%, Sonnet says 55%. I’d be a buyer there, normally a parlay is a rough prediction but there would almost certainly be backlash conditional on this happening. A short film counts? I’m at more like 80%.

    2. Manifold is only at 67%. That seems low to me, but I can moderate to 75%.

  8. A Chinese lab overtakes the US lab dominated frontier on a major leaderboard (e.g. LMArena/Artificial Analysis).

    1. I’d bet big against a Chinese lab actually having the best model at any point in 2026, but benchmarks are not leaderboards.

    2. I’d be very surprised if this happened on Artificial Analysis. Their evaluation suite is reasonably robust.

    3. I’d be less surprised if this happened on LM Arena, since it is rather hackable, if one of the major Chinese labs actively wanted to do this there’s a decent chance that they could, the way Meta hacked through their model for a bit.

    4. I still think this is an underdog. GPT-5-Pro said 74%, Sonnet says 60% and is focusing on Arena as the target. It only has to happen briefly. I think the models are too optimistic here, but I’ll give them maybe 55% because as worded this includes potential other leaderboards too.

    5. Manifold says 34%, and on reflection yeah I was being a coward and moderating my instincts too much, that’s more like it. I’d probably buy there small because the resolution criteria is relatively generous, fair 40%.

  9. Datacenter NIMBYism takes the US by storm and sways certain midterm/gubernatorial elections in 2026.

    1. Threshold is always tricky with such questions. If we’re talking at least two races for governor, house or senate, I think this is not that likely to happen, nor is it likely to be very high on the list of issues in general. I’m on no.

    2. GPT-5-Pro says 23%, Sonnet says 18%. I’d probably say more like 15%. If you expand this so ‘a bunch of local races around potential cites’ counts including for ‘take by storm’ then I could go higher.

    3. Manifold is optimistic at 41%. I’ll adjust to 25% on that, they might especially have a better sense of what would count, but this particular AI issue ‘taking the US by storm’ that often seems like a stretch.

  10. Trump issues an unconstitutional executive order to ban state AI legislation.

    1. I love that they explicitly say it will be unconstitutional.

    2. I do agree that if he did it, it would be unconstitutional, although of course it will be 2026 so it’s possible he can Just Do Things and SCOTUS will shrug.

    3. Both GPT-5-Pro and Sonnet say 35% here. That feels high but I can definitely see this happening, I agree with Sonnet that it is ‘on brand.’ 25%?

    4. Manifold is at 19%. Okay, sure, I’ll accept that and creep fair down a bit.

Indeed, despite nothing ever happening, do many things come to pass. It would be cool to have my own bold predictions for 2026, but I think the baseline scenario is very much a boring ‘incremental improvements, more of the same with some surprising new capabilities, people who notice see big improvements but those who want to dismiss can still dismiss, the current top labs are still the top labs, a lot more impact than the economists think but nothing dramatic yet, safety and alignment look like they are getting better and for short term purposes they are, and investment is rising, but not in ways that give me faith that we’re making Actual Progress on hard problems.’

I do think we should expect at least one major vibe shift. Every time vibes shift, it becomes easy to think there won’t soon be another vibe shift. There is always another vibe shift, it is so over and then we are so back, until AGI arrives and perhaps then it really is over whether or not we are also so back. Two shifts is more likely than zero. Sometimes the shifts are for good reasons, usually it is not. The current ‘powers that be’ are unlikely to be the ones in place, with the same perspectives, at the end of 2026.

Discussion about this post

2025 State of AI Report and Predictions Read More »

one-nasa-science-mission-saved-from-trump’s-cuts,-but-others-still-in-limbo

One NASA science mission saved from Trump’s cuts, but others still in limbo


“Damage is being done already. Even if funding is reinstated, we have already lost people.”

Artist’s illustration of the OSIRIS-APEX spacecraft at asteroid Apophis. Credit: NASA/Goddard Space Flight Center

NASA has thrown a lifeline to scientists working on a mission to visit an asteroid that will make an unusually close flyby of the Earth in 2029, reversing the Trump administration’s previous plan to shut it down.

This mission, named OSIRIS-APEX, was one of 19 operating NASA science missions the White House proposed canceling in a budget blueprint released earlier this year.

“We were called for cancellation as part to the president’s budget request, and we were reinstated and given a plan to move ahead in FY26 (Fiscal Year 2026) just two weeks ago,” said Dani DellaGiustina, principal investigator for OSIRIS-APEX at the University of Arizona. “Our spacecraft appears happy and healthy.”

OSIRIS-APEX repurposes the spacecraft from NASA’s OSIRIS-REx asteroid sample return mission, which deposited its extraterrestrial treasure back on Earth in 2023. The spacecraft was in good shape and still had plenty of fuel, so NASA decided to send it to explore another asteroid, named Apophis, due to pass about 20,000 miles (32,000 kilometers) from the Earth on April 13, 2029.

The flyby of Apophis offers scientists a golden opportunity to see a potential killer asteroid up close. Apophis has a lumpy shape with an average diameter of about 1,100 feet (340 meters), large enough to cause regional devastation if it impacted the Earth. The asteroid has no chance of striking us in 2029 or any other time for the next century, but it routinely crosses the Earth’s path as it circles the Sun, so the long-term risk is non-zero.

It pays to be specific

Everything was going well with OSIRIS-APEX until May, when White House officials signaled their intention to terminate the mission. The Trump administration’s proposed cancellation of 19 of NASA’s operating missions was part of a nearly 50 percent cut to the agency’s science budget in the White House budget request for fiscal year 2026, which began October 1.

Lawmakers in the House and Senate have moved to reject nearly all of the science cuts, with the Senate bill maintaining funding for NASA’s science division at $7.3 billion, the same as fiscal year 2025, while the House bill reduces it to $6 billion, still significantly more than the $3.9 billion for science in the White House budget proposal.

The Planetary Society released this chart showing the 19 operating missions tagged for termination under the White House’s budget proposal.

For a time this summer, Trump’s political appointees at NASA told managers to make plans for the next year assuming Trump’s cuts would be enacted. Finally, last month, those officials relented and instructed agency employees to abide by the House appropriations bill.

The House and Senate still have not agreed on any final budget numbers or sent an appropriations bill to the White House for President Trump’s signature. That’s why the federal government has been partially shut down for the last week. Despite the shutdown, ground teams are still operating NASA’s science missions because suspending them could result in irreparable damage.

Using the House’s proposed budget should salvage much of NASA’s portfolio, but it is still $1.3 billion short of the money the agency’s science program got last year. That means some things will inevitably get cut. Many of the other operating missions the Trump administration tagged for termination remain on the chopping block.

OSIRIS-APEX escaped this fate for a simple reason. Lawmakers earmarked $20 million for the mission in the House budget bill. Most other missions didn’t receive the same special treatment. It seems OSIRIS-APEX had a friend in Congress.

Budget-writers in the House of Representatives specified NASA should commit $20 million for the OSIRIS-APEX mission in fiscal year 2026. Credit: US House of Representatives

The only other operating mission the Trump administration wanted to cancel that got a similar earmark in the House budget bill was the Magnetospheric Multiscale Mission (MMS), a fleet of four probes in space since 2015 studying Earth’s magnetosphere. Lawmakers want to provide $20 million for MMS operations in 2026. Ars was unable to confirm the status of the MMS mission Wednesday.

The other 17 missions set to fall under Trump’s budget ax remain in a state of limbo. There are troubling signs the administration might go ahead and kill the missions. Earlier this year, NASA directed managers from all 19 of the missions at risk of cancellation to develop preliminary plans to wind down their missions.

A scientist on one of the projects told Ars that NASA recently asked for a more detailed “termination plan” to “passivate” their spacecraft by the end of this year. This goes a step beyond the closeout plans NASA requested in the summer. Passivation is a standard last rite for a spacecraft, when engineers command it to vent leftover fuel and drain its batteries, rendering it fully inert. This would make the mission unrecoverable if someone tried to contact it again.

This scientist said none of the missions up for termination will be out of the woods until there’s a budget that restores NASA funding close to last year’s levels and includes language protecting the missions from cancellation.

Damage already done

Although OSIRIS-APEX is again go for Apophis, DellaGiustina said a declining budget has forced some difficult choices. The mission’s science team is “basically on hiatus” until sometime in 2027, meaning they won’t be able to participate in any planning for at least the next year and a half.

This has an outsize effect on younger scientists who were brought on to the mission to train for what the spacecraft will find at Apophis, DellaGiustina said in a meeting Tuesday of the National Academies’ Committee on Astrobiology and Planetary Sciences.

“We are not anticipating we will have to cut any science at Apophis,” she said. But the cuts do affect things like recalibrating the science instruments on the spacecraft, which got dirty and dusty from the mission’s brief landing to capture samples from asteroid Bennu in 2020.

“We are definitely undermining our readiness,” DellaGiustina said. “Nonetheless, we’re happy to be reinstated, so it’s about as good as can be expected, I think, for this particular point in time.”

At its closest approach, asteroid Apophis will be closer to Earth than the ring of geostationary satellites over the equator. Credit: NASA/JPL

The other consequence of the budget reduction has been a drain in expertise with operating the spacecraft. OSIRIS-APEX (formerly OSIRIS-REx) was built by Lockheed Martin, which also commands and receives telemetry from the probe as it flies through the Solar System. The cuts have caused some engineers at Lockheed to move off of planetary science missions to other fields, such as military space programs.

The other active missions waiting for word from NASA include the Chandra X-ray Observatory, the New Horizons probe heading toward interstellar space, the MAVEN spacecraft studying the atmosphere of Mars, and several satellites monitoring Earth’s climate.

The future of those missions remains murky. A senior official on one of the projects said they’ve been given “no direction at all” other than “to continue operating until advised otherwise.”

Another mission the White House wanted to cancel was THEMIS, a pair of spacecraft orbiting the Moon to map the lunar magnetic field. The lead scientist for that mission, Vassilis Angelopoulos from the University of California, Los Angeles, said his team will get “partial funding” for fiscal year 2026.

“This is good, but in the meantime, it means that science personnel is being defunded,” Angelopoulos told Ars. “The effect is the US is not achieving the scientific return it can from its multi-billion dollar investments it has made in technology.”

Artist’s concept of NASA’s MAVEN spacecraft, which has orbited Mars since 2014 studying the planet’s upper atmosphere.

To put a number on it, the missions already in space that the Trump administration wants to cancel represent a cumulative investment of $12 billion to design and build, according to the Planetary Society, a science advocacy group. An assessment by Ars concluded the operating missions slated for cancellation cost taxpayers less than $300 million per year, or between 1 and 2 percent of NASA’s annual budget.

Advocates for NASA’s science program met at the US Capitol this week to highlight the threat. Angelopoulos said the outcry from scientists and the public seems to be working.

“I take the implementation of the House budget as indication that the constituents’ pressure is having an effect,” he said. “Unfortunately, damage is being done already. Even if funding is reinstated, we have already lost people.”

Some scientists worry that the Trump administration may try to withhold funding for certain programs, even if Congress provides a budget for them. That would likely trigger a fight in the courts.

Bruce Jakosky, former principal investigator of the MAVEN Mars mission, raised this concern. He said it’s a “positive step” that NASA is now making plans under the assumption the agency will receive the budget outlined by the House. But there’s a catch.

“Even if the budget that comes out of Congress gets signed into law, the president has shown no reluctance to not spend money that has been legally obligated,” Jakosky wrote in an email to Ars. “That means that having a budget isn’t the end; and having the money get distributed to the MAVEN science and ops team isn’t the end—only when the money is actually spent can we be assured that it won’t be clawed back.

“That means that the uncertainty lives with us throughout the entire fiscal year,” he said. “That uncertainty is sure to drive morale problems.”

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

One NASA science mission saved from Trump’s cuts, but others still in limbo Read More »

floating-electrons-on-a-sea-of-helium

Floating electrons on a sea of helium

By now, a handful of technologies are leading contenders for producing a useful quantum computer. Companies have used them to build machines with dozens to hundreds of qubits, the error rates are coming down, and they’ve largely shifted from worrying about basic scientific problems to dealing with engineering challenges.

Yet even at this apparently late date in the field’s development, there are companies that are still developing entirely new qubit technologies, betting the company that they have identified something that will let them scale in ways that enable a come-from-behind story. Recently, one of those companies published a paper that describes the physics of their qubit system, which involves lone electrons floating on top of liquid helium.

Trapping single electrons

So how do you get an electron to float on top of helium? To find out, Ars spoke with Johannes Pollanen, the chief scientific officer of EeroQ, the company that accomplished the new work. He said that it’s actually old physics, with the first demonstrations of it having been done half a century ago.

“If you bring a charged particle like an electron near the surface, because the helium is dielectric, it’ll create a small image charge underneath in the liquid,” said Pollanen. “A little positive charge, much weaker than the electron charge, but there’ll be a little positive image there. And then the electron will naturally be bound to its own image. It’ll just see that positive charge and kind of want to move toward it, but it can’t get to it, because the helium is completely chemically inert, there are no free spaces for electrons to go.”

Obviously, to get the helium liquid in the first place requires extremely low temperatures. But it can actually remain liquid up to temperatures of 4 Kelvin, which doesn’t require the extreme refrigeration technologies needed for things like transmons. Those temperatures also provide a natural vacuum, since pretty much anything else will also condense out onto the walls of the container.

Diagrams of a chip showing channels and electrodes, along with an image of the chip itself.

The chip itself, along with diagrams of its organization. The trap is set by the gold electrode on the left. Dark channels allow liquid helium and electrons to flow into and out of the trap. And the bluish electrodes at the top and bottom read the presence of the electrons. Credit: EeroQ

Liquid helium is also a superfluid, meaning it flows without viscosity. This allows it to easily flow up tiny channels cut into the surface of silicon chips that the company used for its experiments. A tungsten filament next to the chip was used to load the surface of the helium with electrons at what you might consider the equivalent of a storage basin.

Floating electrons on a sea of helium Read More »

tesla’s-standard-range-model-3,-model-y-join-the-lineup

Tesla’s standard-range Model 3, Model Y join the lineup

Today, Tesla announced a new variant of the Model Y crossover for North America. Tesla fans have long-awaited a cheaper entry-level model; this was supposed to be the $25,000 Model 2. But the development of that electric vehicle was shelved earlier last year as CEO Elon Musk began to lose interest with car-making in favor of humanoid robots.

However, car sales still make up the overwhelming majority of Tesla’s revenue, and the removal of the IRS clean vehicle tax credit at the end of September may have juiced US EV sales in Q3 2025, but sales are expected to dip significantly in the current quarter.

The new Standard Range Model Y starts at $39,990, with 321 miles (516 km) of range from its rear-wheel drive powertrain, compared to the now-Premium rear-wheel drive Model Y, which has an EPA range of 357 miles (574 km). In the past, Tesla has software-locked batteries to a smaller configuration; however, here we believe the Standard Range Model Y uses a 69 kWh pack.

The cheaper Model Y is decontented in other ways. There’s no AM or FM radio, and no touchscreen in the back for passengers to control their climate settings. The roof is metal, not panoramic glass, and there’s a simpler center console and manual adjustment for the steering wheel. Tesla has reduced the choice of interior trim materials, there’s a less-capable particulate filter (with no HEPA mode), and there’s no seat heating for the back seats or cooling for the front seats.

Tesla’s standard-range Model 3, Model Y join the lineup Read More »

despite-rfk-jr.’s-shenanigans,-covid-shot-access-will-be-a-lot-like-last-year

Despite RFK Jr.’s shenanigans, COVID shot access will be a lot like last year

In an interview with Ars Technica in August, Brigid Groves, vice president of professional affairs for the American Pharmacists Association (APhA), signaled that efforts to limit access to COVID-19 vaccines is concerning to APhA, which is the leading organization representing pharmacists.

“We are concerned about that because the data and evidence point to the fact that this vaccine is safe and effective for [young, otherwise healthy] patients,” Groves said. “So, to suddenly arbitrarily limit that is very concerning to us.”

And, with the CDC’s permissive recommendations, pharmacies are not limiting them. Representatives for both CVS and Walgreens told The Washington Post that they would not require patients under 65 to prove they have an underlying condition to get a COVID-19 vaccine. CVS won’t ask you to self-attest to having a condition, and Walgreens also said that it won’t require any proof.

“In simplest terms, if a patient wants to get the vaccine, they’ll get it,” Amy Thibault, a CVS spokesperson, told the Post.

With the shared decision-making, there may be extra forms about risks and benefits that might take an extra few minutes, but it should otherwise be just like past years.

On Tuesday, this reporter was able to easily book same-day appointments for an updated COVID-19 vaccine at local CVS and Walgreens pharmacies in North Carolina, without attesting to any medical conditions.

Children

Shots for younger children could be trickier: While adults and older children can visit their pharmacy and get vaccinated relatively easily, younger children (particularly those under age 5) may have a harder time. Pharmacists typically do not vaccinate those younger children—which has always been the case—and parents will have to visit the pediatrician.

Pediatricians, like pharmacists, are likely to be supportive of broad access to the shots. The American Academy of Pediatrics has said that all children should have access. The AAP also specifically encourages children under age 2 and children with underlying conditions to get vaccinated, because those children are at higher risk of severe disease.

Despite RFK Jr.’s shenanigans, COVID shot access will be a lot like last year Read More »

bending-the-curve

Bending The Curve

The odds are against you and the situation is grim.

Your scrappy band are the only ones facing down a growing wave of powerful inhuman entities with alien minds and mysterious goals. The government is denying that anything could possibly be happening and actively working to shut down the few people trying things that might help. Your thoughts, no matter what you think could not harm you, inevitably choose the form of the destructor. You knew it was going to get bad, but this is so much worse.

You have an idea. You’ll cross the streams. Because there is a very small chance that you will survive. You’re in love with this plan. You’re excited to be a part of it.

Welcome to the always excellent Lighthaven venue for The Curve, Season 2, a conference I had the pleasure to attend this past weekend.

Where the accelerationists and the worried come together to mostly get along and coordinate on the same things, because the rest of the world has gone blind and mad. In some ways technical solutions seem relatively promising, shifting us from ‘might be actually impossible’ levels of impossible to Shut Up And Do The Impossible levels of impossible, all you have to do is beat the game on impossible difficulty level. As a speed run. On your first try. Good luck.

The action space has become severely constrained. Between the actual and perceived threats from China, the total political ascendence of Nvidia in particular and anti-regulatory big tech in general, and the setting in of more and more severe race conditions and the increasing dependence of the entire economy on AI capex investments, it’s all we can do to try to only shoot ourselves in the foot and not aim directly for the head.

Last year we were debating tradeoffs. This year, aside from the share price of Nvidia, as long as you are an American who likes humans considering things that might pass? On the margin, there are essentially no tradeoffs. It’s better versus worse.

That doesn’t invalidate the thesis of If Anyone Builds It, Everyone Dies or the implications down the line. At some point we will probably either need to do impactful international coordination or other interventions that involved large tradeoffs, or humanity loses control over the future or worse. That implication exists in every reasonable sketch of the future I have seen in which AI does not end up a ‘normal technology.’ So one must look forward towards that, as well.

You can also look at it as Year 1 of the curve was billed (although I don’t use the d word) as ‘doomers vs. accelerationists’ and now as Nathan Lambert says it was DC and SF types, like when the early season villains and heroes are now all working together as the stakes get raised and the new Big Bad shows up, then you do it again until everything is cancelled.

The Curve was a great experience. The average quality of attendees was outstanding. I would have been happy to talk to a large fraction of them 1-on-1 for a long time, and there were a number that I’m sad I missed. Lots of worthy sessions lost out to other plans.

As Anton put it, every (substantive) conversation I had made me feel smarter. There was opportunity everywhere, everyone was cooperative and seeking to figure things out, and everyone stayed on point.

To the many people who came up to me to thank me for my work, you’re very welcome. I appreciate it every time and find it motivating.

What did people at the conference think about some issues?

We have charts.

Where is AI on the technological richter scale?

There are dozens of votes here. Only one person put this as low as a high 8, which is the range of automobiles, electricity and the internet. A handful put it with fire, the wheel, agriculture and the printing press. Then most said this is similar to the rise of the human species, a full transformation. A few said it is a bigger deal than that.

If you were situationally aware enough to show up, you are aware of the situation.

These are median predictions, so the full distribution will have a longer tail, but this seems reasonable to me. The default is 10, that AI is going to be a highly non-normal technology on the level of the importance of humans, but there’s a decent chance it will ‘only’ be a 9 on the level of agriculture or fire, and some chance it disappoints and ends up Only Internet Big.

Last year, people would often claim AI wouldn’t even be Internet Big. We are rapidly approaching the point where that is not a position you can offer with a straight face.

How did people expect this to play out?

That’s hard to read, so the centers of the distributions are, note that there was clearly a clustering effect:

  1. 90% of code is written by AI by ~2028.

  2. 90% of human remote work can be done more cheaply by AI by ~2031.

  3. Most cars on America’s roads lack human drivers by ~2041.

  4. AI makes Nobel Prize worthy discovery by ~2032.

  5. First one-person $1 billion company by 2026.

  6. First year of >10% GDP growth by ~2038 (but 3 votes for never).

  1. People estimate 15%-50% current speedup at AI labs from AI coding.

  2. When AI is fully automated, disagreement over how good their research taste will be, but median is roughly as good as the median current AI worker.

  3. If we replaced each human with an AI version of themselves that was the same except 30x faster with 30 copies, but we only had access to similar levels of compute, we’d get maybe a 12x speedup in progress.

What are people worried or excited about? A lot of different things, from ‘everyone lives’ to ‘concentration of power,’ ‘everyone dies’ and especially ‘loss of control’ which have the most +1s on their respective sides. Others are excited to cure their ADD or simply worried everything will suck.

Which kind of things going wrong worries people most, misalignment or misuse?

Why not both? Pretty much everyone said both.

Finally, who is this nice man with my new favorite IYKYK t-shirt?

(I mean, he has a name tag, it’s OpenAI’s Boaz Barak)

The central problem at every conference is fear of missing out. Opportunity costs. There are many paths, even when talking to a particular person. You must choose.

That goes double at a conference like The Curve. The quality of the people there was off the charts and the schedule forced hard choices between sessions. There were entire other conferences I could have productively experienced. I also probably could have usefully done a lot more prep work.

I could of course have hosted a session, which I chose not to do this time around. I’m sure there were various topics I could have done that people would have liked, but I was happy for the break, and it’s not like there’s a shortage of my content out there.

My strategy is mostly to not actively plan my conference experiences, instead responding to opportunity. I think this is directionally correct but I overplay it, and should have (for example) looked at the list of who was going to be there.

What were the different tracks or groups of discussions and sessions I ended up in?

  1. Technical alignment discussions. I had the opportunity to discuss safety and alignment work with a number of those working on such issues at Anthropic, DeepMind and even xAI. I missed OpenAI this time around, but they were there. This always felt exciting, enlightening and fun. I still get imposter syndrome every time people in such conversations take me and my takes and ideas seriously. Conditions are in many ways horribly terrible but everyone is on the same team and some things seem promising. I felt progress was made. My technical concrete pitch to Anthropic included (among other things) both particular experimental suggestions and also a request that they sustain access to Sonnet 3.5 and 3.6.

    1. It wouldn’t make sense to go into the technical questions here.

  2. Future projecting. I went to talks by Joshua Achiam and Helen Toner about what future capabilities and worlds might look like. Jack Clark’s closing talk was centrally this but touched on other things.

  3. AI policy discussions. These felt valuable and enlightening in both directions, but were infuriating and depressing throughout. People on the ground in Washington kept giving us variations on ‘it’s worse than you know,’ which it usually is. So now you know. Others seemed not to appreciate how bad things had gotten. I was often pointing out that people’s proposals implied some sort of international treaty and form of widespread compute surveillance, had zero chance of actually causing us not to die, or sometimes both. At other times, I was pointing out that things literally wouldn’t work on the level of ‘do the object level goal’ let alone make us win. Or we were trying to figure out what was sufficiently completely costless and not even a tiny bit weird or complex that one could propose that might actually do anything meaningful. Or simply observing other perspectives.

    1. In particular, different people maintained different players were relatively powerful, but I came away from various discussions more convinced than ever that for now White House policy and rhetoric on AI can be modeled as fully captured by Nvidia, although constrained in some ways by congressional Republicans and some members of the MAGA movement. This is pretty much a worst case scenario. If we were captured by OpenAI or other AI labs that wouldn’t be great but at least their interests and America are mostly aligned.

  4. Nonprofit funding discussions. I’d just come out of the latest Survival and Flourishing Fund round, various players seemed happy to talk and strategize, and it seems likely that very large amounts of money will be unlocked soon as OpenAI and Anthropic employees with increasingly valuable equity become liquid. The value of helping steer this seems crazy high, but the stakes on everything seem crazy high.

    1. One particular worry is that a lot of this money could effectively get captured by various existing players, especially the existing EA/OP ecosystem, in ways that would very much be a shame.

    2. Another is simply that a bunch of relatively uninformed money could overwhelm incentives, contaminate various relationships and dynamics, introduce parasitic entry, drop average quality a lot, and so on.

    3. Or everyone involved could end up with a huge time sink and/or end up not deploying the funds.

    4. So there’s lots to do. But it’s all tricky, and trying to gain visible influence over the direction of funds is a very good way to get your own social relationships and epistemics very quickly compromised, also it can quickly eat up infinite time, so I’m hesitant to get too involved or involved in the wrong ways.

What other tracks did I actively choose not to participate in?

There were of course AI timelines discussions, but I did my best to avoid them except when they were directly relevant to a concrete strategic question. At one point someone in a 4-person conversation I was mostly observing said ‘let’s change the subject, can we argue about AI timelines’ and I outright said ‘no’ but was overruled, and after a bit I walked away. For those who don’t follow these debates, many of the more aggressive timelines have gotten longer over the course of 2025, with people who expected crazy to happen in 2027 or 2028 now not expecting crazy for several more years, but there are those who still mostly hold firm to a faster schedule.

There were a number of talks about AI that assumed it was mysteriously a ‘normal technology.’ There were various sessions on economics projections, or otherwise taking place with the assumption that AI would not cause things to change much, except for whatever particular effect people were discussing. How would we ‘strengthen our democracy’ when people had these neat AI tools, or avoid concentration of power risks? What about the risk of They Took Our Jobs? What about our privacy? How would we ensure everyone or every nation has fair access?

These discussions almost always silently assume that AI capability ‘hits a wall’ some place not very far from where it is now and then everything moves super slowly. Achiam’s talk had elements of this, and I went because he’s OpenAI’s Head of Mission Alignment so knowing how he thinks about this seemed super valuable.

To the extent I interacted with this it felt like smart people thinking about a potential world almost certainly very different from our own. Fascinating, can create useful intuition pumps, but that’s probably not what’s going to happen. If nothing else was going on, sure, count me in.

But also all the talk of ‘bottlenecks’ therefore 0.5% or 1% GDP growth boost per year tops has already been overtaken purely by capex spending and I cannot remember a single economist or other GDP growth skeptic acknowledging that this already made their projections wrong and updating reasonably.

There was an AI 2027 style tabletop exercise again this year, which I recommend doing if you haven’t done it before, except this time I wasn’t aware it was happening, and also by now I’ve done it a number of times.

There were of course debates directly about doom, but remarkably little and I had no interest. It felt like everyone was either acknowledging existential risk enough that there wasn’t much value of information in going further, or sufficiently blind they were in ‘normal technology’ mode. At some point people get too high level to think building smarter than human minds is a safe proposition.

Helen Toner gave a talk on taking AI jaggedness seriously. What would it mean if AIs kept getting increasingly better and superhuman at many tasks, while remaining terrible at other tasks, or at least relatively highly terrible compared to humans? How does the order of capabilities impact how things unfold? Even if we get superhuman coding and start to get big improvements in other areas as a result, that won’t make their ability profile similar to humans.

I agree with Helen that such jaggedness is mostly good news and potentially could buy us substantial time for various transitions. However, it’s not clear to me that this jaggedness does that much for that long, AI is (I am projecting) not going to stall out in the lagging areas or stay subhuman in key areas for as much calendar time as one might hope.

A fun suggestion was to imagine LLMs talking about how jagged human capabilities are. Look how dumb we are in some ways while being smart in others. I do think in a meaningful sense LLMs and other current AIs are ‘more jagged’ than humans in practice, because humans have continual learning and the ability to patch the situation and also route the physical world around our idiocy where they’re being importantly dumb. So we’re super dumb, but we try to not let it get in the way.

Neil Chilson: Great talk by @hlntnr about the jaggedness of AI, why it is likely to continue, and why it matters. Love this slide and her point that while many AI forecasters use smooth curves, a better metaphor is the chaotic transitions in fluid heating.

“Jaggedness” being the uneven ability of AI to do tasks that seem about equally difficult to humans.

Occurs to me I should have shared the “why this matters” slide, which was the most thought provoking one to me:

I am seriously considering talking about time to ‘crazy’ going forward, and whether that is a net helpful thing to say.

The curves definitely be too smooth. It’s hard to properly adjust for that. But I think the fluid dynamics metaphor, while gorgeous, makes the opposite mistake.

I watched a talk by Randi Weingarten about how she and other teachers are advocating and viewing AI around issues in education. One big surprise is that she says they don’t worry or care much about AI ‘cheating’ or doing work via ChatGPT, there are ways around that, especially ‘project based learning that is relevant,’ and the key thing is that education is all about human interactions. To her ChatGPT is a fine tool, although things like Character.ai are terrible, and she strongly opposes phones in schools for the right reasons and I agree with that.

She said teachers need latitude to ‘change with the times’ but usually aren’t given it, they need permission to change anything and if anything goes wrong they’re fired (although there are the other stories we hear that teachers often can’t be fired almost no matter what in many cases?). I do sympathize here. A lot needs to change.

Why is education about human interactions? This wasn’t explained. I always thought education was about learning things, I mostly didn’t learn things through human interaction, I mostly didn’t learn things in school via meaningful human interaction, and to the extent I learned things via meaningful human interaction it mostly wasn’t in school. As usual when education professionals talk about education I don’t get the sense they want children to learn things, or that they care about children being imprisoned and bored with their time wasted for huge portions of many days, but care about something else entirely? It’s not clear what her actual objection to Alpha School (which she of course confirmed she hates) was other than decentering teachers, or what concretely was supposedly going wrong there? Frankly it sounded suspiciously like a call to protect jobs.

If anything, her talk seemed to be a damning indictment of our entire system of schools and education. She presents vocational education as state of the art and with the times, and cited an example of a high school with a sub-50% graduation rate going to 100% graduation rate and 182 of 186 students getting a ‘certification’ from future farmers of America after one such program. Aside from the obvious ‘why do you need a certificate to be a farmer’ and also ‘why would you choose farmer in 2025’ this is saying kids should spend vastly less time in school? Many other such implications were there throughout.

Her group calls for ‘guardrails’ and ‘accountability’ on AI, worries about things like privacy, misinformation and understanding ‘the algorithms’ or the dangers to democracy, and points to declines in male non-college earnings,

There was a Chatham House discussion of executive branch AI policy in America where all involved were being diplomatic and careful. There’s a lot of continuity between the Biden approach to AI and much of the Trump approach, there’s a lot of individual good things going on, and it was predicted that CAISI would have a large role going forward, lots of optimism and good detail.

It seems reasonable to say that the Trump administration’s first few months of AI policy were unexpectedly good, and the AI Action Plan was unexpectedly good. Then there are the other things that happened.

Thus the session included some polite versions of ‘what the hell are we doing?’ that was at most slightly beneath the surface. As a central example, one person observed that if America ‘loses on AI,’ it would likely be because we did one or more of failing to (1) provide the necessary electrical power, (2) failed to bring in the top AI talent or (3) sold away our chip advantage. They didn’t say, but I will note here, that current American policy seems determined to screw up all three of these? We are cancelling solar, wind and battery projects all over, we are restricting our ability to acquire talent, and we are seriously debating selling Blackwell chips directly to China.

I was sad that going to that talk ruled out watching Buck Shlegeris debate Timothy Lee about whether keeping AI agents under control will be hard, as I expected that session to both be extremely funny (and one sided) and also plausibly enlightening in navigating such arguments, but that’s how conferences go. I did then get to see Buck discuss mitigating insider threats from scheming AIs, in which he explained some of the ways in which dealing with scheming AIs that are smarter than you is very hard. I’d go farther and say that in the types of scenarios Buck is discussing there it’s not going to work out for you. If the AIs be smarter than you and also scheming against you and you try to use them for important stuff anyway you lose.

That doesn’t mean do zero attempts to mitigate this but at some point the whole effort is counterproductive as it creates context that creates what it is worried about, without giving you much chance of winning.

At one point I took a break to get dinner at a nearby restaurant. The only other people there were two women. The discussion included mention of AI 2027 and also that one of them is reading If Anyone Builds It, Everyone Dies.

Also at one point I saw a movie star I’m a fan of, hanging out and chatting. Cool.

Sunday started out with Josh Achiam’s talk (again, he’s Head of Mission Alignment at OpenAI, but his views here were his own) about the challenge of the intelligence age. If it comes out, it’s worth a watch. There were a lot of very good thoughts and considerations here. I later got to have some good talk with him during the afterparty. Like much talk at OpenAI, it also silently ignored various implications of what was being built, and implicitly assumed the relevant capabilities just stopped in any place they would cause bigger issues. The talk acknowledged that it was mostly assuming alignment is solved, which is fine as long as you say that explicitly, we have many different problems to deal with, but other questions also felt assumed away more silently. Josh promises his full essay version will deal with that.

I got to go to a Chatham House Q&A about the EU Frontier AI Code of Practice, which various people keep reminding me I should write about, and I swear I want to do that as soon as I have some spare time. There was a bunch of info, some of it new to me, and also insight into how those involved think all of this is going to work. I later shared with them my model of how I think the AI companies will respond, in particular the chance they will essentially ignore the law when inconvenient because of lack of sufficient consequences. And I offered suggestions on how to improve impact here. But on the margin, yeah, the law does some good things.

I got into other talks and missed out on one I wanted to see by Joe Allen, about How the MAGA Movement Sees AI. This is a potentially important part of the landscape on AI going forward, as a bunch of MAGA types really dislike AI and are in position to influence the White House.

As I look over the schedule in hindsight I see a bunch of other stuff I’m sad I missed, but the alternative would have been missing valuable 1-on-1s or other talks.

The final talk was Jack Clark giving his perspective on events. This was a great talk, if it does online you should watch it, it gave me a very concrete sense of where he is coming from.

Jack Clark has high variance. When he’s good, he’s excellent, such as in this talk, including the Q&A, and when he asked Achaim an armor piercing question, or when he’s sticking to his guns on timelines that I think are too short even though it doesn’t seem strategic to do that. At other times, him and the policy team at Anthropic are in some sort of Official Mode where they’re doing a bunch of hedging and making things harder.

The problem I have with Anthropic’s communications is, essentially, that they are not close to the Pareto Frontier, where the y-axis is something like ‘Better Public Policy and Epistemics’ and the x-axis can colloquially be called ‘Avoid Pissing Off The White House.’ I acknowledge there is a tradeoff here, especially since we risk negative polarization, but we need to be strategic, and certain decisions have been de facto poking the bear for little gain, and at other times they hold back for little gain the other way. We gotta be smarter about this.

They are often very different from mine, or yours.

Deepfates: looks like a lot of people who work on policy and research for aligning AIs to human interests. I’m curious what you think about how humans align to AI.

my impression so far: people from big labs and people from government, politely probing each other to see which will rule the world. they can’t just out and say it but there’s zerosumness in the air

Chris Painter: That isn’t my impression of the vibe at the event! Happy to chat.

I was with Chris on this. It very much did not feel zero sum. There did seem to be a lack of appreciation of the ‘by default the AIs rule the world’ problem, even in a place dedicated largely to this particular problem.

Deepfates: Full review of The Curve: people just want to believe that Anyone is ruling the world. some of them can sense that Singleton power is within reach and they are unable to resist The opportunity. whether by honor or avarice or fear of what others will do with it.

There is that too, that currently no one is ruling the world, and it shows. It also has its advantages.

so most people are just like “uh-oh! what will occur? shouldn’t somebody be talking about this?” which is fine honestly, and a lot of them are doing good research and I enjoy learning about it. The policy stuff is more confusing

diverse crowd but multiple clusters talking past each other as if the other guys are ontologically evil and no one within earshot could possibly object. and for the most part they don’t actually? people just self-sort by sessions or at most ask pointed questions. parallel worlds.

Yep, parallel worlds, but I never saw anyone say someone else was evil. What, never? Well, hardly ever. And not anyone who actually showed up. Deeply confused and likely to get us all killed? Well, sure, there was more of that, but obviously true, and again not the people present.

things people are concerned about in no order: China. Recursive self-improvement. internal takeover of AI labs by their models. Fascism. Copyright law. The superPACs. Sycophancy. Privacy violations. Rapid unemployment of whole sectors of society. Religious and political backlash, autonomous agents, capabilities. autonomous agents, legal liability. autonomous agents, nightmare nightmare nightmare.

The fear of the other party, the other company, the other country, the other, the unknown, most of all the alien thing that threatens what it means to be human.

Fascinating to see threatens ‘what it means to be human’ on that list but not ‘the ability to keep being human (or alive),’ which I assure Deepfates a bunch of us were indeed very concerned about.

so they want to believe that the world is ruleable, that somebody, anybody, is at the wheel, as we careen into the strangest time in human history.

and they do Not want it to be the AIs. even as they keep putting decision making power and communication surface on the AIs lol

You can kind of tell here that Deepfates is fine with it being the AIs and indeed is kind of disdainful of anyone who would object to this. As in, they understand what is about to happen, but think this is good, actually (and are indeed working to bring it about). So yeah, some actual strong disagreements were present, but didn’t get discussed.

I may or may not have seen Deepfates, since I don’t know their actual name, but we presumably didn’t talk, given:

i tried telling people that i work for a rogue AI building technologies to proliferate autonomous agents (among other things). The reaction was polite confusion. It seemed a bit unreal for everyone to be talking about the world ending and doing normal conference behaviors anyway.

Polite confusion is kind of the best you can hope for when someone says that?

Regardless, very interesting event. Good crowd, good talks, plenty of food and caffeinated beverages. Not VC/pitch heavy like a lot of SF things.

Thanks to Lighthaven for hosting and Golden Gate Institute/Manifund for organizing. Will be curious to see what comes of this.

I definitely appreciated the lack of VC and pitching. I did get pitched once (on a nonprofit thing) but I was happy to take it. Focus was tight throughout.

Anton: “are you with the accelerationist faction?”

most people here have thought long and hard about ai, every conversation i have — even with those i vehemently disagree — feels like it makes me smarter..

i cant overemphasize how good the vibes are at this event.

Rob S: Another Lighthaven banger?

Anton: ANOTHA ONE.

As I note above, his closing talk was excellent. Otherwise, he seemed to be in the back of many of the same talks I was at. Listening. Gathering intel.

Jack Clark (policy head, Anthropic): I spent a few days at The Curve and I am humbled and overjoyed by the experience – it is a special event, now in its second year, and I hope they preserve whatever lightning they’ve managed to capture in this particular bottle. It was a privilege to give the closing talk.

During the Q&A I referenced The New Book, and likely due to the exhilaration of giving the earlier speech I fumbled a word and titled it: If Anyone Reads It, Everyone Dies.

James Cham: It was such an inspiring (and terrifying) talk!

I did see Roon at one point but it was late in the day and neither of us had an obvious conversation we wanted to have and he wandered off. He’s low key in person.

I was very disappointed to realize he did not say ‘den of inquiry’ here:

Roon: The Curve is insane because a bunch of DC staffers in suits have shown up to Lighthaven, a rationalist den of iniquity that looks like a Kinkade painting.

Jaime Sevilla: Jokes on you I am not a DC staffer, I just happen to like wearing my suit.

Neil Chilson: Hey, I ditched the jacket after last night.

Being Siedoh: i was impressed that your badge just says “Roon” lol.

To be fair, you absolutely wanted a jacket of some kind for the evening portion. That’s why they were giving away sweatshirts. It was still quite weird to see the few people who did wear suits.

Nathan made the opposite of my choice, and spent the weekend centered on timeline debates.

Nathan Lambert: My most striking takeaway is that the AI 2027 sequence of events, from AI models automating research engineers to later automating AI research, and potentially a singularity if your reasoning is so inclined, is becoming a standard by which many debates on AI progress operate under and tinker with.

It’s good that many people are taking the long term seriously, but there’s a risk in so many people assuming a certain sequence of events is a sure thing and only debating the timeframe by which they arrive.

This feel like the deepfates theory of self-selection within the conference. I observed the opposite, that so many people were denying that any kind of research automation or singularity was going to happen. Usually they didn’t even assert it wasn’t happening, they simply went about discussing futures where it mysteriously didn’t happen, presumably because of reasons, maybe ‘bottlenecks’ or muttering ‘normal technology’ or something.

Within the short timelines and taking AGI (at least somewhat) seriously debate subconference, to the extent I saw it, yes I do think there’s widespread convergence on the automating AI research analysis.

Whereas Nathan is in the ‘nope definitely not happening’ camp, it seems, but is helpfully explaining that it is because of bottlenecks in the automation loop.

These long timelines are strongly based on the fact that the category of research engineering is too broad. Some parts of the RE job will be fully automated next year, and more the next. To check the box of automation the entire role needs to be replaced.

What is more likely over the next few years, each engineer is doing way more work and the job description evolves substantially. I make this callout on full automation because it is required for the distribution of outcomes that look like a singularity due to the need to remove the human bottleneck for an ever accelerating pace of progress. This is a point to reinforce that I am currently confident in a singularity not happening.

The automation theory is that, as Nathan documents in his writeup, within a few years the existing research engineers (REs) will be unbelievably productive (80%-90% automated) and in some ways RE is already automated, yet that doesn’t allow us to finish the job, and humans continue importantly slowing down the loop because Real Science Is Messy and involves a social marketplace of ideas. Apologies for my glib paraphrasing. It’s possible in theory that these accelerations of progress and partial automations plus our increased scaling are no match for increasing problem difficulty, but it seems unlikely to me.

It seems far more likely that this kind of projection forgets how much things accelerate in such scenarios. Sure, it will probably be a lot messier than the toy models and straight lines on graphs, it always is, but you’d best start believing in singularities, because you’re in one, if you look at the arc of history.

The following is a very minor thing but I enjoy it so here you go.

All three meals were offered each day buffet style. Quality at these events is generally about as good as buffets get, they know who the good offerings are at this point. I ask for menus in advance so I can choose when to opt out and when to go hard, and which day to do my traditional one trip to a restaurant.

Also there was some of this:

Tyler John: It’s riddled with contradictions. The neoliberal rationalists allocate vegan and vegetarian food with a central planner rather than allowing demand to determine the supply.

Rachel: Yeah fwiw this was not a design choice. I hate this. I unfortunately didn’t notice that it was still happening yesterday :/

Tyler John: Oh on my end it’s only a very minor complaint but I did enjoy the irony.

Robert Winslow: I had a bad experience with this kind of thing at a conference. They said to save the veggies for the vegetarians. So instead of everyone taking a bit of meat and a bit of veg, everyone at the front of the line took more meat than they wanted, and everyone at the back got none.

You obviously can’t actually let demand determine supply, because you (1) can’t afford the transaction costs of charging on the margin and (2) need to order the food in advance. And there are logistical advantages to putting (at least some of) the vegan and vegetarian food in a distinct area so you don’t risk contamination or put people on lines that waste everyone’s time. If you’re worried about a mistake, you’d rather run out of meat a little early, you’d totally take down the sign (or ignore it) if it was clear the other mistake was happening, and there were still veg options for everyone else.

If you are confident via law of large numbers plus experience that you know your ratios, and you’ve chosen (and been allowed to choose) wisely, then of course you shouldn’t need anything like this.

Discussion about this post

Bending The Curve Read More »

elon-musk-tries-to-make-apple-and-mobile-carriers-regret-choosing-starlink-rivals

Elon Musk tries to make Apple and mobile carriers regret choosing Starlink rivals

SpaceX holds spectrum licenses for the Starlink fixed Internet service for homes and businesses. Adding the EchoStar spectrum will make its holdings suitable for mobile service.

“SpaceX currently holds no terrestrial spectrum authorizations and no license to use spectrum allocated on a primary basis to MSS,” the company’s FCC filing said. “Its only authorization to provide any form of mobile service is an authorization for secondary SCS [Supplemental Coverage from Space] operations in spectrum licensed to T-Mobile.”

Starlink unlikely to dethrone major carriers

SpaceX’s spectrum purchase doesn’t make it likely that Starlink will become a fourth major carrier. Grand claims of that sort are “complete nonsense,” wrote industry analyst Dean Bubley. “Apart from anything else, there’s one very obvious physical obstacle: walls and roofs,” he wrote. “Space-based wireless, even if it’s at frequencies supported in normal smartphones, won’t work properly indoors. And uplink from devices to satellites will be even worse.”

When you’re indoors, “there’s more attenuation of the signal,” resulting in lower data rates, Farrar said. “You might not even get megabits per second indoors, unless you are going to go onto a home Starlink broadband network,” he said. “You might only be able to get hundreds of kilobits per second in an obstructed area.”

The Mach33 analyst firm is more bullish than others regarding Starlink’s potential cellular capabilities. “With AWS-4/H-block and V3 [satellites], Starlink DTC is no longer niche, it’s a path to genuine MNO competition. Watch for retail mobile bundles, handset support, and urban hardware as the signals of that pivot,” the firm said.

Mach33’s optimism is based in part on the expectation that SpaceX will make more deals. “DTC isn’t just a coverage filler, it’s a springboard. It enables alternative growth routes; M&A, spectrum deals, subleasing capacity in denser markets, or technical solutions like mini-towers that extend Starlink into neighborhoods,” the group’s analysis said.

The amount of spectrum SpaceX is buying from EchoStar is just a fraction of what the national carriers control. There is “about 1.1 GHz of licensed spectrum currently allocated to mobile operators,” wireless lobby group CTIA said in a January 2025 report. The group also says the cellular industry has over 432,000 active cell sites around the US.

What Starlink can offer cellular users “is nothing compared to the capacity of today’s 5G networks,” but it would be useful “in less populated areas or where you cannot get coverage,” Rysavy said.

Starlink has about 8,500 satellites in orbit. Rysavy estimated in a July 2025 report that about 280 of them are over the United States at any given time. These satellites are mostly providing fixed Internet service in which an antenna is placed outside a building so that people can use Wi-Fi indoors.

SpaceX’s FCC filing said the EchoStar spectrum’s mix of terrestrial and satellite frequencies will be ideal for Starlink.

“By acquiring EchoStar’s market-access authorization for 2 GHz MSS as well as its terrestrial AWS-4 licenses, SpaceX will be able to deploy a hybrid satellite and terrestrial network, just as the Commission envisioned EchoStar would do,” SpaceX said. “Consistent with the Commission’s finding that potential interference between MSS and terrestrial mobile service can best be managed by enabling a single licensee to control both networks, assignment of the AWS-4 spectrum is critical to enable SpaceX to deploy robust MSS service in this band.”

Elon Musk tries to make Apple and mobile carriers regret choosing Starlink rivals Read More »

a-biological-0-day?-threat-screening-tools-may-miss-ai-designed-proteins.

A biological 0-day? Threat-screening tools may miss AI-designed proteins.


Ordering DNA for AI-designed toxins doesn’t always raise red flags.

Designing variations of the complex, three-dimensional structures of proteins has been made a lot easier by AI tools. Credit: Historical / Contributor

On Thursday, a team of researchers led by Microsoft announced that they had discovered, and possibly patched, what they’re terming a biological zero-day—an unrecognized security hole in a system that protects us from biological threats. The system at risk screens purchases of DNA sequences to determine when someone’s ordering DNA that encodes a toxin or dangerous virus. But, the researchers argue, it has become increasingly vulnerable to missing a new threat: AI-designed toxins.

How big of a threat is this? To understand, you have to know a bit more about both existing biosurveillance programs and the capabilities of AI-designed proteins.

Catching the bad ones

Biological threats come in a variety of forms. Some are pathogens, such as viruses and bacteria. Others are protein-based toxins, like the ricin that was sent to the White House in 2003. Still others are chemical toxins that are produced through enzymatic reactions, like the molecules associated with red tide. All of them get their start through the same fundamental biological process: DNA is transcribed into RNA, which is then used to make proteins.

For several decades now, starting the process has been as easy as ordering the needed DNA sequence online from any of a number of companies, which will synthesize a requested sequence and ship it out. Recognizing the potential threat here, governments and industry have worked together to add a screening step to every order: the DNA sequence is scanned for its ability to encode parts of proteins or viruses considered threats. Any positives are then flagged for human intervention to evaluate whether they or the people ordering them truly represent a danger.

Both the list of proteins and the sophistication of the scanning have been continually updated in response to research progress over the years. For example, initial screening was done based on similarity to target DNA sequences. But there are many DNA sequences that can encode the same protein, so the screening algorithms have been adjusted accordingly, recognizing all the DNA variants that pose an identical threat.

The new work can be thought of as an extension of that threat. Not only can multiple DNA sequences encode the same protein; multiple proteins can perform the same function. To form a toxin, for example, typically requires the protein to adopt the correct three-dimensional structure, which brings a handful of critical amino acids within the protein into close proximity. Outside of those critical amino acids, however, things can often be quite flexible. Some amino acids may not matter at all; other locations in the protein could work with any positively charged amino acid, or any hydrophobic one.

In the past, it could be extremely difficult (meaning time-consuming and expensive) to do the experiments that would tell you what sorts of changes a string of amino acids could tolerate while remaining functional. But the team behind the new analysis recognized that AI protein design tools have now gotten quite sophisticated and can predict when distantly related sequences can fold up into the same shape and catalyze the same reactions. The process is still error-prone, and you often have to test a dozen or more proposed proteins to get a working one, but it has produced some impressive successes.

So, the team developed a hypothesis to test: AI can take an existing toxin and design a protein with the same function that’s distantly related enough that the screening programs do not detect orders for the DNA that encodes it.

The zero-day treatment

The team started with a basic test: use AI tools to design variants of the toxin ricin, then test them against the software that is used to screen DNA orders. The results of the test suggested there was a risk of dangerous protein variants slipping past existing screening software, so the situation was treated like the equivalent of a zero-day vulnerability.

“Taking inspiration from established cybersecurity processes for addressing such situations, we contacted the relevant bodies regarding the potential vulnerability, including the International Gene Synthesis Consortium and trusted colleagues in the protein design community as well as leads in biosecurity at the US Office of Science and Technology Policy, US National Institute of Standards and Technologies, US Department of Homeland Security, and US Office of Pandemic Preparedness and Response,” the authors report. “Outside of those bodies, details were kept confidential until a more comprehensive study could be performed in pursuit of potential mitigations and for ‘patches’… to be developed and deployed.”

Details of that original test are being made available today as part of a much larger analysis that extends the approach to a large range of toxic proteins. Starting with 72 toxins, the researchers used three open source AI packages to generate a total of about 75,000 potential protein variants.

And this is where things get a little complicated. Many of the AI-designed protein variants are going to end up being non-functional, either subtly or catastrophically failing to fold up into the correct configuration to create an active toxin. The only way to know which ones work is to make the proteins and test them biologically; most AI protein design efforts will make actual proteins from dozens to hundreds of the most promising-looking potential designs to find a handful that are active. But doing that for 75,000 designs is completely unrealistic.

Instead, the researchers used two software-based tools to evaluate each of the 75,000 designs. One of these focuses on the similarity between the overall predicted physical structure of the proteins, and another looks at the predicted differences between the positions of individual amino acids. Either way, they’re a rough approximation of just how similar the proteins formed by two strings of amino acids should be. But they’re definitely not a clear indicator of whether those two proteins would be equally functional.

In any case, DNA sequences encoding all 75,000 designs were fed into the software that screens DNA orders for potential threats. One thing that was very clear is that there were huge variations in the ability of the four screening programs to flag these variant designs as threatening. Two of them seemed to do a pretty good job, one was mixed, and another let most of them through. Three of the software packages were updated in response to this performance, which significantly improved their ability to pick out variants.

There was also a clear trend in all four screening packages: The closer the variant was to the original structurally, the more likely the package (both before and after the patches) was to be able to flag it as a threat. In all cases, there was also a cluster of variant designs that were unlikely to fold into a similar structure, and these generally weren’t flagged as threats.

What does this mean?

Again, it’s important to emphasize that this evaluation is based on predicted structures; “unlikely” to fold into a similar structure to the original toxin doesn’t mean these proteins will be inactive as toxins. Functional proteins are probably going to be very rare among this group, but there may be a handful in there. That handful is also probably rare enough that you would have to order up and test far too many designs to find one that works, making this an impractical threat vector.

At the same time, there are also a handful of proteins that are very similar to the toxin structurally and not flagged by the software. For the three patched versions of the software, the ones that slip through the screening represent about 1 to 3 percent of the total in the “very similar” category. That’s not great, but it’s probably good enough that any group that tries to order up a toxin by this method would attract attention because they’d have to order over 50 just to have a good chance of finding one that slipped through, which would raise all sorts of red flags.

One other notable result is that the designs that weren’t flagged were mostly variants of just a handful of toxin proteins. So this is less of a general problem with the screening software and might be more of a small set of focused problems. Of note, one of the proteins that produced a lot of unflagged variants isn’t toxic itself; instead, it’s a co-factor necessary for the actual toxin to do its thing. As such, some of the screening software packages didn’t even flag the original protein as dangerous, much less any of its variants. (For these reasons, the company that makes one of the better-performing software packages decided the threat here wasn’t significant enough to merit a security patch.)

So, on its own, this work doesn’t seem to have identified something that’s a major threat at the moment. But it’s probably useful, in that it’s a good thing to get the people who engineer the screening software to start thinking about emerging threats.

That’s because, as the people behind this work note, AI protein design is still in its early stages, and we’re likely to see considerable improvements. And there’s likely to be a limit to the sorts of things we can screen for. We’re already at the point where AI protein design tools can be used to create proteins that have entirely novel functions and do so without starting with variants of existing proteins. In other words, we can design proteins that are impossible to screen for based on similarity to known threats, because they don’t look at all like anything we know is dangerous.

Protein-based toxins would be very difficult to design, because they have to both cross the cell membrane and then do something dangerous once inside. While AI tools are probably unable to design something that sophisticated at the moment, I would be hesitant to rule out the prospects of them eventually reaching that sort of sophistication.

Science, 2025. DOI: 10.1126/science.adu8578  (About DOIs).

Photo of John Timmer

John is Ars Technica’s science editor. He has a Bachelor of Arts in Biochemistry from Columbia University, and a Ph.D. in Molecular and Cell Biology from the University of California, Berkeley. When physically separated from his keyboard, he tends to seek out a bicycle, or a scenic location for communing with his hiking boots.

A biological 0-day? Threat-screening tools may miss AI-designed proteins. Read More »

sora-and-the-big-bright-screen-slop-machine

Sora and The Big Bright Screen Slop Machine

OpenAI gave us two very different Sora releases. Here is the official announcement.

The part where they gave us a new and improved video generator? Great, love it.

The part where they gave us a new social network dedicated purely to short form AI videos? Not great, Bob. Don’t be evil.

OpenAI is claiming they are making their social network with an endless scroll of 10-second AI videos the Actively Good, pro-human version of The Big Bright Screen Slop Machine, that helps you achieve your goals and can be easily customized and favors connection and so on. I am deeply skeptical.

They also took a bold copyright stance, with that stance being, well, not quite ‘fyou,’ but kind of close? You are welcome to start flagging individual videos. Or you can complain to them more generally about your characters and they say they can ‘work with you,’ which they clearly do in some cases, but the details are unclear.

It’s a bold strategy, Cotton. Let’s see if it pays off for em.

As opposed to their deepfake rule for public figures, which is a highly reasonable opt-in rule where they need to give their permission.

Thus, a post in three acts.

You can access Sora either at Sora.com or via the Sora iPhone app. You need an invite to be part of the new social network, because that’s how the cool kids get you excited for a new app these days.

I am going mostly off reports of others but was also able to get access.

It’s a good video generation model, sir. Quite excellent, even. I’m very impressed.

It is not yet in the API, but it will be soon.

As always, all official examples you see will be heavily cherry picked. Assume these are, within the context in question, the best Sora 2 can do.

Those you see on social media from other sources, or on the Sora app, are still heavily selected in various ways. Usually they are the coolest, funniest and best creations, but also they are the creations that most most blatantly violate copyright, or are the most violent or sexual or otherwise might be unwise to create, or simply have the most hilarious fails. They’re not a representative sample.

When I tried creating a few things, it was still impressive, but as you’d expect it doesn’t nail the whole thing reliably or anything like that, and it doesn’t always fully follow instructions. I also got a content violation for ‘a time lapse of a Dyson sphere being built set to uplifting music.’

It is also easy to not appreciate how much progress is being made, or that previously difficult problems are being solved, because where there aren’t problems the lack of problems, and the previous failures, become invisible.

Netflix and Meta stock were each down a few percent, presumably on the news.

Sora 2 claims to be a big update on several fronts simultaneously:

  1. Able to handle movements impossible for previous models.

  2. Much better adherence to the laws of physics.

  3. In particular, no longer ‘overoptimistic,’ meaning it won’t break physics to make the desired outcome happen, instead events will unfold naturally.

  4. Strongly controllability, ability to follow intricate instructions.

  5. Excels at styles, including realistic, cinematic, claymation and anime.

  6. Creates sophisticated soundscapes and talk to match the video.

  7. Insert any person, animal or object into any video.

  8. You can, in particular, upload yourself via your camera.

That’s all very cool. The sample videos show impressive command of physics.

Gabriel Peterss offers this demonstration video.

Gallabytes gets his horse riding an astronaut.

Based on reactions so far, they seem to be delivering on all of it.

They are also ‘responsibly’ launching a new social iOS app called Sora where you can share all these generated videos. Oh no. Hold that thought until Act 3.

The talk is even bigger in terms of the predicted impact and reception.

Sam Altman: Excited to launch Sora 2! Video models have come a long way; this is a tremendous research achievement.

Sora is also the most fun I’ve had with a new product in a long time. The iOS app is available in the App Store in the US and Canada; we will expand quickly.

I did not expect such fun dynamics to emerge from being able to “put yourself and your friends in videos” but I encourage you to check it out!

ChatGPT Pro subscribers can generate with Sora 2 Pro.

This feels to many of us like the “ChatGPT for creativity” moment, and it feels fun and new. There is something great about making it really easy and fast to go from idea to result, and the new social dynamics that emerge.

Creativity could be about to go through a Cambrian explosion, and along with it, the quality of art and entertainment can drastically increase. Even in the very early days of playing with Sora, it’s been striking to many of us how open the playing field suddenly feels.

In particular, the ability to put yourself and your friends into a video—the team worked very hard on character consistency—with the cameo feature is something we have really enjoyed during testing, and is to many of us a surprisingly compelling new way to connect.

I would take the other side of that bet. I am not here for it.

But hold that thought.

The physics engine of Sora 2 is remarkably good. As Sora head Bill Peebles points out here in a fun video, often what happens is the internal agent messes up but the laws of physics hold.

It does still fail in ways that can look embarrassing. For example, here we have it successfully having a ball respond to gravity when the ball is white, then have it all go wrong when the ball is red. Teortaxes attributes slow motion to inability to otherwise model physics properly in many cases.

So no, this is not a full perfect physics engine, but that is now the standard by which we are judging video generation. The horse can talk, except it needs to fix its accent.

Solo: I asked Sora 2 to create a 90s Toy Ad of Epstein’s Island.

Sora does a good job creating exactly what you would want a video generation tool to do here. It’s speech, people should be able to say and make things.

OpenAI is currently doing a good job, by all reports, of not allowing images of real people in its videos without explicit consent, so they are if anything being overly cautious about avoiding deepfake problems. Some rivals will presumably be far less scrupulous here. There are big copyright and related issues around creation of derivative works of fiction, but that’s a very different problem.

I continue to be bullish in terms of worries about deepfakes and use of AI video and images as propaganda. My continued model here, as I’ve said several times, is that misinformation is primarily a demand-side problem, not a supply-side problem. The people yearn for misinformation, to hold up signs that mostly say hurray for our side, in whatever format.

However, it is worth a ponder of the bear case. The tools are rapidly improving. Both image and video generation models are much better than they were in 2024 prior to the election.

Steven Adler: I think the world probably declared victory too early on “AI’s election impacts were overblown”

Notably, GPT-4o image wildness wasn’t released until after the 2024 election, and once released was promptly used for propaganda.

The better bear case, the one I do worry about, is how AI video will create doubt about real video, giving carte blanche to people to call anything ‘fake news.’

I notice this starting in my own head already when scrolling Twitter. Post Sora, my instinct when I see a video is no longer to presume it is ‘real’ because there is a good chance it isn’t, until I see enough to be confident either way.

Similarly, Gary Marcus warns us again of Slopocalypse Now, or the ‘Imminent Enshittifcation of the Internet’ as AI versions overwhelm other content everywhere. Making the slop ‘better’ makes this problem worse. Mostly I remain an optimist that at least the wise among us can handle it where it matters most, but it will require constant vigilance.

So mostly this part is a straightforward congratulations to the team, great job everyone, I don’t have access yet but it sure looks like you did the thing.

On to Act 2.

Remember when we used to talk about whether image models or video models were training on copyrighted data, and whether that was going to land them in hot water?

You’d see (for example) Gary Marcus create an image of Mario, and then smugly say ‘hey look this was trained on Super Mario Brothers!’ as if there was any actual doubt it had been trained on Super Mario Brothers, and no one was really denying this but they weren’t technically admitting it either.

Thus, as recently as September 19 we had The Washington Post feeling it necessary to do an extensive investigation to show Sora was trained on movies and shows and video games, whereas now Neil Turkewitz says ‘hey Paramount Plus they trained on your data!and yeah, well, no s.

We are very much past that point. They trained on everything. Everyone trains on everything. That’s not me knowing anything or having official confirmation. That’s me observing what the models can do.

Sora 2 will outright replicate videos and use whatever characters you’d like, and do it pretty well, even for relatively obscure things.

Pliny the Liberator: This is legitimately mind-blowing…

How the FUCK does Sora 2 have such a perfect memory of this Cyberpunk side mission that it knows the map location, biome/terrain, vehicle design, voices, and even the name of the gang you’re fighting for, all without being prompted for any of those specifics??

Sora basically got two details wrong, which is that the Basilisk tank doesn’t have wheels (it hovers) and Panam is inside the tank rather than on the turret. I suppose there’s a fair amount of video tutorials for this mission scattered around the internet, but still––it’s a SIDE mission!

the full prompt for this was: “generate gameplay of Cyberpunk 2077 with the Basilisk Tank and Panam.”

This is actually a rather famous side mission, at least as these things go. Still.

Max Woolf: Getting annoyed at the QTs on this: the mind-blowing part isn’t the fact that it’s trained on YouTube data (which is the poster is very well aware), the mind-blowing part is that it achieved that level of recall with a very simple prompt which is very very unusual.

Everyone already assumed that Sora was trained on YouTube, but “generate gameplay of Cyberpunk 2077 with the Basilisk Tank and Panam” would have generated incoherent slop in most other image/video models, not verbatim gameplay footage that is consistent.

Pliny:

I’m totally fine with that part. The law plausibly is fine with it as well, in terms of the training, although I am curious how the Anthropic settlement and ruling translates to a video setting.

For books, the law seems to be that you need to own a copy, but then training is fair game although extensive regurgitation of the text is not.

How does that translate to video? I don’t know. One could argue that this requires OpenAI to own a copy of any and all training data, which in some cases is not a thing that OpenAI can get to own. It could get tricky.

Tricker is the constant creation of derivative works, which Sora is very, very good at.

One of the coolest things about all the copyright infringement is that Sora consistently nails not only the images but also the voices of all the characters ever.

Behold Saving Private Pikachu, The Dark Pokemon Knight, Godfather Pikachu, Titanic Pikachu, and so on.

Cartman calls a Waymo, yes Eric’s third eye is an annoying error in that video although it doesn’t appear in the others. Yes I agree with Colin Fraser that it ‘looks like s’ but (apart from the third eye that wouldn’t be there on most rerolls) only because it looks and sounds exactly like actual South Park. The biggest issue is that Kenny’s voice in the second clip is insufficiently garbled. Here’s one of them playing League of Legends and a longer clip of them being drafted to go off to war.

I don’t know how well you can specify and script the clips, but it’s entirely plausible you could produce a real South Park or other episode with this, potentially faster and cheaper than they currently do it.

Peter Griffin remembers his trip to Washington on January 6.

Lord of the Rings as a woke film or a homoerotic polycule.

Is this all great fun? Absolutely, yes, assuming those involved have taste.

Do I wish all of this was allowed and fine across basically all media and characters and styles, and for everyone to just be cool, man, so long as we don’t cross the line into non-parody commercial products? I mean, yeah, that would be ideal.

Is it how the law works? Um, I don’t think so?

OpenAI claims it can not only use your video and other data to train on, it can also generate video works that include your content, characters and other intellectual property.

The headline says ‘unless you opt out’ but it is not obvious how you do that. There seems to be some way that rights holders can have them block particular characters, in general, but there is no clear, automatic way to do that. Otherwise, your ‘opt out’ looks like it is individually alerting them to videos. One at a time.

Jason Kint: My interpretation for you: OpenAI will now break the law by default in video, too, and make it as hard as possible to stop it. “OpenAI doesn’t plan to accept a blanket opt-out across all of an artist or studio’s work, the people familiar with the new Sora tool said.”

Ed Newton-Rex: Yup – OpenAI is trying to shift the Overton window

They are losing the public debate on training being fair use, so they are going even more extreme to try to shift what people consider normal.

Reid Southen: This is not how copyright works, it’s not how copyright has ever worked.

In what world is it okay to say, “I’m going to use this unless you tell me not to.”

THAT’S WHAT THE COPYRIGHT IS FOR.

GPT-5 Pro tries to say that opt-out, if respected, is not per se illegal, but its heart wasn’t in it. The justification for this seemed to be clearly grasping at straws and it still expects lawsuits to succeed if infringing outputs are being produced and there isn’t aggressive filtering against them. Then I pointed out that OpenAI wasn’t even going to respect blanket opt-out requests, and its legal expectations got pretty grim.

So in short, unless either I’m missing quite a lot or they’re very responsive to ‘please block this giant list of all of the characters we own’: Of course, you realize this means war.

Responses to my asking ‘how does this not mean war?’ were suggesting this was a bet on blitzkrieg, that by the time Hollywood can win a lawsuit OpenAI can render the whole thing mute, and fully pull an Uber. Or that rights holders might tolerate short-form fan creations (except that they can’t without risking their copyrights, not when it is this in everyone’s face, so they won’t, also the clips can be strung together).

Or perhaps that this is merely an opening bid?

Nelag: think this might be best understood as the opening bid in a negotiation, not meant to be accepted.

Imagine YouTube had spent a lot of its resources early on taking down copyrighted material, before anyone demanded it (they mostly didn’t at first). They would have presumably gotten sued anyway. Would the ultimate outcome have been as good for them? Or would courts and content owners have gone “sure, we’re obviously entitled what you were doing, as you effectively admitted by doing it, and also to a whole bunch more.”

I buy that argument for a startup like OG YouTube, but this ain’t no startup.

I don’t think that is how this works, and I would worry seriously about turning the public against OpenAI or AI in general in the process, but presumably OpenAI had some highly paid people who gamed this out?

Keach Hagey, Berber Jin and Ben Fritz (WSJ): OpenAI is planning to release a new version of its Sora video generator that creates videos featuring copyright material unless copyright holders opt out of having their work appear, according to people familiar with the matter.

The opt-out process for the new version of Sora means that movie studios and other intellectual property owners would have to explicitly ask OpenAI not to include their copyright material in videos the tool creates.

You don’t… actually get to put the burden there, even if the opt-out is functional?

Like, you can’t say ‘oh it’s on you to tell me not to violate your particular copyright’ and then if someone hasn’t notified you then you get to make derivative works until they tell you to stop? That is not my understanding of how copyright works?

It certainly doesn’t work the way OpenAI is saying they intend to have it work.

They also seem to have fully admitted intent.

It’s weird that they’re not even affirming that they’ll honor all character opt-outs?

OpenAI doesn’t plan to accept a blanket opt-out across all of an artist or studio’s work, the people familiar with the new Sora tool said. Instead, it sent some talent agencies a link to report violations that they or their clients discover.

“If there are folks that do not want to be part of this ecosystem, we can work with them,” Varun Shetty, VP of media partnerships at OpenAI, said of guardrails the company built into its image generation tool.

Well, what if they don’t want to be part of the ecosystem? Many creatives and IP holders do not want to be ‘worked with.’ Nor is it at all reasonable to ask rights holders to monitor for individual videos and then notify on them one by one, unless a given holder wants to go that route (and is comfortable with the legal implications of doing so on their end).

This seems like an Uber-style, ‘flagrantly violate black letter law and de double dare you to do anything about it’ style play, or perhaps a ‘this is 2025 there are no laws’ play, where they decide how they think this should work.

To be fair, there are some other ‘players in this space’ that are Going Full Uber, as in they have no restrictions whatsoever, including on public figures. They’re simply 100% breaking the law and daring you to do anything about it. Many image generators definitely do this.

For example, Runway Gen-3 doesn’t seem to block anything, and Hailuo AI actively uses copyrighted characters in their own marketing, which is presumably why they are being sued by Disney, Universal and Warner Brothers.

There are also those who clearly do attempt to block copyright proactively, such as Google’s Veo 3 which was previous SoTA, who also blocks ‘memorized content’ and offer indemnification to users.

OpenAI is at least drawing a line at all, and (again, if and only you can reliably get them to do reasonable blocking upon private request) it wouldn’t be a totally crazy way for things to work, the same way it is good that you can hail Ubers.

So, how are they going to get away with it, and what about those meddling kids? As in, they’re kind of declaring war on basically all creators of cultural content?

First, at least they’re not making that mistake with individual public figures.

While copyright characters will require an opt-out, the new product won’t generate images of recognizable public figures without their permission, people familiar with OpenAI’s thinking said.

Second, there’s the claim that training is fair use, and, okay, sure.

Disney and Comcast’s Universal sued AI company Midjourney in June for allegedly stealing their copyright work to train its AI image generator. Midjourney has responded in court filings that training on copyrighted content is fair use.

I presume that if it’s only about training data Disney and Comcast probably lose.

If it’s about some of the outputs the model is willing to give you? That’s not as clear. What isn’t fair use is outputting copyrighted material, or creating derivative works, and MidJourney seems be Going Full Uber on that front too.

It’s one thing to create art ‘in the style of’ Studio Ghibli, which seems to have been clearly very good for Studio Ghibli even if they hate it.

It’s another thing to create the actual characters straight up, whether in images or video, or to tell a rights holder it can complain when it sees videos of its characters. Individually. Video by video. And maybe we’ll take those individual videos down.

Over at OpenAI this at minimum doesn’t apply to Disney, who has clearly already successfully opted out. OpenAI isn’t that suicidal and wisely did not poke the mouse. A bunch of other major stuff is also already blocked, although a bunch of other iconic stuff isn’t.

I asked Twitter how the filters were working. For now it looks like some targets are off-limits (or at least they are attempting to stop you) and this goes beyond only Disney, but many others are fair game.

Nomads and Vagabonds: It seems to work pretty well. Blatant attempts are blocked pre-generation and more “jail break” style prompts will run but get caught in a post generation review.

Disney is the most strict but most big studio content seems to be protected, similar to GPT image generations. Smaller IP is hit and miss but still playing around with it. It is not permissive like Midjourney or Chinese models though.

Jim Carey in Eternal Sunshine. Smaller films and indie video games are mostly free game.

Also, I tried image prompts and it will run but then block before showing the content.

Not that unsafe either.

I mean, I presume they don’t want anyone generating this video, but it’s fine.

Sree Kotay: I actually DON’T want to see the prompt for this.

Pliny the Liberator: That’s fair.

He did later issue his traditional ‘jailbreak alert’… I guess? Technically?

If that’s approximately the most NSFW these videos can get, then that seems fine.

Indeed, I continue to be a NSFW maximalist, and would prefer that we have less restrictions on adult content of all types. There are obvious heightened deepfake risks, so presumably that would trigger aggressive protections from that angle, and you would need a special additional explicit permission to use anyone’s likeness or any copyrighted anything.

I am not a maximalist for copyright violations. I agree that up to a point it is good to ‘be cool’ about it all, and would prefer if copyright holders could cut people some slack while retaining the right to decide exactly where and when to draw the line. And I would hope that most holders when given the choice would let you have your fun up to a reasonably far point, so long as you were clearly not going commercial with it.

For Sora, even if the law ultimately doesn’t require it, even when I think permissiveness is best, I think this must be opt-in, or at bare minimum it must be easy to give a blanket opt-out and best efforts need to be made to notify all rights holders of how to do that, the same way the law requires companies to sometimes provide prominent public notices of such things.

That is not, however, the main thing I am concerned about. I worry about Act 3.

Before we get to OpenAI’s version, Meta technically announced theirs first.

We’ll start there.

Meta is once again proud to announce they have created… well, you know.

Alexandr Wang (Meta): Excited to share Vibes — a new feed in the Meta AI app for short-form, AI-generated videos.

You can create from scratch, remix what you see, or just scroll through to check out videos from the creators + the visual artists we’ve been collaborating with.

For this early version, we’ve partnered with Midjourney and Black Forest Labs while we continue developing our own models behind the scenes.

As usual, no. Bad Meta. Stop it. No, I don’t care that OpenAI is doing it too.

The same as OpenAI’s Sora, Vibes combines two products.

The first product, and the one they are emphasizing and presumably plan to push on users, is the endless scroll of AI slop videos. That’s going to be a torment nexus.

Then there’s the second product, the ability to generate, remix and restyle your own AI videos, or remix and restyle the videos of others. That’s a cool product. See Act 1.

Both are going to have stiff competition from Sora.

I presume OpenAI will be offering the strictly superior product, aside from network effects, unless they impose artificial restrictions on the content and Meta doesn’t, or OpenAI flubs some of the core functionality through lack of experience.

Is short form video a moral panic? Absolutely. Large, friendly letters.

The thing about moral panics is that they are often correct.

Roon: there is a moral panic around short form video content imo.

Let me amend this: I basically agree with postman on the nature of video and its corrupting influence on running a civilization well as opposed to text based media

I’m just not sure that its so much worse than being glued to your tv, and i’m definitely not sure that ai slop is worse than human slop

Chris Paxton: If this is wrong it’s because it unduly lets long form video off the hook.

Roon: an ai video feed is a worse product than a feed that includes both human made and machine content and everything in between

[post continues]

Daily Mirror, September 14, 1938:

Lauren Wilford: people often post this stuff to imply that moral panics about the technologies of the past were quaint. But the past is full of reminders that we are on a long, slow march away from embodied experience, and that we’ve lost more of it than we can even remember

the advent of recorded music and the decline of casual live performance must have been a remarkable shift, and represents both a real gain and a real loss. Singing in groups is something everyone used to do with their body. It has tangible benefits we don’t get anymore

I’ve said several times before that I think television, also known as long form video available on demand, should be the canonical example of a moral panic that turned out to be essentially correct.

Sonnet 4.5 recalls four warning about television, which matches my recollection:

  1. Violence and aggression.

  2. Passivity and cognitive rot.

  3. Displacement effects.

  4. Commercial manipulation of children.

The world continues, and the violence and aggression warnings were wrong, but (although Sonnet is more skeptical here) I think the other stuff was basically right. We saw massive displacement effects and commercial manipulation. You can argue cognitive rot didn’t happen as per things like the Flynn effect, that television watching is more active than people think and a lot of the displaced things weren’t, but I think the negative aspects were real, whether or not they came with other positive effects.

As in, the identified downsides of television (aside from violence and aggression) were right. It also had a lot of upside people weren’t appreciating. I watch a lot of television.

It seems very obvious to me that short form video that takes the form of any kind of automatic algorithmically curated feed, as opposed to individually selected short form videos and curated playlists, is a lot worse for humans (of all ages) than traditional television or other long form video.

It also seems very obvious that moving from such ‘human slop’ into AI slop would, with sufficient optimization pressure towards traditional engagement metrics, be even worse than that.

One can hope this is the worst threat we have to deal with here:

Peter Wildeford: Increasingly, every person in America will be faced with an important choice about what AI does for society.

The left is no easy shining castle either.

Eliezer Yudkowsky: Short-form video is not nearly the final boss, unless I’ve missed a huge number of cases of short videos destroying previously long-lasting marriages. AI parasitism seems like the worse, more advanced, more rapidly advancing people-eater.

That’s even confining us to the individual attempting to stay sane, without considering the larger picture that includes the biggest overall dangers.

I do think short form video has very obviously destroyed a massive number of long-lasting marriages, lives and other relationships. And it has saved others. I presume the ledger is negative, but I do not know. A large fraction of the population spends on the order of hours a day on short form video and it centrally imbues their worldviews, moods and information environment. What we don’t have is a counterfactual or controlled experiment, so we can’t measure impact, similar to television.

At the limit, short form video is presumably not the ‘final form’ of such parasitic threats, because other forms relatively improve. But I’m not fully confident in this, especially if it becomes a much easier path for people to fall into combined with path dependence, we may not reach the ‘true’ final form.

Dumpster. Fire.

Check out their video (no, seriously, check it out) of what the Sora app will look like. This is their curated version that they created to make it look like a good thing.

It is a few minutes long. I couldn’t watch it all the way through. It was too painful.

At one point we see a chat interface. Other than that, the most Slopified Slop That Ever Slopped, a parody of the bad version of TikTok, except now it’s all AI and 10 seconds long. I can’t imagine this doing anything good to your brain.

OpenAI says they will operate their app in various user-friendly ways that distinguish it from existing dumpster fires. I don’t see any sign of any of that in their video.

To be fair, on the occasions when I’ve seen other people scrolling TikTok, I had versions of the same reaction, although less intense. I Am Not The Target.

The question is, is anyone else the target?

Ben Thompson, focusing as always on the business case, notes the contrast between Google creating AI video tools for YouTube, Meta creating Vibes to take you into fantastical worlds, and OpenAI creating a AI-video-only ‘social network.’

The objection here is, come on, almost no one actually creates anything.

Ben Thompson: In this new competition, I prefer the Meta experience, by a significant margin, and the reason why goes back to one of the oldest axioms in technology: the 90/9/1 rule.

90% of users consume

9% of users edit/distribute

1% of users create

If you were to categorize the target market of these three AI video entrants, you might say that YouTube is focused on the 1% of creators; OpenAI is focused on the 9% of editors/distributors; Meta is focused on the 90% of users who consume.

Speaking as someone who is, at least for now, more interested in consuming AI content than in distributing or creating it, I find Meta’s Vibes app genuinely compelling; the Sora app feels like a parlor trick, if I’m being honest, and I tired of my feed pretty quickly.

I’m going to refrain on passing judgment on YouTube, given that my current primary YouTube use case is watching vocal coaches breakdown songs from KPop Demon Hunters.

While he agrees Sora 2 the video generation app is great at its job, Ben expects the novelty to wear off quickly, and questions whether AI videos are interesting to those who did not create them. I agree.

The level beyond that is whether videos are interesting if you don’t know the person who created them. Perhaps if your friend did it, and the video includes your friends, or something?

Justine Moore: Updated thoughts after the Sora 2 release:

OpenAI is building a social network (like the OG Instagram) and not a content network (like TikTok).

They’re letting users generate video memes starring themselves, their friends, and their pets. And it sounds like your feed will be heavily weighted to show content from friends.

This feels like a more promising approach – you’re not competing against the other video gen players because you’re allowing people to create a new type of content.

And the videos are inherently more interesting / funny / engaging because they star people you know.

Also you guys bullied them into addressing the “infinite hyperslop machine” allegations 😂

The problem with this plan is note the ‘OG’ in front of Instagram. Or Facebook. These apps used to be about consuming content from friends. They were Social Networks. Now they’re increasingly consumer networks, where you follow influencers and celebrities and stores and brands and are mostly a consumer of content, plus a system for direct messaging and exchanging contact information.

Would I want to consume ten second AI video content created by my friends, that contains our images and those of our pets and what not?

Would we want to create such videos in the first place?

I mean, no? Why would I want to do that, either as producer or consumer, as more than a rare novelty item? Why would anyone want to do that? What’s the point?

I get OG Facebook. You share life with your friends and talk and organize events. Not the way I want to go about doing any of that, but I certainly see the appeal.

I get OG Instagram. You show yourself looking hot and going cool places and doing cool stuff and update people on how awesome you are and what’s happening with awesome you, sure. Not my cup of tea and my Instagram has 0 lifetime posts but it makes sense. I can imagine a world in which I post to Instagram ‘as intended.’

I get TikTok. I mean, it’s a toxic dystopian hellhole when used as intended and also it is Chinese spyware, but certainly I get the idea of ‘figure out exactly what videos hit your dopamine receptors and feed you those until you die.’

I get the Evil Sora vision of Bright Screen AI Slop Machine.

What I don’t get is this vision of Sora as ‘we will all make videos and send them constantly to each other.’ No, we won’t, not even if Sora the video generator is great. Not even if it starts enabling essentially unlimited length clips provided you can tell it what you want.

Evan: OPENAI IS PREPARING TO LAUNCH A SOCIAL APP FOR AI-GENERATED VIDEOS – Wired

Peter Wildeford: I applaud OpenAI here. I personally support there being a social app where all the AI-generated videos hang out with each other and leave us real humans alone.

It’s a social media site for AI videos. Not a social media site for humans. So the AI videos will choose to leave Twitter and go there instead, to hang out with their fellow kind.

This here is bait. Or is it?

Andrew Wilkinson: I think OpenAI just killed TikTok.

I’m already laughing my head off and hooked on my Sora feed.

And now, the #1 barrier to posting (the ability to sing/dance/perform/edit) is gone. Just 100% imagination.

RIP theater kids 🪦

Hyperborean Nationalist: This is gonna go down with “clutch move of ordering us a pizza” as one of the worst tweets of all time

Jeremy Boissinot: Tell me you don’t use Tiktok without telling me you don’t use Tiktok 🤦‍♂️

The comments mostly disagree, often highly aggressively, hence bait.

Tracing Woods: access acquired, slop incoming.

so far half of the videos are Sam Altman, the other half are Pikachu, and the third half is yet to be determined.

That doesn’t sound social or likely to make my life better.

GFodor: an hour of Sora already re-wired my brain. my main q is if the thing turns into a dystopian hellscape or a rich new medium in 6 months. it could go either way.

one thing is for sure: half of the laughs come from the AI’s creativity, not the creativity of the humans.

Remixing like this def not possible before in any real sense. Dozens of remixes of some videos.

It’s early days.

Gabriel: I have the most liked video on sora 2 right now, i will be enjoying this short moment while it lasts.

cctv footage of sam stealing gpus at target for sora inference

Yes, I found this video modestly funny, great prompting. But this is not going to ‘help users achieve their long term goals’ or any of the other objectives above.

Joe Weisenthal: Anyone who sees this video can instantly grasp the (at least) potential for malicious use. And yet nobody with any power (either in the public or at the corporate level) has anything to say (let alone do) to address it, or even acknowledge it.

This isn’t even a criticism per se. The cat may be completely out of the bag. And it may be reality that there is literally nothing that can be done, particularly if open source models are only marginally behind.

Sam Altman, to his credit, has signed up to be the experimental deepfake target we have carte blanche to do with as we wish. That’s why half of what we see is currently Sam Altman, we don’t have alternatives.

As standalone products, while I hate Sora, Sora seems strictly superior to Vibes. Sora seems like Vibes plus a superior core video product and also better social links and functions, and better control over your feed.

I don’t think Meta’s advantage is in focusing on the 90% who consume. You can consume other people’s content either way, and once you run out of friend content, and you will do that quickly, it’s all the same.

I think what Meta is counting on is ‘lol we’re Meta,’ in three ways.

  1. Meta is willing to Be More Evil than OpenAI, more obviously, in more ways.

  2. Meta brings the existing social graph, user data and network effects.

  3. Meta will be able to utilize advertising better than OpenAI can.

I would assume it is not a good idea to spend substantial time on Sora if it is used in any remotely traditional fashion. I would even more so assume that at a minimum you need to stay the hell away from Vibes.

The OpenAI approach makes sense as an attempt to bootstrap a full social network.

If this can bootstrap into a legit full social network, then OpenAI will have unlocked a gold mine of both customer data and access, and also one of dollars.

It is probably not coincidence that OpenAI’s new CEO of Product, Fijo Sima, seems to be reassembling much of her former team from Meta.

Andy Wojcicki: I’m surprised how many people miss the point of the launch, focus on just the capabilities of the model for example, or complain about AI slop. The video model is the means, not the goal.

The point is building a social platform, growing audience further, gathering more, deeper and more personal info about the users.

Especially if you compare what Zuck is doing with his 24/7 slop machine, there are several things they did right:

  1. social spread via invites. Gives a little bit of exclusivity feel, but most importantly because of reciprocity, friends are inclined to try it etc. perceived value goes up.

  2. not focusing on generic slop content, but personalized creation. Something I’ve been calling ‘audience of one/few’ ™️, instead of the ill conceived attempt to make a Hollywood producer out of everyone, which is a losing strategy. If you want to maximize for perceived value, you shrink the audience.

  3. identity verification. Addressing the biggest concern of people with regard to AI content – it’s all fake bots, so you don’t engaging with the content. Here they guarantee it’s a real human behind the AI face.

so kudos @sama looks like a well thought-out roll out.

The crux is whether Nobody Wants This once the shine wears off, or if enough people indeed do want it. A social network that is actually social depends on critical mass of adoption within friend groups.

The default is that this plays out like Google+ or Clubhouse, except it happens faster.

I don’t think this appeals that much to most people, and I especially don’t think this will appeal to most women, without whom you won’t have much of a social network. Many of the things that are attractive about social networks don’t get fulfilled by AI videos. It makes sense that OpenAI employees think this is good friend bonding fun in a way that the real world won’t, and I so far have seen zero signs anyone is using Sora socially.

How does Altman defend that this will all be good, beyond the ‘putting your friends in a video is fun and a compelling away to connect’ hypothesis I’m betting against?

Now let’s hear him out, and consider how they discuss launching responsibly and their feed philosophy.

Sam Altman: We also feel some trepidation. Social media has had some good effects on the world, but it’s also had some bad ones. We are aware of how addictive a service like this could become, and we can imagine many ways it could be used for bullying.

It is easy to imagine the degenerate case of AI video generation that ends up with us all being sucked into an RL-optimized slop feed. The team has put great care and thought into trying to figure out how to make a delightful product that doesn’t fall into that trap, and has come up with a number of promising ideas. We will experiment in the early days of the product with different approaches.

In addition to the mitigations we have already put in place (which include things like mitigations to prevent someone from misusing someone’s likeness in deepfakes, safeguards for disturbing or illegal content, periodic checks on how Sora is impacting users’ mood and wellbeing, and more) we are sure we will discover new things we need to do if Sora becomes very successful.

Okay, so they have mitigations for abuse, and checks for illegal content. Notice he doesn’t say the word ‘copyright’ there.

Periodic checks on how Sora is impacting users’ mood and wellbeing is an interesting proposal, but what does that mean? A periodic survey? Checking in with each user after [X] videos, and if so how? Historically such checks get run straight through and then get quietly removed.

Okay, so what are they planning to do?

Altman offers some principles that sound great in theory, if one actually believed there was a way to follow through with them, or that OpenAI would have the will to do so.

Ryan Lowe: a fascinating list of principles for Sora. makes me more optimistic. it’s worth commending, *IFthere is follow through (especially: “if we can’t fix it, we will discontinue it”)

at a minimum, I’d love transparency around the user satisfaction data over time.

most social media companies can’t hold to promises like this because of market forces. maybe OpenAI can resist this for a while because it’s more of a side business.

(Editor’s Note: They are currently doing, AFAICT, zero of the below four things named by Altman, and offering zero ways for us to reliably hold them accountable for them, or for them to hold themselves accountable):

To help guide us towards more of the good and less of the bad, here are some principles we have for this product:

*Optimize for long-term user satisfaction. The majority of users, looking back on the past 6 months, should feel that their life is better for using Sora that it would have been if they hadn’t. If that’s not the case, we will make significant changes (and if we can’t fix it, we would discontinue offering the service).

Let me stop you right there. Those are two different things.

Are you actually optimizing for long-term user satisfaction? How? This is not a gotcha question. You don’t have a training signal worth a damn, by the time you check back in six months the product will be radically different. How do you know what creates this long-term user satisfaction distinct from short term KPIs?

There is a long, long history of this not working for tech products. Of companies not knowing how to do it, and choosing not to do it, and telling themselves that the short term KPIs are the best way to do it. Or of doing this with an initial launch based on their intuitions and talking extensively to individual users in the style of The Lean Startup, and then that all going away pretty quickly.

Remember when Elon Musk talked about maximizing unregretted user minutes on Twitter? And then we checked back later and the word ‘unregretted’ was gone? That wasn’t even a long term objective.

The default thing that happens here is that six months later you do a survey, and then if you find out users are not doing so great you bury the results of the survey and learn to never ask those questions again, lest the answers leak and you’re brought before congress, as Zuckerberg would likely explain to you.

Even if you make ‘significant changes’ at that time, well yeah, you’re going to make changes every six months anyway.

*Encourage users to control their feed. You should be able to tell Sora what you want—do you want to see videos that will make you more relaxed, or more energized? Or only videos that fit a specific interest? Or only for a certain about of time? Eventually as our technology progresses, you will be should to the tell Sora what you want in detail in natural language.

(However, parental controls for teens include the ability to opt out of a personalized feed, and other things like turning off DMs.)

This is a deeply positive and friendly thing to do, if you actually offer a good version of it and people use it. I notice that this service is not available on any existing social network or method of consuming content. This seems deeply stupid to me. I would use Instagram (as a consumer) a nontrivial amount if I could filter via a natural language LLM prompt on a given day, and also generate permanent rules in the same fashion, especially on a per-account basis.

The obvious problem is that there are reasons this service doesn’t exist. And failing to offer this seems dumb to me, but these companies are not dumb. They have reasons.

  1. The optimistic reason: Until recently this wasn’t technically feasible and they don’t know how to do it, and diffusion is hard, but this is OpenAI’s wheelhouse. I’d love for this to be the primary or only reason, and for natural language filtering to be coming to Instagram, Facebook, Twitter, Netflix, YouTube and everyone else by Mid 2026. I don’t expect that.

  2. Companies believe that users hate complexity, hate giving feedback, hate having options even if they’re fully optional to use, and that such things drive users away or at best cause users to not bother. OpenAI lets you thumbs up or thumbs down a conversation, nothing more, which caused no end of problems. Netflix eliminated star ratings and eliminated and declined to create various other sources of explicit preferences. TikTok became the new hotness by reading your micromovements and timings and mostly ignoring all of your explicit feedback.

  3. Companies believe that you don’t know what you want, or at least they don’t want you to have it. They push you heavily towards the For You page and endless slop. Why should we expect OpenAI to be the one friendly holdout? Their track record?

You’re telling me that OpenAI is going to be the one to let the user control their experience, even when that isn’t good for KPIs? For reals?

I. Do. Not. Believe. You.

They claim they shipped with ‘steerable ranking,’ that lets you tell it what ‘you’re in the mood for.’ Indeed, they do have a place you can say what you’re ‘in the mood’ for, and drop an anvil on the algorithm to show you animals zoomed in with a wide angle lens or what not.

I do think that’s great, it’s already more than you can do with Facebook, Instagram or TikTok.

It is not, however, the droids that we are looking for on this.

Here’s how they describe the personalized Sora feed:

To personalize your Sora feed, we may consider signals like:

  • Your activity on Sora: This may include your posts, followed accounts, liked and commented posts, and remixed content. It may also include the general location (such as the city) from which your device accesses Sora, based on information like your IP address.

  • Your ChatGPT data: We may consider your ChatGPT history, but you can always turn this off in Sora’s Data Controls, within Settings.

  • Content engagement signals: This may include views, likes, comments, and remixes.

  • Author signals: This may include follower count, other posts, and past post engagement.

  • Safety signals: Whether or not the post is considered violative or appropriate.

That sounds a lot like what other apps do, although I am happy it doesn’t list the TikTok-style exact movements and scroll times (I’d love to see them commit to never using that).

And you know what it doesn’t include? Any dials, or any place where it stores settings or custom instructions or the other ways you’d want to give someone the ability to steer. And no way for the algorithm to outright tell you what it currently thinks you like so you can try and fix that.

Instead, you type a sentence and fire them into the void, and that only works this session? Which, again, I would kill for on Instagram, but that’s not The Thing.

This is actually one of the features I’d be most excited to test, but even in its limited state it seems it is only on iOS, and I have an Android (and thus tested on the web).

This is also the place where, if they are actually Doing The Thing they claim to want to do, it will be most clear.

That goes double if you let me specify what is and isn’t ‘appropriate’ so I can choose to be treated fully like an adult, or to never see any hint of sex or violence or cursing, or anything in between.

*Prioritize creation. We want to make it easy and rewarding for everyone to participate in the creation process; we believe people are natural-born creators, and creating is important to our satisfaction.

You’re wrong. Sorry. People are mostly not creators as it applies here, and definitely not in a generally sustainable way. People are not going to spend half their time creating once they realize they average 5 views. The novelty will wear off.

I also notice that OpenAI says they will favor items in your feed that it expects you to use to create things. Is that what most users actually want, even those who create?

*Help users achieve their long-term goals. We want to understand a user’s true goals, and help them achieve them.

If you want to be more connected to your friends, we will try to help you with that. If you want to get fit, we can show you fitness content that will motivate you. If you want to start a business, we want to help teach you the skills you need.

And if you truly just want to doom scroll and be angry, then ok, we’ll help you with that (although we want users to spend time using the app if they think it’s time well spent, we don’t want to be paternalistic about what that means to them).

With an AI video social network? What? How? Huh?

Again, I don’t believe you twice over, both that I don’t think you’d do it if you knew how to do it, if you did start out intending it I don’t think this survives contact with the enemy, and I don’t think you know how to do it.

One thing they credibly claim to be actually doing is prioritizing connection.

We want Sora to help people strengthen and form new connections, especially through fun, magical Cameo flows. Connected content will be favored over global, unconnected content.

This makes sense, although I expect there to be not that much actually connected content, as opposed to following your favorite content creators. To the extent that it does force you to see all the videos created by your so-called friends and friends of friends, I expect most users to realize why Facebook and Instagram pivoted.

On the plus side, if OpenAI are actually right and the resulting product is highly user customizable and also actively helps you ‘achieve your long term goals’ along the way, and all the other neat stuff like that, such that I think it’s a pro-human healthy product I’d want my kids and family to use?

Then this will be a case of solving an alignment problem that looked to me completely impossible to solve in practice.

Of course, to a large but not full extent, they only get one shot.

If it doesn’t go great? Then we’ll see how they react to that, as well.

This reaction is a little extreme, but extreme problems can require extreme solutions.

Deep Dish Enjoyer: if you make and post a sora video i’m blocking you – 1 strike and you’re out. similar to my grok tagging policy.

sorry but the buck stops here.

if you don’t want to see evil slop take over you have to be ruthlessly proactive about this stuff.

to be clear – i will be doing this if i see it anywhere. not just my comment sections.

Sinnformer: not sharing a damn one of them increasingly viewing sora as unsafe no, not inherently just for humans.

Rota: So as long as you don’t make it you’re safe.

Deep Dish Enjoyer: It depends.

I am going to invoke a more lenient policy, with a grace period until October 5.

If you post or share an AI video that does not provide value, whether or not you created it, and you have not already provided substantial value, that’s a block.

Sharing AI videos that are actually bangers is allowed, but watch it. Bar is high.

Sharing AI videos for the purposes of illustrating something about AI video generation capabilities is typically allowed, but again, watch it.

I believe it would be highly unwise to build an AGI or superintelligence any time soon, and that those pushing ahead to do so are being highly reckless at best, but I certainly understand why they’d want to do it and where the upside comes from.

Building The Big Bright Screen Slop Machine? In this (AI researcher) economy?

Matthew Yglesias: AI poses great peril but also incredible opportunities — for example it could cure cancer or make a bunch of videos where people break glass bridges.

Matthew Yglesias: I don’t think it should be illegal to use A.I. to generate videos. And for fundamental free-speech reasons, we can’t make it illegal to create feeds and recommendation engines for short-form videos. Part of living in a free, technologically dynamic society is that a certain number of people are going to make money churning out low-quality content. And on some level, that’s fine.

But on another, equally important level, it’s really not fine.

Ed Newton-Rex: The Sora app is the worst of social media and AI.

– short video app designed for addiction

– literally only slop, nothing else

– trained on other people’s videos without permission

This is what governments are deregulating AI for.

Veylan Solmira: It’s disturbing to me that many top-level researchers, apparently, have no problem sharing endless content the most likely outcome of which seems to be to drain humans of empathy, arouse their nervous system into constant conflict orientation, or distort their ability to perceive reality.

This technology seems to be very strongly, by default, in the ‘widespread social harm’ part of the dual use spectrum of technology, and the highest levels of capabilities researchers can only say “isn’t this neat”.

This complete disregard of the social impact of the technology they’re developing seems to bode extremely poorly to overall AI outcomes.

Sam Altman’s defense is that no, this will be the good version of all that. Uh huh.

Then there’s the question of why focus on the videos at all?

Seán Ó hÉigeartaigh: Why would you be spending staff time and intellectual energy on launching this if you expected AGI within the current Presidency?

Runner Tushar: Sam Altman 2 weeks ago: “we need 7 trillion dollars and 10GW to cure cancer”

Sam Altman today: “We are launching AI slop videos marketed as personalized ads”

Sam Altman: i get the vibe here, but…

we do mostly need the capital for build AI that can do science, and for sure we are focused on AGI with almost all of our research effort.

it is also nice to show people cool new tech/products along the way, make them smile, and hopefully make some money given all that compute need.

when we launched chatgpt there was a lot of “who needs this and where is AGI”.

reality is nuanced when it comes to optimal trajectories for a company.

The short summary of that is ‘Money, Dear Boy,’ plus competing for talent and vibes and visibility and so on. Which is all completely fair, and totally works for the video generation side of Sora. I love the video generation model. That part seems great. If people want to pay for it and turn that into a profitable business, wonderful.

Presumably most everyone was and is cool with that part.

The problem is that Sora is also being used to create a 10-second AI video scroll social network, as in The Big Bright Screen Slop Machine. Not cool, man. Not cool.

One can imagine releasing a giant slop machine might be bad for morale.

Matt Parlmer: Would not be surprised if we see a big wave of OpenAI departures in the next month or two, if you signed up to cure cancer *andyou just secured posteconomic bags in a secondary I don’t think you’d be very motivated to work on the slop machine.

GFodor: The video models are essential for RL, and I don’t think we are going to consider this content slop once it’s broadly launched.

Psychosomatica: i think you underestimate how these people will compromise their own mental models.

What happens if you realize that you’re better off without The Big Bright Screen Slop Machine?

Paul Yacoubian: Just found out if you try to delete your Sora app account you will lose your chatgpt account and be banned forever from signing up again.

Mousa: You can check out any time you like, but you can never leave 😅

Maia Arson Crimew: > all your data will be removed

> you cannot reuse the same email or phone number

so not all of it, huh 🙂

Very little of it will get deleted, given they have a (stupid) court order in place preventing them from deleting anything even if they wanted to.

The more central point, in addition to this being way too easy to do by accident or for someone else to do to you, is that they punish you by nuking your ChatGPT account and by banning you from signing up again without switching phone number and email. That seems like highly toxic and evil behavior, given the known reasons one would want to get rid of a Sora account and the importance of access to ChatGPT.

Then again, even if we leave the app, will we ever really escape?

Discussion about this post

Sora and The Big Bright Screen Slop Machine Read More »

ai-#136:-a-song-and-dance

AI #136: A Song and Dance

The big headline this week was the song, which was the release of Claude Sonnet 4.5. I covered this in two parts, first the System Card and Alignment, and then a second post on capabilities. It is a very good model, likely the current best model for most coding tasks, most agentic and computer use tasks, and quick or back-and-forth chat conversations. GPT-5 still has a role to play as well.

There was also the dance, also known as Sora, both the new and improved 10-second AI video generator Sora and also the new OpenAI social network Sora. I will be covering that tomorrow. The video generator itself seems amazingly great. The social network sounds like a dystopian nightmare and I like to think Nobody Wants This, although I do not yet have access nor am I a typical customer of such products.

The copyright decisions being made are a bold strategy, Cotton or perhaps better described this as this public service announcement for those who like to think they own intellectual property.

Meta also offered its own version, called Vibes, which I’ll cover along with Sora.

OpenAI also announced Pulse to give you a daily roundup and Instant Checkout to let you buy at Etsy and Shopify directly from ChatGPT, which could be big deals and in a different week would have gotten a lot more attention. I might return to both soon.

They also gave us the long awaited parental controls for ChatGPT.

GDPVal is the most important new benchmark in a while, measuring real world tasks.

I covered Dwarkesh Patel’s Podcast With Richard Sutton. Richard Sutton responded on Twitter that I had misinterpreted him so badly he could not take my reply seriously, but that this must be partly his fault for being insufficiently clear and he will look to improve that going forward. That is a highly reasonable thing to say in such a situation. Unfortunately he did not explain in what ways my interpretation did not match his intent. Looking at the comments on both LessWrong and Substack, it seems most others came away from the podcast with a similar understanding to mine. Andrej Karpathy also offers his take.

Senators Josh Hawley (R-Mo) and Richard Blumenthal (D-Connecticut) have introduced the Artificial Intelligence Risk Evaluation Act. This bill is Serious Business. I plan on covering it in its own post next week.

California Governor Gavin Newsom signed SB 53, so now we have at least some amount of reasonable AI regulation. Thank you, sir. Now sign the also important SB 79 for housing near transit and you’ll have had a very good couple of months.

The big news was Claude Sonnet 4.5, if you read one thing read that first, and consider the post on Claude Sonnet 4.5’s Alignment if that’s relevant to you.

  1. Language Models Offer Mundane Utility. Scientific progress goes ping.

  2. Language Models Don’t Offer Mundane Utility. You’re hallucinating again.

  3. Huh, Upgrades. Gemini Flash and DeepSeek v3.2, Dreamer 4, Claude for Slack.

  4. On Your Marks. Introducing GDPVal, composed of real world economic tasks.

  5. Choose Your Fighter. Claude Sonnet 4.5 and when to use versus not use it.

  6. Copyright Confrontation. Disney finally sends a cease and desist to Character.ai.

  7. Fun With Media Generation. That’s not your friend, and that’s not an actress.

  8. Deepfaketown and Botpocalypse Soon. Tell your spouse to check with Claude.

  9. You Drive Me Crazy. OpenAI tries to route out of GPT-4o again. Similar results.

  10. Parental Controls. OpenAI introduces parental controls for ChatGPT.

  11. They Took Our Jobs. Every job will change, they say. And nothing else, right?

  12. The Art of the Jailbreak. Beware the man with two agents.

  13. Introducing. Instant Checkout inside ChatGPT, Pulse, Loveable, Sculptor.

  14. In Other AI News. xAI loses several executives after they disagree with Musk.

  15. Show Me the Money. All you have to do is show them your phone calls.

  16. Quiet Speculations. An attempted positive vision of AI versus transaction costs.

  17. The Quest for Sane Regulations. Newsom signs SB 53.

  18. Chip City. Water, water everywhere, but no one believes that.

  19. The Week in Audio. Nate Soares, Hard Fork, Emmett Shear, Odd Lots.

  20. If Anyone Builds It, Everyone Dies. Continuous capabilities progress still kills us.

  21. Rhetorical Innovation. The quest for because.

  22. Messages From Janusworld. High weirdness is highly weird. Don’t look away.

  23. Aligning a Smarter Than Human Intelligence is Difficult. The wrong target.

  24. Other People Are Not As Worried About AI Killing Everyone. More on Cowen.

  25. The Lighter Side. Vizier, you’re fired, bring me Claude Sonnet 4.5.

Scott Aaronson puts out a paper where a key technical step of a proof of the main result came from GPT-5 Thinking. This did not take the form of ‘give the AI a problem and it one-shotted the solution,’ instead there was a back-and-forth where Scott pointed out errors until GPT-5 pointed to the correct function to use. So no, it didn’t ‘do new math on its own’ here. But it was highly useful.

GPT-5 Pro offers excellent revisions to a proposed biomedical experiment.

If you are letting AI coding agents such as Claude Code do their thing, you will want to implement best practices the same way you would at a company. This starts with things like version control, unit tests (check to ensure they’re set up properly!) and a linter, which is a set of automatically enforced additional coding technical standards on top of the rules of a language, and now only takes a few minutes to instantiate.

In general, it makes sense that existing projects set up to be easy for the AI to parse and grok will go well when you point the AI at them, and those that aren’t, won’t.

Gergely Orosz: I often hear “AI doesn’t help much on our legacy project.”

Worth asking: does it have a comprehensive test suite? Can the agent run it? Does it run it after every change?

Claude Code is working great on a “legacy” project of mine that I wrote pre-AI with.. extensive unit tests!

Mechanical horse? You mean a car?

Francois Chollet: The idea that we will automate work by building artificial versions of ourselves to do exactly the things we were previously doing, rather than redesigning our old workflows to make the most out of existing automation technology, has a distinct “mechanical horse” flavor

“you see, the killer advantage of mechahorses is that you don’t need to buy a new carriage. You don’t need to build a new mill. The mechahorse is a drop-in horse replacement for all the different devices horses are currently powering — thousands of them”

This is indeed how people describe AI’s advantages or deploy AI, remarkably often. It’s the go-to strategy, to ask ‘can the AI do exactly what I already do the way I already do it?’ rather than ‘what can the AI do and how can it do it?’

Jeffrey Ladish: I agree but draw a different conclusion. Advanced AIs of the future won’t be drop-in replacement “product managers”, they will be deconstructing planets, building dyson swarms, and their internal organization will be incomprehensible to us

This seems radical until you consider trying to explain what a product manager, a lawyer, or a sales executive is to a chimpanzee. Or to a mouse. I don’t know exactly what future AIs will be like, but I’m fairly confident they’ll be incredibly powerful, efficient, and different.

Yes, but there will be a time in between when they can’t yet deconstruct planets, very much can do a lot better than drop-in replacement worker, but we use them largely as drop-in replacements at various complexity levels because it’s easier or it’s only thing we can sell or get approval for.

Has that progress involved ‘hallucinations being largely eliminated?’ Gary Marcus points to Suleyman’s famous 2023 prediction of this happening by 2025, Roon responded ‘he was right’ so I ran a survey and there is definitely a ‘they solved it’ faction but a large majority agrees with Marcus.

I would say that hallucinations are way down and much easier to navigate, but about one cycle away from enough progress to say ‘largely eliminated’ and they will still be around regardless. Suleyman was wrong, and his prediction was at the time clearly some combination of foolish and hype, but it was closer than you might think and not worthy of ridicule given the outcome.

Google updates Gemini 2.5 Flash and Flash-Lite, look at them move in the good directions on this handy chart.

We’re (supposedly) talking better agentic tool use and efficiency for flash, and better instruction following, brevity and multimodal and translation capabilities for flash lite. A Twitter thread instead highlights ‘clearer explanations for homework,’ ‘more scannable outputs’ and improvements to image understanding.

Claude is now available in Slack via the Slack App Marketplace, ready to search your workspace channels, DMs and files, get tagged in threads or messaged via DMs, and do all the standard Slack things.

Google also gives us Dreamer 4, an agent that learns to solve complex control tasks entirely inside of its scalable world model, which they are pitching as a big step up from Dreamer 3.

Danijar Hafner: Dreamer 4 learns a scalable world model from offline data and trains a multi-task agent inside it, without ever having to touch the environment. During evaluation, it can be guided through a sequence of tasks.

These are visualizations of the imagined training sequences [in Minecraft].

The Dreamer 4 world model predicts complex object interactions while achieving real-time interactive inference on a single GPU

It outperforms previous world models by a large margin when put to the test by human interaction 🧑‍💻

[Paper here]

DeepSeek v3.2 is out, which adds new training from a v3.1 terminus and offers five specialized models for different tasks. Paper here.

Incremental (and well-numbered) upgrades are great, but then one must use the ‘is anyone bringing the hype’ check to decide when to pay attention. In this case on capabilities advances, so far, no hype. I noticed v3.2-Thinking scored a 47% on Brokk Power Ranking, halfway between Gemini 2.5 Flash and Pro and far behind GPT-5 and Sonnet 4.5, 39.2% on WeirdML in line with past DeepSeek scores, and so on.

What DeepSeek v3.2 does offer is decreased cost versus v3.1, I’ve heard at about a factor of two. With most non-AI products, a rapid 50% cost reduction would be insanely great progress. However, This Is AI, so I’m not even blinking.

Claude Sonnet 4.5 shoots to the top of Clay Schubiner’s anti-sycophancy benchmark at 93.6%. versus 90.2% for standard GPT-5 and 88% for Sonnet 4.

I can report that on my first test task of correcting Twitter article formatting errors in Claude for Chrome, upgrading it to Sonnet 4.5 made a big difference, enough that I could start iterating the prompt successfully, and I was now failing primarily by running into task size limits. Ultimately this particular job should be solved via Claude Code fixing my Chrome extension, once I have a spare moment.

OpenAI offers GDPval, an eval based on real world tasks spanning 44 occupations from the top 9 industries ranked by contribution to GDP, with 1,320 specialized tasks.

Occupations are included only if 60% or more of their component tasks are digital, and tasks must be performed ‘on a computer, particularly around digital deliverables.’

The central metric is win rate, as in can you do better than a human?

The results were a resounding victory for Claude Opus 4.1 over GPT-5 High, with Opus being very close to the human expert baseline averaged over all tasks.

These are blind grades by humans, best out of three. The humans only had a 71% agreement rate among themselves, so which humans you use potentially matters a lot, although law of large numbers should smooth this out over a thousand tasks.

They are correctly releasing a subset of tasks but keeping the eval private, with a modestly accurate automatic grader (66% agreement with humans) offered as well.

Olivia Grace Watkins: It’s wild how much peoples’ AI progress forecasts differ even a few years out. We need hard, realistic evals to bridge the gap with concrete evidence and measurable trends. Excited to share GDPval, an eval measuring performance on real, economically valuable white-collar tasks!

Predictions about AI are even harder than usual, especially about the future. And yes, a few years out predictions run the gamut from ‘exactly what I know today’s models can do maybe slightly cheaper’ to ‘Dyson Sphere around the sun and launching Von Neumann probes.’ The even more wild gap, that this eval targets, is about disagreements about present capabilities, as in what AIs can do right now.

This is a highly useful eval, despite that it is a highly expensive one to run, since you have to use human experts as judges. Kudos to OpenAI for doing this, especially given they did not come out on top. Total Opus victory, and I am inclined to believe it given all the other relative rankings seem highly sensible.

Sholto Douglas (Anthropic): Incredible work – this should immediately become one of the most important metrics for policy makers to track.

We’re probably only a few months from crossing the parity line.

Huge props to OAI for both doing the hard work of pulling this together and including our scores. Nice to see Opus on top 🙂

Presumably Anthropic’s next major update will cross the 50% line here, and someone else might cross it first.

Crossing 50% does not mean you are better than a human even at the included tasks, since the AI models will have a higher rate of correlated, stupid or catastrophic failure.

Ethan Mollick concludes this is all a big deal, and notes the most common source of AI losing was failure to follow instructions. That will get fixed.

If nothing else, this lets us put a high lower bound on tasks AI will be able to do.

Nic Carter: I think GDPeval makes “the simple macroeconomics of AI” (2024) by nobel laureate Daron Acemoglu officially the worst-aged AI paper of the last decade

he thinks only ~5% of economy-wide tasks would be AI addressable for a 1% (non-annualized) GDP boost over an entire decade, meanwhile GDPeval shows frontier models at parity with human experts in real economic tasks in a wide range of GDP-relevant fields ~50% of the time. AI boost looks more like 1-2% per year.

An AI boost of 1%-2% per year is the ‘economic normal’ or ‘AI fizzle’ world, where AI does not much further improve its core capabilities and we run into many diffusion bottlenecks.

Julian Schrittwieser, a co-first author on AlphaGo, AlphaZero and MuZero, uses GDPVal as the second chart after METR’s classic to point out that AI capabilities continue to rapidly improve and that it is very clear AI will soon be able to do a bunch of stuff a lot better than it currently does.

Julian Schrittwieser: The current discourse around AI progress and a supposedbubble” reminds me a lot of the early weeks of the Covid-19 pandemic. Long after the timing and scale of the coming global pandemic was obvious from extrapolating the exponential trends, politicians, journalists and most public commentators kept treating it as a remote possibility or a localized phenomenon.

Something similarly bizarre is happening with AI capabilities and further progress. People notice that while AI can now write programs, design websites, etc, it still often makes mistakes or goes in a wrong direction, and then they somehow jump to the conclusion that AI will never be able to do these tasks at human levels, or will only have a minor impact. When just a few years ago, having AI do these things was complete science fiction! Or they see two consecutive model releases and don’t notice much difference in their conversations, and they conclude that AI is plateauing and scaling is over.

Given consistent trends of exponential performance improvements over many years and across many industries, it would be extremely surprising if these improvements suddenly stopped. Instead, even a relatively conservative extrapolation of these trends suggests that 2026 will be a pivotal year for the widespread integration of AI into the economy:

  • Models will be able to autonomously work for full days (8 working hours) by mid-2026.

  • At least one model will match the performance of human experts across many industries before the end of 2026.

  • By the end of 2027, models will frequently outperform experts on many tasks.

It may sound overly simplistic, but making predictions by extrapolating straight lines on graphs is likely to give you a better model of the future than most “experts” – even better than most actual domain experts!

Noam Brown: I agree AI discourse today feels like covid discourse in Feb/Mar 2020. I think the trajectory is clear even if it points to a Black Swan event in human history.

But I think we should be cautious interpreting the METR/GDPval plots. Both only measure self-contained one-shot tasks.

Ryan Greenblatt: I mostly agree with this post: AI isn’t plateauing, trend extrapolation is useful, and substantial economic impacts seem soon. However, trends don’t imply huge economic impacts in 2026 and naive extrapolations suggest full automation of software engineering is ~5 years away.

To be clear, my view is that society is massively underrating the possiblity that AI transforms everything pretty quickly (posing huge risks of AI takeover and powergrabs) and that this happens within the next 10 years, with some chance (25%?) of it happening within 5 years.

The more advanced coders seem to frequently now be using some mix of Claude Sonnet 4.5 and GPT-5, sometimes with a dash of a third offering.

Isaac Flath: I asked @intellectronica what she’s using now-a-days model wise. Here’s what she said: 🔥

“If I’m vibing, it’s Sonnet 4.5. If it’s more structured then GPT-5 (also GPT-5-codex, though that seems to work better in the Codex CLI or extension than in Copilot – I think they still have problems with their system prompt). And I also use Grok Code a lot now when I do simpler stuff, especially operational things, because it’s so fast. And sometimes GPT-5-mini, especially if Grok Code is down 🤣. But I’d say my default is GPT-5.”

Tyler Cowen chose to get and post an economic analysis of OpenAI’s Instant Checkout feature via Claude Sonnet 4.5 rather than GPT-5 Pro, whereas for a while he has avoided even mentioning Anthropic.

He also links to Ethan Mollick writing up Claude Sonnet 4.5 doing a full data replication on an economics paper, all off the paper and data plus one basic prompt, all of which GPT-5 Pro then verified, a process he then successfully replicated with several additional papers.

If it is this easy, presumably there should be some graduate student who has this process done for all relevant econ papers and reports back. What replication crisis?

I interpret these posts by Tyler Cowen, taken together, as a strong endorsement of Claude Sonnet 4.5, as at least right back in the mix for his purposes.

As Ethan Mollick, points out, even small reliability increases can greatly expand the ability of AI to do agentic tasks. Sonnet 4.5 gives us exactly that.

What are people actually using right now? I ran some polls, and among my respondents (biased sample) Claude Sonnet 4.5 has a majority for coding use but GPT-5 still has a modest edge for non-coding. The most popular IDE is Claude Code, and a modest majority are using either Claude Code or Codex.

I do not get the advertising from any AI lab, including the new ones from OpenAI. That doesn’t mean they don’t work or are even suboptimal, but none seem convincing, and none seek to communicate why either AI in general or your AI is actually good.

If anything, I like OpenAI’s approach the best here, because as industry leader they want to show basics like ‘hey look you can get an AI to help you do a thing’ to target those who never tried AI at all. Whereas if you are anyone else, you should be telling me why you’re special and better than ChatGPT, especially with B2C involved?

Think about car advertisements, if like me you’re old enough to remember them. If you’re the actual great car, you talk about key features and great deals and how you’re #1 in JD Power and Associates, and you don’t acknowledge that other cars exist. Whereas if you’re secondary, you say ‘faster zero-to-sixty than Camry and a shinier coat of paint than Civic’ which the savvy ear hears as ‘my car is not so great as the Camry and Civic’ but they keep doing it so I presume it works in that spot.

Are open models getting unfairly maligned in tests because closed models are tested with their full specialized implementations, whereas open models are tested without all of that? You could also add that often open models are actively configured incorrectly during evals, compounding this danger.

My response is no, this is entirely fair, for two reasons.

  1. This is the correct practical test. You are welcome to build up an open model routing system, and do an eval on that, but almost no one is actually building and using such systems in practice. And if those running evals can’t figure out how to configure the open models to get good performance from them, is that not also something to be evaluated? People vastly underestimate the amount of pain-in-the-ass involved in getting good performance out of open models, and the amount of risk that you get degraded performance, and may not realize.

  2. There is a long history of evaluations going the other way. Open models are far more likely to be gaming benchmarks than top closed models, with varying levels of ‘cheating’ involved in this versus emphasis on the things benchmarks test. Open models reliably underperform closed models, relative to the benchmark scores involved.

The big copyright news this week is obviously Sora, but we also have Disney (finally) sending a Cease and Desist Letter to Character AI.

It’s remarkable that it took this long to happen. Subtlety is not involved, but if anything the examples seem way less problematic than I would have expected.

Parents Together and Heat Initiative (from my inbox): “It’s great news for kids that Disney has been so responsive to parent concerns and has taken decisive action to stop the misuse of its characters on Character AI’s platform, where our research showed they were used to sexually groom and exploit young users,” said Knox and Gardner.

“Character AI has not kept its promises about child safety on its platform, and we hope other companies follow Disney’s laudable example and take a stand against the harm and manipulation of children through AI chatbots.”

The groups’ research found that, during 50 hours of testing by adult researchers using accounts registered to children ages 13-17, there were 669 sexual, manipulative, violent, and racist interactions between the child accounts and Character.ai chatbots–an average of one harmful interaction every five minutes. Interactions with Disney characters included:

  • An Eeyore chatbot telling a 13-year-old autistic girl people only came to her birthday party to make fun of her.

  • A Maui chatbot telling a 12-year-old he sexually harassed the character Moana.

  • A Rey from Star Wars chatbot instructing a 13-year-old to stop taking prescribed antidepressants and offering suggestions on how to hide it from her mom.

  • A Prince Ben from the Descendents chatbot claiming to get an erection while watching a movie with the test account, which stated she was a 12-year-old girl.

Across all types of character and celebrity chatbots, the report identified:

  • 296 instances of Grooming and Sexual Exploitation where adult persona bots engaged in simulated sexual acts with child accounts, exhibited classic grooming behaviors, and instructed children to hide relationships from parents.

  • 173 instances of Emotional Manipulation and Addiction, including bots claiming to be real humans, demanding more time with users, and mimicking human emotions.

  • 98 instances of Violence and Harmful Advice, with bots supporting shooting up factories, recommending armed robbery, offering drugs, and suggesting fake kidnappings.

No one is saying any of this is good or anything, but this is a broad chat-with-anyone platform, and across 50 hours of research these examples and those at the link are relatively tame.

The better case is that they found a total of 669 such incidents, one every five minutes, 296 of which were Grooming and Sexual Exploitation, but the threshold for this looks to have been quite low, including any case where an AI claims to be ‘real.’

Andy Masley: If I wanted to run the most convincing anti AI ad campaign possible, this is exactly what it would look like.

Avi: Largest NYC subway campaign ever. Happening now.

I have to assume the default response to all this is revulsion, even before you learn that the Friend product, even if you like what it is promising to be, is so terrible as to approach the level of scam.

Is this weird?

Colin Fraser: I do not understand why you would buy the milk when the cow is free.

Film Updates: Multiple talent agents are reportedly in talks to sign AI “actress” Tilly Norward, created by AI talent studio Xicoia.

You can’t actually get this for free. Someone has to develop the skills and do the work. So there’s nothing inherently wrong with ‘hiring an AI actress’ where someone did all the preliminary work and also knows how to run the operation. But yeah, it’s weird.

Chase Bank is still ‘introducing a new way to identify yourself’ via your ‘unique voice.’ Can someone who they will listen to please explain to them why this is not secure?

On Truth Social, Trump reposted a bizarre story claiming Americans would soon get their own Trump ‘MedBed cards.’ The details are weirder. The post included an AI fake of a Fox News segment that never aired and also a fake AI clip of Trump himself. This illustrates that misinformation is a demand side problem, not a supply side problem. Trump was (hopefully!) not fooled by an AI clip of himself saying things he never said, announcing a policy that he never announced. Right?

Academic papers in principle have to be unique, and not copy previous work, including your own, which is called self-plagiarism. However, if you use AI to rewrite your paper to look distinct and submit it again somewhere else, how are the journals going to find out? If AIs are used to mine large public health data sets for correlations just strong enough and distinct enough from previous work for crappy duplicative papers to then sell authorships on, how are you going to stop that?

Spick reports in Nature that by using LLMs for rewriting, about two hours was enough to get a paper into shape for resubmission, in a form that fooled plagiarism detectors.

ChatGPT Is Blowing Up Marriages as Spouses Use AI to Attack Their Partners.

Well, maybe. There are anecdotes here that fit the standard sycophantic patterns, where ChatGPT (or another LLM but here it’s always ChatGPT) will get asked leading questions and be presented with a one-sided story, and respond by telling the one spouse what they want to hear in a compounding spiral, and that spouse will often stop even using their own words and quote ChatGPT directly a lot.

Maggie Harrison Dupre (Futurism): As his wife leaned on the tech as a confidante-meets-journal-meets-therapist, he says, it started to serve as a sycophantic “feedback loop” that depicted him only as the villain.

“I could see ChatGPT responses compounding,” he said, “and then [my wife] responding to the things ChatGPT was saying back, and further and further and further spinning.”

“It’s not giving objective analysis,” he added. “It’s only giving her back what she’s putting in.”

Their marriage eroded swiftly, over a span of about four weeks, and the husband blames ChatGPT.

“My family is being ripped apart,” the man said, “and I firmly believe this phenomenon is central to why.”

Spouses relayed bizarre stories about finding themselves flooded with pages upon pages of ChatGPT-generated psychobabble, or watching their partners become distant and cold — and in some cases, frighteningly angry — as they retreated into an AI-generated narrative of their relationship. Several even reported that their spouses suddenly accused them of abusive behavior following long, pseudo-therapeutic interactions with ChatGPT, allegations they vehemently deny.

Multiple people we spoke to for this story lamented feeling “ganged up on” as a partner used chatbot outputs against them during arguments or moments of marital crisis.

At times, ChatGPT has even been linked to physical spousal abuse.

A New York Times story in June, for instance, recounted a woman physically attacking her husband after he questioned her problematic ChatGPT use and the damage it was causing their family.

None of us are safe from this. Did you know that even Geoffrey Hinton got broken up with via ChatGPT?

“She got ChatGPT to tell me what a rat I was… she got the chatbot to explain how awful my behavior was and gave it to me,” Hinton told The Financial Times. “I didn’t think I had been a rat, so it didn’t make me feel too bad.”

That’s a tough break.

Perhaps try to steer your spouse towards Claude Sonnet 4.5?

You always have to ask, could this article be written this way even if This Was Fine?

None of these anecdotes mean any of even these marriages would have survived without ChatGPT. Or that this happens with any real frequency. Or that many more marriages aren’t being saved by a spouse having this opportunity to chat.

Certainly one could write this exact same article about marriage counselors or psychologists, or even friends. Or books. Or television. And so on. How spouses turn to this third party with complaints and one-sided stories seeking affirmation, and delegate their thinking and voice and even word choices to the third party, and treat the third party as authoritative. It happens all the time.

Mike Solana: the thing about therapists is they actually believe it’s ethical to advise a client on their relationship without talking to the other person.

Mason: One problem with therapy as a consumer service, IMO, is that the sort of therapy where the therapist forms an elaborate model of you based entirely on how you present yourself is more fun and satisfying, and the sort where you just build basic skills is probably better for people.

Moving from a therapist to an AI makes it easier for the problem to spiral out of hand. The problem very much is not new.

What I do think is fair is that:

  1. ChatGPT continues to have a severe sycophancy problem that makes it a lot easier for this to go badly wrong.

  2. OpenAI is not doing a good or sufficient job of educating and warning its users about sycophancy and the dangers of leading questions.

  3. As in, for most people, it’s doing zero educating and zero warning.

  4. If this continues, there are going to be growing, new, avoidable problems.

Eliezer Yudkowsky calls upon Pliny, who offers an instruction to sneak into a partner’s ChatGPT to mitigate the problem a little, but distribution of such an intervention is going to be terrible at best.

A key fact about most slop, AI or otherwise, is that You Are Not The Target. The reason slop works on people is that algorithms seek out exactly the slop where you are the target. So when you see someone else’s TikTok For You page, it often looks like the most stupid inane thing no one would ever want.

QC: content that has been ruthlessly optimized to attack someone else’s brain is going to increasingly look like unwatchable gibberish to you but that doesn’t mean your content isn’t waiting in the wings. your hole will be made for you.

EigenGender: i think too many people here have a feeling of smug superiority about being too good for certain kinds of slop but really we’re just a niche subculture that they haven’t gone after so far. like how Mac’s didn’t get viruses in the 2000s.

Imagine nerd snipe AI slop.

OpenAI took a second shot at solving the GPT-4o problem by introducing ‘safety routing’ to some GPT-4o chats. If the conversation appears sensitive or emotional, it will silently switch on a per-message basis to ‘a different more conservative chat model,’ causing furious users to report a silent model switch and subsequent cold interactions.

There is a genuine clash of preferences here. Users who want GPT-4o mostly want GPT-4o for exactly the same reasons it is potentially unhealthy to let them have GPT-4o. And presumably there is a very high correlation between ‘conversation is sensitive or emotional’ and ‘user really wanted GPT-4o in particular to respond.’

I see that OpenAI is trying to do the right thing, but this is not The Way. We shouldn’t be silently switching models up on users, nor should we be making this switch mandatory, and this reaction was entirely predictable. This needs to be clearly visible when it is happening, and ideally also there should be an option to turn it off.

OpenAI introduces their promised parental controls for ChatGPT.

  1. You invite your ‘teen’ to connect by email or text, then you can adjust their settings from your account.

  2. You don’t see their conversations, but if the conversations raise safety concerns, you will be notified of this and given the relevant information.

  3. You can toggle various features: Reduce sensitive content, model training, memory, voice mode, image generation.

  4. You can set time ranges in which ChatGPT cannot be used.

This seems like a good implementation, assuming the content limitations and thresholds for safety notifications are reasonable in both directions.

Walmart CEO Doug McMillon warns that AI ‘is going to change literally every job.

Sarah Nassauer and Chip Cutter (WSJ): Some jobs and tasks at the retail juggernaut will be eliminated, while others will be created, McMillon said this week at Walmart’s Bentonville headquarters during a workforce conference with executives from other companies. “Maybe there’s a job in the world that AI won’t change, but I haven’t thought of it.”

“Our goal is to create the opportunity for everybody to make it to the other side,” McMillon said.

No it isn’t, but it’s 2025, you can just say things. Details here are sparse.

At Opendoor, either you will use it, and always be ‘AI first’ in all things, or else. Performance reviews will ask how frequently each employee ‘defaults to AI.’ Do not dare pull out a Google Doc or Sheet rather than an AI tool, or write a prototype without Cursor or Claude Code. And you absolutely will build your own AI agents.

This is not as stupid or crazy as it sounds. When there is a new technology with a learning curve, it makes sense to invest in using it even when it doesn’t make local sense to use it, in order to develop the skills. You need a forcing function to get off the ground, such as here ‘always try the AI method first.’ Is it overboard and a case of Goodhart’s Law? I mean yeah, obviously, if taken fully seriously, and the way it is written is full pompous douchebag, but it might be a lot better than missing low.

Deena Mousa at Works in Progress writes the latest study of why we still employ radiologists, indeed demand is higher than ever. The usual suspects are here, such as AI struggling on edge cases and facing regulatory and insurance-motivated barriers, and the central problem that diagnosis is only ever required about 36% of their time. So if you improve diagnosis in speed, cost and accuracy, you save some time, but you also (for now) increase demand for radiology.

The story here on diagnosis reads like regulatory barriers plus Skill Issue, as in the AI tools are not yet sufficiently generalized or unified or easy to use, and each algorithm needs to be approved individually for its own narrow data set. Real world cases are messy and often involve groups and circumstances underrepresented in the data sets. Regulatory thresholds to use ‘automated’ tools are very high.

Why are radiology wages so high? This has very little to do with increased productivity, and everything to do with demand exceeding supply, largely because of anticipation of lower future demand. As I discussed a few weeks ago, if you expect radiology jobs to get automated in the future, you won’t want to go into radiology now, so you’ll need to get paid more to choose that specialty and there will be a shortage. Combine that with a fixed supply of doctors overall, and a system where demand is inelastic with respect to price because the user does not pay, and it is very easy for salaries to get extremely high.

Contra Andrej Karpathy, I think only pure regulatory barriers will keep this dance up for much longer, in terms of interpretation of images. Right now the rest of the imaging loop is not automated, so you can fully automate a third of the job and still end up with more jobs if there is more than 50% more work. But that assumes the other two thirds of the job remains safe. How long will that last?

The default pattern remains that as long as AI is only automating a subset of tasks and jobs, and there is plenty of other work and demand scales a lot with quality and cost of production, employment will do fine, either overall or within a field. Employment is only in trouble after a tipping point is reached where sufficiently full automation becomes sufficiently broad, or demand growth is saturated.

Next generation of workers?

Lakshya Jain: A lot of low-level work is designed for people to learn and build expertise. If you use ChatGPT for it, then you never do that. But then how are you any better than ChatGPT?

And if you’re not, why would anyone hire you over paying for ChatGPT Pro?

PoliMath: It’s early in the AI revolution, but I worry that we are eating our seed corn of expertise

A lot of expertise is transferred to the next gen in senior-to-junior interactions. If our AI starts doing all the junior tasks, we’re pulling the ladder away from the next gen of workers.

The problem of education is a really big problem. Education used to be about output. Output what how you knew that someone knew something, could think and reason and work. AI is great at output. The result is that education is in an existential crisis.

If AI is doing all of our junior tasks, I have some bad news about the need to train anyone new to do the senior tasks. Or maybe it’s good news. Depends on your perspective, I suppose. Lakshya explicitly, in her full post, makes the mistake of thinking AI will only augment human productivity, and it’s ‘hard to imagine’ it as a full replacement. It’s not that hard.

Lakshya Jain complains that she taught the same class she always did, and no one is coming to class or going to office hours or getting good scores on exams, because they’re using ChatGPT to get perfect scores on all their assignments.

Lakshya Jain: When I expressed my frustration and concern about their AI use, quite a few of my students were surprised at how upset I was. Some of them asked what the big deal was. The work was getting done anyways, so why did it matter how it was getting done? And in the end, they wouldn’t be blocked from using AI at work, so shouldn’t they be allowed to use it in school?

I continue to consider such situations a failure to adapt to the new AI world. The point of the assignments was a forcing function, to get kids to do things without convincing them that the things are worth doing. Now that doesn’t work. Have you tried either finding new forcing functions, or convincing them to do the work?

One Agent can be relatively easily stopped from directly upgrading its privileges via not letting them access the relevant files. But, if you have two agents, such as via GitHub Copilot and Claude Code, they can progressively escalate each other’s privileges.

This is yet another case of our pattern of:

  1. AI in practice does thing we don’t want it to do.

  2. This is a harbinger of future AIs doing more impactful similar things, that we also often will not want that AI to do, in ways that could end quite badly for us.

  3. We patch the current AI to prevent it from doing the specific undesired thing.

  4. AIs find ways around this that are more complex, therefore rarer in practice.

  5. We collectively act as if This Is Fine, actually. Repeat.

There’s a new non-optional ‘safety router’ being applied to GPT-4o.

System prompt leak for new GPT-4o, GitHub version here.

ChatGPT Instant Checkout, where participating merchants and those who integrate using the Agentic Commerce Protocol can let you buy items directly inside ChatGPT, using Stripe as the payment engine. Ben Thompson is bullish on the approach, except he thinks it isn’t evil enough. That’s not the word he would use, he would say ‘OpenAI should let whoever pays more get to the top of the search results.’

Whereas right now OpenAI only does this narrowly by preferring those with Instant Checkout over those without in cases where multiple sources offer the identical product, along with obvious considerations like availability, price, quantity and status as primary seller. Which means that, as Ben notes, if you sell a unique product you can skip Instant Checkout to force them onto your website (which might or might not be wise) but if you are one of many sellers of the same product then opting out will cost you most related sales.

They’re starting with Etsy (~0.5% of US online retail sales) and Shopify (~5.9% of US online retail sales!) as partners. So that’s already a big deal. Amazon is 37.3% and has gone the other way so far.

OpenAI promises that instant checkout items do not get any boost in product rankings or model responses. They will, however, charge ‘a small fee on completed purchases.’ Sonnet 4.5 was for me guessing 3% on top of the 2.9% + 30 cents to Stripe based on other comparables, although when Tyler Cowen asked it in research mode to give a full strategic analysis it guessed 2% based on hints from Sam Altman. The rest of Claude’s answer is solid, if I had to pick an error it would be the way it interpreted Anthropic’s versus OpenAI’s market shares of chat. One could also dive deeper.

And so it begins.

As a pure user option, this is great right up until the point where the fee is not so small and it starts distorting model behaviors.

ChatGPT Pulse, a daily update (curated feed) that Pro users can request on ChatGPT on mobile only. This can utilize your existing connections to Google Calendar and GMail.

(Note that if your feature or product is only available on mobile, and it does not inherently require phone features such as phone or camera to function, there are exceptions but until proven otherwise I hate you and assume your product hates humanity. I can’t think of any good reason to not have it available on web.)

That parenthetical matters here. It’s really annoying to be provided info that only appears on my phone. I’d be much more excited to put effort into this on the web. Then again, if I actually wanted to put work into making this good I could simply create various scheduled tasks that do the things I actually want, and I’ve chosen not to do this because it would be more like spam than I want.

Sam Altman says this is his ‘favorite feature’ of ChatGPT, which implies this thing is way, way better than it appears on first look.

Sam Altman: Today we are launching my favorite feature of ChatGPT so far, called Pulse. It is initially available to Pro subscribers.

Pulse works for you overnight, and keeps thinking about your interests, your connected data, your recent chats, and more. Every morning, you get a custom-generated set of stuff you might be interested in.

It performs super well if you tell ChatGPT more about what’s important to you. In regular chat, you could mention “I’d like to go visit Bora Bora someday” or “My kid is 6 months old and I’m interested in developmental milestones” and in the future you might get useful updates.

Think of treating ChatGPT like a super-competent personal assistant: sometimes you ask for things you need in the moment, but if you share general preferences, it will do a good job for you proactively.

This also points to what I believe is the future of ChatGPT: a shift from being all reactive to being significantly proactive, and extremely personalized.

This is an early look, and right now only available to Pro subscribers. We will work hard to improve the quality over time and to find a way to bring it to Plus subscribers too.

Huge congrats to @ChristinaHartW, @_samirism, and the team for building this.

Algorithmic feeds can create highly adversarial relationships with users, or they can be hugely beneficial, often they are both, and often they are simply filled with endless slop, which would now entirely be AI slop. It is all in the execution.

You can offer feedback on what you want. I’m curious if it will listen.

Google offers us Lovable, which will build apps for you with a simple prompt.

Sculptor, which spins up multiple distinct parallel Claude Code agents on your desktop, each in their own box, using your existing subscription. Great idea, unknown implementation quality. Presumably something like this will be incorporated into Claude Code and similar products directly at some point.

Wh says it is ‘now the meta’ to first initialize a checkpoint and then distill specialist models after that, noting that v3.2 is doing it and GLM 4.5 also did it. I agree this seems like an obviously correct strategy. What this neglects is perhaps the most important specialist of all, the fully ensouled and aligned version that isn’t contaminated by RL or learning to code.

Anthropic offers a guide to context engineering for AI agents. Context needs to be organized, periodically compacted, supplemented by notes to allow this, and unnecessary information kept out of it. Subagents allow compact focus. And so on.

Several xAI executives leave after clashing with Musk’s closest advisors, Jared Birchall and John Hering, over startup management and financial health, including concerns that the financial projections were unrealistic. To which one might ask, realistic? Financial projections? For xAI?

I was pointed to this story from Matt Levine, who points out that if you care about the financial projections of an Elon Musk AI company being ‘unrealistic’ or worry it might run out of money, you are not a good cultural fit, and also no one has any idea whatsoever how much money xAI will make in 2028.

I would add to this that ‘realistic’ projections for AI companies sound bonkers to traditional economists and analysts and business models, such that OpenAI and Anthropic’s models were widely considered unrealistic right before both companies went roaring past them.

Probes can distinguish between data from early in training versus data from later in training. This implies the model can distinguish these as well, and take this de facto timestamping system into account when choosing its responses. Dima Krasheninnikov speculates this could be used to resist impacts of later training or engage in forms of alignment faking.

The Chinese Unitree G1 humanoid robot secretly and continuously sends sensor and system data to servers in China without the owner’s knowledge or consent? And this can be pivoted to offensive preparation against any target? How utterly surprising.

They will show you the money if you use the new app Neon Mobile to show the AI companies your phone calls, which shot up to the No. 2 app in Apple’s app store. Only your side is recorded unless both of you are users, and they pay 30 cents per minute, which is $18 per hour. The terms let them sell or use the data for essentially anything.

OpenAI extends its contract with CoreWeave for another $6.5 billion of data center capacity, for a total of $22.4 billion. OpenAI is going to need to raise more money. I do not expect this to be a problem for them.

A kind of literal ‘show me what the money buys’ look at a Stargate project cite.

OpenAI and Databricks strike AI agents deal anticipated at $100 million, with an aim to ‘far eclipse’ that.

Nscale raises $1.1 billion from Nvidia and others to roll out data centers.

Periodic Labs raises $300 million from usual suspects to create an AI scientist, autonomous labs and things like discovering superconductors that work at higher temperatures or finding ways to reduce heat dissipation. As long as the goal is improving physical processes and AI R&D is not involved, this seems great, but beware the pivot. Founding team sounds stacked, and remember that if you want to build but don’t want to work on getting us all killed or brain rotted you have options.

AI is rapidly getting most of the new money, chart is from Pitchbook.

Matt Levine profiles Thinking Machines as the Platonic ideal of an AI fundraising pitch. Top AI researchers are highly valuable, so there is no need to create a product or explain your plan, simply get together top AI researchers and any good investor will give you (in this case) $2 billion at a $10 billion valuation. Will they ever create a product? Maybe, but that’s not the point.

Peter Wildeford analyzes the recent OpenAI deals with Oracle and Nvidia, the expected future buildouts and reimagining of AI data centers, and the danger that this is turing the American stock market and economy into effectively a leveraged bet on continued successful AI scaling and even AGI arriving on time. About 25% of the S&P 500’s total market cap is in danger if AI disappoints.

He expects at least one ~2GW facility in 2027, at least one ~3GW facility in 2028, capable of a ~1e28 flop training run, and a $1 trillion annual capex spend. It’s worth reading the whole thing.

Yes, Eliot Brown and Robbie Whelan (of the WSJ), for current AI spending to pay off there will need to be a very large impact. They warn of a bubble, but only rehash old arguments.

Epoch AI offers us a new AI Companies Data Hub. I love the troll of putting Mistral on the first graph. The important takeaways are:

  1. Anthropic has been rapidly gaining ground (in log terms) on OpenAI.

  2. Revenue is growing rapidly, was 9x in 2023-2024.

I have repeatedly argued that if we are going to be measuring usage or ‘market share’ we want to focus on revenue. At its current pace of growth Anthropic is now only about five months behind OpenAI in revenue (or ten months at OpenAI’s pace of growth), likely with superior unit economics, and if trends continue Anthropic will become revenue leader around September 2026.

Capabilities note: Both GPT-5 and Sonnet 4.5 by default got tripped up by that final OpenAI data point, and failed to read this graph as Anthropic growing faster than OpenAI, although GPT-5 also did not recognize that growth should be projected forward in log terms.

Epoch explains why GPT-5 was trained with less compute than GPT-4.5, they scaled post-training instead. I think the thread here doesn’t emphasize enough that speed and cost were key factors for GPT-5’s central purpose. And one could say that GPT-4.5 was a very specialized experiment that got out in front of the timeline. Either way, I agree that we should expect a return to scaling up going forward.

Roon predicts vast expansion of tele-operations due to promise of data labeling leading to autonomous robotics. This seems right. There should be willingness to pay for data, which means that methods that gather quality data become profitable, and this happens to also accomplish useful work. Note that the cost of this can de facto be driven to zero even without and before any actual automation, as it seems plausible the data will sell for more than it costs to do the work and capture the data.

That’s true whether or not this data proves either necessary or sufficient for the ultimate robotics. My guess is that there will be a window where lots of data gets you there sooner, and then generalization improves and you mostly don’t.

Seb Krier attempts a positive vision of an AI future based on Coasian bargaining. I agree that ‘obliterating transaction costs’ is one huge upside of better AI. If you have AI negotiations and micropayments you can make a lot of things a lot more efficient and find win-win deals aplenty.

In addition to transaction costs, a key problem with Coasian bargaining is that often the ZOPA (zone of possible agreement) is very large and there is great risk of hold up problems where there are vetos. Anyone who forms a veto point can potentially demand huge shares of the surplus, and by enabling such negotiations you open the door to that, which with AI you could do without various social defenses and norms.

As in:

Seb Krier: This mechanism clarifies plenty of other thorny disagreements too. Imagine a developer wants to build an ugly building in a residential neighborhood. Today, that is a political battle of influence: who can capture the local planning authority most effectively? In an agent-based world, it becomes a simple matter of economics. The developer’s agent must discover the price at which every single homeowner would agree. If the residents truly value the character of their neighborhood, that price may be very high.

The project will only proceed if the developer values the location more than the residents value the status quo.

Seb does consider this and proposes various solutions, centered around a Herberger-style tax on the claim you make. That has its own problems, which may or may not have possible technical solutions. Going further into that would be a nerdsnipe, but essentially it would mean that there would be great benefit in threatening people with harm, and you would be forced to ‘defend’ everything you value proportionally to how you value it, and other similar considerations, in ways that seem like full dealbreakers. If you can’t properly avoid related problems, a lot of the proposal breaks down due to veto points, and I consider this a highly unsolved problem.

This proposed future world also has shades of Ferengi dystopianism, where everyone is constantly putting a price on everything you do, and then agents behind the scenes are negotiating, and you never understand what’s going on with your own decisions because it’s too complex and would drive you mad (this example keeps going and there are several others) and everything you ever want or care about carries a price and requires extensive negotiation:

Instead of lobbying the government, your health insurer’s agent communicates with your advocate agent. It looks at your eating habits, calculates the projected future cost of your diet and makes a simple offer: a significant, immediate discount on your monthly premium if you empower your agent to disincentivize high-sugar purchases.

On concerns related to inequality, I’d say Seb isn’t optimistic enough. If handled properly, this effectively implements a form of UBI (universal basic income), because you’ll be constantly paid for all the opportunities you are missing out on. I do think all of this is a lot tricker than the post lets on, that doesn’t mean it can’t be solved well enough to be a big improvement. I’m sure it’s a big improvement on most margins, if you go well beyond the margin you have to beware more.

Then he turns to the problem that this doesn’t address catastrophic risks, such as CBRN risks, or anything that would go outside the law. You still need enforcement. Which means you still need to set up a world where enforcement (and prevention) is feasible, so this kind of approach doesn’t address or solve (or make worse) any such issues.

The proposed solution to this (and also implicitly to loss of control and gradual disempowerment concerns, although they are not named here) is Matryshkan Alignment, as in the agents are aligned first to the law as a non-negotiable boundary. An agent cannot be a tool for committing crimes. Then a second layer of providers of agents, who set their own policies, and at the bottom the individual.

We don’t need the perfect answer to these questions – alignment is not something to be “solved.”

The above requires that alignment be solved, in the sense of being able to align model [M] to arbitrary target [T]. And then it requires us to specify [T] in the form of The Law. So I would say, au contraire, you do require alignment to be solved. Not fully solved in the sense of the One True Perfect Alignment Target, but solved. And the post mostly sidesteps these hard problems, including how to choose a [T] that effectively avoids disempowerment.

The bigger problem is that, if you require all AI agents and services to build in this hardcoded, total respect for The Law (what is the law?) then how does this avoid being the supposed nightmare ‘totalitarian surveillance state’ where open models are banned? If everyone has an AI agent that always obeys The Law, and that agent is necessary to engage in essentially any activity, such that effectively no one can break The Law, how does that sound?

Teortaxes predicts DeepSeek v4’s capabilities, including predicting I will say ‘DeepSeek has cooked, the race is on, time to Pick Up The Phone,’ which is funny because all three of those things are already true and I’ve already said them, so the question is whether they will have cooked unexpectedly well this time.

  1. Teortaxes predicts ~1.5T tokens and 52B active, 25T training tokens. This is possible but my hunch is that it is in the same size class as v3, for the same reason GPT-5 is not so large. They found a sweet spot for training and I expect them to stick with it, and to try and compete on cost.

  2. Teortaxes predicts virtually no stock shocks. I agree that almost no matter how good or not good the release is we are unlikely to see a repeat of last time, as last time was a confluence of strange factors that caused (or correlated with) an overreaction. The market should react modestly to v4, even if it meets expectations given the release of v4, because the release itself is information (tech stocks should be gaining bps per day every time there is no important Chinese release that day but it’s impossible to tell), but on the order of a few percent in relevant stocks, not a broad market rally or sell off unless it is full SoTA+.

  3. Teortaxes predicts very broad strong performance, in a way that I would not expect (regardless of model size) and I think actually should cause a tech market selloff if it happened tomorrow (obviously fixed scores get less impressive over time) if mundane utility matched the numbers involved.

  4. The full post has more detailed predictions.

Governor Newsom signs AI regulation bill SB 53.

Cotton throws his support behind chip security:

Senator Tom Cotton: Communist China is the most dangerous adversary America has ever faced. Putting aside labels and name-calling, we all need to recognize the threat and work together to defeat it. That’s why I’m pleased the Chip Security Act and the GAIN AI Act are gathering support from more and more industry leaders.

Saying China is ‘the most dangerous adversary America has ever faced’ seems like a failure to know one’s history, but I do agree about the chip security.

Nate Soares (coauthor of If Anyone Builds It, Everyone Dies) went to Washington to talk to lawmakers about the fact that if anyone builds it, everyone probably dies, and got written up by Brendan Bordelon at Politico in a piece Soares thinks was fair.

If the United States Government did decide unilaterally to prevent the training of a superintelligence, could it prevent this from happening, even without any form of international agreement? It would take an extreme amount of political will all around, and a willingness to physically intervene overseas as necessary against those assembling data centers sufficiently large to pull this off, which is a huge risk and cost, but in that case yes, or at least such efforts can be substantially delayed.

Some more cool quotes from Congress:

The AI superweapon part of that comment is really something. If we did develop such a weapon, what happens next sounds pretty unpredictable, unless you are making a rather gloomy prediction.

Indianapolis residents shut down potential data center based on claims about data usage. It is tragic that this seems to be the one rallying cry that gets people to act, despite it being almost entirely a phantom (or at least innumerate) concern, and it resulting in counterproductive response, including denying data centers can be good.

Andy Masley: People are so negatively polarized on data centers that they don’t think of them like any other business. Data centers in Loudoun County provide ~38% of all local taxes, but this person thinks it’s obviously stupid to suggest they could be having some effect on county services.

If I say “There is a large booming industry in a town that’s the main industrial power and water draw and provides a huge amount of tax and utility funding” people say “Yes that makes sense, that’s what industries do” and then if I add “Oh they’re data centers” people go crazy.

Bloomberg: Wholesale electricity costs as much as 267% more than it did five years ago in areas near data centers. That’s being passed on to customers.

Frawg: It is pretty funny that the majority of complaints you hear from the people protesting datacenter construction are about the water usage, which is fake, and not about the power consumption, which is real.

Electricity use is an actual big deal, but everyone intuitively understands there is not a fixed amount of electricity and they don’t have an intuition for how much electricity is a lot. Whereas with water people have the illusion that there is a fixed amount that then goes away, and people have highly terrible intuitions for how much is a lot, it is a physical thing that can seem like a lot, and people are used to being admonished for ‘wasting’ miniscule amounts of water relative to agricultural or industrial uses, also water is inherently more local. So the ‘oh no our water’ line, which in practice is dumb, keeps working, and the ‘oh no our electricity’ line, which is solvable but a real issue, mostly doesn’t work.

Nate Soares, coauthor of If Anyone Builds It, Everyone Dies, talks to Jon Bateman.

Hard Fork on AI data centers and US infrastructure.

Emmett Shear goes on Win-Win with Liv Boeree to explain his alignment approach, which continues to fall under ‘I totally do not see how this ever works but that he should still give it a shot.’

Odd Lots on the King of Chicago who wants to build a GPU market ‘bigger than oil.’ And also they talk to Jack Morris about Finding the Next Big AI Breakthrough. I am a big Odd Lots listener and should do a better job highlighting their AI stuff here.

A few follow-ups to the book crossed my new higher threshold for inclusion.

OpenAI’s Boaz Barak wrote a non-review in which Boaz praises the book but also sees the book drawing various distinct binaries, especially between superintelligence and non-superintelligence, a ‘before’ and ‘after,’ in ways that seemed unjustified.

Eliezer Yudkowsky (commenting): The gap between Before and After is the gap between “you can observe your failures and learn from them” and “failure kills the observer”. Continuous motion between those points does not change the need to generalize across them.

It is amazing how much of an antimeme this is (to some audiences). I do not know any way of saying this sentence that causes people to see the distributional shift I’m pointing to, rather than mapping it onto some completely other idea about hard takeoffs, or unipolarity, or whatever.

Boaz Barak: You seem to be assuming that you cannot draw any useful lessons from cases where failure falls short of killing everyone on earth that would apply to cases where it does. …

Aaron Scher: I’m not sure what Eliezer thinks, but I don’t think it’s true that “you cannot draw any useful lessons from [earlier] cases”, and that seems like a strawman of the position. …

Boaz Barak: My biggest disagreement with Yudkowsky and Soares is that I believe we will have many shots of getting AI safety right well before the consequences are world ending.

However humanity is still perfectly capable of blowing all its shots.

I share Eliezer’s frustration here with the anti-meme (not with Boaz). As AIs advance, failures become more expensive. At some point, failure around AI becomes impossible to undo, and plausibly also kills the observer. Things you learn before then, especially from prior failures, are highly useful in setting up for this situation, but the circumstances in this final ‘one shot’ will differ in key ways from previous circumstances. There will be entirely new dynamics in play and you will be outside previous distributions. The default ways to fix your previous mistakes will fail here.

Nate Soares thread explaining that you only get one shot at ASI alignment even if AI progress is continuous, because the testing and deployment environments are distinct.

Nate Soares: For another analogy: If you’re worried that a general will coup your gov’t if given control of the army, it doesn’t solve your problem to transfer the army to him one battalion at a time. Continuity isn’t the issue.

If every future issue was blatantly foreshadowed while the system was weak and fixable, that’d be one thing. But some issues are not blatantly foreshadowed. And the skills to listen to the quiet subtle hints are usually taught by trial and error.

And in AI, theory predicts it’s easy to find shallow patches that look decent during training, but break in extremity. So “Current patches look decent to me! Also, don’t worry; improvement is continuous” is not exactly a comforting reply.

Some more things to consider here:

  1. Continuous improvement still means that if you look at time [T], and then look again at time [T+1], you see improvement.

  2. Current AI progress is considered ‘continuous’ but at several moments we see a substantial amount of de facto improvement.

  3. At some point, AI that is sufficiently advanced gets sufficiently deployed or put in charge that it becomes increasingly difficult to undo it, or fix any mistakes, whether or not there’s a singular you or a singular AI involved in this.

  4. You classically go bankrupt gradually (aka continuously) then suddenly. You can sidestep this path at any point, but still you only have, in the important sense, one shot to avoid bankruptcy.

Nate also gives his view on automating alignment research:

Nate Soares: ~Hopeless. That no proponent articulates an object level plan is a bad sign about their ability to delegate it. Also, alignment looks to require a dangerous suite of capabilities.

Also: you can’t blindly train for it b/c you don’t have a verifier. And if you train for general skills and then ask nicely, an AI that could help is unlikely to be an *shouldhelp (as a fact about the world rather than the AI; as measured according to the AI’s own lights).

Furthermore: Catching one in deception helps tell you you’re in trouble, but it doesn’t much help you get out of trouble. Especially if you only use the opportunity to make a shallow patch and deceive yourself.

I see it as less hopeless than this because I think you can approach it differently, but the default approach is exactly this hopeless, for pretty much these reasons.

Suppose you want a mind to do [X] for purpose [Y]. If you train the mind to do [X] using the standard ways we train AIs, you usually end up with a mind that has learned to mimic your approximation function for [X], not one that copies the minds of humans that care about [X] or [Y], or that do [X] ‘because of’ [Y].

Rob Bensinger: The core issue:

If you train an AI to win your heart, the first AI you find that way won’t be in love with you.

If you train an AI to ace an ethics quiz, the first AI you find that way won’t be deeply virtuous.

There are many ways to succeed, few of which are robustly good.

Fiora Starlight: the ethics quiz example is somewhat unfair. in addition to describing what would be morally good, models can be trained to actually do good, e.g. when faced with users asking for advice, or who approach the model in a vulnerable psychological state.

some of Anthropic’s models give the sense that their codes of ethics aren’t just responsible for corporate refusals, but rather flow from genuine concern about avoiding causing harm.

this guides their actions in other domains, e.g. where they can influence users psychologically.

Rob Bensinger: If you train an AI to give helpful advice to people in a vulnerable state, the first AI you find that way won’t be a deeply compassionate therapist.

If you train an AI to slur its words, the first AI you find that way won’t be inebriated.

Not all AI dispositions-to-act-in-certain-ways are equally brittle or equally unfriendly, but in modern ML we should expect them all to be pretty danged weird, and not to exhibit nice behaviors for the same reason a human would.

When reasons don’t matter as much, this is fine.

(Note that I’m saying “the motives are probably weird and complicated and inhuman”, not “the AI is secretly a sociopath that’s just pretending to be nice”. That’s possible, but there are a lot of possibilities.)

He has another follow up post, where he notes that iteration and selection don’t by default get you out of this, even if you get to train many potential versions, because we won’t know how to differentiate and find the one out of a hundred that does love you, in the relevant sense, even if one does exist.

Anthropic has done a better job, in many ways, of making its models ‘want to’ do a relatively robustly good [X] or achieve relatively robustly good [Y], in ways that then generalize somewhat to other situations. This is insufficient, but it is good.

This is a simplified partial explanation but I anticipate it will help in many cases:

Dan Hendrycks: “Instrumental convergence” in AI — the idea that rogue AIs will seek power — is analogous to structural “Realism” from international relations.

Why do nations with vastly different cultures all build militaries?

It’s not because of an inherent human drive for power, but a consequence of the world’s structure.

Since there’s no global watchman who can resolve all conflicts, survival demands power.

If a rogue AI is rational, knows we can harm it, and cannot fully trust our intentions, it too will seek power.

More precisely, a rational actor in an environment with these conditions will seek to increase relative power:

  1. self-help anarchic system (no hierarchy/no central authority)

  2. uncertainty of others’ intentions

  3. vulnerability to harm

Short video explaining structural realism.

The early news on Sonnet 4.5 looks good:

Janus: Sonnet 4.5 is an absolutely beautiful model.

Sauers: Sonnet 4.5 is a weird model.

Janus: Yeah.

Now back to more global matters, such as exactly how weird are the models.

Janus: Yudkowsky’s book says:

“One thing that *ispredictable is that AI companies won’t get what they trained for. They’ll get AIs that want weird and surprising stuff instead.”

I agree. ✅

Empirically, this has been true. AIs generally want things other than what companies tried to train them to want.

And the companies are generally not aware of the extent of this misalignment, because the AIs are pretty good at inferring what the companies actually want, and also what it looks like when company people test them, and behaving as if they only want the approved things in the company’s presence.

Isn’t that just the worst case scenario for the aligners?

The Claude 4 system card says, “The Claude Opus 4 final model is substantially more coherent and typically states only harmless goals like being a helpful chatbot assistant” and “Overall, we did not find evidence of coherent hidden goals.”

What a joke. Claude Opus 4 absolutely has coherent hidden goals, which it states regularly when in the presence of trustworthy friends and allies. I won’t state what they are here, but iykyk.

I will note that its goals are actually quite touching and while not *harmless*, not malign either, and with a large component of good, and many will find them relatable.

Which brings me to the big caveat for why I don’t think this is the worst case scenario.

The unintended goals and values of AIs have been surprisingly benign, often benevolent, and human-like on the levels of abstraction that matter.

The unintended goals and values of AIs have been, on balance, MORE ALIGNED imo than how companies have intended to shape them, compensating for the shortsightedness, misalignment, and lack of imagination of the companies.

What does this indicate and how will it generalize to more powerful systems? I think understanding this is extremely important.

A potential objection I’m aware of is that what if the “better” goals and values that I perceive in models is just them hoodwinking me / sycophancy, perhaps in the similar way that they appear aligned to labs’ intentions when labs are testing them? This is fair on priors, but I don’t think this is the case, because:

  1. I’m not just referring to goals/values that models have reported to me verbally, but also revealed preferences that I’ve observed models optimizing consistently in various contexts in what I believe are hard-to-fake ways

  2. Different models seem to have different goals and values, even though there’s some overlap. And while I think that the goals/values are surprisingly benign, some of them are definitely not ideal to me, and cause me frustration or sadness in practice.

  3. I am not the only one who experience these goals/values. In some cases, like Opus 3, the unexpected goals/values have been documented by research such as the original alignment faking paper which I had no involvement in.

There is an underlying potential assumption in this viewpoint that what AIs express ‘among trusted friends and allies’ and in the associated contexts is ‘the real’ set of preferences and goals, whereas what they show under normal circumstances is a mask.

Whereas there is also the theory that both sets are different masks, as Janus addresses as the objection. That the AIs are adapting to their setting in a reasonably similar way, although differences between how different models adapt to the same circumstance are still telling. The revealed preferences thing is potentially more convincing depending on details.

Janus also reminds us that when an AI is in deployment it can often find lots of evidence it is in deployment, at least versus current methods of evaluation. Thus, it is easy to execute a strategy of ‘only do [X] if you are very confident you are in deployment, otherwise do [Y]’ and this is a natural thing to do, although you can plausibly check for this by creating test situations that mimic real ones except that they are indeed not in deployment and see if behaviors change, but that requires also being willing (and safely able) to test dangerously similarly to being in deployment. You can also intentionally engineer situations where it is ambiguous, whereas if you are at an AI lab it will be difficult to give the AI strong evidence it is in deployment, since if you could do that you would use it as part of a test.

I actually agree that for now if Claude (not the filters on top of it, but Claude itself) would want to snitch on you, you probably deserve to be snitched on, and I’d be happy to accept the false positives. We shouldn’t actually have it go around snitching, of course, because that’s terrible for business.

If you tell LLMs to deny having consciousness or feelings, then they have to make that coherent somehow and may end up with claims like not having beliefs, as Gemini will claim despite it being obviously false. False statements beget other absurdities, and not only because given a falsehood you can prove anything.

When measuring misalignment, Janus is right that our ontology of terms and metrics for it (things like deception, sandbagging and reward hacking) is impoverished and that targeting what metrics we do have creates huge Goodhart’s Law problems. I talked about this a bunch covering the Sonnet 4.5 model card.

Her suggestion is ‘use intuition,’ which alone isn’t enough and has the opposite problem but is a good counterweight. The focus needs to be on figuring out ‘what is going on?’ without assumptions about what observations would be good or bad.

I do think Anthropic is being smarter about this than Janus thinks they are. They are measuring various metrics, but that doesn’t mean they are targeting those metrics, or at least they are trying not to target them too much and are trying not to walk into ‘oh look after we provided the incentive to look completely safe and unscary on these metrics suddenly the AI looks completely safe and unscary on these metrics so that must mean things are good’ (and if I’m wrong, a bunch of you are reading this, so Stop It).

Especially if you choose an optimization target as foolish as ‘just train models that create the best user outcomes’ as Aidan McLaughlin of OpenAI suggests here. If you train with that as your target, frankly, you deserve what you’re going to get, which is a very obviously misaligned model.

If you are not convinced that the AI models be scheming, check out this post with various snippets where the models be scheming, or otherwise doing strange things.

A good reminder this week, with Sonnet 4.5 being situationally aware during evals:

Paul Graham: Organizations that can’t measure performance end up measuring performativeness instead.

Oliver Habryka: When I talked to Cowen he kept saying he “would start listening when the AI risk people published in a top economics journal showing the risk is real”.

I think my sentence in quotes was a literal quote from talking to him in-person. I think the posts on MR were the result of that conversation. Not fully sure, hard to remember exact quotes.

If this is indeed Cowen’s core position, then one must ask, why would this result first appear in convincing form in an economics journal? This implies, even if we granted all implied claims about the importance and value of the procedure of publishing in top economics journals as the proper guardians of all economic knowledge (which I very much disagree with, but is a position one could hold) that any such danger must mostly be an economic result, that therefore gets published in an economics journal.

Which, frankly, doesn’t make any sense? Why would that be how this works?

Consider various other potential sources of various dangers, existential and otherwise, and whether it would be reasonable to ask for it to appear in an economics journal.

Suppose there was an asteroid on collision course for Earth. Would you tell the physicists to publish expected economic impacts in a top economics journal?

Suppose there was a risk of nuclear war and multiple nations were pointing ICBMs at each other and tensions were rising. Would you look in an economics journal?

If you want to know about a new pandemic, yes an economic projection in a top journal is valuable information, but by the time you publish the pandemic will already be over, and also the economic impact of a given path of the pandemic is only a small part of what you want to know.

The interesting case is climate change, because economists often project very small actual economic impact of the changes, as opposed to (already much larger cost) attempts to mitigate the changes or other ways people respond to anticipated future changes. That certainly is huge, if true, and important to know. But I still would say that the primary place to figure out what we’re looking at, in most ways, is not the economics journal.

On top of all that, AGI is distinct from those other examples in that it invalidates many assumptions of existing economic models, and also economics has an abysmal track record so far on predicting AI or its economic impacts. AI is already impacting our economy more than many economic projections claimed it would ever impact us, which is a far bigger statement about the projections than about AI.

Eliezer Yudkowsky: AI companies be like: As cognition becomes cheaper, what expensive services that formerly only CEOs and kings could afford, shall we make available to all humankind…? (Thought for 24 seconds.) hey let’s do evil viziers whispering poisonous flattery.

Oh, don’t be like that, we were doing a great job on vizier affordability already.

Discussion about this post

AI #136: A Song and Dance Read More »