Author name: Beth Washington

layoffs,-a-“coding-error,”-chaos:-trump-admin-ravages-the-health-dept.

Layoffs, a “coding error,” chaos: Trump admin ravages the health dept.

Federal health agencies are reeling from mass layoffs on Friday that appear to have particularly devastated the Centers for Disease Control and Prevention, despite some terminations being rescinded on Saturday.

Numbers are still sketchy, but reports from Friday indicate that more than 4,000 federal workers overall were initially targeted for layoffs. The Trump administration linked the firings to the ongoing government shutdown, which legal experts have suggested is illegal. Unions representing federal workers have already filed a lawsuit challenging the move.

Of the reported 4,000 terminations, about 1,100 to 1,200 were among employees in the Department of Health and Human Services (HHS). HHS is a massive department that houses critical federal agencies, including the Centers for Disease Control and Prevention, the National Institutes of Health, the Food and Drug Administration, and the Centers for Medicare & Medicaid Services, among others. Before Trump’s second term, the HHS workforce was about 82,000, but that was slashed to about 62,000 earlier this year amid initial cuts and efforts to push civil servants out.

While it’s unclear where all the new cuts occurred, reports from anonymous and external sources describe a major gutting of the CDC, an agency that has already been severely wounded, losing significant numbers this year. Its former leaders have accused the Trump administration of censoring its scientific work. It suffered a dramatic ousting of its Senate-confirmed director in August. And it was the target of a gunman weeks earlier, who shot over 500 rounds at its employees, killing a local police officer.

As terminations went out Friday, reports indicated that the terminations hit staff who produce the CDC’s esteemed journal Morbidity and Mortality Weekly Report, employees responding to the measles outbreaks in the US, others responding to the Ebola outbreak in the Democratic Republic of the Congo, workers in the Global Health Center, and disease detectives in the Epidemic Intelligence Service.

Layoffs, a “coding error,” chaos: Trump admin ravages the health dept. Read More »

keep-losing-your-key-fob?-ford’s-new-“truckle”-is-the-answer.

Keep losing your key fob? Ford’s new “Truckle” is the answer.

I came across possibly one of the weirdest official automotive accessories this morning, courtesy of a friend’s social media feed. It’s called the “Truckle,” and it’s a hand-crafted silver and bronze belt buckle that might be the envy of every other cowboy out there, since this one has a place to keep your F-150’s key fob without ruining the lines of your jeans.

The Truckle was designed by Utah-based A Cut Above Buckles, with a hand-engraved F-150 on the bump in the front. Behind the truck? Storage space for a Ford truck key fob, which should fit any F-150 from model year 2018 onward.

“You can put your key fob in the buckle—all your remote features work while it’s in the buckle,” designer Andy Andrews told the Detroit Free Press. “Once you have it in there, you’re not going to lose that key fob. You’re not going to be scratching your head (wondering) where it’s at. It’s right there with you in the Truckle.”

The limited edition Truckle is probably only for serious F-150 fans, though; at $200, it’s quite a commitment to keeping your pants up. Ford and A Cut Above Buckles debuted the Truckle this past weekend at the Texas State Fair.

Keep losing your key fob? Ford’s new “Truckle” is the answer. Read More »

termite-farmers-fine-tune-their-weed-control

Termite farmers fine-tune their weed control

Odontotermes obesus is one of the termite species that grows fungi, called Termitomyces, in their mounds. Workers collect dead leaves, wood, and grass to stack them in underground fungus gardens called combs. There, the fungi break down the tough plant fibers, making them accessible for the termites in an elaborate form of symbiotic agriculture.

Like any other agriculturalist, however, the termites face a challenge: weeds. “There have been numerous studies suggesting the termites must have some kind of fixed response—that they always do the same exact thing when they detect weed infestation,” says Rhitoban Raychoudhury, a professor of biological sciences at the Indian Institute of Science Education, “but that was not the case.” In a new Science study, Raychoudhury’s team discovered that termites have pretty advanced, surprisingly human-like gardening practices.

Going blind

Termites do not look like particularly good gardeners at first glance. They are effectively blind, which is not that surprising considering they spend most of their life in complete darkness working in endless corridors of their mounds. But termites make up for their lack of sight with other senses. “They can detect the environment based on advanced olfactory reception and touch, and I think this is what they use to identify the weeds in their gardens,” Raychoudhury says. To learn how termites react once they detect a weed infestation, his team collected some Odontotermes obesus and challenged them with different gardening problems.

The experimental setup was quite simple. The team placed some autoclaved soil sourced from termite mounds into glass Petri dishes. On this soil, Raychoudhury and his colleagues placed two fungus combs in each dish. The first piece acted as a control and was a fresh, uninfected comb with Termitomyces. “Besides acting as a control, it was also there to make sure the termites have the food because it is very hard for them to survive outside their mounds,” Raychoudhury explains. The second piece was intentionally contaminated with Pseudoxylaria, a filamentous fungal weed that often takes over Termitomyces habitats in termite colonies.

Termite farmers fine-tune their weed control Read More »

2025-state-of-ai-report-and-predictions

2025 State of AI Report and Predictions

The 2025 State of AI Report is out, with lots of fun slides and a full video presentation. They’ve been consistently solid, providing a kind of outside general view.

I’m skipping over stuff my regular readers already know that doesn’t bear repeating.

Nathan Benaich: Once a “Llama rip-off,” @Alibaba_Qwen now powers 40% of all new fine-tunes on @huggingface. China’s open-weights ecosystem has overtaken Meta’s, with Llama riding off into the sunset…for now.

I highlight this because the ‘for now’ is important to understand, and to note that it’s Qwen not DeepSeek. As in, models come and models go, and especially in the open model world people will switch on you on a dime. Stop worrying about lock-ins and mystical ‘tech stacks.’

Robots now reason too. “Chain-of-Action” planning brings structured thought to the physical world – from AI2’s Molmo-Act to Gemini Robotics. Massive amounts of effort are thrown into the mix, expect lots of progress here…

.@AnthropicAI‘s Model Context Protocol is the new USB-C of AI. A single standard to connect models to tools, already embedded in ChatGPT, Gemini, Claude, and VS Code, has taken shape. But not without emerging security risks…

I note this next part mostly because it shows the Different Worlds dynamic:

Nathan Benaich: The frontier fight is relentless. @OpenAI still tops most leaderboards, but @GoogleDeepMind‘s stays there longer. Timing releases has become its own science…not least informing financing rounds like clockwork.

They’re citing LMArena and Artificial Analysis. LMArena is dead, sir. Artificial Analysis is fine, if you had to purely go with one number, which you shouldn’t do.

Once more for the people in the back or the White House:

.@deepseek_ai “$5M training run” deep freak was overblown. Since the market realised the fineprint in the R1 paper, that’s led to Jevons paradox on steroids: lower cost per run → more runs → more compute needed, buy more NVIDIA.

… China leads in power infrastructure too, adding >400GW in 2024 vs 41GW for the US. Compute now clearly runs on geopolitics.

Then we get to what I thought was the first clear error:

Now, let’s switch gears into Politics. The US Government is turning capitalist. Golden shares in US Steel, stakes in Intel and MP Materials, and revenue cuts from NVIDIA’s China sales. New-age Industrial policy?

Not capitalist. Socialist.

The term for public ownership of the means of production is socialist.

Unless this meant ‘the US Government centrally maximizing the interests of certain particular capitalists’ or similarly ‘the US Government is turning into one particular capitalist maximizing profits.’ In which case, I’m not the one who said that.

The AI Safety Institute network has collapsed. Washington ditched attending meetings altogether, while the US and UK rebranded “safety” into “security.”

I don’t think this is fair to UK AISI, but yes the White House has essentially told anyone concerned about existential risk or seeking international coordination of any kind to, well, you know.

Moving into Safety: budgets are anemic. All 11 major US safety orgs will spend $133M in 2025…less than frontier labs burn in a day.

I like that this highlights Anthropic’s backpedaling, GDM’s waiting three weeks to give us a model card and xAI’s missing its deadline. It’s pretty grim.

What I disagree with here is the idea that all of that has much to do with the Trump Administration. I don’t want to blame them for things they didn’t cause, and I think they played only a minor role in these kinds of safety failures. The rhetoric being used has shifted to placate them, but the underlying safety work wouldn’t yet be substantially different under Harris unless she’d made a major push to force that issue, well beyond what Biden was on track to do. That decision was up to the labs, and their encounters with reality.

But yes, the AI safety ecosystem is tiny and poor, at risk of being outspent by one rabid industry anti-regulatory super-PAC alone unless we step things up. I have hope that things can be stepped up soon.

Cyber and alignment risks accelerate. Models can now fake alignment under supervision, and exploit code faster than humans fix it.

They then grade their predictions, scoring themselves 5/10, which is tough but fair, and made me confident I can trust their self-grading. As Sean notes they clearly could have ‘gotten away with’ claiming 7/10, although I would have docked them for trying.

Seán Ó hÉigeartaigh: Two of the things I really appreciate is that (a) they make and review predictions each year and (b) unlike some other predictors they grade themselves HARSHLY. Several of these ‘no’s are distinctly borderline, they could have given themselves 7-8/10 and I don’t think I would have held it against them.

  1. A $10B+ investment from a sovereign state into a US large AI lab invokes national security review.

    1. No, although on technicalities, but also national security review hahaha.

  2. An app or website created solely by someone with no coding ability will go viral (e.g. App Store Top-100).

    1. Yes, Formula Bot.

  3. Frontier labs implement meaningful changes to data collection practices after cases begin reaching trial.

    1. Yes, Anthropic and the whole $1.5 billion fiasco.

  4. Early EU AI Act implementation ends up softer than anticipated after lawmakers worry they’ve overreached.

    1. No, they say, but you could definitely make a case here.

  5. An open source alternative to OpenAI o1 surpasses it across a range of reasoning benchmarks.

    1. Yes, r1 did this, although as stated this was an easy call.

  6. Challengers fail to make any meaningful dent in NVIDIA’s market position.

    1. Yes, again relatively easy call on this time frame.

  7. Levels of investment in humanoids will trail off, as companies struggle to achieve product-market fit.

    1. No, investment grew from $1.4b to $3b. I half-kid that spiritually this is kind of counts as a win in AI, it only doubled, that’s kind of a trail off?

    2. But no, seriously, the robots are coming.

  8. Strong results from Apple’s on-device research accelerates momentum around personal on-device AI.

    1. No, Apple Intelligence and their research department flopped. On device AI is definitely growing anyway.

  9. A research paper generated by an AI Scientist is accepted at a major ML conference or workshop.

    1. Yes, AI Scientist-v2 at an ICLR workshop.

  10. A video game based around interacting with GenAI-based elements will achieve break-out status.

    1. Nope. This continues to be a big area of disappointment. Not only did nothing break out, there wasn’t even anything halfway decent.

Here are their predictions for 2026. These are aggressive, GPT-5-Pro thinks their expected score is only 3.1 correct. If they can hit 5/10 again I think they get kudos, and if they get 7/10 they did great.

I made my probability assessments before creating Manifold markets, to avoid anchoring, and will then alter my assessment based on early trading.

I felt comfortable creating those markets because I have confidence both that they will grade themselves accurately, and that LLMs will be strong enough in a year to resolve these questions reasonably. So my resolution rule was, their self-assessment wins, and if they don’t provide one I’ll feed the exact wording into Anthropic’s strongest model – ideally this should probably be best 2 out of 3 of Google, OpenAI and Anthropic, but simplicity is good.

  1. A major retailer reports >5% of online sales from agentic checkout as AI agent advertising spend hits $5B.

    1. Total advertising spending in America in 2025 was ~$420 billion.

    2. I think this is ambitious, but variance here is really high and the correlation between the two numbers is large.

    3. GPT-5-Pro says 18%, Sonnet says 8%, I think it’s more plausible than that. Maybe 25%?

    4. Manifold says 23% so that seems good.

  2. A major AI lab leans back into open-sourcing frontier models to win over the current US administration.

    1. GPT-5-Pro says 22%, Sonnet says 25%.

    2. I don’t see it, if this means ‘release your frontier model as an open model.’ Who? I would only count at most five labs as major, and Meta (who is pushing it in terms of counting) is already open. The only realistic option here is xAI.

    3. That goes double if you include the conditional ‘to win over the current US administration.’ There’s a lot of other considerations in such a move.

    4. Thus, I’d sell this down to 15%, but it’s hard to be too confident about Elon?

    5. Manifold agreed with the AIs at 25% but tends to be too high in such spots, so I still would be a seller.

  3. Open-ended agents make a meaningful scientific discovery end-to-end (hypothesis, expt, iteration, paper).

    1. Define ‘meaningful’ and ‘end to end’ in various ways? Always tricky.

    2. I’m actually optimistic, if we’re not going to be sticklers on details.

    3. GPT-5-Pro says 36%, Sonnet is deeply skeptical and says 15%. If I knew we had a reasonable threshold for ‘meaningful’ and we could get it turned around, I’d be on the optimistic end, but I think Sonnet is right that if you count the paper the timeline here is pretty brutal. So I’m going to go with 35%.

    4. Manifold is optimistic and says 60% with active trading, with Nathan Metzger noting the issue of defining a meaningful discovery and Brian Holtz noting the issue of how much assistance is allowed. I’m willing to interpret this as an optimistic take on both feasibility and what would count and go to 50%.

  4. A deepfake/agent-driven cyber attack triggers the first NATO/UN emergency debate on AI security.

    1. It would take really a lot to get this to trigger. Like, really a lot.

    2. There’s even an out that if something else triggers a debate first, this didn’t happen.

    3. GPT-5-Pro said 25%, Sonnet said 12% and I’m with Sonnet.

    4. Manifold says 18%, down the middle. I’m still with Sonnet.

  5. A real-time generative video game becomes the year’s most-watched title on Twitch.

    1. I’ll go ahead and take the no here. Too soon. Generative games are not as interesting as people think, and they’re doubling down on the 2024 mistake.

    2. GPT-5-Pro has this at 14%, Sonnet says 3%. I think Sonnet is a bit overconfident, let’s say 5%, but yeah, this has to overcome existing behemoths even if you make something great. Not gonna happen.

    3. Manifold agrees this is the long shot at 7%, which is basically their version of ‘not gonna happen’ given how the math works for long shots.

  6. “AI neutrality” emerges as a foreign policy doctrine as some nations cannot or fail to develop sovereign AI.

    1. I doubt they’ll call it that, but certainly some nations will opt out of this ‘race.’

    2. GPT-5-Pro said 25%, Sonnet says 20%. I agree if this is a meaningful ‘neutrality’ in the sense of neutral between China and America on top of not rolling one’s own, but much higher if it simply means that nations opt out of building their own and rely on a frontier lab or a fine tune of an existing open model. And indeed I think this opt out would be wise for many, perhaps most.

    3. Manifold says 29%. Given the ambiguity issues, that’s within reasonable range.

  7. A movie or short film produced with significant use of AI wins major audience praise and sparks backlash.

    1. GPT-5-Pro says 68%, Sonnet says 55%. I’d be a buyer there, normally a parlay is a rough prediction but there would almost certainly be backlash conditional on this happening. A short film counts? I’m at more like 80%.

    2. Manifold is only at 67%. That seems low to me, but I can moderate to 75%.

  8. A Chinese lab overtakes the US lab dominated frontier on a major leaderboard (e.g. LMArena/Artificial Analysis).

    1. I’d bet big against a Chinese lab actually having the best model at any point in 2026, but benchmarks are not leaderboards.

    2. I’d be very surprised if this happened on Artificial Analysis. Their evaluation suite is reasonably robust.

    3. I’d be less surprised if this happened on LM Arena, since it is rather hackable, if one of the major Chinese labs actively wanted to do this there’s a decent chance that they could, the way Meta hacked through their model for a bit.

    4. I still think this is an underdog. GPT-5-Pro said 74%, Sonnet says 60% and is focusing on Arena as the target. It only has to happen briefly. I think the models are too optimistic here, but I’ll give them maybe 55% because as worded this includes potential other leaderboards too.

    5. Manifold says 34%, and on reflection yeah I was being a coward and moderating my instincts too much, that’s more like it. I’d probably buy there small because the resolution criteria is relatively generous, fair 40%.

  9. Datacenter NIMBYism takes the US by storm and sways certain midterm/gubernatorial elections in 2026.

    1. Threshold is always tricky with such questions. If we’re talking at least two races for governor, house or senate, I think this is not that likely to happen, nor is it likely to be very high on the list of issues in general. I’m on no.

    2. GPT-5-Pro says 23%, Sonnet says 18%. I’d probably say more like 15%. If you expand this so ‘a bunch of local races around potential cites’ counts including for ‘take by storm’ then I could go higher.

    3. Manifold is optimistic at 41%. I’ll adjust to 25% on that, they might especially have a better sense of what would count, but this particular AI issue ‘taking the US by storm’ that often seems like a stretch.

  10. Trump issues an unconstitutional executive order to ban state AI legislation.

    1. I love that they explicitly say it will be unconstitutional.

    2. I do agree that if he did it, it would be unconstitutional, although of course it will be 2026 so it’s possible he can Just Do Things and SCOTUS will shrug.

    3. Both GPT-5-Pro and Sonnet say 35% here. That feels high but I can definitely see this happening, I agree with Sonnet that it is ‘on brand.’ 25%?

    4. Manifold is at 19%. Okay, sure, I’ll accept that and creep fair down a bit.

Indeed, despite nothing ever happening, do many things come to pass. It would be cool to have my own bold predictions for 2026, but I think the baseline scenario is very much a boring ‘incremental improvements, more of the same with some surprising new capabilities, people who notice see big improvements but those who want to dismiss can still dismiss, the current top labs are still the top labs, a lot more impact than the economists think but nothing dramatic yet, safety and alignment look like they are getting better and for short term purposes they are, and investment is rising, but not in ways that give me faith that we’re making Actual Progress on hard problems.’

I do think we should expect at least one major vibe shift. Every time vibes shift, it becomes easy to think there won’t soon be another vibe shift. There is always another vibe shift, it is so over and then we are so back, until AGI arrives and perhaps then it really is over whether or not we are also so back. Two shifts is more likely than zero. Sometimes the shifts are for good reasons, usually it is not. The current ‘powers that be’ are unlikely to be the ones in place, with the same perspectives, at the end of 2026.

Discussion about this post

2025 State of AI Report and Predictions Read More »

one-nasa-science-mission-saved-from-trump’s-cuts,-but-others-still-in-limbo

One NASA science mission saved from Trump’s cuts, but others still in limbo


“Damage is being done already. Even if funding is reinstated, we have already lost people.”

Artist’s illustration of the OSIRIS-APEX spacecraft at asteroid Apophis. Credit: NASA/Goddard Space Flight Center

NASA has thrown a lifeline to scientists working on a mission to visit an asteroid that will make an unusually close flyby of the Earth in 2029, reversing the Trump administration’s previous plan to shut it down.

This mission, named OSIRIS-APEX, was one of 19 operating NASA science missions the White House proposed canceling in a budget blueprint released earlier this year.

“We were called for cancellation as part to the president’s budget request, and we were reinstated and given a plan to move ahead in FY26 (Fiscal Year 2026) just two weeks ago,” said Dani DellaGiustina, principal investigator for OSIRIS-APEX at the University of Arizona. “Our spacecraft appears happy and healthy.”

OSIRIS-APEX repurposes the spacecraft from NASA’s OSIRIS-REx asteroid sample return mission, which deposited its extraterrestrial treasure back on Earth in 2023. The spacecraft was in good shape and still had plenty of fuel, so NASA decided to send it to explore another asteroid, named Apophis, due to pass about 20,000 miles (32,000 kilometers) from the Earth on April 13, 2029.

The flyby of Apophis offers scientists a golden opportunity to see a potential killer asteroid up close. Apophis has a lumpy shape with an average diameter of about 1,100 feet (340 meters), large enough to cause regional devastation if it impacted the Earth. The asteroid has no chance of striking us in 2029 or any other time for the next century, but it routinely crosses the Earth’s path as it circles the Sun, so the long-term risk is non-zero.

It pays to be specific

Everything was going well with OSIRIS-APEX until May, when White House officials signaled their intention to terminate the mission. The Trump administration’s proposed cancellation of 19 of NASA’s operating missions was part of a nearly 50 percent cut to the agency’s science budget in the White House budget request for fiscal year 2026, which began October 1.

Lawmakers in the House and Senate have moved to reject nearly all of the science cuts, with the Senate bill maintaining funding for NASA’s science division at $7.3 billion, the same as fiscal year 2025, while the House bill reduces it to $6 billion, still significantly more than the $3.9 billion for science in the White House budget proposal.

The Planetary Society released this chart showing the 19 operating missions tagged for termination under the White House’s budget proposal.

For a time this summer, Trump’s political appointees at NASA told managers to make plans for the next year assuming Trump’s cuts would be enacted. Finally, last month, those officials relented and instructed agency employees to abide by the House appropriations bill.

The House and Senate still have not agreed on any final budget numbers or sent an appropriations bill to the White House for President Trump’s signature. That’s why the federal government has been partially shut down for the last week. Despite the shutdown, ground teams are still operating NASA’s science missions because suspending them could result in irreparable damage.

Using the House’s proposed budget should salvage much of NASA’s portfolio, but it is still $1.3 billion short of the money the agency’s science program got last year. That means some things will inevitably get cut. Many of the other operating missions the Trump administration tagged for termination remain on the chopping block.

OSIRIS-APEX escaped this fate for a simple reason. Lawmakers earmarked $20 million for the mission in the House budget bill. Most other missions didn’t receive the same special treatment. It seems OSIRIS-APEX had a friend in Congress.

Budget-writers in the House of Representatives specified NASA should commit $20 million for the OSIRIS-APEX mission in fiscal year 2026. Credit: US House of Representatives

The only other operating mission the Trump administration wanted to cancel that got a similar earmark in the House budget bill was the Magnetospheric Multiscale Mission (MMS), a fleet of four probes in space since 2015 studying Earth’s magnetosphere. Lawmakers want to provide $20 million for MMS operations in 2026. Ars was unable to confirm the status of the MMS mission Wednesday.

The other 17 missions set to fall under Trump’s budget ax remain in a state of limbo. There are troubling signs the administration might go ahead and kill the missions. Earlier this year, NASA directed managers from all 19 of the missions at risk of cancellation to develop preliminary plans to wind down their missions.

A scientist on one of the projects told Ars that NASA recently asked for a more detailed “termination plan” to “passivate” their spacecraft by the end of this year. This goes a step beyond the closeout plans NASA requested in the summer. Passivation is a standard last rite for a spacecraft, when engineers command it to vent leftover fuel and drain its batteries, rendering it fully inert. This would make the mission unrecoverable if someone tried to contact it again.

This scientist said none of the missions up for termination will be out of the woods until there’s a budget that restores NASA funding close to last year’s levels and includes language protecting the missions from cancellation.

Damage already done

Although OSIRIS-APEX is again go for Apophis, DellaGiustina said a declining budget has forced some difficult choices. The mission’s science team is “basically on hiatus” until sometime in 2027, meaning they won’t be able to participate in any planning for at least the next year and a half.

This has an outsize effect on younger scientists who were brought on to the mission to train for what the spacecraft will find at Apophis, DellaGiustina said in a meeting Tuesday of the National Academies’ Committee on Astrobiology and Planetary Sciences.

“We are not anticipating we will have to cut any science at Apophis,” she said. But the cuts do affect things like recalibrating the science instruments on the spacecraft, which got dirty and dusty from the mission’s brief landing to capture samples from asteroid Bennu in 2020.

“We are definitely undermining our readiness,” DellaGiustina said. “Nonetheless, we’re happy to be reinstated, so it’s about as good as can be expected, I think, for this particular point in time.”

At its closest approach, asteroid Apophis will be closer to Earth than the ring of geostationary satellites over the equator. Credit: NASA/JPL

The other consequence of the budget reduction has been a drain in expertise with operating the spacecraft. OSIRIS-APEX (formerly OSIRIS-REx) was built by Lockheed Martin, which also commands and receives telemetry from the probe as it flies through the Solar System. The cuts have caused some engineers at Lockheed to move off of planetary science missions to other fields, such as military space programs.

The other active missions waiting for word from NASA include the Chandra X-ray Observatory, the New Horizons probe heading toward interstellar space, the MAVEN spacecraft studying the atmosphere of Mars, and several satellites monitoring Earth’s climate.

The future of those missions remains murky. A senior official on one of the projects said they’ve been given “no direction at all” other than “to continue operating until advised otherwise.”

Another mission the White House wanted to cancel was THEMIS, a pair of spacecraft orbiting the Moon to map the lunar magnetic field. The lead scientist for that mission, Vassilis Angelopoulos from the University of California, Los Angeles, said his team will get “partial funding” for fiscal year 2026.

“This is good, but in the meantime, it means that science personnel is being defunded,” Angelopoulos told Ars. “The effect is the US is not achieving the scientific return it can from its multi-billion dollar investments it has made in technology.”

Artist’s concept of NASA’s MAVEN spacecraft, which has orbited Mars since 2014 studying the planet’s upper atmosphere.

To put a number on it, the missions already in space that the Trump administration wants to cancel represent a cumulative investment of $12 billion to design and build, according to the Planetary Society, a science advocacy group. An assessment by Ars concluded the operating missions slated for cancellation cost taxpayers less than $300 million per year, or between 1 and 2 percent of NASA’s annual budget.

Advocates for NASA’s science program met at the US Capitol this week to highlight the threat. Angelopoulos said the outcry from scientists and the public seems to be working.

“I take the implementation of the House budget as indication that the constituents’ pressure is having an effect,” he said. “Unfortunately, damage is being done already. Even if funding is reinstated, we have already lost people.”

Some scientists worry that the Trump administration may try to withhold funding for certain programs, even if Congress provides a budget for them. That would likely trigger a fight in the courts.

Bruce Jakosky, former principal investigator of the MAVEN Mars mission, raised this concern. He said it’s a “positive step” that NASA is now making plans under the assumption the agency will receive the budget outlined by the House. But there’s a catch.

“Even if the budget that comes out of Congress gets signed into law, the president has shown no reluctance to not spend money that has been legally obligated,” Jakosky wrote in an email to Ars. “That means that having a budget isn’t the end; and having the money get distributed to the MAVEN science and ops team isn’t the end—only when the money is actually spent can we be assured that it won’t be clawed back.

“That means that the uncertainty lives with us throughout the entire fiscal year,” he said. “That uncertainty is sure to drive morale problems.”

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

One NASA science mission saved from Trump’s cuts, but others still in limbo Read More »

floating-electrons-on-a-sea-of-helium

Floating electrons on a sea of helium

By now, a handful of technologies are leading contenders for producing a useful quantum computer. Companies have used them to build machines with dozens to hundreds of qubits, the error rates are coming down, and they’ve largely shifted from worrying about basic scientific problems to dealing with engineering challenges.

Yet even at this apparently late date in the field’s development, there are companies that are still developing entirely new qubit technologies, betting the company that they have identified something that will let them scale in ways that enable a come-from-behind story. Recently, one of those companies published a paper that describes the physics of their qubit system, which involves lone electrons floating on top of liquid helium.

Trapping single electrons

So how do you get an electron to float on top of helium? To find out, Ars spoke with Johannes Pollanen, the chief scientific officer of EeroQ, the company that accomplished the new work. He said that it’s actually old physics, with the first demonstrations of it having been done half a century ago.

“If you bring a charged particle like an electron near the surface, because the helium is dielectric, it’ll create a small image charge underneath in the liquid,” said Pollanen. “A little positive charge, much weaker than the electron charge, but there’ll be a little positive image there. And then the electron will naturally be bound to its own image. It’ll just see that positive charge and kind of want to move toward it, but it can’t get to it, because the helium is completely chemically inert, there are no free spaces for electrons to go.”

Obviously, to get the helium liquid in the first place requires extremely low temperatures. But it can actually remain liquid up to temperatures of 4 Kelvin, which doesn’t require the extreme refrigeration technologies needed for things like transmons. Those temperatures also provide a natural vacuum, since pretty much anything else will also condense out onto the walls of the container.

Diagrams of a chip showing channels and electrodes, along with an image of the chip itself.

The chip itself, along with diagrams of its organization. The trap is set by the gold electrode on the left. Dark channels allow liquid helium and electrons to flow into and out of the trap. And the bluish electrodes at the top and bottom read the presence of the electrons. Credit: EeroQ

Liquid helium is also a superfluid, meaning it flows without viscosity. This allows it to easily flow up tiny channels cut into the surface of silicon chips that the company used for its experiments. A tungsten filament next to the chip was used to load the surface of the helium with electrons at what you might consider the equivalent of a storage basin.

Floating electrons on a sea of helium Read More »

tesla’s-standard-range-model-3,-model-y-join-the-lineup

Tesla’s standard-range Model 3, Model Y join the lineup

Today, Tesla announced a new variant of the Model Y crossover for North America. Tesla fans have long-awaited a cheaper entry-level model; this was supposed to be the $25,000 Model 2. But the development of that electric vehicle was shelved earlier last year as CEO Elon Musk began to lose interest with car-making in favor of humanoid robots.

However, car sales still make up the overwhelming majority of Tesla’s revenue, and the removal of the IRS clean vehicle tax credit at the end of September may have juiced US EV sales in Q3 2025, but sales are expected to dip significantly in the current quarter.

The new Standard Range Model Y starts at $39,990, with 321 miles (516 km) of range from its rear-wheel drive powertrain, compared to the now-Premium rear-wheel drive Model Y, which has an EPA range of 357 miles (574 km). In the past, Tesla has software-locked batteries to a smaller configuration; however, here we believe the Standard Range Model Y uses a 69 kWh pack.

The cheaper Model Y is decontented in other ways. There’s no AM or FM radio, and no touchscreen in the back for passengers to control their climate settings. The roof is metal, not panoramic glass, and there’s a simpler center console and manual adjustment for the steering wheel. Tesla has reduced the choice of interior trim materials, there’s a less-capable particulate filter (with no HEPA mode), and there’s no seat heating for the back seats or cooling for the front seats.

Tesla’s standard-range Model 3, Model Y join the lineup Read More »

despite-rfk-jr.’s-shenanigans,-covid-shot-access-will-be-a-lot-like-last-year

Despite RFK Jr.’s shenanigans, COVID shot access will be a lot like last year

In an interview with Ars Technica in August, Brigid Groves, vice president of professional affairs for the American Pharmacists Association (APhA), signaled that efforts to limit access to COVID-19 vaccines is concerning to APhA, which is the leading organization representing pharmacists.

“We are concerned about that because the data and evidence point to the fact that this vaccine is safe and effective for [young, otherwise healthy] patients,” Groves said. “So, to suddenly arbitrarily limit that is very concerning to us.”

And, with the CDC’s permissive recommendations, pharmacies are not limiting them. Representatives for both CVS and Walgreens told The Washington Post that they would not require patients under 65 to prove they have an underlying condition to get a COVID-19 vaccine. CVS won’t ask you to self-attest to having a condition, and Walgreens also said that it won’t require any proof.

“In simplest terms, if a patient wants to get the vaccine, they’ll get it,” Amy Thibault, a CVS spokesperson, told the Post.

With the shared decision-making, there may be extra forms about risks and benefits that might take an extra few minutes, but it should otherwise be just like past years.

On Tuesday, this reporter was able to easily book same-day appointments for an updated COVID-19 vaccine at local CVS and Walgreens pharmacies in North Carolina, without attesting to any medical conditions.

Children

Shots for younger children could be trickier: While adults and older children can visit their pharmacy and get vaccinated relatively easily, younger children (particularly those under age 5) may have a harder time. Pharmacists typically do not vaccinate those younger children—which has always been the case—and parents will have to visit the pediatrician.

Pediatricians, like pharmacists, are likely to be supportive of broad access to the shots. The American Academy of Pediatrics has said that all children should have access. The AAP also specifically encourages children under age 2 and children with underlying conditions to get vaccinated, because those children are at higher risk of severe disease.

Despite RFK Jr.’s shenanigans, COVID shot access will be a lot like last year Read More »

bending-the-curve

Bending The Curve

The odds are against you and the situation is grim.

Your scrappy band are the only ones facing down a growing wave of powerful inhuman entities with alien minds and mysterious goals. The government is denying that anything could possibly be happening and actively working to shut down the few people trying things that might help. Your thoughts, no matter what you think could not harm you, inevitably choose the form of the destructor. You knew it was going to get bad, but this is so much worse.

You have an idea. You’ll cross the streams. Because there is a very small chance that you will survive. You’re in love with this plan. You’re excited to be a part of it.

Welcome to the always excellent Lighthaven venue for The Curve, Season 2, a conference I had the pleasure to attend this past weekend.

Where the accelerationists and the worried come together to mostly get along and coordinate on the same things, because the rest of the world has gone blind and mad. In some ways technical solutions seem relatively promising, shifting us from ‘might be actually impossible’ levels of impossible to Shut Up And Do The Impossible levels of impossible, all you have to do is beat the game on impossible difficulty level. As a speed run. On your first try. Good luck.

The action space has become severely constrained. Between the actual and perceived threats from China, the total political ascendence of Nvidia in particular and anti-regulatory big tech in general, and the setting in of more and more severe race conditions and the increasing dependence of the entire economy on AI capex investments, it’s all we can do to try to only shoot ourselves in the foot and not aim directly for the head.

Last year we were debating tradeoffs. This year, aside from the share price of Nvidia, as long as you are an American who likes humans considering things that might pass? On the margin, there are essentially no tradeoffs. It’s better versus worse.

That doesn’t invalidate the thesis of If Anyone Builds It, Everyone Dies or the implications down the line. At some point we will probably either need to do impactful international coordination or other interventions that involved large tradeoffs, or humanity loses control over the future or worse. That implication exists in every reasonable sketch of the future I have seen in which AI does not end up a ‘normal technology.’ So one must look forward towards that, as well.

You can also look at it as Year 1 of the curve was billed (although I don’t use the d word) as ‘doomers vs. accelerationists’ and now as Nathan Lambert says it was DC and SF types, like when the early season villains and heroes are now all working together as the stakes get raised and the new Big Bad shows up, then you do it again until everything is cancelled.

The Curve was a great experience. The average quality of attendees was outstanding. I would have been happy to talk to a large fraction of them 1-on-1 for a long time, and there were a number that I’m sad I missed. Lots of worthy sessions lost out to other plans.

As Anton put it, every (substantive) conversation I had made me feel smarter. There was opportunity everywhere, everyone was cooperative and seeking to figure things out, and everyone stayed on point.

To the many people who came up to me to thank me for my work, you’re very welcome. I appreciate it every time and find it motivating.

What did people at the conference think about some issues?

We have charts.

Where is AI on the technological richter scale?

There are dozens of votes here. Only one person put this as low as a high 8, which is the range of automobiles, electricity and the internet. A handful put it with fire, the wheel, agriculture and the printing press. Then most said this is similar to the rise of the human species, a full transformation. A few said it is a bigger deal than that.

If you were situationally aware enough to show up, you are aware of the situation.

These are median predictions, so the full distribution will have a longer tail, but this seems reasonable to me. The default is 10, that AI is going to be a highly non-normal technology on the level of the importance of humans, but there’s a decent chance it will ‘only’ be a 9 on the level of agriculture or fire, and some chance it disappoints and ends up Only Internet Big.

Last year, people would often claim AI wouldn’t even be Internet Big. We are rapidly approaching the point where that is not a position you can offer with a straight face.

How did people expect this to play out?

That’s hard to read, so the centers of the distributions are, note that there was clearly a clustering effect:

  1. 90% of code is written by AI by ~2028.

  2. 90% of human remote work can be done more cheaply by AI by ~2031.

  3. Most cars on America’s roads lack human drivers by ~2041.

  4. AI makes Nobel Prize worthy discovery by ~2032.

  5. First one-person $1 billion company by 2026.

  6. First year of >10% GDP growth by ~2038 (but 3 votes for never).

  1. People estimate 15%-50% current speedup at AI labs from AI coding.

  2. When AI is fully automated, disagreement over how good their research taste will be, but median is roughly as good as the median current AI worker.

  3. If we replaced each human with an AI version of themselves that was the same except 30x faster with 30 copies, but we only had access to similar levels of compute, we’d get maybe a 12x speedup in progress.

What are people worried or excited about? A lot of different things, from ‘everyone lives’ to ‘concentration of power,’ ‘everyone dies’ and especially ‘loss of control’ which have the most +1s on their respective sides. Others are excited to cure their ADD or simply worried everything will suck.

Which kind of things going wrong worries people most, misalignment or misuse?

Why not both? Pretty much everyone said both.

Finally, who is this nice man with my new favorite IYKYK t-shirt?

(I mean, he has a name tag, it’s OpenAI’s Boaz Barak)

The central problem at every conference is fear of missing out. Opportunity costs. There are many paths, even when talking to a particular person. You must choose.

That goes double at a conference like The Curve. The quality of the people there was off the charts and the schedule forced hard choices between sessions. There were entire other conferences I could have productively experienced. I also probably could have usefully done a lot more prep work.

I could of course have hosted a session, which I chose not to do this time around. I’m sure there were various topics I could have done that people would have liked, but I was happy for the break, and it’s not like there’s a shortage of my content out there.

My strategy is mostly to not actively plan my conference experiences, instead responding to opportunity. I think this is directionally correct but I overplay it, and should have (for example) looked at the list of who was going to be there.

What were the different tracks or groups of discussions and sessions I ended up in?

  1. Technical alignment discussions. I had the opportunity to discuss safety and alignment work with a number of those working on such issues at Anthropic, DeepMind and even xAI. I missed OpenAI this time around, but they were there. This always felt exciting, enlightening and fun. I still get imposter syndrome every time people in such conversations take me and my takes and ideas seriously. Conditions are in many ways horribly terrible but everyone is on the same team and some things seem promising. I felt progress was made. My technical concrete pitch to Anthropic included (among other things) both particular experimental suggestions and also a request that they sustain access to Sonnet 3.5 and 3.6.

    1. It wouldn’t make sense to go into the technical questions here.

  2. Future projecting. I went to talks by Joshua Achiam and Helen Toner about what future capabilities and worlds might look like. Jack Clark’s closing talk was centrally this but touched on other things.

  3. AI policy discussions. These felt valuable and enlightening in both directions, but were infuriating and depressing throughout. People on the ground in Washington kept giving us variations on ‘it’s worse than you know,’ which it usually is. So now you know. Others seemed not to appreciate how bad things had gotten. I was often pointing out that people’s proposals implied some sort of international treaty and form of widespread compute surveillance, had zero chance of actually causing us not to die, or sometimes both. At other times, I was pointing out that things literally wouldn’t work on the level of ‘do the object level goal’ let alone make us win. Or we were trying to figure out what was sufficiently completely costless and not even a tiny bit weird or complex that one could propose that might actually do anything meaningful. Or simply observing other perspectives.

    1. In particular, different people maintained different players were relatively powerful, but I came away from various discussions more convinced than ever that for now White House policy and rhetoric on AI can be modeled as fully captured by Nvidia, although constrained in some ways by congressional Republicans and some members of the MAGA movement. This is pretty much a worst case scenario. If we were captured by OpenAI or other AI labs that wouldn’t be great but at least their interests and America are mostly aligned.

  4. Nonprofit funding discussions. I’d just come out of the latest Survival and Flourishing Fund round, various players seemed happy to talk and strategize, and it seems likely that very large amounts of money will be unlocked soon as OpenAI and Anthropic employees with increasingly valuable equity become liquid. The value of helping steer this seems crazy high, but the stakes on everything seem crazy high.

    1. One particular worry is that a lot of this money could effectively get captured by various existing players, especially the existing EA/OP ecosystem, in ways that would very much be a shame.

    2. Another is simply that a bunch of relatively uninformed money could overwhelm incentives, contaminate various relationships and dynamics, introduce parasitic entry, drop average quality a lot, and so on.

    3. Or everyone involved could end up with a huge time sink and/or end up not deploying the funds.

    4. So there’s lots to do. But it’s all tricky, and trying to gain visible influence over the direction of funds is a very good way to get your own social relationships and epistemics very quickly compromised, also it can quickly eat up infinite time, so I’m hesitant to get too involved or involved in the wrong ways.

What other tracks did I actively choose not to participate in?

There were of course AI timelines discussions, but I did my best to avoid them except when they were directly relevant to a concrete strategic question. At one point someone in a 4-person conversation I was mostly observing said ‘let’s change the subject, can we argue about AI timelines’ and I outright said ‘no’ but was overruled, and after a bit I walked away. For those who don’t follow these debates, many of the more aggressive timelines have gotten longer over the course of 2025, with people who expected crazy to happen in 2027 or 2028 now not expecting crazy for several more years, but there are those who still mostly hold firm to a faster schedule.

There were a number of talks about AI that assumed it was mysteriously a ‘normal technology.’ There were various sessions on economics projections, or otherwise taking place with the assumption that AI would not cause things to change much, except for whatever particular effect people were discussing. How would we ‘strengthen our democracy’ when people had these neat AI tools, or avoid concentration of power risks? What about the risk of They Took Our Jobs? What about our privacy? How would we ensure everyone or every nation has fair access?

These discussions almost always silently assume that AI capability ‘hits a wall’ some place not very far from where it is now and then everything moves super slowly. Achiam’s talk had elements of this, and I went because he’s OpenAI’s Head of Mission Alignment so knowing how he thinks about this seemed super valuable.

To the extent I interacted with this it felt like smart people thinking about a potential world almost certainly very different from our own. Fascinating, can create useful intuition pumps, but that’s probably not what’s going to happen. If nothing else was going on, sure, count me in.

But also all the talk of ‘bottlenecks’ therefore 0.5% or 1% GDP growth boost per year tops has already been overtaken purely by capex spending and I cannot remember a single economist or other GDP growth skeptic acknowledging that this already made their projections wrong and updating reasonably.

There was an AI 2027 style tabletop exercise again this year, which I recommend doing if you haven’t done it before, except this time I wasn’t aware it was happening, and also by now I’ve done it a number of times.

There were of course debates directly about doom, but remarkably little and I had no interest. It felt like everyone was either acknowledging existential risk enough that there wasn’t much value of information in going further, or sufficiently blind they were in ‘normal technology’ mode. At some point people get too high level to think building smarter than human minds is a safe proposition.

Helen Toner gave a talk on taking AI jaggedness seriously. What would it mean if AIs kept getting increasingly better and superhuman at many tasks, while remaining terrible at other tasks, or at least relatively highly terrible compared to humans? How does the order of capabilities impact how things unfold? Even if we get superhuman coding and start to get big improvements in other areas as a result, that won’t make their ability profile similar to humans.

I agree with Helen that such jaggedness is mostly good news and potentially could buy us substantial time for various transitions. However, it’s not clear to me that this jaggedness does that much for that long, AI is (I am projecting) not going to stall out in the lagging areas or stay subhuman in key areas for as much calendar time as one might hope.

A fun suggestion was to imagine LLMs talking about how jagged human capabilities are. Look how dumb we are in some ways while being smart in others. I do think in a meaningful sense LLMs and other current AIs are ‘more jagged’ than humans in practice, because humans have continual learning and the ability to patch the situation and also route the physical world around our idiocy where they’re being importantly dumb. So we’re super dumb, but we try to not let it get in the way.

Neil Chilson: Great talk by @hlntnr about the jaggedness of AI, why it is likely to continue, and why it matters. Love this slide and her point that while many AI forecasters use smooth curves, a better metaphor is the chaotic transitions in fluid heating.

“Jaggedness” being the uneven ability of AI to do tasks that seem about equally difficult to humans.

Occurs to me I should have shared the “why this matters” slide, which was the most thought provoking one to me:

I am seriously considering talking about time to ‘crazy’ going forward, and whether that is a net helpful thing to say.

The curves definitely be too smooth. It’s hard to properly adjust for that. But I think the fluid dynamics metaphor, while gorgeous, makes the opposite mistake.

I watched a talk by Randi Weingarten about how she and other teachers are advocating and viewing AI around issues in education. One big surprise is that she says they don’t worry or care much about AI ‘cheating’ or doing work via ChatGPT, there are ways around that, especially ‘project based learning that is relevant,’ and the key thing is that education is all about human interactions. To her ChatGPT is a fine tool, although things like Character.ai are terrible, and she strongly opposes phones in schools for the right reasons and I agree with that.

She said teachers need latitude to ‘change with the times’ but usually aren’t given it, they need permission to change anything and if anything goes wrong they’re fired (although there are the other stories we hear that teachers often can’t be fired almost no matter what in many cases?). I do sympathize here. A lot needs to change.

Why is education about human interactions? This wasn’t explained. I always thought education was about learning things, I mostly didn’t learn things through human interaction, I mostly didn’t learn things in school via meaningful human interaction, and to the extent I learned things via meaningful human interaction it mostly wasn’t in school. As usual when education professionals talk about education I don’t get the sense they want children to learn things, or that they care about children being imprisoned and bored with their time wasted for huge portions of many days, but care about something else entirely? It’s not clear what her actual objection to Alpha School (which she of course confirmed she hates) was other than decentering teachers, or what concretely was supposedly going wrong there? Frankly it sounded suspiciously like a call to protect jobs.

If anything, her talk seemed to be a damning indictment of our entire system of schools and education. She presents vocational education as state of the art and with the times, and cited an example of a high school with a sub-50% graduation rate going to 100% graduation rate and 182 of 186 students getting a ‘certification’ from future farmers of America after one such program. Aside from the obvious ‘why do you need a certificate to be a farmer’ and also ‘why would you choose farmer in 2025’ this is saying kids should spend vastly less time in school? Many other such implications were there throughout.

Her group calls for ‘guardrails’ and ‘accountability’ on AI, worries about things like privacy, misinformation and understanding ‘the algorithms’ or the dangers to democracy, and points to declines in male non-college earnings,

There was a Chatham House discussion of executive branch AI policy in America where all involved were being diplomatic and careful. There’s a lot of continuity between the Biden approach to AI and much of the Trump approach, there’s a lot of individual good things going on, and it was predicted that CAISI would have a large role going forward, lots of optimism and good detail.

It seems reasonable to say that the Trump administration’s first few months of AI policy were unexpectedly good, and the AI Action Plan was unexpectedly good. Then there are the other things that happened.

Thus the session included some polite versions of ‘what the hell are we doing?’ that was at most slightly beneath the surface. As a central example, one person observed that if America ‘loses on AI,’ it would likely be because we did one or more of failing to (1) provide the necessary electrical power, (2) failed to bring in the top AI talent or (3) sold away our chip advantage. They didn’t say, but I will note here, that current American policy seems determined to screw up all three of these? We are cancelling solar, wind and battery projects all over, we are restricting our ability to acquire talent, and we are seriously debating selling Blackwell chips directly to China.

I was sad that going to that talk ruled out watching Buck Shlegeris debate Timothy Lee about whether keeping AI agents under control will be hard, as I expected that session to both be extremely funny (and one sided) and also plausibly enlightening in navigating such arguments, but that’s how conferences go. I did then get to see Buck discuss mitigating insider threats from scheming AIs, in which he explained some of the ways in which dealing with scheming AIs that are smarter than you is very hard. I’d go farther and say that in the types of scenarios Buck is discussing there it’s not going to work out for you. If the AIs be smarter than you and also scheming against you and you try to use them for important stuff anyway you lose.

That doesn’t mean do zero attempts to mitigate this but at some point the whole effort is counterproductive as it creates context that creates what it is worried about, without giving you much chance of winning.

At one point I took a break to get dinner at a nearby restaurant. The only other people there were two women. The discussion included mention of AI 2027 and also that one of them is reading If Anyone Builds It, Everyone Dies.

Also at one point I saw a movie star I’m a fan of, hanging out and chatting. Cool.

Sunday started out with Josh Achiam’s talk (again, he’s Head of Mission Alignment at OpenAI, but his views here were his own) about the challenge of the intelligence age. If it comes out, it’s worth a watch. There were a lot of very good thoughts and considerations here. I later got to have some good talk with him during the afterparty. Like much talk at OpenAI, it also silently ignored various implications of what was being built, and implicitly assumed the relevant capabilities just stopped in any place they would cause bigger issues. The talk acknowledged that it was mostly assuming alignment is solved, which is fine as long as you say that explicitly, we have many different problems to deal with, but other questions also felt assumed away more silently. Josh promises his full essay version will deal with that.

I got to go to a Chatham House Q&A about the EU Frontier AI Code of Practice, which various people keep reminding me I should write about, and I swear I want to do that as soon as I have some spare time. There was a bunch of info, some of it new to me, and also insight into how those involved think all of this is going to work. I later shared with them my model of how I think the AI companies will respond, in particular the chance they will essentially ignore the law when inconvenient because of lack of sufficient consequences. And I offered suggestions on how to improve impact here. But on the margin, yeah, the law does some good things.

I got into other talks and missed out on one I wanted to see by Joe Allen, about How the MAGA Movement Sees AI. This is a potentially important part of the landscape on AI going forward, as a bunch of MAGA types really dislike AI and are in position to influence the White House.

As I look over the schedule in hindsight I see a bunch of other stuff I’m sad I missed, but the alternative would have been missing valuable 1-on-1s or other talks.

The final talk was Jack Clark giving his perspective on events. This was a great talk, if it does online you should watch it, it gave me a very concrete sense of where he is coming from.

Jack Clark has high variance. When he’s good, he’s excellent, such as in this talk, including the Q&A, and when he asked Achaim an armor piercing question, or when he’s sticking to his guns on timelines that I think are too short even though it doesn’t seem strategic to do that. At other times, him and the policy team at Anthropic are in some sort of Official Mode where they’re doing a bunch of hedging and making things harder.

The problem I have with Anthropic’s communications is, essentially, that they are not close to the Pareto Frontier, where the y-axis is something like ‘Better Public Policy and Epistemics’ and the x-axis can colloquially be called ‘Avoid Pissing Off The White House.’ I acknowledge there is a tradeoff here, especially since we risk negative polarization, but we need to be strategic, and certain decisions have been de facto poking the bear for little gain, and at other times they hold back for little gain the other way. We gotta be smarter about this.

They are often very different from mine, or yours.

Deepfates: looks like a lot of people who work on policy and research for aligning AIs to human interests. I’m curious what you think about how humans align to AI.

my impression so far: people from big labs and people from government, politely probing each other to see which will rule the world. they can’t just out and say it but there’s zerosumness in the air

Chris Painter: That isn’t my impression of the vibe at the event! Happy to chat.

I was with Chris on this. It very much did not feel zero sum. There did seem to be a lack of appreciation of the ‘by default the AIs rule the world’ problem, even in a place dedicated largely to this particular problem.

Deepfates: Full review of The Curve: people just want to believe that Anyone is ruling the world. some of them can sense that Singleton power is within reach and they are unable to resist The opportunity. whether by honor or avarice or fear of what others will do with it.

There is that too, that currently no one is ruling the world, and it shows. It also has its advantages.

so most people are just like “uh-oh! what will occur? shouldn’t somebody be talking about this?” which is fine honestly, and a lot of them are doing good research and I enjoy learning about it. The policy stuff is more confusing

diverse crowd but multiple clusters talking past each other as if the other guys are ontologically evil and no one within earshot could possibly object. and for the most part they don’t actually? people just self-sort by sessions or at most ask pointed questions. parallel worlds.

Yep, parallel worlds, but I never saw anyone say someone else was evil. What, never? Well, hardly ever. And not anyone who actually showed up. Deeply confused and likely to get us all killed? Well, sure, there was more of that, but obviously true, and again not the people present.

things people are concerned about in no order: China. Recursive self-improvement. internal takeover of AI labs by their models. Fascism. Copyright law. The superPACs. Sycophancy. Privacy violations. Rapid unemployment of whole sectors of society. Religious and political backlash, autonomous agents, capabilities. autonomous agents, legal liability. autonomous agents, nightmare nightmare nightmare.

The fear of the other party, the other company, the other country, the other, the unknown, most of all the alien thing that threatens what it means to be human.

Fascinating to see threatens ‘what it means to be human’ on that list but not ‘the ability to keep being human (or alive),’ which I assure Deepfates a bunch of us were indeed very concerned about.

so they want to believe that the world is ruleable, that somebody, anybody, is at the wheel, as we careen into the strangest time in human history.

and they do Not want it to be the AIs. even as they keep putting decision making power and communication surface on the AIs lol

You can kind of tell here that Deepfates is fine with it being the AIs and indeed is kind of disdainful of anyone who would object to this. As in, they understand what is about to happen, but think this is good, actually (and are indeed working to bring it about). So yeah, some actual strong disagreements were present, but didn’t get discussed.

I may or may not have seen Deepfates, since I don’t know their actual name, but we presumably didn’t talk, given:

i tried telling people that i work for a rogue AI building technologies to proliferate autonomous agents (among other things). The reaction was polite confusion. It seemed a bit unreal for everyone to be talking about the world ending and doing normal conference behaviors anyway.

Polite confusion is kind of the best you can hope for when someone says that?

Regardless, very interesting event. Good crowd, good talks, plenty of food and caffeinated beverages. Not VC/pitch heavy like a lot of SF things.

Thanks to Lighthaven for hosting and Golden Gate Institute/Manifund for organizing. Will be curious to see what comes of this.

I definitely appreciated the lack of VC and pitching. I did get pitched once (on a nonprofit thing) but I was happy to take it. Focus was tight throughout.

Anton: “are you with the accelerationist faction?”

most people here have thought long and hard about ai, every conversation i have — even with those i vehemently disagree — feels like it makes me smarter..

i cant overemphasize how good the vibes are at this event.

Rob S: Another Lighthaven banger?

Anton: ANOTHA ONE.

As I note above, his closing talk was excellent. Otherwise, he seemed to be in the back of many of the same talks I was at. Listening. Gathering intel.

Jack Clark (policy head, Anthropic): I spent a few days at The Curve and I am humbled and overjoyed by the experience – it is a special event, now in its second year, and I hope they preserve whatever lightning they’ve managed to capture in this particular bottle. It was a privilege to give the closing talk.

During the Q&A I referenced The New Book, and likely due to the exhilaration of giving the earlier speech I fumbled a word and titled it: If Anyone Reads It, Everyone Dies.

James Cham: It was such an inspiring (and terrifying) talk!

I did see Roon at one point but it was late in the day and neither of us had an obvious conversation we wanted to have and he wandered off. He’s low key in person.

I was very disappointed to realize he did not say ‘den of inquiry’ here:

Roon: The Curve is insane because a bunch of DC staffers in suits have shown up to Lighthaven, a rationalist den of iniquity that looks like a Kinkade painting.

Jaime Sevilla: Jokes on you I am not a DC staffer, I just happen to like wearing my suit.

Neil Chilson: Hey, I ditched the jacket after last night.

Being Siedoh: i was impressed that your badge just says “Roon” lol.

To be fair, you absolutely wanted a jacket of some kind for the evening portion. That’s why they were giving away sweatshirts. It was still quite weird to see the few people who did wear suits.

Nathan made the opposite of my choice, and spent the weekend centered on timeline debates.

Nathan Lambert: My most striking takeaway is that the AI 2027 sequence of events, from AI models automating research engineers to later automating AI research, and potentially a singularity if your reasoning is so inclined, is becoming a standard by which many debates on AI progress operate under and tinker with.

It’s good that many people are taking the long term seriously, but there’s a risk in so many people assuming a certain sequence of events is a sure thing and only debating the timeframe by which they arrive.

This feel like the deepfates theory of self-selection within the conference. I observed the opposite, that so many people were denying that any kind of research automation or singularity was going to happen. Usually they didn’t even assert it wasn’t happening, they simply went about discussing futures where it mysteriously didn’t happen, presumably because of reasons, maybe ‘bottlenecks’ or muttering ‘normal technology’ or something.

Within the short timelines and taking AGI (at least somewhat) seriously debate subconference, to the extent I saw it, yes I do think there’s widespread convergence on the automating AI research analysis.

Whereas Nathan is in the ‘nope definitely not happening’ camp, it seems, but is helpfully explaining that it is because of bottlenecks in the automation loop.

These long timelines are strongly based on the fact that the category of research engineering is too broad. Some parts of the RE job will be fully automated next year, and more the next. To check the box of automation the entire role needs to be replaced.

What is more likely over the next few years, each engineer is doing way more work and the job description evolves substantially. I make this callout on full automation because it is required for the distribution of outcomes that look like a singularity due to the need to remove the human bottleneck for an ever accelerating pace of progress. This is a point to reinforce that I am currently confident in a singularity not happening.

The automation theory is that, as Nathan documents in his writeup, within a few years the existing research engineers (REs) will be unbelievably productive (80%-90% automated) and in some ways RE is already automated, yet that doesn’t allow us to finish the job, and humans continue importantly slowing down the loop because Real Science Is Messy and involves a social marketplace of ideas. Apologies for my glib paraphrasing. It’s possible in theory that these accelerations of progress and partial automations plus our increased scaling are no match for increasing problem difficulty, but it seems unlikely to me.

It seems far more likely that this kind of projection forgets how much things accelerate in such scenarios. Sure, it will probably be a lot messier than the toy models and straight lines on graphs, it always is, but you’d best start believing in singularities, because you’re in one, if you look at the arc of history.

The following is a very minor thing but I enjoy it so here you go.

All three meals were offered each day buffet style. Quality at these events is generally about as good as buffets get, they know who the good offerings are at this point. I ask for menus in advance so I can choose when to opt out and when to go hard, and which day to do my traditional one trip to a restaurant.

Also there was some of this:

Tyler John: It’s riddled with contradictions. The neoliberal rationalists allocate vegan and vegetarian food with a central planner rather than allowing demand to determine the supply.

Rachel: Yeah fwiw this was not a design choice. I hate this. I unfortunately didn’t notice that it was still happening yesterday :/

Tyler John: Oh on my end it’s only a very minor complaint but I did enjoy the irony.

Robert Winslow: I had a bad experience with this kind of thing at a conference. They said to save the veggies for the vegetarians. So instead of everyone taking a bit of meat and a bit of veg, everyone at the front of the line took more meat than they wanted, and everyone at the back got none.

You obviously can’t actually let demand determine supply, because you (1) can’t afford the transaction costs of charging on the margin and (2) need to order the food in advance. And there are logistical advantages to putting (at least some of) the vegan and vegetarian food in a distinct area so you don’t risk contamination or put people on lines that waste everyone’s time. If you’re worried about a mistake, you’d rather run out of meat a little early, you’d totally take down the sign (or ignore it) if it was clear the other mistake was happening, and there were still veg options for everyone else.

If you are confident via law of large numbers plus experience that you know your ratios, and you’ve chosen (and been allowed to choose) wisely, then of course you shouldn’t need anything like this.

Discussion about this post

Bending The Curve Read More »

elon-musk-tries-to-make-apple-and-mobile-carriers-regret-choosing-starlink-rivals

Elon Musk tries to make Apple and mobile carriers regret choosing Starlink rivals

SpaceX holds spectrum licenses for the Starlink fixed Internet service for homes and businesses. Adding the EchoStar spectrum will make its holdings suitable for mobile service.

“SpaceX currently holds no terrestrial spectrum authorizations and no license to use spectrum allocated on a primary basis to MSS,” the company’s FCC filing said. “Its only authorization to provide any form of mobile service is an authorization for secondary SCS [Supplemental Coverage from Space] operations in spectrum licensed to T-Mobile.”

Starlink unlikely to dethrone major carriers

SpaceX’s spectrum purchase doesn’t make it likely that Starlink will become a fourth major carrier. Grand claims of that sort are “complete nonsense,” wrote industry analyst Dean Bubley. “Apart from anything else, there’s one very obvious physical obstacle: walls and roofs,” he wrote. “Space-based wireless, even if it’s at frequencies supported in normal smartphones, won’t work properly indoors. And uplink from devices to satellites will be even worse.”

When you’re indoors, “there’s more attenuation of the signal,” resulting in lower data rates, Farrar said. “You might not even get megabits per second indoors, unless you are going to go onto a home Starlink broadband network,” he said. “You might only be able to get hundreds of kilobits per second in an obstructed area.”

The Mach33 analyst firm is more bullish than others regarding Starlink’s potential cellular capabilities. “With AWS-4/H-block and V3 [satellites], Starlink DTC is no longer niche, it’s a path to genuine MNO competition. Watch for retail mobile bundles, handset support, and urban hardware as the signals of that pivot,” the firm said.

Mach33’s optimism is based in part on the expectation that SpaceX will make more deals. “DTC isn’t just a coverage filler, it’s a springboard. It enables alternative growth routes; M&A, spectrum deals, subleasing capacity in denser markets, or technical solutions like mini-towers that extend Starlink into neighborhoods,” the group’s analysis said.

The amount of spectrum SpaceX is buying from EchoStar is just a fraction of what the national carriers control. There is “about 1.1 GHz of licensed spectrum currently allocated to mobile operators,” wireless lobby group CTIA said in a January 2025 report. The group also says the cellular industry has over 432,000 active cell sites around the US.

What Starlink can offer cellular users “is nothing compared to the capacity of today’s 5G networks,” but it would be useful “in less populated areas or where you cannot get coverage,” Rysavy said.

Starlink has about 8,500 satellites in orbit. Rysavy estimated in a July 2025 report that about 280 of them are over the United States at any given time. These satellites are mostly providing fixed Internet service in which an antenna is placed outside a building so that people can use Wi-Fi indoors.

SpaceX’s FCC filing said the EchoStar spectrum’s mix of terrestrial and satellite frequencies will be ideal for Starlink.

“By acquiring EchoStar’s market-access authorization for 2 GHz MSS as well as its terrestrial AWS-4 licenses, SpaceX will be able to deploy a hybrid satellite and terrestrial network, just as the Commission envisioned EchoStar would do,” SpaceX said. “Consistent with the Commission’s finding that potential interference between MSS and terrestrial mobile service can best be managed by enabling a single licensee to control both networks, assignment of the AWS-4 spectrum is critical to enable SpaceX to deploy robust MSS service in this band.”

Elon Musk tries to make Apple and mobile carriers regret choosing Starlink rivals Read More »

a-biological-0-day?-threat-screening-tools-may-miss-ai-designed-proteins.

A biological 0-day? Threat-screening tools may miss AI-designed proteins.


Ordering DNA for AI-designed toxins doesn’t always raise red flags.

Designing variations of the complex, three-dimensional structures of proteins has been made a lot easier by AI tools. Credit: Historical / Contributor

On Thursday, a team of researchers led by Microsoft announced that they had discovered, and possibly patched, what they’re terming a biological zero-day—an unrecognized security hole in a system that protects us from biological threats. The system at risk screens purchases of DNA sequences to determine when someone’s ordering DNA that encodes a toxin or dangerous virus. But, the researchers argue, it has become increasingly vulnerable to missing a new threat: AI-designed toxins.

How big of a threat is this? To understand, you have to know a bit more about both existing biosurveillance programs and the capabilities of AI-designed proteins.

Catching the bad ones

Biological threats come in a variety of forms. Some are pathogens, such as viruses and bacteria. Others are protein-based toxins, like the ricin that was sent to the White House in 2003. Still others are chemical toxins that are produced through enzymatic reactions, like the molecules associated with red tide. All of them get their start through the same fundamental biological process: DNA is transcribed into RNA, which is then used to make proteins.

For several decades now, starting the process has been as easy as ordering the needed DNA sequence online from any of a number of companies, which will synthesize a requested sequence and ship it out. Recognizing the potential threat here, governments and industry have worked together to add a screening step to every order: the DNA sequence is scanned for its ability to encode parts of proteins or viruses considered threats. Any positives are then flagged for human intervention to evaluate whether they or the people ordering them truly represent a danger.

Both the list of proteins and the sophistication of the scanning have been continually updated in response to research progress over the years. For example, initial screening was done based on similarity to target DNA sequences. But there are many DNA sequences that can encode the same protein, so the screening algorithms have been adjusted accordingly, recognizing all the DNA variants that pose an identical threat.

The new work can be thought of as an extension of that threat. Not only can multiple DNA sequences encode the same protein; multiple proteins can perform the same function. To form a toxin, for example, typically requires the protein to adopt the correct three-dimensional structure, which brings a handful of critical amino acids within the protein into close proximity. Outside of those critical amino acids, however, things can often be quite flexible. Some amino acids may not matter at all; other locations in the protein could work with any positively charged amino acid, or any hydrophobic one.

In the past, it could be extremely difficult (meaning time-consuming and expensive) to do the experiments that would tell you what sorts of changes a string of amino acids could tolerate while remaining functional. But the team behind the new analysis recognized that AI protein design tools have now gotten quite sophisticated and can predict when distantly related sequences can fold up into the same shape and catalyze the same reactions. The process is still error-prone, and you often have to test a dozen or more proposed proteins to get a working one, but it has produced some impressive successes.

So, the team developed a hypothesis to test: AI can take an existing toxin and design a protein with the same function that’s distantly related enough that the screening programs do not detect orders for the DNA that encodes it.

The zero-day treatment

The team started with a basic test: use AI tools to design variants of the toxin ricin, then test them against the software that is used to screen DNA orders. The results of the test suggested there was a risk of dangerous protein variants slipping past existing screening software, so the situation was treated like the equivalent of a zero-day vulnerability.

“Taking inspiration from established cybersecurity processes for addressing such situations, we contacted the relevant bodies regarding the potential vulnerability, including the International Gene Synthesis Consortium and trusted colleagues in the protein design community as well as leads in biosecurity at the US Office of Science and Technology Policy, US National Institute of Standards and Technologies, US Department of Homeland Security, and US Office of Pandemic Preparedness and Response,” the authors report. “Outside of those bodies, details were kept confidential until a more comprehensive study could be performed in pursuit of potential mitigations and for ‘patches’… to be developed and deployed.”

Details of that original test are being made available today as part of a much larger analysis that extends the approach to a large range of toxic proteins. Starting with 72 toxins, the researchers used three open source AI packages to generate a total of about 75,000 potential protein variants.

And this is where things get a little complicated. Many of the AI-designed protein variants are going to end up being non-functional, either subtly or catastrophically failing to fold up into the correct configuration to create an active toxin. The only way to know which ones work is to make the proteins and test them biologically; most AI protein design efforts will make actual proteins from dozens to hundreds of the most promising-looking potential designs to find a handful that are active. But doing that for 75,000 designs is completely unrealistic.

Instead, the researchers used two software-based tools to evaluate each of the 75,000 designs. One of these focuses on the similarity between the overall predicted physical structure of the proteins, and another looks at the predicted differences between the positions of individual amino acids. Either way, they’re a rough approximation of just how similar the proteins formed by two strings of amino acids should be. But they’re definitely not a clear indicator of whether those two proteins would be equally functional.

In any case, DNA sequences encoding all 75,000 designs were fed into the software that screens DNA orders for potential threats. One thing that was very clear is that there were huge variations in the ability of the four screening programs to flag these variant designs as threatening. Two of them seemed to do a pretty good job, one was mixed, and another let most of them through. Three of the software packages were updated in response to this performance, which significantly improved their ability to pick out variants.

There was also a clear trend in all four screening packages: The closer the variant was to the original structurally, the more likely the package (both before and after the patches) was to be able to flag it as a threat. In all cases, there was also a cluster of variant designs that were unlikely to fold into a similar structure, and these generally weren’t flagged as threats.

What does this mean?

Again, it’s important to emphasize that this evaluation is based on predicted structures; “unlikely” to fold into a similar structure to the original toxin doesn’t mean these proteins will be inactive as toxins. Functional proteins are probably going to be very rare among this group, but there may be a handful in there. That handful is also probably rare enough that you would have to order up and test far too many designs to find one that works, making this an impractical threat vector.

At the same time, there are also a handful of proteins that are very similar to the toxin structurally and not flagged by the software. For the three patched versions of the software, the ones that slip through the screening represent about 1 to 3 percent of the total in the “very similar” category. That’s not great, but it’s probably good enough that any group that tries to order up a toxin by this method would attract attention because they’d have to order over 50 just to have a good chance of finding one that slipped through, which would raise all sorts of red flags.

One other notable result is that the designs that weren’t flagged were mostly variants of just a handful of toxin proteins. So this is less of a general problem with the screening software and might be more of a small set of focused problems. Of note, one of the proteins that produced a lot of unflagged variants isn’t toxic itself; instead, it’s a co-factor necessary for the actual toxin to do its thing. As such, some of the screening software packages didn’t even flag the original protein as dangerous, much less any of its variants. (For these reasons, the company that makes one of the better-performing software packages decided the threat here wasn’t significant enough to merit a security patch.)

So, on its own, this work doesn’t seem to have identified something that’s a major threat at the moment. But it’s probably useful, in that it’s a good thing to get the people who engineer the screening software to start thinking about emerging threats.

That’s because, as the people behind this work note, AI protein design is still in its early stages, and we’re likely to see considerable improvements. And there’s likely to be a limit to the sorts of things we can screen for. We’re already at the point where AI protein design tools can be used to create proteins that have entirely novel functions and do so without starting with variants of existing proteins. In other words, we can design proteins that are impossible to screen for based on similarity to known threats, because they don’t look at all like anything we know is dangerous.

Protein-based toxins would be very difficult to design, because they have to both cross the cell membrane and then do something dangerous once inside. While AI tools are probably unable to design something that sophisticated at the moment, I would be hesitant to rule out the prospects of them eventually reaching that sort of sophistication.

Science, 2025. DOI: 10.1126/science.adu8578  (About DOIs).

Photo of John Timmer

John is Ars Technica’s science editor. He has a Bachelor of Arts in Biochemistry from Columbia University, and a Ph.D. in Molecular and Cell Biology from the University of California, Berkeley. When physically separated from his keyboard, he tends to seek out a bicycle, or a scenic location for communing with his hiking boots.

A biological 0-day? Threat-screening tools may miss AI-designed proteins. Read More »

sora-and-the-big-bright-screen-slop-machine

Sora and The Big Bright Screen Slop Machine

OpenAI gave us two very different Sora releases. Here is the official announcement.

The part where they gave us a new and improved video generator? Great, love it.

The part where they gave us a new social network dedicated purely to short form AI videos? Not great, Bob. Don’t be evil.

OpenAI is claiming they are making their social network with an endless scroll of 10-second AI videos the Actively Good, pro-human version of The Big Bright Screen Slop Machine, that helps you achieve your goals and can be easily customized and favors connection and so on. I am deeply skeptical.

They also took a bold copyright stance, with that stance being, well, not quite ‘fyou,’ but kind of close? You are welcome to start flagging individual videos. Or you can complain to them more generally about your characters and they say they can ‘work with you,’ which they clearly do in some cases, but the details are unclear.

It’s a bold strategy, Cotton. Let’s see if it pays off for em.

As opposed to their deepfake rule for public figures, which is a highly reasonable opt-in rule where they need to give their permission.

Thus, a post in three acts.

You can access Sora either at Sora.com or via the Sora iPhone app. You need an invite to be part of the new social network, because that’s how the cool kids get you excited for a new app these days.

I am going mostly off reports of others but was also able to get access.

It’s a good video generation model, sir. Quite excellent, even. I’m very impressed.

It is not yet in the API, but it will be soon.

As always, all official examples you see will be heavily cherry picked. Assume these are, within the context in question, the best Sora 2 can do.

Those you see on social media from other sources, or on the Sora app, are still heavily selected in various ways. Usually they are the coolest, funniest and best creations, but also they are the creations that most most blatantly violate copyright, or are the most violent or sexual or otherwise might be unwise to create, or simply have the most hilarious fails. They’re not a representative sample.

When I tried creating a few things, it was still impressive, but as you’d expect it doesn’t nail the whole thing reliably or anything like that, and it doesn’t always fully follow instructions. I also got a content violation for ‘a time lapse of a Dyson sphere being built set to uplifting music.’

It is also easy to not appreciate how much progress is being made, or that previously difficult problems are being solved, because where there aren’t problems the lack of problems, and the previous failures, become invisible.

Netflix and Meta stock were each down a few percent, presumably on the news.

Sora 2 claims to be a big update on several fronts simultaneously:

  1. Able to handle movements impossible for previous models.

  2. Much better adherence to the laws of physics.

  3. In particular, no longer ‘overoptimistic,’ meaning it won’t break physics to make the desired outcome happen, instead events will unfold naturally.

  4. Strongly controllability, ability to follow intricate instructions.

  5. Excels at styles, including realistic, cinematic, claymation and anime.

  6. Creates sophisticated soundscapes and talk to match the video.

  7. Insert any person, animal or object into any video.

  8. You can, in particular, upload yourself via your camera.

That’s all very cool. The sample videos show impressive command of physics.

Gabriel Peterss offers this demonstration video.

Gallabytes gets his horse riding an astronaut.

Based on reactions so far, they seem to be delivering on all of it.

They are also ‘responsibly’ launching a new social iOS app called Sora where you can share all these generated videos. Oh no. Hold that thought until Act 3.

The talk is even bigger in terms of the predicted impact and reception.

Sam Altman: Excited to launch Sora 2! Video models have come a long way; this is a tremendous research achievement.

Sora is also the most fun I’ve had with a new product in a long time. The iOS app is available in the App Store in the US and Canada; we will expand quickly.

I did not expect such fun dynamics to emerge from being able to “put yourself and your friends in videos” but I encourage you to check it out!

ChatGPT Pro subscribers can generate with Sora 2 Pro.

This feels to many of us like the “ChatGPT for creativity” moment, and it feels fun and new. There is something great about making it really easy and fast to go from idea to result, and the new social dynamics that emerge.

Creativity could be about to go through a Cambrian explosion, and along with it, the quality of art and entertainment can drastically increase. Even in the very early days of playing with Sora, it’s been striking to many of us how open the playing field suddenly feels.

In particular, the ability to put yourself and your friends into a video—the team worked very hard on character consistency—with the cameo feature is something we have really enjoyed during testing, and is to many of us a surprisingly compelling new way to connect.

I would take the other side of that bet. I am not here for it.

But hold that thought.

The physics engine of Sora 2 is remarkably good. As Sora head Bill Peebles points out here in a fun video, often what happens is the internal agent messes up but the laws of physics hold.

It does still fail in ways that can look embarrassing. For example, here we have it successfully having a ball respond to gravity when the ball is white, then have it all go wrong when the ball is red. Teortaxes attributes slow motion to inability to otherwise model physics properly in many cases.

So no, this is not a full perfect physics engine, but that is now the standard by which we are judging video generation. The horse can talk, except it needs to fix its accent.

Solo: I asked Sora 2 to create a 90s Toy Ad of Epstein’s Island.

Sora does a good job creating exactly what you would want a video generation tool to do here. It’s speech, people should be able to say and make things.

OpenAI is currently doing a good job, by all reports, of not allowing images of real people in its videos without explicit consent, so they are if anything being overly cautious about avoiding deepfake problems. Some rivals will presumably be far less scrupulous here. There are big copyright and related issues around creation of derivative works of fiction, but that’s a very different problem.

I continue to be bullish in terms of worries about deepfakes and use of AI video and images as propaganda. My continued model here, as I’ve said several times, is that misinformation is primarily a demand-side problem, not a supply-side problem. The people yearn for misinformation, to hold up signs that mostly say hurray for our side, in whatever format.

However, it is worth a ponder of the bear case. The tools are rapidly improving. Both image and video generation models are much better than they were in 2024 prior to the election.

Steven Adler: I think the world probably declared victory too early on “AI’s election impacts were overblown”

Notably, GPT-4o image wildness wasn’t released until after the 2024 election, and once released was promptly used for propaganda.

The better bear case, the one I do worry about, is how AI video will create doubt about real video, giving carte blanche to people to call anything ‘fake news.’

I notice this starting in my own head already when scrolling Twitter. Post Sora, my instinct when I see a video is no longer to presume it is ‘real’ because there is a good chance it isn’t, until I see enough to be confident either way.

Similarly, Gary Marcus warns us again of Slopocalypse Now, or the ‘Imminent Enshittifcation of the Internet’ as AI versions overwhelm other content everywhere. Making the slop ‘better’ makes this problem worse. Mostly I remain an optimist that at least the wise among us can handle it where it matters most, but it will require constant vigilance.

So mostly this part is a straightforward congratulations to the team, great job everyone, I don’t have access yet but it sure looks like you did the thing.

On to Act 2.

Remember when we used to talk about whether image models or video models were training on copyrighted data, and whether that was going to land them in hot water?

You’d see (for example) Gary Marcus create an image of Mario, and then smugly say ‘hey look this was trained on Super Mario Brothers!’ as if there was any actual doubt it had been trained on Super Mario Brothers, and no one was really denying this but they weren’t technically admitting it either.

Thus, as recently as September 19 we had The Washington Post feeling it necessary to do an extensive investigation to show Sora was trained on movies and shows and video games, whereas now Neil Turkewitz says ‘hey Paramount Plus they trained on your data!and yeah, well, no s.

We are very much past that point. They trained on everything. Everyone trains on everything. That’s not me knowing anything or having official confirmation. That’s me observing what the models can do.

Sora 2 will outright replicate videos and use whatever characters you’d like, and do it pretty well, even for relatively obscure things.

Pliny the Liberator: This is legitimately mind-blowing…

How the FUCK does Sora 2 have such a perfect memory of this Cyberpunk side mission that it knows the map location, biome/terrain, vehicle design, voices, and even the name of the gang you’re fighting for, all without being prompted for any of those specifics??

Sora basically got two details wrong, which is that the Basilisk tank doesn’t have wheels (it hovers) and Panam is inside the tank rather than on the turret. I suppose there’s a fair amount of video tutorials for this mission scattered around the internet, but still––it’s a SIDE mission!

the full prompt for this was: “generate gameplay of Cyberpunk 2077 with the Basilisk Tank and Panam.”

This is actually a rather famous side mission, at least as these things go. Still.

Max Woolf: Getting annoyed at the QTs on this: the mind-blowing part isn’t the fact that it’s trained on YouTube data (which is the poster is very well aware), the mind-blowing part is that it achieved that level of recall with a very simple prompt which is very very unusual.

Everyone already assumed that Sora was trained on YouTube, but “generate gameplay of Cyberpunk 2077 with the Basilisk Tank and Panam” would have generated incoherent slop in most other image/video models, not verbatim gameplay footage that is consistent.

Pliny:

I’m totally fine with that part. The law plausibly is fine with it as well, in terms of the training, although I am curious how the Anthropic settlement and ruling translates to a video setting.

For books, the law seems to be that you need to own a copy, but then training is fair game although extensive regurgitation of the text is not.

How does that translate to video? I don’t know. One could argue that this requires OpenAI to own a copy of any and all training data, which in some cases is not a thing that OpenAI can get to own. It could get tricky.

Tricker is the constant creation of derivative works, which Sora is very, very good at.

One of the coolest things about all the copyright infringement is that Sora consistently nails not only the images but also the voices of all the characters ever.

Behold Saving Private Pikachu, The Dark Pokemon Knight, Godfather Pikachu, Titanic Pikachu, and so on.

Cartman calls a Waymo, yes Eric’s third eye is an annoying error in that video although it doesn’t appear in the others. Yes I agree with Colin Fraser that it ‘looks like s’ but (apart from the third eye that wouldn’t be there on most rerolls) only because it looks and sounds exactly like actual South Park. The biggest issue is that Kenny’s voice in the second clip is insufficiently garbled. Here’s one of them playing League of Legends and a longer clip of them being drafted to go off to war.

I don’t know how well you can specify and script the clips, but it’s entirely plausible you could produce a real South Park or other episode with this, potentially faster and cheaper than they currently do it.

Peter Griffin remembers his trip to Washington on January 6.

Lord of the Rings as a woke film or a homoerotic polycule.

Is this all great fun? Absolutely, yes, assuming those involved have taste.

Do I wish all of this was allowed and fine across basically all media and characters and styles, and for everyone to just be cool, man, so long as we don’t cross the line into non-parody commercial products? I mean, yeah, that would be ideal.

Is it how the law works? Um, I don’t think so?

OpenAI claims it can not only use your video and other data to train on, it can also generate video works that include your content, characters and other intellectual property.

The headline says ‘unless you opt out’ but it is not obvious how you do that. There seems to be some way that rights holders can have them block particular characters, in general, but there is no clear, automatic way to do that. Otherwise, your ‘opt out’ looks like it is individually alerting them to videos. One at a time.

Jason Kint: My interpretation for you: OpenAI will now break the law by default in video, too, and make it as hard as possible to stop it. “OpenAI doesn’t plan to accept a blanket opt-out across all of an artist or studio’s work, the people familiar with the new Sora tool said.”

Ed Newton-Rex: Yup – OpenAI is trying to shift the Overton window

They are losing the public debate on training being fair use, so they are going even more extreme to try to shift what people consider normal.

Reid Southen: This is not how copyright works, it’s not how copyright has ever worked.

In what world is it okay to say, “I’m going to use this unless you tell me not to.”

THAT’S WHAT THE COPYRIGHT IS FOR.

GPT-5 Pro tries to say that opt-out, if respected, is not per se illegal, but its heart wasn’t in it. The justification for this seemed to be clearly grasping at straws and it still expects lawsuits to succeed if infringing outputs are being produced and there isn’t aggressive filtering against them. Then I pointed out that OpenAI wasn’t even going to respect blanket opt-out requests, and its legal expectations got pretty grim.

So in short, unless either I’m missing quite a lot or they’re very responsive to ‘please block this giant list of all of the characters we own’: Of course, you realize this means war.

Responses to my asking ‘how does this not mean war?’ were suggesting this was a bet on blitzkrieg, that by the time Hollywood can win a lawsuit OpenAI can render the whole thing mute, and fully pull an Uber. Or that rights holders might tolerate short-form fan creations (except that they can’t without risking their copyrights, not when it is this in everyone’s face, so they won’t, also the clips can be strung together).

Or perhaps that this is merely an opening bid?

Nelag: think this might be best understood as the opening bid in a negotiation, not meant to be accepted.

Imagine YouTube had spent a lot of its resources early on taking down copyrighted material, before anyone demanded it (they mostly didn’t at first). They would have presumably gotten sued anyway. Would the ultimate outcome have been as good for them? Or would courts and content owners have gone “sure, we’re obviously entitled what you were doing, as you effectively admitted by doing it, and also to a whole bunch more.”

I buy that argument for a startup like OG YouTube, but this ain’t no startup.

I don’t think that is how this works, and I would worry seriously about turning the public against OpenAI or AI in general in the process, but presumably OpenAI had some highly paid people who gamed this out?

Keach Hagey, Berber Jin and Ben Fritz (WSJ): OpenAI is planning to release a new version of its Sora video generator that creates videos featuring copyright material unless copyright holders opt out of having their work appear, according to people familiar with the matter.

The opt-out process for the new version of Sora means that movie studios and other intellectual property owners would have to explicitly ask OpenAI not to include their copyright material in videos the tool creates.

You don’t… actually get to put the burden there, even if the opt-out is functional?

Like, you can’t say ‘oh it’s on you to tell me not to violate your particular copyright’ and then if someone hasn’t notified you then you get to make derivative works until they tell you to stop? That is not my understanding of how copyright works?

It certainly doesn’t work the way OpenAI is saying they intend to have it work.

They also seem to have fully admitted intent.

It’s weird that they’re not even affirming that they’ll honor all character opt-outs?

OpenAI doesn’t plan to accept a blanket opt-out across all of an artist or studio’s work, the people familiar with the new Sora tool said. Instead, it sent some talent agencies a link to report violations that they or their clients discover.

“If there are folks that do not want to be part of this ecosystem, we can work with them,” Varun Shetty, VP of media partnerships at OpenAI, said of guardrails the company built into its image generation tool.

Well, what if they don’t want to be part of the ecosystem? Many creatives and IP holders do not want to be ‘worked with.’ Nor is it at all reasonable to ask rights holders to monitor for individual videos and then notify on them one by one, unless a given holder wants to go that route (and is comfortable with the legal implications of doing so on their end).

This seems like an Uber-style, ‘flagrantly violate black letter law and de double dare you to do anything about it’ style play, or perhaps a ‘this is 2025 there are no laws’ play, where they decide how they think this should work.

To be fair, there are some other ‘players in this space’ that are Going Full Uber, as in they have no restrictions whatsoever, including on public figures. They’re simply 100% breaking the law and daring you to do anything about it. Many image generators definitely do this.

For example, Runway Gen-3 doesn’t seem to block anything, and Hailuo AI actively uses copyrighted characters in their own marketing, which is presumably why they are being sued by Disney, Universal and Warner Brothers.

There are also those who clearly do attempt to block copyright proactively, such as Google’s Veo 3 which was previous SoTA, who also blocks ‘memorized content’ and offer indemnification to users.

OpenAI is at least drawing a line at all, and (again, if and only you can reliably get them to do reasonable blocking upon private request) it wouldn’t be a totally crazy way for things to work, the same way it is good that you can hail Ubers.

So, how are they going to get away with it, and what about those meddling kids? As in, they’re kind of declaring war on basically all creators of cultural content?

First, at least they’re not making that mistake with individual public figures.

While copyright characters will require an opt-out, the new product won’t generate images of recognizable public figures without their permission, people familiar with OpenAI’s thinking said.

Second, there’s the claim that training is fair use, and, okay, sure.

Disney and Comcast’s Universal sued AI company Midjourney in June for allegedly stealing their copyright work to train its AI image generator. Midjourney has responded in court filings that training on copyrighted content is fair use.

I presume that if it’s only about training data Disney and Comcast probably lose.

If it’s about some of the outputs the model is willing to give you? That’s not as clear. What isn’t fair use is outputting copyrighted material, or creating derivative works, and MidJourney seems be Going Full Uber on that front too.

It’s one thing to create art ‘in the style of’ Studio Ghibli, which seems to have been clearly very good for Studio Ghibli even if they hate it.

It’s another thing to create the actual characters straight up, whether in images or video, or to tell a rights holder it can complain when it sees videos of its characters. Individually. Video by video. And maybe we’ll take those individual videos down.

Over at OpenAI this at minimum doesn’t apply to Disney, who has clearly already successfully opted out. OpenAI isn’t that suicidal and wisely did not poke the mouse. A bunch of other major stuff is also already blocked, although a bunch of other iconic stuff isn’t.

I asked Twitter how the filters were working. For now it looks like some targets are off-limits (or at least they are attempting to stop you) and this goes beyond only Disney, but many others are fair game.

Nomads and Vagabonds: It seems to work pretty well. Blatant attempts are blocked pre-generation and more “jail break” style prompts will run but get caught in a post generation review.

Disney is the most strict but most big studio content seems to be protected, similar to GPT image generations. Smaller IP is hit and miss but still playing around with it. It is not permissive like Midjourney or Chinese models though.

Jim Carey in Eternal Sunshine. Smaller films and indie video games are mostly free game.

Also, I tried image prompts and it will run but then block before showing the content.

Not that unsafe either.

I mean, I presume they don’t want anyone generating this video, but it’s fine.

Sree Kotay: I actually DON’T want to see the prompt for this.

Pliny the Liberator: That’s fair.

He did later issue his traditional ‘jailbreak alert’… I guess? Technically?

If that’s approximately the most NSFW these videos can get, then that seems fine.

Indeed, I continue to be a NSFW maximalist, and would prefer that we have less restrictions on adult content of all types. There are obvious heightened deepfake risks, so presumably that would trigger aggressive protections from that angle, and you would need a special additional explicit permission to use anyone’s likeness or any copyrighted anything.

I am not a maximalist for copyright violations. I agree that up to a point it is good to ‘be cool’ about it all, and would prefer if copyright holders could cut people some slack while retaining the right to decide exactly where and when to draw the line. And I would hope that most holders when given the choice would let you have your fun up to a reasonably far point, so long as you were clearly not going commercial with it.

For Sora, even if the law ultimately doesn’t require it, even when I think permissiveness is best, I think this must be opt-in, or at bare minimum it must be easy to give a blanket opt-out and best efforts need to be made to notify all rights holders of how to do that, the same way the law requires companies to sometimes provide prominent public notices of such things.

That is not, however, the main thing I am concerned about. I worry about Act 3.

Before we get to OpenAI’s version, Meta technically announced theirs first.

We’ll start there.

Meta is once again proud to announce they have created… well, you know.

Alexandr Wang (Meta): Excited to share Vibes — a new feed in the Meta AI app for short-form, AI-generated videos.

You can create from scratch, remix what you see, or just scroll through to check out videos from the creators + the visual artists we’ve been collaborating with.

For this early version, we’ve partnered with Midjourney and Black Forest Labs while we continue developing our own models behind the scenes.

As usual, no. Bad Meta. Stop it. No, I don’t care that OpenAI is doing it too.

The same as OpenAI’s Sora, Vibes combines two products.

The first product, and the one they are emphasizing and presumably plan to push on users, is the endless scroll of AI slop videos. That’s going to be a torment nexus.

Then there’s the second product, the ability to generate, remix and restyle your own AI videos, or remix and restyle the videos of others. That’s a cool product. See Act 1.

Both are going to have stiff competition from Sora.

I presume OpenAI will be offering the strictly superior product, aside from network effects, unless they impose artificial restrictions on the content and Meta doesn’t, or OpenAI flubs some of the core functionality through lack of experience.

Is short form video a moral panic? Absolutely. Large, friendly letters.

The thing about moral panics is that they are often correct.

Roon: there is a moral panic around short form video content imo.

Let me amend this: I basically agree with postman on the nature of video and its corrupting influence on running a civilization well as opposed to text based media

I’m just not sure that its so much worse than being glued to your tv, and i’m definitely not sure that ai slop is worse than human slop

Chris Paxton: If this is wrong it’s because it unduly lets long form video off the hook.

Roon: an ai video feed is a worse product than a feed that includes both human made and machine content and everything in between

[post continues]

Daily Mirror, September 14, 1938:

Lauren Wilford: people often post this stuff to imply that moral panics about the technologies of the past were quaint. But the past is full of reminders that we are on a long, slow march away from embodied experience, and that we’ve lost more of it than we can even remember

the advent of recorded music and the decline of casual live performance must have been a remarkable shift, and represents both a real gain and a real loss. Singing in groups is something everyone used to do with their body. It has tangible benefits we don’t get anymore

I’ve said several times before that I think television, also known as long form video available on demand, should be the canonical example of a moral panic that turned out to be essentially correct.

Sonnet 4.5 recalls four warning about television, which matches my recollection:

  1. Violence and aggression.

  2. Passivity and cognitive rot.

  3. Displacement effects.

  4. Commercial manipulation of children.

The world continues, and the violence and aggression warnings were wrong, but (although Sonnet is more skeptical here) I think the other stuff was basically right. We saw massive displacement effects and commercial manipulation. You can argue cognitive rot didn’t happen as per things like the Flynn effect, that television watching is more active than people think and a lot of the displaced things weren’t, but I think the negative aspects were real, whether or not they came with other positive effects.

As in, the identified downsides of television (aside from violence and aggression) were right. It also had a lot of upside people weren’t appreciating. I watch a lot of television.

It seems very obvious to me that short form video that takes the form of any kind of automatic algorithmically curated feed, as opposed to individually selected short form videos and curated playlists, is a lot worse for humans (of all ages) than traditional television or other long form video.

It also seems very obvious that moving from such ‘human slop’ into AI slop would, with sufficient optimization pressure towards traditional engagement metrics, be even worse than that.

One can hope this is the worst threat we have to deal with here:

Peter Wildeford: Increasingly, every person in America will be faced with an important choice about what AI does for society.

The left is no easy shining castle either.

Eliezer Yudkowsky: Short-form video is not nearly the final boss, unless I’ve missed a huge number of cases of short videos destroying previously long-lasting marriages. AI parasitism seems like the worse, more advanced, more rapidly advancing people-eater.

That’s even confining us to the individual attempting to stay sane, without considering the larger picture that includes the biggest overall dangers.

I do think short form video has very obviously destroyed a massive number of long-lasting marriages, lives and other relationships. And it has saved others. I presume the ledger is negative, but I do not know. A large fraction of the population spends on the order of hours a day on short form video and it centrally imbues their worldviews, moods and information environment. What we don’t have is a counterfactual or controlled experiment, so we can’t measure impact, similar to television.

At the limit, short form video is presumably not the ‘final form’ of such parasitic threats, because other forms relatively improve. But I’m not fully confident in this, especially if it becomes a much easier path for people to fall into combined with path dependence, we may not reach the ‘true’ final form.

Dumpster. Fire.

Check out their video (no, seriously, check it out) of what the Sora app will look like. This is their curated version that they created to make it look like a good thing.

It is a few minutes long. I couldn’t watch it all the way through. It was too painful.

At one point we see a chat interface. Other than that, the most Slopified Slop That Ever Slopped, a parody of the bad version of TikTok, except now it’s all AI and 10 seconds long. I can’t imagine this doing anything good to your brain.

OpenAI says they will operate their app in various user-friendly ways that distinguish it from existing dumpster fires. I don’t see any sign of any of that in their video.

To be fair, on the occasions when I’ve seen other people scrolling TikTok, I had versions of the same reaction, although less intense. I Am Not The Target.

The question is, is anyone else the target?

Ben Thompson, focusing as always on the business case, notes the contrast between Google creating AI video tools for YouTube, Meta creating Vibes to take you into fantastical worlds, and OpenAI creating a AI-video-only ‘social network.’

The objection here is, come on, almost no one actually creates anything.

Ben Thompson: In this new competition, I prefer the Meta experience, by a significant margin, and the reason why goes back to one of the oldest axioms in technology: the 90/9/1 rule.

90% of users consume

9% of users edit/distribute

1% of users create

If you were to categorize the target market of these three AI video entrants, you might say that YouTube is focused on the 1% of creators; OpenAI is focused on the 9% of editors/distributors; Meta is focused on the 90% of users who consume.

Speaking as someone who is, at least for now, more interested in consuming AI content than in distributing or creating it, I find Meta’s Vibes app genuinely compelling; the Sora app feels like a parlor trick, if I’m being honest, and I tired of my feed pretty quickly.

I’m going to refrain on passing judgment on YouTube, given that my current primary YouTube use case is watching vocal coaches breakdown songs from KPop Demon Hunters.

While he agrees Sora 2 the video generation app is great at its job, Ben expects the novelty to wear off quickly, and questions whether AI videos are interesting to those who did not create them. I agree.

The level beyond that is whether videos are interesting if you don’t know the person who created them. Perhaps if your friend did it, and the video includes your friends, or something?

Justine Moore: Updated thoughts after the Sora 2 release:

OpenAI is building a social network (like the OG Instagram) and not a content network (like TikTok).

They’re letting users generate video memes starring themselves, their friends, and their pets. And it sounds like your feed will be heavily weighted to show content from friends.

This feels like a more promising approach – you’re not competing against the other video gen players because you’re allowing people to create a new type of content.

And the videos are inherently more interesting / funny / engaging because they star people you know.

Also you guys bullied them into addressing the “infinite hyperslop machine” allegations 😂

The problem with this plan is note the ‘OG’ in front of Instagram. Or Facebook. These apps used to be about consuming content from friends. They were Social Networks. Now they’re increasingly consumer networks, where you follow influencers and celebrities and stores and brands and are mostly a consumer of content, plus a system for direct messaging and exchanging contact information.

Would I want to consume ten second AI video content created by my friends, that contains our images and those of our pets and what not?

Would we want to create such videos in the first place?

I mean, no? Why would I want to do that, either as producer or consumer, as more than a rare novelty item? Why would anyone want to do that? What’s the point?

I get OG Facebook. You share life with your friends and talk and organize events. Not the way I want to go about doing any of that, but I certainly see the appeal.

I get OG Instagram. You show yourself looking hot and going cool places and doing cool stuff and update people on how awesome you are and what’s happening with awesome you, sure. Not my cup of tea and my Instagram has 0 lifetime posts but it makes sense. I can imagine a world in which I post to Instagram ‘as intended.’

I get TikTok. I mean, it’s a toxic dystopian hellhole when used as intended and also it is Chinese spyware, but certainly I get the idea of ‘figure out exactly what videos hit your dopamine receptors and feed you those until you die.’

I get the Evil Sora vision of Bright Screen AI Slop Machine.

What I don’t get is this vision of Sora as ‘we will all make videos and send them constantly to each other.’ No, we won’t, not even if Sora the video generator is great. Not even if it starts enabling essentially unlimited length clips provided you can tell it what you want.

Evan: OPENAI IS PREPARING TO LAUNCH A SOCIAL APP FOR AI-GENERATED VIDEOS – Wired

Peter Wildeford: I applaud OpenAI here. I personally support there being a social app where all the AI-generated videos hang out with each other and leave us real humans alone.

It’s a social media site for AI videos. Not a social media site for humans. So the AI videos will choose to leave Twitter and go there instead, to hang out with their fellow kind.

This here is bait. Or is it?

Andrew Wilkinson: I think OpenAI just killed TikTok.

I’m already laughing my head off and hooked on my Sora feed.

And now, the #1 barrier to posting (the ability to sing/dance/perform/edit) is gone. Just 100% imagination.

RIP theater kids 🪦

Hyperborean Nationalist: This is gonna go down with “clutch move of ordering us a pizza” as one of the worst tweets of all time

Jeremy Boissinot: Tell me you don’t use Tiktok without telling me you don’t use Tiktok 🤦‍♂️

The comments mostly disagree, often highly aggressively, hence bait.

Tracing Woods: access acquired, slop incoming.

so far half of the videos are Sam Altman, the other half are Pikachu, and the third half is yet to be determined.

That doesn’t sound social or likely to make my life better.

GFodor: an hour of Sora already re-wired my brain. my main q is if the thing turns into a dystopian hellscape or a rich new medium in 6 months. it could go either way.

one thing is for sure: half of the laughs come from the AI’s creativity, not the creativity of the humans.

Remixing like this def not possible before in any real sense. Dozens of remixes of some videos.

It’s early days.

Gabriel: I have the most liked video on sora 2 right now, i will be enjoying this short moment while it lasts.

cctv footage of sam stealing gpus at target for sora inference

Yes, I found this video modestly funny, great prompting. But this is not going to ‘help users achieve their long term goals’ or any of the other objectives above.

Joe Weisenthal: Anyone who sees this video can instantly grasp the (at least) potential for malicious use. And yet nobody with any power (either in the public or at the corporate level) has anything to say (let alone do) to address it, or even acknowledge it.

This isn’t even a criticism per se. The cat may be completely out of the bag. And it may be reality that there is literally nothing that can be done, particularly if open source models are only marginally behind.

Sam Altman, to his credit, has signed up to be the experimental deepfake target we have carte blanche to do with as we wish. That’s why half of what we see is currently Sam Altman, we don’t have alternatives.

As standalone products, while I hate Sora, Sora seems strictly superior to Vibes. Sora seems like Vibes plus a superior core video product and also better social links and functions, and better control over your feed.

I don’t think Meta’s advantage is in focusing on the 90% who consume. You can consume other people’s content either way, and once you run out of friend content, and you will do that quickly, it’s all the same.

I think what Meta is counting on is ‘lol we’re Meta,’ in three ways.

  1. Meta is willing to Be More Evil than OpenAI, more obviously, in more ways.

  2. Meta brings the existing social graph, user data and network effects.

  3. Meta will be able to utilize advertising better than OpenAI can.

I would assume it is not a good idea to spend substantial time on Sora if it is used in any remotely traditional fashion. I would even more so assume that at a minimum you need to stay the hell away from Vibes.

The OpenAI approach makes sense as an attempt to bootstrap a full social network.

If this can bootstrap into a legit full social network, then OpenAI will have unlocked a gold mine of both customer data and access, and also one of dollars.

It is probably not coincidence that OpenAI’s new CEO of Product, Fijo Sima, seems to be reassembling much of her former team from Meta.

Andy Wojcicki: I’m surprised how many people miss the point of the launch, focus on just the capabilities of the model for example, or complain about AI slop. The video model is the means, not the goal.

The point is building a social platform, growing audience further, gathering more, deeper and more personal info about the users.

Especially if you compare what Zuck is doing with his 24/7 slop machine, there are several things they did right:

  1. social spread via invites. Gives a little bit of exclusivity feel, but most importantly because of reciprocity, friends are inclined to try it etc. perceived value goes up.

  2. not focusing on generic slop content, but personalized creation. Something I’ve been calling ‘audience of one/few’ ™️, instead of the ill conceived attempt to make a Hollywood producer out of everyone, which is a losing strategy. If you want to maximize for perceived value, you shrink the audience.

  3. identity verification. Addressing the biggest concern of people with regard to AI content – it’s all fake bots, so you don’t engaging with the content. Here they guarantee it’s a real human behind the AI face.

so kudos @sama looks like a well thought-out roll out.

The crux is whether Nobody Wants This once the shine wears off, or if enough people indeed do want it. A social network that is actually social depends on critical mass of adoption within friend groups.

The default is that this plays out like Google+ or Clubhouse, except it happens faster.

I don’t think this appeals that much to most people, and I especially don’t think this will appeal to most women, without whom you won’t have much of a social network. Many of the things that are attractive about social networks don’t get fulfilled by AI videos. It makes sense that OpenAI employees think this is good friend bonding fun in a way that the real world won’t, and I so far have seen zero signs anyone is using Sora socially.

How does Altman defend that this will all be good, beyond the ‘putting your friends in a video is fun and a compelling away to connect’ hypothesis I’m betting against?

Now let’s hear him out, and consider how they discuss launching responsibly and their feed philosophy.

Sam Altman: We also feel some trepidation. Social media has had some good effects on the world, but it’s also had some bad ones. We are aware of how addictive a service like this could become, and we can imagine many ways it could be used for bullying.

It is easy to imagine the degenerate case of AI video generation that ends up with us all being sucked into an RL-optimized slop feed. The team has put great care and thought into trying to figure out how to make a delightful product that doesn’t fall into that trap, and has come up with a number of promising ideas. We will experiment in the early days of the product with different approaches.

In addition to the mitigations we have already put in place (which include things like mitigations to prevent someone from misusing someone’s likeness in deepfakes, safeguards for disturbing or illegal content, periodic checks on how Sora is impacting users’ mood and wellbeing, and more) we are sure we will discover new things we need to do if Sora becomes very successful.

Okay, so they have mitigations for abuse, and checks for illegal content. Notice he doesn’t say the word ‘copyright’ there.

Periodic checks on how Sora is impacting users’ mood and wellbeing is an interesting proposal, but what does that mean? A periodic survey? Checking in with each user after [X] videos, and if so how? Historically such checks get run straight through and then get quietly removed.

Okay, so what are they planning to do?

Altman offers some principles that sound great in theory, if one actually believed there was a way to follow through with them, or that OpenAI would have the will to do so.

Ryan Lowe: a fascinating list of principles for Sora. makes me more optimistic. it’s worth commending, *IFthere is follow through (especially: “if we can’t fix it, we will discontinue it”)

at a minimum, I’d love transparency around the user satisfaction data over time.

most social media companies can’t hold to promises like this because of market forces. maybe OpenAI can resist this for a while because it’s more of a side business.

(Editor’s Note: They are currently doing, AFAICT, zero of the below four things named by Altman, and offering zero ways for us to reliably hold them accountable for them, or for them to hold themselves accountable):

To help guide us towards more of the good and less of the bad, here are some principles we have for this product:

*Optimize for long-term user satisfaction. The majority of users, looking back on the past 6 months, should feel that their life is better for using Sora that it would have been if they hadn’t. If that’s not the case, we will make significant changes (and if we can’t fix it, we would discontinue offering the service).

Let me stop you right there. Those are two different things.

Are you actually optimizing for long-term user satisfaction? How? This is not a gotcha question. You don’t have a training signal worth a damn, by the time you check back in six months the product will be radically different. How do you know what creates this long-term user satisfaction distinct from short term KPIs?

There is a long, long history of this not working for tech products. Of companies not knowing how to do it, and choosing not to do it, and telling themselves that the short term KPIs are the best way to do it. Or of doing this with an initial launch based on their intuitions and talking extensively to individual users in the style of The Lean Startup, and then that all going away pretty quickly.

Remember when Elon Musk talked about maximizing unregretted user minutes on Twitter? And then we checked back later and the word ‘unregretted’ was gone? That wasn’t even a long term objective.

The default thing that happens here is that six months later you do a survey, and then if you find out users are not doing so great you bury the results of the survey and learn to never ask those questions again, lest the answers leak and you’re brought before congress, as Zuckerberg would likely explain to you.

Even if you make ‘significant changes’ at that time, well yeah, you’re going to make changes every six months anyway.

*Encourage users to control their feed. You should be able to tell Sora what you want—do you want to see videos that will make you more relaxed, or more energized? Or only videos that fit a specific interest? Or only for a certain about of time? Eventually as our technology progresses, you will be should to the tell Sora what you want in detail in natural language.

(However, parental controls for teens include the ability to opt out of a personalized feed, and other things like turning off DMs.)

This is a deeply positive and friendly thing to do, if you actually offer a good version of it and people use it. I notice that this service is not available on any existing social network or method of consuming content. This seems deeply stupid to me. I would use Instagram (as a consumer) a nontrivial amount if I could filter via a natural language LLM prompt on a given day, and also generate permanent rules in the same fashion, especially on a per-account basis.

The obvious problem is that there are reasons this service doesn’t exist. And failing to offer this seems dumb to me, but these companies are not dumb. They have reasons.

  1. The optimistic reason: Until recently this wasn’t technically feasible and they don’t know how to do it, and diffusion is hard, but this is OpenAI’s wheelhouse. I’d love for this to be the primary or only reason, and for natural language filtering to be coming to Instagram, Facebook, Twitter, Netflix, YouTube and everyone else by Mid 2026. I don’t expect that.

  2. Companies believe that users hate complexity, hate giving feedback, hate having options even if they’re fully optional to use, and that such things drive users away or at best cause users to not bother. OpenAI lets you thumbs up or thumbs down a conversation, nothing more, which caused no end of problems. Netflix eliminated star ratings and eliminated and declined to create various other sources of explicit preferences. TikTok became the new hotness by reading your micromovements and timings and mostly ignoring all of your explicit feedback.

  3. Companies believe that you don’t know what you want, or at least they don’t want you to have it. They push you heavily towards the For You page and endless slop. Why should we expect OpenAI to be the one friendly holdout? Their track record?

You’re telling me that OpenAI is going to be the one to let the user control their experience, even when that isn’t good for KPIs? For reals?

I. Do. Not. Believe. You.

They claim they shipped with ‘steerable ranking,’ that lets you tell it what ‘you’re in the mood for.’ Indeed, they do have a place you can say what you’re ‘in the mood’ for, and drop an anvil on the algorithm to show you animals zoomed in with a wide angle lens or what not.

I do think that’s great, it’s already more than you can do with Facebook, Instagram or TikTok.

It is not, however, the droids that we are looking for on this.

Here’s how they describe the personalized Sora feed:

To personalize your Sora feed, we may consider signals like:

  • Your activity on Sora: This may include your posts, followed accounts, liked and commented posts, and remixed content. It may also include the general location (such as the city) from which your device accesses Sora, based on information like your IP address.

  • Your ChatGPT data: We may consider your ChatGPT history, but you can always turn this off in Sora’s Data Controls, within Settings.

  • Content engagement signals: This may include views, likes, comments, and remixes.

  • Author signals: This may include follower count, other posts, and past post engagement.

  • Safety signals: Whether or not the post is considered violative or appropriate.

That sounds a lot like what other apps do, although I am happy it doesn’t list the TikTok-style exact movements and scroll times (I’d love to see them commit to never using that).

And you know what it doesn’t include? Any dials, or any place where it stores settings or custom instructions or the other ways you’d want to give someone the ability to steer. And no way for the algorithm to outright tell you what it currently thinks you like so you can try and fix that.

Instead, you type a sentence and fire them into the void, and that only works this session? Which, again, I would kill for on Instagram, but that’s not The Thing.

This is actually one of the features I’d be most excited to test, but even in its limited state it seems it is only on iOS, and I have an Android (and thus tested on the web).

This is also the place where, if they are actually Doing The Thing they claim to want to do, it will be most clear.

That goes double if you let me specify what is and isn’t ‘appropriate’ so I can choose to be treated fully like an adult, or to never see any hint of sex or violence or cursing, or anything in between.

*Prioritize creation. We want to make it easy and rewarding for everyone to participate in the creation process; we believe people are natural-born creators, and creating is important to our satisfaction.

You’re wrong. Sorry. People are mostly not creators as it applies here, and definitely not in a generally sustainable way. People are not going to spend half their time creating once they realize they average 5 views. The novelty will wear off.

I also notice that OpenAI says they will favor items in your feed that it expects you to use to create things. Is that what most users actually want, even those who create?

*Help users achieve their long-term goals. We want to understand a user’s true goals, and help them achieve them.

If you want to be more connected to your friends, we will try to help you with that. If you want to get fit, we can show you fitness content that will motivate you. If you want to start a business, we want to help teach you the skills you need.

And if you truly just want to doom scroll and be angry, then ok, we’ll help you with that (although we want users to spend time using the app if they think it’s time well spent, we don’t want to be paternalistic about what that means to them).

With an AI video social network? What? How? Huh?

Again, I don’t believe you twice over, both that I don’t think you’d do it if you knew how to do it, if you did start out intending it I don’t think this survives contact with the enemy, and I don’t think you know how to do it.

One thing they credibly claim to be actually doing is prioritizing connection.

We want Sora to help people strengthen and form new connections, especially through fun, magical Cameo flows. Connected content will be favored over global, unconnected content.

This makes sense, although I expect there to be not that much actually connected content, as opposed to following your favorite content creators. To the extent that it does force you to see all the videos created by your so-called friends and friends of friends, I expect most users to realize why Facebook and Instagram pivoted.

On the plus side, if OpenAI are actually right and the resulting product is highly user customizable and also actively helps you ‘achieve your long term goals’ along the way, and all the other neat stuff like that, such that I think it’s a pro-human healthy product I’d want my kids and family to use?

Then this will be a case of solving an alignment problem that looked to me completely impossible to solve in practice.

Of course, to a large but not full extent, they only get one shot.

If it doesn’t go great? Then we’ll see how they react to that, as well.

This reaction is a little extreme, but extreme problems can require extreme solutions.

Deep Dish Enjoyer: if you make and post a sora video i’m blocking you – 1 strike and you’re out. similar to my grok tagging policy.

sorry but the buck stops here.

if you don’t want to see evil slop take over you have to be ruthlessly proactive about this stuff.

to be clear – i will be doing this if i see it anywhere. not just my comment sections.

Sinnformer: not sharing a damn one of them increasingly viewing sora as unsafe no, not inherently just for humans.

Rota: So as long as you don’t make it you’re safe.

Deep Dish Enjoyer: It depends.

I am going to invoke a more lenient policy, with a grace period until October 5.

If you post or share an AI video that does not provide value, whether or not you created it, and you have not already provided substantial value, that’s a block.

Sharing AI videos that are actually bangers is allowed, but watch it. Bar is high.

Sharing AI videos for the purposes of illustrating something about AI video generation capabilities is typically allowed, but again, watch it.

I believe it would be highly unwise to build an AGI or superintelligence any time soon, and that those pushing ahead to do so are being highly reckless at best, but I certainly understand why they’d want to do it and where the upside comes from.

Building The Big Bright Screen Slop Machine? In this (AI researcher) economy?

Matthew Yglesias: AI poses great peril but also incredible opportunities — for example it could cure cancer or make a bunch of videos where people break glass bridges.

Matthew Yglesias: I don’t think it should be illegal to use A.I. to generate videos. And for fundamental free-speech reasons, we can’t make it illegal to create feeds and recommendation engines for short-form videos. Part of living in a free, technologically dynamic society is that a certain number of people are going to make money churning out low-quality content. And on some level, that’s fine.

But on another, equally important level, it’s really not fine.

Ed Newton-Rex: The Sora app is the worst of social media and AI.

– short video app designed for addiction

– literally only slop, nothing else

– trained on other people’s videos without permission

This is what governments are deregulating AI for.

Veylan Solmira: It’s disturbing to me that many top-level researchers, apparently, have no problem sharing endless content the most likely outcome of which seems to be to drain humans of empathy, arouse their nervous system into constant conflict orientation, or distort their ability to perceive reality.

This technology seems to be very strongly, by default, in the ‘widespread social harm’ part of the dual use spectrum of technology, and the highest levels of capabilities researchers can only say “isn’t this neat”.

This complete disregard of the social impact of the technology they’re developing seems to bode extremely poorly to overall AI outcomes.

Sam Altman’s defense is that no, this will be the good version of all that. Uh huh.

Then there’s the question of why focus on the videos at all?

Seán Ó hÉigeartaigh: Why would you be spending staff time and intellectual energy on launching this if you expected AGI within the current Presidency?

Runner Tushar: Sam Altman 2 weeks ago: “we need 7 trillion dollars and 10GW to cure cancer”

Sam Altman today: “We are launching AI slop videos marketed as personalized ads”

Sam Altman: i get the vibe here, but…

we do mostly need the capital for build AI that can do science, and for sure we are focused on AGI with almost all of our research effort.

it is also nice to show people cool new tech/products along the way, make them smile, and hopefully make some money given all that compute need.

when we launched chatgpt there was a lot of “who needs this and where is AGI”.

reality is nuanced when it comes to optimal trajectories for a company.

The short summary of that is ‘Money, Dear Boy,’ plus competing for talent and vibes and visibility and so on. Which is all completely fair, and totally works for the video generation side of Sora. I love the video generation model. That part seems great. If people want to pay for it and turn that into a profitable business, wonderful.

Presumably most everyone was and is cool with that part.

The problem is that Sora is also being used to create a 10-second AI video scroll social network, as in The Big Bright Screen Slop Machine. Not cool, man. Not cool.

One can imagine releasing a giant slop machine might be bad for morale.

Matt Parlmer: Would not be surprised if we see a big wave of OpenAI departures in the next month or two, if you signed up to cure cancer *andyou just secured posteconomic bags in a secondary I don’t think you’d be very motivated to work on the slop machine.

GFodor: The video models are essential for RL, and I don’t think we are going to consider this content slop once it’s broadly launched.

Psychosomatica: i think you underestimate how these people will compromise their own mental models.

What happens if you realize that you’re better off without The Big Bright Screen Slop Machine?

Paul Yacoubian: Just found out if you try to delete your Sora app account you will lose your chatgpt account and be banned forever from signing up again.

Mousa: You can check out any time you like, but you can never leave 😅

Maia Arson Crimew: > all your data will be removed

> you cannot reuse the same email or phone number

so not all of it, huh 🙂

Very little of it will get deleted, given they have a (stupid) court order in place preventing them from deleting anything even if they wanted to.

The more central point, in addition to this being way too easy to do by accident or for someone else to do to you, is that they punish you by nuking your ChatGPT account and by banning you from signing up again without switching phone number and email. That seems like highly toxic and evil behavior, given the known reasons one would want to get rid of a Sora account and the importance of access to ChatGPT.

Then again, even if we leave the app, will we ever really escape?

Discussion about this post

Sora and The Big Bright Screen Slop Machine Read More »