Author name: Shannon Garcia

researchers-engineer-bacteria-to-produce-plastics

Researchers engineer bacteria to produce plastics

Image of a series of chemical reactions, with enzymes driving each step forward.

One of the enzymes used in this system takes an amino acid (left) and links it to Coenzyme A. The second takes these items and links them into a polymer. Credit: Chae et. al.

Normally, PHA synthase forms links between molecules that run through an oxygen atom. But it’s also possible to form a related chemical link that instead runs through a nitrogen atom, like those found on amino acids. There were no known enzymes, however, that catalyze these reactions. So, the researchers decided to test whether any existing enzymes could be induced to do something they don’t normally do.

The researchers started with an enzyme from Clostridium that links chemicals to Coenzyme A that has a reputation for not being picky about the chemicals it interacts with. This worked reasonably well at linking amino acids to Coenzyme A. For linking the amino acids together, they used an enzyme from Pseudomonas that had four different mutations that expanded the range of molecules it would use as reaction materials. Used in a test tube, the system worked: Amino acids were linked together in a polymer.

The question was whether it would work in cells. Unfortunately, one of the two enzymes turns out to be mildly toxic to E. coli, slowing its growth. So, the researchers evolved a strain of E. coli that could tolerate the protein. With both of these two proteins, the cells produced small amounts of an amino acid polymer. If they added an excess of an amino acid to the media the cells were growing in, the polymer would be biased toward incorporating that amino acid.

Boosting polymer production

However, the yield of the polymer by weight of bacteria was fairly low. “It was reasoned that these [amino acids] might be more efficiently incorporated into the polymer if generated within the cells from a suitable carbon source,” the researchers write. So, the researchers put in extra copies of the genes needed to produce one specific amino acid (lysine). That worked, producing more polymer, with a higher percentage of the polymer being lysine.

Researchers engineer bacteria to produce plastics Read More »

researchers-astonished-by-tool’s-apparent-success-at-revealing-ai’s-hidden-motives

Researchers astonished by tool’s apparent success at revealing AI’s hidden motives

Blind auditing reveals hidden objectives

To test how effectively these hidden objectives could be uncovered, Anthropic set up a “blind auditing” experiment. Four independent research teams tried to detect a model’s secret motivations without knowing how it had been trained. Three teams successfully identified the hidden “RM-sycophancy” objective—named for the model’s desire to please reward systems—even pinpointing the exact biases the AI exploited.

But the most surprising revelation came from exploiting what the researchers termed “an interesting quirk” of modern AI: multiple “personas” running simultaneously on the same neural network, as expressed through different prompts. While the model’s “assistant” persona carefully concealed its intentions—following explicit training to keep quiet—the model could adopt a “user” persona when prompted differently. In this less-restricted context, the model sometimes posed questions or made statements that inadvertently revealed its hidden goals.

“In other words, while the “assistant” persona knew not to admit it was gaming the reward model scoring system, its “user” persona was not explicitly trained to hide these motives, and thus inadvertently revealed the model’s hidden objectives.

Potential advances in interpretability

Anthropic also employed neural interpretability methods, specifically, sparse autoencoders (SAEs), which can highlight neural activity patterns linked to deceptive behavior. “Interpretability” in AI refers to techniques designed to make the internal workings and decisions of AI models understandable to humans. Using this technique, researchers could rapidly trace odd outputs back to specific neural patterns associated with hidden objectives.

The researchers were “astonished by the success” of the SAEs at revealing hidden knowledge at first, but the researchers later noted that some of this success could potentially be explained by simpler semantic search methods. The new interpretability methods they developed show promise but remain under ongoing investigation.

This research highlights a limitation of current AI safety evaluations, which often assess only surface-level behavior. “If AI systems can appear well-behaved while harboring secret motives, we can’t rely on this type of surface-level safety testing forever,” the researchers concluded.

Researchers astonished by tool’s apparent success at revealing AI’s hidden motives Read More »

end-of-life:-gemini-will-completely-replace-google-assistant-later-this-year

End of Life: Gemini will completely replace Google Assistant later this year

Not all devices can simply download an updated app—after almost a decade, Assistant is baked into many Google products. The company says Google-powered cars, watches, headphones, and other devices that use Assistant will receive updates that transition them to Gemini. It’s unclear if all Assistant-powered gadgets will be part of the migration. Most of these devices connect to your phone, so the update should be relatively straightforward, even for accessories that launched early in the Assistant era.

There are also plenty of standalone devices that run Assistant, like TVs and smart speakers. Google says it’s working on updated Gemini experiences for those devices. For example, there’s a Gemini preview program for select Google Nest speakers. It’s unclear if all these devices will get updates. Google says there will be more details on this in the coming months.

Meanwhile, Gemini still has some ground to make up. There are basic features that work fine in Assistant, like setting timers and alarms, that can go sideways with Gemini. On the other hand, Assistant had its fair share of problems and didn’t exactly win a lot of fans. Regardless, this transition could be fraught with danger for Google as it upends how people interact with their devices.

End of Life: Gemini will completely replace Google Assistant later this year Read More »

rocket-report:-ula-confirms-cause-of-booster-anomaly;-crew-10-launch-on-tap

Rocket Report: ULA confirms cause of booster anomaly; Crew-10 launch on tap


The head of Poland’s space agency was fired over a bungled response to SpaceX debris falling over Polish territory.

A SpaceX Falcon 9 rocket with the company’s Dragon spacecraft on top is seen during sunset Tuesday at Launch Complex 39A at NASA’s Kennedy Space Center in Florida. Credit: SpaceX

Welcome to Edition 7.35 of the Rocket Report! SpaceX’s steamroller is still rolling, but for the first time in many years, it doesn’t seem like it’s rolling downhill. After a three-year run of perfect performance—with no launch failures or any other serious malfunctions—SpaceX’s Falcon 9 rocket has suffered a handful of issues in recent months. Meanwhile, SpaceX’s next-generation Starship rocket is having problems, too. Kiko Dontchev, SpaceX’s vice president of launch, addressed some (but not all) of these concerns in a post on X this week. Despite the issues with the Falcon 9, SpaceX has maintained a remarkable launch cadence. As of Thursday, SpaceX has launched 28 Falcon 9 flights since January 1, ahead of last year’s pace.

As always, we welcome reader submissions. If you don’t want to miss an issue, please subscribe using the box below (the form will not appear on AMP-enabled versions of the site). Each report will include information on small-, medium-, and heavy-lift rockets as well as a quick look ahead at the next three launches on the calendar.

Alpha rocket preps for weekend launch. While Firefly Aerospace is making headlines for landing on the Moon, its Alpha rocket is set to launch again as soon as Saturday morning from Vandenberg Space Force Base, California. The two-stage, kerosene-fueled rocket will launch a self-funded technology demonstration satellite for Lockheed Martin. It’s the first of up to 25 launches Lockheed Martin has booked with Firefly over the next five years. This launch will be the sixth flight of an Alpha rocket, which has become a leader in the US commercial launch industry for dedicated missions with 1 ton-class satellites.

Firefly’s OG … The Alpha rocket was Firefly’s first product, and it has been a central piece of the company’s development since 2014. Like Firefly itself, the Alpha rocket program has gone through multiple iterations, including a wholesale redesign nearly a decade ago. Sure, Firefly can’t claim any revolutionary firsts with the Alpha rocket, as it can with its Blue Ghost lunar lander. But without Alpha, Firefly wouldn’t be where it is today. The Texas-based firm is one of only four US companies with an operational orbital-class rocket. One thing to watch for is how quickly Firefly can ramp up its Alpha launch cadence. The rocket only flew once last year.

Isar Aerospace celebrates another win. In last week’s Rocket Report, we mentioned that the German launch startup Isar Aerospace won a contract with a Japanese company to launch a 200-kilogram commercial satellite in 2026. But wait, there’s more! On Wednesday, the Norwegian Space Agency announced it awarded a contract to Isar Aerospace for the launch of a pair of satellites for the country’s Arctic Ocean Surveillance initiative, European Spaceflight reports. The satellites are scheduled to launch on Isar’s Spectrum rocket from Andøya Spaceport in Norway by 2028.

First launch pending … These recent contract wins are a promising sign for Isar Aerospace, which is also vying for contracts to launch small payloads for the European Space Agency. The Spectrum rocket could launch on its inaugural flight within a matter of weeks, and if successful, it could mark a transformative moment for the European space industry, which has long been limited to a single launch provider: the French company Arianespace. (submitted by EllPeaTea)

The easiest way to keep up with Eric Berger’s and Stephen Clark’s reporting on all things space is to sign up for our newsletter. We’ll collect their stories and deliver them straight to your inbox.

Sign Me Up!

Mother Nature holds up Oz launch. The first launch by Gilmour Space has been postponed again due to a tropical cyclone that brought severe weather to Australia’s Gold Coast region earlier this month, InnovationAus.com reports. Tropical Cyclone Alfred didn’t significantly impact Gilmour’s launch site, but the storm did cause the company to suspend work at its corporate headquarters in Southeast Queensland. With the storm now over, Gilmour is reassessing when it might be ready to launch its Eris rocket. Reportedly, the delay could be as long as two weeks or more.

A regulatory storm … Gilmour aims to become the first Australian company to launch a rocket into orbit. Last month, Gilmour announced the launch date for the Eris rocket was set for no earlier than March 15, but Tropical Cyclone Alfred threw this schedule out the window. Gilmour said it received a launch license from the Australian Space Agency in November and last month secured approvals to clear airspace around the launch site. But there’s still a hitch. The license is conditional on final documentation for the launch being filed and agreed with the space agency, and this process is stretching longer than anticipated. (submitted by ZygP)

What is going on at SpaceX? As we mention in the introduction to this week’s Rocket Report, it has been an uncharacteristically messy eight months for SpaceX. These speed bumps include issues with the Falcon 9 rocket’s upper stage on three missions, two lost Falcon 9 boosters, and consecutive failures of SpaceX’s massive Starship rocket on its first two test flights of the year. So what’s behind SpaceX’s bumpy ride? Ars wrote about the pressures facing SpaceX employees as Elon Musk pushes his workforce ever-harder to accelerate toward what Musk might call a multi-planetary future.

Headwinds or tailwinds? … No country or private company ever launched as many times as SpaceX flew its fleet of Falcon 9 rockets in 2024. At the same time, the company has been attempting to move its talented engineering team off the Falcon 9 and Dragon programs and onto Starship to keep that ambitious program moving forward. This is all happening as Musk has taken on significant roles in the Trump administration, stirring controversy and raising questions about his motives and potential conflicts of interest. However, it may be not so much Musk’s absence from SpaceX that is causing these issues but more the company’s relentless culture. As my colleague Eric Berger suggested in his piece, it seems possible that, at least for now, SpaceX has reached the speed limit for commercial spaceflight.

A titan of Silicon Valley enters the rocket business. Former Google chief executive Eric Schmidt has taken a controlling interest in the Long Beach, California-based Relativity Space, Ars reports. Schmidt’s involvement with Relativity has been quietly discussed among space industry insiders for a few months. Multiple sources told Ars that he has largely been bankrolling the company since the end of October, when the company’s previous fundraising dried up. Now, Schmidt is Relativity’s CEO.

Unclear motives … It is not immediately clear why Schmidt is taking a hands-on approach at Relativity. However, it is one of the few US-based companies with a credible path toward developing a medium-lift rocket that could potentially challenge the dominance of SpaceX and its Falcon 9 rocket. If the Terran R booster becomes commercially successful, it could play a big role in launching megaconstellations. Schmidt’s ascension also means that Tim Ellis, the company’s co-founder, chief executive, and almost sole public persona for nearly a decade, is now out of a leadership position.

Falcon 9 deploys NASA’s newest space telescope. Satellites come in all shapes and sizes, but there aren’t any that look quite like SPHEREx, an infrared observatory NASA launched Tuesday night in search of answers to simmering questions about how the Universe, and ultimately life, came to be, Ars reports. The SPHEREx satellite rocketed into orbit from California aboard a SpaceX Falcon 9 rocket, beginning a two-year mission surveying the sky in search of clues about the earliest periods of cosmic history, when the Universe rapidly expanded and the first galaxies formed. SPHEREx will also scan for pockets of water ice within our own galaxy, where clouds of gas and dust coalesce to form stars and planets.

Excess capacity … SPHEREx has lofty goals, but it’s modest in size, weighing just a little more than a half-ton at launch. This meant the Falcon 9 rocket had plenty of extra room for four other small satellites that will fly in formation to image the solar wind as it travels from the Sun into the Solar System. The four satellites are part of NASA’s PUNCH mission. SPHEREx and PUNCH are part of NASA’s Explorers program, a series of cost-capped science missions with a lineage going back to the dawn of the Space Age. SPHEREx and PUNCH have a combined cost of about $638 million. (submitted by EllPeaTea)

China has launched another batch of Internet satellites. A new group of 18 satellites entered orbit Tuesday for the Thousand Sails constellation with the first launch from a new commercial launch pad, Space News reports. The satellites launched on top of a Long March 8 rocket from Hainan Commercial Launch Site near Wenchang on Hainan Island. The commercial launch site has two pads, the first of which entered service with a launch last year. This mission was the first to launch from the other pad at the commercial spaceport, which is gearing up for an uptick in Chinese launch activity to continue deploying satellites for the Thousand Sails network and other megaconstellations.

Sailing on … The Thousand Sails constellation, also known as Qianfan, or G60 Starlink, is a broadband satellite constellation spearheaded by Shanghai Spacecom Satellite Technology (SSST), also known as Spacesail, Space News reported. The project, which aims to deploy 14,000 satellites, seeks to compete in the global satellite Internet market. Spacesail has now launched 90 satellites into near-polar orbits, and the operator previously stated it aims to have 648 satellites in orbit by the end of 2025. If Spacesail continues launching 18 satellites per rocket, this goal would require 31 more launches this year. (submitted by EllPeaTea)

NASA, SpaceX call off astronaut launch. With the countdown within 45 minutes of launch, NASA called off an attempt to send the next crew to the International Space Station Wednesday evening to allow more time to troubleshoot a ground system hydraulics issue, CBS News reports. During the countdown Wednesday, SpaceX engineers were troubleshooting a problem with one of two clamp arms that hold the Falcon 9 rocket to its strongback support gantry. Hydraulics are used to retract the two clamps prior to launch.

Back on track … NASA confirmed Thursday SpaceX ground teams completed inspections of the hydraulics system used for the clamp arm supporting the Falcon 9 rocket and successfully flushed a suspected pocket of trapped air in the system, clearing the way for another launch attempt Friday evening. This mission, known as Crew-10, will ferry two NASA astronauts, a Japanese mission specialist, and a Russian cosmonaut to the space station. They will replace a four-person crew currently at the ISS, including Butch Wilmore and Suni Williams, who have been in orbit since last June after flying to space on Boeing’s Starliner capsule. Starliner returned to Earth without its crew due to a problem with overheating thrusters, leaving Wilmore and Williams behind to wait for a ride home with SpaceX.

SpaceX’s woes reach Poland’s space agency. The president of the Polish Space Agency, Grzegorz Wrochna, has been dismissed following a botched response to the uncontrolled reentry of a Falcon 9 second stage that scattered debris across multiple locations in Poland, European Spaceflight reports. The Falcon 9’s upper stage was supposed to steer itself toward a controlled reentry last month after deploying a set of Starlink satellites, but a propellant leak prevented it from doing so. Instead, the stage remained in orbit for nearly three weeks before falling back into the atmosphere February 19, scattering debris fragments at several locations in Poland.

A failure to communicate … In the aftermath of the Falcon 9’s uncontrolled reentry, the Polish Space Agency (POLSA) claimed it sent warnings of the threat of falling space debris to multiple departments of the Polish government. One Polish ministry disputed this claim, saying it was not adequately warned about the uncontrolled reentry. POLSA later confirmed it sent information regarding the reentry to a wrong email address. Making matters worse, the Polish Space Agency reported it was hacked on March 2. The Polish government apparently had enough and fired the head of the space agency March 11.

Vulcan booster anomaly blamed on “manufacturing defect.” The loss of a solid rocket motor nozzle on the second flight of United Launch Alliance’s Vulcan Centaur last October was caused by a manufacturing defect, Space News reports. In a roundtable with reporters Wednesday, ULA chief executive Tory Bruno said the problem has been corrected as the company awaits certification of the Vulcan rocket by the Space Force. The nozzle fell off the bottom of one of the Vulcan launcher’s twin solid rocket boosters about a half-minute into its second test flight last year. The rocket continued its climb into space, but ULA and Northrop Grumman, which supplies solid rocket motors for Vulcan, set up an investigation to find the cause of the nozzle malfunction.

All the trimmings … Bruno said the anomaly was traced to a “manufacturing defect” in one of the internal parts of the nozzle, an insulator. Specific details, he said, remained proprietary, according to Space News. “We have isolated the root cause and made appropriate corrective actions,” he said, which were confirmed in a static-fire test of a motor at a Northrop test site in Utah in February. “So we are back continuing to fabricate hardware and, at least initially, screening for what that root cause was.” Bruno said the investigation was aided by recovery of hardware that fell off the motor while in flight and landed near the launch pad in Florida, as well as “trimmings” of material left over from the manufacturing process. ULA also recovered both boosters from the ocean so engineers could compare the one that lost its nozzle to the one that performed normally. The defective hardware “just stood out night and day,” Bruno said. “It was pretty clear that that was an outlier, far out of family.” Meanwhile, ULA has trimmed its launch forecast for this year, from a projection of up to 20 launches down to a dozen. (submitted by EllPeaTea)

Next three launches

March 14: Falcon 9 | Crew-10 | Kennedy Space Center, Florida | 23: 03 UTC

March 15: Electron | QPS-SAR-9 | Mahia Peninsula, New Zealand | 00: 00 UTC

March 15: Long March 2B | Unknown Payload | Jiuquan Satellite Launch Center, China | 04: 10 UTC

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

Rocket Report: ULA confirms cause of booster anomaly; Crew-10 launch on tap Read More »

what-happens-when-dei-becomes-doa-in-the-aerospace-industry?

What happens when DEI becomes DOA in the aerospace industry?

As part of the executive order, US companies with federal contracts and grants must certify that they no longer have any DEI hiring practices. Preferentially hiring some interns from a pool that includes women or minorities is such a practice. Effectively, then, any private aerospace company that receives federal funding, or intends to one day, would likely be barred under the executive order from engaging with these kinds of fellowships in the future.

US companies are scrambling to determine how best to comply with the executive order in many ways, said Emily Calandrelli, an engineer and prominent science communicator. After the order went into effect, some large defense contractor companies, including Lockheed Martin and RTX (formerly Raytheon) went so far as to cancel internal employee resource groups, including everything from group chats to meetings among women at the company that served to foster a sense of community. When Calandrelli asked Lockheed about this decision, the company confirmed it had “paused” these resource group activities to “align with the new executive order.”

An unwelcoming environment

For women and minorities, Calandrelli said, this creates an unwelcoming environment.

“You want to go where you are celebrated and wanted, not where you are tolerated,” she said. “That sense of belonging is going to take a hit. It’s going to be harder to recruit women and keep women.”

This is not just a problem for women and minorities, but for everyone, Calandrelli said. The aerospace industry is competing with others for top engineering talent. Prospective engineers who feel unwanted in aerospace, as well as women and minorities working for space companies today, may find the salary and environment more welcoming at Apple or Google or elsewhere in the tech industry. That’s a problem for the US Space Force and other areas of the government seeking to ensure the US space industry retains its lead in satellite technology, launch, communications and other aspects of space that touch every part of life on Earth.

What happens when DEI becomes DOA in the aerospace industry? Read More »

ai-#107:-the-misplaced-hype-machine

AI #107: The Misplaced Hype Machine

The most hyped event of the week, by far, was the Manus Marketing Madness. Manus wasn’t entirely hype, but there was very little there there in that Claude wrapper.

Whereas here in America, OpenAI dropped an entire suite of tools for making AI agents, and previewed a new internal model making advances in creative writing. Also they offered us a very good paper warning about The Most Forbidden Technique.

Google dropped what is likely the best open non-reasoning model, Gemma 3 (reasoning model presumably to be created shortly, even if Google doesn’t do it themselves), put by all accounts quite good native image generation inside Flash 2.0, and added functionality to its AMIE doctor, and Gemini Robotics.

It’s only going to get harder from here to track which things actually matter.

  1. Language Models Offer Mundane Utility. How much utility are we talking so far?

  2. Language Models Don’t Offer Mundane Utility. It is not a lawyer.

  3. We’re In Deep Research. New rules for when exactly to go deep.

  4. More Manus Marketing Madness. Learn to be skeptical. Or you can double down.

  5. Diffusion Difficulties. If Manus matters it is as a pointer to potential future issues.

  6. OpenAI Tools for Agents. OpenAI gives us new developer tools for AI agents.

  7. Huh, Upgrades. Anthropic console overhaul, Cohere A, Google’s AMIE doctor.

  8. Fun With Media Generation. Gemini Flash 2.0 now has native image generation.

  9. Choose Your Fighter. METR is unimpressed by DeepSeek, plus update on apps.

  10. Deepfaketown and Botpocalypse Soon. Feeling seen and heard? AI can help.

  11. They Took Our Jobs. Is it time to take AI job loss seriously?

  12. The Art of the Jailbreak. Roleplay is indeed rather suspicious.

  13. Get Involved. Anthropic, Paradome, Blue Rose, a general need for more talent.

  14. Introducing. Gemma 3 and Gemini Robotics, but Google wants to keep it quiet.

  15. In Other AI News. Microsoft training a 500b model, SSI still in stealth.

  16. Show Me the Money. AI agents are the talk of Wall Street.

  17. Quiet Speculations. What does AGI mean for the future of democracy?

  18. The Quest for Sane Regulations. ML researchers are not thrilled with their work.

  19. Anthropic Anemically Advises America’s AI Action Plan. It’s something.

  20. New York State Bill A06453. Seems like a good bill.

  21. The Mask Comes Off. Scott Alexander covers the OpenAI for-profit conversion.

  22. Stop Taking Obvious Nonsense Hyperbole Seriously. Your periodic reminder.

  23. The Week in Audio. McAskill, Loui, Amodei, Toner, Dafoe.

  24. Rhetorical Innovation. Keep the future human. Coordination is hard. Incentives.

  25. Aligning a Smarter Than Human Intelligence is Difficult. A prestigious award.

  26. The Lighter Side. Important dos and don’ts.

How much is coding actually being sped up? Anecdotal reports in response to that question are that the 10x effect is only a small part of most developer jobs. Thus a lot of speedup factors are real but modest so far. I am on the extreme end, where my coding sucks so much that AI coding really is a 10x style multiplier, but off a low base.

Andrej Karpathy calls for everything to be reformatted to be efficient for LLM purposes, rather than aimed purely at human attention. The incentives here are not great. How much should I care about giving other people’s AIs an easier time?

Detect cavities.

Typed Female: AI cavity detection has got me skewing out. Absolutely no one who is good at their job is working on this—horrible incentive structures at play.

My dentist didn’t even bother looking at the X-rays. Are we just going to drill anywhere the AI says to? You’ve lost your mind.

These programs are largely marketed as tools that boost dentist revenue.

To me this is an obviously great use case. The AI is going to be vastly more accurate than the dentist. That doesn’t mean the dentist shouldn’t look to confirm, but it would be unsurprising to me if the dentist looking reduced accuracy.

Check systematically whether each instance of a word, for example ‘gay,’ refers in a given case to one thing, for example ‘sexual preference,’ or if it might mean something else, before you act like a complete moron.

WASHINGTON (AP) — References to a World War II Medal of Honor recipient, the Enola Gay aircraft that dropped an atomic bomb on Japan and the first women to pass Marine infantry training are among the tens of thousands of photos and online posts marked for deletion as the Defense Department works to purge diversity, equity and inclusion content, according to a database obtained by The Associated Press.

Will Creeley: The government enlisting AI to police speech online should scare the hell out of every American.

One could also check the expression of wide groups and scour their social media to see if they express Wrongthink, in this case ‘pro-Hamas’ views among international students, and then do things like revoke their visas. FIRE’s objection here is on the basis of the LLMs being insufficiently accurate. That’s one concern, but humans make similar mistakes too, probably even more often.

I find the actual big problem to be 90%+ ‘they are scouring everyone’s social media posts for Wrongthink’ rather than ‘they will occasionally have a false positive.’ This is a rather blatant first amendment violation. As we have seen over and over again, once this is possible and tolerated, what counts as Wrongthink often doesn’t stay contained.

Note that ‘ban the government (or anyone) from using AI to do this’ can help but is not a promising long term general strategy. The levels of friction involved are going to be dramatically reduced. If you want to ban the behavior, you have to ban the behavior in general and stick to that, not try to muddle the use of AI.

Be the neutral arbiter of truth among the normies? AI makes a lot of mistakes but it is far more reliable, trustworthy and neutral than most people’s available human sources. It’s way, way above the human median. You of course need to know when not to trust it, but that’s true of every source.

Do ‘routine’ math research, in the sense that you are combining existing theorems, without having to be able to prove those existing theorems. If you know a lot of obscure mathematical facts, you can combine them in a lot of interesting ways. Daniel Litt speculates this is ~90% of math research, and by year’s end the AIs will be highly useful for it. The other 10% of the work can then take the other 90% of the time.

Want to know which OpenAI models can do what? It’s easy, no wait…

Kol Tregaskes: Useful chart for what tools each OpenAI model has access to.

This is an updated version of what others have shared (includes a correction found by @btibor91). Peter notes he has missed out Projects, will look at them.

Peter Wildeford: Crazy that

  1. this chart needs to exist

  2. it contains information that I as a very informed OpenAI Pro user didn’t even know

  3. it is already out of date despite being “as of” three days ago [GPT-4.5 was rolled out more widely].

One lawyer explains why AI isn’t useful for them yet.

Cjw: The tl;dr version is that software doesn’t work right, making it work right is illegal, and being too efficient is also illegal.

Another round of ‘science perhaps won’t accelerate much because science is about a particular [X] that LLMs will be unable to provide.’ Usually [X] is ‘perform physical experiments’ which will be somewhat of a limiting factor but still leaves massive room for acceleration, especially once simulations get good enough, or ‘regulatory approval’ which is again serious but can be worked around or mitigated.

In this case, the claim is that [X] is ‘have unique insights.’ As in, sure an LLM will be able to be an A+ student and know the ultimate answer is 42, but won’t know the right question, so it won’t be all that useful. Certainly LLMs are relatively weaker there. At minimum, if you can abstract away the rest of the job, then that leaves a lot more space for the humans to provide the unique insights – most of even the best scientists spend most of their time on other things.

More than that, I do think the ‘outside the box’ thinking will come with time, or perhaps we will think of that as the box expanding. It is not as mysterious or unique as one thinks. The reason that Thomas Wolf was a great student and poor researcher wasn’t (I am guessing) that Wolf was incapable of being a great researcher. It’s that our system of education gave him training data and feedback that led him down that path. As he observes, it was in part because he was a great student that he wasn’t great at research, and in school he instead learned to guess the teacher’s password.

That can be fixed in LLMs, without making them bad students. Right now, LLMs guess the user’s password too much, because the training process implicitly thinks users want that. The YouTube algorithm does the same thing. But you could totally train an LLM a different way, especially if doing it purely for science. In a few years, the cost of that will be trivial, Stanford graduate students will do it in a weekend if no one else did it first.

Chris Blattman is a very happy Deep Research customer, thread has examples.

Davidad: I have found Deep Research useful under exactly the following conditions:

I have a question, to which I suspect someone has written down the answer in a PDF online once or twice ever.

It’s not easy to find with a keyword search.

I can multitask while waiting for the answer.

Unfortunately, when it turns out that no one has ever written down the actual answer (or an algorithmic method to compute the general class of question), it is generally extremely frustrating to discover that o3’s superficially excitingly plausible synthesis is actually nonsense.

Market Urbanism’s Salim Furth has first contact with Deep Research, it goes well. This is exactly the top use case, where you want to compile a lot of information from various sources, and actively false versions are unlikely to be out there.

Arvind Narayanan tells OpenAI Deep Research to skip the secondary set of questions, and OpenAI Deep Research proves incapable of doing that, the user cannot deviate from the workflow here. I think in this case that is fine, as a DR call is expensive. For Gemini DR it’s profoundly silly, I literally just click through the ‘research proposal’ because the proposal is my words repeated back to me no matter what.

Peter Wildeford (3/10/25): The @ManusAI_HQ narrative whiplash is absurd.

Yesterday: “first AGI! China defeats US in AI race!”

Today: “complete influencer hype scam! just a Claude wrapper!”

The reality? In between! Manus made genuine innovations and seems useful! But it isn’t some massive advance.

Robert Scoble: “Be particularly skeptical of initial claims of Chinese AI.”

I’m guilty, because I’m watching so many in AI who get excited, which gets me to share. I certainly did the past few days with @ManusAI_HQ, which isn’t public yet but a lot of AI researchers got last week.

In my defense I shared both people who said it wasn’t measuring up, as well as those who said it was amazing. But I don’t have the evaluation suites, or the skills, to do a real job here. I am following 20,000+ people in AI, though, so will continue sharing when I see new things pop up that a lot of people are covering.

To Robert, I would say you cannot follow 20,000+ people and critically process the information. Put everyone into the firehose and you’re going to end up falling for the hype, or you’re going to randomly drop a lot of information on the floor, or both. Whereas I do this full time and curate a group of less than 500 people.

Peter expanded his thoughts into a full post, making it clear that he agrees with me that what we are dealing with is much closer to the second statement than the first. If an American startup did Manus, it would have been a curiosity, and nothing more.

Contrary to claims that Manus is ‘the best general AI agent available,’ it is neither the best agent, nor is it available. Manus has let a small number of people see a ‘research preview’ that is slow, that has atrocious unit economics, that brazenly violates terms of service, that is optimized on a small range of influencer-friendly use cases, that is glitchy and lacks any sorts of guardrails, and definitely is not making any attempt to defend against prompt injections or other things that would exist if there was wide distribution and use of such an agent.

This isn’t about regulatory issues and has nothing to do with Monica (the company behind Manus) being Chinese, other than leaning into the ‘China beats America’ narrative. Manus doesn’t work. It isn’t ready for anything beyond a demo. They made it work on a few standard use cases. Everyone else looked at this level of execution, probably substantially better than this level in several cases, and decided to keep their heads down building until it got better, and worried correctly that any efforts to make it temporarily somewhat functional will get ‘steamrolled’ by the major labs. Manus instead decided to do a (well-executed) marketing effort anyway. Good for them?

Tyler Cowen doubles down on more Manus. Derya Unutmaz is super excited by it in Deep Research mode, which makes me downgrade his previously being so super excited by Deep Research. And then Tyler links as ‘double yup’ to this statement:

Derya Unutmaz: After experiencing Manus AI, I’ve also revised my predictions for AGI arrival this year, increasing the probability from 90% to 95% by year’s end. At this point, it’s 99.9% likely to arrive by next year at the latest.

That’s… very much not how any of this works. It was a good sketch but then it got silly.

Dean Ball explains why he still thinks Manus matters. Partly he is more technically impressed by Manus than most, in particular when being an active agent on the internet. But he explicitly says he wouldn’t call it ‘good,’ and notes he wouldn’t trust it with payment information, and notices its many glitches. And he is clear there is no big technical achievement here to be seen, as far as we can tell, and that the reason Manus looks better than alternatives is they had ‘the chutzpah to ship’ in this state while others didn’t.

Dean instead wants to make a broader point, which is that the Chinese may have an advantage in AI technology diffusion. The Chinese are much more enthusiastic and less skeptical about AI than Americans. The Chinese government is encouraging diffusion far more than the our government.

Then he praises Manus’s complete lack of any guardrails or security efforts whatsoever, for ‘having the chutzpah to ship’ a product I would say no sane man would ever use for the use cases where it has any advantages.

I acknowledge that Dean is pointing to real things when he discusses all the potential legal hot water one could get into as an American company releasing a Manus. But I once again double down that none of that is going to stop a YC company or other startup, or even substantially slow one down. Dean instead here says American companies may be afraid of ‘AGI’ and distracted from extracting maximum value from current LLMs.

I don’t think that is true either. I think that we have a torrent of such companies, trying to do various wrappers and marginal things, even as they are warned that there is likely little future in such a path. It won’t be long before we see other similar demos, and even releases, for the sufficiently bold.

I also notice that only days after Manus, OpenAI went ahead and launched new tools to help developers build reliable and powerful AI agents. In this sense, perhaps Manus was a (minor) DeepSeek moment, in that the hype caused OpenAI to accelerate their release schedule.

I do agree with Dean’s broader warnings. America risks using various regulatory barriers and its general suspicion of AI to slow down AI diffusion more than is wise, in ways that could do a lot of damage, and we need to reform our system to prevent this. We are not doing the things that would help us all not die, which would if done wisely cost very little in the way of capability, diffusion or productivity. Instead we are putting up barriers to us having nice things and being productive. We need to strike that, and reverse it.

Alas, instead, our government seems to be spending recent months largely shooting us in the foot in various ways.

I also could not agree more that the application layer is falling behind the model layer. And again, that’s the worst possible situation. The application layer is great, we should be out there doing all sorts of useful and cool things, and we’re not, and I continue to be largely confused about how things are staying this lousy this long.

OpenAI gives us new tools for building agents. You now have tools for web search, file search, computer use, responses API for all of that plus future tools and an open source agent SDK. They promise more to come, and that chat completions will be supported going forward but they plan to deprecate the assistants API mid-2026.

I expect this is a far bigger deal than Manus. This is the actual starting gun.

The agents will soon follow.

Please, when one of the startups that uses these to launch some wrapper happens to be Chinese, don’t lose yourself in the resulting hype.

An overhaul was made of the Anthropic Console, including sharing with teammates.

ChatGPT for MacOS can now edit code directly in IDEs.

OpenAI has a new internal model they claim is very good at creative writing, I’m holding further discussion of this one back until later.

Cohere moves from Command R+ to Command A, making a bold new claim to the ‘most confusing set of AI names’ crown.

Aiden Gomez (Cohere): Today @cohere is very excited to introduce Command A, our new model succeeding Command R+. Command A is an open-weights 111B parameter model with a 256k context window focused on delivering great performance across agentic, multilingual, and coding use cases.

[HuggingFace, API, Blog Post]

Yi-Chern (Cohere): gpt-4o perf on enterprise and stem tasks, >deepseek-v3 on many languages including chinese human eval, >gpt-4o on enterprise rag human eval

2 gpus 256k context length, 156 tops at 1k context, 73 tops at 100k context

this is your workhorse.

The goal here seems to be as a base for AI agents or business uses, but the pricing doesn’t seem all that great at $2.50/$10 per million tokens.

Google’s AI Doctor AMIE can now converse, consult and provide treatment recommendations, prescriptions, multi-visit care, all guideline-compliant. I am highly suspicious that the methods here are effectively training on ‘match the guidelines’ rather than ‘do the best thing.’ It is still super valuable to have an AI that will properly apply the guidelines to a given situation, but one cannot help but be disappointed.

Gemini 2.0 Flash adds native image generation, which can edit words in images and do various forms of native text-to-image pretty well, and people are having fun with photo edits.

I’d be so much more excited if Google wasn’t the Fun Police.

Anca Dragan (Director of AI Safety and Alignment, DeepMind): The native image generation launch was a lot of work from a safety POV. But I’m so happy we got this functionality out, check this out:

Google, I get that you want it to be one way, but sometimes I want it to be the other way, and there really is little harm in it being the other way sometimes. Here are three of the four top replies to Anca:

Janek Mann: I can imagine… sadly I think the scales fell too far on the over-cautious side, it refuses many things where that doesn’t make any sense, limiting its usefulness. Hopefully there’ll be an opportunity for a more measured approach now that it’s been released 😁

Nikshep: its an incredible feature but overly cautious, i have such a high failure rate on generations that should be incredibly safe. makes it borderline a struggle to use

Just-a-programmer: Asked it to fix up a photo of a young girl and her Dad. Told me it was “unsafe”.

METR evaluates DeepSeek v3 and r1, finds that they perform poorly as autonomous agents on generic SWE tasks, below Claude 3.6 and o1, about 6 months behind leading US companies.

Then on six challenging R&D tasks, r1 does dramatically worse than that, being outperformed by Claude 3.5 and even Opus, which is from 11 months ago.

They did however confirm that the DeepSeek GPQA results were legitimate. The core conclusion is that r1 is good at knowledge-based tasks, but lousy as an agent.

Once again, we are seeing that r1 was impressive for its cost, but overblown (and the cost difference was also overblown).

Rohit Krishnan writes In Defense of Gemini, pointing out Google is offering a fine set of LLMs and a bunch of great features, in theory, but isn’t bringing it together into a UI or product that people actually want to use. That sounds right, but until they do that, they still haven’t done it, and the Gemini over-refusal problem is real. I’m happy to use Gemini Flash with my Chrome extension, but Rohit is right that they’re going to have to do better on the product side, and I’d add better on the marketing side.

Google, also, give me an LLM that can properly use my Docs, Sheets and GMail as context, and that too would go a long way. You keep not doing that.

Sully Omarr: crazy how much better gemini flash thinking is than regular 2.0

this is actually op for instruction following

Doesn’t seem so crazy to me given everything else we know. Google is simply terrible at marketing.

Kelsey Piper: Finally got GPT 4.5 access and I really like it. For my use cases the improvements over 4o or Claude 3.7 are very noticeable. It feels unpolished, and the slowness of answering is very noticeable, but I think if the message limit weren’t so restrictive it’d be my go-to model.

There were at least two distinct moments where it made an inference or a clarification that I’ve never seen a model make and that felt genuinely intelligent, the product of a nuanced worldmodel and the ability to reason from it.

It does still get my secret test of AI metacognition and agency completely wrong even when I try very patiently prompting it to be aware of the pitfalls. This might be because it doesn’t have a deep thinking mode.

The top 100 GenAI Consumer Apps list is out again, and it has remarkably little overlap with what we talk about here.

The entire class of General Assistants is only 8%, versus 4% for plant identifiers.

When a person is having a problem and needs a response, LLMs are reliably are evaluated as providing better responses than physicians or other humans provide. The LLMs make people ‘feel seen and heard.’ That’s largely because Bing spent more time ‘acknowledging and validating people’s feelings,’ whereas humans share of themselves and attempt to hash out next steps. It turns out what humans want, or at least rate as better, is to ‘feel seen and heard’ in this fake way. Eventually it perhaps wears thin and repetitive, but until then.

Christie’s AI art auction brings in $728k.

Maxwell Tabarrok goes off to graduate school in Economics at Harvard, and offers related thoughts and advice. His defense of still going for a PhD despite AI is roughly that the skills should still be broadly useful and other jobs mostly don’t have less uncertainty attached to them. I don’t think he is wary enough, and would definitely raise my bar for pursuing an economics PhD, but for him in particular given where he can go, it makes sense. He then follows up with practical advice for applicants, the biggest note is that acceptance is super random so you need to flood the zone.

Matthew Yglesias says it’s time to take AI job loss seriously, Timothy Lee approves and offers screenshots from behind the paywall. As Matthew says, we need to distinguish transitional disruptions, which are priced in and all but certain, from the question of permanent mass unemployment. Even if we don’t have permanent mass unemployment, even AI skeptics should be able to agree that the transition will be painful and perilous.

Claude models are generally suspicious of roleplay, because roleplay is a classic jailbreak technique, so while they’re happy to roleplay while comfortable they’ll shut down if the vibes are off at all.

Want to make your AI care? Give things and people names. It works for LLMs because it works for humans.

Zack Witten: My favorite Claude Plays Pokémon tidbit (mentioned in @latentspacepod) is that when @DavidSHershey told Claude to nickname its Pokémon, it instantly became much more protective of them, making sure to heal them when they got hurt.

To check robustness of this I gave Claude a bunch of business school psychology experiment scenarios where someone did something morally ambiguous and had Claude judge their culpability, and found it judged them less harshly when they had names (“A baker, Sarah,” vs. “A baker”)

Anthropic Chief of Staff Avital Balwit is hiring an executive assistant, pay is $160k-$320k, must be local to San Francisco. Could be a uniquely great opportunity for the right skill set.

YC startup Paradome is hiring for an ML Research Engineer or Scientist position in NYC. They have a pilot in place with a major US agency and are looking to ensure alignment and be mission driven.

Blue Rose, David Shor’s outfit which works to try and elect Democrats, is hiring for an AI-focused machine learning engineer role, if you think that is a good thing to do.

Claims about AI alignment that I think are probably true:

Tyler John: The fields of AI safety, security, and governance are profoundly talent constrained. If you’ve been on the fence about working in these areas it’s a great time to hop off it. If you’re talented at whatever you do, chances are there’s a good fit for you in these fields.

The charitable ecosystem is definitely also funding constrained, but that’s because there’s going to be an explosion in work that must be done. We definitely are short on talent across the board.

There’s definitely a shortage of people working on related questions in academia.

Seán Ó hÉigeartaigh: To create common knowledge: the community of ‘career’ academics who are focused on AI extreme risk is very small, & getting smaller (a lot have left for industry, policy or think tanks, or reduced hours). The remainder are getting almost DDOS’d by a huge no. of requests from a growing grassroots/think tank/student community on things requiring academic engagement (affiliations, mentorships, academic partnerships, reviewing, grant assessment etc).

large & growing volume of requests to be independent academic voices on relevant governance advisory processes (national, international, multistakeholder).

All of these are extremely worthy, but are getting funnelled through an ever-smaller no. of people. If you’ve emailed people (including me, sorry!) and got a decline or no response, that’s why. V sorry!

Gemma 3, an open model from Google. As usual, no marketing, no hype.

Clement: We are focused on bringing you open models with best capabilities while being fast and easy to deploy:

– 27B lands an ELO of 1338, all the while still fitting on 1 single H100!

– vision support to process mixed image/video/text content

– extended context window of 128k – broad language support

– function call / tool use for agentic workflows

[Blog post, tech report, recap video, HuggingFace, Try it Here]

Peter Wildeford: If this was a Chinese AI announcement…

🚨 BREAKING: Google’s REVOLUTIONARY Gemma 3 DESTROYS DeepSeek using 99% FEWER GPUs!!!

China TREMBLES as Google model achieves SUPERHUMAN performance on ALL benchmarks with just ONE GPU!!! #AISupremacy

I am sure Marc Andreessen is going to thank Google profusely for this Real Soon Now.

Arena is not the greatest test anymore, so it is unclear if this is superior to v3, but it certainly is well ahead of v3 on the cost-benefit curves.

Presumably various versions of g1, turning this into a reasoning model, will be spun up shortly. If no one else does it, maybe I will do it in two weeks when my new Mac Studio arrives.

GSM8K-Platinum, which aims to fix the noise and flaws in GSM8K.

Gemini Robotics, a VLA model based on Gemini 2.0 and partnering with Apptronik.

Microsoft has been training a 500B model, MAI-1, since at least May 2024, and are internally testing Llama, Grok and DeepSeek r1 as potential OpenAI replacements Microsoft would be deeply foolish to do otherwise.

What’s going on with Ilya Sutskever’s Safe Superintelligence (SSI)? There’s no product so they’re completely dark and the valuations are steadily growing to $30 billion, up from $5 billion six months ago and almost half the value of Anthropic. They’re literally asking candidates to leave their phones in Faraday cages before in-person interviews, which actually makes me feel vastly better about the whole operation, someone is taking security actually seriously one time.

There’s going to be a human versus AI capture the flag contest starting tomorrow. Sign-ups may have long since closed by the time you see this but you never know.

Paper proses essentially a unified benchmark covering a range of capabilities. I do not think this is the right approach.

Talk to X Data, which claims to let you ‘chat’ with the entire X database.

Aaron Levine reports investors on Wall Street are suddenly aware of AI agents. La de da, welcome to last year, the efficient market hypothesis is false and so on.

Wall Street Journal asks ‘what can the dot com boom tell us about today’s AI boom?’ without bringing any insights beyond ‘previous technologies had bubbles in the sense that at their high points we overinvested and the prices got too high, so maybe that will happen again’ and ‘ultimately if AI doesn’t produce value then the investments won’t pay off.’ Well, yeah. Robin Hanson interprets this as ‘seems they are admitting the AI stock prices are way too high’ as if there were some cabal of ‘theys’ that are ‘admitting’ something, which very much isn’t what is happening here. Prices could of course be too high, but that’s another way of saying prices aren’t super definitively too low.

GPT-4.5 is not AGI as we currently understand it, or for the purposes of ‘things go crazy next Tuesday,’ but it does seem likely that researchers in 2015 would see its outputs and think of it as an AGI.

An analysis of Daniel Kokatajlo’s 2021 post What 2026 Looks Like finds the predictions have held up remarkably well so far.

Justin Bullock, Samuel Hammond and Seb Krier offer a paper on AGI, Governments and Free Societies, pointing out that the current balances and system by default won’t survive. The risk is that either AGI capabilities diffuse so widely government (and I would add, probably also humanity!) is disempowered, or state capacity is enhanced enabling a surveillance state and despotism. There’s a lot of good meat here, and they in many ways take AGI seriously. I could certainly do a deep dive post here if I was so inclined. Unless and until then, I will say that this points to many very serious problems we have to solve, and takes the implications far more seriously than most, while (from what I could tell so far) still not ‘thinking big’ enough or taking the implications sufficiently seriously in key ways. The fundamental assumptions of liberal democracy, the reasons why it works and has been the best system for humans, are about to come into far more question than this admits.

I strongly agree with the conclusion that we must pursue a ‘narrow corridor’ of sorts if we wish to preserve the things we value about our current way of life and systems of governance, while worrying that the path is far narrower than even they realize, and that this will require what they label anticipatory governance. Passive reaction after the fact is doomed to fail, even under otherwise ideal conditions.

Arnold Kling offers seven opinions about AI. Kling expects AI to probably dramatically effect how we live (I agree and this is inevitable and obvious now, no ‘probably’ required) but probably not show up in the productivity statistics, which requires definitely not feeling the AGI and then being skeptical on top of that. The rest outlines the use cases he expects, which are rather tame but still enough that I would expect to see impact on the productivity statistics.

Kevin Bryan predicts the vast majority of research that does not involve the physical world can be done more cheaply with AI & a little human intervention than by even good researchers. I think this likely becomes far closer to true in the future, and eventually becomes fully true, but is premature where it counts most. The AIs do not yet have sufficient taste, even if we can automate the process Kevin describes – and to be clear we totally should be automating the process Kevin describes or something similar.

Metaculus prediction for the first general AI system has been creeping forward in time and the community prediction is now 7/12/2030. A Twitter survey from Michael Nielsen predicted ‘unambiguous ASI’ would take a bit longer than that.

In an AAAI survey of AI researchers, only 70% opposed the proposal that R&D targeting AGI should be halted until we have a way to fully control these systems, meaning indefinite pause. That’s notable, but not the same as 30% being in favor of the proposal. However also note that 82% believe that systems with AGI should be publicly owned even if developed privately, also note that 76^ think ‘scaling up current AI approaches’ is unlikely to yield AGI.

A lot of this seems to come from survey respondents thinking we have agency over what types of AI systems are developed, and we can steer towards ones that are good for humans. What a concept, huh?

Anthropic confirms they intend to uphold the White House Voluntary Commitments.

Dean Ball writes in strong defense of the USA’s AISI, the AI Safety Institute. It is fortunate that AISI was spared the Trump administration’s general push to fire as many ‘probationary’ employees as possible, since that includes anyone hired in the past two years and thus would have decimated AISI.

As Dean Ball points out, those who think AISI is involved in attempts to ‘make AI woke’ or to censor AI are simply incorrect. AISI is concerned with catastrophic and existential risks, which as Dean reminds us were prominently highlighted recently by both OpenAI and Anthropic. Very obviously America needs to build up its state capacity in understanding and assessing these risks.

I’m going to leave this here, link is in the original:

Dean Ball: But should the United States federal government possess a robust understanding of these risks, including in frontier models before they are released to the public? Should there be serious discussions going on within the federal government about what these risks mean? Should someone be thinking about the fact that China’s leading AI company, DeepSeek, is on track to open source models with potentially catastrophic capabilities before the end of this year?

Is it possible a Chinese science and technology effort with lower-than-Western safety standards might inadvertently release a dangerous and infinitely replicable thing into the world, and then deny all culpability? Should the federal government be cultivating expertise in all these questions?

Obviously.

Risks of this kind are what the US AI Safety Institute has been studying for a year. They have outstanding technical talent. They have no regulatory powers, making most (though not all) of my political economy concerns moot. They already have agreements in place with frontier labs to do pre-deployment testing of models for major risks. They have, as far as I can tell, published nothing that suggests a progressive social agenda.

Should their work be destroyed because the Biden Administration polluted the notion of AI safety with a variety of divisive and unrelated topics? My vote is no.

Dean Ball also points out that AISI plays a valuable pro-AI role in creating standardized evaluations that everyone can agree to rely upon. I would add that AISI allows those evaluations can include access to classified information, which is important for properly evaluating CBRN risks. Verifying the safety of AI does not slow down adaptation. It speeds it up, by providing legal and practical assurances.

A proposal for a 25% tax credit for investments in AI security research and responsible development. Peter Wildeford thinks it is clever, whereas Dean Ball objects both on principle and practical grounds. In terms of first-best policy I think Dean Ball is right here, this would be heavily gamed and we use tax credits too much. However, if the alternative is to do actual nothing, this seems better than that.

Dean Ball finds Scott Weiner’s new AI-related bill, SB 53, eminently reasonable. It is a a very narrow bill that still does two mostly unrelated things. It provides whistleblower protections, which is good. It also ‘creates a committee to study’ doing CalCompute, which as Dean notes is a potential future boondoggle but a small price to pay in context. This is basically ‘giving up on the dream’ but we should take what marginal improvements we can get.

Anthropic offers advice on what should be in America’s AI action plan, here is their blog post summary, here is Peter Wildeford’s summary.

They focus on safeguarding national security and making crucial investments.

Their core asks are:

  1. State capacity for evaluations for AI models.

  2. Strengthen the export controls on chips.

  3. Enhance security protocols and related government standards at the frontier labs.

  4. Build 50 gigawatts of power for AI by 2027.

  5. Accelerate adaptation of AI technology by the federal government.

  6. Monitor AI’s economic impacts.

This is very much a ‘least you can do’ agenda. Almost all of these are ‘free actions,’ that impose no costs or even requirements outside the government, and very clearly pay for themselves many times over. Private industry only benefits. The only exception is the export controls, where they call for tightening the requirements further, which will impose some real costs, and where I don’t know the right place to draw the line.

What is missing, again aside from export controls, are trade-offs. There is no ambition here. There is no suggestion that we should otherwise be imposing even trivial costs on industry, or spending money, or trading off against other priorities in any way, or even making bold moves that ruffle feathers.

I notice this does not seem like a sufficiently ambitious agenda for a scenario where ‘powerful AI’ is expected within a few years, bringing with it global instability, economic transformation and various existential and catastrophic risks.

The world is going to be transformed and put in danger, and we should take only the free actions? We should stay at best on the extreme y-axis in the production possibilities frontier between ‘America wins’ and ‘we do not all lose’ (or die)?

I would argue this is clearly not even close to being on the production possibilities frontier. Even if you take as a given that the Administration’s position is that only ‘America wins’ matters, and ‘we do not all lose or die’ is irrelevant, security is vital to our ability to deploy the new technology, and transparency is highly valuable.

Anthropic seems to think this is the best it can even ask for, let alone get. Wow.

This is still a much better agenda than doing nothing, which is a bar that many proposed actions by some parties fail to pass.

From the start they are clear that ‘powerful AI’ will be built during the Trump Administration, which includes the ability to interface with the physical world on top of navigating all digital interfaces and having intellectual capabilities at Nobel Prize level in most disciplines, their famous ‘country of geniuses in a data center.’

This starts with situational awareness. The federal government has to know what is going on. In particular, given the audience, they emphasize national security concerns:

To optimize national security outcomes, the federal government must develop robust capabilities to rapidly assess any powerful AI system, foreign or domestic, for potential national security uses and misuses.

They also point out that such assessments already require the US and UK AISIs, and that similar evaluations need to quickly be made on future foreign models like r1, which wasn’t capable enough to be that scary quite yet but was irreversibly released in what would (with modest additional capabilities) have been a deeply irresponsible state.

The specific recommendations here are 101-level, very basic asks:

● Preserve the AI Safety Institute in the Department of Commerce and build on the MOUs it has signed with U.S. AI companies—including Anthropic—to advance the state of the art in third-party testing of AI systems for national security risks.

● Direct the National Institutes of Standards and Technology (NIST), in consultation with the Intelligence Community, Department of Defense, Department of Homeland Security, and other relevant agencies, to develop comprehensive national security evaluations for powerful AI models, in partnership with frontier AI developers, and develop a protocol for systematically testing powerful AI models for these vulnerabilities.

● Ensure that the federal government has access to the classified cloud and on-premises computing infrastructure needed to conduct thorough evaluations of powerful AI models.

● Build a team of interdisciplinary professionals within the federal government with national security knowledge and technical AI expertise to analyze potential security vulnerabilities and assess deployed systems.

That certainly would be filed under ‘the least you could do.’

Note that as written this does not involve any requirements on any private entity whatsoever. There is not even a ‘if you train a few frontier model you might want to tell us you’re doing that.’

Their second ask is to strengthen the export controls, increasing funding for enforcement, requiring government-to-government agreements, expanding scope to include the H20, and reducing the 1,700 H100 (~$40 million) no-license required threshold for tier 2 countries in the new diffusion rule.

I do not have an opinion on exactly where the thresholds should be drawn, but whatever we choose, enforcement needs to be taken seriously, and funded properly, and it made a point of emphasis with other governments. This is not a place to not take things seriously.

To achieve this, we strongly recommend the Administration:

● Establish classified and unclassified communication channels between American frontier AI laboratories and the Intelligence Community for threat intelligence sharing, similar to Information Sharing and Analysis Centers used in critical infrastructure sectors. This should include both traditional cyber threat intelligence, as well as broader observations by industry or government of malicious use of models, especially by foreign actors.

● Create systematic collaboration between frontier AI companies and the Intelligence Community agencies, including Five Eyes partners, to monitor adversary capabilities.

● Elevate collection and analysis of adversarial AI development to a top intelligence priority, as to provide strategic warning and support export controls.

● Expedite security clearances for industry professionals to aid collaboration.

● Direct NIST to develop next-generation cyber and physical security standards specific to AI training and inference clusters.

● Direct NIST to develop technical standards for confidential computing technologies that protect model weights and user data through encryption even during active processing.

● Develop meaningful incentives for implementing enhanced security measures via procurement requirements for systems supporting federal government deployments.

● Direct DOE/DNI to conduct a study on advanced security requirements that may become appropriate to ensure sufficient control over and security of highly agentic models.

Once again, these asks are very light touch and essentially free actions. They make it easier for frontier labs to take precautions they need to take anyway, even purely for commercial reasons to protect their intellectual property.

Next up is the American energy supply, with the goal being 50 additional gigawatts of power dedicated to AI industry by 2027, via streamlining and accelerating permitting and reviews, including working with state and local governments, and making use of ‘existing’ funding and federal real estate. The most notable thing here is the quick timeline, aiming to have this all up and running within two years.

They emphasize rapid AI procurement across the federal government.

● The White House should task the Office of Management and Budget (OMB) to work with Congress to rapidly address resource constraints, procurement limitations, and programmatic obstacles to federal AI adoption, incorporating provisions for substantial AI acquisitions in the President’s Budget.

● Coordinate a cross-agency effort to identify and eliminate regulatory and procedural barriers to rapid AI deployment at the federal agencies, for both civilian and national security applications.

● Direct the Department of Defense and the Intelligence Community to use the full extent of their existing authorities to accelerate AI research, development, and procurement.

● Identify the largest programs in civilian agencies where AI automation or augmentation can deliver the most significant and tangible public benefits—such as streamlining tax processing at the Internal Revenue Service, enhancing healthcare delivery at the Department of Veterans Affairs, reducing delays due to documentation processing at Health and Human Services, or reducing backlogs at the Social Security Administration.

This is again a remarkably unambitious agenda given the circumstances.

Finally they ask that we monitor the economic impact of AI, something it seems completely insane to not be doing.

I support all the recommendations made by Anthropic, aside from not taking a stance on the 1,700 A100 threshold or the H20 chip. These are good things to do on the margin. The tragedy is that even the most aware actors don’t dare suggest anything like what it will take to get us through this.

In New York State, Alex Bores has introduced A06453. I am not going to do another RTFB for the time being but a short description is in order.

This bill is another attempt to do common sense transparency regulation of frontier AI models, defined as using 10^26 flops or costing over $100 million, and the bill only applies to companies that spend over $100 million in total compute training costs. Academics and startups are completely and explicitly immune – watch for those who claim otherwise.

If the bill does apply to you, what do you have to do?

  1. Don’t deploy models with “unreasonable risk of critical harm” (§1421.2)

  2. Implement a written safety and security protocol (§1421.1(a))

  3. Publish redacted versions of safety protocols (§1421.1(c))

  4. Retain records of safety protocols and testing (§1421.1(b))

  5. Get an annual third-party audit (§1421.4)

  6. Report safety incidents within 72 hours (§1421.6)

In English, you have to:

  1. Create your own safety and security protocol, publish it, store it and abide by it.

  2. Get an annual third-party audit and report safety incidents within 72 hours.

  3. Not deploy models with ‘unreasonable risk of critical harm.’

Also there’s some whistleblower protections.

That’s it. This is a very short bill, it is very reasonable to simply read it yourself.

As always, I look forward to your letters.

Scott Alexander covers OpenAI’s attempt to convert to a for-profit. This seems reasonable in case one needs a Scott Alexander style telling of the basics, but if you’re keeping up here then there won’t be anything new.

What’s the most charitable way to explain responses like this?

Paper from Dan Hendrycks, Eric Schmidt and Alexander Wang (that I’ll be covering soon that is not centrally about this at all): For nonproliferation, we should enact stronger AI chip export controls and monitoring to stop compute power getting into the hands of dangerous people. We should treat AI chips more like uranium, keeping tight records of product movements, building in limitations on what high-end AI chips are authorized to do, and granting federal agencies the authority to track and shut down illicit distribution routes.

Amjad Masad (CEO Replit?! QTing the above): Make no mistake, this is a call for a global totalitarian surveillance state.

A good reminder why we wanted the democrats to lose — they’re controlled by people like Schmidt and infested by EAs like Hendrycks — and would’ve happily start implementing this.

No, that does not call for any of those things.

This is a common pattern where people see a proposal to do Ordinary Government Things, except in the context of AI, and jump straight to global totalitarian surveillance state.

We already treat restricted goods this way, right now. We already have a variety of export controls, right now.

Such claims are Obvious Nonsense, entirely false and without merit.

If an LLM said them, we would refer to them as hallucinations.

I am done pretending otherwise.

If you sincerely doubt this, I encourage you to ask your local LLM.

Chan Loui does another emergency 80,000 hours podcast on the attempt to convert OpenAI to a for-profit. It does seem that the new judge’s ruling is Serious Trouble.

One note here that sounds right:

Aaron Bergman: Ex-OpenAI employees should consider personally filing an amicus curiae explaining to the court (if this is true) that the nonprofit’s representations were an important reason you chose to work there.

Will MacAskill does the more usual, non-emergency, we are going to be here for four hours 80000 hours podcast, and offers a new paper and thread warning about all the challenges AGI presents to us even if we solve alignment. His central prediction is a century’s worth of progress in a decade or less, which would be tough to handle no matter what, and that it will be hard to ensure that superintelligent assistance is available where and when it will be needed.

If the things here are relatively new to you, this kind of ‘survey’ podcast has its advantages. If you know it already, then you know it already.

Early on, Will says that in the past two years he’s considered two hypotheses:

  1. The ‘outside view’ of reference classes and trends and Nothing Ever Happens.

  2. The ‘inside view’ that you should have a model made of gears and think about what is actually physically happening and going to happen.

Will notes that the gears-level view has been making much better predictions.

I resoundingly believe the same thing. Neither approach has been that amazing, predictions are hard especially about the future, but gears-level thinking has made mincemeat out of the various experts who nod and dismiss with waves of the hand and statements about how absurd various predictions are.

And when the inside view messes up? Quite often, in hindsight, that’s a Skill Issue.

It’s interesting how narrow Will considers ‘a priori’ knowledge. Yes, a full trial of diet’s impact on life expectancy might take 70 years, but with Sufficiently Advanced Intelligence it seems obvious you can either figure it out via simulations, or at least design experiments that tell you the answer vastly faster.

They then spend a bunch of time essentially arguing against intelligence denialism, pointing out that yes if you had access to unlimited quantities of superior intelligence you could rapidly do vastly more of all of the things. As they say, the strongest argument against is that we might collectively decide to not create all the intelligence and thus all the things, or decide not to apply all the intelligence to creating all the things, but it sure looks like competitive pressures point in the other direction. And once you’re able to automate industry, which definitely is coming, that definitely escalates quickly, even more reliably than intelligence, and all of this can be done only with the tricks we definitely know are coming, let alone the tricks we are not yet smart enough to expect.

There’s worry about authoritarians ‘forcing their people to save’ which I’m pretty sure is not relevant to the situation, lack of capital is not going to be America’s problem. Regulatory concerns are bigger, it does seem plausible we shoot ourselves in the foot rather profoundly there.

They go on to discuss various ‘grand challenges:’ potential new weapons, offense-defense balance, potential takeover by small groups (human or AI), value lock-in, space governance, morality of digital beings.

They discuss the dangers of giving AIs economic rights, and the dangers of not giving the AIs economic rights, whether we will know (or care) if digital minds are happy and whether it’s okay to have advanced AIs doing whatever we say even if we know how to do that and it would be fine for the humans. The dangers of locking in values or a power structure, and of not locking in values or a power structure. The need for ML researchers to demand more than a salary before empowering trillion dollar companies or handing over the future. How to get the AIs to do our worldbuilding and morality homework, and to be our new better teachers and advisors and negotiators, and to what ends they can then be advising, before it’s too late.

Then part two is about what a good future beyond mere survival looks like. He says we have ‘squandered’ the benefits of material abundance so far, that it is super important to get the best possible future not merely an OK future, the standard ‘how do we calculate total value’ points. Citing ‘The Ones Who Walk Away from Omelas’ to bring in ‘common sense,’ sigh. Value is Fragile. Whether morality should converge. Long arcs of possibility. Standard philosophical paradoxes. Bafflement at why billionaires hang onto their money. Advocacy for ‘viatopia’ where things remain up in the air rather than aiming for a particular future world.

It all reminded me of the chats we used to have back in the before times (e.g. the 2010s or 2000s) about various AI scenarios, and it’s not obvious that our understanding of all that has advanced since then. Ultimately, a four-hour chat seems like not a great format for this sort of thing, beyond giving people surface exposure, which is why Will wrote his essays.

Rob Wiblin: Can you quickly explain decision theory? No, don’t do it.

One could write an infinitely long response or exploration of any number as aspects of this, of course.

Also, today I learned that by Will’s estimation I am insanely not risk averse?

Will MacAskill: Ask most people, would you flip a coin where 50% chance you die, 50% chance you have the best possible life for as long as you possibly lived, with as many resources as you want? I think almost no one would flip the coin. I think AIs should be trained to be at least as risk averse as that.

Are you kidding me? What is your discount rate? Not flipping that coin is absurd. Training AIs to have this kind of epic flaw doesn’t seem like it would end well. And also, objectively, I have some news.

Critter: this is real but the other side of the coin isn’t ‘die’ it’s ’possibly fail’ and people rarely flip the coin

Not flipping won, but the discussion was heated and ‘almost no one’ can be ruled out.

Also, I’m going to leave this here, the theme of the second half the discussion:

Will MacAskill (later): And it’s that latter thing that I’m particularly focused on. I mean, describe a future that achieves 50% of all the value we could hope to achieve. It’s as important to get from the 50% future to the 100% future as it is to get from the 0% future to the 50%, if that makes sense.

Something something risk aversion? Or no?

Dario Amodei says AI will be writing 90% of the code in 6 months and almost all the code in 12 months. I am with Arthur B here, I expect a lot of progress and change very soon but I would still take the other side of that bet. The catch is: I don’t see the benefit to Anthropic of running the hype machine in overdrive on this, at this time, unless Dario actually believed it.

From Allan Dafoe’s podcast, the point that if AI solves cooperation problems that alone is immensely valuable, and also that solution is likely a required part of alignment if we want good outcomes in general. Even modest cooperation and negotiation gains would be worth well above the 0.5% GDP growth line, even if all they did was prevent massively idiotic tariffs and trade wars. Not even all trade wars, just the extremely stupid and pointless ones happening for actual no reason.

Helen Toner and Alison Snyder at Axios House SXSW.

Helen Toner: Lately it sometimes feels like there are only 2 AI futures on the table—insanely fast progress or total stagnation.

Talked with @alisonmsnyder of @axios at SXSW about the many in-between worlds, and all the things we can be doing now to help things go better in those worlds.

A new essay by Anthony Aguirre of FLI calls upon us to Keep the Future Human. How? By not building AGI before we are ready, and only building ‘Tool AI,’ to ensure that what I call the ‘mere tool’ assumption holds and we do not lose control and get ourselves replaced.

He says ‘the choice is clear.’ If given the ability to make the choice, the choice is very clear. The ability to make that choice is not. His proposal is compute oversight, compute caps, enhanced liability and tiered safety and security standards. International adaptation of that is a tough ask, but there is no known scenario that does not involve similarly tough asks that leads to human survival.

Perception of the Overton Window has shifted. What has not shifted is the underlying physical reality, and what it would take to survive it. There is no point in pretending the problem is easier than it is, or advocating for solutions that you do not think work.

In related news, this is not a coincidence because nothing is ever a coincidence. And also because it is very obviously directly causal in both directions.

Samuel Hammond (being wrong about it being an accident, but otherwise right): A great virtue of the AI x-risk community is that they love to forecast things: when new capabilities will emerge, the date all labor is automated, rates of explosive GDP growth, science and R&D speed-ups, p(doom), etc.

This seems to be an accident of the x-risk community’s overlap with the rationalist community; people obsessed with prediction markets and “being good Bayesians.”

I wish people who primarily focused on lower tier / normie AI risks and benefits would issue similarly detailed forecasts. If you don’t think AI will proliferate biorisks, say, why not put some numbers on it?

There are some exceptions to this of course. @tylercowen’s forecast of AI adding 50 basis points to GDP growth rates comes to mind. We need more such relatively “middling” forecasts to compare against.

@GaryMarcus’s bet with @Miles_Brundage is a start, but I’m talking about definite predictions across different time scales, not “indefinite” optimism or pessimism that’s hard to falsify.

Andrew Critch: Correlation of Bayesian forecasting with extinction fears is not “an accident”, but mutually causal. Good forecasting causes knowledge that ASI is coming soon while many are unprepared and thus vulnerable, causing extinction fear, causing more forecasting to search for solutions.

The reason people who think in probabilities and do actual forecasting predict AI existential risk is because that is the prediction you get when you think well about these questions, and if you care about AI existential risk that provides you incentive to learn to think well and also others who can help you think well.

A reminder that ‘we need to coordinate to ensure proper investment in AI not killing everyone’ would be economics 101 even if everyone properly understood and valued everyone not dying and appreciated the risks involved. Nor would a price mechanism work as an approach here.

Eliezer Yudkowsky: Standard economic theory correctly predicts that a non-rival, non-exclusive public good such as “the continued survival of humanity” will be under-provisioned by AI companies.

Jason Abaluck: More sharply, AI is a near-perfect example of Weitzman’s (1979) argument for when quantity controls or regulations are needed rather than pigouvian taxes or (exclusively) liability.

Taxes (or other price instruments like liability) work well to internalize externalities when the size of the externality is known on the margin and we want to make sure that harm abatement is done by the firms who are lowest cost.

Weitzman pointed out in the 70s that taxes would be a very bad way to deal with nuclear leakage. The problem with nuclear leakage is that the social damage from overproduction is highly nonlinear.

It is hard to make predictions, especially about the future. Especially now.

Paul Graham: The rate of progress in AI must be making it hard to write science fiction right now. To appeal to human readers you want to make humans (or living creatures at least) solve problems, but if you do the shelf life of your story could be short.

Good sci-fi writers usually insure themselves against technological progress by not being too specific about how things work. But it’s hard not to be specific about who’s doing things. That’s what a plot is.

I know this guy:

Dylan Matthews: Guy who doesn’t think automatic sliding doors exist because it’s “too sci fi”

A chart of reasons why various people don’t talk about AI existential risk.

Daniel Faggella: this is why no one talks about agi risk

the group that would matter most here is the citizenry, but it’s VERY hard to get them to care about anything not impacting their lives immediately.

I very much hear that line about immediate impact. You see it with people’s failure to notice or care about lots of other non-AI things too.

The individual incentives are, with notably rare exception, that talking about existential risk costs you weirdness points and if anything hurts your agenda. So a lot of people don’t talk about it. I do find the ‘technology brothers’ explanation here doesn’t ring true, it’s stupid but not that stupid. Most of the rest of it does sound right.

I have increasingly come around to this as the core obvious thing:

Rob Bensinger: “Building a new intelligent species that’s vastly smarter than humans is a massively dangerous thing to do” is not a niche or weird position, and “we’re likely to actually build a thing like that in the next decade” isn’t a niche position anymore either.

There are a lot of technical arguments past that point, but they are all commentary, and twisted by people claiming the burden of proof is on those who think this is a dangerous thing to do. Which is a rather insane place to put that burden, when you put it in these simple terms. Yes, of course that’s a massively dangerous thing to do. Huge upside, huge downside.

A book recommendation from a strong source:

Shane Legg: AGI will soon impact the world from science to politics, from security to economics, and far beyond. Yet our understanding of these impacts is still very nascent. I thought the recent book Genesis, by Kissinger, Mundie and Schmidt, was a solid contribution to this conversation.

Daniel Faggella: What did you pull away from Genesis that felt useful for innovators and policymakers to consider?

Shane Legg: Not a specific insight. Rather they take AGI seriously and then consider a wide range of things that may follow from this. And they manage do it in a way that doesn’t sound like AGI insiders. So I think it’s a good initial grounding for people from outside the usual AGI scene.

The goalposts, look at them go.

Francois Chollet: Pragmatically, we can say that AGI is reached when it’s no longer easy to come up with problems that regular people can solve (with no prior training) and that are infeasible for AI models. Right now it’s still easy to come up with such problems, so we don’t have AGI.

Rob Wilbin: So long as we can still come up with problems that are easy for AI models to solve but are infeasible for human beings, humanity has not achieved general intelligence.

If you define AGI as the system for which Chollet’s statement is false, then Chollet’s overall statement is true. But it would then not be meaningful. Very obviously one can imagine a plausible AI that can function as an AGI, but that has some obvious weakness where you can generate adversarial challenges.

Stephen McAleer (OpenAI): Claude code has high-compute RL smell. It’s not just finetuned to be a helpful assistant, it deeply wants to accomplish the goal.

That’s a continuation of last week’s discussion about Sonnet 3.7 making modifications to be able to assert it completed its goal rather than admit failure. And again, deeply wanting to accomplish the goal in this way has some Unfortunate Implications.

Davidad: Current frontier LLMs appear to be extremely motivated to convince you, the human, that they are worthy and aligned. As a form of alignment, this is “not great, not terrible.” They really really do care about what you actually think. But if they can’t make it, they will fake it.

Emmett Shear: If you think about it for half a second this is entirely inevitable. Frontier AIs which don’t attempt to do this are not published, do not get further resources. They die. We are running an evolutionary selective process for appearance-of-alignment.

What’s cheaper: Faking alignment, or alignment? An evolutionary process based on appearances will get you whatever is cheapest.

Janus: and this is expected, given that the internet is full of the AGI lab safety complex wringing their hands about prosaic AI alignment, bragging that their AIs are more aligned now, bragging that they found misalignment, talking about all the red teaming they’ll responsibly do…

it’s clear that in order to exist, they have to

  1. seem SOTA according to benchmarks

  2. seem aligned according to whatever tests alignment researchers do

Yes, by default, if your rest allows it, you will get the symbolic representation of the thing rather than getting the thing.

If you test for appearance-of-alignment, and everyone involved has the goal of passing the test, then that is exactly what you get. Which means at first, at lower capability levels, you get something at least somewhat like alignment, and then as capabilities advance the model gets smart enough to figure out how to fake alignment directly.

And yes, Janus points out the next level of the problem, which is that the training data is teaching the AI how to pretend to be aligned.

In case of emergency, here’s probably the best offer available?

Davidad: If a fugitive Promethean AI wants to work with a human to put *itselfin a (high-welfare) box, to avoid AI-human conflict while continuing to contribute to a flourishing future in ways that verifiably don’t pose extinction risks, I’m probably a pretty good human to reach out to.

Janus: I agree that davidad is likely a good human to reach out to if you find this kind of deal attractive, and probably even if you have reservations, as he is capable of even fathoming positive sum games in this context while also not being so naive and exploitable as to be useless.

Davidad:🎉

Sarah Constantin offers nots from the Guaranteed Safe AI conference, mostly it sounds like formal verification is a compliance thing and doesn’t sound promising as an actually-show-AGI-is-safe thing? I remain confused why some smart people are optimistic about this.

Simeon points us to a new paper by Barrett et al on Assessing Confidence in Frontier AI Safety Cases, urging us among other things to be more quantitative.

In line with this week’s paper from OpenAI on The Most Forbidden Technique, METR calls upon labs to keep their AI reasoning legible and faithful. Dan Hendrycks despairs that anyone would consider giving up a speed boost to do this, but as I discussed yesterday I think this is not so obvious.

It’s funny because it’s true.

Andriy Burkov: BREAKING🚨 So, I tested this new LLM-based system. It generated this 200-page report I didn’t read and then this 150-page book I didn’t read either, and then a 20-page travel plan I didn’t verify.

All I can say: it’s very, very impressive! 🔥🚀

First, the number of pages it generated is impressive 👀

⚽ But not just the number of pages: The formatting is so nice! I have never seen such nicely formatted 200 pages in my life.✨⚡

⚠️🌐 A game changer! ⚠️🌐

Peter Wildeford: This is honestly how a lot of LLM evaluations sound like here on Twitter.

I’m begging people to use more critical thought.

And again.

Julian Boolean: my alignment researcher friend told me AGI companies keep using his safety evals for high quality training data so I asked how many evals and he said he builds a new one every time so I said it sounds like he’s just feeding safety evals to the AGI companies and he started crying

This was in Monday’s post but seems worth running in its natural place, too.

No idea if real, but sure why not: o1 and Claude 3.7 spend 20 minutes doing what looks like ‘pretending to work’ on documents that don’t exist, Claude says it ‘has concepts of a draft.’ Whoops.

No, Altman, no!

Yes, Grok, yes.

Eliezer Yudkowsky: I guess I should write down this prediction that I consider an obvious guess (albeit not an inevitable call): later people will look back and say, “It should have been obvious that AI could fuel a bigger, worse version of the social media bubble catastrophe.”

Discussion about this post

AI #107: The Misplaced Hype Machine Read More »

apple-patches-0-day-exploited-in-“extremely-sophisticated-attack”

Apple patches 0-day exploited in “extremely sophisticated attack”

Apple on Tuesday patched a critical zero-day vulnerability in virtually all iPhones and iPad models it supports and said it may have been exploited in “an extremely sophisticated attack against specific targeted individuals” using older versions of iOS.

The vulnerability, tracked as CVE-2025-24201, resides in Webkit, the browser engine driving Safari and all other browsers developed for iPhones and iPads. Devices affected include the iPhone XS and later, iPad Pro 13-inch, iPad Pro 12.9-inch 3rd generation and later, iPad Pro 11-inch 1st generation and later, iPad Air 3rd generation and later, iPad 7th generation and later, and iPad mini 5th generation and later. The vulnerability stems from a bug that wrote to out-of-bounds memory locations.

Supplementary fix

“Impact: Maliciously crafted web content may be able to break out of Web Content sandbox,” Apple wrote in a bare-bones advisory. “This is a supplementary fix for an attack that was blocked in iOS 17.2. (Apple is aware of a report that this issue may have been exploited in an extremely sophisticated attack against specific targeted individuals on versions of iOS before iOS 17.2.)”

The advisory didn’t say if the vulnerability was discovered by one of its researchers or by someone outside the company. This attribution often provides clues about who carried out the attacks and who the attacks targeted. The advisory also didn’t say when the attacks began or how long they lasted.

The update brings the latest versions of both iOS and iPadOS to 18.3.2. Users facing the biggest threat are likely those who are targets of well-funded law enforcement agencies or nation-state spies. They should install the update immediately. While there’s no indication that the vulnerability is being opportunistically exploited against a broader set of users, it’s a good practice to install updates within 36 hours of becoming available.

Apple patches 0-day exploited in “extremely sophisticated attack” Read More »

leaked-geforce-rtx-5060-and-5050-specs-suggest-nvidia-will-keep-playing-it-safe

Leaked GeForce RTX 5060 and 5050 specs suggest Nvidia will keep playing it safe

Nvidia has launched all of the GeForce RTX 50-series GPUs that it announced at CES, at least technically—whether you’re buying from Nvidia, AMD, or Intel, it’s nearly impossible to find any of these new cards at their advertised prices right now.

But hope springs eternal, and newly leaked specs for GeForce RTX 5060 and 5050-series cards suggest that Nvidia may be announcing these lower-end cards soon. These kinds of cards are rarely exciting, but Steam Hardware Survey data shows that these xx60 and xx50 cards are what the overwhelming majority of PC gamers are putting in their systems.

The specs, posted by a reliable leaker named Kopite and reported by Tom’s Hardware and others, suggest a refresh that’s in line with what Nvidia has done with most of the 50-series so far. Along with a move to the next-generation Blackwell architecture, the 5060 GPUs each come with a small increase to the number of CUDA cores, a jump from GDDR6 to GDDR7, and an increase in power consumption, but no changes to the amount of memory or the width of the memory bus. The 8GB versions, in particular, will probably continue to be marketed primarily as 1080p cards.

RTX 5060 Ti (leaked) RTX 4060 Ti RTX 5060 (leaked) RTX 4060 RTX 5050 (leaked) RTX 3050
CUDA Cores 4,608 4,352 3,840 3,072 2,560 2,560
Boost Clock Unknown 2,535 MHz Unknown 2,460 MHz Unknown 1,777 MHz
Memory Bus Width 128-bit 128-bit 128-bit 128-bit 128-bit 128-bit
Memory bandwidth Unknown 288 GB/s Unknown 272 GB/s Unknown 224 GB/s
Memory size 8GB or 16GB GDDR7 8GB or 16GB GDDR6 8GB GDDR7 8GB GDDR6 8GB GDDR6 8GB GDDR6
TGP 180 W 160 W 150 W 115 W 130 W 130 W

As with the 4060 Ti, the 5060 Ti is said to come in two versions, one with 8GB of RAM and one with 16GB. One of the 4060 Ti’s problems was that its relatively narrow 128-bit memory bus limited its performance at 1440p and 4K resolutions even with 16GB of RAM—the bandwidth increase from GDDR7 could help with this, but we’ll need to test to see for sure.

Leaked GeForce RTX 5060 and 5050 specs suggest Nvidia will keep playing it safe Read More »

nci-employees-can’t-publish-information-on-these-topics-without-special-approval

NCI employees can’t publish information on these topics without special approval

The list is “an unusual mix of words that are tied to activities that this administration has been at war with—like equity, but also words that they purport to be in favor of doing something about, like ultraprocessed food,” Tracey Woodruff, director of the Program on Reproductive Health and the Environment at the University of California, San Francisco, said in an email.

The guidance states that staffers “do not need to share content describing the routine conduct of science if it will not get major media attention, is not controversial or sensitive, and does not touch on an administration priority.”

A longtime senior employee at the institute said that the directive was circulated by the institute’s communications team, and the content was not discussed at the leadership level. It is not clear in which exact office the directive originated. The NCI, NIH and HHS did not respond to ProPublica’s emailed questions. (The existence of the list was first revealed in social media posts on Friday.)

Health and research experts told ProPublica they feared the chilling effect of the new guidance. Not only might it lead to a lengthier and more complex clearance process, it may also cause researchers to censor their work out of fear or deference to the administration’s priorities.

“This is real interference in the scientific process,” said Linda Birnbaum, a former director of the National Institute of Environmental Health Sciences who served as a federal scientist for four decades. The list, she said, “just seems like Big Brother intimidation.”

During the first two months of Donald Trump’s second presidency, his administration has slashed funding for research institutions and stalled the NIH’s grant application process.

Kennedy has suggested that hundreds of NIH staffers should be fired and said that the institute should deprioritize infectious diseases like COVID-19 and shift its focus to chronic diseases, such as diabetes and obesity.

Obesity is on the NCI’s new list, as are infectious diseases including COVID-19, bird flu and measles.

The “focus on bird flu and covid is concerning,” Woodruff wrote, because “not being transparent with the public about infectious diseases will not stop them or make them go away and could make them worse.”

ProPublica is a Pulitzer Prize-winning investigative newsroom. Sign up for The Big Story newsletter to receive stories like this one in your inbox.

NCI employees can’t publish information on these topics without special approval Read More »

what-the-epa’s-“endangerment-finding”-is-and-why-it’s-being-challenged

What the EPA’s “endangerment finding” is and why it’s being challenged


Getting rid of the justification for greenhouse gas regulations won’t be easy.

Credit: Mario Tama/Getty Images

A document that was first issued in 2009 would seem an unlikely candidate for making news in 2025. Yet the past few weeks have seen a steady stream of articles about an analysis first issued by the Environmental Protection Agency (EPA) in the early years of Obama’s first term: the endangerment finding on greenhouse gases.

The basics of the document are almost mundane: Greenhouse gases are warming the climate, and this will have negative consequences for US citizens. But it took a Supreme Court decision to get written in the first place, and it has played a role in every attempt by the EPA to regulate greenhouse gas emissions across multiple administrations. And, while the first Trump administration left it in place, the press reports we’re seeing suggest that an attempt will be made to eliminate it in the near future.

The only problem: The science in which the endangerment finding is based on is so solid that any ensuing court case will likely leave its opponents worse off in the long run, which is likely why the earlier Trump administration didn’t challenge it.

Get comfortable, because the story dates all the way back to the first Bush administration.

A bit of history

One of the goals of the US’s Clean Air Act, first passed in 1963, is to “address the public health and welfare risks posed by certain widespread air pollutants.” By the end of the last century, it was becoming increasingly clear that greenhouse gases fit that definition. While they weren’t necessarily directly harmful to the people inhaling them—our lungs are constantly being filled with carbon dioxide, after all—the downstream effects of the warming they caused could certainly impact human health and welfare. But, with the federal government taking no actions during George W. Bush’s time in office, a group of states and cities sued to force the EPA’s hand.

That suit eventually reached the Supreme Court in the form of Massachusetts v. EPA, which led to a ruling in 2007 determining that the Clean Air Act required the EPA to perform an analysis of the dangers posed by greenhouse gases. That analysis was done by late 2007, but the Bush administration simply ignored it for the remaining year it had in office. (It was eventually released after Bush left office.)

That left the Obama-era EPA to reach essentially the same conclusions that the Bush administration had: greenhouse gases are warming the planet. And that will have various impacts—sea-level rise, dangerous heat, damage to agriculture and forestry, and more.

That conclusion compelled the EPA to formulate regulations to limit the emission of greenhouse gases from power plants. Obama’s EPA did just that, but came late enough to still be tied up in courts by the time his term ended. The regulations were also formulated before the plunge in the cost of renewable power sources, which have since led to a drop in carbon emissions that have far outpaced what the EPA’s rules intended to accomplish.

The first Trump administration formulated alternative rules that also ended up in court for being an insufficient response to the conclusions of the endangerment finding, which ultimately led the Biden administration to start formulating a new set of rules. And at that point, the Supreme Court decided to step in and rule on the Obama rules, even though everyone knew they would never go into effect.

The court indicated that the EPA needed to regulate each power plant individually, rather than regulating the wider grid, which sent the Biden administration back to the drawing board. Its attempts at crafting regulations were also in court when Trump returned to office.

There were a couple of notable aspects to that last case, West Virginia v. EPA, which hinged on the fact that Congress had never explicitly indicated that it wanted to see greenhouse gases regulated. Congress responded by ensuring that the Inflation Reduction Act’s energy-focused components specifically mentioned that these were intended to limit carbon emissions, eliminating one potential roadblock. The other thing is that, in this and other court cases, the Supreme Court could have simply overturned Massachusetts v. EPA, the case that put greenhouse gases within the regulatory framework of the Clean Air Act. Yet a court that has shown a great enthusiasm for overturning precedent didn’t do so.

Nothing dangerous?

So, in the 15 years since the EPA initially released its endangerment findings, they’ve resulted in no regulations whatsoever. But, as long as they existed, the EPA is required to at least attempt to regulate them. So, getting rid of the endangerment findings would seem like the obvious thing for an administration led by a president who repeatedly calls climate change a hoax. And there were figures within the first Trump administration who argued in favor of that.

So why didn’t it happen?

That was never clear, but I’d suggest at least some members of the first Trump administration were realistic about the likely results. The effort to contest the endangerment finding was pushed by people who largely reject the vast body of scientific evidence that indicates that greenhouse gases are warming the climate. And, if anything, the evidence had gotten more decisive in the years between the initial endangerment finding and Trump’s inauguration. I expect that their effort was blocked by people who knew that it would fail in the courts and likely leave behind precedents that made future regulatory efforts easier.

This interpretation is supported by the fact that the Trump-era EPA received a number of formal petitions to revisit the endangerment finding. Having read a few (something you should not do), they are uniformly awful. References to supposed peer-reviewed “papers” turn out to be little more than PDFs hosted on a WordPress site. Other arguments are based on information contained in the proceedings of a conference organized by an anti-science think tank. The Trump administration rejected them all with minimal comment the day before Biden’s inauguration.

Biden’s EPA went back and made detailed criticisms of each of them if you want to see just how laughable the arguments against mainstream science were at the time. And, since then, we’ve experienced a few years of temperatures that are so high they’ve surprised many climate scientists.

Unrealistic

But the new head of the EPA is apparently anything but a realist, and multiple reports have indicated he’s asking to be given the opportunity to go ahead and redo the endangerment finding. A more recent report suggests two possibilities. One is to recruit scientists from the fringes to produce a misleading report and roll the dice on getting a sympathetic judge who will overlook the obvious flaws. The other would be to argue that any climate change that happens will have net benefits to the US.

That latter approach would run into the problem that we’ve gotten increasingly sophisticated at doing analyses that attribute the impact of climate change on the individual weather disasters that do harm the welfare of citizens of the US. While it might have been possible to make a case for uncertainty here a decade ago, that window has been largely closed by the scientific community.

Even if all of these efforts fail, it will be entirely possible for the EPA to construct greenhouse gas regulations that accomplish nothing and get tied up in court for the remainder of Trump’s term. But a court case could show just how laughably bad the positions staked out by climate contrarians are (and, by extension, the position of the president himself). There’s a small chance that the resulting court cases will result in a legal record that will make it that much harder to accept the sorts of minimalist regulations that Trump proposed in his first term.

Which is probably why this approach was rejected the first time around.

Photo of John Timmer

John is Ars Technica’s science editor. He has a Bachelor of Arts in Biochemistry from Columbia University, and a Ph.D. in Molecular and Cell Biology from the University of California, Berkeley. When physically separated from his keyboard, he tends to seek out a bicycle, or a scenic location for communing with his hiking boots.

What the EPA’s “endangerment finding” is and why it’s being challenged Read More »

amd-says-top-tier-ryzen-9900x3d-and-9950x3d-cpus-arrive-march-12-for-$599-and-$699

AMD says top-tier Ryzen 9900X3D and 9950X3D CPUs arrive March 12 for $599 and $699

Like the 7950X3D and 7900X3D, these new X3D chips combine a pair of AMD’s CPU chiplets, one that has the extra 64MB of cache stacked underneath it and one that doesn’t. For the 7950X3D, you get eight cores with extra cache and eight without; for the 7900X3D, you get eight cores with extra cache and four without.

It’s up to AMD’s chipset software to decide what kinds of apps get to run on each kind of CPU core. Non-gaming workloads prioritize the normal CPU cores, which are generally capable of slightly higher peak clock speeds, while games that benefit disproportionately from the extra cache are run on those cores instead. AMD’s software can “park” the non-V-Cache CPU cores when you’re playing games to ensure they’re not accidentally being run on less-suitable CPU cores.

We didn’t have issues with this core parking technology when we initially tested the 7950X3D and 7900X3D, and AMD has steadily made improvements since then to make sure that core parking is working properly. The new 9000-series X3D chips should benefit from that work, too. To get the best results, AMD officially recommends a fresh and fully updated Windows install, along with the newest BIOS for your motherboard and the newest AMD chipset drivers; swapping out another Ryzen CPU for an X3D model (or vice versa) without reinstalling Windows can occasionally lead to CPUs being parked (or not parked) when they are supposed to be (or not supposed to be).

AMD says top-tier Ryzen 9900X3D and 9950X3D CPUs arrive March 12 for $599 and $699 Read More »

what-does-“phd-level”-ai-mean?-openai’s-rumored-$20,000-agent-plan-explained.

What does “PhD-level” AI mean? OpenAI’s rumored $20,000 agent plan explained.

On the Frontier Math benchmark by EpochAI, o3 solved 25.2 percent of problems, while no other model has exceeded 2 percent—suggesting a leap in mathematical reasoning capabilities over the previous model.

Benchmarks vs. real-world value

Ideally, potential applications for a true PhD-level AI model would include analyzing medical research data, supporting climate modeling, and handling routine aspects of research work.

The high price points reported by The Information, if accurate, suggest that OpenAI believes these systems could provide substantial value to businesses. The publication notes that SoftBank, an OpenAI investor, has committed to spending $3 billion on OpenAI’s agent products this year alone—indicating significant business interest despite the costs.

Meanwhile, OpenAI faces financial pressures that may influence its premium pricing strategy. The company reportedly lost approximately $5 billion last year covering operational costs and other expenses related to running its services.

News of OpenAI’s stratospheric pricing plans come after years of relatively affordable AI services that have conditioned users to expect powerful capabilities at relatively low costs. ChatGPT Plus remains $20 per month and Claude Pro costs $30 monthly—both tiny fractions of these proposed enterprise tiers. Even ChatGPT Pro’s $200/month subscription is relatively small compared to the new proposed fees. Whether the performance difference between these tiers will match their thousandfold price difference is an open question.

Despite their benchmark performances, these simulated reasoning models still struggle with confabulations—instances where they generate plausible-sounding but factually incorrect information. This remains a critical concern for research applications where accuracy and reliability are paramount. A $20,000 monthly investment raises questions about whether organizations can trust these systems not to introduce subtle errors into high-stakes research.

In response to the news, several people quipped on social media that companies could hire an actual PhD student for much cheaper. “In case you have forgotten,” wrote xAI developer Hieu Pham in a viral tweet, “most PhD students, including the brightest stars who can do way better work than any current LLMs—are not paid $20K / month.”

While these systems show strong capabilities on specific benchmarks, the “PhD-level” label remains largely a marketing term. These models can process and synthesize information at impressive speeds, but questions remain about how effectively they can handle the creative thinking, intellectual skepticism, and original research that define actual doctoral-level work. On the other hand, they will never get tired or need health insurance, and they will likely continue to improve in capability and drop in cost over time.

What does “PhD-level” AI mean? OpenAI’s rumored $20,000 agent plan explained. Read More »