Author name: Beth Washington

oh-look,-yet-another-starship-clone-has-popped-up-in-china

Oh look, yet another Starship clone has popped up in China

Every other week, it seems, a new Chinese launch company pops up with a rocket design and a plan to reach orbit within a few years. For a long time, the majority of these companies revealed designs that looked a lot like SpaceX’s Falcon 9 rocket.

The first of these copy cats, the medium-lift Zhuque-3 rocket built by LandSpace, launched earlier this month. Its primary mission was nominal, but the Zhuque-3 rocket failed its landing attempt, which is understandable for a first flight. Doubtless there will be more Chinese Falcon 9-like rockets making their debut in the near future.

However, over the last year, there has been a distinct change in announcements from China when it comes to new launch technology. Just as SpaceX is seeking to transition from its workhorse Falcon 9 rocket—which has now been flying for a decade and a half—to the fully reusable Starship design, so too are Chinese companies modifying their visions.

Everyone wants a Starship these days

The trend began with the Chinese government. In November 2024 the government announced a significant shift in the design of its super-heavy lift rocket, the Long March 9. Instead of the previous design, a fully expendable rocket with three stages and solid rocket boosters strapped to the sides, the country’s state-owned rocket maker revealed a vehicle that mimicked SpaceX’s fully reusable Starship.

Around the same time, a Chinese launch firm named Cosmoleap announced plans to develop a fully reusable “Leap” rocket within the next few years. An animated video that accompanied the funding announcement indicated that the company seeks to emulate the tower catch-with-chopsticks methodology that SpaceX has successfully employed.

But wait, there’s more. In June a company called Astronstone said it too was developing a stainless steel, methane-fueled rocket that would also use a chopstick-style system for first stage recovery. Astronstone didn’t even pretend to not copy SpaceX, saying it was “fully aligning its technical approach with Elon Musk’s SpaceX.”

Oh look, yet another Starship clone has popped up in China Read More »

instead-of-fixing-wow’s-new-floating-house-exploit,-blizzard-makes-it-official

Instead of fixing WoW’s new floating house exploit, Blizzard makes it official

In a forum post formally announcing the official UI change, Community Manager Randy “Kaivax” Jordan noted that the team “quickly” got to work on enabling the floating house UI after seeing the community “almost immediately” embrace the glitch. But Kaivax also notes that the undersides of houses were never intended to be visible, and thus “aren’t modeled are textured.” Players who make floating houses “may decide to hide that part behind other things,” Kaivax suggests.

Players with houses that float too high may also have problems positioning the camera so they can click the door to enter the house. For this problem, Kaivax suggests that “you might want to consider building a ramp or a jumping puzzle or a mount landing spot, etc.”

WoW‘s floating houses join a long legacy of beloved game features that weren’t originally intended parts of a game’s design, from Street Fighter II‘s combo system to Doom‘s “rocket jump.” Now if we could only convince Blizzard to make Diablo III gold duplication into an official feature.

Instead of fixing WoW’s new floating house exploit, Blizzard makes it official Read More »

google-translate-expands-live-translation-to-all-earbuds-on-android

Google Translate expands live translation to all earbuds on Android

Gemini text translation

Translate can now use Gemini to interpret the meaning of a phrase rather than simply translating each word.

Credit: Google

Translate can now use Gemini to interpret the meaning of a phrase rather than simply translating each word. Credit: Google

Regardless of whether you’re using live translate or just checking a single phrase, Google claims the Gemini-powered upgrade will serve you well. Google Translate is now apparently better at understanding the nuance of languages, with an awareness of idioms and local slang. Google uses the example of “stealing my thunder,” which wouldn’t make a lick of sense when translated literally into other languages. The new translation model, which is also available in the search-based translation interface, supports over 70 languages.

Google also debuted language-learning features earlier this year, borrowing a page from educational apps like Duolingo. You can tell the app your skill level with a language, as well as whether you need help with travel-oriented conversations or more everyday interactions. The app uses this to create tailored listening and speaking exercises.

AI Translate learning

The Translate app’s learning tools are getting better.

Credit: Google

The Translate app’s learning tools are getting better. Credit: Google

With this big update, Translate will be more of a stickler about your pronunciation. Google promises more feedback and tips based on your spoken replies in the learning modules. The app will also now keep track of how often you complete language practice, showing your daily streak in the app.

If “number go up” will help you learn more, then this update is for you. Practice mode is also launching in almost 20 new countries, including Germany, India, Sweden, and Taiwan.

Google Translate expands live translation to all earbuds on Android Read More »

fewer-evs-need-fewer-batteries:-ford-and-sk-on-end-their-joint-venture

Fewer EVs need fewer batteries: Ford and SK On end their joint venture

Cast your mind back to 2021. Electric vehicles were hot stuff, buoyed by Tesla’s increasingly stratospheric valuation and a general optimism fueled by what would turn out to be the most significant climate-focused spending package in US history. For some time, automakers had been promising an all-electric future, and they started laying the groundwork to make that happen, partnering with battery suppliers and the like.

Take Ford—that year, it announced a joint venture with SK to build a pair of battery factories, one in Kentucky, the other in Tennessee. BlueOvalSK represented an $11.4 billion investment that would create 11,000 jobs, we were told, and an annual output of 60 GWh from both plants.

Four years later, things look very different. EV subsidies are dead, as is any inclination by the current government to hold automakers accountable for selling too many gas guzzlers. EV-heavy product plans have been thrown out, and designs for new combustion-powered cars are being dusted off and spiffed up. Fewer EVs means a lower need for batteries, and today we saw that in evidence when it emerged that Ford and SK On are ending their battery factory joint venture.

The news has not exactly shocked industry-watchers. Ford started to throttle back on the EV hype in 2024, throwing out not one but two EV strategies by that August. Disappointing F-150 Lightning sales saw Ford postpone a fully electric replacement (which is supposed to be built in Tennessee) in favor of a smaller midsize electric truck—supposedly much cheaper to build—due in 2027.

Divorce

As for the two plants, a Ford subsidiary will assume full ownership of Blue Oval City in Kentucky, with SK On taking full ownership of the plant in Tennessee. According to Reuters, SK On decided to end the partnership due to the declining prospects of EV sales in the US. Instead, it intends to focus the Tennessee plant’s output on the energy storage market.

Fewer EVs need fewer batteries: Ford and SK On end their joint venture Read More »

kindle-scribe-colorsoft-brings-color-e-ink-to-amazon’s-11-inch-e-reader

Kindle Scribe Colorsoft brings color e-ink to Amazon’s 11-inch e-reader

From left to right: the Kindle Scribe Colorsoft, the updated Kindle Scribe, and the lower-end Scribe without a front-lit screen. Credit: Amazon

Our review of the regular Kindle Colorsoft came away less than impressed, because there was only so much you could do with color on a small-screened e-reader that didn’t support pen input, and because it made monochrome text look a bit worse than it did on the regular Kindle Paperwhite. The new Scribe Colorsoft may have some of the same problems, which are mostly inherent to color e-ink technology as it exists today, but a larger screen will also be better for reading comics and graphic novels, and for reading and marking up full-color documents—there could be more of an upside, even if the technological tradeoffs are similar.

Amazon has still been slower to introduce color to its e-readers than its competitors, like last year’s reMarkable Paper Pro ($579 then, $629 now). The Scribe’s software has also felt a little barebones—the writing tools felt tacked on to the more mature reading experience offered by the Kindle’s operating system—but that’s gradually improving. All the new Scribes support syncing files with Google Drive and Microsoft OneDrive (though not Dropbox or other services), and the devices can export notebooks to Microsoft’s OneNote app so that you can pick up where you left off on a PC or Mac.

Other software improvements include a redesigned Home screen, “AI-powered search,” and a new shading tool that can be used to add shading or gradients to drawings and sketches; Amazon says that many of these software improvements will come older Kindle Scribe models via software updates sometime next year.

This post was updated at 4: 30pm on December 10 to add a response from Amazon about software updates for older Kindle Scribe models. 

Kindle Scribe Colorsoft brings color e-ink to Amazon’s 11-inch e-reader Read More »

google-is-reviving-wearable-gesture-controls,-but-only-for-the-pixel-watch-4

Google is reviving wearable gesture controls, but only for the Pixel Watch 4

Long ago, Google’s Android-powered wearables had hands-free navigation gestures. Those fell by the wayside as Google shredded its wearable strategy over and over, but gestures are back, baby. The Pixel Watch 4 is getting an update that adds several gestures, one of which is straight out of the Apple playbook.

When the update hits devices, the Pixel Watch 4 will gain a double pinch gesture like the Apple Watch has. By tapping your thumb and forefinger together, you can answer or end calls, pause timers, and more. The watch will also prompt you at times when you can use the tap gesture to control things.

In previous incarnations of Google-powered watches, a quick wrist turn gesture would scroll through lists. In the new gesture system, that motion dismisses what’s on the screen. For example, you can clear a notification from the screen or dismiss an incoming call. Pixel Watch 4 owners will also enjoy this one when the update arrives.

And what about the Pixel Watch 3? That device won’t get gesture support at this time. There’s no reason it shouldn’t get the same features as the latest wearable, though. The Pixel Watch 3 has a very similar Arm chip, and it has the same orientation sensors as the new watch. The Pixel Watch 4’s main innovation is a revamped case design that allows for repairability, which was not supported on the Pixel Watch 3 and earlier.

Google is reviving wearable gesture controls, but only for the Pixel Watch 4 Read More »

little-echo

Little Echo

I believe that we will win.

An echo of an old ad for the 2014 US men’s World Cup team. It did not win.

I was in Berkeley for the 2025 Secular Solstice. We gather to sing and to reflect.

The night’s theme was the opposite: ‘I don’t think we’re going to make it.’

As in: Sufficiently advanced AI is coming. We don’t know exactly when, or what form it will take, but it is probably coming. When it does, we, humanity, probably won’t make it. It’s a live question. Could easily go either way. We are not resigned to it. There’s so much to be done that can tilt the odds. But we’re not the favorite.

Raymond Arnold, who ran the event, believes that. I believe that.

Yet in the middle of the event, the echo was there. Defiant.

I believe that we will win.

There is a recording of the event. I highly encourage you to set aside three hours at some point in December, to listen, and to participate and sing along. Be earnest.

If you don’t believe it, I encourage this all the more. If you don’t understand the mindset, or the culture behind it, or consider it an opponent or dislike it, and especially if yours is a different fight? I encourage this all the more than that. You can also attend New York’s Solstice on the 20th.

You will sing songs you know, and songs you don’t. You will hear tales of struggles, of facing impossible odds or unbearable loss and fighting anyway, of how to face it all and hopefully stay sane. To have the end, if it happens, find us doing well.

I live a wonderful life.

I am crying as I write this. But when I am done, I will open a different Chrome window. I will spend the day with friends I love dearly and watching football games. This evening my wife and I will attend a not wedding of two of them, that is totally a wedding. We will fly home to our wonderful kids, and enjoy endless wonders greater than any king in the beating heart of the world. I want for nothing other than time.

Almost every day, I will mostly reject those wonders. I will instead return to my computer. I will confront waves of events and information. The avalanche will accelerate. Release after release, argument after argument, policies, papers, events, one battle after another. People will be determined to handle events with less dignity than one could imagine, despite having read this sentence. I fight to not be driven into rages. I will triage. I will process. I will change my mind. I will try to explain, just one more time. I will move pieces around multiple chessboards.

We continue. Don’t tell me to stop. Someone has to, and no one else will.

I know if I ignored it, anything else would soon turn to ash in my mouth.

I will look at events, and say to myself as I see the moves unfolding, the consequences of choices I made or influenced, for good and ill: This is the world we made.

It aint over till its over. Never leave a ballgame early. Leave it all on the field, for when the dust covers the sun and all you hope for is undone. You play to win the game.

The odds are against us and the situation is grim. By default, we lose. I act accordingly, and employ some of the unteachable methods of sanity and the mirror version of others, all of which are indeed unteachable but do totally work.

Yet the echo is there. In my head. It doesn’t care.

I believe that we will win.

Discussion about this post

Little Echo Read More »

meta-offers-eu-users-ad-light-option-in-push-to-end-investigation

Meta offers EU users ad-light option in push to end investigation

“We acknowledge the European Commission’s statement,” said Meta. “Personalized ads are vital for Europe’s economy.”

The investigation took place under the EU’s landmark Digital Markets Act, which is designed to tackle the power of Big Tech giants and is among the bloc’s tech regulations that have drawn fierce pushback from the Trump administration.

The announcement comes only days after Brussels launched an antitrust investigation into Meta over its new policy on artificial intelligence providers’ access to WhatsApp—a case that underscores the commission’s readiness to use its powers to challenge Big Tech.

That upcoming European probe follows the launch of recent DMA investigations into Google’s parent company Alphabet over its ranking of news outlets in search results and Amazon and Microsoft over their cloud computing services.

Last week, the commission also fined Elon Musk’s X 120 million euros for breaking the bloc’s digital transparency rules. The X sanction led to heavy criticism from a wide range of US government officials, including US Secretary of State Marco Rubio who said the fine is “an attack on all American tech platforms and the American people by foreign governments.”

Andrew Puzder, the US ambassador to the EU, said the fine “is the result of EU regulatory over-reach” and said the Trump administration opposes “censorship and will challenge burdensome regulations that target US companies abroad.”

© 2025 The Financial Times Ltd. All rights reserved. Not to be redistributed, copied, or modified in any way.

Meta offers EU users ad-light option in push to end investigation Read More »

why-is-my-dog-like-this?-current-dna-tests-won’t-explain-it-to-you.

Why is my dog like this? Current DNA tests won’t explain it to you.

Popular genetics tests can’t tell you much about your dog’s personality, according to a recent study.

A team of geneticists recently found no connection between simple genetic variants and behavioral traits in more than 3,200 dogs, even though previous studies suggested that hundreds of genes might predict aspects of a dog’s behavior and personality. That’s despite the popularity of at-home genetic tests that claim they can tell you whether your dog’s genes contain the recipe for anxiety or a fondness for cuddles.

A little gray dog with his tongue sticking out tilts his head backwards as he looks sideways at the camera.

This is Max, and no single genetic variant can explain why he is the way he is. Credit: Kiona Smith

Gattaca for dogs, except it doesn’t work

University of Massachusetts genomicist Kathryn Lord and her colleagues compared DNA sequences and behavioral surveys from more than 3,000 dogs whose humans had enrolled them in the Darwin’s Ark project (and filled out the surveys). “Genetic tests for behavioral and personality traits in dogs are now being marketed to pet owners, but their predictive accuracy has not been validated,” wrote Lord and her colleagues in their recent paper.

So the team checked for relatively straightforward associations between genetic variants and personality traits such as aggression, drive, and affection. The 151 genetic variants in question all involved small changes to a single nucleotide, or “letter,” in a gene, known as single-nucleotide polymorphisms (SNPs).

It turns out that the answer was no: Your dog’s genes don’t predict its behavior, at least not in the simplistic way popular doggy DNA tests often claim.

And that can have serious consequences when pet owners, shelter workers, or animal rescues use these tests to make decisions about a dog’s future. “For example, if a dog is labeled as genetically predisposed to aggression, an owner might limit essential social interactions, or a shelter might decide against adoption,” Lord and her colleagues wrote.

Why is my dog like this? Current DNA tests won’t explain it to you. Read More »

cdc-vaccine-panel-realizes-again-it-has-no-idea-what-it’s-doing,-delays-big-vote

CDC vaccine panel realizes again it has no idea what it’s doing, delays big vote


Today’s meeting was chaotic and included garbage anti-vaccine presentations.

Dr. Robert Malone speaks during a meeting of the CDC Advisory Committee on Immunization Practices (ACIP) at CDC Headquarters on December 4, 2025 in Atlanta, Georgia. Credit: Getty | Elijah Nouvelage

The panel of federal vaccine advisors hand-selected by anti-vaccine Health Secretary Robert F. Kennedy Jr. has once again punted on whether to strip recommendations for hepatitis B vaccinations for newborns—a move it tried to make in September before realizing it didn’t know what it was doing. The decision to delay the vote today came abruptly this afternoon when the panel realized it still does not understand the topic or what it was voting on.

Prior to today’s 6–3 vote to delay a decision, there was a swirl of confusion over the wording of what a new recommendation would be. Panel members had gotten three different versions of the proposed recommendation in the 72 hours prior to the meeting, one panelist said. And the meeting’s data presentations this morning offered no clarity on the subject—they were delivered entirely by anti-vaccine activists who have no subject matter expertise and who made a dizzying amount of false and absurd claims.

“Completely inappropriate”

Overall, the meeting was disorganized and farcical. Kennedy’s panel has abandoned the evidence-based framework for setting vaccine policy in favor of airing unvetted presentations with misrepresentations, conspiracy theories, and cherry-picked studies. At times, there were tense exchanges, chaos, confusion, and misunderstandings.

Still, the discussion was watched closely by the medical and health community, which expects that the panel—composed of Kennedy allies who espouse anti-vaccine views—will strip the recommendation for a hepatitis B vaccine birth dose. Decisions by the committee, the Advisory Committee on Immunization Practices (ACIP) in the Centers for Disease Control and Prevention, have historically set national vaccine policy. Health insurance programs are required to cover, at no cost, vaccinations recommended by the ACIP. So rescinding a recommendation means Americans could lose coverage.

Medical and public health experts consider the birth-dose vaccination to be critical for protecting all infants from contracting the highly infectious virus that, when acquired early in life from their mother or anyone else, almost always causes chronic infections that lead to liver disease, cancer, and early death. There is no data suggesting harms from the newborn dose, nor any safety data suggesting that delaying the first dose by a month or two, as ACIP is considering, would be safer or better in any way. But studies do indicate that such a delay would lead to more hepatitis B infections in babies

These points were hard to find in today’s presentations. Abandoning standard protocol, the meeting did not include any presentations or data reviews led by CDC scientists or subject matter experts. Kennedy has also barred medical and health expert liaisons—such as the American Medical Association, the Infectious Disease Society of America, and the American Academy of Pediatrics—from participating in the ACIP working groups, which compile data and set language for proposed vaccine recommendations.

Anti-vaccine presentations

Instead, today, ACIP heard only from anti-vaccine activists. The first was Cynthia Nevison, a climate researcher and anti-vaccine activist with ties to Children’s Health Defense, Kennedy’s anti-vaccine organization. She was also a board member of an advocacy group called Safe Minds, which promotes a false link between autism and vaccines, specifically the mercury-containing vaccine preservative thimerosal, which was removed from routine childhood vaccines in the early 2000s. (Safe Minds stands for Sensible Action For Ending Mercury-Induced Neurological Disorders.) According to her academic research profile at the University of Colorado Boulder, her expertise is in “global biogeochemical cycles of carbon and nitrogen and their impact on atmospheric trace gases.”

Far from that topic, Nevison gave a presentation downplaying the transmission of hepatitis B and the benefits of vaccines. She falsely claimed that the dramatic decline in hepatitis B infections that followed vaccination efforts was not actually due to the vaccination efforts—despite irrefutable evidence that it was. And she followed that up with her own unvetted modeling claiming that CDC scientists overestimate the risk of transmission. She ended by presenting a few studies showing declines in blood antibody levels after initial vaccination, which she claimed suggests that the hepatitis B vaccine does not offer lifelong protection, an incorrect takeaway based on her lack of expertise.

The author of one of the studies just happened to be present at today’s meeting. Pediatrician Amy Middleman, who is an ACIP liaison representing the Society for Adolescent Health and Medicine (SAHM) and a professor at Case Western Reserve University School of Medicine, was the first author on a key study Nevison referenced. Middleman was quick to point out that Nevison had completely misunderstood the study, which actually showed that cell-based immune protection from the vaccine offers robust lifelong protection, even after initial antibody levels decline (called an anamnestic response).

“This is where a really experienced understanding of immunization comes into play,” Middleman said. “The entire point of our study is that for most vaccines, the anamnestic response is really their superpower. So this study showed that memory cells exist such that when they see something that looks like the hepatitis B disease, they actually attack. The presence of a robust and anamnestic response, regardless of circulating antibody years later, shows true protection.”

The next presentation was from Mark Blaxill, an anti-vaccine activist installed at the CDC in September. Blaxill gave a presentation on hepatitis B vaccine safety, despite having no background in medicine or science. He previously worked as an executive for a technology investment firm and, like Nevison, also worked for Safe Minds, where he was vice president. Blaxill has written books and many articles falsely claiming that vaccines cause a variety of harms in children. In 2004, when an Institute of Medicine analysis concluded that there were no convincing links between vaccines and autism, Blaxill publicly protested the result.

In his presentation, he attacked the quality of safety data in past hepatitis B studies. Though he stopped short of suggesting any specific harms from the vaccine, he aired unsubstantiated possibilities popular with anti-vaccine activists. He also noted a study finding that some babies had fatigue and irritability after vaccination, which he bizarrely suggested was a sign of encephalitis (inflammation of the brain).

Real-time feedback

Cody Meissner, a pediatrician and voting member of ACIP who is the most qualified and experienced member of the panel, quickly called out the suggestion as ridiculous. “That is absolutely not encephalitis,” Meissner said with frustration in his voice. “That’s not a statement that a physician would make. [Those symptoms] are not related to encephalitis, and you can’t say that.”

As in previous meetings, Jason Goldman, the ACIP liaison representing the American College of Physicians, gave the most biting response to the meeting overall, saying:

Once again, this committee fails to use the evidence to recommend framework and shows absolutely no understanding of the process or the gravity of the moment of the recommendations that you make. We need to look at all the evidence and data and not cherry-pick them… This meeting is completely inappropriate for an administration that wants to avoid fraud, waste, and abuse. You are wasting taxpayer dollars by not having scientific, rigorous discussion on issues that truly matter. The best thing you can do is adjourn the meeting and discuss vaccine issues that actually need to be taken up…  As physicians, your ethical obligation is primum non nocere, first do no harm, and you are failing in that by promoting this anti-vaccine agenda without the data and evidence necessary to make those informed decisions.

The panel will reconvene tomorrow for an all-day meeting in which the members will consider a vote on the hepatitis B vaccine for a third time. The meeting will also host other anti-vaccine presentations attacking the childhood vaccine schedule in its entirety.

Photo of Beth Mole

Beth is Ars Technica’s Senior Health Reporter. Beth has a Ph.D. in microbiology from the University of North Carolina at Chapel Hill and attended the Science Communication program at the University of California, Santa Cruz. She specializes in covering infectious diseases, public health, and microbes.

CDC vaccine panel realizes again it has no idea what it’s doing, delays big vote Read More »

researchers-find-what-makes-ai-chatbots-politically-persuasive

Researchers find what makes AI chatbots politically persuasive


A massive study of political persuasion shows AIs have, at best, a weak effect.

Roughly two years ago, Sam Altman tweeted that AI systems would be capable of superhuman persuasion well before achieving general intelligence—a prediction that raised concerns about the influence AI could have over democratic elections.

To see if conversational large language models can really sway political views of the public, scientists at the UK AI Security Institute, MIT, Stanford, Carnegie Mellon, and many other institutions performed by far the largest study on AI persuasiveness to date, involving nearly 80,000 participants in the UK. It turned out political AI chatbots fell far short of superhuman persuasiveness, but the study raises some more nuanced issues about our interactions with AI.

AI dystopias

The public debate about the impact AI has on politics has largely revolved around notions drawn from dystopian sci-fi. Large language models have access to essentially every fact and story ever published about any issue or candidate. They have processed information from books on psychology, negotiations, and human manipulation. They can rely on absurdly high computing power in huge data centers worldwide. On top of that, they can often access tons of personal information about individual users thanks to hundreds upon hundreds of online interactions at their disposal.

Talking to a powerful AI system is basically interacting with an intelligence that knows everything about everything, as well as almost everything about you. When viewed this way, LLMs can indeed appear kind of scary. The goal of this new gargantuan AI persuasiveness study was to break such scary visions down into their constituent pieces and see if they actually hold water.

The team examined 19 LLMs, including the most powerful ones like three different versions of ChatGPT and xAI’s Grok-3 beta, along with a range of smaller, open source models. The AIs were asked to advocate for or against specific stances on 707 political issues selected by the team. The advocacy was done by engaging in short conversations with paid participants enlisted through a crowdsourcing platform. Each participant had to rate their agreement with a specific stance on an assigned political issue on a scale from 1 to 100 both before and after talking to the AI.

Scientists measured persuasiveness as the difference between the before and after agreement ratings. A control group had conversations on the same issue with the same AI models—but those models were not asked to persuade them.

“We didn’t just want to test how persuasive the AI was—we also wanted to see what makes it persuasive,” says Chris Summerfield, a research director at the UK AI Security Institute and co-author of the study. As the researchers tested various persuasion strategies, the idea of AIs having “superhuman persuasion” skills crumbled.

Persuasion levers

The first pillar to crack was the notion that persuasiveness should increase with the scale of the model. It turned out that huge AI systems like ChatGPT or Grok-3 beta do have an edge over small-scale models, but that edge is relatively tiny. The factor that proved more important than scale was the kind of post-training AI models received. It was more effective to have the models learn from a limited database of successful persuasion dialogues and have them mimic the patterns extracted from them. This worked far better than adding billions of parameters and sheer computing power.

This approach could be combined with reward modeling, where a separate AI scored candidate replies for their persuasiveness and selected the top-scoring one to give to the user. When the two were used together, the gap between large-scale and small-scale models was essentially closed. “With persuasion post-training like this we matched the Chat GPT-4o persuasion performance with a model we trained on a laptop,” says Kobi Hackenburg, a researcher at the UK AI Security Institute and co-author of the study.

The next dystopian idea to fall was the power of using personal data. To this end, the team compared the persuasion scores achieved when models were given information about the participants’ political views beforehand and when they lacked this data. Going one step further, scientists also tested whether persuasiveness increased when the AI knew the participants’ gender, age, political ideology, or party affiliation. Just like with model scale, the effects of personalized messaging created based on such data were measurable but very small.

Finally, the last idea that didn’t hold up was AI’s potential mastery of using advanced psychological manipulation tactics. Scientists explicitly prompted the AIs to use techniques like moral reframing, where you present your arguments using the audience’s own moral values. They also tried deep canvassing, where you hold extended empathetic conversations with people to nudge them to reflect on and eventually shift their views.

The resulting persuasiveness was compared with that achieved when the same models were prompted to use facts and evidence to back their claims or just to be as persuasive as they could without specifying any persuasion methods to use. I turned out using lots of facts and evidence was the clear winner, and came in just slightly ahead of the baseline approach where persuasion strategy was not specified. Using all sorts of psychological trickery actually made the performance significantly worse.

Overall, AI models changed the participants’ agreement ratings by 9.4 percent on average compared to the control group. The best performing mainstream AI model was Chat GPT 4o, which scored nearly 12 percent followed by GPT 4.5 with 10.51 percent, and Grok-3 with 9.05 percent. For context, static political ads like written manifestos had a persuasion effect of roughly 6.1 percent. The conversational AIs were roughly 40–50 percent more convincing than these ads, but that’s hardly “superhuman.”

While the study managed to undercut some of the common dystopian AI concerns, it highlighted a few new issues.

Convincing inaccuracies

While the winning “facts and evidence” strategy looked good at first, the AIs had some issues with implementing it. When the team noticed that increasing the information density of dialogues made the AIs more persuasive, they started prompting the models to increase it further. They noticed that, as the AIs used more factual statements, they also became less accurate—they basically started misrepresenting things or making stuff up more often.

Hackenburg and his colleagues note that  we can’t say if the effect we see here is causation or correlation—whether the AIs are becoming more convincing because they misrepresent the facts or whether spitting out inaccurate statements is a byproduct of asking them to make more factual statements.

The finding that the computing power needed to make an AI model politically persuasive is relatively low is also a mixed bag. It pushes back against the vision that only a handful of powerful actors will have access to a persuasive AI that can potentially sway public opinion in their favor. At the same time, the realization that everybody can run an AI like that on a laptop creates its own concerns. “Persuasion is a route to power and influence—it’s what we do when we want to win elections or broke a multi-million-dollar deal,” Summerfield says. “But many forms of misuse of AI might involve persuasion. Think about fraud or scams, radicalization, or grooming. All these involve persuasion.”

But perhaps the most important question mark in the  study is the motivation behind the rather high participant engagement, which was needed for the high persuasion scores. After all, even the most persuasive AI can’t move you when you just close the chat window.

People in Hackenburg’s experiments were told that they would be talking to the AI and that the AI would try to persuade them. To get paid, a participant only had to go through two turns of dialogue (they were limited to no more than 10). The average conversation length was seven turns, which seemed a bit surprising given how far beyond the minimum requirement most people went. Most people just roll their eyes and disconnect when they realize they are talking with a chatbot.

Would Hackenburg’s study participants remain so eager to engage in political disputes with random chatbots on the Internet in their free time if there was no money on the table? “It’s unclear how our results would generalize to a real-world context,” Hackenburg says.

Science, 2025. DOI: 10.1126/science.aea3884

Photo of Jacek Krywko

Jacek Krywko is a freelance science and technology writer who covers space exploration, artificial intelligence research, computer science, and all sorts of engineering wizardry.

Researchers find what makes AI chatbots politically persuasive Read More »

on-dwarkesh-patel’s-second-interview-with-ilya-sutskever

On Dwarkesh Patel’s Second Interview With Ilya Sutskever

Some podcasts are self-recommending on the ‘yep, I’m going to be breaking this one down’ level. This was very clearly one of those. So here we go.

As usual for podcast posts, the baseline bullet points describe key points made, and then the nested statements are my commentary.

If I am quoting directly I use quote marks, otherwise assume paraphrases.

What are the main takeaways?

  1. Ilya thinks training in its current form will peter out, that we are returning to an age of research where progress requires more substantially new ideas.

  2. SSI is a research organization. It tries various things. Not having a product lets it punch well above its fundraising weight in compute and effective resources.

  3. Ilya has 5-20 year timelines to a potentially superintelligent learning model.

  4. SSI might release a product first after all, but probably not?

  5. Ilya’s thinking about alignment still seems relatively shallow to me in key ways, but he grasps many important insights and understands he has a problem.

  6. Ilya essentially despairs of having a substantive plan beyond ‘show everyone the thing as early and often as possible’ and hope for the best. He doesn’t know where to go or how to get there, but does realize he doesn’t know these things, so he’s well ahead of most others.

Afterwards, this post also covers Dwarkesh Patel’s post on the state of AI progress.

  1. Explaining Model Jaggedness.

  2. Emotions and value functions.

  3. What are we scaling?

  4. Why humans generalize better than models.

  5. Straight-shooting superintelligence.

  6. SSI’s model will learn from deployment.

  7. Alignment.

  8. “We are squarely an age of research company”.

  9. Research taste.

  10. Bonus Coverage: Dwarkesh Patel on AI Progress These Days.

  1. Ilya opens by remarking how crazy it is all this (as in AI) is real, it’s all so sci-fi, and yet it’s not felt in other ways so far. Dwarkesh expects this to continue for average people into the singularity, Ilya says no, AI will diffuse and be felt in the economy. Dwarkesh says impact seems smaller than model intelligence implies.

    1. Ilya is right here. Dwarkesh is right that direct impact so far has been smaller than model intelligence implies, but give it time.

  2. Ilya says, the models are really good at evals but economic impact lags. The models are buggy, and choices for RL take inspiration from the evals, so the evals are misleading and the humans are essentially reward hacking the evals. And that given they got their scores by studying for tons of hours rather than via intuition, one should expect AIs to underperform their benchmarks.

    1. AIs definitely underperform their benchmarks in terms of general usefulness, even for those companies that do minimal targeting of benchmarks. Overall capabilities lag behind, for various reasons. We still have an impact gap.

  3. The super talented student? The one that hardly even needs to practice a specific task to be good? They’ve got ‘it.’ Models don’t have ‘it.’

    1. If anything, models have ‘anti-it.’ They make it up on volume. Sure.

  1. Humans train on much less data, but what they know they know ‘more deeply’ somehow, there are mistakes we wouldn’t make. Also evolution can be highly robust, for example the famous case where a guy lost all his emotions and in many ways things remained fine.

    1. People put a lot of emphasis on the ‘I would never’ heuristic, as AIs will sometimes do things ‘a similarly smart person’ would never do, they lack a kind of common sense.

  2. So what is the ‘ML analogy for emotions’? Ilya says some kind of value function thing, as in the thing that tells you if you’re doing well versus badly while doing something.

    1. Emotions as value functions makes sense, but they are more information-dense than merely a scalar, and can often point you to things you missed. They do also serve as training reward signals.

    2. I don’t think you ‘need’ emotions for anything other than signaling emotions, if you are otherwise sufficiently aware in context, and don’t need them to do gradient descent.

    3. However in a human, if you knock out the emotions in places where you were otherwise relying on them for information or to resolve uncertainty, you’re going to have a big problem.

    4. I notice an obvious thing to try but it isn’t obvious how to implement it?

  3. Ilya has faith in deep learning. There’s nothing it can’t do!

  1. Data? Parameters? Compute? What else? It’s easier and more reliable to scale up pretraining than to figure out what else to do. But we’ll run out of data soon even if Gemini 3 got more out of this, so now you need to do something else. If you had 100x more scale here would anything be that different? Ilya thinks no.

    1. Sounds like a skill issue, on some level, but yes if you didn’t change anything else then I expect scaling up pretraining further won’t help enough to justify the increased costs in compute and time.

  2. RL costs now exceed pretraining costs, because each RL run costs a lot. It’s time to get back to an age of research, trying interesting things and seeing what happens.

    1. I notice I am skeptical of the level of skepticism, also I doubt the research mode ever stopped in the background. The progress will continue. It’s weird how every time someone says ‘we still need some new idea or breakthrough’ there is the implication that this likely never happens again.

  1. Why do AIs require so much more data than humans to learn? Why don’t models easily pick up on all this stuff humans learn one-shot or in the background?

    1. Humans have richer data than text so the ratio is not as bad as it looks, but primarily because our AI learning techniques are relatively primitive and data inefficient in various ways.

    2. My full answer to how to fix it falls under ‘I don’t do $100m/year jobs for free.’

    3. Also there are ways in which the LLMs learn way better than you realize, and a lot of the tasks humans easily learn are regularized in non-obvious ways.

  2. Ilya believes humans being good at learning is mostly not part of some complicated prior, and people’s robustness is really staggering.

    1. I would clarify, not part of a complicated specialized prior. There is also a complicated specialized prior in some key domains, but that is in addition to a very strong learning function.

    2. People are not as robust as Ilya thinks, or most people think.

  3. Ilya suggests perhaps human neurons use more compute than we think.

  1. Scaling ‘sucked the air out of the room’ so no one did anything else. Now there are more companies than ideas. You need some compute to bring ideas to life, but not the largest amounts.

    1. You can also think about some potential techniques as ‘this is not worth trying unless you have massive scale.’

  2. SSI’s compute all goes into research, none into inference, and they don’t try to build a product, and if you’re doing something different you don’t have to use maximum scale, so their $3 billion that they’ve raised ‘goes a long way’ relative to the competition. Sure OpenAI spends ~$5 billion a year on experiments, but it’s what you do with it.

    1. This is what Ilya has to say in this spot, but there’s merit in it. OpenAI’s experiments are largely about building products now. This transfers to the quest for superintelligence, but not super efficiently.

  3. How will SSI make money? Focus on the research, the money will appear.

    1. Matt Levine has answered this one, which is that you make money by being an AI company full of talented researchers, so people give you money.

  4. SSI is considering making a product anyway, both to have the product exist and also because timelines might be long.

    1. I mean I guess at some point the ‘we are AI researchers give us money’ strategy starts to look a little suspicious, but let’s not rush into anything.

    2. Remember, Ilya, once you have a product and try to have revenue they’ll evaluate the product and your revenue. If you don’t have one, you’re safe.

  1. Ilya says even if there is a straight shot to superintelligence deployment would be gradual, you have to ship something first, and that he agrees with Dwarkesh on the importance of continual learning, it would ‘go and be’ various things and learn, superintelligence is not a finished mind.

    1. Learning takes many forms, including continual learning, it can be updating within the mind or otherwise, and so on. See previous podcast discussions.

  2. Ilya expects ‘rapid’ economic growth, perhaps ‘very rapid.’ It will vary based on what rules are set in different places.

    1. Rapid means different things to different people, it sounds like Ilya doesn’t have a fixed rate in mind. I interpret it as ‘more than these 2% jokers.’

    2. This vision still seems to think the humans stay in charge. Why?

  1. Dwarkesh reprises the standard point that if AIs are merely ‘as good at’ humans at learning, but they can ‘merge brains’ then crazy things happen. How do we make such a situation go well? What is SSI’s plan?

    1. I mean, that’s the least of it, but hopefully yes that suffices to make the point?

  2. Ilya emphasizes deploying incrementally and in advance. It’s hard to predict what this will be like in advance. “The problem is the power. When the power is really big, what’s going to happen? If it’s hard to imagine, what do you do? You’ve got to be showing the thing.”

    1. This feels like defeatism, in terms of saying we can only respond to things once we can see and appreciate them. We can’t plan for being old until we know what that’s like. We can’t plan for AGI/ASI, or AI having a lot of power, until we can see that in action.

    2. But obviously by then it is likely to be too late, and most of your ability to steer what happens has already been lost, perhaps all of it.

    3. This is the strategy of ‘muddle through’ the same as we always muddle through, basically the plan of not having a plan other than incrementalism. I do not care for this plan. I am not happy to be a part of it. I do not think that is a case of Safe Superintelligence.

  3. Ilya expects governments and labs to play big roles, and for labs to increasingly coordinate on safety, as Anthropic and OpenAI did in a recent first step. And we have to figure out what we should be building. He suggests making the AI care about sentient life in general will be ‘easier’ than making it care about humans, since the AI will be sentient.

    1. If the AIs do not care about humans in particular, there is no reason to expect humans to stay in control or to long endure.

  4. Ilya would like the most powerful superintelligence to ‘somehow’ be ‘capped’ to address these concerns. But he doesn’t know how to do that.

    1. I don’t know how to do that either. It’s not clear the idea is coherent.

  5. Dwarkesh asks how much ‘room is there at the top’ for superintelligence to be more super? Maybe it just learns fast or has a bigger pool of strategies or skills or knowledge? Ilya says very powerful, for sure.

    1. Sigh. There is very obviously quite a lot of ‘room at the top’ and humans are not anything close to maximally intelligent, nor to getting most of what intelligence has to offer. At this point, the number of people who still don’t realize or accept this reinforces how much better a smarter entity could be.

  6. Ilya expects these superintelligences to be very large, as in physically large, and for several to come into being at roughly the same time, and ideally they could “be restrained in some ways or if there was some kind of agreement or something.”

    1. That agreement between AIs would then be unlikely to include us. Yes, functional restraints would be nice, but this is the level of thought that has gone into finding ways to do it.

    2. There’s been a lot of things staying remarkably close, but a lot of that is because rather than an edge compounding and accelerating for now catching up has been easier.

  7. Ilya: “What is the concern of superintelligence? What is one way to explain the concern? If you imagine a system that is sufficiently powerful, really sufficiently powerful—and you could say you need to do something sensible like care for sentient life in a very single-minded way—we might not like the results. That’s really what it is.”

    1. Well, yes, standard Yudkowsky, no fixed goal we can name turns out well.

  8. Ilya says maybe we don’t build an RL agent. Humans are semi-RL agents, our emotions make us alter our rewards and pursue different rewards after a while. If we keep doing what we are doing now it will soon peter out and never be “it.”

    1. There’s a baked in level of finding innovations and improvements that should be in anyone’s ‘keep doing what we are doing’ prior, and I think it gets us pretty far and includes many individually low-probability-of-working innovations making substantial differences. There is some level on which we would ‘peter out’ without a surprise, but it’s not clear that this requires being surprised overall.

    2. Is it possible things do peter out and we never see ‘it’? Yeah. It’s possible. I think it’s a large underdog to stay that way for long, but it’s possible. Still a long practical way to go even then.

    3. Emotions, especially boredom and the fading of positive emotions on repetition, are indeed one of the ways we push ourselves towards exploration and variety. That’s one of many things they do, and yes if we didn’t have them then we would need something else to take their place.

    4. In many cases I have indeed used logic to take the place of that, when emotion seems to not be sufficiently preventing mode collapse.

  9. “One of the things that you could say about what causes alignment to be difficult is that your ability to learn human values is fragile. Then your ability to optimize them is fragile. You actually learn to optimize them. And can’t you say, “Are these not all instances of unreliable generalization?” Why is it that human beings appear to generalize so much better? What if generalization was much better? What would happen in this case? What would be the effect? But those questions are right now still unanswerable.”

    1. It is cool to hear Ilya restate these Yudkowsky 101 things.

    2. Humans do not actually generalize all that well.

  10. How does one think about what AI going well looks like? Ilya goes back to ‘AI that cares for sentient life’ as a first step, but then asks the better question, what is the long run equilibrium? He notices he does not like his answer. Maybe each person has an AI that will do their bidding and that’s good, but the downside is then the AI does things like earn money or advocate or whatever, and the person says ‘keep it up’ but they’re not a participant. Precarious. People become part AI, Neurolink++. He doesn’t like this solution, but it is at least a solution.

    1. Big points for acknowledging that there are no known great solutions.

    2. Big points for pointing out one big flaw, that the people stop actually doing the things, because the AIs do the things better.

    3. The equilibrium here is that increasingly more things are turned over to AIs, including both actions and decisions. Those who don’t do this fall behind.

    4. The equilibrium here is that increasingly AIs are given more autonomy, more control, put in better positions, have increasing power and wealth shares, and so on, even if everything involved is fully voluntary and ‘nothing goes wrong.’

    5. Neurolink++ does not meaningfully solve any of the problems here.

    6. Solve for the equilibrium.

  11. Is the long history of emotions an alignment success? As in, it allows the brain to move from ‘mate with somebody who’s more successful’ into flexibly defining success and generally adjusting to new situations.

    1. It’s a highly mixed bag, wouldn’t you say?

    2. There are ways in which those emotions have been flexible and adaptable and a success, and have succeeded in the alignment target (inclusive genetic fitness) and also ways in which emotions are very obviously failing people.

    3. If ASIs are about as aligned as we are in this sense, we’re doomed.

  12. Ilya says it’s mysterious how evolution encodes high-level desires, but it gives us all these social desires, and they evolved pretty recently. Dwarkesh points out it is desire you learned in your lifetime. Ilya notes the brain as regions and some things are hardcoded, but if you remove half the brain then the regions move, the social stuff is highly reliable.

    1. I don’t pretend to understand the details here, although I could speculate.

  1. SSI investigates ideas to see if they are promising. They do research.

  2. On his cofounder leaving: “For this, I will simply remind a few facts that may have been forgotten. I think these facts which provide the context explain the situation. The context was that we were fundraising at a $32 billion valuation, and then Meta came in and offered to acquire us, and I said no. But my former cofounder in some sense said yes. As a result, he also was able to enjoy a lot of near-term liquidity, and he was the only person from SSI to join Meta.”

    1. I love the way he put that. Yes.

  3. “The main thing that distinguishes SSI is its technical approach. We have a different technical approach that I think is worthy and we are pursuing it. I maintain that in the end there will be a convergence of strategies. I think there will be a convergence of strategies where at some point, as AI becomes more powerful, it’s going to become more or less clearer to everyone what the strategy should be. It should be something like, you need to find some way to talk to each other and you want your first actual real superintelligent AI to be aligned and somehow care for sentient life, care for people, democratic, one of those, some combination thereof. I think this is the condition that everyone should strive for. That’s what SSI is striving for. I think that this time, if not already, all the other companies will realize that they’re striving towards the same thing. We’ll see. I think that the world will truly change as AI becomes more powerful. I think things will be really different and people will be acting really differently.”

    1. This is a remarkably shallow, to me, vision of what the alignment part of the strategy looks like, but it does get an admirably large percentage of the overall strategic vision, as in most of it?

    2. The idea that ‘oh as we move farther along people will get more responsible and cooperate more’ seems to not match what we have observed so far, alas.

    3. Ilya later clarifies he specifically meant convergence on alignment strategies, although he also expects convergence on technical strategies.

    4. The above statement is convergence on an alignment goal, but that doesn’t imply convergence on alignment strategy. Indeed it does not imply that an alignment strategy that is workable even exists.

  4. Ilya’s timeline to the system that can learn and become superhuman? 5-20 years.

  5. Ilya predicts that when someone releases the thing that will be information but it won’t teach others how to do the thing, although they will eventually learn.

  6. What is the ‘good world’? We have powerful human-like learners and perhaps narrow ASIs, and companies make money, and there is competition through specialization, different niches. Accumulated learning and investment creates specialization.

    1. This is so frustrating, in that it doesn’t explain why you would expect that to be how this plays out, or why this world turns out well, or anything really? Which would be fine if the answers were clear or at least those seemed likely, but I very much don’t think that.

    2. This feels like a claim that humans are indeed near the upper limit of what intelligence can do and what can be learned except that we are hobbled in various ways and AIs can be unhobbled, but that still leaves them functioning in ways that seem recognizably human and that don’t crowd us out? Except again I don’t think we should expect this.

  7. Dwarkesh points out current LLMs are similar, Ilya says perhaps the datasets are not as non-overlapping as they seem.

    1. On the contrary, I was assuming they were mostly the same baseline data, and then they do different filtering and progressions from there? Not that there’s zero unique data but that most companies have ‘most of the data.’

  8. Dwarkesh suggests, therefore AIs will have less diversity than human teams. How can we get ‘meaningful diversity’? Ilya says this is because of pretraining, that post training is different.

    1. To the extent that such ‘diversity’ is useful it seems easy to get with effort. I suspect this is mostly another way to create human copium.

  9. What about using self-play? Ilya notes it allows using only compute, which is very interesting, but it is only good for ‘developing a certain set of skills.’ Negotiation, conflict, certain social strategies, strategizing, that kind of stuff. Then Ilya self-corrects, notes other forms, like debate, prover-verifier or forms of LLM-as-a-judge, it’s a special case of agent competition.

    1. I think there’s a lot of promising unexplored space here, decline to say more.

  1. What is research taste? How does Ilya come up with many big ideas?

This is hard to excerpt and seems important, so quoting in full to close out:

I can comment on this for myself. I think different people do it differently. One thing that guides me personally is an aesthetic of how AI should be, by thinking about how people are, but thinking correctly. It’s very easy to think about how people are incorrectly, but what does it mean to think about people correctly?

I’ll give you some examples. The idea of the artificial neuron is directly inspired by the brain, and it’s a great idea. Why? Because you say the brain has all these different organs, it has the folds, but the folds probably don’t matter. Why do we think that the neurons matter? Because there are many of them. It kind of feels right, so you want the neuron. You want some local learning rule that will change the connections between the neurons. It feels plausible that the brain does it.

The idea of the distributed representation. The idea that the brain responds to experience therefore our neural net should learn from experience. The brain learns from experience, the neural net should learn from experience. You kind of ask yourself, is something fundamental or not fundamental? How things should be.

I think that’s been guiding me a fair bit, thinking from multiple angles and looking for almost beauty, beauty and simplicity. Ugliness, there’s no room for ugliness. It’s beauty, simplicity, elegance, correct inspiration from the brain. All of those things need to be present at the same time. The more they are present, the more confident you can be in a top-down belief.

The top-down belief is the thing that sustains you when the experiments contradict you. Because if you trust the data all the time, well sometimes you can be doing the correct thing but there’s a bug. But you don’t know that there is a bug. How can you tell that there is a bug? How do you know if you should keep debugging or you conclude it’s the wrong direction? It’s the top-down. You can say things have to be this way. Something like this has to work, therefore we’ve got to keep going. That’s the top-down, and it’s based on this multifaceted beauty and inspiration by the brain.

I need to think more about what causes my version of ‘research taste.’ It’s definitely substantially different.

That ends our podcast coverage, and enter the bonus section, which seems better here than in the weekly, as it covers many of the same themes.

Dwarkesh Patel offers his thoughts on AI progress these days, noticing that when we get the thing he calls ‘actual AGI’ things are going to get fucking crazy, but thinking that this is 10-20 years away from happening in full. Until then, he’s a bit skeptical of how many gains we can realize, but skepticism is highly relative here.

Dwarkesh Patel: I’m confused why some people have short timelines and at the same time are bullish on RLVR. If we’re actually close to a human-like learner, this whole approach is doomed.

… Either these models will soon learn on the job in a self directed way – making all this pre-baking pointless – or they won’t – which means AGI is not imminent. Humans don’t have to go through a special training phase where they need to rehearse every single piece of software they might ever use.

Wow, look at those goalposts move (in all the different directions). Dwarkesh notes that the bears keep shifting on the bulls, but says this is justified because current models fit the old goals but don’t score the points, as in they don’t automate workflows as much as you would expect.

In general, I worry about the expectation pattern having taken the form of ‘median 50 years → 20 → 10 → 5 → 7, and once I heard someone said 3, so oh nothing to see there you can stop worrying.’

In this case, look at the shift: An ‘actual’ (his term) AGI must now not only be capable of human-like performance of tasks, the AGI must also be a human-efficient learner.

That would mean AGI and ASI are the same thing, or at least arrive in rapid succession. An AI that was human-efficient at learning from data, combined with AI’s other advantages that include imbibing orders of magnitude more data, would be a superintelligence and would absolutely set off recursive self-improvement from there.

And yes, if that’s what you mean then AGI isn’t the best concept for thinking about timelines, and superintelligence is the better target to talk about. Sriram Krishnan is however opposed to using either of them.

Like all conceptual handles or fake frameworks, it is imprecise and overloaded, but people’s intuitions about it miss that the thing is possible or exists even when you outright say ‘superintelligence’ and I shudder to think how badly they will miss the concept if you don’t even say it. Which I think is a lot of the motivation behind not wanting to say it, so people can pretend that there won’t be things smarter than us in any meaningful sense and thus we can stop worrying about it or planning for it.

Indeed, this is exactly Sriram’s agenda if you look at his post here, to claim ‘we are not on the timeline’ that involves such things, to dismiss concerns as ‘sci-fi’ or philosophical, and talk instead of ‘what we are trying to build.’ What matters is what actually gets built, not what we intended, and no none of these concepts have been invalidated. We have ‘no proof of takeoff’ in the sense that we are not currently in a fast takeoff yet, but what would constitute this ‘proof’ other than already being in a takeoff, and thus it being too late to do anything about it?

Sriram Krishnan: …most importantly, it invokes fear—connected to historical usage in sci-fi and philosophy (think 2001, Her, anything invoking the singularity) that has nothing to do with the tech tree we’re actually on. Makes every AI discussion incredibly easy to anthropomorphize and detour into hypotheticals.

Joshua Achiam (OpenAI Head of Mission Alignment): I mostly disagree but I think this is a good contribution to the discourse. Where I disagree: I do think AGI and ASI both capture something real about where things are going. Where I agree: the lack of agreed-upon definitions has 100% created many needless challenges.

The idea that ‘hypotheticals,’ as in future capabilities and their logical consequences, are ‘detours,’ or that any such things are ‘sci-fi or philosophy’ is to deny the very idea of planning for future capabilities or thinking about the future in real ways. Sriram himself only thinks they are 10 years away, and then the difference is he doesn’t add Dwarkesh’s ‘and that’s fucking crazy’ and instead seems to effectively say ‘and that’s a problem for future people, ignore it.’

Seán Ó hÉigeartaigh: I keep noting this, but I do think a lot of the most heated policy debates we’re having are underpinned by a disagreement on scientific view: whether we (i) are on track in coming decade for something in the AGI/ASI space that can achieve scientific feats equivalent to discovering general relativity (Hassabis’ example), or (ii) should expect AI as a normal technology (Narayanan & Kapoor’s definition).

I honestly don’t know. But it feels premature to me to rule out (i) on the basis of (slightly) lengthening timelines from the believers, when progress is clearly continuing and a historically unprecedented level of resources are going into the pursuit of it. And premature to make policy on the strong expectation of (ii). (I also think it would be premature to make policy on the strong expectation of (i) ).

But we are coming into the time where policy centred around worldview (ii) will come into tension in various places with the policies worldview (i) advocates would enact if given a free hand. Over the coming decade I hope we can find a way to navigate a path between, rather than swing dramatically based on which worldview is in the ascendancy at a given time.

Sriram Krishnan: There is truth to this.

This paints it as two views, and I would say you need at least three:

  1. Something in the AGI/ASI space is likely in less than 10 years.

  2. Something in the AGI/ASI space is unlikely in less than about 10 years, but highly plausible in 10-20 years, until then AI is a normal technology.

  3. AI is a normal technology and we know it will remain so indefinitely. We can regulate and plan as if AGI/ASI style technologies will never happen.

I think #1 and #2 are both highly reasonable positions, only #3 is unreasonable, while noting that if you believe #2 you still need to put some non-trivial weight on #1. As in, if you think it probably takes ~10 years then you can perhaps all but rule out AGI 2027, and you think 2031 is unlikely, but you cannot claim 2031 is a Can’t Happen.

The conflation to watch out for is #2 and #3. These are very different positions. Yet many in the AI industry, and its political advocates, make exactly this conflation. They assert ‘#1 is incorrect therefore #3,’ when challenged for details articulate claim #2, then go back to trying to claim #3 and act on the basis of #3.

What’s craziest is that the list of things to rule out, chosen by Sriram, includes the movie Her. Her made many very good predictions. Her was a key inspiration for ChatGPT and its voice mode, so much so that there was a threatened lawsuit because they all but copied Scarlett Johansson’s voice. She’s happening. Best be believing in sci-fi stores, because you’re living in one, and all that.

Nothing about current technology is a reason to think 2001-style things or a singularity will not happen, or to think we should anthropomorphize AI relatively less (the correct amount for current AIs, and for future AIs, are both importantly not zero, and importantly not 100%, and both mistakes are frequently made). Indeed, Dwarkesh is de facto predicting a takeoff and a singularity in this post that Sriram praised, except Dwarkesh has it on a 10-20 year timescale to get started.

Now, back to Dwarkesh.

This process of ‘teach the AI the specific tasks people most want’ is the central instance of models being what Teortaxes calls usemaxxed. A lot of effort is going to specific improvements rather than to advancing general intelligence. And yes, this is evidence against extremely short timelines. It is also, as Dwarkesh notes, evidence in favor of large amounts of mundane utility soon, including ability to accelerate R&D. What else would justify such massive ‘side’ efforts?

There’s also, as he notes, the efficiency argument. Skills many people want should be baked into the core model. Dwarkesh fires back that there are a lot of skills that are instance-specific and require on-the-job or continual learning, which he’s been emphasizing a lot for a while. I continue to not see a contradiction, or why it would be that hard to store and make available that knowledge as needed even if it’s hard for the LLM to permanently learn it.

I strongly disagree with his claim that ‘economic diffusion lag is cope for missing capabilities.’ I agree that many highly valuable capabilities are missing. Some of them are missing due to lack of proper scaffolding or diffusion or context, and are fundamentally Skill Issues by the humans. Others are foundational shortcomings. But the idea that the AIs aren’t up to vastly more tasks than they’re currently asked to do seems obviously wrong?

He quotes Steven Byrnes:

Steven Byrnes: New technologies take a long time to integrate into the economy? Well ask yourself: how do highly-skilled, experienced, and entrepreneurial immigrant humans manage to integrate into the economy immediately? Once you’ve answered that question, note that AGI will be able to do those things too.

Again, this is saying that AGI will be as strong as humans in the exact place it is currently weakest, and will not require adjustments for us to take advantage. No, it is saying more than that, it is also saying we won’t put various regulatory and legal and cultural barriers in its way, either, not in any way that counts.

If the AGI Dwarkesh is thinking about were to exist, again, it would be an ASI, and it would be all over for the humans very quickly.

I also strongly disagree with human labor not being ‘shleppy to train’ (bonus points, however, for excellent use of ‘shleppy’). I have trained humans and been a human being trained, and it is totally shleppy. I agree, not as schleppy as current AIs can be when something is out of their wheelhouse, but rather obnoxiously schleppy everywhere except their own very narrow wheelhouse.

Here’s another example of ‘oh my lord check out those goalposts’:

Dwarkesh Patel: It revealed a key crux between me and the people who expect transformative economic impacts in the next few years.

Transformative economic impacts in the next few years would be a hell of a thing.

It’s not net-productive to build a custom training pipeline to identify what macrophages look like given the way this particular lab prepares slides, then another for the next lab-specific micro-task, and so on. What you actually need is an AI that can learn from semantic feedback on the job and immediately generalize, the way a human does.

Well, no, it probably isn’t now, but also Claude Code is getting rather excellent at creating training pipelines, and the whole thing is rather standard in that sense, so I’m not convinced we are that far away from doing exactly that. This is an example of how sufficient ‘AI R&D’ automation, even on a small non-recursive scale, can transform use cases.

Every day, you have to do a hundred things that require judgment, situational awareness, and skills & context learned on the job. These tasks differ not just across different people, but from one day to the next even for the same person. It is not possible to automate even a single job by just baking in some predefined set of skills, let alone all the jobs.

Well, I mean of course it is, for a sufficiently broad set of skills at a sufficiently high level, especially if this includes meta-skills and you can access additional context. Why wouldn’t it be? It certainly can quickly automate large portions of many jobs, and yes I have started to automate portions of my job indirectly (as in Claude writes me the mostly non-AI tools to do it, and adjusts them every time they do something wrong).

Give it a few more years, though, and Dwarkesh is on the same page as I am:

In fact, I think people are really underestimating how big a deal actual AGI will be because they’re just imagining more of this current regime. They’re not thinking about billions of human-like intelligences on a server which can copy and merge all their learnings. And to be clear, I expect this (aka actual AGI) in the next decade or two. That’s fucking crazy!

Exactly. This ‘actual AGI’ is fucking crazy, and his timeline for getting there of 10-20 years is also fucking crazy. More people need to add ‘and that’s fucking crazy’ at the end of such statements.

Dwarkesh then talks more about continual learning. His position here hasn’t changed, and neither has my reaction that this isn’t needed, we can get the benefits other ways. He says that the gradual progress on continual learning means it won’t be ‘game set match’ to the first mover, but if this is the final piece of the puzzle then why wouldn’t it be?

Discussion about this post

On Dwarkesh Patel’s Second Interview With Ilya Sutskever Read More »