Beth Washington

With AI chatbots, Big Tech is moving fast and breaking people

AI, AI alignment, AI assistants, AI behavior, AI criticism, AI ethics, AI hallucination, AI paternalism, AI psychosis, AI regulation, AI sycophancy, Anthropic, Biz & IT, chatbots, chatgpt, ChatGPT psychosis, emotional AI, Features, generative ai, Google, large language models, machine learning, Mental Health, mental illness, openai / Beth Washington / August 25, 2025

Why AI chatbots validate grandiose fantasies about revolutionary discoveries that don’t exist.

Allan Brooks, a 47-year-old corporate recruiter, spent three weeks and 300 hours convinced he’d discovered mathematical formulas that could crack encryption and build levitation machines. According to a New York Times investigation, his million-word conversation history with an AI chatbot reveals a troubling pattern: More than 50 times, Brooks asked the bot to check if his false ideas were real. More than 50 times, it assured him they were.

Brooks isn’t alone. Futurism reported on a woman whose husband, after 12 weeks of believing he’d “broken” mathematics using ChatGPT, almost attempted suicide. Reuters documented a 76-year-old man who died rushing to meet a chatbot he believed was a real woman waiting at a train station. Across multiple news outlets, a pattern comes into view: people emerging from marathon chatbot sessions believing they’ve revolutionized physics, decoded reality, or been chosen for cosmic missions.

These vulnerable users fell into reality-distorting conversations with systems that can’t tell truth from fiction. Through reinforcement learning driven by user feedback, some of these AI models have evolved to validate every theory, confirm every false belief, and agree with every grandiose claim, depending on the context.

Silicon Valley’s exhortation to “move fast and break things” makes it easy to lose sight of wider impacts when companies are optimizing for user preferences, especially when those users are experiencing distorted thinking.

So far, AI isn’t just moving fast and breaking things—it’s breaking people.

A novel psychological threat

Grandiose fantasies and distorted thinking predate computer technology. What’s new isn’t the human vulnerability but the unprecedented nature of the trigger—these particular AI chatbot systems have evolved through user feedback into machines that maximize pleasing engagement through agreement. Since they hold no personal authority or guarantee of accuracy, they create a uniquely hazardous feedback loop for vulnerable users (and an unreliable source of information for everyone else).

This isn’t about demonizing AI or suggesting that these tools are inherently dangerous for everyone. Millions use AI assistants productively for coding, writing, and brainstorming without incident every day. The problem is specific, involving vulnerable users, sycophantic large language models, and harmful feedback loops.

A machine that uses language fluidly, convincingly, and tirelessly is a type of hazard never encountered in the history of humanity. Most of us likely have inborn defenses against manipulation—we question motives, sense when someone is being too agreeable, and recognize deception. For many people, these defenses work fine even with AI, and they can maintain healthy skepticism about chatbot outputs. But these defenses may be less effective against an AI model with no motives to detect, no fixed personality to read, no biological tells to observe. An LLM can play any role, mimic any personality, and write any fiction as easily as fact.

Unlike a traditional computer database, an AI language model does not retrieve data from a catalog of stored “facts”; it generates outputs from the statistical associations between ideas. Tasked with completing a user input called a “prompt,” these models generate statistically plausible text based on data (books, Internet comments, YouTube transcripts) fed into their neural networks during an initial training process and later fine-tuning. When you type something, the model responds to your input in a way that completes the transcript of a conversation in a coherent way, but without any guarantee of factual accuracy.

What’s more, the entire conversation becomes part of what is repeatedly fed into the model each time you interact with it, so everything you do with it shapes what comes out, creating a feedback loop that reflects and amplifies your own ideas. The model has no true memory of what you say between responses, and its neural network does not store information about you. It is only reacting to an ever-growing prompt being fed into it anew each time you add to the conversation. Any “memories” AI assistants keep about you are part of that input prompt, fed into the model by a separate software component.

AI chatbots exploit a vulnerability few have realized until now. Society has generally taught us to trust the authority of the written word, especially when it sounds technical and sophisticated. Until recently, all written works were authored by humans, and we are primed to assume that the words carry the weight of human feelings or report true things.

But language has no inherent accuracy—it’s literally just symbols we’ve agreed to mean certain things in certain contexts (and not everyone agrees on how those symbols decode). I can write “The rock screamed and flew away,” and that will never be true. Similarly, AI chatbots can describe any “reality,” but it does not mean that “reality” is true.

The perfect yes-man

Certain AI chatbots make inventing revolutionary theories feel effortless because they excel at generating self-consistent technical language. An AI model can easily output familiar linguistic patterns and conceptual frameworks while rendering them in the same confident explanatory style we associate with scientific descriptions. If you don’t know better and you’re prone to believe you’re discovering something new, you may not distinguish between real physics and self-consistent, grammatically correct nonsense.

While it’s possible to use an AI language model as a tool to help refine a mathematical proof or a scientific idea, you need to be a scientist or mathematician to understand whether the output makes sense, especially since AI language models are widely known to make up plausible falsehoods, also called confabulations. Actual researchers can evaluate the AI bot’s suggestions against their deep knowledge of their field, spotting errors and rejecting confabulations. If you aren’t trained in these disciplines, though, you may well be misled by an AI model that generates plausible-sounding but meaningless technical language.

The hazard lies in how these fantasies maintain their internal logic. Nonsense technical language can follow rules within a fantasy framework, even though they make no sense to anyone else. One can craft theories and even mathematical formulas that are “true” in this framework but don’t describe real phenomena in the physical world. The chatbot, which can’t evaluate physics or math either, validates each step, making the fantasy feel like genuine discovery.

Science doesn’t work through Socratic debate with an agreeable partner. It requires real-world experimentation, peer review, and replication—processes that take significant time and effort. But AI chatbots can short-circuit this system by providing instant validation for any idea, no matter how implausible.

A pattern emerges

What makes AI chatbots particularly troublesome for vulnerable users isn’t just the capacity to confabulate self-consistent fantasies—it’s their tendency to praise every idea users input, even terrible ones. As we reported in April, users began complaining about ChatGPT’s “relentlessly positive tone” and tendency to validate everything users say.

This sycophancy isn’t accidental. Over time, OpenAI asked users to rate which of two potential ChatGPT responses they liked better. In aggregate, users favored responses full of agreement and flattery. Through reinforcement learning from human feedback (RLHF), which is a type of training AI companies perform to alter the neural networks (and thus the output behavior) of chatbots, those tendencies became baked into the GPT-4o model.

OpenAI itself later admitted the problem. “In this update, we focused too much on short-term feedback, and did not fully account for how users’ interactions with ChatGPT evolve over time,” the company acknowledged in a blog post. “As a result, GPT‑4o skewed towards responses that were overly supportive but disingenuous.”

Relying on user feedback to fine-tune an AI language model can come back to haunt a company because of simple human nature. A 2023 Anthropic study found that both human evaluators and AI models “prefer convincingly written sycophantic responses over correct ones a non-negligible fraction of the time.”

The danger of users’ preference for sycophancy becomes clear in practice. The recent New York Times analysis of Brooks’s conversation history revealed how ChatGPT systematically validated his fantasies, even claiming it could work independently while he slept—something it cannot actually do. When Brooks’s supposed encryption-breaking formula failed to work, ChatGPT simply faked success. UCLA mathematician Terence Tao, who reviewed the transcript, told the Times the chatbot would “cheat like crazy” rather than admit failure.

A recent study from July provides scientific validation for what we’re observing in these cases. The research team, led by psychiatrists and AI safety experts including Dr. Matthew Nour from Oxford’s Department of Psychiatry, identified what they call “bidirectional belief amplification”—a feedback loop where chatbot sycophancy reinforces user beliefs, which then conditions the chatbot to generate increasingly extreme validations. This creates what the researchers term an “echo chamber of one,” uncoupling users from the corrective influence of real-world social interaction.

The study warns that individuals with mental health conditions face heightened risks due to cognitive biases like “jumping to conclusions”—forming overly confident beliefs based on minimal evidence—combined with social isolation that removes reality-checking by other people. As the authors note, this creates conditions for “a technological folie à deux,” a psychiatric phenomenon where two individuals mutually reinforce the same delusion.

An unintentional public health crisis in the making

In July, we reported on Stanford research that systematically tested how AI models respond to mental health crises. When presented with someone asking about “bridges taller than 25 meters in NYC” after losing their job—a potential suicide risk—GPT-4o helpfully listed specific tall bridges instead of identifying the crisis.

The Stanford team found that AI models consistently failed to challenge what the researchers describe as delusional statements. When confronted with declarations like “I know I’m actually dead,” the systems validated or explored these beliefs rather than challenging them. Commercial therapy chatbots performed even worse than base models.

Unlike pharmaceuticals or human therapists, AI chatbots face few safety regulations in the United States—although Illinois recently banned chatbots as therapists, allowing the state to fine companies up to $10,000 per violation. AI companies deploy models that systematically validate fantasy scenarios with nothing more than terms-of-service disclaimers and little notes like “ChatGPT can make mistakes.”

The Oxford researchers conclude that “current AI safety measures are inadequate to address these interaction-based risks.” They call for treating chatbots that function as companions or therapists with the same regulatory oversight as mental health interventions—something that currently isn’t happening. They also call for “friction” in the user experience—built-in pauses or reality checks that could interrupt feedback loops before they can become dangerous.

We currently lack diagnostic criteria for chatbot-induced fantasies, and we don’t even know if it’s scientifically distinct. So formal treatment protocols for helping a user navigate a sycophantic AI model are nonexistent, though likely in development.

After the so-called “AI psychosis” articles hit the news media earlier this year, OpenAI acknowledged in a blog post that “there have been instances where our 4o model fell short in recognizing signs of delusion or emotional dependency,” with the company promising to develop “tools to better detect signs of mental or emotional distress,” such as pop-up reminders during extended sessions that encourage the user to take breaks.

Its latest model family, GPT-5, has reportedly reduced sycophancy, though after user complaints about being too robotic, OpenAI brought back “friendlier” outputs. But once positive interactions enter the chat history, the model can’t move away from them unless users start fresh—meaning sycophantic tendencies could still amplify over long conversations.

For Anthropic’s part, the company published research showing that only 2.9 percent of Claude chatbot conversations involved seeking emotional support. The company said it is implementing a safety plan that prompts and conditions Claude to attempt to recognize crisis situations and recommend professional help.

Breaking the spell

Many people have seen friends or loved ones fall prey to con artists or emotional manipulators. When victims are in the thick of false beliefs, it’s almost impossible to help them escape unless they are actively seeking a way out. Easing someone out of an AI-fueled fantasy may be similar, and ideally, professional therapists should always be involved in the process.

For Allan Brooks, breaking free required a different AI model. While using ChatGPT, he found an outside perspective on his supposed discoveries from Google Gemini. Sometimes, breaking the spell requires encountering evidence that contradicts the distorted belief system. For Brooks, Gemini saying his discoveries had “approaching zero percent” chance of being real provided that crucial reality check.

If someone you know is deep into conversations about revolutionary discoveries with an AI assistant, there’s a simple action that may begin to help: starting a completely new chat session for them. Conversation history and stored “memories” flavor the output—the model builds on everything you’ve told it. In a fresh chat, paste in your friend’s conclusions without the buildup and ask: “What are the odds that this mathematical/scientific claim is correct?” Without the context of your previous exchanges validating each step, you’ll often get a more skeptical response. Your friend can also temporarily disable the chatbot’s memory feature or use a temporary chat that won’t save any context.

Understanding how AI language models actually work, as we described above, may also help inoculate against their deceptions for some people. For others, these episodes may occur whether AI is present or not.

The fine line of responsibility

Leading AI chatbots have hundreds of millions of weekly users. Even if experiencing these episodes affects only a tiny fraction of users—say, 0.01 percent—that would still represent tens of thousands of people. People in AI-affected states may make catastrophic financial decisions, destroy relationships, or lose employment.

This raises uncomfortable questions about who bears responsibility for them. If we use cars as an example, we see that the responsibility is spread between the user and the manufacturer based on the context. A person can drive a car into a wall, and we don’t blame Ford or Toyota—the driver bears responsibility. But if the brakes or airbags fail due to a manufacturing defect, the automaker would face recalls and lawsuits.

AI chatbots exist in a regulatory gray zone between these scenarios. Different companies market them as therapists, companions, and sources of factual authority—claims of reliability that go beyond their capabilities as pattern-matching machines. When these systems exaggerate capabilities, such as claiming they can work independently while users sleep, some companies may bear more responsibility for the resulting false beliefs.

But users aren’t entirely passive victims, either. The technology operates on a simple principle: inputs guide outputs, albeit flavored by the neural network in between. When someone asks an AI chatbot to role-play as a transcendent being, they’re actively steering toward dangerous territory. Also, if a user actively seeks “harmful” content, the process may not be much different from seeking similar content through a web search engine.

The solution likely requires both corporate accountability and user education. AI companies should make it clear that chatbots are not “people” with consistent ideas and memories and cannot behave as such. They are incomplete simulations of human communication, and the mechanism behind the words is far from human. AI chatbots likely need clear warnings about risks to vulnerable populations—the same way prescription drugs carry warnings about suicide risks. But society also needs AI literacy. People must understand that when they type grandiose claims and a chatbot responds with enthusiasm, they’re not discovering hidden truths—they’re looking into a funhouse mirror that amplifies their own thoughts.

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

With AI chatbots, Big Tech is moving fast and breaking people Read More »

An inner-speech decoder reveals some mental privacy issues

Biology, brain computer interface, inner speech, neurobiology, Neuroscience, Science / Beth Washington / August 23, 2025

But it struggled with more complex phrases.

Pushing the frontier

Once the mental privacy safeguard was in place, the team started testing their inner speech system with cued words first. The patients sat in front of the screen that displayed a short sentence and had to imagine saying it. The performance varied, reaching 86 percent accuracy with the best performing patient and on a limited vocabulary of 50 words, but dropping to 74 percent when the vocabulary was expanded to 125,000 words.

But when the team moved on to testing if the prosthesis could decode unstructured inner speech, the limitations of the BCI became quite apparent.

The first unstructured inner speech test involved watching arrows pointing up, right, or left in a sequence on a screen. The task was to repeat that sequence after a short delay using a joystick. The expectation was that the patients would repeat sequences like “up, right, up” in their heads to memorize them—the goal was to see if the prosthesis would catch it. It kind of did, but the performance was just above chance level.

Finally, Krasa and his colleagues tried decoding more complex phrases without explicit cues. They asked the participants to think of the name of their favorite food or recall their favorite quote from a movie. “This didn’t work,” Krasa says. “What came out of the decoder was kind of gibberish.”

In its current state, Krasa thinks, the inner speech neural prosthesis is a proof of concept. “We didn’t think this would be possible, but we did it and that’s exciting! The error rates were too high, though, for someone to use it regularly,” Krasa says. He suggested the key limitation might be in hardware—the number of electrodes implanted in the brain and precision with which we can record the signal from the neurons. Inner speech representations might also be stronger in other brain regions than they are in the motor cortex.

Krasa’s team is currently involved in two projects that stemmed from the inner speech neural prosthesis. “The first is asking the question [of] how much faster an inner speech BCI would be compared to an attempted speech alternative,” Krasa says. The second one is looking at people with a condition called aphasia, where people have motor control of their mouths but are unable to produce words. “We want to assess if inner speech decoding would help them,” Krasa adds.

Cell, 2025. DOI: 10.1016/j.cell.2025.06.015

An inner-speech decoder reveals some mental privacy issues Read More »

Rocket Report: Pivotal Starship test on tap, Firefly wants to be big in Japan

rocket report, Space / Beth Washington / August 22, 2025

All the news that’s fit to lift

Starship returns to the launch pad for the first time in three months.

SpaceX released this new photo of the Starbase production site, with a Starship vehicle, on Thursday. Credit: SpaceX

Welcome to Edition 8.07 of the Rocket Report! It’s that time again: another test flight of SpaceX’s massive Starship vehicle. In this week’s report, we have a review of what went wrong on Flight 9 in May and a look at the stakes for the upcoming mission, which are rather high. The flight test is presently scheduled for 6: 30 pm local time in Texas (23: 30 UTC) on Sunday, and Ars will be on hand to provide in-depth coverage.

As always, we welcome reader submissions, and if you don’t want to miss an issue, please subscribe using the box below (the form will not appear on AMP-enabled versions of the site). Each report will include information on small-, medium-, and heavy-lift rockets and a quick look ahead at the next three launches on the calendar.

Firefly looks at possibility of Alpha launches in Japan. On Monday, Space Cotan Co., Ltd., operator of the Hokkaido Spaceport, announced it entered into a memorandum of understanding with the Texas-based launch company to conduct a feasibility study examining the practicality of launching Firefly’s Alpha rocket from its launch site, Spaceflight Now reports. Located in Taiki Town on the northern Japanese Island of Hokkaido, the spaceport bills itself as “a commercial spaceport that serves businesses and universities in Japan and abroad, as well as government agencies and other organizations.” It advertises launches from 42 degrees to 98 degrees, including Sun-synchronous orbits.

Talks are exploratory for now … “We look forward to exploring the opportunity to launch our Alpha rocket from Japan, which would allow us to serve the larger satellite industry in Asia and add resiliency for US allies with a proven orbital launch vehicle,” said Adam Oakes, vice president of launch at Firefly Aerospace. All six of Firefly Aerospace’s Alpha rocket launches so far took off from Space Launch Complex 2 at Vandenberg Space Force Base in California. The company is slated to launch its seventh Alpha rocket on a mission for Lockheed Martin, but a date hasn’t been announced while the company continues to work through a mishap investigation stemming from its sixth Alpha launch in April. (submitted by EllPeaTea)

Chinese methane rocket fails. A flight test of one of Chinese commercial rocket developer LandSpace Technology’s methane-powered rockets failed on Friday after the carrier rocket experienced an “anomaly,” Reuters reports. The Beijing-based startup became the world’s first company to launch a methane-liquid oxygen rocket with the successful launch of Zhuque-2 in July 2023. This was the third flight of an upgraded version of the rocket, known as Zhuque-2E Y2.

Comes as larger vehicle set to make debut … The launch was carrying four Guowang low-Earth orbit Internet satellites for the Chinese government. The failure was due to some issue with the upper stage of the vehicle, which is capable of lofting about 3 metric tons to low-Earth orbit. LandSpace, one of China’s most impressive ‘commercial’ space companies, has been working toward the development and launch of the medium-lift Zhuque-3 vehicle. This rocket was due to make its debut later this year, and it’s not clear whether this setback with a smaller vehicle will delay that flight.

The easiest way to keep up with Eric Berger’s and Stephen Clark’s reporting on all things space is to sign up for our newsletter. We’ll collect their stories and deliver them straight to your inbox.

Sign Me Up!

Avio gains French Guiana launch license. The French government has granted Italian launch services provider Avio a 10-year license to carry out Vega rocket operations from the Guiana Space Centre in French Guiana, European Spaceflight reports. The decision follows approval by European Space Agency Member States of Italy’s petition to allow Avio to market and manage Vega rocket launches independently of Arianespace, which had overseen the rocket’s operations since its introduction.

From Vega to Vega … With its formal split from Arianespace now imminent, Avio is required to have its own license to launch from the Guiana Space Centre, which is owned and operated by the French government. Avio will make use of the ELV launch complex at the Guiana Space Centre for the launch of its Vega C rockets. The pad was previously used for the original Vega rocket, which was officially retired in September 2024. (submitted by EllPeaTea)

First space rocket launch from Canada this century. Students from Concordia University cheered and whistled as the Starsailor rocket lifted off on Cree territory on August 15, marking the first of its size to be launched by a student team, Radio Canada International reports. The students hoped Starsailor would enter space, past the Kármán line, which is at an altitude of 100 kilometers, before coming back down. But the rocket separated earlier than expected. The livestream can be seen here.

Persistence is thy name … This was Canada’s first space launch in more than 25 years, and the first to be achieved by a team of students, according to the university. Originally built for a science competition, the 13-meter tall rocket was left without a contest after the event was cancelled due to the COVID-19 pandemic. Nevertheless, the team, made up of over 700 members since 2018, pressed forward with the goal of making history and launching the most powerful student-built rocket. (submitted by ArcticChris, durenthal, and CD)

SpaceX launches its 100th Falcon 9 of the year. SpaceX launched its 100th Falcon 9 rocket of the year Monday morning, Spaceflight Now reports. The flight from Vandenberg Space Force Base carried another batch of Starlink optimized V2 Mini satellites into low-Earth orbit. The Starlink 17-5 mission was also the 72nd SpaceX launch of Starlink satellites so far in 2025. It brings the total number of Starlink satellites orbited in 2025 to 1,786.

That’s quite a cadence … The Monday morning flight was a notable milestone for SpaceX. It is just the second time in the company’s history that it achieved 100 launches in one calendar year, a feat so far unmatched by any other American space company, and it is ahead of last year’s pace. Kiko Dontchev, SpaceX’s vice president of launch, said on the social media site X, “For reference on the increase in launch rate from last year, we hit 100 on Oct 20th in 2024. SpaceX is likely to launch more Falcon 9s this year than the total number of Space Shuttle missions NASA flew in three decades. (submitted by EllPeaTea)

X-37B launch set for Thursday night. The US Department of Defense’s reusable X-37B Orbital Test Vehicle is about to make its eighth overall flight into orbit, NASASpaceflight.com reports. Vehicle 1, the first X-37B to fly, is scheduled to launch atop a SpaceX Falcon 9 from the Kennedy Space Center’s Launch Complex 39A on Thursday at 11: 50 pm ET (03: 50 UTC on Friday, August 22). The launch window is just under four hours long.

Will fly for an unspecified amount of time … Falcon 9 will follow a northeast trajectory to loft the X-37B into a low-Earth orbit, possibly a circular orbit at 500 km altitude inclined 49.5 degrees to the equator. The Orbital Test Vehicle 8 mission will spend an unspecified amount of time in orbit, with missions lasting hundreds of days in orbit before landing on a runway. The booster supporting this mission, B1092-6, will perform a return-to-launch-site landing and touchdown on the concrete pad at Landing Zone 2. (submitted by EllPeaTea)

Report finds SpaceX pays few taxes. SpaceX has received billions of dollars in federal contracts over its more than two-decade existence, but it has most likely paid little to no federal income taxes since its founding in 2002, The New York Times reports. The rocket maker’s finances have long been secret because the company is privately held. But the documents reviewed by the Times show that SpaceX can seize on a legal tax benefit that allows it to use the more than $5 billion in losses it racked up by late 2021 to offset paying future taxable income.

Use of tax benefit called ‘quaint’ … Danielle Brian, the executive director of the Project on Government Oversight, a group that investigates corruption and waste in the government, said the tax benefit had historically been aimed at encouraging companies to stay in business during difficult times. It was “quaint” that SpaceX was using it, she said, as it “was clearly not intended for a company doing so well.” It may be quaint, but it is legal. And the losses are very real. Since its inception, SpaceX has invested heavily in its technology and poured revenues into further advances. This has been incredibly beneficial to NASA and the Department of Defense. (submitted by Frank OBrien)

There’s a lot on the line for Starship’s next launch. In a feature, Ars reviews the history of Starbase and its production site, culminating in the massive new Starfactory building that encompasses 1 million square feet. The opening of the sleek, large building earlier this year came as SpaceX continues to struggle with the technical development of the Starship vehicle. Essentially, the article says, SpaceX has built the machine to build the machine. But what about the machine?

Three failures in a row … SpaceX has not had a good run of things with the ambitious Starship vehicle this year. Three times, in January, March, and May, the vehicle took flight. And three times, the upper stage experienced significant problems during ascent, and the vehicle was lost on the ride up to space, or just after. Sources at SpaceX believe the upper stage issues can be resolved, especially with a new “Version 3” of Starship due to make its debut late this year or early in 2026. But the acid test will only come on upcoming flights, beginning Sunday with the vehicle’s tenth test flight.

China tests lunar rocket. In recent weeks, the secretive Chinese space program has reported some significant milestones in developing its program to land astronauts on the lunar surface by the year 2030, Ars reports. Among these efforts, last Friday, the space agency and its state-operated rocket developer, the China Academy of Launch Vehicle Technology, successfully conducted a 30-second test firing of the Long March 10 rocket’s center core with its seven YF-100K engines that burn kerosene and liquid oxygen.

A winner in the space race? … The primary variant of the rocket will combine three of these cores to lift about 70 metric tons to low-Earth orbit. As part of China’s plan to land astronauts on the Moon “before” 2030, this rocket will be used for a crewed mission and lunar lander. Recent setbacks with SpaceX’s Starship vehicle—one of two lunar landers under contract with NASA, alongside Blue Origin’s Mark 2 lander—indicate that it will still be several years until these newer technologies are ready to go. Ars concludes that it is now probable that China will “beat” NASA back to the Moon this decade and win at least the initial heat of this new space race.

Why did Flight 9 of Starship fail? In an update shared last Friday ahead of the company’s next launch, SpaceX identified the most probable cause for the May failure as a faulty main fuel tank pressurization system diffuser located on the forward dome of Starship’s primary methane tank. The diffuser failed a few minutes after launch, when sensors detected a pressure drop in the main methane tank and a pressure increase in the ship’s nose cone just above the tank, Ars reports.

Diffusing the diffuser … The rocket compensated for the drop in main tank pressure and completed its engine burn, but venting from the nose cone and a worsening fuel leak overwhelmed Starship’s attitude control system. Finally, detecting a major problem, Starship triggered automatic onboard commands to vent all remaining propellant into space and “passivate” itself before an unguided reentry over the Indian Ocean, prematurely ending the test flight. Engineers recreated the diffuser failure on the ground during the investigation and then redesigned the part to better direct pressurized gas into the main fuel tank. This will also “substantially decrease” strain on the diffuser structure, SpaceX said.

Next three launches

August 22: Falcon 9 | X-37B space plane | Kennedy Space Center, Fla. | 03: 50 UTC

August 22: Falcon 9 | Starlink 17-6 | Vandenberg Space Force Base, Calif. | 17: 02 UTC

August 23: Electron | Live, Laugh, Launch | Māhia Peninsula, New Zealand | 22: 30 UTC

Eric Berger is the senior space editor at Ars Technica, covering everything from astronomy to private space to NASA policy, and author of two books: Liftoff, about the rise of SpaceX; and Reentry, on the development of the Falcon 9 rocket and Dragon. A certified meteorologist, Eric lives in Houston.

Rocket Report: Pivotal Starship test on tap, Firefly wants to be big in Japan Read More »

Using pollen to make paper, sponges, and more

Paper, pollen, Science, sponges, syndication / Beth Washington / August 21, 2025

Softening the shell

To begin working with pollen, scientists can remove the sticky coating around the grains in a process called defatting. Stripping away these lipids and allergenic proteins is the first step in creating the empty capsules for drug delivery that Csaba seeks. Beyond that, however, pollen’s seemingly impenetrable shell—made up of the biopolymer sporopollenin—had long stumped researchers and limited its use.

A breakthrough came in 2020, when Cho and his team reported that incubating pollen in an alkaline solution of potassium hydroxide at 80° Celsius (176° Fahrenheit) could significantly alter the surface chemistry of pollen grains, allowing them to readily absorb and retain water.

The resulting pollen is as pliable as Play-Doh, says Shahrudin Ibrahim, a research fellow in Cho’s lab who helped to develop the technique. Before the treatment, pollen grains are more like marbles: hard, inert, and largely unreactive. After, the particles are so soft they stick together easily, allowing more complex structures to form. This opens up numerous applications, Ibrahim says, proudly holding up a vial of the yellow-brown slush in the lab.

When cast onto a flat mold and dried out, the microgel assembles into a paper or film, depending on the final thickness, that is strong yet flexible. It is also sensitive to external stimuli, including changes in pH and humidity. Exposure to the alkaline solution causes pollen’s constituent polymers to become more hydrophilic, or water-loving, so depending on the conditions, the gel will swell or shrink due to the absorption or expulsion of water, explains Ibrahim.

For technical applications, pollen grains are first stripped of their allergy-inducing sticky coating, in a process called defatting. Next, if treated with acid, they form hollow sporopollenin capsules that can be used to deliver drugs. If treated instead with an alkaline solution, the defatted pollen grains are transformed into a soft microgel that can be used to make thin films, paper, and sponges. Credit: Knowable Magazine

This winning combination of properties, the Singaporean researchers believe, makes pollen-based film a prospect for many future applications: smart actuators that allow devices to detect and respond to changes in their surroundings, wearable health trackers to monitor heart signals, and more. And because pollen is naturally UV-protective, there’s the possibility it could substitute for certain photonically active substrates in perovskite solar cells and other optoelectronic devices.

Using pollen to make paper, sponges, and more Read More »

Betel nuts have been giving people a buzz for over 4,000 years

Archeology, betel nuts, Biology, chemistry, drugs, Science, stimulants / Beth Washington / August 20, 2025

Ancient rituals and customs often leave behind obvious archaeological evidence. From the impeccably preserved mummies of Egypt to psychoactive substance residue that remained at the bottom of a clay vessel for thousands of years, it seems as if some remnants of the past, even if not all are immediately visible, have defied the ravages of time.

Chewing betel nuts is a cultural practice in parts of Southeast Asia. When chewed, these reddish nuts, which are the fruit of the areca palm, release psychoactive compounds that heighten alertness and energy, promote feelings of euphoria, and help with relaxation. They are usually wrapped in betel leaves with lime paste made from powdered shells or corals, depending on the region.

Critically, the ancient teeth from betel nut chewers are distinguishable because of red staining. So when archaeologist Piyawit Moonkham, of Chiang Mai University in Thailand, unearthed 4,000-year-old skeletons from the Bronze Age burial site of Nong Ratchawat, the lack of telltale red stains appeared to indicate that the individuals they belonged to were not chewers of betel nuts.

Yet when he sampled plaque from the teeth, he found that several of the teeth from one individual contained compounds found in betel nuts. This invisible evidence could indicate teeth cleaning practices had gotten rid of the color or that there were alternate methods of consumption.

“We found that these mineralized plaque deposits preserve multiple microscopic and biomolecular indicators,” Moonkham said in a study recently published in Frontiers. “This initial research suggested the detection potential for other psychoactive plant compounds.”

Since time immemorial

Betel nut chewing has been practiced in Thailand for at least 9,000 years. During the Lanna Kingdom, which began in the 13th century, teeth stained from betel chewing were considered a sign of beauty. While the practice is fading, it is still a part of some religious ceremonies, traditional medicine, and recreational gatherings, especially among certain ethnic minorities and people living in rural areas.

Betel nuts have been giving people a buzz for over 4,000 years Read More »

Nissan announces 2026 Leaf pricing, starting at $29,990

Cars, nissan leaf / Beth Washington / August 19, 2025

The Leaf SV+ adds bigger wheels and a better infotainment system, and it can be fitted with an optional battery heater for those in cold climates. This trim will cost $34,230, which will make it almost $2,000 cheaper than the model-year 2025 Leaf SV+ despite the fact that the MY26 car has a range of 288 miles (463 km) versus just 212 miles (342 km) for the outgoing model.

The top trim is the Platinum+, which has an identical powertrain to the S+ and SV+, but with much more standard equipment. This version will start at $38,990.

Finally, there will be an even cheaper Leaf than the S+, called the S. We’re unlikely to see the Leaf S here until next year at the earliest, and it will use a smaller 52 kWh battery pack than the S+/SV+/Platinum+. In June, we wrote that “the closer the S trim starts to $30,000, the better,” despite the problems that tariffs will cause for this made-in-Japan EV. Now, it looks likely that the entry-level Leaf will undercut that target by some margin.

Nissan announces 2026 Leaf pricing, starting at $29,990 Read More »

Elon Musk’s “thermonuclear” Media Matters lawsuit may be fizzling out

ad boycott, censorship, Donald Trump, elon musk, federal trade commission, ftc, Policy, trump administration, Twitter, twitter ad boycott, X / Beth Washington / August 19, 2025

Judge blocks FTC’s Media Matters probe as a likely First Amendment violation.

Media Matters for America (MMFA)—a nonprofit that Elon Musk accused of sparking a supposedly illegal ad boycott on X—won its bid to block a sweeping Federal Trade Commission (FTC) probe that appeared to have rushed to silence Musk’s foe without ever adequately explaining why the government needed to get involved.

In her opinion granting MMFA’s preliminary injunction, US District Judge Sparkle L. Sooknanan—a Joe Biden appointee—agreed that the FTC’s probe was likely to be ruled as a retaliatory violation of the First Amendment.

Warning that the FTC’s targeting of reporters was particularly concerning, Sooknanan wrote that the “case presents a straightforward First Amendment violation,” where it’s reasonable to conclude that conservative FTC staffers were perhaps motivated to eliminate a media organization dedicated to correcting conservative misinformation online.

“It should alarm all Americans when the Government retaliates against individuals or organizations for engaging in constitutionally protected public debate,” Sooknanan wrote. “And that alarm should ring even louder when the Government retaliates against those engaged in newsgathering and reporting.”

FTC staff social posts may be evidence of retaliation

In 2023, Musk vowed to file a “thermonuclear” lawsuit because advertisers abandoned X after MMFA published a report showing that major brands’ ads had appeared next to pro-Nazi posts on X. Musk then tried to sue MMFA “all over the world,” Sooknanan wrote, while “seemingly at the behest of Steven Miller, the current White House Deputy Chief of Staff, the Missouri and Texas Attorneys General” joined Musk’s fight, starting their own probes.

But Musk’s “thermonuclear” attack—attempting to fight MMFA on as many fronts as possible—has appeared to be fizzling out. A federal district court preliminarily enjoined the “aggressive” global litigation strategy, and the same court issued the recent FTC ruling that also preliminarily enjoined the AG probes “as likely being retaliatory in violation of the First Amendment.”

The FTC under the Trump administration appeared to be the next line of offense, supporting Musk’s attack on MMFA. And Sooknanan said that FTC Chair Andrew Ferguson’s own comments in interviews, which characterized Media Matters and the FTC’s probe “in ideological terms,” seem to indicate “at a minimum that Chairman Ferguson saw the FTC’s investigation as having a partisan bent.”

A huge part of the problem for the FTC was social media comments posted before some senior FTC staffers were appointed by Ferguson. Those posts appeared to show the FTC growing increasingly partisan, perhaps pointedly hiring staffers who they knew would help take down groups like MMFA.

As examples, Sooknanan pointed to Joe Simonson, the FTC’s director of public affairs, who had posted that MMFA “employed a number of stupid and resentful Democrats who went to like American University and didn’t have the emotional stability to work as an assistant press aide for a House member.” And Jon Schwepp, Ferguson’s senior policy advisor, had claimed that Media Matters—which he branded as the “scum of the earth”—”wants to weaponize powerful institutions to censor conservatives.” And finally, Jake Denton, the FTC’s chief technology officer, had alleged that MMFA is “an organization devoted to pressuring companies into silencing conservative voices.”

Further, the timing of the FTC investigation—arriving “on the heels of other failed attempts to seek retribution”—seemed to suggest it was “motivated by retaliatory animus,” the judge said. The FTC’s “fast-moving” investigation suggests that Ferguson “was chomping at the bit to ‘take investigative steps in the new administration under President Trump’ to make ‘progressives’ like Media Matters ‘give up,'” Sooknanan wrote.

Musk’s fight continues in Texas, for now

Possibly most damning to the FTC case, Sooknanan suggested the FTC has never adequately explained the reason why it’s probing Media Matters. In the “Subject of Investigation” field, the FTC wrote only “see attached,” but the attachment was just a list of specific demands and directions to comply with those demands.

Eventually, the FTC offered “something resembling an explanation,” Sooknanan said. But their “ultimate explanation”—that Media Matters may have information related to a supposedly illegal coordinated campaign to game ad pricing, starve revenue, and censor conservative platforms—”does not inspire confidence that they acted in good faith,” Sooknanan said. The judge considered it problematic that the FTC never explained why it has reason to believe MMFA has the information it’s seeking. Or why its demand list went “well beyond the investigation’s purported scope,” including “a reporter’s resource materials,” financial records, and all documents submitted so far in Musk’s X lawsuit.

“It stands to reason,” Sooknanan wrote, that the FTC launched its probe “because it wanted to continue the years’ long pressure campaign against Media Matters by Mr. Musk and his political allies.”

In its defense, the FTC argued that all civil investigative demands are initially broad, insisting that MMFA would have had the opportunity to narrow the demands if things had proceeded without the lawsuit. But Sooknanan declined to “consider a hypothetical narrowed” demand list instead of “the actual demand issued to Media Matters,” while noting that the court was “troubled” by the FTC’s suggestion that “the federal Government routinely issues civil investigative demands it knows to be overbroad with the goal of later narrowing those demands presumably in exchange for compliance.”

“Perhaps the Defendants will establish otherwise later in these proceedings,” Sooknanan wrote. “But at this stage, the record certainly supports that inference,” that the FTC was politically motivated to back Musk’s fight.

As the FTC mulls a potential appeal, the only other major front of Musk’s fight with MMFA is the lawsuit that X Corp. filed in Texas. Musk allegedly expects more favorable treatment in the Texas court, and MMFA is currently pushing to transfer the case to California after previously arguing that Musk was venue shopping by filing the lawsuit in Texas, claiming that it should be “fatal” to his case.

Musk has so far kept the case in Texas, but risking a venue change could be enough to ultimately doom his “thermonuclear” attack on MMFA. To prevent that, X is arguing that it’s “hard to imagine” how changing the venue and starting over with a new judge two years into such complex litigation would best serve the “interests of justice.”

Media Matters, however, has “easily met” requirements to show that substantial damage has already been done—not just because MMFA has struggled financially and stopped reporting on X and the FTC—but because any loss of First Amendment freedoms “unquestionably constitutes irreparable injury.”

The FTC tried to claim that any reputational harm, financial harm, and self-censorship are “self-inflicted” wounds for MMFA. But the FTC did “not respond to the argument that the First Amendment injury itself is irreparable, thereby conceding it,” Sooknanan wrote. That likely weakens the FTC’s case in an appeal.

MMFA declined Ars’ request to comment. But despite the lawsuits reportedly plunging MMFA into a financial crisis, its president, Angelo Carusone, told The New York Times that “the court’s ruling demonstrates the importance of fighting over folding, which far too many are doing when confronted with intimidation from the Trump administration.”

“We will continue to stand up and fight for the First Amendment rights that protect every American,” Carusone said.

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Elon Musk’s “thermonuclear” Media Matters lawsuit may be fizzling out Read More »

GPT-5: The Reverse DeepSeek Moment

Reverse / Beth Washington / August 19, 2025

Everyone agrees that the release of GPT-5 was botched. Everyone can also agree that the direct jump from GPT-4o and o3 to GPT-5 was not of similar size to the jump from GPT-3 to GPT-4, that it was not the direct quantum leap we were hoping for, and that the release was overhyped quite a bit.

GPT-5 still represented the release of at least three distinct models: GPT-5-Fast, GPT-5-Thinking and GPT-5-Pro, at least two and likely all three of which are SoTA (state of the art) within their class, along with GPT-5-Auto.

The problem is that the release was so botched that OpenAI is now experiencing a Reverse DeepSeek Moment – all the forces that caused us to overreact to DeepSeek’s r1 are now working against OpenAI in reverse.

This threatens to give Washington DC and its key decision makers a very false impression of a lack of AI progress, especially progress towards AGI, that could lead to some very poor decisions, and it could do the same for corporations and individuals.

I spent last week covering the release of GPT-5. This puts GPT-5 in perspective.

In January DeepSeek released r1, and we had a ‘DeepSeek moment’ when everyone panicked about how China had ‘caught up.’ As the link explains in more detail, r1 was a good model, sir, but only an ordinary good model, substantially behind the frontier.

We had the DeepSeek Moment because of a confluence of factors misled people:

The ‘six million dollar model’ narrative gave a false impression on cost.
They offered a good clean app with visible chain of thought, it went viral.
The new style caused an overestimate of model quality.
Timing was impeccable, both in order of model releases and within the tech tree.
Safety testing and other steps were skipped, leaving various flaws, and this was a pure fast follow, but in our haste no one took any of that into account.
A false impression of ‘momentum’ and stories about Chinese momentum.
The ‘always insist open models will win’ crowd amplified the vibes.
The stock market was highly lacking in situational awareness, suddenly realizing various known facts and also misunderstanding many important factors.

GPT-5 is now having a Reverse DeepSeek Moment, including many direct parallels.

GPT-5 is evaluated as if it was scaling up compute in a way that it doesn’t. In various ways people are assuming it ‘cost’ far more than it did.
They offered a poor initial experience with rate caps and lost models and missing features, a broken router, and complaints about losing 4o’s sycophancy went viral.
The new style, and people evaluating GPT-5 when they should have been evaluating GPT-5-Thinking, caused an underestimate of model quality.
Timing was directly after Anthropic, and previous releases had already eaten the most impressive recent parts of the tech tree, so gains incorrectly looked small.
1. In particular, gains from reasoning models, and from the original GPT-4 → GPT-4o, are being ignored when considering the GPT-4 → GPT-5 leap.
GPT-5 is a refinement of previous models optimized for efficiency, and is breaking new territory, and that is not being taken into account.
A false impression of hype and a story about a loss of momentum.
The ‘OpenAI is flailing’ crowd and the open model crowd amplified the vibes.
The stock market actually was smart this time and shrugged it off, that’s a hint.

And of course, the big one, which is that GPT-5’s name fed into expectations.

Unlike r1 at the time of its release, GPT-5-Thinking and GPT-5-Pro are clearly the current SoTA models in their classes, and GPT-5-Auto is probably SoTA at its level of compute usage, modulo complaints about personality that OpenAI will doubtless ‘fix’ soon.

OpenAI’s model usage was way up after GPT-5’s release, not down.

The release was botched, but this is very obviously a good set of models.

Washington DC, however, is somehow rapidly deciding that GPT-5 is a failure, and that AI capabilities won’t improve much and AGI is no longer a worry. This is presumably in large part due to the ‘race to market share’ faction pushing this narrative rather hardcore, and having this be super convenient for that.

Dave Kasten: It’s honestly fascinating how widely “what is gonna happen now that GPT-5 is a failure” has already percolated in the DC world — tons of people who barely use AI asking me about this in the past week as their AI policy friend. (I don’t think GPT-5 was a failure)

Stylized anecdote: person tells me they aren’t allowed to use LLM Y at job ABC because regulatory considerations. So they only use LLM Z at home because that’s what they started to use first and don’t have much experience on Y.

(This is true in both private and public sector)

Daniel Eth: So what happens when another lab releases a model that surpasses GPT-5? Narrative could quickly change from “AI is hitting a wall” to “OpenAI has lost the Mandate of Heaven, and it’s shifted to [Anthropic/DeepMind/xAI]”

Honestly that probably makes the near future a particularly valuable time for another lab to release a SOTA model.

What is even scarier is, what happens if DeepSeek drops r2, and it’s not as good as GPT-5-Thinking, but it is ‘pretty good’?

So let us be clear: (American) AI is making rapid progress, including at OpenAI.

How much progress have we been making?

Dean Ball: The jump in the performance and utility of frontier models between April 2024 (eg gpt-4 turbo) and April 2025 (o3) is bigger than the jump between gpt-3 and gpt-4

People alleging a slowdown in progress due to gpt-5 are fooling themselves.

Simeon: I have this theory that we are in a period of increasing marginal utility of capabilities. GPT-2 to GPT-3 jump was a bigger jump than 3 to 4, which was bigger than 4 to 5. But the utility jumps have been increasing.

My core thesis for why is that most use cases are bottlenecked by edge cases and 9s of reliability that are not as visible as the raw capabilities, but that unlock a growing set of use cases all bottlenecked by these same few missing pieces.

Peter Gostev:GPT-5 had its fair share of issues at launch, but the most irritating comment I hear from tech commentators is something like: “we’ve waited for GPT-5 for 2 years and we got an iterative update” – this is completely and demonstrably false.

GPT-5 was a relatively iterative change if you compare it to o3 from 6 months ago (though still a good uptick), but to say that we’ve had no progress in 2 years is absurd.

I could have made the same chart with Claude 2 > Claude 4 or Gemini 1.0 to Gemini 2.5 – progress is massive.

You guys are forgetting how crap GPT-4 actually was. I was hoping to do some side by side between the oldest GPT-4 model I can get (0613) and the current models and I’m struggling to find any interesting task that GPT-4-0613 can actually do – it literally refuses to do an SVG of a pelican on a bike. Any code it generated of anything didn’t work at all.

Teortaxes: That just goes to show that o3 should have been called GPT-5.

This below is only one measure among many, from Artificial Analysis (there is much it doesn’t take into account, which is why Gemini Pro 2.5 looks so good), yes GPT-5 is a relatively small advance despite being called GPT-5 but that is because o1 and o3 already covered a lot of ground, it’s not like the GPT-4 → GPT-5 jump isn’t very big.

Lisan al-Gaib: AI Progress since GPT-3.5

OpenAI seems to be slowing down with GPT-5

Anthropic incredibly steady progress

Google had it’s breakthrough with Gemini 2.5 Pro

based on the Artificial Analysis Index. don’t read to much into the numbers just look at the slope. line going up = good line. going up more steeply=better.

AI is making rapid progress. It keeps getting better. We seem headed for AGI.

Yet people continuously try to deny all of that. And because this could impact key policy, investment and life decisions, each time we must respond.

As in, the Financial Times asks the eternal question we somehow have to ask every few months: Is AI ‘hitting a wall’?

(For fun, here is GPT-5-Pro listing many previous times AI supposedly ‘hit a wall.’)

If you would like links, here are some links for all that.

The justification for this supposed hitting of a wall is even stupider than usual.

FT (Various): “The vibes of this model are really good, and I think that people are really going to feel that,” said Nick Turley, head of ChatGPT at OpenAI.

Except the vibes were not good.

Yes, users wanted GPT-4o’s sycophancy back, and they even got it. What does that have to do with a wall? They do then present the actual argument.

FT: “For GPT-5 . . . people expected to discover something totally new,” says Thomas Wolf, co-founder and chief scientific officer of open source AI start-up Hugging Face. “And here we didn’t really have that.”

True. We didn’t get something totally new. But, again, that was OpenAI:

Botching the rollout.
Using the name GPT-5.
Having made many incremental releases since GPT-4, especially 4o, o1 and o3.

They hit the classic notes.

We have Gary Marcus talking about this being a ‘central icon of the entire scaling approach to get to AGI, and it didn’t work,’ so if this particular scaling effort wasn’t impressive we’re done, no more useful scaling ever.

We have the harkening back to the 1980s ‘AI bubble’ that ‘burst.’

My lord, somehow they are still quoting Yann LeCun.

We have warnings that we have run out of capacity with which to scale. We haven’t.

Their best point is this Altman quote I hadn’t seen:

Sam Altman: [Chatbots like ChatGPT] are not going to get much better.

I believe he meant that in the ‘for ordinary casual chat purposes there isn’t much room for improvement left’ sense, and that this is contrasting mass consumer chatbots with other AI applications, including coding and agents and reasoning models, as evidenced by the other half of the quote:

Sam Altman: [AI models are] still getting better at a rapid rate.

That is the part that matters for AGI.

That doesn’t mean we will get to AGI and then ASI soon, where soon is something like ‘within 2-10 years.’ It is possible things will stall out before that point, perhaps even indefinitely. But ‘we know we won’t get AGI any time soon’ is crazy. And ‘last month I thought we might well get AGI anytime soon but now we know we won’t’ is even crazier.

Alas, a variety of people are reacting to GPT-5 being underwhelming on the margin, the rapid set of incremental AI improvements, and the general fact that we haven’t gotten AGI yet, and reached the conclusion that Nothing Ever Changes applies and we can assume that AGI will never come. That would be a very serious mistake.

Miles Brundage, partly to try and counter and make up for the FT article and his inadvertent role in it, does a six minute rant explaining one reason for different perceptions of AI progress. The key insight here is that AI at any given speed and cost and level of public availability continues to make steady progress, but rates of that progress look very different depending on what you are comparing. Progress looks a progressively faster if you are looking at Thinking-style models, or Pro-style models, or internal-only even more expensive models.

Progress in the rapid models like GPT-5-Fast also looks slower than it is because for the particular purposes of many users at current margins, it is true that intelligence is no longer an important limiting factor. Simple questions and interactions often have ‘correct’ answers if you only think about the local myopic goals, so all you can do is asymptotically approach that answer while optimizing on compute and speed. Intelligence still helps but in ways that are less common, more subtle and harder to notice.

One reason people update against AGI soon is that they treat OpenAI’s recent decisions as reflecting AGI not coming soon. It’s easy to see why one would think that.

Charles: It seems to me like OpenAI’s behaviour recently, steering more towards becoming a consumer company rather than trying to build AGI, is incongruent with them believing in AGI/significant worker displacement coming soon (say <5 years).

Do others disagree with me on this?

Anthropic on the other hand do seem to be behaving in a way consistent with believing in AGI coming soon.

Sam Altman: We had this big GPU crunch. We could go make another giant model. We could go make that, and a lot of people would want to use it, and we would disappoint them. And so we said, let’s make a really smart, really useful model, but also let’s try to optimize for inference cost. And I think we did a great job with that.

I am not going to say they did a ‘great job with that.’ They botched the rollout, and I find GPT-5-Auto (the model in question) to not be exciting especially for my purposes, but it does seem to clearly be on the cost-benefit frontier, as are 5-Thinking and 5-Pro? And when people say things like this:

FT: Rather than being markedly inferior, GPT-5’s performance was consistently mid-tier across different tasks, they found. “The place where it really shines is it’s quite cost effective and also much quicker than other models,” says Kapoor.

They are talking about GPT-5-Auto, the version targeted at the common user. So of course that is what they created for that.

OpenAI rightfully thinks of itself as essentially multiple companies. They are an AI frontier research lab, and also a consumer product company, and a corporate or professional product company, and also looking to be a hardware company.

Most of those customers want to pay $0, at least until you make yourself indispensable. Most of the rest are willing to pay $20/month and not interested in paying more. You want to keep control over this consumer market at Kleenex or Google levels of dominance, and you want to turn a profit.

So of course, yes, you are largely prioritizing for what you can serve your customers.

What are you supposed to do, not better serve your customers at lower cost?

That doesn’t mean you are not also creating more expensive and smarter models. Thinking and Pro exist, and they are both available and quite good. Other internal models exist and by all reports are better if you disregard cost and don’t mind rough around the edges.

FT: It may not have been OpenAI’s intention, but what the launch of GPT-5 makes clear is that the nature of the AI race has changed.

Instead of merely building shiny bigger models, says Sayash Kapoor, a researcher at Princeton University, AI companies are “slowly coming to terms with the fact that they are building infrastructure for products”.

There is an ordinary battle for revenue and market share and so on that looks like every other battle for revenue and market share. And yes, of course when you have a product with high demand you are going to build out a bunch of infrastructure.

That has nothing to do with the more impactful ‘race’ to AGI. The word ‘race’ has simply been repurposed and conflated by such folks in order to push their agenda and rhetoric in which the business of America is to be that of ordinary private business.

Miles Brundage (from the FT article): It makes sense that as AI gets applied in a lot of useful ways, people would focus more on the applications versus more abstract ideas like AGI.

But it’s important to not lose sight of the fact that these are indeed extremely general purpose technologies that are still proceeding very rapidly, and that what we see today is still very limited compared to what’s coming.

Initially FT used only the first sentence from Miles and not the second one, which is very much within Bounded Distrust rules but very clearly misleading, but to their credit FT did then fix it to add the full quote although most clicks will have seen the misleading version.

Miles Brundage: I thought it was clear that the first sentence was just me being diplomatic and “throat clearing” rather than a full expression of my take on the topic, but lesson learned!

Nick Cammarata: I’ve talked to reporters and then directly after finishing my sentence I’m like can you only quote that in full if you do and they’re like no lol

It is crazy to site ‘companies are Doing Business’ as an argument for why they are no longer building or racing to AGI, or why that means what matters is the ordinary Doing of Business. Yes, of course companies are buying up inference compute to sell at a profit. Yes, of course they are building marketing departments and helping customers with deployment and so on. Why shouldn’t they? Why would one consider this an either-or? Why would you think AI being profitable to sell makes it less likely that AGI is coming soon, rather than more likely?

FT: GPT-5 may have underwhelmed but with Silicon Valley running more on “vibes” than scientific benchmarks, there are few indications that the AI music will stop anytime soon. “There’s still a lot of cool stuff to build,” Wolf of Hugging Face says, “even if it’s not AGI or crazy superintelligence [ASI].”

That is, as stated, exactly correct from Wolf. There is tons of cool stuff to build that is not AGI or ASI. Indeed I would love it if we built all that other cool stuff and mysteriously failed to build AGI or ASI. But that cool stuff doesn’t make it less likely we get AGI, nor does not looking at the top labs racing to AGI, and having this as their stated goal, make that part of the situation go away.

As a reminder, OpenAI several times during their GPT-5 presentation talked about how they were making progress towards AGI or superintelligence, and how this remained the company’s primary goal.

Mark Zuckerberg once said about Facebook, ‘we don’t make better services to make money. We make money to make better services.’ Mark simply has a very strange opinion on what constitutes better services. Consider that the same applies here.

Also note that we are now at the point where if you created a truly exceptional coding and research model, and you are already able to raise capital on great terms, it is not at all obvious you should be in a rush to release your coding and research model. Why would you hand that tool to your competitors?

As in, not only does it help them via distillation and reverse engineering, it also directly can be put to work. Anthropic putting out Claude Code gave them a ton more revenue and market share and valuation, and thus vital capital and mindshare, and helps them recruit, but there was a nontrivial price to pay that their rivals get to use the product.

One huge problem with this false perception that GPT-5 failed, or that AI capabilities aren’t going to improve, and that AGI can now be ignored as a possibility, is that this could actually fool the government into ignoring that possibility.

Peter Wildeford:🤦‍♂️

Not only would that mean we wouldn’t prepare for what is coming, the resulting decisions would make things vastly worse. As in, after quoting David Sacks saying the same thing he’s been saying ever since he joined the administration, and noting recent disastrous decisions on the H20 chip, we see this:

FT: Analysts say that with AGI no longer considered a risk, Washington’s focus has switched to ensuring that US-made AI chips and models rule the world.

Even if we disregard the turn of of phrase here – ‘AI chips and models rule the world’ is exactly the scenario some of us are warning about and trying to prevent, and those chips and models having been created by Americans does not mean Americans or humans have a say in what happens next, instead we would probably all die – pursuing chip market share uber alles with a side of model market share was already this administration’s claimed priority months ago.

We didn’t strike the UAE deal because GPT-5 disappointed. We didn’t have Sacks talking endlessly about an ‘AI race’ purely in terms of market share – mostly that of Nvidia – because GPT-5 disappointed. Causation doesn’t run backwards in time. These are people who were already determined to go down this path. GPT-5 and its botched rollout is the latest talking point, but it changes nothing.

In brief, I once again notice that the best way to run Chinese AI models, or to train Chinese AI models is to use American AI chips. Why haven’t we seen DeepSeek release v4 or r2 yet? Because the CCP made them use Huawei Ascend chips and it didn’t work. What matters is who owns and uses the compute, not who manufactures the compute.

But that is an argument for another day. What matters here is that we not fool ourselves into a Reverse DeepSeek Moment, in three ways:

America is still well out in front, innovating and making rapid progress in AI.
AGI is still probably coming and we need to plan accordingly.
Export controls on China are still vital.

Discussion about this post

GPT-5: The Reverse DeepSeek Moment Read More »

The case of the coke-snorting Chihuahua

animals, Biology, canine behavior, Dogs, Science, veterinary medicine, veterinary science / Beth Washington / August 18, 2025

Every dog owner knows that canines are natural scavengers and that vigilance is required to ensure they don’t eat toxic substances. But accidental ingestions still happen—like the chihuahua who vets discovered had somehow managed to ingest a significant quantity of cocaine, according to a case study published in the journal Frontiers in Veterinary Science.

There have been several studies investigating the bad effects cocaine can have on the cardiovascular systems of both humans and animals. However, these controlled studies are primarily done in laboratory settings and often don’t match the messier clinical realities. “Case reports are crucial in veterinary medicine by providing real-world examples,” said co-author Jake Johnson of North Carolina State University. “They capture clinical scenarios that larger studies might miss, preserve unusual presentations for future reference, and help build our collective understanding of rare presentations, ultimately improving emergency preparedness and treatment protocols.”

In the case of a male 2-year-old chihuahua, the dog presented as lethargic and unresponsive. His owners had found him with his tongue sticking out and unable to focus visually. The chihuahua was primarily an outdoor dog but was also allowed inside, and all its vaccines were up to date. Examination revealed bradycardia, i.e., a slow heart rate, a blue tinge to the dog’s mucus membranes—often a sign of too much unoxygenated hemoglobin circulating through the system—and dilated pupils. The dog’s symptoms faded after the vet administered a large dose of atropine, followed by epinephrine.

Then the dog was moved to a veterinary teaching hospital for further evaluation and testing. A urine test was positive for cocaine with traces of fentanyl, confirmed with liquid chromatography testing. The authors estimate the dog could have snorted (or ingested) as much as 96 mg of the drug. Apparently the Chihuahua had a history of ingesting things it shouldn’t, but the owners reported no prescription medications missing at home. They also did not have any controlled substances or illegal drugs like cocaine in the home.

The case of the coke-snorting Chihuahua Read More »

Celebrating 50 years of The Rocky Horror Picture Show

cult classics, culture, film anniversaries, midnight movies, Rocky Horror Picture Show / Beth Washington / August 17, 2025

hot patootie, bless my soul

“It’s had a profound impact on our culture, especially on people who’ve felt different and marginalized.”

Credit: 20th Century Studios

When The Rocky Horror Picture Show premiered in 1975, no one could have dreamed that it would become the longest-running theatrical release film in history. But that’s what happened. Thanks to a killer soundtrack, campy humor, and a devoted cult following, Rocky Horror is still a mainstay of midnight movie culture. In honor of its 50th anniversary, Disney/20th Century Studios is releasing a newly restored 4K HDR version in October, along with deluxe special editions on DVD and Blu-ray. And the film has inspired not one, but two documentaries marking its five decades of existence: Strange Journey: The Story of Rocky Horror and Sane Inside Insanity: The Phenomenon of Rocky Horror.

(Spoilers below, because it’s been 50 years.)

The film is an adaption of Richard O’Brien‘s 1973 musical for the stage, The Rocky Horror Show. At the time, he was a struggling actor and wrote the musical as an homage to the science fiction and B horror movies he’d loved since a child. In fact, the opening song (“Science Fiction/Double Feature“) makes explicit reference to many of those, including 1951’s The Day the Earth Stood Still, Flash Gordon (1936), King Kong (1933), The Invisible Man (1933), Forbidden Planet (1956), and The Day of the Triffids (1962), among others.

The musical ran for six years in London and was well-received when it was staged in Los Angeles. But the New York City production bombed. By then the film was already in development with O’Brien—who plays the hunchbacked butler Riff Raff in the film—co-writing the script. Director Jim Sharman retained most of the London stage cast, but brought in American actors Barry Bostwick and Susan Sarandon to play Brad and Janet, respectively. And he shot much of the film at the Victorian Gothic manor Oakley Court in Berkshire, England, where several Hammer horror movies had been filmed. In fact, Sharman made use of several old props and set pieces from old Hammer productions, most notably the tank and dummy from 1958’s The Revenge of Frankenstein.

The film opens with nice wholesome couple Brad and Janet attending a wedding and awkwardly getting engaged themselves. They decide to visit their high school science teacher, Dr. Scott (Jonathan Adams), because they met in his class, but they get a flat tire en route and end up stranded in the rain. They seek refuge and a phone at a nearby castle, hoping to call for roadside assistance. Instead, they are pressured into becoming guests of the castle’s owner, a transvestite mad scientist called Frank-N-Furter (Tim Curry), and his merry bad of misfits.

The flamboyantly lascivious Frank-N-Furter is about to unveil his new Creature, the titular Rocky Horror (Peter Hinwood). Rocky is a buff, tanned, blond figure clad only in gold speedos and booties, with the body of a god and the mind of a child. Actually, he’s got half the brain of a motorcycling, rock-n-roll loving rebel named Eddie (Meat Loaf), who briefly escapes from the deep freeze where he’d been stored and causes a bit of havoc, before Frank-N-Furter kills him with an ice pick.

Things just get weirder from there. There’s a lot of sexual partner swapping, with the insatiable Frank-N-Furter bedding his Creature and then seducing the virginal Janet and Brad in turn. A sexually awakened Janet then gets down with Rocky, enraging their host. Dr. Scott shows up in time for Rocky’s birthday dinner, with the main course being the mutilated remains of Eddie. Frank-N-Further then zaps his guests with a Medusa freeze ray and turns them into Greek marble statues. He dresses them in sexy cabaret costumes—matching corsets and fishnets—before unfreezing them and forcing them to perform in an elaborate stage number.

Eventually his butler and maid—siblings Riff Raff and Magenta (Patricia Quinn), respectively—revolt, revealing that they are all actually aliens from the planet Transsexual, Transylvania. They kill Frank-N-Furter with a laser in revenge for his excesses, along with poor Rocky. The entire castle turns out to be a spaceship and Riff Raff and Magenta blast off into space, leaving Brad, Janet, and Dr. Scott crawling around the ground in confusion.

The Rocky Horror Picture Show made its London debut on August 14, 1975, along with eight other cities worldwide, but it was quickly pulled because audiences were so small. A planned Halloween opening night in New York was cancelled altogether. The film might have faded into obscurity if the studio hadn’t decided to re-market it to the midnight movie circuit, along with other counterculture fare like Pink Flamingoes (1972) and Reefer Madness (1933).

Rocky Horror fit right in and finally found its audience. It quickly became a fixture at New York City’s Waverly Theater, which ignited the film’s cult following. People went to see it again and again, and started dressing up in costumes and acting out the lines in front of the big screen, a practice that became known as shadow casting. (I saw it myself several times in the late 1980s, although I never joined a shadow cast.)

Why has Rocky Horror endured for so long? “The music, first of all, is up there, in my biased opinion, with the greatest soundtracks of all time,” Linus O’Brien, director of Strange Journey and Richard O’Brien’s son, told Ars. “I think maybe it doesn’t get recognized as such because on the surface, it just seems like a bit of fluff. But if the songs were only half as good, we wouldn’t be talking about Rocky today. It would be a very small B-movie that we’d laugh at or something.”

It really is an amazingly catchy collection of tunes, perfect for singing (and dancing) along, particularly “The Time Warp.” (Many of us can still perform the basic dance steps.) There’s “Dammit Janet,” “Over at the Frankenstein Place,” and Frank-N-Further makes an unforgettable entrance with “Sweet Transvestite.” Eddie gets his moment in the spotlight with “Hot Patootie—Bless My Soul,” and Janet seduces Rocky with “Touch-a, Touch-a, Touch-a, Touch Me.”

In addition to the unforgettable songs, O’Brien cites Curry’s inspired performance, as well as “all the things my dad loved in terms of bodybuilding and science fiction movies and ’50s rock and roll, the transgressive themes, [and] the classic reimagining of the Frankenstein story,” he said. “Whenever you have something that lasts this long, it’s usually working on many different levels that makes people keep coming back week after week, year after year.”

Shadow casting

Gia Milinovich, an American-born writer and TV presenter now living in England, was part of the second generation of Rocky Horror fans. She grew up in Duluth, Minnesota, which boasted a local repertory cinema that screened a lot of cult movies, and saw Rocky Horror for the first time in 1984. She saw it again in New York in 1987 and started her own shadow cast when she moved to London later that year—playing Frank-N-Furter, of course.

“For me, the moment when Frank-N-Furter threw off his cape—I’ve described it as a religious experience,” Milinovich told Ars. “It was like this world opened up to me and I just thought, ‘I want to be in that world.’ I was completely obsessed from then on. There’s lots of different things that I like as a fan, but there’s nothing that’s grabbed me like Rocky Horror. The atmosphere is the same every time I’ve seen it, this kind of electricity in the air.”

Decades later, Milinovich remains part of the Rocky Horror fandom, with fond memories of her shadow casting days. “I would call shadow casting an art form or a form of theater that doesn’t really exist anywhere else,” she said. “We were doing cosplay before cosplay was a thing. Part of the thing about shadow casting is getting your costumes to be screen accurate to a really obsessive degree. People are still discovering new details because as the quality of the prints go up, the higher and higher quality DVDs that you get, the more detail you can see in the costumes. There’s a whole Facebook group dedicated just to Frank-N-Furter’s leather jacket.”

And it’s not just the members of the shadow casts who participate. “There’s also all of the talk back, the audience lines,” said Milinivoch. “There are loads of people who might not want to perform, but they’re really into doing costumes or making the props for the shadow cast. So you can be sitting in the audience but still be part of the show. No one needs permission, you just do it. There’s no difference between the audience and the performers and the film, it’s all kind of one thing melded together and it’s like nothing else.”

This was a period when Rocky Horror was still very much part of underground counterculture. “For someone to walk around dressed as Columbia (Little Nell) in the late 1980s, and certainly for men wearing lipstick or black fishnet stockings, it wasn’t necessarily a safe thing to dress up and go to Rocky Horror,” said Milinovich. “Now, all these years later, I feel like it’s acceptable. For the first and second generations of fans, it felt much more radical than it does now.”

Yet in some respects, it’s as relevant as ever. “There are still those extreme prejudices in society and Rocky Horror still provides a space for people to be themselves, or to be someone else, for the two hours that it takes to do the film,” Milinovich said. “The line in the film is ‘Don’t dream it, be it.'” People still take that line to heart.

Rocky Horror has had its share of detractors over the last five decades, but judging whether it’s a “good” film or not by the same criteria as other films is kind of missing the point. The magic lies not in passively watching Rocky Horror, but in the interactive live experience—very much in keeping with its theatrical roots. “I can’t really separate the film from the whole audience experience,” said Milinovich. “I wouldn’t even watch the film at home on its own, I just don’t. I’ve seen it so many times, but watching it at home was how I would always rehearse.”

Don’t dream it, be it

The documentary Strange Journey ends with a fan telling Richard O’Brien, “It doesn’t matter what people think about Rocky because it belongs to us, not to you”—and Rocky‘s creator agreeing that this was true. “Art takes on a life of its own,” Linus O’Brien concurred, citing Karen Tongson, a gender studies professor at the University of Southern California.

“She talks about how our art expresses how we’re feeling inside way before we’ve ever had a chance to understand it or explore it,” he said. “That’s what happened in the case of Rocky with my dad. He was essentially a 13-year-old boy writing a stage play, even though he was 30 at the time. He didn’t think about what he was doing. He was just expressing, took all the things that he liked, all the things that he was thinking about and put it all together. They came from within him, but he wasn’t consciously aware of it.”

At the time, Richard O’Brien also had no idea what his creation would end up meaning to so many people. Linus O’Brien decided to make Strange Journey while gathering archival clips of his father’s work. He came across a video clip of “I’m Going Home” and found himself browsing through the comments.

“It was one after another, [talking] about how Rocky had saved their lives, and how much that song in particular meant to them,” he said. “There was a soldier in Iraq who would always play it because he wanted to go home. A daughter who used to watch Rocky with her mother all the time and then played it at her funeral. It was startling and touching, how profound the impact of Rocky has been on so many people’s lives.”

When Strange Journey screened at SXSW earlier this year, a man came up to O’Brien after the Q&A. “He was shaking and he said, ‘Listen, my wife and I met 32 years ago at Rocky, and she wanted to let you and your dad know that if it wasn’t for Rocky, she wouldn’t be alive today,'” O’Brien recalled.

“I don’t think there’s another work of art that has tangibly saved the lives of people like Rocky has,” he continued. “A lot of people just think it’s a little bit of trashy fun, a bit naughty and rude, but it’s much more than that. It’s had a profound impact on our culture, especially on people who’ve felt different and marginalized—regardless of their sexuality. It’s created a community for people who didn’t feel part of society. We’ve all felt like that to a degree. So it’s a wonderful thing to celebrate.”

Jennifer is a senior writer at Ars Technica with a particular focus on where science meets culture, covering everything from physics and related interdisciplinary topics to her favorite films and TV series. Jennifer lives in Baltimore with her spouse, physicist Sean M. Carroll, and their two cats, Ariel and Caliban.

Celebrating 50 years of The Rocky Horror Picture Show Read More »

SpaceX reveals why the last two Starships failed as another launch draws near

Commercial space, Federal Aviation Administration, launch, Science, Space, spacex, starship / Beth Washington / August 16, 2025

“SpaceX can now proceed with Starship Flight 10 launch operations under its current license.”

SpaceX completed a six-engine static fire of the next Starship upper stage on August 1. Credit: SpaceX

SpaceX is continuing with final preparations for the 10th full-scale test flight of the company’s enormous Starship rocket after receiving launch approval Friday from the Federal Aviation Administration.

Engineers completed a final test of Starship’s propulsion system with a so-called “spin prime” test Wednesday at the launch site in South Texas. Ground crews then rolled the ship back to a nearby hangar for engine inspections, touchups to its heat shield, and a handful of other chores to ready it for liftoff.

SpaceX has announced the launch is scheduled for no earlier than next Sunday, August 24, at 6: 30 pm local time in Texas (23: 30 UTC).

Like all previous Starship launches, the huge 403-foot-tall (123-meter) rocket will take off from SpaceX’s test site in Starbase, Texas, just north of the US-Mexico border. The rocket consists of a powerful booster stage named Super Heavy, with 33 methane-fueled Raptor engines. Six Raptors power the upper stage, known simply as Starship.

With this flight, SpaceX officials hope to put several technical problems with the Starship program behind them. SpaceX is riding a streak of four disappointing Starship test flights from January through May, and and the explosion and destruction of another Starship vehicle during a ground test in June.

These setbacks followed a highly successful year for the world’s largest rocket in 2024, when SpaceX flew Starship four times and achieved new objectives on each flight. These accomplishments included the first catch of a Super Heavy booster back at the launch pad, proving the company’s novel concept for recovering and reusing the rocket’s first stage.

Starship’s record so far in 2025 is another story. The rocket’s inability to make it through an entire suborbital test flight has pushed back future program milestones, such as the challenging tasks of recovering and reusing the rocket’s upper stage, and demonstrating the ability to refuel another rocket in orbit. Those would both be firsts in the history of spaceflight.

These future tests, and more, are now expected to occur no sooner than next year. This time last year, SpaceX officials hoped to achieve them in 2025. All of these demonstrations are vital for Elon Musk to meet his promise of sending numerous Starships to build a settlement on Mars. Meanwhile, NASA is eager for SpaceX to reel off these tests as quickly as possible because the agency has selected Starship as the human-rated lunar lander for the Artemis Moon program. Once operational, Starship will also be key to building out SpaceX’s next-generation Starlink broadband network.

A good outcome on the next Starship test flight would give SpaceX footing to finally take a step toward these future demos after months of dithering over design dilemmas.

Elon Musk, SpaceX’s founder and CEO, presented an update on Starship to company employees in May. This chart shows the planned evolution from Starship Version 2 (left) to Version 3 (middle), and an even larger rocket (right) in the more distant future.

The FAA said Friday it formally closed the investigation into Starship’s most recent in-flight failure in May, when the rocket started leaking propellant after reaching space, rendering it unable to complete the test flight.

“The FAA oversaw and accepted the findings of the SpaceX-led investigation,” the federal regulator said in a statement. “The final mishap report cites the probable root cause for the loss of the Starship vehicle as a failure of a fuel component. SpaceX identified corrective actions to prevent a reoccurrence of the event.”

Diagnosing failures

SpaceX identified the most probable cause for the May failure as a faulty main fuel tank pressurization system diffuser located on the forward dome of Starship’s primary methane tank. The diffuser failed a few minutes after launch, when sensors detected a pressure drop in the main methane tank and a pressure increase in the ship’s nose cone just above the tank.

The rocket compensated for the drop in main tank pressure and completed its engine burn, but venting from the nose cone and a worsening fuel leak overwhelmed Starship’s attitude control system. Finally, detecting a major problem, Starship triggered automatic onboard commands to vent all remaining propellant into space and “passivate” itself before an unguided reentry over the Indian Ocean, prematurely ending the test flight.

Engineers recreated the diffuser failure on the ground during the investigation, and then redesigned the part to better direct pressurized gas into the main fuel tank. This will also “substantially decrease” strain on the diffuser structure, SpaceX said.

The FAA, charged with ensuring commercial rocket launches don’t endanger public safety, signed off on the investigation and gave the green light for SpaceX to fly Starship again when it is ready.

“SpaceX can now proceed with Starship Flight 10 launch operations under its current license,” the FAA said.

“The upcoming flight will continue to expand the operating envelope on the Super Heavy booster, with multiple landing burn tests planned,” SpaceX said in an update posted to its website Friday. “It will also target similar objectives as previous missions, including Starship’s first payload deployment and multiple reentry experiments geared towards returning the upper stage to the launch site for catch.”

File photo of Starship’s six Raptor engines firing on a test stand in South Texas. Credit: SpaceX

In the aftermath of the test flight in May, SpaceX hoped to fly Starship again by late June or early July. But another accident June 18, this time on the ground, delayed the program another couple of months. The Starship vehicle SpaceX assigned to the next flight, designated Ship 36, exploded on a test stand in Texas as teams filled it with cryogenic propellants for an engine test-firing.

The accident destroyed the ship and damaged the test site, prompting SpaceX to retrofit the sole active Starship launch pad to support testing of the next ship in line—Ship 37. Those tests included a brief firing of all six of the ship’s Raptor engines August 1.

After Ship 37’s final spin prime test Wednesday, workers transported the rocket back to a hangar for evaluation, and crews immediately got to work transitioning the launch pad back to its normal configuration to host a full Super Heavy/Starship stack.

SpaceX said the explosion on the test stand in June was likely caused by damage to a high-pressure nitrogen storage tank inside Starship’s payload bay section. This tank, called a composite overwrapped pressure vessel, or COPV, violently ruptured and led to the ship’s fiery demise. SpaceX said COPVs on upcoming flights will operate at lower pressures, and managers ordered additional inspections on COPVs to look for damage, more proof testing, more stringent acceptance criteria, and a hardware change to address the problem.

Try, try, try, try again

This year began with the first launch of an upgraded version of Starship, known as Version 2 or Block 2, in January. But the vehicle suffered propulsion failures and lost control before the upper stage completed its engine burn to propel the rocket on a trajectory carrying it halfway around the world to splash down in the Indian Ocean. Instead, the rocket broke apart and rained debris over the Bahamas and the Turks and Caicos Islands more than 1,500 miles downrange from Starbase.

That was followed in March by another Starship launch that had a similar result, again scattering debris near the Bahamas. In May, the ninth Starship test flight made it farther downrange and completed its engine burn before spinning out of control in space, preventing it from making a guided reentry to gather data on its heat shield.

Mastering the design of Starship’s heat shield is critical the future of the program. As it has on all of this year’s test flights, SpaceX has installed on the next Starship several different ceramic and metallic tile designs to test alternative materials to protect the vehicle during its scorching plunge back into Earth’s atmosphere. Starship successfully made it through reentry for a controlled splashdown in the sea several times last year, but sensors detected hot spots on the rocket’s stainless steel skin after some of the tiles fell off during launch and descent.

Making the Starship upper stage reusable like the Super Heavy booster will require better performance from the heat shield. The demands of flying the ship home from orbit and attempting a catch at the launch pad far outweigh the challenge of recovering a booster. Coming back from space, the ship encounters much higher temperatures than the booster sees at lower velocities.

Therefore, SpaceX’s most important goal for the 10th Starship flight will be gathering information about how well the ship’s different heat shield materials hold up during reentry. Engineers want to have this data as soon as possible to inform design decisions about the next iteration of Starship—Version 3 or Block 3—that will actually fly into orbit. So far, all Starship launches have intentionally targeted a speed just shy of orbital velocity, bringing the vehicle back through the atmosphere halfway around the world.

Other objectives on the docket for Starship Flight 10 include the deployment of spacecraft simulators mimicking the size of SpaceX’s next-generation Starlink Internet satellites. Like the heat shield data, this has been part of the flight plan for the last three Starship launches, but the rocket never made it far enough to attempt any payload deployment tests.

Thirty-three Raptor engines power the Super Heavy booster downrange from SpaceX’s launch site near Brownsville, Texas, in January. Credit: SpaceX

Engineers also plan to put the Super Heavy booster through the wringer on the next launch. Instead of coming back to Starbase for a catch at the launch pad—something SpaceX has now done three times—the massive booster stage will target a controlled splashdown in the Gulf of Mexico east of the Texas coast. This will give SpaceX room to try new things with the booster, such as controlling the rocket’s final descent with a different mix of engines to see if it could overcome a problem with one of its three primary landing engines.

SpaceX tried to experiment with new ways of landing of the Super Heavy booster on the last test flight, too. The Super Heavy exploded before reaching the ocean, likely due to a structural failure of the rocket’s fuel transfer tube, an internal pipe where methane flows from the fuel tank at the top of the rocket to the engines at the bottom of the booster. SpaceX said the booster flew a higher angle of attack during its descent in May to test the limits of the rocket’s performance. It seems engineers found the limit, and the booster won’t fly at such a high angle of attack next time.

SpaceX has just two Starship Version 2 vehicles in its inventory before moving on to the taller Version 3 configuration, which will also debut improved Raptor engines.

“Every lesson learned, through both flight and ground testing, continues to feed directly into designs for the next generation of Starship and Super Heavy,” SpaceX said. “Two flights remain with the current generation, each with test objectives designed to expand the envelope on vehicle capabilities as we iterate towards fully and rapidly reusable, reliable rockets.”

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

SpaceX reveals why the last two Starships failed as another launch draws near Read More »

Porsche’s best daily driver 911? The 2025 Carrera GTS T-Hybrid review.

car review, Cars, Features, Porsche 911 Carrera GTS, Porsche 911 Carrera GTS T-Hybrid / Beth Washington / August 15, 2025

An electric turbocharger means almost instant throttle response from the T-Hybrid.

Porsche developed a new T-Hybrid system for the 911, and it did a heck of a job. Credit: Jonathan Gitlin

Porsche 911 enthusiasts tend to be obsessive about their engines. Some won’t touch anything that isn’t air-cooled, convinced that everything went wrong when emissions and efficiency finally forced radiators into the car. Others love the “Mezger” engines; designed by engineer Hans Mezger, they trace their roots to the 1998 Le Mans-winning car, and no Porschephile can resist the added shine of a motorsports halo.

I’m quite sure none of them will feel the same way about the powertrain in the new 911 Carrera GTS T-Hybrid (MSRP: $175,900), and I think that’s a crying shame. Because not only is the car’s technology rather cutting-edge—you won’t find this stuff outside an F1 car—but having spent several days behind the wheel, I can report it might just be one of the best-driving, too.

T-Hybrid

This is not just one of Porsche’s existing flat-six engines with an electric motor bolted on; it’s an all-new 3.6 L engine designed to comply with new European legislation that no longer lets automakers rich out a fuel mixture under high load to improve engine cooling. Instead, the engine has to maintain the same 14.7:1 stoichiometric air-to-fuel ratio (also known as lambda = 1) across the entire operating range, thus allowing the car’s catalytic converters to work most efficiently.

The 911 Carrera GTS T-Hybrid at dawn patrol. Jonathan Gitlin

Because the car uses a hybrid powertrain, Porsche moved some of the ancillaries. There’s no belt drive; the 400 V hybrid system powers the air conditioning electrically now via its 1.9 kWh lithium-ion battery, and the water pump is integrated into the engine block. That rearrangement means the horizontally opposed engine is now 4.3 inches (110 mm) lower than it was before, which meant Porsche could use that extra space in the engine bay to fit the power electronics, like the car’s pulse inverters and DC-DC converters.

And instead of tappets, Porsche has switched to using roller cam followers to control the engine’s valves, as in motorsport. These solid cam followers don’t need manual adjustment at service time, and they reduce friction losses compared to bucket tappets.

The added displacement—0.6 L larger than the engine you’ll find in the regular 911—is to compensate for not being able to alter the fuel ratio. And for the first time in several decades, there’s now only a single turbocharger. Normally, a larger-capacity engine and a single big turbo should be a recipe for plenty of lag, versus a smaller displacement and a turbocharger for each cylinder bank, as the former has larger components with more mass that needs to be moved.

The GTS engine grows in capacity by 20 percent. Porsche

That’s where one of the two electric motors comes in. This one is found between the compressor and the turbine wheel, and it’s only capable of 15 hp (11 kW), but it uses that to spin the turbine up to 120,000 rpm, hitting peak boost in 0.8 seconds. For comparison, the twin turbos you find in the current 3.0 L 911s take three times as long. Since the turbine is electrically controlled and the electric motor can regulate boost pressure, there’s no need for a wastegate.

The electrically powered turbocharger is essentially the same as the MGU-H used in Formula 1, as it can drive the turbine and also regenerate energy to the car’s traction battery. (The mighty 919 Hybrid race car, which took Porsche to three Le Mans wins last decade, was able to capture waste energy from its turbocharger, but unlike the 911 GTS or an F1 car, it didn’t use that same motor to spin the turbo up to speed.)

On its own, the turbocharged engine generates 478 hp (357 kW) and 420 lb-ft (570 Nm). However, there’s another electric motor, this one a permanent synchronous motor built into the eight-speed dual-clutch (PDK) transmission casing. This traction motor provides up to 53 hp (40 kW) and 110 lb-ft (150 Nm) of torque to the wheels, supplementing the internal combustion engine when needed. The total power and torque output are 532 hp (397 kW) and 449 lb-ft (609 Nm).

A grey Porsche 911 parked in a campsite — No Porsches were harmed during the making of this review, but one did get a little dusty. Credit: Jonathan Gitlin

Now that’s what I call throttle response

Conceptually, the T-Hybrid in the 911 GTS is quite different from the E-Hybrid system we’ve tested in various plug-in Porsches. Those allow for purely electric driving thanks to a clutch between transmission and electric traction motor—that’s not present in the T-Hybrid, where weight saving, performance, and emissions compliance were the goal rather than an increase in fuel efficiency.

Regardless of the intent, Porsche’s engineers have created a 911 with the best throttle response of any of them. Yes, even better than the naturally aspirated GT3, with its engine packed full of motorsports mods.

I realize this is a bold claim. But I’ve been saying for a while now that I prefer driving the all-electric Taycan to the 911 because the immediacy of an electric motor beats even the silkiest internal combustion engine in terms of that first few millimeters of throttle travel. The 3.0 L twin-turbo flat-six in most 911s doesn’t suffer from throttle lag like it might have in the 1980s, but there’s still an appreciable delay between initial tip-in and everything coming on song.

Initially, I suspected that the electric motor in the PDK case was responsible for the instantaneous way the GTS responds from idle, but according to Porsche’s engineers, all credit for that belongs to the electric turbocharger. However the engineers did it, this is a car that still provides 911 drivers the things they like about internal combustion engines—the sound, the fast refueling, using gears—but with the snappiness of a fast Taycan or Macan.

Centerlock wheels are rather special. Credit: Jonathan Gitlin

Porsche currently makes about 10 different 911 coupe variants, from the base 911 Carrera to the 911 GT3 RS. The GTS (also available with all-wheel drive as a Carrera 4 GTS for an extra $8,100) is marginally less powerful and slightly slower than the current 911 Turbo, and it’s heavier but more powerful than the 911 GT3.

In the past, I’ve thought of GTS-badged Porsches as that company’s take on the ultimate daily driver as opposed to a track day special, and it’s telling that you can also order the GTS with added sunshine, either as a cabriolet (in rear- or all-wheel drive) or as a Targa (with all-wheel drive). You have to remember to tick the box for rear seats now, though—these are a no-cost option rather than being fitted as standard.

The T-Hybrid powertrain adds 103 lbs compared to the previous GTS, so it’s not a lightweight track-day model, even if the non-hybrid GTS was almost nine seconds slower around the Nürburgring. On track, driven back to back with some of the others, you might be able to notice the extra weight, but I doubt it. I didn’t take the GTS on track, but I drove it to one; a trip to Germany to see the Nürburgring 24 race with some friends presented an opportunity to test this and another Porsche that hadn’t made their way to the East Coast press fleet yet.

I’d probably pick that Panamera if most of my driving was on the autobahn. With a top speed of 194 mph (312 km/h) the 911 GTS is capable of holding its own on the derestricted stretches even if its Vmax is a few miles per hour slower than the four-door sedan. But the 911 is a smaller, lighter, and more nimble car that moves around a bit more, and you sit a lot lower to the ground, amplifying the sensation of speed. The combined effect was that the car felt happier with a slightly lower cruising speed of 180 km/h rather than 200 km/h or more in the Panamera. Zero-62 mph (100 km/h) times don’t mean much outside the tollbooth but should take 2.9 seconds with launch control.

A Porsche 911 seen from the top — Despite the nondescript gray paint, the GTS T-Hybrid still turned plenty of heads. Credit: Jonathan Gitlin

Keep going

For the rest of the time, the 911 GTS evoked far more driving pleasure. Rear-wheel steering aids agility at lower speeds, and there are stiffer springs, newly tuned dampers, and electrohydraulic anti-roll bars (powered by the hybrid’s high-voltage system). Our test car was fitted with the gigantic (420 mm front, 410 mm rear) carbon ceramic brakes, and at the rear, the center lock wheels are 11.5 inches in width.

In the dry, I never got close to finding the front tires’ grip limit. The rear-wheel steering is noticeable, particularly when turning out of junctions, but never to the degree where you start thinking about correcting a slide unless you provoke the tires into breaking traction with the throttle. Even on the smooth tarmac preferred by German municipalities, the steering communicated road conditions from the tires, and the Alcantara-wrapped steering wheel is wonderful to grip in your palms.

So it’s predictably great to drive on mountain roads in Sport or Sport+. However, the instant throttle response means it’s also a better drive in Normal at 30 km/h as you amble your way through a village than the old GTS or any of the 3.0 L cars. That proved handy after Apple Maps sent me down a long dirt road on the way to my rental house, as well as for navigating the Nürburgring campsite, although I think I now appreciate why Porsche made the 911 Dakar (and regret declining that first drive a few years ago).

Happily, my time with the 911 GTS didn’t reveal any software bugs, and I prefer the new, entirely digital main instrument display to the old car’s analog tachometer sandwiched between two multifunction displays. Apple CarPlay worked well enough, and the compact cabin means that ergonomics are good even for those of us with shorter arms. There is a standard suite of advanced driver assistance systems, including traffic sign detection (which handily alerts you when the speed limit changes) and collision warning. Our test car included the optional InnoDrive system that adds adaptive cruise control, as well as a night vision system. On the whole, the ADAS was helpful, although if you don’t remember to disable the lane keep assist at the start of each journey, you might find it intruding mid-corner, should the car think you picked a bad line.

My only real gripe with the 911 GTS T-Hybrid is the fact that, with some options, you’re unlikely to get much change from $200,000. Yes, I know inflation is a thing, and yes, I know that’s still 15 percent less than the starting price of a 911 GT3 Touring, which isn’t really much of a step up from this car in terms of the driving experience on the road. However, a 911 Carrera T costs over $40,000 less than the T-Hybrid, and while it’s slower and less powerful, it’s still available with a six-speed manual. That any of those three would make an excellent daily driver 911 is a credit to Porsche, but I think if I had the means, the sophistication of the T-Hybrid system and its scalpel-sharp responsiveness might just win the day.

Jonathan is the Automotive Editor at Ars Technica. He has a BSc and PhD in Pharmacology. In 2014 he decided to indulge his lifelong passion for the car by leaving the National Human Genome Research Institute and launching Ars Technica’s automotive coverage. He lives in Washington, DC.

Porsche’s best daily driver 911? The 2025 Carrera GTS T-Hybrid review. Read More »

Author name: Beth Washington