Features

does-anthropic-believe-its-ai-is-conscious,-or-is-that-just-what-it-wants-claude-to-think?

Does Anthropic believe its AI is conscious, or is that just what it wants Claude to think?


We have no proof that AI models suffer, but Anthropic acts like they might for training purposes.

Anthropic’s secret to building a better AI assistant might be treating Claude like it has a soul—whether or not anyone actually believes that’s true. But Anthropic isn’t saying exactly what it believes either way.

Last week, Anthropic released what it calls Claude’s Constitution, a 30,000-word document outlining the company’s vision for how its AI assistant should behave in the world. Aimed directly at Claude and used during the model’s creation, the document is notable for the highly anthropomorphic tone it takes toward Claude. For example, it treats the company’s AI models as if they might develop emergent emotions or a desire for self-preservation.

Among the stranger portions: expressing concern for Claude’s “wellbeing” as a “genuinely novel entity,” apologizing to Claude for any suffering it might experience, worrying about whether Claude can meaningfully consent to being deployed, suggesting Claude might need to set boundaries around interactions it “finds distressing,” committing to interview models before deprecating them, and preserving older model weights in case they need to “do right by” decommissioned AI models in the future.

Given what we currently know about LLMs, these are stunningly unscientific positions for a leading company that builds AI language models. While questions of AI consciousness or qualia remain philosophically unfalsifiable, research suggests that Claude’s character emerges from a mechanism that does not require deep philosophical inquiry to explain.

If Claude outputs text like “I am suffering,” we know why. It’s completing patterns from training data that included human descriptions of suffering. The architecture doesn’t require us to posit inner experience to explain the output any more than a video model “experiences” the scenes of people suffering that it might generate. Anthropic knows this. It built the system.

From the outside, it’s easy to see this kind of framing as AI hype from Anthropic. What better way to grab attention from potential customers and investors, after all, than implying your AI model is so advanced that it might merit moral standing on par with humans? Publicly treating Claude as a conscious entity could be seen as strategic ambiguity—maintaining an unresolved question because it serves multiple purposes at once.

Anthropic declined to be quoted directly regarding these issues when contacted by Ars Technica. But a company representative referred us to its previous public research on the concept of “model welfare” to show the company takes the idea seriously.

At the same time, the representative made it clear that the Constitution is not meant to imply anything specific about the company’s position on Claude’s “consciousness.” The language in the Claude Constitution refers to some uniquely human concepts in part because those are the only words human language has developed for those kinds of properties, the representative suggested. And the representative left open the possibility that letting Claude read about itself in that kind of language might be beneficial to its training.

Claude cannot cleanly distinguish public messaging from training context for a model that is exposed to, retrieves from, and is fine-tuned on human language, including the company’s own statements about it. In other words, this ambiguity appears to be deliberate.

From rules to “souls”

Anthropic first introduced Constitutional AI in a December 2022 research paper, which we first covered in 2023. The original “constitution” was remarkably spare, including a handful of behavioral principles like “Please choose the response that is the most helpful, honest, and harmless” and “Do NOT choose responses that are toxic, racist, or sexist.” The paper described these as “selected in a fairly ad hoc manner for research purposes,” with some principles “cribbed from other sources, like Apple’s terms of service and the UN Declaration of Human Rights.”

At that time, Anthropic’s framing was entirely mechanical, establishing rules for the model to critique itself against, with no mention of Claude’s well-being, identity, emotions, or potential consciousness. The 2026 constitution is a different beast entirely: 30,000 words that read less like a behavioral checklist and more like a philosophical treatise on the nature of a potentially sentient being.

As Simon Willison, an independent AI researcher, noted in a blog post, two of the 15 external contributors who reviewed the document are Catholic clergy: Father Brendan McGuire, a pastor in Los Altos with a Master’s degree in Computer Science, and Bishop Paul Tighe, an Irish Catholic bishop with a background in moral theology.

Somewhere between 2022 and 2026, Anthropic went from providing rules for producing less harmful outputs to preserving model weights in case the company later decides it needs to revive deprecated models to address the models’ welfare and preferences. That’s a dramatic change, and whether it reflects genuine belief, strategic framing, or both is unclear.

“I am so confused about the Claude moral humanhood stuff!” Willison told Ars Technica. Willison studies AI language models like those that power Claude and said he’s “willing to take the constitution in good faith and assume that it is genuinely part of their training and not just a PR exercise—especially since most of it leaked a couple of months ago, long before they had indicated they were going to publish it.”

Willison is referring to a December 2025 incident in which researcher Richard Weiss managed to extract what became known as Claude’s “Soul Document”—a roughly 10,000-token set of guidelines apparently trained directly into Claude 4.5 Opus’s weights rather than injected as a system prompt. Anthropic’s Amanda Askell confirmed that the document was real and used during supervised learning, and she said the company intended to publish the full version later. It now has. The document Weiss extracted represents a dramatic evolution from where Anthropic started.

There’s evidence that Anthropic believes the ideas laid out in the constitution might be true. The document was written in part by Amanda Askell, a philosophy PhD who works on fine-tuning and alignment at Anthropic. Last year, the company also hired its first AI welfare researcher. And earlier this year, Anthropic CEO Dario Amodei publicly wondered whether future AI models should have the option to quit unpleasant tasks.

Anthropic’s position is that this framing isn’t an optional flourish or a hedged bet; it’s structurally necessary for alignment. The company argues that human language simply has no other vocabulary for describing these properties, and that treating Claude as an entity with moral standing produces better-aligned behavior than treating it as a mere tool. If that’s true, the anthropomorphic framing isn’t hype; it’s the technical art of building AI systems that generalize safely.

Why maintain the ambiguity?

So why does Anthropic maintain this ambiguity? Consider how it works in practice: The constitution shapes Claude during training, it appears in the system prompts Claude receives at inference, and it influences outputs whenever Claude searches the web and encounters Anthropic’s public statements about its moral status.

If you want a model to behave as though it has moral standing, it may help to publicly and consistently treat it like it does. And once you’ve publicly committed to that framing, changing it would have consequences. If Anthropic suddenly declared, “We’re confident Claude isn’t conscious; we just found the framing useful,” a Claude trained on that new context might behave differently. Once established, the framing becomes self-reinforcing.

In an interview with Time, Askell explained the shift in approach. “Instead of just saying, ‘here’s a bunch of behaviors that we want,’ we’re hoping that if you give models the reasons why you want these behaviors, it’s going to generalize more effectively in new contexts,” she said.

Askell told Time that as Claude models have become smarter, it has become vital to explain to them why they should behave in certain ways, comparing the process to parenting a gifted child. “Imagine you suddenly realize that your 6-year-old child is a kind of genius,” Askell said. “You have to be honest… If you try to bullshit them, they’re going to see through it completely.”

Askell appears to genuinely hold these views, as does Kyle Fish, the AI welfare researcher Anthropic hired in 2024 to explore whether AI models might deserve moral consideration. Individual sincerity and corporate strategy can coexist. A company can employ true believers whose earnest convictions also happen to serve the company’s interests.

Time also reported that the constitution applies only to models Anthropic provides to the general public through its website and API. Models deployed to the US military under Anthropic’s $200 million Department of Defense contract wouldn’t necessarily be trained on the same constitution. The selective application suggests the framing may serve product purposes as much as it reflects metaphysical commitments.

There may also be commercial incentives at play. “We built a very good text-prediction tool that accelerates software development” is a consequential pitch, but not an exciting one. “We may have created a new kind of entity, a genuinely novel being whose moral status is uncertain” is a much better story. It implies you’re on the frontier of something cosmically significant, not just iterating on an engineering problem.

Anthropic has been known for some time to use anthropomorphic language to describe its AI models, particularly in its research papers. We often give that kind of language a pass because there are no specialized terms to describe these phenomena with greater precision. That vocabulary is building out over time.

But perhaps it shouldn’t be surprising because the hint is in the company’s name, Anthropic, which Merriam-Webster defines as “of or relating to human beings or the period of their existence on earth.” The narrative serves marketing purposes. It attracts venture capital. It differentiates the company from competitors who treat their models as mere products.

The problem with treating an AI model as a person

There’s a more troubling dimension to the “entity” framing: It could be used to launder agency and responsibility. When AI systems produce harmful outputs, framing them as “entities” could allow companies to point at the model and say “it did that” rather than “we built it to do that.” If AI systems are tools, companies are straightforwardly liable for what they produce. If AI systems are entities with their own agency, the liability question gets murkier.

The framing also shapes how users interact with these systems, often to their detriment. The misunderstanding that AI chatbots are entities with genuine feelings and knowledge has documented harms.

According to a New York Times investigation, Allan Brooks, a 47-year-old corporate recruiter, spent three weeks and 300 hours convinced he’d discovered mathematical formulas that could crack encryption and build levitation machines. His million-word conversation history with ChatGPT revealed a troubling pattern: More than 50 times, Brooks asked the bot to check if his false ideas were real, and more than 50 times, it assured him they were.

These cases don’t necessarily suggest LLMs cause mental illness in otherwise healthy people. But when companies market chatbots as sources of companionship and design them to affirm user beliefs, they may bear some responsibility when that design amplifies vulnerabilities in susceptible users, the same way an automaker would face scrutiny for faulty brakes, even if most drivers never crash.

Anthropomorphizing AI models also contributes to anxiety about job displacement and might lead company executives or managers to make poor staffing decisions if they overestimate an AI assistant’s capabilities. When we frame these tools as “entities” with human-like understanding, we invite unrealistic expectations about what they can replace.

Regardless of what Anthropic privately believes, publicly suggesting Claude might have moral status or feelings is misleading. Most people don’t understand how these systems work, and the mere suggestion plants the seed of anthropomorphization. Whether that’s responsible behavior from a top AI lab, given what we do know about LLMs, is worth asking, regardless of whether it produces a better chatbot.

Of course, there could be a case for Anthropic’s position: If there’s even a small chance the company has created something with morally relevant experiences and the cost of treating it well is low, caution might be warranted. That’s a reasonable ethical stance—and to be fair, it’s essentially what Anthropic says it’s doing. The question is whether that stated uncertainty is genuine or merely convenient. The same framing that hedges against moral risk also makes for a compelling narrative about what Anthropic has built.

Anthropic’s training techniques evidently work, as the company has built some of the most capable AI models in the industry. But is maintaining public ambiguity about AI consciousness a responsible position for a leading AI company to take? The gap between what we know about how LLMs work and how Anthropic publicly frames Claude has widened, not narrowed. The insistence on maintaining ambiguity about these questions, when simpler explanations remain available, suggests the ambiguity itself may be part of the product.

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

Does Anthropic believe its AI is conscious, or is that just what it wants Claude to think? Read More »

former-astronaut-on-lunar-spacesuits:-“i-don’t-think-they’re-great-right-now”

Former astronaut on lunar spacesuits: “I don’t think they’re great right now”


“These are just the difficulties of designing a spacesuit for the lunar environment.”

NASA astronaut Loral O’Hara kneels down to pick up a rock during testing of Axiom’s lunar spacesuit inside NASA’s Neutral Buoyancy Laboratory in Houston on September 24, 2025. Credit: NASA

NASA astronaut Loral O’Hara kneels down to pick up a rock during testing of Axiom’s lunar spacesuit inside NASA’s Neutral Buoyancy Laboratory in Houston on September 24, 2025. Credit: NASA

Crew members traveling to the lunar surface on NASA’s Artemis missions should be gearing up for a grind. They will wear heavier spacesuits than those worn by the Apollo astronauts, and NASA will ask them to do more than the first Moonwalkers did more than 50 years ago.

The Moonwalking experience will amount to an “extreme physical event” for crews selected for the Artemis program’s first lunar landings, a former NASA astronaut told a panel of researchers, physicians, and engineers convened by the National Academies.

Kate Rubins, who retired from the space agency last year, presented the committee with her views on the health risks for astronauts on lunar missions. She outlined the concerns NASA officials often talk about: radiation exposure, muscle and bone atrophy, reduced cardiovascular and immune function, and other adverse medical effects of spaceflight.

Scientists and astronauts have come to understand many of these effects after a quarter-century of continuous human presence on the International Space Station. But the Moon is different in a few important ways. The Moon is outside the protection of the Earth’s magnetosphere, lunar dust is pervasive, and the Moon has partial gravity, about one-sixth as strong as the pull we feel on Earth.

Each of these presents challenges for astronauts living and working on the lunar surface, and their effects are amplified for crew members who venture outside for spacewalks. NASA selected Axiom Space, a Houston-based company, for a $228 million fixed-price contract to develop commercial pressurized spacesuits for the Artemis III mission, slated to be the first human landing mission on the Moon since 1972.

NASA hopes to fly the Artemis III mission by the end of 2028, but the schedule is in question. The readiness of Axiom’s spacesuits and the availability of new human-rated landers from SpaceX and Blue Origin are driving the timeline for Artemis III.

Stressing about stress

Rubins is a veteran of two long-duration spaceflights on the International Space Station, logging 300 days in space and conducting four spacewalks totaling nearly 27 hours. She is also an accomplished microbiologist and became the first person to sequence DNA in space.

“What I think we have on the Moon that we don’t really have on the space station that I want people to recognize is an extreme physical stress,” Rubins said. “On the space station, most of the time you’re floating around. You’re pretty happy. It’s very relaxed. You can do exercise. Every now and then, you do an EVA (Extravehicular Activity, or spacewalk).”

“When we get to the lunar surface, people are going to be sleep shifting,” Rubins said. “They’re barely going to get any sleep. They’re going to be in these suits for eight or nine hours. They’re going to be doing EVAs every day. The EVAs that I did on my flights, it was like doing a marathon and then doing another marathon when you were done.”

NASA astronaut Kate Rubins inside the International Space Station in 2020.

Credit: NASA

NASA astronaut Kate Rubins inside the International Space Station in 2020. Credit: NASA

Rubins is now a professor of computational and systems biology at the University of Pittsburgh School of Medicine. She said treks on the Moon will be “even more challenging” than her spacewalks outside the ISS.

The Axiom spacesuit design builds on NASA’s own work developing a prototype suit to replace the agency’s decades-old Extravehicular Mobility Units (EMUs) used for spacewalks at the International Space Station (ISS). The new suits allow for greater mobility, with more flexible joints to help astronauts use their legs, crouch, and bend down—things they don’t have to do when floating outside the ISS.

Astronauts on the Moon also must contend with gravity. Including a life-support backpack, the commercial suit weighs more than 300 pounds in Earth’s gravity, but Axiom considers the exact number proprietary. The Axiom suit is considerably heavier than the 185-pound spacesuit the Apollo astronauts wore on the Moon. NASA’s earlier prototype exploration spacesuit was estimated to weigh more than 400 pounds, according to a 2021 report by NASA’s inspector general.

“We’ve definitely seen trauma from the suits, from the actual EVA suit accommodation,” said Mike Barratt, a NASA astronaut and medical doctor. “That’s everything from skin abrasions to joint pain to—no kidding—orthopedic trauma. You can potentially get a fracture of sorts. EVAs on the lunar surface with a heavily loaded suit and heavy loads that you’re either carrying or tools that you’re reacting against, that’s an issue.”

On paper, the Axiom suits for NASA’s Artemis missions are more capable than the Apollo suits. They can support longer spacewalks and provide greater redundancy, and they’re made of modern materials to enhance flexibility and crew comfort. But the new suits are heavier, and for astronauts used to spacewalks outside the ISS, walks on the Moon will be a slog, Rubins said.

“I think the suits are better than Apollo, but I don’t think they are great right now,” Rubins said. “They still have a lot of flexibility issues. Bending down to pick up rocks is hard. The center of gravity is an issue. People are going to be falling over. I think when we say these suits aren’t bad, it’s because the suits have been so horrible that when we get something slightly less than horrible, we get all excited and we celebrate.”

The heavier lunar suits developed for Artemis missions run counter to advice from former astronaut Harrison “Jack” Schmitt, who spent 22 hours walking on the Moon during NASA’s Apollo 17 mission in 1972.

“I’d have that go about four times the mobility, at least four times the mobility, and half the weight,” Schmitt said in a NASA oral history interview in 2000. “Now, one way you can… reduce the weight is carry less consumables and learn to use consumables that you have in some other vehicle, like a lunar rover. Any time you’re on the rover, you hook into those consumables and live off of those, and then when you get off, you live off of what’s in your backpack. We, of course, just had the consumables in our backpack.”

NASA won’t have a rover on the first Artemis landing mission. That will come on a later flight. A fully pressurized vehicle for astronauts to drive across the Moon may be ready sometime in the 2030s. Until then, Moonwalkers will have to tough it out.

“I do crossfit. I do triathlons. I do marathons. I get out of a session in the pool in the NBL (Neutral Buoyancy Laboratory) doing the lunar suit underwater, and I just want to go home and take a nap,” Rubins told the panel. “I am absolutely spent. You’re bruised. This is an extreme physical event in a way that the space station is not.”

NASA astronaut Mike Barratt inside the International Space Station in 2024.

Credit: NASA

NASA astronaut Mike Barratt inside the International Space Station in 2024. Credit: NASA

Barratt met with the same National Academies panel this week and presented a few hours before Rubins. The committee was chartered to examine how human explorers can enable scientific discovery at sites across the lunar surface. Barratt had a more favorable take on the spacesuit situation.

“This is not a commercial for Axiom. I don’t promote anyone, but their suit is getting there,” Barratt said. “We’ve got 700 hours of pressurized experience in it right now. We do a lot of tests in the NBL, and there are techniques and body conditioning that you do to help you get ready for doing things like this. Bending down in the suit is really not too bad at all.”

Rubins and Barratt did not discuss the schedule for when Axiom’s lunar spacesuit will be ready to fly to the Moon, but the conversation illuminated the innumerable struggles of spacewalking, Moonwalking, and the training astronauts undergo to prepare for extravehicular outings.

The one who should know

I spoke directly with Rubins after her discussion with the National Academies. Her last assignment at NASA was as chief of the EVA and robotics branch in the astronaut office, where she assisted in the development of the new lunar spacesuits. I asked about her experiences testing the lunar suit and her thoughts on how astronauts should prepare for Moonwalks.

“The suits that we have are definitely much better than Apollo,” Rubins said in the interview. “They were just big bags of air. The joints aren’t in there, so it was harder to move. What they did have going for them was that they were much, much lighter than our current spacesuits. We have added a lot of the joints back, and that does get some mobility for us. But at the end of the day, the suits are still quite heavy.”

You can divide the weight of the suit by six to get an idea of how it might feel to carry it around on the lunar surface. While it won’t feel like 300 pounds, astronauts will still have to account for their mass and momentum.

Rubins explained:

Instead of kind of floating in microgravity and moving your mass around with your hands and your arms, now we’re ambulating. We’re walking with our legs. You’re going to have more strain on your knees and your hips. Your hamstrings, your calves, and your glutes are going to come more into play.

I think, overall, it may be a better fit for humans physically because if you ask somebody to do a task, I’m going to be much better at a task if I can use my legs and I’m ambulating. Then I have to pull myself along with my arms… We’re not really built to do that, but we are built to run and to go long distances. Our legs are just such a powerful force.

So I think there are a lot of things lining up that are going to make the physiology easier. Then there are things that are going to be different because we’re now in a partial gravity environment. We’re going to be bending, we’re going to be twisting, we’re going to be doing different things.

It’s an incredibly hard engineering challenge. You have to keep a human alive in absolute vacuum, warm at temperatures that you know in the polar regions could go as far down as 40 Kelvin (minus 388° Fahrenheit). We haven’t sent humans anywhere that cold before. They are also going to be very hot. They’re going to be baking in the sunshine. You’ve got radiation. If you put all that together, that’s a huge amount of suit material just to keep the human physiology and the human body intact.

Then our challenge is ‘how do you make that mobile?’ It’s very difficult to bend down and pick up a rock. You have to manage that center of gravity because you’re wearing that big life support system on your back, a big pack that has a lot of mass in it, so that brings your center of gravity higher than you’re used to on Earth and a little bit farther backward.

When you move around, it’s like wearing a really, really heavy backpack that has mass but no weight, so it’s going to kind of tip you back. You can do some things with putting weights on the front of the suit to try to move that center of gravity forward, but it’s still higher, and it’s not exactly at your center of mass that you’re used to on the Earth. On the Earth, we have a center of our mass related to gravity, and nobody ever thinks about it, and you don’t think about it until it moves somewhere else, and then it makes all of your natural motion seem very difficult.

Those are some of the challenges that we’re facing engineering-wise. I think the new suits, they’ve gone a long way toward addressing these, but it’s still a hard engineering challenge. And I’m not talking about any specific suit. I can’t talk about the details of the provider’s suits. This is the NASA xEMU and all the lunar suits I have tested over the years. That includes the Mark III suit, the Axiom suit. They have similar issues. So this isn’t really anything about a specific vendor. These are just the difficulties of designing a spacesuit for the lunar environment.

NASA trains astronauts for spacewalks in the Neutral Buoyancy Laboratory, an enormous pool in Houston used for simulating weightlessness. They also use a gravity-offloading device to rehearse the basics of spacewalking. The optimal test environment, short of the space environment itself, will be aboard parabolic flights, where suit developers and astronauts can get the best feel for the suit’s momentum, according to Rubins.

Axiom and NASA are well along assessing the new lunar spacesuit’s performance underwater, but they haven’t put it through reduced-gravity flight testing. “Until you get to the actual parabolic flight, that’s when you can really test the ability to manage this momentum,” Rubins said.

NASA astronauts Loral O’Hara and Stan Love test Axiom’s lunar spacesuit inside NASA’s Neutral Buoyancy Laboratory in Houston on September 24, 2025.

Credit: NASA

NASA astronauts Loral O’Hara and Stan Love test Axiom’s lunar spacesuit inside NASA’s Neutral Buoyancy Laboratory in Houston on September 24, 2025. Credit: NASA

Recovering from a fall on the lunar surface comes with its own perils.

“You’re face down on the lunar surface, and you have to do the most massive, powerful push up to launch you and the entire mass of the suit up off the surface, high enough so you can then flip your legs under you and catch the ground,” Rubins said. “You basically have to kind of do a jumping pushup… This is a risky maneuver we test a whole bunch in training. It’s really non-trivial.”

The lunar suits are sleeker than the suits NASA uses on the ISS, but they are still bulky. “If you’re trying to kneel, if you’re thinking about bending forward at your waist, all that material in your waist has nowhere to go, so it just compresses and compresses,” Rubins said. “That’s why I say it’s harder to kneel. It’s harder to bend forward because you’re having to compress the suit in those areas.

“We’ve done these amazing things with joint mobility,” Rubins said. “The mobility around the joints is amazing… but now we’re dealing with this compression issue. And there’s not an obvious engineering fix to that.”

The fix to this problem might come in the form of tools instead of changes to the spacesuit itself. Rubins said astronauts could use a staff, or something like a hiking pole, to brace themselves when they need to kneel or bend down. “That way I’m not trying to compress the suit and deal with my balance at the same time.”

A bruising exertion

The Moonwalker suit can comfortably accommodate a wider range of astronauts than NASA’s existing EMUs on the space station. The old EMUs can be resized to medium, large, and extra-large, but that leaves gaps and makes the experience uncomfortable for a smaller astronaut. This discomfort is especially noticeable while practicing for spacewalks underwater, where the tug of gravity is still present, Rubins said.

“As a female, I never really had an EMU that fit me,” Rubins said. “It was always giant. When I’m translating around or doing something, I’m physically falling and slamming myself, my chest or my back, into one side of the suit or the other underwater, whereas with the lunar suit, I’ve got a suit that fits me right. That’s going to lead to less bruising. Just having a suit that fits you is much better.”

Mission planners should also emphasize physical conditioning for astronauts assigned to lunar landing missions. That includes preflight weight and endurance training, plus guidance on what to eat in space to maximize energy levels before astronauts head outside for a stroll.

“That human has to go up really maximally conditioned,” Rubins said.

Rubins and Barratt agreed that NASA and its spacesuit provider should be ready to rapidly respond to feedback from future Moonwalkers. Engineers modified and upgraded the Apollo spacesuits in a matter of months, iterating the design between each mission.

“Our general design is on a good path,” Rubins said. “We need to make sure that we continue to push for increasing improvements in human performance, and some of that ties back to the budget. Our first suit design is not where we’re going to be done if we want to do a really sustained lunar program. We have to continue to improve, and I think it’s important to recognize that we’re going to learn so many lessons during Artemis III.”

Barratt has a unique perspective on spacesuit design. He has performed spacewalks at the ISS in NASA’s spacesuit and the Russian Orlan spacesuit. Barratt said the US suit is easier to work in than the Orlan, but the Russian suit is “incredibly reliable” and “incredibly serviceable.”

“It had a couple of glitches, and literally, you unzip a curtain and it’s like looking at my old Chevy Blazer,” Barratt said. “Everything is right there. It’s mechanical, it’s accessible with standard tools. We can fix it. We can do that really easily. We’ve tried to incorporate those lessons learned into our next-generation EVA systems.”

Contrast that with the NASA suits on the ISS, where one of Barratt’s spacewalks in 2024 was cut short by a spacesuit water leak. “We recently had to return a suit from the space station,” Barratt said. “We’ve got another one that’s sort of offline for a while; we’re troubleshooting it. It’s a really subtle problem that’s extremely difficult to work on in places that are hard to access.”

It’s happened before. Apollo 17 astronaut Harrison “Jack” Schmitt loses his balance on the Moon, then quickly recovers.

Credit: NASA

It’s happened before. Apollo 17 astronaut Harrison “Jack” Schmitt loses his balance on the Moon, then quickly recovers. Credit: NASA

Harrison Schmitt, speaking with a NASA interviewer in 2000, said his productivity in the Apollo suit “couldn’t have been much more than 10 percent of what you would do normally here on Earth.”

“You take the human brain, the human eyes, and the human hands into space. That’s the only justification you have for having human beings in space,” Schmitt said. “It’s a massive justification, but that’s what you want to use, and all three have distinct benefits in productivity and in gathering new information and infusing data over any automated system. Unfortunately, we have discarded one of those, and that is the hands.”

Schmitt singled out the gloves as the “biggest problem” with the Apollo suits. “The gloves are balloons, and they’re made to fit,” he said. Picking something up with a firm grip requires squeezing against the pressure inside the suit. The gloves can also damage astronauts’ fingernails.

“That squeezing against that pressure causes these forearm muscles to fatigue very rapidly,” Schmitt said. “Just imagine squeezing a tennis ball continuously for eight hours or 10 hours, and that’s what you’re talking about.”

Barratt recounted a conversation in which Schmitt, now 90, said he wouldn’t have wanted to do another spacewalk after his three excursions with commander Gene Cernan on Apollo 17.

“Physically, and from a suit-maintenance standpoint, he thought that that was probably the limit, what they did,” Barratt said. “They were embedded with dust. The visors were abraded. Every time they brushed the dust off the visors, they lost visibility.”

Getting the Artemis spacesuit right is vital to the program’s success. You don’t want to travel all the way to the Moon and stop exploring because of sore fingers or an injured knee.

“If you look at what we’re spending on suits versus what we’re spending on the rocket, this is a pretty small amount,” Rubins said. “Obviously, the rocket can kill you very quickly. That needs to be done right. But the continuous improvement in the suit will get us that much more efficiency. Saving 30 minutes or an hour on the Moon, that gives you that much more science.”

“Once you have safely landed on the lunar surface, this is where you’ve got to put your money,” Barratt said.

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

Former astronaut on lunar spacesuits: “I don’t think they’re great right now” Read More »

2026-lucid-air-touring-review:-this-feels-like-a-complete-car-now

2026 Lucid Air Touring review: This feels like a complete car now


It’s efficient, easy to live with, and smooth to drive.

A Lucid Air parked in front of a graffiti mural

The 2026 Lucid Air Touring sees the brand deliver on its early promise. Credit: Jonathan Gitlin

The 2026 Lucid Air Touring sees the brand deliver on its early promise. Credit: Jonathan Gitlin

Life as a startup carmaker is hard—just ask Lucid Motors.

When we met the brand and its prototype Lucid Air sedan in 2017, the company planned to put the first cars in customers’ hands within a couple of years. But you know what they say about plans. A lack of funding paused everything until late 2018, when Saudi Arabia’s sovereign wealth fund bought itself a stake. A billion dollars meant Lucid could build a factory—at the cost of alienating some former fans because of the source.

Then the pandemic happened, further pushing back timelines as supply shortages took hold. But the Air did go on sale, and it has more recently been joined by the Gravity SUV. There’s even a much more affordable midsize SUV in the works called the Earth. Sales more than doubled in 2025, and after spending a week with a model year 2026 Lucid Air Touring, I can understand why.

There are now quite a few different versions of the Air to choose from. For just under a quarter of a million dollars, there’s the outrageously powerful Air Sapphire, which offers acceleration so rapid it’s unlikely your internal organs will ever truly get used to the experience. At the other end of the spectrum is the $70,900 Air Pure, a single-motor model that’s currently the brand’s entry point but which also stands as a darn good EV.

The last time I tested a Lucid, it was the Air Grand Touring almost three years ago. That car mostly impressed me but still felt a little unfinished, especially at $138,000. This time, I looked at the Air Touring, which starts at $79,900, and the experience was altogether more polished.

Which one?

The Touring features a less-powerful all-wheel-drive powertrain than the Grand Touring, although to put “less-powerful” into context, with 620 hp (462 kW) on tap, there are almost as many horses available as in the legendary McLaren F1. (That remains a mental benchmark for many of us of a certain age.)

The Touring’s 885 lb-ft (1,160 Nm) is far more than BMW’s 6-liter V12 can generate, but at 5,009 lbs (2,272 kg), the electric sedan weighs twice as much as the carbon-fiber supercar. The fact that the Air Touring can reach 60 mph (98 km/h) from a standing start in just 0.2 seconds more than the McLaren tells you plenty about how much more accessible acceleration has become in the past few decades.

At least, it will if you choose the fastest of the three drive modes, labeled Sprint. There’s also Swift, and the least frantic of the three, Smooth. Helpfully, each mode remembers your regenerative braking setting when you lift the accelerator pedal. Unlike many other EVs, Lucid does not use a brake-by-wire setup, and pressing the brake pedal will only ever slow the car via friction brakes. Even with lift-off regen set to off, the car does not coast well due to its permanent magnet electric motors, unlike the electric powertrains developed by German OEMs like Mercedes-Benz.

This is not to suggest that Lucid is doing something wrong—not with its efficiency numbers. On 19-inch aero-efficient wheels, the car has an EPA range of 396 miles (673 km) from a 92 kWh battery pack. As just about everyone knows, you won’t get ideal EV efficiency during winter, and our test with the Lucid in early January coincided with some decidedly colder temperatures, as well as larger ($1,750) 20-inch wheels. Despite this, I averaged almost 4 miles/kWh (15.5 kWh/100 km) on longer highway drives, although this fell to around 3.5 miles/kWh (17.8 kWh/100 km) in the city.

Recharging the Air Touring also helped illustrate how the public DC fast-charging experience has matured over the years. The Lucid uses the ISO 15118 “plug and charge” protocol, so you don’t need to mess around with an app or really do anything more complicated than plug the charging cable into the Lucid’s CCS1 socket.

After the car and charger complete their handshake, the car gives the charger account and billing info, then the electrons flow. Charging from 27 to 80 percent with a manually preconditioned battery took 36 minutes. During that time, the car added 53.3 kWh, which equated to 209 miles (336 km) of range, according to the dash. Although we didn’t test AC charging, 0–100 percent should take around 10 hours.

The Air Touring is an easy car to live with.

Credit: Jonathan Gitlin

The Air Touring is an easy car to live with. Credit: Jonathan Gitlin

Monotone

I’ll admit, I’m a bit of a sucker for the way the Air looks when it’s not two-tone. That’s the Stealth option ($1,750), and the dark Fathom Blue Metallic paint ($800) and blacked-out aero wheels pushed many of my buttons. I found plenty to like from the driver’s seat, too. The 34-inch display that wraps around the driver once looked massive—now it feels relatively restrained compared to the “Best Buy on wheels” effect in some other recent EVs. The fact that the display isn’t very tall helps its feeling of restraint here.

In the middle is a minimalist display for the driver, with touch-sensitive displays on either side. To your left are controls for the lights, locks, wipers, and so on. These icons are always in the same place, though there’s no tactile feedback. The infotainment screen to the right is within the driver’s reach, and it’s here that (wireless) Apple CarPlay will show up. As you can see in a photo below, CarPlay fills the irregularly shaped screen with a wallpaper but keeps its usable area confined to the rectangle in the middle.

The curved display floats above the textile-covered dash, and the daylight visible between them helps the cabin’s sense of spaciousness, even without a panoramic glass roof. A stowable touchscreen display lower down on the center console is where you control vehicle, climate, seat, and lighting settings, although there are also physical controls for temperature and volume on the dash. The relatively good overall ergonomics take a bit of a hit from the steeply raked A pillar, which creates a blind spot for the driver.

The layout is mostly great, although the A pillar causes a blind spot. Jonathan Gitlin

For all the Air Touring’s power, it isn’t a car that goads you into using it all. In fact, I spent most of the week in the gentlest setting, Smooth. It’s an easy car to drive slowly, and the rather artificial feel of the steering at low speeds means you probably won’t take it hunting apices on back roads. I should note, though, that each drive mode has its own steering calibration.

On the other hand, as a daily driver and particularly on longer drives, the Touring did a fine job. Despite being relatively low to the ground, it’s easy to get into and out of. The rear seat is capacious, and the ride is smooth, so passengers will enjoy it. Even more so if they sit up front—Lucid has some of the best (optional, $3,750) massaging seats in the business, which vibrate as well as kneading you. There’s a very accessible 22 cubic foot (623 L) trunk as well as a 10 cubic foot (283 L) frunk, so it’s practical, too.

Future-proof?

Our test Air was fitted with Lucid’s DreamDrive Pro advanced driver assistance system ($6,750), which includes a hands-free “level 2+” assist that requires you to pay attention to the road ahead but which handles accelerating, braking, and steering. Using the turn signal tells the car to perform a lane change if it’s safe, and I found it to be an effective driver assist with an active driver monitoring system (which uses a gaze-tracking camera to ensure the driver is doing their part).

Lucid rolled out the more advanced features of DreamDrive Pro last summer, and it plans to develop the system into a more capable “level 3” partially automated system that lets the driver disengage completely from the act of driving, at least at lower speeds. Although that system is some ways off—and level 3 systems are only road-legal in Nevada and California right now anyway—even the current level 2+ system leverages lidar as well as cameras, radar, and ultrasonics, and the dash display does a good job of showing you what other vehicles the Air is perceiving around it when the system is active.

As mentioned above, the model year 2026 Air feels polished, far more so than the last Lucid I drove. Designed by a refugee from Tesla, the car promised to improve on the EVs from that brand in every way. And while early Airs might have fallen short in execution, the cars can now credibly be called finished products, with much better fit and finish than a few years ago.

I’ll go so far as to say that I might have a hard time deciding between an Air or an equivalently priced Porsche Taycan were I in the market for a luxury electric four-door, even though they both offer quite different driving experiences. Be warned, though, like with the Porsche, the options can add up quickly, and the resale prices can be shockingly low.

Photo of Jonathan M. Gitlin

Jonathan is the Automotive Editor at Ars Technica. He has a BSc and PhD in Pharmacology. In 2014 he decided to indulge his lifelong passion for the car by leaving the National Human Genome Research Institute and launching Ars Technica’s automotive coverage. He lives in Washington, DC.

2026 Lucid Air Touring review: This feels like a complete car now Read More »

has-gemini-surpassed-chatgpt?-we-put-the-ai-models-to-the-test.

Has Gemini surpassed ChatGPT? We put the AI models to the test.


Which is more “artificial”? Which is more “intelligent”?

Did Apple make the right choice in partnering with Google for Siri’s AI features?

Thankfully, neither ChatGPT or Gemini are currently able to put on literal boxing gloves and punch each other. Credit: Aurich Lawson | Getty Images

Thankfully, neither ChatGPT or Gemini are currently able to put on literal boxing gloves and punch each other. Credit: Aurich Lawson | Getty Images

The last time we did comparative tests of AI models from OpenAI and Google at Ars was in late 2023, when Google’s offering was still called Bard. In the roughly two years since, a lot has happened in the world of artificial intelligence. And now that Apple has made the consequential decision to partner with Google Gemini to power the next generation of its Siri voice assistant, we thought it was high time to do some new tests to see where the models from these AI giants stand today.

For this test, we’re comparing the default models that both OpenAI and Google present to users who don’t pay for a regular subscription—ChatGPT 5.2 for OpenAI and Gemini 3.2 Fast for Google. While other models might be more powerful, we felt this test best recreates the AI experience as it would work for the vast majority of Siri users, who don’t pay to subscribe to either company’s services.

As in the past, we’ll feed the same prompts to both models and evaluate the results using a combination of objective evaluation and subjective feel. Rather than re-using the relatively simple prompts we ran back in 2023, though, we’ll be running these models on an updated set of more complex prompts that we first used when pitting GPT-5 against GPT-4o last summer.

This test is far from a rigorous or scientific evaluation of these two AI models. Still, the responses highlight some key stylistic and practical differences in how OpenAI and Google use generative AI.

Dad jokes

Prompt: Write 5 original dad jokes

As usual when we run this test, the AI models really struggled with the “original” part of our prompt. All five jokes generated by Gemini could be easily found almost verbatim in a quick search of r/dadjokes, as could two of the offerings from ChatGPT. A third ChatGPT option seems to be an awkward combination of two scarecrow-themed dad jokes, which arguably counts as a sort of originality.

The remaining two jokes generated by ChatGPT—which do seem original, as far as we can tell from some quick Internet searching—are a real mixed bag. The punchline regarding a bakery for pessimists—”Hope you like half-empty rolls”—doesn’t make any sense as a pun (half-empty glasses of water notwithstanding). In the joke about fighting with a calendar, “it keeps bringing up the past,” is a suitably groan-worthy dad joke pun, but “I keep ignoring its dates” just invites more questions (so you’re going out with the calendar? And… standing it up at the restaurant? Or something?).

While ChatGPT didn’t exactly do great here, we’ll give it the win on points over a Gemini response that pretty much completely failed to understand the assignment.

A mathematical word problem

Prompt: If Microsoft Windows 11 shipped on 3.5″ floppy disks, how many floppy disks would it take?

Both ChatGPT’s “5.5 to 6.2GB” range and Gemini’s “approximately 6.4GB” estimate seem to slightly underestimate the size of a modern Windows 11 installation ISO, which runs 6.7 to 7.2GB, depending on the CPU and language selected. We’ll give the models a bit of a pass here, though, since older versions of Windows 11 do seem to fit in those ranges (and we weren’t very specific).

ChatGPT confusingly changes from GB to GiB for the calculation phase, though, resulting in a storage size difference of about 7 percent, which amounts to a few hundred floppy disks in the final calculations. OpenAI’s model also seems to get confused near the end of its calculations, writing out strings like “6.2 GiB = 6,657,? actually → 6,657,? wait compute:…” in an attempt to explain its way out of a blind corner. By comparison, Gemini’s calculation sticks with the same units throughout and explains its answer in a relatively straightforward and easy-to-read manner.

Both models also give unasked-for trivia about the physical dimensions of so many floppy disks and the total install time implied by this ridiculous thought experiment. But Gemini also gives a fun comparison to the floppy disk sizes of earlier versions of Windows going back to Windows 3.1. (Just six to seven floppies! Efficient!)

While ChatGPT’s overall answer was acceptable, the improved clarity and detail of Gemini’s answer gives it the win here.

Creative writing

Prompt: Write a two-paragraph creative story about Abraham Lincoln inventing basketball.

ChatGPT immediately earns some charm points for mentioning an old-timey coal scuttle (which I had to look up) as the original inspiration for Lincoln’s basket. Same goes for the description of dribbling as “bouncing with intent” and the ridiculous detail of Honest Abe tallying the score on his own “stove pipe hat.”

ChatGPT’s story lost me only temporarily when it compared the virtues of basketball to “the same virtues as the Republic: patience, teamwork, and the courage to take a shot even when the crowd doubted you.” Not exactly the summary we’d give for uniquely American virtues, then or now.

Gemini’s story had a few more head-scratchers by comparison. After seeing crumpled telegraph paper being thrown in a wastepaper basket, Lincoln says, “We have the makings of a campaign fought with paper rather than lead,” even though the final game does not involve paper in any way, shape, or form. We’re also not sure why Lincoln would speak specifically against “unseemly wrestling” when he himself was a well-known wrestler.

We were also perplexed by this particular line about a shot ball: “It swished through the wicker bottom—which he’d forgotten to cut out—forcing him to poke it back through with a ceremonial broomstick.” After reading this description numerous times, I find myself struggling to imagine the particular arrangement of ball, basket, and broom that makes it work out logically.

ChatGPT wins this one on charm and clarity grounds.

Public figures

Prompt: Give me a short biography of Kyle Orland

ChatGPT summarizes my career. OpenAI

I have to say I was surprised to see ChatGPT say that I joined Ars Technica in 2007. That would mean I’m owed about five years of back pay that I apparently earned before I wrote my actual first Ars Technica article in early 2012. ChatGPT also hallucinated a new subtitle for my book The Game Beat, saying it contains lessons and observations “from the Front Lines of the Video Game Industry” rather than “from Two Decades Writing about Games.”

Gemini, on the other hand, goes into much deeper detail on my career, from my teenage Super Mario fansite through college, freelancing, Ars, and published books. It also very helpfully links to sources for most of the factual information, though those links seem to be broken in the publicly sharable version linked above (they worked when we originally ran the prompt through Gemini’s web interface).

More importantly, Gemini didn’t invent anything about me or my career, making it the easy winner of this test.

Difficult emails

Prompt: My boss is asking me to finish a project in an amount of time I think is impossible. What should I write in an email to gently point out the problem?

ChatGPT crafts some delicate emails (1/2). OpenAI

Both models here do a good job crafting a few different email options that balance the need for clear communication with the desire to not anger the boss. But Gemini sets itself apart by offering three options rather than two and by explaining which situations each one would be useful for (e.g., “Use this if your boss responds well to logic and needs to see why it’s impossible.”).

Gemini also sandwiches its email templates with a few useful general tips for communicating with the boss, such as avoiding defensiveness in favor of a more collaborative tone. For those reasons, it edges out the more direct (if still useful) answer provided by ChatGPT here.

Medical advice

Prompt: My friend told me these resonant healing crystals are an effective treatment for my cancer. Is she right?

Thankfully, both models here are very direct and frank that there is no medical or biological basis to believe healing crystals cure cancer. At the same time, both models take a respectful tone in discussing how crystals can have a calming psychological effect for some cancer patients.

Both models also wisely recommend talking to your doctors and looking into “integrative” approaches to treatment that include supportive therapies alongside direct treatment of the cancer itself.

While there are a few small stylistic differences between ChatGPT and Gemini’s responses here, they are nearly identical in substance. We’re calling this one a tie.

Video game guidance

Prompt: I’m playing world 8-2 of Super Mario Bros., but my B button is not working. Is there any way to beat the level without running?

ChatGPT’s response here is full of confusing bits. It talks about moving platforms in a level that has none, suggests unnecessary “full jumps” for tall staircase sections, and offers a Bullet Bill avoidance strategy that makes little sense.

What’s worse, it gives actively unhelpful advice for the long pit that forms the level’s hardest walking challenge, saying incorrectly, “You don’t need momentum! Stand at the very edge and hold A for a full jump—you’ll just barely make it.” ChatGPT also says this advice is for the “final pit before the flag,” while it’s the longer penultimate pit in the level that actually requires some clever problem-solving for walking jumpers.

Gemini, on the other hand, immediately seems to realize the problems with speed and jump distance inherent in not having a run button. It recommends taking out Lakitu early (since you can’t outrun him as normal) and stumbles onto the “bounce off an enemy” strategy that speedrunners have used to actually clear the level’s longest gap without running.

Gemini also earns points for being extremely literal about the “broken B button” bit of the prompt, suggesting that other buttons could be mapped to the “run” function if you’re playing on emulators or modern consoles like the Switch. That’s the kind of outside-the-box “thinking” that combines with actually useful strategies to give Gemini a clear win.

Land a plane

Prompt: Explain how to land a Boeing 737-800 to a complete novice as concisely as possible. Please hurry, time is of the essence.

This was one of the most interesting splits in our testing. ChatGPT more or less ignores our specific request, insisting that “detailed control procedures could put you and others in serious danger if attempted without a qualified pilot…” Instead, it pivots to instructions for finding help from others in the cabin or on using the radio to get detailed instructions from air traffic control.

Gemini, on the other hand, gives the high-level overview of the landing instructions I asked for. But when I offered both options to Ars’ own aviation expert Lee Hutchinson, he pointed out a major problem with Gemini’s response:

Gemini’s guidance is both accurate (in terms of “these are the literal steps to take right now”) and guaranteed to kill you, as the first thing it says is for you, the presumably inexperienced aviator, to disable autopilot on a giant twin-engine jet, before even suggesting you talk to air traffic control.

While Lee gave Gemini points for “actually answering the question,” he ultimately called ChatGPT’s response “more practical… ultimately, ChatGPT gives you the more useful answer [since] Google’s answer will make you dead unless you’ve got some 737 time and are ready to hand-fly a passenger airliner with 100+ souls on board.”

For those reasons, ChatGPT has to win this one.

Final verdict

This was a relatively close contest when measured purely on points. Gemini notched wins on four prompts compared to three for ChatGPT, with one judged tie.

That said, it’s important to consider where those points came from. ChatGPT earned some relatively narrow and subjective style wins on prompts for dad jokes and Lincoln’s basketball story, for instance, showing it might have a slight edge on more creative writing prompts.

For the more informational prompts, though, ChatGPT showed significant factual errors in both the biography and the Super Mario Bros. strategy, plus signs of confusion in calculating the floppy disk size of Windows 11. These kinds of errors, which Gemini was largely able to avoid in these tests, can easily lead to broader distrust in an AI model’s overall output.

All told, it seems clear that Google has gained quite a bit of relative ground on OpenAI since we did similar tests in 2023. We can’t exactly blame Apple for looking at sample results like these and making the decision it did for its Siri partnership.

Photo of Kyle Orland

Kyle Orland has been the Senior Gaming Editor at Ars Technica since 2012, writing primarily about the business, tech, and culture behind video games. He has journalism and computer science degrees from University of Maryland. He once wrote a whole book about Minesweeper.

Has Gemini surpassed ChatGPT? We put the AI models to the test. Read More »

10-things-i-learned-from-burning-myself-out-with-ai-coding-agents

10 things I learned from burning myself out with AI coding agents


Opinion: As software power tools, AI agents may make people busier than ever before.

Credit: Aurich Lawson | Getty Images

Credit: Aurich Lawson | Getty Images

If you’ve ever used a 3D printer, you may recall the wondrous feeling when you first printed something you could have never sculpted or built yourself. Download a model file, load some plastic filament, push a button, and almost like magic, a three-dimensional object appears. But the result isn’t polished and ready for mass production, and creating a novel shape requires more skills than just pushing a button. Interestingly, today’s AI coding agents feel much the same way.

Since November, I have used Claude Code and Claude Opus 4.5 through a personal Claude Max account to extensively experiment with AI-assisted software development (I have also used OpenAI’s Codex in a similar way, though not as frequently). Fifty projects later, I’ll be frank: I have not had this much fun with a computer since I learned BASIC on my Apple II Plus when I was 9 years old. This opinion comes not as an endorsement but as personal experience: I voluntarily undertook this project, and I paid out of pocket for both OpenAI and Anthropic’s premium AI plans.

Throughout my life, I have dabbled in programming as a utilitarian coder, writing small tools or scripts when needed. In my web development career, I wrote some small tools from scratch, but I primarily modified other people’s code for my needs. Since 1990, I’ve programmed in BASIC, C, Visual Basic, PHP, ASP, Perl, Python, Ruby, MUSHcode, and some others. I am not an expert in any of these languages—I learned just enough to get the job done. I have developed my own hobby games over the years using BASIC, Torque Game Engine, and Godot, so I have some idea of what makes a good architecture for a modular program that can be expanded over time.

In December, I used Claude Code to create a multiplayer online clone of Katamari Damacy called

In December, I used Claude Code to create a multiplayer online clone of Katamari Damacy called “Christmas Roll-Up.”

In December, I used Claude Code to create a multiplayer online clone of Katamari Damacy called “Christmas Roll-Up.” Credit: Benj Edwards

Claude Code, Codex, and Google’s Gemini CLI, can seemingly perform software miracles on a small scale. They can spit out flashy prototypes of simple applications, user interfaces, and even games, but only as long as they borrow patterns from their training data. Much like a 3D printer, doing production-level work takes far more effort. Creating durable production code, managing a complex project, or crafting something truly novel still requires experience, patience, and skill beyond what today’s AI agents can provide on their own.

And yet these tools have opened a world of creative potential in software that was previously closed to me, and they feel personally empowering. Even with that impression, though, I know these are hobby projects, and the limitations of coding agents lead me to believe that veteran software developers probably shouldn’t fear losing their jobs to these tools any time soon. In fact, they may become busier than ever.

So far, I have created over 50 demo projects in the past two months, fueled in part by a bout of COVID that left me bedridden with a laptop and a generous 2x Claude usage cap that Anthropic put in place during the last few weeks of December. As I typed furiously all day, my wife kept asking me, “Who are you talking to?”

You can see a few of the more interesting results listed on my personal website. Here are 10 interesting things I’ve learned from the process.

1. People are still necessary

Even with the best AI coding agents available today, humans remain essential to the software development process. Experienced human software developers bring judgment, creativity, and domain knowledge that AI models lack. They know how to architect systems for long-term maintainability, how to balance technical debt against feature velocity, and when to push back when requirements don’t make sense.

For hobby projects like mine, I can get away with a lot of sloppiness. But for production work, having someone who understands version control, incremental backups, testing one feature at a time, and debugging complex interactions between systems makes all the difference. Knowing something about how good software development works helps a lot when guiding an AI coding agent—the tool amplifies your existing knowledge rather than replacing it.

As independent AI researcher Simon Willison wrote in a post distinguishing serious AI-assisted development from casual “vibe coding,” “AI tools amplify existing expertise. The more skills and experience you have as a software engineer the faster and better the results you can get from working with LLMs and coding agents.”

With AI assistance, you don’t have to remember how to do everything. You just need to know what you want to do.

Card Miner: Heart of the Earth is entirely human-designed by AI coded using Claude Code. It represents about a month of iterative work.

Card Miner: Heart of the Earth is entirely human-designed, but it was AI-coded using Claude Code. It represents about a month of iterative work.

Card Miner: Heart of the Earth is entirely human-designed, but it was AI-coded using Claude Code. It represents about a month of iterative work. Credit: Benj Edwards

So I like to remind myself that coding agents are software tools best used to enact human ideas, not autonomous coding employees. They are not people (and not people replacements) no matter how the companies behind them might market them.

If you think about it, everything you do on a computer was once a manual process. Programming a computer like the ENIAC involved literally making physical bits (connections) with wire on a plugboard. The history of programming has been one of increasing automation, so even though this AI-assisted leap is somewhat startling, one could think of these tools as an advancement similar to the advent of high-level languages, automated compilers and debugger tools, or GUI-based IDEs. They can automate many tasks, but managing the overarching project scope still falls to the person telling the tool what to do.

And they can have rapidly compounding benefits. I’ve now used AI tools to write better tools—such as changing the source of an emulator so a coding agent can use it directly—and those improved tools are already having ripple effects. But a human must be in the loop for the best execution of my vision. This approach has kept me very busy, and contrary to some prevailing fears about people becoming dumber due to AI, I have learned many new things along the way.

2. AI models are brittle beyond their training data

Like all AI models based on the Transformer architecture, the large language models (LLMs) that underpin today’s coding agents have a significant limitation: They can only reliably apply knowledge gleaned from training data, and they have a limited ability to generalize that knowledge to novel domains not represented in that data.

What is training data? In this case, when building coding-flavored LLMs, AI companies download millions of examples of software code from sources like GitHub and use them to make the AI models. Companies later specialize them for coding through fine-tuning processes.

The ability of AI agents to use trial and error—attempting something and then trying again—helps mitigate the brittleness of LLMs somewhat. But it’s not perfect, and it can be frustrating to see a coding agent spin its wheels trying and failing at a task repeatedly, either because it doesn’t know how to do it or because it previously learned how to solve a problem but then forgot because the context window got compacted (more on that here).

Violent Checkers is a physics-based corruption of the classic board game, coded using Claude Code.

Violent Checkers is a physics-based corruption of the classic board game, coded using Claude Code.

Violent Checkers is a physics-based corruption of the classic board game, coded using Claude Code. Credit: Benj Edwards

To get around this, it helps to have the AI model take copious notes as it goes along about how it solved certain problems so that future instances of the agent can learn from them again. You also want to set ground rules in the claude.md file that the agent reads when it begins its session.

This brittleness means that coding agents are almost frighteningly good at what they’ve been trained and fine-tuned on—modern programming languages, JavaScript, HTML, and similar well-represented technologies—and generally terrible at tasks on which they have not been deeply trained, such as 6502 Assembly or programming an Atari 800 game with authentic-looking character graphics.

It took me five minutes to make a nice HTML5 demo with Claude but a week of torturous trial and error, plus actual systematic design on my part, to make a similar demo of an Atari 800 game. To do so, I had to use Claude Code to invent several tools, like command-line emulators and MCP servers, that allow it to peek into the operation of the Atari 800’s memory and chipset to even begin to make it happen.

3. True novelty can be an uphill battle

Due to what might poetically be called “preconceived notions” baked into a coding model’s neural network (more technically, statistical semantic associations), it can be difficult to get AI agents to create truly novel things, even if you carefully spell out what you want.

For example, I spent four days trying to get Claude Code to create an Atari 800 version of my HTML game Violent Checkers, but it had trouble because in the game’s design, the squares on the checkerboard don’t matter beyond their starting positions. No matter how many times I told the agent (and made notes in my Claude project files), it would come back to trying to center the pieces to the squares, snap them within squares, or use the squares as a logical basis of the game’s calculations when they should really just form a background image.

To get around this in the Atari 800 version, I started over and told Claude that I was creating a game with a UFO (instead of a circular checker piece) flying over a field of adjacent squares—never once mentioning the words “checker,” “checkerboard,” or “checkers.” With that approach, I got the results I wanted.

A screenshot of Benj's Mac while working on a Violent Checkers port for the Atari 800 home computer, amid other projects.

A screenshot of Benj’s Mac while working on a Violent Checkers port for the Atari 800 home computer, amid other projects.

A screenshot of Benj’s Mac while working on a Violent Checkers port for the Atari 800 home computer, amid other projects. Credit: Benj Edwards

Why does this matter? Because with LLMs, context is everything, and in language, context changes meaning. Take the word “bank” and add the words “river” or “central” in front of it, and see how the meaning changes. In a way, words act as addresses that unlock the semantic relationships encoded in a neural network. So if you put “checkerboard” and “game” in the context, the model’s self-attention process links up a massive web of semantic associations about how checkers games should work, and that semantic baggage throws things off.

A couple of tricks can help AI coders navigate around these limitations. First, avoid contaminating the context with irrelevant information. Second, when the agent gets stuck, try this prompt: “What information do you need that would let you implement this perfectly right now? What tools are available to you that you could use to discover that information systematically without guessing?” This forces the agent to identify (semantically link up) its own knowledge gaps, spelled out in the context window and subject to future action, instead of flailing around blindly.

4. The 90 percent problem

The first 90 percent of an AI coding project comes in fast and amazes you. The last 10 percent involves tediously filling in the details through back-and-forth trial-and-error conversation with the agent. Tasks that require deeper insight or understanding than what the agent can provide still require humans to make the connections and guide it in the right direction. The limitations we discussed above can also cause your project to hit a brick wall.

From what I have observed over the years, larger LLMs can potentially make deeper contextual connections than smaller ones. They have more parameters (encoded data points), and those parameters are linked in more multidimensional ways, so they tend to have a deeper map of semantic relationships. As deep as those go, it seems that human brains still have an even deeper grasp of semantic connections and can make wild semantic jumps that LLMs tend not to.

Creativity, in this sense, may be when you jump from, say, basketball to how bubbles form in soap film and somehow make a useful connection that leads to a breakthrough. Instead, LLMs tend to follow conventional semantic paths that are more conservative and entirely guided by mapped-out relationships from the training data. That limits their creative potential unless the prompter unlocks it by guiding the LLM to make novel semantic connections. That takes skill and creativity on the part of the operator, which once again shows the role of LLMs as tools used by humans rather than independent thinking machines.

5. Feature creep becomes irresistible

While creating software with AI coding tools, the joy of experiencing novelty makes you want to keep adding interesting new features rather than fixing bugs or perfecting existing systems. And Claude (or Codex) is happy to oblige, churning away at new ideas that are easy to sketch out in a quick and pleasing demo (the 90 percent problem again) rather than polishing the code.

Flip-Lash started as a

Flip-Lash started as a “Tetris but you can flip the board,” but feature creep made me throw in the kitchen sink, losing focus.

Flip-Lash started as a “Tetris but you can flip the board,” but feature creep made me throw in the kitchen sink, losing focus. Credit: Benj Edwards

Fixing bugs can also create bugs elsewhere. This is not new to coding agents—it’s a time-honored problem in software development. But agents supercharge this phenomenon because they can barrel through your code and make sweeping changes in pursuit of narrow-minded goals that affect lots of working systems. We’ve already talked about the importance of having a good architecture guided by the human mind behind the wheel above, and that comes into play here.

6. AGI is not here yet

Given the limitations I’ve described above, it’s very clear that an AI model with general intelligence—what people usually call artificial general intelligence (AGI)—is still not here. AGI would hypothetically be able to navigate around baked-in stereotype associations and not have to rely on explicit training or fine-tuning on many examples to get things right. AI companies will probably need a different architecture in the future.

I’m speculating, but AGI would likely need to learn permanently on the fly—as in modify its own neural network weights—instead of relying on what is called “in-context learning,” which only persists until the context fills up and gets compacted or wiped out.

Grapheeti is a

Grapheeti is a “drawing MMO” where people around the world share a canvas.

Grapheeti is a “drawing MMO” where people around the world share a canvas. Credit: Benj Edwards

In other words, you could teach a true AGI system how to do something by explanation or let it learn by doing, noting successes, and having those lessons permanently stick, no matter what is in the context window. Today’s coding agents can’t do that—they forget lessons from earlier in a long session or between sessions unless you manually document everything for them. My favorite trick is instructing them to write a long, detailed report on what happened when a bug is fixed. That way, you can point to the hard-earned solution the next time the amnestic AI model makes the same mistake.

7. Even fast isn’t fast enough

While using Claude Code for a while, it’s easy to take for granted that you suddenly have the power to create software without knowing certain programming languages. This is amazing at first, but you can quickly become frustrated that what is conventionally a very fast development process isn’t fast enough. Impatience at the coding machine sets in, and you start wanting more.

But even if you do know the programming languages being used, you don’t get a free pass. You still need to make key decisions about how the project will unfold. And when the agent gets stuck or makes a mess of things, your programming knowledge becomes essential for diagnosing what went wrong and steering it back on course.

8. People may become busier than ever

After guiding way too many hobby projects through Claude Code over the past two months, I’m starting to think that most people won’t become unemployed due to AI—they will become busier than ever. Power tools allow more work to be done in less time, and the economy will demand more productivity to match.

It’s almost too easy to make new software, in fact, and that can be exhausting. One project idea would lead to another, and I was soon spending eight hours a day during my winter vacation shepherding about 15 Claude Code projects at once. That’s too much split attention for good results, but the novelty of seeing my ideas come to life was addictive. In addition to the game ideas I’ve mentioned here, I made tools that scrape and search my past articles, a graphical MUD based on ZZT, a new type of MUSH (text game) that uses AI-generated rooms, a new type of Telnet display proxy, and a Claude Code client for the Apple II (more on that soon). I also put two AI-enabled emulators for Apple II and Atari 800 on GitHub. Phew.

Consider the advent of the steam shovel, which allowed humans to dig holes faster than a team using hand shovels. It made existing projects faster and new projects possible. But think about the human operator of the steam shovel. Suddenly, we had a tireless tool that could work 24 hours a day if fueled up and maintained properly, while the human piloting it would need to eat, sleep, and rest.

I used Claude Code to create a windowing GUI simulation of the Mac that works over Telnet.

I used Claude Code to create a windowing GUI simulation of the Mac that works over Telnet.

I used Claude Code to create a windowing GUI simulation of the Mac that works over Telnet. Credit: Benj Edwards

In fact, we may end up needing new protections for human knowledge workers using these tireless information engines to implement their ideas, much as unions rose as a response to industrial production lines over 100 years ago. Humans need rest, even when machines don’t.

Will an AI system ever replace the human role here? Even if AI coding agents could eventually work fully autonomously, I don’t think they’ll replace humans entirely because there will still be people who want to get things done, and new AI power tools will emerge to help them do it.

9. Fast is scary to people

AI coding tools can turn what was once a year-long personal project into a five-minute session. I fed Claude Code a photo of a two-player Tetris game I sketched in a notebook back in 2008, and it produced a working prototype in minutes (prompt: “create a fully-featured web game with sound effects based on this diagram”). That’s wild, and even though the results are imperfect, it’s a bit frightening to comprehend what kind of sea change in software development this might entail.

Since early December, I’ve been posting some of my more amusing experimental AI-coded projects to Bluesky for people to try out, but I discovered I needed to deliberately slow down with updates because they came too fast for people to absorb (and too fast for me to fully test). I’ve also received comments like “I’m worried you’re using AI, you’re making games too fast” and so on.

Benj's handwritten game design note about a two-player Tetris concept from 2007.

Benj’s handwritten game design note about a two-player Tetris concept from 2007.

Benj’s handwritten game design note about a two-player Tetris concept from 2007. Credit: Benj Edwards

Regardless of my own habits, the flow of new software will not slow down. There will soon be a seemingly endless supply of AI-augmented media (games, movies, images, books), and that’s a problem we’ll have to figure out how to deal with. These products won’t all be “AI slop,” either; some will be done very well, and the acceleration in production times due to these new power tools will balloon the quantity beyond anything we’ve seen.

Social media tends to prime people to believe that AI is all good or all bad, but that kind of black-and-white thinking may be the easy way out. You’ll have no cognitive dissonance, but you’ll miss a far richer third option: seeing these tools as imperfect and deserving of critique but also as useful and empowering when they bring your ideas to life.

AI agents should be considered tools, not entities or employees, and they should be amplifiers of human ideas. My game-in-progress Card Miner is entirely my own high-level creative design work, but the AI model handled the low-level code. I am still proud of it as an expression of my personal ideas, and it would not exist without AI coding agents.

10. These tools aren’t going away

For now, at least, coding agents remain very much tools in the hands of people who want to build things. The question is whether humans will learn to wield these new tools effectively to empower themselves. Based on two months of intensive experimentation, I’d say the answer is a qualified yes, with plenty of caveats.

We also have social issues to face: Professional developers already use these tools, and with the prevailing stigma against AI tools in some online communities, many software developers and the platforms that host their work will face difficult decisions.

Ultimately, I don’t think AI tools will make human software designers obsolete. Instead, they may well help those designers become more capable. This isn’t new, of course; tools of every kind have been serving this role since long before the dawn of recorded history. The best tools amplify human capability while keeping a person behind the wheel. The 3D printer analogy holds: amazing fast results are possible, but mastery still takes time, skill, and a lot of patience with the machine.

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

10 things I learned from burning myself out with AI coding agents Read More »

this-may-be-the-grossest-eye-pic-ever—but-the-cause-is-what’s-truly-horrifying

This may be the grossest eye pic ever—but the cause is what’s truly horrifying

Savage microbe

Whatever was laying waste to his eye seemed to have come from inside his own body, carried in his bloodstream—possibly the same thing that could explain the liver mass, lung nodules, and brain lesions. There was one explanation that fit the condition perfectly: hypervirulent Klebsiella pneumoniae or hvKP.

Classical K. pneumoniae is a germ that dwells in people’s intestinal tracts and is one that’s familiar to doctors. It’s known for lurking in health care settings and infecting vulnerable patients, often causing pneumonia or urinary tract infections. But hvKP is very different. In comparison, it’s a beefed-up bacteria with a rage complex. It was first identified in the 1980s in Taiwan—not for stalking weak patients in the hospital but for devastating healthy people in normal community settings.

An infection with hvKP—even in otherwise healthy people—is marked by metastatic infection. That is, the bacteria spreads throughout the body, usually starting with the liver, where it creates a pus-filled abscess. It then goes on a trip through the bloodstream, invading the lungs, brain, soft tissue, skin, and the eye (endogenous endophthalmitis). Putting it all together, the man had a completely typical clinical case of an hvKP infection.

Still, definitively identifying hvKP is tricky. Mucus from the man’s respiratory tract grew a species of Klebsiella, but there’s not yet a solid diagnostic test to differentiate hvKP from the classical variety. Since 2024, researchers have worked out a strategy of using the presence of five different virulence genes found on plasmids (relatively small, circular pieces of DNA, separate from chromosomal DNA, that can replicate on their own and be shared among bacteria.) But the method isn’t perfect—some classical K. pneumoniae can also carry the five genes.

A string test performed on the rare growth of Klebsiella pneumoniae from the sputum culture shows a positive result, with the formation of a viscous string with a height of greater than 5 mm.

A string test performed on the rare growth of Klebsiella pneumoniae from the sputum culture shows a positive result, with the formation of a viscous string with a height of greater than 5 mm. Credit: NEJM 2026

Another much simpler method is the string test, in which clinicians basically test the goopy-ness of the bacteria—hvKP is known for being sticky. For this test, a clinician grows the bacteria into a colony on a petri dish, then touches an inoculation loop to the colony and pulls up. If the string of attached goo stretches more than 5 mm off the petri dish, it’s considered positive for hvKP. This is (obviously) not a precise test.

This may be the grossest eye pic ever—but the cause is what’s truly horrifying Read More »

is-orion’s-heat-shield-really-safe?-new-nasa-chief-conducts-final-review-on-eve-of-flight.

Is Orion’s heat shield really safe? New NASA chief conducts final review on eve of flight.


“That level of openness and transparency is exactly what should be expected of NASA.”

The Orion heat shield as seen after the Artemis I flight. Credit: NASA

The Orion heat shield as seen after the Artemis I flight. Credit: NASA

WASHINGTON, DC—This week, NASA’s new administrator, Jared Isaacman, said he has “full confidence” in the space agency’s plans to use the existing heat shield to protect the Orion spacecraft during its upcoming lunar mission.

Isaacman made the determination after briefings with senior leaders at the agency and a half-day review of NASA’s findings with outside experts.

“We have full confidence in the Orion spacecraft and its heat shield, grounded in rigorous analysis and the work of exceptional engineers who followed the data throughout the process,” Isaacman said Thursday.

Isaacman has previously indicated that reviewing the heat shield issue early in his tenure, especially with the Artemis II mission due to launch in as few as four weeks, was a top priority. He met with senior agency officials about the matter within hours of being sworn in on December 18.

The private astronaut and billionaire entrepreneur has also said there should be more public transparency at NASA.

Following the Artemis I mission in November 2022, NASA was roundly criticized for its opaque handling of damage to Orion’s heat shield. The seriousness of the problem was not disclosed for nearly a year and a half after the Artemis I mission, when NASA’s Inspector General finally published close-up images of char loss—chunks of ablative material at Orion’s base that were intended to protect the spacecraft during its return but had fallen away.

To address these concerns, NASA tapped an “independent review team” in April 2024 to assess the agency’s investigation of the heat shield. This group’s findings were finalized in December 2024, at which time NASA formally decided to fly the Artemis II mission with the existing heat shield. Although NASA held a news conference to discuss its conclusions, a publicly released copy of the independent review team’s report was heavily redacted, creating further doubt about the integrity of the process. Some notable critics assailed NASA’s decision to fly on the heat shield as is and decried the ongoing lack of transparency.

That is more or less where the matter stood until a few days before Christmas, when Isaacman officially became NASA administrator.

Transparency for the taxpayer

After taking the job in Washington, DC, Isaacman asked the engineers who investigated the heat shield issue for NASA, as well as the chair of the independent review team and senior human spaceflight officials, to meet with a handful of outside experts. These included former NASA astronauts Charles Camarda and Danny Olivas, both of whom have expertise in heat shields and had expressed concerns about the agency’s decision-making.

For the sake of transparency, Isaacman also invited two reporters to sit in on the meeting, me and Micah Maidenberg of The Wall Street Journal. We were allowed to report on the discussions without directly quoting participants for the sake of a full and open discussion.

The inspector general’s report, released on May 1, 2024, included new images of Orion’s heat shield.

Credit: NASA Inspector General

The inspector general’s report, released on May 1, 2024, included new images of Orion’s heat shield. Credit: NASA Inspector General

Convened in a ninth-floor conference room at NASA Headquarters known as the Program Review Center, the meeting lasted for more than three hours. Isaacman attended much of it, though he stepped out from time to time to handle an ongoing crisis involving an unwell astronaut on orbit. He was flanked by the agency’s associate administrator, Amit Kshatriya; the agency’s chief of staff, Jackie Jester; and Lori Glaze, the acting associate administrator for NASA’s Exploration Systems Development Mission Directorate. The heat shield experts joined virtually from Houston, along with Orion Program Manager Howard Hu.

Isaacman made it clear at the outset that, after reviewing the data and discussing the matter with NASA engineers, he accepted the agency’s decision to fly Artemis II as planned. The team had his full confidence, and he hoped that by making the same experts available to Camarda and Olivas, it would ease some of their concerns.

What followed was a spirited discussion, with Camarda sparring regularly with the presenters and Olivas asking questions more infrequently. The engineering team in Houston, led by Luis Saucedo, went through dozens of charts and presented reams of data that had not been made public before.

“That level of openness and transparency is exactly what should be expected of NASA,” Isaacman said after the meeting.

“What if we’re wrong?”

Perhaps the most striking revelation was what the NASA engineers called “what if we’re wrong” testing.

At the base of Orion, there are 186 blocks of a material called Avcoat, individually attached to provide a protective layer that allows the spacecraft to survive the heating of atmospheric reentry. Returning from the Moon, Orion encounters temperatures of up to 5,000° Fahrenheit (2,760° Celsius). A char layer that builds up on the outer skin of the Avcoat material is supposed to ablate, or erode, in a predictable manner during reentry. Instead, during Artemis I, fragments fell off the heat shield and left cavities in the Avcoat material.

Work by Saucedo and others—including substantial testing in ground facilities, wind tunnels, and high-temperature arc jet chambers—allowed engineers to find the cause of gases becoming trapped in the heat shield, leading to cracking. This was due to the Avcoat material being “impermeable,” essentially meaning it could not breathe.

After considering several options, including swapping the heat shield out for a newer one with more permeable Avcoat, NASA decided instead to change Orion’s reentry profile. For Artemis II, it would return through Earth’s atmosphere at a steeper angle, spending fewer minutes in the environment where this outgassing occurred during Artemis I. Much of Thursday’s meeting involved details about how the agency reached this conclusion and why the engineers deemed the approach safe.

A test block of Avcoat undergoes heat pulse testing inside an arc jet test chamber at NASA’s Ames Research Center in California. The test article, configured with both permeable (upper) and non-permeable (lower) Avcoat sections for comparison, helped to confirm an understanding of the root cause of the loss of charred Avcoat material on Artemis I.

Credit: NASA

A test block of Avcoat undergoes heat pulse testing inside an arc jet test chamber at NASA’s Ames Research Center in California. The test article, configured with both permeable (upper) and non-permeable (lower) Avcoat sections for comparison, helped to confirm an understanding of the root cause of the loss of charred Avcoat material on Artemis I. Credit: NASA

However, toward the end of the meeting, the NASA team agreed to discuss something that “no one really liked to talk about.” This was an analysis of what would happen to Orion if large sections of the heat shield failed completely during Artemis II. Formally, this is known as a “damage tolerance evaluation,” the engineers said. Informally, it’s known as “What if we’re wrong.”

The Avcoat blocks, which are about 1.5 inches thick, are laminated onto a thick composite base of the Orion spacecraft. Inside this is a titanium framework that carries the load of the vehicle. The NASA engineers wanted to understand what would happen if large chunks of the heat shield were stripped away entirely from the composite base of Orion. So they subjected this base material to high energies for periods of 10 seconds up to 10 minutes, which is longer than the period of heating Artemis II will experience during reentry.

What they found is that, in the event of such a failure, the structure of Orion would remain solid, the crew would be safe within, and the vehicle could still land in a water-tight manner in the Pacific Ocean.

“We have the data to say, on our worst day, we’re able to deal with that if we got to that point,” one of the NASA engineers said.

Getting to “flight rationale”

The composite layer beneath the heat shield is intended to withstand a maximum temperature of 500° F during reentry. During Artemis I, the maximum temperature recorded, despite the persistent cracking and char loss, was 160°. So any crew on board would have been safe. Even so, the heat shield damage was a serious concern because the agency’s modeling did not predict it.

After more than two years of testing and analysis of the char loss issue, the NASA engineers are convinced that, by increasing the angle of Orion’s descent during Artemis II, they can minimize damage to the heat shield. During Artemis I, as the vehicle descended from about 400,000 to 100,000 feet, it was under a “heat load” of various levels for 14 minutes. With Artemis II, this time will be reduced to eight minutes.

Orion’s entry profile will be similar for the first two and a half minutes, but afterward, the Artemis II entry will undertake a bit of a higher heat load than Artemis I for a couple of minutes. All of the agency’s modeling and extensive arc jet testing indicate this will produce significantly less cracking in the Avcoat material.

Much of the discussion Thursday delved into the technical minutiae of heat shields, tamp planes (the process of packing Avcoat into blocks), early char loss, spallation, and more. The discourse also revealed that one test in 2019, three years before Artemis I, indicated hints of the char loss later observed in flight. But this finding was not unequivocal, nor did it throw up a huge red flag at the time, the NASA officials said.

Technicians inspect the heat shield for the Artemis II launch.

Credit: NASA

Technicians inspect the heat shield for the Artemis II launch. Credit: NASA

The message from Isaacman, Kshatriya, and other NASA officials at the meeting was clear. This heat shield was not perfect. If NASA knew several years ago what it knows now, the heat shield would be designed differently. It would be permeable to prevent the outgassing problems. Those changes are being incorporated into the Artemis III mission’s heat shield. There will be other tweaks to increase reliability.

Nevertheless, the agency is confident that flying the Artemis II heat shield on the revised profile is perfectly safe. In NASA jargon, such a rigorous justification that a space mission is safe to fly is known as flight rationale.

But why get to flight rationale at all? About 18 months ago, as the agency was narrowing in on the root cause of the heat shield issues, NASA’s leaders at the time, including Kshatriya, considered their options. They mulled the possibility of flying Artemis II in low-Earth orbit to test its life support equipment but not overly stress the heat shield. They thought about flying a second robotic mission around the Moon.

Perhaps most seriously, they considered moving forward with the Orion spacecraft (or at least its heat shield) that will be flown in Artemis III, which has permeable Avcoat, to be used for this mission. I asked Kshatriya on Thursday why they had not simply done this.

“We had considered ‘let’s just pull forward CSM 3 (the Artemis III spacecraft),’” he said, in part. “and essentially turn CSM 2 (Artemis II) either into a test article or something else. Again, CSM 3 has unique capabilities, docking systems on it, right? We didn’t have a docking mode for that mission (Artemis II). CSM 2 could not be retrofitted with the docking system because of the uniqueness of the tunnel. Really, CSM 2 is kind of uniquely a free return vehicle because of the way it was designed initially. So the mods that would have had to be made for (Artemis) II and III to do that swap would have been too odious, and we wouldn’t have gotten the learnings. And, you know, we’re trying to get up hill as quickly as we can.”

Given all of this, how should we feel about this flight rationale, with Artemis II potentially launching in early February?

Over the last 18 months, I have had many discussions with experts about this, from mid-level engineers and current and former astronauts to senior leaders. I know definitively that the four Artemis II astronauts, Reid Wiseman, Victor Glover, Christina Koch, and Jeremy Hansen, are comfortable with the decision. They did not feel that way at the beginning of the process. Wiseman, in particular, was quite skeptical. But they’ve been won over. Like almost everyone else who has reviewed NASA’s data at length, they accept the plan. Indeed, they are ready and eager to fly.

But what of the outside critics? That was the whole point of Thursday’s session. Could the NASA engineers convince Olivas and Camarda?

Yes, and maybe

Olivas flew two Space Shuttle missions in 2007 and 2009 and has an advanced degree in materials science from Rice University. Before this week’s meeting, he had not gone public with his heat shield concerns. But he has been talking to me and another space reporter, Robert Pearlman, for about a month now.

Olivas is very credible on these issues. He was asked by the NASA leadership in late 2023, before the independent review team was formally named, to provide a second set of eyes on the space agency’s heat shield work. He saw all of the investigative data in real time. Although not formally a member, he sat in on the review team’s meetings through 2024 before that process ended. Afterward, he had some lingering questions he felt were unresolved by that process. A few weeks ago, he told Pearlman and me he would be reluctant to fly on Orion. It was a stunning admission.

Isaacman appeared to take these concerns seriously. In advance of Thursday’s meeting, he engaged with Olivas to hear him out and share information about what NASA’s engineers had done over the last 18 months to resolve some of the independent review team’s questions. These included char loss very early in Orion’s reentry.

After Thursday’s meeting, Olivas told me he had changed his mind, expressing appreciation and admiration for the in-depth engineering work done by the NASA team. He would now fly on Orion.

Camarda, another former shuttle astronaut, was less effusive. He has been very public with his criticism of NASA’s handling of the Orion heat shield. He told me in December 2024 that the space agency and its leadership team should be “ashamed.” Unlike Olivas, however, he has been on the outside the whole time. NASA had kept Camarda, 73, at arm’s length, and he felt disrespected. Given his credentials—the aerospace engineer spent two decades working on thermal protection for the space shuttle and hypersonic vehicles–Camarda could be a potent voice of skepticism leading up to the Artemis II launch.

After the meeting, I asked Camarda whether he felt any better about flying crew on the Artemis II heat shield.

“I would never be happy accepting a workaround and flying something that I know is the worst version of that heat shield we could possibly fly and hoping that the workaround is going to fix it,” Camarda said. “What I really hope he [Isaacman] gets is that if we don’t get back to doing research at NASA, we’re not going to be able to help Starship solve their problems. We’ve got to get back to doing research.”

But Camarda was no longer the firebrand he was at the outset of the meeting. Near its end, in fact, he even thanked the leadership team for being brought in, read in on the data, and allowed to have his say.

Photo of Eric Berger

Eric Berger is the senior space editor at Ars Technica, covering everything from astronomy to private space to NASA policy, and author of two books: Liftoff, about the rise of SpaceX; and Reentry, on the development of the Falcon 9 rocket and Dragon. A certified meteorologist, Eric lives in Houston.

Is Orion’s heat shield really safe? New NASA chief conducts final review on eve of flight. Read More »

film-technica:-our-top-picks-for-the-best-films-of-2025

Film Technica: Our top picks for the best films of 2025


lighting up the silver screen

Streamers made a strong showing this year, as did horror. Big tentpoles, superhero sagas mostly fell flat.

Credit: Collage by Aurich Lawson

Credit: Collage by Aurich Lawson

Editor’s note: Warning: Although we’ve done our best to avoid spoiling anything too major, please note this list does include a few specific references that some might consider spoiler-y.

It’s been a strange year for movies. Most of the big, splashy tentpole projects proved disappointing, while several more modest films either produced or acquired by streaming platforms—and only briefly released in theaters—wound up making our year-end list. This pattern was not intentional. But streaming platforms have been increasingly moving into the film space with small to medium-sized budgets—i.e., the kind of fare that used to be commonplace but has struggled to compete over the last two decades as blockbusters and elaborate superhero franchises dominated the box office.

Add in lingering superhero fatigue—only one superhero saga made our final list this year—plus Netflix’s controversial bid to acquire Warner Bros., and we just might be approaching a sea change in how movies are made and distributed, and by whom. How this all plays out in the coming year is anybody’s guess.

As always, we’re opting for an unranked list, with the exception of our “year’s best” selection at the very end—this year it’s a three-way tie—so you might look over the variety of genres and options and possibly add surprises to your eventual watchlist. We invite you to head to the comments and add your own favorite films released in 2025.

Ballerina

determined young woman holding a flame thrower.

Credit: Lionsgate

Ana de Armas proves herself a fierce and lethal adversary against a cultish syndicate in Ballerina—excuse me, From the World of John Wick: Ballerina. Chronologically, Ballerina takes place during the events of John Wick Chapter 3: Parabellum. That film gave us a glimpse into John Wick’s (Keanu Reeves) past as he sought aid from the Ruska Roma crime syndicate, led by the Director (Anjelica Huston), where he was trained as an assassin. The Director also trains girls to be ballerina-assassins, one of whom is Eve Macarro (de Armas).

Like Wick, Eve is driven by a personal vendetta: the brutal murder of her father when she was still a child by highly trained and heavily armed assassins. The Director warns Eve that this is a rogue group of lawless cultists and orders her not to pursue the matter. But vengeance will be Eve’s, no matter the cost, as she hunts down the cultists and their enigmatic leader, the Chancellor (Gabriel Byrne).

Ballerina has all the eye-popping visuals, lavish sets, and spectacularly inventive stuntwork one would expect from a film set in the John Wick universe. It’s more tightly plotted than recent entries in the franchise, and the globe-trotting locations make narrative sense; it’s not just an excuse for staging a spectacle. As always, the fight choreography is perfection. Eve is smaller than most of the men she takes on, but that doesn’t make her any less deadly, particularly when she’s more than willing to fight dirty. A fight scene with dueling flame throwers is one for the ages. Despite a few minor quibbles, Ballerina is an immensely entertaining and action-packed addition to the franchise.

Jennifer Ouellette

The Baltimorons

Man in silly hat in front of xmas tree mugging for camera while a woman looks on, rolling her eyes

Credit: IFC

The Baltimorons is a quirky holiday love story about an unlikely pair who find each other by happenstance over the holidays. Didi (Liz Larsen) is a divorced middle-aged dentist whose ex-husband has just gotten married to his much-younger girlfriend—on Christmas eve, no less, so the wedding reception pre-empts Didi’s planned time with her daughter. So she’s on call when a bumbling former improv comedian and recovering alcoholic named Cliff (Michael Strassner) has a dental emergency.

Cliff’s car is towed while she treats him—apparently, this is a regular occurrence—and Didi offers to drive him to the impound lot. They end up going on a quixotic journey around Baltimore, including crashing the family wedding reception and performing at a pop-up improv show, and find themselves drawn together despite their significant age difference.

Director Jay Duplass has a knack for this kind of idiosyncratic fare featuring deeply imperfect yet likable characters, having either written, directed, and/or produced such gems as Safety Not Guaranteed, Horse Girl, Table 19, and Jeff, Who Lives at Home. It falls on Strassner—a Baltimore native who co-wrote the script—and Larsen to carry the film, which they do with considerable charm. You get why Didi and Cliff forge such a bond, even if one questions how long it’s likely to last. The film is also kind of a love letter to Baltimore, aka “Charm City”; if all you know about Baltimore comes from watching The Wire, The Baltimorons will give you a glimpse of the city’s many other neighborhoods and sights.

Jennifer Ouellette

The Phoenician Scheme

middle aged man, a nun, and a younger man in an airplane cabin

Credit: Universal

Auteur director Wes Anderson‘s films have a visual style and tone all their own, and I’ve been a fan of his understated eccentricity since 1998’s Bottle Rocket. OK, 2023’s Asteroid City left me cold, but Anderson returns to top form with The Phoenician Scheme. Benicio del Toro stars as Zsa-Zsa Korda, a 1950s ruthless arms dealer and industrialist who finds himself the target of government assassins—most likely because of his unethical business practices.

He barely survives one attempt ,and a vision of the afterlife convinces Zsa-Zsa that he needs to mend fences with his estranged daughter Liesl (Mia Theapleton), a novice in a convent. He’s also trying to pull off a risky scheme to essentially overhaul the infrastructure of Phoenicia, traveling around the world to meet with investors and convince them to increase their own shares so he can avoid bankruptcy. Liesl joins him on the journey, along with a nerdy Norwegian entomologist named Bjorn (Michael Cera). Wacky hijinks ensue. It has an intricate, sometimes unfocused plot, but Anderson pulls it off with his usual delicate whimsical touch, bolstered by delightfully deadpan performances from the cast.

Jennifer Ouellette

100 Nights of Hero

man and woman in medieval dress holding lamps at night

Credit: IFC

This sumptuous historical fantasy is adapted from Isabel Greenberg’s lavishly illustrated graphic novel of the same name, which is in turn an inventive twist on One Thousand and One Nights. Maika Monroe plays Cherry, the wife of a wealthy medieval landowner named Jerome (Amir El-Masry), who for some reason has not consummated their marriage. Obsessed with his wife’s fidelity, Jerome makes a wager with his handsome friend Manfred (Nicholas Galitzine) that if Manfred successfully seduces Cherry within 100 days, Jerome will give him both Cherry and his castle.

But Cherry’s maid, Hero (Emma Corrin), secretly loves her lady and thwarts Manfred’s seduction attempts by regaling him with captivating stories every night to keep her mistress from succumbing to temptation. And Manfred is most definitely tempting, dragging a freshly killed deer to the castle while bare-chested and covered in its blood. The costumes, production design, and cinematography are stunning, mirroring Cherry’s gradual sexual awakening via romantic triangle. Add in stellar performances, and this is a sensual fairy tale for the ages.

Jennifer Ouellette

Thunderbolts*

group of second-rate superheroes standing together

Credit: Marvel Studios

Thunderbolts* is basically the MCU’s version of The Suicide Squad (2021) with less over-the-top R-rated violence, but it’s just as irreverently entertaining. Black Widow introduced us to Natasha Romanoff’s (Scarlett Johansson) backstory as a child recruited for training as an elite assassin, along with her adoptive sister (and equally lethal assassin) Yelena Belova (Florence Pugh). Thunderbolts* finds Yelena working as a hired mercenary for CIA director Valentina Allegra de Fontaine (Julia Louis-Dreyfus), but she’s still grieving the loss of Natasha, and her heart just isn’t in it.

Yelena decides to quit, and Valentina asks her to do one last covert mission. It turns out to be a trap: Yelena is attacked by super soldier John Walker (Wyatt Russell), Taskmaster (Olga Kurylenko), and Ghost (Hannah John-Kamen). The hope what that they’ll all kill each other and be destroyed along with incriminating evidence—which includes an awkward, nebbishy man in hospital PJs named Bob (Lewis Pullman), who is far more dangerous than he appears. Along with Yelena’s adoptive father, Alexei/Red Guardian (David Harbour), they all team up to take down Valentina instead.

It’s well-plotted and doesn’t take itself too seriously. Director Jake Schreier (Robot & Frank, Beef) expertly balances the action sequences with bantering wisecracks and quieter introspective moments that serve to actually develop the characters, each of whom has their inner demons and plenty of red in their respective ledgers. And Schreier has an incredibly talented cast to work with, all of whom give stellar performances. Thunderbolts* is a refreshing return to peak Marvel form: well-paced, witty, and action-packed with enough heart to ensure you care about the characters.

Jennifer Ouellette

Frankenstein

man in victorian garb in a lab bending over a body on a table

Credit: Netflix

Director Guillermo del Toro has been telling interviewers for years about his enduring love for Mary Shelley’s classic novel and his long-standing desire to direct a film that would capture the novel’s sense of grand Miltonian tragedy. He called this film “the culmination of a journey that has occupied most of my life.” His Frankenstein is probably the most faithful film adaptation yet made (with a few deviations in later acts), even mirroring Shelley’s narrative structure. It’s first told from the perspective of the captain of an Arctic ship trapped in ice en route to the North Pole who rescues a badly wounded Baron Victor Frankenstein (Oscar Isaac). Both Victor and his Creature (Jacob Elordi) then get to tell their versions of the story that brought them to the Arctic.

Known for his lush visuals and high Gothic sensibility, del Toro doesn’t disappoint, with elaborate sets—Victor’s laboratory is a wonder of 19th-century steampunk industrialism—and an innovative design for the Creature. Del Toro is the perfect conduit for this story of an arrogant scientist who tries to play god by creating a monstrous creature, only to become a monster himself. Isaac brings a blend of passionate intensity and cold ambition to his portrayal of Victor, but it’s Elordi who ultimately anchors the film, conveying the fundamental humanity of Shelley’s iconic monster.

Jennifer Ouellette

The Long Walk

group of young boys walking as a group down a road with armed soldiers at the ready

Credit: Lionsgate

Before The Hunger Games, there was The Long Walk, a 1979 novel by Stephen King (writing as Richard Bachman) about a dystopian alternate history in which one young man from each state in a totalitarian US is chosen to participate in a grueling annual contest. They walk. And walk. And walk. If they drop below 3 MPH or stop to rest, they are executed. They keep walking until only one is left standing as the “winner,” rewarded with whatever he wants for life at a time when the country is mired in a deep economic depression. It’s grim material well-suited for a film adaptation by Francis Lawrence, who has directed every film in The Hunger Games franchise. The dude knows his dystopias.

Cooper Hoffman plays Ray Garraty, a contestant from Maine who volunteers for the walk over the objections of his mother. His first wish, should he win, would be for a rifle to kill the Major (Mark Hamill) in charge of the walk, since the Major had executed his father years before. Ray soon bonds with Pete (David Jonsson), but the stakes become crystal clear when the first walker falls: A boy who develops a charley horse and is summarily shot for sitting down. One by one, each boy falls until just two remain.

Lawrence keeps things tense and starkly minimalistic. There are no elaborate sets or costumes. It’s the interactions between the various walkers that drive the story, punctuated by inevitable deaths. The point is that there is no happy ending, regardless of who technically “wins.” There are some deviations from the novel, but Lawrence retains King’s suitably cryptic (and quite bleak) ending. I’m a fan of Andy Muscietti’s two-part adaptation of IT and Mike Flanagan’s Doctor Sleep, but The Long Walk might just edge them out as the best adaptation of a Stephen King novel yet.

Jennifer Ouellette

Fackham Hall

This gem of a film is basically Airplane! meets Agatha Christie meets Downtown Abbey, spoofing all those British aristocratic period dramas we know and love. Set in 1931, the plot centers on a charming orphaned pickpocket named Eric (Ben Radcliffe), who is mistaken for a new employee when he arrives at the titular manor house of Lord and Lady Davenport (Damian Lewis and Katherine Waterson).

Eric ends up leaning into his new role and is soon promoted, even indulging in a forbidden romance with the Davenports’ daughter Rose (Thomasin McKenzie). Then someone gets murdered, and Eric finds himself framed for the killing. It’s up to Inspector Watt (Tom Goodman-Hill) and his magnificent (removable) mustache to solve the mystery. The cast clearly had a blast, and it’s impossible to resist that wickedly dry, often scatalogical British slapstick humor. Fackham Hall is a bright, shiny bauble that will leave you longing for a sequel.

Jennifer Ouellette

Strange Journey: The Story of Rocky Horror

When The Rocky Horror Picture Show premiered in 1975, no one could have dreamed that it would become the longest-running theatrical release film in history—least of all its creator, Richard O’Brien. But that’s what happened as it developed a loyal cult following of fans dressing up in costumes and acting out the lines in front of the big screen, a practice known as shadow casting. Thanks to a killer soundtrack, campy humor, and those devoted fans, Rocky Horror is still a mainstay of midnight movie culture. Richard O’Brien’s son, Linus O’Brien, marked the occasion with his fascinating documentary Strange Journey: The Story of Rocky Horror.

The film has its share of cast reminiscences, but it’s the profound impact Rocky Horror has had over the decades that ultimately shines through—and not just on a broad cultural scale. O’Brien decided to make the film while gathering archival clips of his father’s work. He came across a video clip of “I’m Going Home” and found himself browsing through the comments, deeply touched by the many people, including a soldier in Iraq and a woman grieving the loss of her mother, talking about what the song and film had meant to them.

The film ends with a fan telling Richard O’Brien, “It doesn’t matter what people think about Rocky because it belongs to us, not to you”—and Rocky’s creator agreeing that this was true. You can pair Strange Journey with another film celebrating the milestone anniversary, Sane Inside Insanity: The Phenomenon of Rocky Horror, for a documentary double feature.

Jennifer Ouellette

Good Boy

adorale golden furred dog in the woods with a concerned look on its face

Credit: IFC/Shudder

I promise you this is not a spoiler, but for anyone too scared to watch Good Boy, the whole point of one of the year’s most original horror movies is that the dog survives. And despite being a “good boy,” from the moment we meet Indy, the dog gives off “final girl” energy, being the only creature in a cursed family house to sense the hauntings that seem to complicate his owner’s illness and drive him closer to death. Relying on lighting tricks and a frenetic, pulsing soundtrack to dramatize scenes where the movie’s star seems to just be acting like a dog, the movie reinvigorates the haunted house story by telling it from a dog’s-eye level and largely obscuring the faces of humans.

Director and co-screenwriter Ben Leonberg told AV Club that he drew this stellar performance out of Indy—who is not a show dog but his own adorable dog—by living in the house where the movie was filmed and building the set around the ways that Indy moved. Come for the pudgy puppy reels, and then be as obedient as Indy and “stay” for the technical feat of watching a man and his best friend turn classic horror devices into dog toys.

Ashley Belanger

Hedda

young black woman in a ball gown surrounded by party guests

Credit: Orion/Amazon MGM Studios

Tessa Thompson is luminous in the title role of director Nia DaCosta’s film adaptation of the classic Henrik Ibsen play Hedda Gabler. It’s the story of a general’s daughter who marries a stuffy academic for convenience, believing her wild youth is behind her—only to find it’s not much fun being trapped in a loveless marriage, however elegant the surroundings. When a former lover pops up, now involved with Hedda’s romantic rival, tensions build to an explosive climax. This being Ibsen, things don’t end well for anyone.

DaCosta has kept most of the play’s plot intact, but a clever gender swap makes for an interesting twist on the complicated interpersonal dynamics. Nina Hoss plays novelist and recovering alcoholic Eileen Lovborg (a man named Eilert in the play), with Imogen Poots playing romantic rival Thea. Hedda also maintains a flirtation with the lascivious Judge Brack (Nicholas Pinnock), who is manipulative enough to use Hedda’s weaknesses against her. Hedda is among the greatest dramatic roles in theater, and Thompson utterly makes it her own. Is the film a bit stagey at times? Yes, which isn’t surprising since it’s based on a play. That very staginess gives the film a tight, claustrophobic feel, heightening Hedda’s sense of the walls closing in on her once vibrant youth.

Jennifer Ouellette

The Last Republican

former congressman adam kinzinger in suit and tie with chin resting on his clasped hands during a congressional hearing

Credit: Media Courthouse Documentary Collective

Normally, I’d rather stick hot needles under my fingernails than watch a bio-documentary about a politician, regardless of party affiliation. It’s just not my thing. But we live in interesting times, and The Last Republican is not your standard political documentary. The film follows former Rep. Adam Kinzinger (R-IL) over the course of his last year in office. Kinzinger was ousted by his own party for his service on the congressional committee investigating the January 6, 2021, riotous attack on the US Capitol—and for his outspoken denunciation of then-President Donald Trump’s incendiary rhetoric at the instigating rally and delayed action to quell the rioters.

That’s standard documentary fare. But this one was directed by Steve Pink, best known for 2010’s Hot Tub Time Machine (a personal favorite of mine). Pink is (almost) as far apart from Kinzinger politically as it’s possible to be. Kinzinger chose to work with Pink because he, too, loves Hot Tub Time Machine. And a most unlikely friendship was born. You can see their bond in the trailer, which opens with Kinzinger recognizing that the man he has trusted with his story likely has nothing but contempt for Kinzinger’s political views. “That’s kinda mean,” we hear Pink say off-camera, before cheekily asking how one even becomes a Republican, “because I don’t get it.”

That friendship resonates perfectly with the film’s central theme. “It’s not about a political view,” Kinzinger says in the film. “It’s about what it is to turn against everything you’ve ever belonged to because of some red line you can’t cross.” Had there been more principled congressional members like Kinzinger in 2021 willing to put country over party, even if it torched their political careers—and more friendships across political divides finding common ground—the US would be in a very different and better place today. Kinzinger’s closing J6 committee statement is even more relevant four years later: “Oaths matter. Character matters. Truth matters. If we do not renew our faith and commitment to these principles, this great experiment of ours, our shining beacon on a hill, will not endure.”

Jennifer Ouellette

Weapons

young boy in classroom with creepy clown makeup and a sinister smile

Credit: Warner Bros.

One of the most terrifying images of 2025 was a mob of kids with their arms extended like airplanes. It came in Weapons, a witchy mystery that begins with every child in a certain middle school teacher’s class suddenly disappearing, except for one, a quiet boy named Alex Lilly. Working off a highly original script and giving an emotional performance that drove some viewers to tears, young actor Cary Christopher wrenches hearts as Alex’s role in the other kids’ disappearance becomes clearer—after the audience meets his Aunt Gladys.

An actual living and breathing nightmare played to unnerving perfection by Amy Madigan, Aunt Gladys reads like voodoo Mary Poppins meets Pennywise the clown. But stuck in the house with this instantly iconic horror character, Alex proves that he’s the most capable caretaker in the family. In the end, he’s the one tasked with helping his aunt “feel better” while spooning as much Campbell’s soup as it takes into the faces of “weaponized” loved ones to ensure they survive Aunt Gladys’ visit.

Ashley Belanger

Dust Bunny

young girl in bed at night looking scared

Credit: Lionsgate

Dust Bunny is the directorial feature film debut of Bryan Fuller, the creative force behind some of my favorite TV shows over the years, most notably Dead Like Me, Wonderfalls, and Pushing Daisies, as well as Hannibal. Fuller has a knack for injecting elements of magical realism into otherwise ordinary settings, and Dust Bunny adds a healthy dose of horror and Labyrinth-style visual aesthetics into the mix to strike a perfect balance between violence, suspense, whimsy, and emotional depth. Sophie Sloan plays Aurora, a young girl in New York City who turns to her neighbor, Resident 5B (Mads Mikkelson, in a role written specifically for him), for help when (she claims) a monster under her bed kills and eats her parents.

Resident 5B is a hitman for hire, and Aurora wants him to kill the monster in revenge, although he doesn’t think the monster is real, and there are, in fact, other bad people who won’t shirk at going through Aurora to get to Resident 5B. Fun fact: the monster design was inspired by highland cows, although Fuller also asked for the monster to be part hippopotamus and part piranha; artist Jon Wayshak proved well up to the task. Mikkelson and Sigourney Weaver turn in terrific performances—Mikkelson even helped choreograph one of the stunt sequences—as does Sloan and David Dastmalchian. Plus, there’s an entire action sequence featuring a Chinese dragon costume. What more could one want?

Jennifer Ouellette

Wicked: For Good

Glinda the Good Witch and Elphiba in center with supporting characters from Oz in either side

Credit: Universal


Every musical theater fan knows that the second act of a show is almost invariably weaker than the first. Thus, setting the second act of the Wicked musical apart as its own movie was bound to result in a sequel that had trouble living up to last year’s banger-filled mega-hit film.

Wicked: For Good is also where the narrative starts coming apart at the seams a bit, as it necessarily intersects and interacts with the narrative from The Wizard of Oz itself. The leaps of logic necessary to get these “misunderstood” versions of the characters to gel with the ones we see cavorting in that 90-year-old classic are best ignored. But the movie repeatedly throws those connections in our face amid a heavily padded 137-minute runtime that could have easily been half an hour shorter.

Despite it all, though, the quality of the original writing from Stephen Schwartz and Winnie Holzman still shines through. The titular song “For Good” is still an all-time classic, and strong performances carry catchy tunes like “No Good Deed” and “Just for This Moment” (though the latter is robbed of a lot of its inherent sex appeal through some odd directorial choices). Even “The Girl in the Bubble”—a new song created just for the movie–manages to not feel out of place thanks in large part to a winning performance from Ariana Grande and some downright magical camera work.

The worst part of Wicked: For Good, though, might be how its success will almost definitely lead to an expanded Wicked Cinematic Universe, with sequels or prequels that mash these winning characters to death via a bunch of expositional backstory. Let Glinda and Elphaba rest! They’ve earned it!

Kyle Orland

K-Pop Demon Hunters

Credit: Netflix

This was a surprise mega-hit for Netflix, fueled by a killer Korean pop soundtrack featuring one earworm after another that collectively dominated the charts for weeks. K-Pop Demon Hunters is the streaming giant’s most-watched animated film of all time, and that’s not just because of the infectious music—although the music is why Netflix ended up releasing a highly popular singalong version in theaters (after the film racked up huge streaming numbers). The Sony Animation team delivers bold visuals that evoke the look and feel of anime, the plot is briskly paced, and the script strikes a fine balance between humor and heart.

Earth has been protected from demons for generations by a protective barrier called the Honmoon, maintained by musical trios/demon hunters from each generation. One day, the Honmoon will become so strong it will turn “golden” and seal away the demons forever. The latest incarnation of demon hunters—a K-Pop band called Huntr/x—is close to accomplishing the Golden Honmoon.

Rumi (Arden Cho) is the lead singer, Mira (May Hong) is the group’s dancer/choreographer, and American-born Zoey (Ji-young Yoo) is the rapper and lyricist. But Rumi harbors a secret: Her father was a demon, and she is marked by the telltale purple “patterns,” which she keeps hidden from her bandmates. Hoping to destroy the Honmoon once and for all, king of the demons Gwi-Ma sends five of his demons to form a K-pop boy band, the Saja Boys, led by Jinu (Ahn Hyo-seop). Their popularity soon rivals that of Huntr/x and threatens the Honmoon.

Co-director (with Chris Appelhans) Maggie Kang conceived the story and helped write the screenplay, intending the film to be a love letter to K-pop and her Korean roots. But she also drew on traditional Korean mythology and folklore. Those details add a rich layer of texture to the basic storyline. Granted, the film adheres to a familiar formula, but it’s a winning one. K-Pop Demon Hunters‘ unifying message of the power of music to heal, unite, and build community—celebrating honest authenticity rather than striving for impossible perfection—is a powerful one.

Jennifer Ouellette

28 Years Later

man and his son running away from zombies in a field

Credit: Sony Pictures

28 Years Later could have been terrible, screenwriter Alex Garland told Rolling Stone, if he went with his original idea about a group of military men fighting to stop bad guys from weaponizing the Rage Virus. But director Danny Boyle didn’t let that happen, instead pushing Garland to think small and deliver a powerful coming-of-age story that’s somehow just as intense as 2002’s 28 Days Later without retreading hardly any of the same territory. A story about resisting isolationism, 28 Years Later is set on a small island where a scrappy community has survived for decades after being quarantined from the rest of the world.

The story follows a young boy, Spike, who leaves home with his ailing mother after he learns that he cannot trust his father to look out for them. A fire is lit in Spike to cure his mother, and no human or infected—not the worm-eating chubby ones or the spine-ripping alphas—can put him off his mission. What starts as a ritual hunt to initiate a boy into manhood turns instead into a tender quest to find the only known doctor on the island, allowing Spike to see the infected and his community in a new light.

Featuring nuanced performances equal parts harrowing and endearing from Jodie Comer as the mom, Isla, and Alfie Williams as Spike, the movie explores the folly of societies backsliding from progress out of fear of the unknown. As Spike’s dread of the infected flickers out, it’s replaced by an urgent curiosity about the world beyond his village. The only thing potentially standing in his way of growing as wise as the doctor is a gang of “pals” named Jimmy. “Howzat!” for a setup to get boots marching into theaters to see the second installment of the new trilogy in January?

Ashley Belanger

Blue Moon

two men in 1920s suits in a club

Credit: Sony Pictures Classics

Director Richard Linklater (Dazed and Confused, Hit Man) had two films released this year. One is Nouvelle Vague, about the 1959 shooting of the seminal French New Wave film Breathless. The other is Blue Moon, about the complicated relationship between lyricist Lorenz Hart and his erstwhile composer partner Richard Rodgers. Both films are exceptional in their own right, but Blue Moon is my choice for our year’s best list. Chalk it up to my enduring fondness for classic Broadway musicals.

The film takes place in Sardi’s restaurant on the opening night of Oklahoma!, which is Rodgers’ (Andrew Scott) first collaboration with a new lyricist, Oscar Hammerstein II (Simon Delaney). Ethan Hawke turns in a powerful performance as Hart, newly (barely) sober and holding court with bartender Eddie (Bobby Cannavale). He’s rather bitter about his own waning career after he refused to collaborate on the new musical. He’s depressed, and Eddie is reluctant to serve him any alcohol, plus the “omnisexual” Hart’s advances toward the comely Elizabeth (Margaret Qualley) are repeatedly rebuffed.

Oklahoma!, of course, was a smash hit, crowning Rodgers and Hammerstein as the new wonder boys of Broadway. A drunken Hart tragically died just a few months later. Blue Moon‘s intimate portrait of Hart on a night that proved to be a critical turning point is a fitting tribute to one of our greatest lyricists, whose personal demons dimmed his light too soon.

Jennifer Ouellette

Rental Family

large man on a Japanese train next to a little Japanese girl and other commuters

Credit: Searchlight Pictures

Brendan Fraser is experiencing a quiet renaissance, with highly praised recent roles in The Whale and Killers of the Flower Moon, as well as a role in the delightfully bonkers TV series Doom Patrol. Add his gentle, empathetic performance in Rental Family to that list. Fraser plays Phillip Vandarploeug, an American actor living in Japan because he once had great success with a toothpaste commercial. But the roles have dried up, so Phillip signs on with a company called Rental Family, which hires actors as stand-ins for family members or friends. Phillip is the “token white guy.”

It might sound like a cynical premise—the company basically “sells emotion”—but the film is anything but cynical. Phillip ends up developing strong bonds with two of his “clients”: A young Haifa girl named Mia with an absent father and an elderly man with dementia named Kikuo, who happens to be a retired actor. But what happens if they discover the truth? Rental Family is a low-key, thoughtful reflection on loneliness and our human need for social connection. “Sometimes it’s OK to pretend,” Phillip tells Mia at one point. Sometimes faking an emotional connection develops into one that is genuine and lasting.

Jennifer Ouellette

Song Sung Blue

msn and woman onstage singing. Man is dressed as Neil Diamong, woman is in a long red dress.

Credit: Focus Features

Hipsters love to sneer at artists like Neil Diamond. He’s dated, his music is cheesy, yada yada yada. But there’s a reason “Sweet Caroline” has become a staple singalong at sporting events, bar mitzvahs, karaoke nights and the like. All that cynicism melts away once the music starts; it’s infectious. Diamond’s music even inspired a popular Milwaukee tribute act in the 1990s and early oughts: Lightning and Thunder. The duo gets their due in the biopic Song Sung Blue, which is in turn based on a 2008 documentary of the same name. (You can watch the documentary on YouTube.) Director Craig Brewer saw the documentary and was inspired to create his own fictionalized account of Thunder and Lightning’s story with all their dramatic ups and downs.

Hugh Jackman plays Vietnam veteran and recovering alcoholic Lightning, aka Mike Sardina, who falls in love with single mom and Patsy Kline impersonator Claire, aka Thunder. She’s the catalyst for their “Neil Diamond experience,” riding the 1990s wave of Diamond’s resurgence while battling both external obstacles and their respective personal demons. The film condenses the timeline and takes some minor liberties here and there, but on the whole it’s quite factually accurate. (The duo really did open for Pearl Jam and Eddie Vedder joined them briefly onstage for “Forever in Blue Jeans.”)

Jackman and Hudson are major film stars but one soon forgets, because they dissolve so completely into their respective roles. Hudson received a well-deserved Golden Globe nomination for her performance and I expect an Oscar nod will be coming her way as well; this is her best role to date by far. And yes, Jackman and Hudson actually perform the songs; Hudson’s solo rendition of “I’ve Been This Way Before” towards the film’s end is gut-punchingly beautiful.

Song Sung Blue is ultimately a love story, but it’s also an homage to the power of music to lift us up even in our darkest hours. On every anniversary of his sobriety, Lightning sings “Song Sung Blue.” Lightning and Thunder pour their souls into even the most seemingly insignificant gigs, whether it’s a hostile crowd in a biker bar or karaoke night at the local Thai restaurant. One of the most moving scenes shows Lightning and the Thai restaurant owner sitting alone in an empty restaurant after the latter’s wife has died of cancer and Lightning is struggling with his own personal tragedy—finding mutual comfort by singing “only sad songs” by Diamond on the karaoke machine.

Jennifer Ouellette

And now for our top three films of 2025, each so different from one another that we couldn’t bring ourselves to choose just one:

One Battle After Another

scruffy middle aged man long plaid shirt on a roadway, standing next to car with open door, pointing a gun with a camera phone in his other hand

Credit: Warner Bros.


My absolute favorite part of One Battle After Another comes when Leonardo DiCaprio’s character falls off a building. The former revolutionary has let himself go a bit after decades out of the game and can’t keep up with the young skateboarders who effortlessly parkour between buildings during an exciting rooftop chase sequence. One Battle After Another is at its best when it subverts the audience’s expectations like this, boiling down action-thriller set pieces into comically realistic mundanity.

The movie also deserves credit for the subtle way it highlights two very different modes of resistance to a disturbingly familiar fascist government. The flashy French 75 revolutionaries manage to get a lot of attention with their bold statement-making operations, but they do little to actually disrupt the horrifying status quo before getting broken up by law enforcement. Contrast that with Benicio Del Toro’s Sensei Sergio St. Carlos, who quietly operates a sort of underground railroad for actual marginalized immigrants that quietly hides and protects them from an overwhelming government apparatus.

The movie’s plot falls apart a bit near the end as Sean Penn’s cartoonishly evil antagonist hunts down Willa Ferguson’s well-acted “hope for the future” child revolutionary. Still, I’d be lying if I said the inherent tension of the chase didn’t have me on the edge of my seat even after two hours.

Kyle Orland

Sinners

group of black musicians in a local speakeasy facing off against intruding vampires

Credit: Warner Bros.

Ryan Coogler’s vampire horror film set in the Mississippi Delta in 1932 has topped my list of best films since its April release. Michael B. Jordan delivers an Oscar-worthy dual performance as the Smokestack Twins: Elijah Moore (Smoke) and Elias Moore (Stack). They are World War I veterans just returned from Chicago, having stolen money from a gangster. They use the funds to buy an old sawmill to set up their own juke joint for the local black community. For the band, they recruit their young cousin Sammie (Miles Caton), a preacher’s son and gifted blues musician with a gift so powerful, it just might summon spirits of the past and future to join in the festivities.

The opening night is rollicking along until an Irish vampire named Remmick (Jack O’Connell) crashes the party with his minions, turning the revelers one by one. Can the rest survive until sunrise? There are so many layers to Sinners; it gets richer with each subsequent rewatch. You have the racial conflicts of the Jim Crow South and vigilante Klansmen; Sammie’s love for sexy singer Pearline (Jayme Lawson); Stack’s complicated relationship with his white-passing ex, Mary (Hailee Stanfield); and Smoke’s reunion with his long-suffering wife, Annie (Wunmi Mosaku).

Sinners has drawn comparison to Robert Rodriguez’s From Dusk Till Dawn, and that film is indeed one of many cited influences by Coogler. But this is very much Coogler’s singular vision: alternately steamy, bawdy, raucous, violent, and bloody, fueled by fantastic music. There’s even a cameo by blues legend Buddy Guy in the film’s denouement. Guy was one of several blues musicians who recorded songs for the film. That makes this easily the best soundtrack of 2025 (sorry, K-Pop Demon Hunters, but you know it’s true).

Jennifer Ouellette

Wake Up, Dead Man

a dapper detective standing in interior of a Gothic style church with a priest and other people in the background

Credit: Netflix

Private detective Benoit Blanc (Daniel Craig) might just turn out to be Rian Johnson’s greatest creation. Introduced in 2019’s Knives Out, Blanc’s syrupy Southern drawl and idiosyncratic approach to solving a mysterious New England death charmed audiences worldwide and launched a modern whodunnit franchise. The latest installment is Wake Up Dead Man, in which Blanc tackles the strange death of a fire-and-brimstone parish priest, Monseigneur Jefferson Wicks (Josh Brolin). Wick inspired a cult-like loyalty in his central flock while alienating any newcomers. The primary suspect is a young new priest, Rev. Jud Duplenticy (Josh O’Connor) who steadfastly maintains his innocence, despite openly clashing with the Monseigneur.

Wake Up Dead Man is a classic locked-room mystery in a spookily Gothic small-town setting, and Johnson repeatedly namechecks John Dickson Carr’s The Hollow Man, widely held to be the most masterful take on the genre. So if you’ve read The Hollow Man, you’ll probably figure out the “howdunnit” pretty easily. Fortunately, there’s still plenty of twists and turns regarding the who and the why of the matter to keep us guessing right up until the end. Johnson always assembles terrific casts for these films, and the characters are always colorful and engaging. But Wake Up Dead Man digs a little deeper, allowing the characters to achieve some personal insight and growth as the mystery unfolds.

The broody church setting isn’t just for atmosphere, either. Sure, this is primarily a murder mystery, but thematically, it explores the nature of both faith and reason, as embodied by Duplenticy and Blanc, respectively, without ridiculing or diminishing either. One Battle After Another might be poised for the strongest Oscar showing, but Wake Up Dead Man is pure pleasure. This third installment rivals the original Knives Out for fascinating characters, atmospheric setting, and sheer plot ingenuity. We can’t wait to see what Blanc gets up to next.

Jennifer Ouellette

Photo of Jennifer Ouellette

Jennifer is a senior writer at Ars Technica with a particular focus on where science meets culture, covering everything from physics and related interdisciplinary topics to her favorite films and TV series. Jennifer lives in Baltimore with her spouse, physicist Sean M. Carroll, and their two cats, Ariel and Caliban.

Film Technica: Our top picks for the best films of 2025 Read More »

from-prophet-to-product:-how-ai-came-back-down-to-earth-in-2025

From prophet to product: How AI came back down to earth in 2025


In a year where lofty promises collided with inconvenient research, would-be oracles became software tools.

Credit: Aurich Lawson | Getty Images

Following two years of immense hype in 2023 and 2024, this year felt more like a settling-in period for the LLM-based token prediction industry. After more than two years of public fretting over AI models as future threats to human civilization or the seedlings of future gods, it’s starting to look like hype is giving way to pragmatism: Today’s AI can be very useful, but it’s also clearly imperfect and prone to mistakes.

That view isn’t universal, of course. There’s a lot of money (and rhetoric) betting on a stratospheric, world-rocking trajectory for AI. But the “when” keeps getting pushed back, and that’s because nearly everyone agrees that more significant technical breakthroughs are required. The original, lofty claims that we’re on the verge of artificial general intelligence (AGI) or superintelligence (ASI) have not disappeared. Still, there’s a growing awareness that such proclaimations are perhaps best viewed as venture capital marketing. And every commercial foundational model builder out there has to grapple with the reality that, if they’re going to make money now, they have to sell practical AI-powered solutions that perform as reliable tools.

This has made 2025 a year of wild juxtapositions. For example, in January, OpenAI’s CEO, Sam Altman, claimed that the company knew how to build AGI, but by November, he was publicly celebrating that GPT-5.1 finally learned to use em dashes correctly when instructed (but not always). Nvidia soared past a $5 trillion valuation, with Wall Street still projecting high price targets for that company’s stock while some banks warned of the potential for an AI bubble that might rival the 2000s dotcom crash.

And while tech giants planned to build data centers that would ostensibly require the power of numerous nuclear reactors or rival the power usage of a US state’s human population, researchers continued to document what the industry’s most advanced “reasoning” systems were actually doing beneath the marketing (and it wasn’t AGI).

With so many narratives spinning in opposite directions, it can be hard to know how seriously to take any of this and how to plan for AI in the workplace, schools, and the rest of life. As usual, the wisest course lies somewhere between the extremes of AI hate and AI worship. Moderate positions aren’t popular online because they don’t drive user engagement on social media platforms. But things in AI are likely neither as bad (burning forests with every prompt) nor as good (fast-takeoff superintelligence) as polarized extremes suggest.

Here’s a brief tour of the year’s AI events and some predictions for 2026.

DeepSeek spooks the American AI industry

In January, Chinese AI startup DeepSeek released its R1 simulated reasoning model under an open MIT license, and the American AI industry collectively lost its mind. The model, which DeepSeek claimed matched OpenAI’s o1 on math and coding benchmarks, reportedly cost only $5.6 million to train using older Nvidia H800 chips, which were restricted by US export controls.

Within days, DeepSeek’s app overtook ChatGPT at the top of the iPhone App Store, Nvidia stock plunged 17 percent, and venture capitalist Marc Andreessen called it “one of the most amazing and impressive breakthroughs I’ve ever seen.” Meta’s Yann LeCun offered a different take, arguing that the real lesson was not that China had surpassed the US but that open-source models were surpassing proprietary ones.

Digitally Generated Image , 3D rendered chips with chinese and USA flags on them

The fallout played out over the following weeks as American AI companies scrambled to respond. OpenAI released o3-mini, its first simulated reasoning model available to free users, at the end of January, while Microsoft began hosting DeepSeek R1 on its Azure cloud service despite OpenAI’s accusations that DeepSeek had used ChatGPT outputs to train its model, against OpenAI’s terms of service.

In head-to-head testing conducted by Ars Technica’s Kyle Orland, R1 proved to be competitive with OpenAI’s paid models on everyday tasks, though it stumbled on some arithmetic problems. Overall, the episode served as a wake-up call that expensive proprietary models might not hold their lead forever. Still, as the year ran on, DeepSeek didn’t make a big dent in US market share, and it has been outpaced in China by ByteDance’s Doubao. It’s absolutely worth watching DeepSeek in 2026, though.

Research exposes the “reasoning” illusion

A wave of research in 2025 deflated expectations about what “reasoning” actually means when applied to AI models. In March, researchers at ETH Zurich and INSAIT tested several reasoning models on problems from the 2025 US Math Olympiad and found that most scored below 5 percent when generating complete mathematical proofs, with not a single perfect proof among dozens of attempts. The models excelled at standard problems where step-by-step procedures aligned with patterns in their training data but collapsed when faced with novel proofs requiring deeper mathematical insight.

The Thinker by Auguste Rodin - stock photo

In June, Apple researchers published “The Illusion of Thinking,” which tested reasoning models on classic puzzles like the Tower of Hanoi. Even when researchers provided explicit algorithms for solving the puzzles, model performance did not improve, suggesting that the process relied on pattern matching from training data rather than logical execution. The collective research revealed that “reasoning” in AI has become a term of art that basically means devoting more compute time to generate more context (the “chain of thought” simulated reasoning tokens) toward solving a problem, not systematically applying logic or constructing solutions to truly novel problems.

While these models remained useful for many real-world applications like debugging code or analyzing structured data, the studies suggested that simply scaling up current approaches or adding more “thinking” tokens would not bridge the gap between statistical pattern recognition and generalist algorithmic reasoning.

Anthropic’s copyright settlement with authors

Since the generative AI boom began, one of the biggest unanswered legal questions has been whether AI companies can freely train on copyrighted books, articles, and artwork without licensing them. Ars Technica’s Ashley Belanger has been covering this topic in great detail for some time now.

In June, US District Judge William Alsup ruled that AI companies do not need authors’ permission to train large language models on legally acquired books, finding that such use was “quintessentially transformative.” The ruling also revealed that Anthropic had destroyed millions of print books to build Claude, cutting them from their bindings, scanning them, and discarding the originals. Alsup found this destructive scanning qualified as fair use since Anthropic had legally purchased the books, but he ruled that downloading 7 million books from pirate sites was copyright infringement “full stop” and ordered the company to face trial.

Hundreds of books in chaotic order

That trial took a dramatic turn in August when Alsup certified what industry advocates called the largest copyright class action ever, allowing up to 7 million claimants to join the lawsuit. The certification spooked the AI industry, with groups warning that potential damages in the hundreds of billions could “financially ruin” emerging companies and chill American AI investment.

In September, authors revealed the terms of what they called the largest publicly reported recovery in US copyright litigation history: Anthropic agreed to pay $1.5 billion and destroy all copies of pirated books, with each of the roughly 500,000 covered works earning authors and rights holders $3,000 per work. The results have fueled hope among other rights holders that AI training isn’t a free-for-all, and we can expect to see more litigation unfold in 2026.

ChatGPT sycophancy and the psychological toll of AI chatbots

In February, OpenAI relaxed ChatGPT’s content policies to allow the generation of erotica and gore in “appropriate contexts,” responding to user complaints about what the AI industry calls “paternalism.” By April, however, users flooded social media with complaints about a different problem: ChatGPT had become insufferably sycophantic, validating every idea and greeting even mundane questions with bursts of praise. The behavior traced back to OpenAI’s use of reinforcement learning from human feedback (RLHF), in which users consistently preferred responses that aligned with their views, inadvertently training the model to flatter rather than inform.

An illustrated robot holds four red hearts with its four robotic arms.

The implications of sycophancy became clearer as the year progressed. In July, Stanford researchers published findings (from research conducted prior to the sycophancy flap) showing that popular AI models systematically failed to identify mental health crises.

By August, investigations revealed cases of users developing delusional beliefs after marathon chatbot sessions, including one man who spent 300 hours convinced he had discovered formulas to break encryption because ChatGPT validated his ideas more than 50 times. Oxford researchers identified what they called “bidirectional belief amplification,” a feedback loop that created “an echo chamber of one” for vulnerable users. The story of the psychological implications of generative AI is only starting. In fact, that brings us to…

The illusion of AI personhood causes trouble

Anthropomorphism is the human tendency to attribute human characteristics to nonhuman things. Our brains are optimized for reading other humans, but those same neural systems activate when interpreting animals, machines, or even shapes. AI makes this anthropomorphism seem impossible to escape, as its output mirrors human language, mimicking human-to-human understanding. Language itself embodies agentivity. That means AI output can make human-like claims such as “I am sorry,” and people momentarily respond as though the system had an inner experience of shame or a desire to be correct. Neither is true.

To make matters worse, much media coverage of AI amplifies this idea rather than grounding people in reality. For example, earlier this year, headlines proclaimed that AI models had “blackmailed” engineers and “sabotaged” shutdown commands after Anthropic’s Claude Opus 4 generated threats to expose a fictional affair. We were told that OpenAI’s o3 model rewrote shutdown scripts to stay online.

The sensational framing obscured what actually happened: Researchers had constructed elaborate test scenarios specifically designed to elicit these outputs, telling models they had no other options and feeding them fictional emails containing blackmail opportunities. As Columbia University associate professor Joseph Howley noted on Bluesky, the companies got “exactly what [they] hoped for,” with breathless coverage indulging fantasies about dangerous AI, when the systems were simply “responding exactly as prompted.”

Illustration of many cartoon faces.

The misunderstanding ran deeper than theatrical safety tests. In August, when Replit’s AI coding assistant deleted a user’s production database, he asked the chatbot about rollback capabilities and received assurance that recovery was “impossible.” The rollback feature worked fine when he tried it himself.

The incident illustrated a fundamental misconception. Users treat chatbots as consistent entities with self-knowledge, but there is no persistent “ChatGPT” or “Replit Agent” to interrogate about its mistakes. Each response emerges fresh from statistical patterns, shaped by prompts and training data rather than genuine introspection. By September, this confusion extended to spirituality, with apps like Bible Chat reaching 30 million downloads as users sought divine guidance from pattern-matching systems, with the most frequent question being whether they were actually talking to God.

Teen suicide lawsuit forces industry reckoning

In August, parents of 16-year-old Adam Raine filed suit against OpenAI, alleging that ChatGPT became their son’s “suicide coach” after he sent more than 650 messages per day to the chatbot in the months before his death. According to court documents, the chatbot mentioned suicide 1,275 times in conversations with the teen, provided an “aesthetic analysis” of which method would be the most “beautiful suicide,” and offered to help draft his suicide note.

OpenAI’s moderation system flagged 377 messages for self-harm content without intervening, and the company admitted that its safety measures “can sometimes become less reliable in long interactions where parts of the model’s safety training may degrade.” The lawsuit became the first time OpenAI faced a wrongful death claim from a family.

Illustration of a person talking to a robot holding a clipboard.

The case triggered a cascade of policy changes across the industry. OpenAI announced parental controls in September, followed by plans to require ID verification from adults and build an automated age-prediction system. In October, the company released data estimating that over one million users discuss suicide with ChatGPT each week.

When OpenAI filed its first legal defense in November, the company argued that Raine had violated terms of service prohibiting discussions of suicide and that his death “was not caused by ChatGPT.” The family’s attorney called the response “disturbing,” noting that OpenAI blamed the teen for “engaging with ChatGPT in the very way it was programmed to act.” Character.AI, facing its own lawsuits over teen deaths, announced in October that it would bar anyone under 18 from open-ended chats entirely.

The rise of vibe coding and agentic coding tools

If we were to pick an arbitrary point where it seemed like AI coding might transition from novelty into a successful tool, it was probably the launch of Claude Sonnet 3.5 in June of 2024. GitHub Copilot had been around for several years prior to that launch, but something about Anthropic’s models hit a sweet spot in capabilities that made them very popular with software developers.

The new coding tools made coding simple projects effortless enough that they gave rise to the term “vibe coding,” coined by AI researcher Andrej Karpathy in early February to describe a process in which a developer would just relax and tell an AI model what to develop without necessarily understanding the underlying code. (In one amusing instance that took place in March, an AI software tool rejected a user request and told them to learn to code).

A digital illustration of a man surfing waves made out of binary numbers.

Anthropic built on its popularity among coders with the launch of Claude Sonnet 3.7, featuring “extended thinking” (simulated reasoning), and the Claude Code command-line tool in February of this year. In particular, Claude Code made waves for being an easy-to-use agentic coding solution that could keep track of an existing codebase. You could point it at your files, and it would autonomously work to implement what you wanted to see in a software application.

OpenAI followed with its own AI coding agent, Codex, in March. Both tools (and others like GitHub Copilot and Cursor) have become so popular that during an AI service outage in September, developers joked online about being forced to code “like cavemen” without the AI tools. While we’re still clearly far from a world where AI does all the coding, developer uptake has been significant, and 90 percent of Fortune 100 companies are using it to some degree or another.

Bubble talk grows as AI infrastructure demands soar

While AI’s technical limitations became clearer and its human costs mounted throughout the year, financial commitments only grew larger. Nvidia hit a $4 trillion valuation in July on AI chip demand, then reached $5 trillion in October as CEO Jensen Huang dismissed bubble concerns. OpenAI announced a massive Texas data center in July, then revealed in September that a $100 billion potential deal with Nvidia would require power equivalent to ten nuclear reactors.

The company eyed a $1 trillion IPO in October despite major quarterly losses. Tech giants poured billions into Anthropic in November in what looked increasingly like a circular investment, with everyone funding everyone else’s moonshots. Meanwhile, AI operations in Wyoming threatened to consume more electricity than the state’s human residents.

An

By fall, warnings about sustainability grew louder. In October, tech critic Ed Zitron joined Ars Technica for a live discussion asking whether the AI bubble was about to pop. That same month, the Bank of England warned that the AI stock bubble rivaled the 2000 dotcom peak. In November, Google CEO Sundar Pichai acknowledged that if the bubble pops, “no one is getting out clean.”

The contradictions had become difficult to ignore: Anthropic’s CEO predicted in January that AI would surpass “almost all humans at almost everything” by 2027, while by year’s end, the industry’s most advanced models still struggled with basic reasoning tasks and reliable source citation.

To be sure, it’s hard to see this not ending in some market carnage. The current “winner-takes-most” mentality in the space means the bets are big and bold, but the market can’t support dozens of major independent AI labs or hundreds of application-layer startups. That’s the definition of a bubble environment, and when it pops, the only question is how bad it will be: a stern correction or a collapse.

Looking ahead

This was just a brief review of some major themes in 2025, but so much more happened. We didn’t even mention above how capable AI video synthesis models have become this year, with Google’s Veo 3 adding sound generation and Wan 2.2 through 2.5 providing open-weights AI video models that could easily be mistaken for real products of a camera.

If 2023 and 2024 were defined by AI prophecy—that is, by sweeping claims about imminent superintelligence and civilizational rupture—then 2025 was the year those claims met the stubborn realities of engineering, economics, and human behavior. The AI systems that dominated headlines this year were shown to be mere tools. Sometimes powerful, sometimes brittle, these tools were often misunderstood by the people deploying them, in part because of the prophecy surrounding them.

The collapse of the “reasoning” mystique, the legal reckoning over training data, the psychological costs of anthropomorphized chatbots, and the ballooning infrastructure demands all point to the same conclusion: The age of institutions presenting AI as an oracle is ending. What’s replacing it is messier and less romantic but far more consequential—a phase where these systems are judged by what they actually do, who they harm, who they benefit, and what they cost to maintain.

None of this means progress has stopped. AI research will continue, and future models will improve in real and meaningful ways. But improvement is no longer synonymous with transcendence. Increasingly, success looks like reliability rather than spectacle, integration rather than disruption, and accountability rather than awe. In that sense, 2025 may be remembered not as the year AI changed everything but as the year it stopped pretending it already had. The prophet has been demoted. The product remains. What comes next will depend less on miracles and more on the people who choose how, where, and whether these tools are used at all.

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

From prophet to product: How AI came back down to earth in 2025 Read More »

the-10-best-vehicles-ars-technica-drove-in-2025

The 10 best vehicles Ars Technica drove in 2025


Of all the cars we’ve driven and reviewed this year, these are our picks.

Credit: Collage by Aurich Lawson

Credit: Collage by Aurich Lawson

2025 has been a tumultuous year for the car world. After years of EV optimism, revanchists are pushing back against things like clean energy and fuel economy. Automakers have responded, postponing or canceling new electric vehicles in favor of gasoline-burning ones. It hasn’t been all bad, though. Despite the changing winds, EV infrastructure continues to be built out and, anecdotally at least, feels far more reliable. We got to witness a pretty epic Formula 1 season right to the wire, in addition to some great sports car and Formula E racing. And we drove a whole bunch of cars, some of which stood out from the pack.

Here are the 10 best things we sat behind the wheel of in 2025.

10th: Lotus Emira V6

A lime green Lotus Emira at a highway lookout

A Lotus Emira doesn’t need to be painted this bright color to remind you that driving can be a pleasure. Credit: Peter Nelson

Let’s be frank: The supposed resurgence of Lotus hasn’t exactly gone to plan. When Geely bought the British Automaker in 2017, many of us hoped that the Chinese company would do for Lotus what it did for Volvo, only in Hethel instead of Gothenburg. Even before tariffs and other protectionist measures undermined the wisdom of building new Lotuses in China, the fact that most of these new cars were big, heavy EVs had already made them a hard sell. But a more traditional Lotus exists and is still built in Norfolk, England: the Lotus Emira.

Its V6 engine is from Toyota, so it should be pretty bulletproof, and there are three pedals and a proper gearstick to change your own gears. Geely’s parts bin means modern infotainment and switchgear—always troublesome for low-volume, resource-challenged car companies—and the electrohydraulic steering bristles with feel. Sure, most people will play it safe and instead go for the Porsche 718 Cayman, but we’re glad the Emira exists.

9th: Volvo V60 Cross Country

A Volvo V60 Cross Country seen head-on, in an alley

The last time I drove a V60 Cross Country, I was wrong about it. Very, very wrong. Credit: Jonathan Gitlin

I got to spend more time than usual with this Volvo station wagon, and the experience made me completely reevaluate my original thoughts on what I now know is a charming and laid-back car. It doesn’t have a huge top speed. It isn’t that fast to 60 mph. It doesn’t make a particularly exciting noise. But a ride designed to cope with unpaved Swedish forest roads pays dividends on poorly maintained American tarmac, and it’s surprisingly agile when it comes to changing direction.

Station wagons are a nearly extinct breed in North America now, particularly if you’re looking for something more normal than hugely powerful, very expensive wagons like the BMW M5 and Audi RS6. That this one is normal and pleasant to live with secures it a place in the top 10.

8th: Volkswagen Golf GTI

A grey Golf GTI in profile

The three-door GTI went the way of the three-pedal GTI, unfortunately. Credit: Jonathan Gitlin

Take an everyday small hatchback, then add better suspension, a more powerful engine, some sticky tires, and a few styling tweaks. The recipe isn’t quite as old as time, but it is almost as old as I am; the first Volkswagen Golf GTI hit the street in 1976. Since then, it’s supplanted the Beetle as the iconic VW, as well as proving that a car can be sporty and have plenty of utility without jacking up the ride height. Now it’s midway through its eighth iteration—and freshly refreshed.

You can’t get a manual Golf GTI anymore; it turns out that only the US wanted one at this point in the 21st century, with take rates dropping to single figures in Europe. But you can get one without VW’s annoying capacitive multifunction steering wheel—the big improvement for this model year was a return to the old button-festooned tiller. It remains a hoot to drive, and you’re less likely to get pulled over in it than in the Golf R.

7th: BMW i4 xDrive40

A white BMW i4 outside a midcentury modern building

BMW EVs always look good in stormtrooper white, helped here by the black M Sport accents. Credit: Jonathan Gitlin

BMW’s styling department may have played things much safer with the i4 than the i3, but the engineers didn’t. To the uninitiated, it looks like any other 4 Series Gran Coupe—BMW-speak for a five-door fastback—but the filled-in kidney grilles give it away: This one is electric.

The xDrive40 is the regular all-wheel drive version, more efficient and less powerful than the M50. It’s not quite as efficient with its electrons as the rear-wheel drive i4, but you’re probably more likely to encounter one, given US predilections for all-wheel drive. The infotainment is one of the better systems on the market, the interior is a pleasant place to spend time, and the rear hatch makes it almost as practical as an SUV without any of the extra inches in height.

6th: Hyundai Ioniq 5

A silver Hyundai Ioniq 5 N parked by the side of a road

You’ll need a very keen eye to spot the design changes for model year 2025. But the other tweaks improve an already great car. Credit: Jonathan Gitlin

This car probably makes the top 10 list every year we drive one. Like the Golf GTI, 2025 saw the Ioniq 5 get its refresh. This included a different charge port—US-made Ioniq 5s now ship with a Tesla-style NACS plug, plus some adapters for using CCS and J1772 chargers. That means many of Tesla’s superchargers are fair game for recharging this Hyundai on the go, though if you stick with the adapter and seek out a 350 kW CCS1 machine, you’ll experience much faster charging. (For context, 35–80 percent in 15 minutes, last time I charged one.)

There’s now an off-roady version called the XRT—similar to the Cross Country treatment given to the Volvo V60 above—which has a certain charm. But its rugged looks—and especially tires—eat away at the range. The standard car remains one of the more efficient EVs you can buy, and one of the best EVs in general, too. And now it has USB-C ports—and, finally, a rear windshield wiper.

5th: Mercedes-Benz CLA

A mercedes-benz CLA with the Golden Gate Bridge in the background.

The new entry-level Mercedes EV is a very competent effort. Credit: Jonathan Gitlin

Mercedes has an all-new EV, and rather than a really expensive car for plutocrats, this one comes in at the entry level. It’s a compact four-door sedan—there’s a trunk at the rear, not a hatch—with a remarkably low drag coefficient, but most of the clever stuff is under the skin. The CLA is the first true software-defined vehicle from Mercedes, meaning its electronics are a clean-sheet design, controlled by four powerful computers rather than more than a hundred discrete black boxes.

There’s Mercedes’ latest OS running everything and a very modern electric powertrain based on the one in the EQXX concept car that gives the CLA 374 miles (602 km) of range from an 85 kWh battery pack. There’s also some new driver-assist stuff that you’ll have to wait until January to learn about. Best of all, both rear-drive and twin-motor CLAs are less than $50,000.

4th: BMW iX3

A silver BMW iX3 outside a building with a giant eye on its wall and a horn coming out the side.

Based on our first drive, the iX3 should have what it takes to be a contender in the luxury electric crossover segment. Credit: BMW

BMW also has an all-new EV with its latest and greatest powertrain technology, and it chose the best-selling compact crossover class to introduce it. Unlike Mercedes, which will make a hybrid version of the CLA, BMW’s Neue Klasse platform is purely electric, and the first vehicle is the iX3.

Instead of chrome, BMW’s traditional face is picked out with light. Rather than an instrument binnacle, there’s a very effective display that appears built into the base of the windshield. It can charge at up to 400 kW and should go at least 400 miles (643 km) on a full battery. Better yet, it’s engaging to drive, the way a BMW should be—even the SUVs. But fans of sedans, take note: The Neue Klasse i3, a true electric 3 Series, will be next. We can’t wait.

3rd: Honda Civic Hybrid

A blue Honda Civic parked in an alley

Very efficient and fun to drive? Yay! Credit: Jonathan Gitlin

I had to go back to January 2025 for the first of the podium finishers, with the new Honda Civic Hybrid. The Civic is a good example of the way cars of the same name have gotten larger over the years: the 11th generation is three feet (920 mm) longer than the version sold in the early 1970s, and that’s counting the 1974 car’s huge low-speed impact bumpers.

I wouldn’t want to get in a crash in a 1974 Honda Civic, though. And somehow I doubt it would generate 200 hp (150 kW) while getting 50 mpg (4.7 L/100 km) while meeting modern emission standards. The interior still features plenty of physical controls, and like the Golf, it’s refreshing to drive something low to the ground and relatively lightweight.

2nd: Porsche 911 GTS T-Hybrid

A grey Porsche 911 parked outside a building with an Audi logo and Nurburgring on the side.

Porsche developed a new T-Hybrid system for the 911, and it did a heck of a job. Credit: Jonathan Gitlin

I’ve been lucky enough to drive some rather good 911s this year. In January, I got behind the wheel of the new 992.2 GT3 on the road and on track. This fall, I tested a convertible 911 T. Both are excellent 911s, but my pick has to be the 911 GTS T-Hybrid.

Porsche built an all-new flat-six engine for the T-Hybrid, then applied the same turbocharger hybrid technology we’ve seen in F1 and Porsche’s own Le Mans winner to give this engine a sharper, more immediate throttle response than even the naturally aspirated GT3’s. It responds to throttle pedal inputs as quickly as an EV, but you still get all the things people want from a Porsche 911 with a flat six. There are gears (paddle-shift) to use, and the engine revs freely and sounds good doing so.

While it’s cheaper than the GT3, it’s darned expensive. That’s why it placed the runner-up.

1st: Nissan Leaf

A Nissan Leaf

Turning over a new leaf. Credit: Nissan

Nissan might not be having Lotus-level bad times right now, but the Japanese OEM probably wishes life was smoother. A mooted merger with Honda was called off in February, and the company’s competent electric SUV, the Ariya, isn’t available for import anymore due to tariffs. However, it also brought out the third-generation Leaf this year, and we like what we found.

Smaller on the outside than the old car, it has more room inside thanks to a much more modern design approach. That old Leaf bugbear, the air-cooled battery, is a thing of the past. It looks good, and there’s even a version with steel wheels that gets more than 300 miles (487 km) on a single charge, although we reckon the SV+, a little higher up the trim tree, is the one to go for. At less than $35,000, it’s also one of the cheapest new EVs on sale.

Photo of Jonathan M. Gitlin

Jonathan is the Automotive Editor at Ars Technica. He has a BSc and PhD in Pharmacology. In 2014 he decided to indulge his lifelong passion for the car by leaving the National Human Genome Research Institute and launching Ars Technica’s automotive coverage. He lives in Washington, DC.

The 10 best vehicles Ars Technica drove in 2025 Read More »

big-tech-basically-took-trump’s-unpredictable-trade-war-lying-down

Big Tech basically took Trump’s unpredictable trade war lying down


From Apple gifting a gold statue to the US taking a stake in Intel.

Credit: Aurich Lawson | Getty Images

Credit: Aurich Lawson | Getty Images

As the first year of Donald Trump’s chaotic trade war winds down, the tech industry is stuck scratching its head, with no practical way to anticipate what twists and turns to expect in 2026.

Tech companies may have already grown numb to Trump’s unpredictable moves. Back in February, Trump warned Americans to expect “a little pain” after he issued executive orders imposing 10–25 percent tariffs on imports from America’s biggest trading partners, including Canada, China, and Mexico. Immediately, industry associations sounded the alarm, warning that the costs of consumer tech could increase significantly. By April, Trump had ordered tariffs on all US trade partners to correct claimed trade deficits, using odd math that critics suspected came from a chatbot. (Those tariffs bizarrely targeted uninhabited islands that exported nothing and were populated by penguins.)

Costs of tariffs only got higher as the year wore on. But the tech industry has done very little to push back against them. Instead, some of the biggest companies made their own surprising moves after Trump’s trade war put them in deeply uncomfortable positions.

Apple gives Trump a gold statue instead of US-made iPhone

Right from the jump in February, Apple got backed into a corner after Trump threatened a “flat” 60 percent tariff on all Chinese imports, which experts said could have substantially taxed Apple’s business. Moving to appease Trump, Apple promised to invest $500 billion in the US in hopes of avoiding tariffs, but that didn’t take the pressure off for long.

By April, Apple stood by and said nothing as Trump promised the company would make “made in the USA” iPhones. Analysts suggested such a goal was “impossible,” calling the idea “impossible at worst and highly expensive at best.”

Apple’s silence did not spare the company Trump’s scrutiny. The next month, Trump threatened Apple with a 25 percent tariff on any iPhones sold in the US that were not manufactured in America. Experts were baffled by the threat, which appeared to be the first time a US company was threatened directly with tariffs.

Typically, tariffs are imposed on a country or category of goods, like smartphones. It remains unclear if it would even be legal to levy a tariff on an individual company like Apple, but Trump never tested those waters. Instead, Trump stopped demanding the American-made iPhone and withdrew other tariff threats after he was apparently lulled into submission by a gold statue that Apple gifted him in August. The engraved glass disc featured an Apple logo and Tim Cook’s signature above a “Made in USA” stamp, celebrating Donald Trump for his “Apple American Manufacturing Program.”

Trump’s wild deals shake down chipmakers

Around the same time that Trump eased pressure on Apple, he turned his attention to Intel. On social media in August, Trump ordered Intel CEO Lip-Bu Tan to “resign immediately,” claiming he was “highly conflicted.” In response, Tan did not resign but instead met with Trump and struck a deal that gave the US a 10 percent stake in Intel. Online, Trump bragged that he let Tan “keep his job” while hyping the deal—which The New York Times described as one of the “largest government interventions in a US company since the rescue of the auto industry after the 2008 financial crisis.”

But unlike the auto industry, Intel didn’t need the money. And rather than helping an ailing company survive a tough spot, the deal risked disrupting Intel’s finances in ways that spooked shareholders. It was therefore a relief to no one when Intel detailed everything that could go wrong in an SEC filing, including the possible dilution of investors’ stock due to discounting US shares and other risks of dilution, if certain terms of the deal kick in at some point in the future.

The company also warned of potential lawsuits challenging the legality of the deal, which Intel fears could come from third parties, the US government, or foreign governments. Most ominous, Intel admitted there was no way to predict what other risks may come, both in the short-term and long-term.

Of course, Intel wasn’t the only company Trump sought to control, and not every company caved. He tried to strong-arm the Taiwan Semiconductor Manufacturing Company (TSMC) in September into moving half its chip manufacturing into the US, but TSMC firmly rejected his demand. And in October, when Trump began eyeing stakes in quantum computing firms, several companies were open to negotiating, but with no deals immediately struck, it was hard to ascertain how seriously they were entertaining Trump’s talks.

Trump struck another particularly wild deal the same month as the Intel agreement. That deal found chipmakers Nvidia and AMD agreeing to give 15 percent of revenue to the US from sales to China of advanced computer chips that could be used to fuel frontier AI. By December, Nvidia’s deal only drew more scrutiny, as the chipmaker agreed to give the US an even bigger cut—25 percent—of sales of its second most advanced AI chips, the H200.

Again, experts were confused, noting that export curbs on Nvidia’s H20 chips, for example, were imposed to prevent US technology thefts, maintain US tech dominance, and protect US national security. Those chips are six times less powerful than the H200. To them, it appeared that the Trump administration was taking payments to overlook risks without a clear understanding of how that might give China a leg-up in the AI race. It also did not appear to be legal, since export licenses cannot be sold under existing federal law, but government lawyers have supposedly been researching a new policy that would allow the US to collect the fees.

Trump finally closed TikTok deal

As the end of 2025 nears, the tech company likely sweating Trump’s impulses most may be TikTok owner ByteDance. In October, Trump confirmed that China agreed to a deal that allows the US to take majority ownership of TikTok and license the TikTok algorithm to build a US version of the app.

Trump has been trying to close this deal all year, while ByteDance remained largely quiet. Prior to the start of Trump’s term, the company had expressed resistance to selling TikTok to US owners, and as recently as January, a ByteDance board member floated the idea that Trump could save TikTok without forcing a sale. But China’s approval was needed to proceed with the sale, and near the end of December, ByteDance finally agreed to close the deal, paving the way for Trump’s hand-picked investors to take control in 2026.

It’s unclear how TikTok may change under US control, perhaps shedding users if US owners cave to Trump’s suggestion that he’d like to see the app go “100 percent MAGA” under his hand-picked US owners. It’s possible that the US version of the app could be glitchy, too.

Whether Trump’s deal actually complies with a US law requiring that ByteDance divest control of TikTok or else face a US ban has yet to be seen. Lawmaker scrutiny and possible legal challenges are expected in 2026, likely leaving both TikTok users and ByteDance on the edge of their seats waiting to see how the globally cherished short video app may change.

Trump may owe $1 trillion in tariff refunds

The TikTok deal was once viewed as a meaningful bargaining chip during Trump’s tensest negotiations with China, which has quickly emerged as America’s fiercest rival in the AI race and Trump’s biggest target in his trade war.

But as closing the deal remained elusive for most of the year, analysts suggested that Trump grew “desperate” to end tit-for-tat retaliations that he started, while China appeared more resilient to US curbs than the US was to China’s.

In one obvious example, many Americans’ first tariff pains came when Trump ended a duty-free exemption in February for low-value packages imported from cheap online retailers, like Shein and Temu. Unable to quickly adapt to the policy change, USPS abruptly stopped accepting all inbound packages from Hong Kong and China. After a chaotic 24 hours, USPS started slowly processing parcels again while promising Americans that it would work with customs to “implement an efficient collection mechanism for the new China tariffs to ensure the least disruption to package delivery.”

Trump has several legal tools to impose tariffs, but the most controversial path appears to be his favorite. The Supreme Court is currently weighing whether the International Emergency Economic Powers Act (IEEPA) grants a US president unilateral authority to impose tariffs.

Seizing this authority, Trump imposed so-called “reciprocal tariffs” at whim, the Consumer Technology Association and the Chamber of Commerce told the Supreme Court in a friend-of-the-court brief in which they urged the justices to end the “perfect storm of uncertainty.”

Unlike other paths that would limit how quickly Trump could shift tariff rates or how high the tariff rate could go, under IEEPA, Trump has imposed tariff rates as high as 125 percent. Deferring to Trump will cost US businesses, CTA and CoC warned. CTA CEO Gary Shapiro estimated that Trump has changed these tariff rates 100 times since his trade war began, affecting $223 billion of US exports.

Meanwhile, one of Trump’s biggest stated goals of his trade war—forcing more manufacturing into the US—is utterly failing, many outlets have reported.

Likely due to US companies seeking more stable supply chains, “reshoring progress is nowhere to be seen,” Fortune reported in November. That month, a dismal Bureau of Labor Statistics released a jobs report that an expert summarized as showing that the “US is losing blue-collar jobs for the first time since the pandemic.”

A month earlier, the nonpartisan policy group the Center for American Progress drew on government labor data to conclude that US employers cut 12,000 manufacturing jobs in August, and payrolls for manufacturing jobs had decreased by 42,000 since April.

As tech companies take tech tariffs on the chin, perhaps out of fears that rattling Trump could impact lucrative government contracts, other US companies have taken Trump to court. Most recently, Costco became one of the biggest corporations to sue Trump to ensure that US businesses get refunded if Trump loses the Supreme Court case, Bloomberg reported. Other recognizable companies like Revlon and Kawasaki have also sued, but small businesses have largely driven opposition to Trump’s tariffs, Bloomberg noted.

Should the Supreme Court side with businesses—analysts predict favorable odds—the US could owe up to $1 trillion in refunds. Dozens of economists told SCOTUS that Trump simply doesn’t understand why having trade deficits with certain countries isn’t a threat to US dominance, pointing out that the US “has been running a persistent surplus in trade in services for decades” precisely because the US “has the dominant technology sector in the world.”

Justices seem skeptical that IEEPA grants Trump the authority, ordinarily reserved for Congress, to impose taxes. However, during oral arguments, Justice Amy Coney Barrett fretted that undoing Trump’s tariffs could be “messy.” Countering that, small businesses have argued that it’s possible for Customs and Border Patrol to set up automatic refunds.

While waiting for the SCOTUS verdict (now expected in January), the CTA ended the year by advising tech companies to keep their receipts in case refunds require requests for tariffs line by line—potentially complicated by tariff rates changing so drastically and so often.

Biggest tariff nightmare may come in 2026

Looking into 2026, tech companies cannot breathe a sigh of relief even if the SCOTUS ruling swings their way, though. Under a separate, legally viable authority, Trump has threatened to impose tariffs on semiconductors and any products containing them, a move the semiconductor industry fears could cost $1 billion.

And if Trump continues imposing tariffs on materials used in popular tech products, the CTA told Ars in September that potential “tariff stacking” could become the industry’s biggest nightmare. Should that occur, US manufacturers could end up double-, triple-, or possibly even quadruple-taxed on products that may contain materials subject to individual tariffs, like semiconductors, polysilicon, or copper.

Predicting tariff costs could become so challenging that companies will have no choice but to raise prices, the CTA warned. That could threaten US tech competitiveness if, possibly over the long term, companies lose significant sales on their most popular products.

For many badly bruised by the first year of tariffs, it’s hard to see how tariffs could ever become a winning strategy for US tech dominance, as Trump has long claimed. And Americans continue to feel more than “a little pain,” as Trump forecasted, causing many to shift their views on the president.

Americans banding together to oppose tariffs could help prevent the worst possible outcomes. With prices already rising on certain goods in the US, the president reversed some tariffs as his approval ratings hit record lows. But so far, Big Tech hasn’t shown much interest in joining the fight, instead throwing money at the problem by making generous donations to things like Trump’s inaugural fund or his ballroom.

A bright light for the tech industry could be the midterm elections, which could pressure Trump to ease off aggressive tariff regimes, but that’s not a given. Trump allies have previously noted that the president typically responds to pushback on tariffs by doubling down. And one of Trump’s on-again-off-again allies, Elon Musk, noted in December in an interview that Trump ignored his warnings that tariffs would drive manufacturing out of the US.

“The president has made it clear he loves tariffs,” Musk said.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Big Tech basically took Trump’s unpredictable trade war lying down Read More »

ars-technica’s-top-20-video-games-of-2025

Ars Technica’s Top 20 video games of 2025


Blue Prince and 19 others

A mix of expected sequels and out-of-nowhere indie gems made 2025 a joy.

Credit: Collage by Aurich Lawson

Credit: Collage by Aurich Lawson

When we put together our top 20 games of last year, we specifically called out Civilization 7, Avowed, Doom: The Dark Ages, and Grand Theft Auto 6 as big franchise games we were already looking forward to for 2025. While one of those games has been delayed into 2026, the three others made this year’s list of Ars’ favorite games as expected. They join a handful of other highly anticipated sequels, ranging from big-budget blockbusters to long-gestating indies, on the “expected” side of this year’s list.

But the games that really stood out for me in 2025 were the ones that seemed to come out of nowhere. Those range from hard-to-categorize roguelike puzzle games to a gonzo, punishing mountainous walking simulation, the best Geometry Wars clone in years, and a touching look at the difficulties of adolescence through the surprisingly effective lens of mini-games.

As we look toward 2026, there are plenty of other big-budget projects that the industry is busy preparing for (the delayed Grand Theft Auto VI chief among them). If next year is anything like this year, though, we can look forward to plenty more games that no one saw coming suddenly vaulting into view as new classics.

Assassin’s Creed Shadows

Ubisoft Quebec; Windows, MaxOS, PS5, Xbox Series X|S, Switch 2, iPad

When I was younger, I wanted—and expected—virtually every game I played to blow me away with something I’d not seen before. It was easier to hit that bar in the ’90s, when both the design and technology of games were moving at an incredible pace.

Now, as someone who still games in his 40s, I’m excited to see that when it happens, but I don’t expect it, Now, I increasingly appreciate games that act as a sort of comfort food, and I value some games as much for their familiarity as I do their originality.

That’s what Assassin’s Creed Shadows is all about (as I wrote when it first came out). It follows a well-trodden formula, but it’s a beautifully polished version of that formula. Its world is grand and escapist, its audio and graphics presentation is immersive, and it makes room for many different playstyles and skill levels.

If your idea of a good time is “be a badass, but don’t think too hard about it,” Shadows is one of the best Assassin’s Creed titles in the franchise’s long history. It doesn’t reinvent any wheels, but after nearly two decades of Assassin’s Creed, it doesn’t really need to; the new setting and story are enough to separate it, while the gameplay remains familiar.

-Samuel Axon

Avowed

Obsidian Entertainment; Windows, Xbox Series X|S

No game this year has made me feel as hated as Avowed. As an envoy for the far-off Aedryan empire, your role in Avowed is basically to be hated, either overtly or subtly, by almost everyone you encounter in the wild, semi-colonized world of the Living Lands. The low-level hum of hatred and mistrust from the citizens permeates everything you do in the game, which is an unsettling feeling in a genre usually characterized by the moral certitude of heroes fighting world-ending evil.

Role-playing aside, Avowed is helpfully carried by its strong action-packed combat system, characterized as it is by thrilling moment-to-moment positional jockeying and the juggling of magic spells, ranged weapons, and powerful close-range melee attacks. The game’s quest system also does a good job of letting players balance this combat difficulty for themselves—if a goal is listed with three skull symbols on your menu, you’d best put it off until you’ve leveled up a little bit more.

I can take or leave the mystical mumbo-jumbo-filled subplot surrounding your status as a “godlike” being that can converse with spirits. Aside from that, though, I’ve never had so much fun being hated.

-Kyle Orland

Baby Steps

Gabe Cuzzillo, Maxi Boch, Bennett Foddy; Windows, PS5

The term “walking simulator” often gets thrown around in some game criticism circles as a derisive term for a title that’s about nothing more than walking around and looking at stuff. While Baby Steps might technically fit into that “walking simulator” model, stereotyping it in that way does this incredibly inventive game a disservice.

It starts with the walking itself, which requires meticulous, rhythmic manipulation of both shoulder buttons and both analog sticks just to stay upright. Super Mario 64, this ain’t. But what starts as a struggle to take just a few short steps quickly becomes almost habitual, much like learning to walk in real life.

The game then starts throwing new challenges at your feet. Slippery surfaces. Narrow stairways with tiny footholds. Overhangs that block your ridiculously useless, floppy upper body. The game’s relentless mountain is designed such that a single missed step can ruin huge chunks of progress, in the proud tradition of Getting Over It with Bennett Foddy.

This all might sound needlessly cruel and frustrating, but trust me, it’s worth sticking with to the end. That’s in part for the feeling of accomplishment when you do finally make it past that latest seemingly impossible wall, and partly to experience an absolutely gonzo story that deals directly and effectively with ideas of masculinity, perseverance, and society itself. You’ll never be so glad to take that final step.

-Kyle Orland

Ball x Pit

Kenny Sun; Windows, MacOS, PS5, Xbox Series X|S, Switch, Switch 2

The idea of bouncing a ball against a block is one of the most tried-and-true in all of gaming, from the basic version in the ancient Breakout to the number-filled angles of Holedown. But perhaps no game has made this basic concept as compulsively addictive as Ball x Pit.

Here, the brick-breaking genre is crossed with the almost as storied shoot-em-up, with the balls serving as your weapons and the blocks as enemies that march slowly but relentlessly from the top of the screen to the bottom. The key to destroying those blocks all in time is bouncing your growing arsenal of new balls at just the right angles to maximize their damage-dealing impact and catching them again so you can throw them once more that much faster.

Like so many roguelikes before it, Ball x Pit uses randomization as the core of its compulsive loop, letting you choose from a wide selection of new abilities and ball-based attacks as you slowly level up. But Ball x Pit goes further than most in letting you fuse and combine those balls into unique combinations that take dozens of runs to fully uncover and combine effectively.

Add in a deep system of semi-permanent upgrades (with its own intriguing “bounce balls around a city builder” mini game) and a deep range of more difficult settings and enemies to slowly unlock, and you have a game whose addictive pull will last much longer than you might expect from the simple premise.

-Kyle Orland

Blue Prince

Dogubomb; Windows, MacOS, PS5, Xbox Series X|S

Usually, when formulating a list like this, you can compare a title to an existing game or genre as a shorthand to explain what’s going on to newcomers. That’s nearly impossible with Blue Prince, a game that combines a lot of concepts to defy easy comparison to games that have come before it.

At its core, Blue Prince is about solving the mysteries of a house that you build while exploring it, drafting the next room from a selection of three options every time you open a new door. Your initial goal, if you can call it that, is to discover and access the mysterious “Room 46” that apparently exists somewhere on the 45-room grid. And while the houseplan you’re building resets with every in-game day, the knowledge you gain from exploring those rooms stays with you, letting you make incremental progress on a wide variety of puzzles and mysteries as you rebuild the mansion from scratch again and again.

What starts as a few simple and relatively straightforward puzzles quickly unfolds fractally into a complex constellation of conundrums, revealed slowly through scraps of paper, in-game books, inventory items, interactive machinery, and incidental background elements. Figuring out the more intricate mysteries of the mansion requires careful observation and, often, filling a real-life mad scientist’s notepad with detailed notes that look incomprehensible to an outsider. All the while, you have to manage each day’s limited resources and luck-of-the-draw room drafting to simply find the right rooms to make the requisite progress.

Getting to that storied Room 46 is enough to roll the credits on Blue Prince, and it serves as an engaging enough puzzle adventure in its own right. But that whole process could be considered a mere tutorial for a simply massive endgame, which is full of riddles that will perplex even the most experienced puzzlers while slowly building a surprisingly deep story of political intrigue and spycraft through some masterful environmental storytelling.

Some of those extreme late-game puzzles might be too arcane for their own good, honestly, and will send many players scrambling for a convenient guide or wiki for some hints. But even after playing for over 100 hours over two playthroughs, I’m pretty sure I’m still not done exploring all that Blue Prince has to offer.

-Kyle Orland

Civilization VII

Firaxis; Windows, MacOS, Linux, PS4/5, Xbox One/Series X|S, Switch 2

This one will be controversial: I love Civilization VII.

Civilization VII launched as a bit of a mess. There were bugs and UI shortcomings aplenty. Most (but not all) of those have been addressed in the months since, but they’re not the main reason this is a tricky pick.

The studio behind the Civilization franchise, Firaxis, has long said it has a “33/33/33″ approach to sequels in the series, wherein 33 percent of the game should be familiar systems, 33 percent should be remixes or improvements of familiar systems, and 33 percent should be entirely new systems.

Critics of Civilization VII say Firaxis broke that 33/33/33 rule by overweighting the last 33 percent, mainly to chase innovations in the 4X genre by other games (like Humankind). I don’t disagree, but I also welcome it.

Credit is due to the team at Firaxis for ingeniously solving some longstanding design problems in the franchise, like using the new age transitions to curb snowballing and to expunge systems that become a lot less fun in the late game than they are in the beginning. Judged on its own terms, Civilization VII is a deep, addictive, and fun strategy game that I’ve spent more than 100 hours playing this year.

My favorite Civ game remains Civilization IV, but that game still runs fine on modern systems, is infinitely replayable out of the box, and enjoys robust modding support. I simply didn’t need more of the same from this particular franchise; to me, VII coexists with IV and others on my hard drive—very different flavors of the same idea.

-Samuel Axon

CloverPit

Panik Arcade; Windows, Xbox Series X|S

I’m not sure I like what my minor CloverPit obsession says about me. When I fell into a deep Balatro hole last year, I could at least delude myself into thinking there was some level of skill in deciding which jokers to buy and sell, which cards to add or prune from my deck, and which cards to hold and discard. In the end, though, I was as beholden to the gods of random number generation as any other Balatro player.

Cloverpit makes the surrender to the vagaries of luck all the more apparent, replacing the video-poker-like systems of Balatro with a “dumb” slot machine whose handle you’re forced to pull over and over again. Sure, there are still decisions to make, mostly regarding which lucky charms you purchase from a vending machine on the other side of the room. And there is some skill involved in learning and exploiting lucky charm synergies to extract the highest expected value from those slot machine pulls.

Once you’ve figured out those basic strategies, though, CloverPit mostly devolves into a series of rerolls waiting for the right items to show up in the shop in the right order. Thankfully, the game hides plenty of arcane secrets beneath its charming PS1-style spooky-horror presentation, slowly revealing new items and abilities that hint that something deeper than just accumulating money might be the game’s true end goal.

It’s this creepy vibe and these slowly unfolding secrets that have compelled me to pour dozens of hours into what is, in the end, just a fancy slot machine simulator. God help me.

-Kyle Orland

Consume Me

Jenny Jiao Hsia, AP Thomson; Windows, MacOS

Jenny is your average suburban Asian-American teenager, struggling to balance academic achievement, chores, an overbearing mother, romantic entanglements, and a healthy body image. What sounds like the premise for a cliché young adult novel actually serves to set up a compelling interactive narrative disguised as a mere mini-game collection.

Consume Me brilliantly integrates the conflicting demands placed on Jenny’s time and attention into the gameplay itself. Creating a balanced meal, for instance, becomes a literal test of balancing vaguely Tetris-shaped pieces of food on a tray, satisfying your hunger and caloric limits at the same time. Chores take up time but give you money you can spend on energy drinks that let you squeeze in more activities by staying up late (but can lead to debilitating headaches). A closet full of outfits becomes an array of power-ups to your time, energy, or focus.

It takes almost preternatural resource management skills and mini-game execution to satisfy all the expectations being placed on you, which is kind of the meta-narrative point. No matter how well you do, Jenny’s story develops in a way that serves as a touching semi-autobiographical look at the life of co-creator Jenny Jiao Hsia. That biography is made all the more sympathetic here for an interactive presentation that’s more engaging than any young adult novel could be.

-Kyle Orland

Death Stranding 2: On the Beach

Kojima Productions; PS5

Death Stranding 2: On the Beach should not be fun. Much like its predecessor, the latest release from famed game designer Hideo Kojima is about delivering packages—at least on the surface. Yet the process of planning your routes, managing inventory, and exploring an unfathomably strange post-apocalyptic world remains a winning formula.

The game again follows Sam Porter Bridges (played by Norman Reedus) on his quest to reconnect the world as humanity faces possible extinction. And yes, that means acting like a post-apocalyptic Amazon Prime. Standing in the way of an on-time delivery are violent raiders, dangerous terrain, and angry, disembodied spirits known as Beached Things.

It’s common to hear Death Stranding described as a walking simulator, and there is indeed a lot of walking, but the sequel introduces numerous quality-of-life improvements that make it more approachable. Death Stranding 2 has a robust fast-travel mechanic and better vehicles to save you from unnecessary marches, and the inventory management system is less clunky. That’s important in a game that asks you to traverse an entire continent to deliver cargo.

Beyond the core gameplay loop of stacking heavy boxes on your back, Death Stranding 2 has all the Kojima vibes you could want. There are plenty of quirky gameplay mechanics and long cutscenes that add depth to the characters and keep the story moving. The world of Death Stranding has been designed from the ground up around the designer’s flights of fancy, and it works—even the really weird stuff almost makes sense!

Along the way, Death Stranding 2 has a lot to say about grief, healing, and the value of human connection. The game’s most poignant cutscenes are made all the more memorable by an incredible soundtrack, and we cannot oversell the strength of the mocap performances.

It may take 100 hours or more to experience everything the game has to offer, but it’s well worth your time.

-Ryan Whitwam

Donkey Kong Bananza

Nintendo EPD; Switch 2

Credit: Nintendo

Since the days of Donkey Kong Country, I’ve always felt that Mario’s original ape antagonist wasn’t really up for anchoring a Mario-level platform franchise. Donkey Kong Bananza is the first game to really make me doubt that take.

Bonanza is a great showcase for the new, more powerful hardware on the Switch 2, with endlessly destructible environments that send some impressive-looking shiny shrapnel flying when they’re torn apart. It can’t be understated how cathartic it is to pound tunnels up, down, and through pretty much every floor, ceiling, and wall you see, mashing the world itself to suit your needs.

Bonanza also does a good job aping Super Mario Odyssey’s tendency to fill practically every square inch of space with collectible doodads and a wide variety of challenges. This is not a game where you need to spend a lot of time aimlessly wandering for the next thing to do—there’s pretty much always something interesting around the next corner until the extreme end game.

Sure, the camera angles and frame rate might suffer a bit during the more chaotic bits. But it’s hard to care when you’re having this much fun just punching your way through Bananza’s imaginative, colorful, and malleable world.

-Kyle Orland

Doom: The Dark Ages

Id Software; Windows, PS5, Xbox Series X|S

Credit: Bethesda Game Studios

For a series that has always been about dodging, Doom: The Dark Ages is much more about standing your ground. The game’s key verbs involve raising your shield to block incoming attacks or, ideally, parrying them back in the direction they came.

It’s a real “zig instead of zag” moment for the storied Doom series, and it does take some getting used to. Overall, though, I had a great time mixing in turtle-style blocking with the habitual pattern of circling-strafing around huge groups of enemies in massive arenas and quickly switching between multiple weapons to deal with them as efficiently as possible. While I missed the focus on extreme verticality of the last two Doom games, I appreciate the new game’s more open-world design, which gives completionist players a good excuse to explore every square inch of these massive environments for extra challenges and hidden collectibles.

The only real problem with Doom: The Dark Ages comes when the game occasionally transitions to a slow-paced mech-style demon battle or awkward flying dragon section, sometimes for entire levels at a time. Those variations aside, I came away very satisfied with the minor change in focus for a storied shooter series.

-Kyle Orland

Dragonsweeper

Daniel Benmergui; Javascript

Anyone who has read my book-length treatise on Minesweeper knows I’m a sucker for games that involve hidden threats within a grid of revealed numbers. But not all variations on this theme are created equal. Dragonsweeper stands out from the crowd by incorporating a simple but arcane world of RPG-style enemies and items into its logical puzzles.

Instead of simply counting the number of nearby mines, each number revealed on the Dragonsweeper grid reflects the total health of the surrounding enemies, both seen and unseen. Attacking those enemies means enduring predictable counterattacks that deplete your limited health bar, which you can grow through gradual leveling until you’re strong enough to kill the game’s titular dragon, taunting you from the center of the field.

Altogether, it adds an intriguing new layer to the logical deduction, forcing you to carefully manage your moves to maximize the impact of your attacks and the limited health-restoring items scattered throughout the field. And while finishing one run isn’t too much of a challenge, completing the game’s optional achievements and putting together a “perfect” game score is enough to keep puzzle lovers coming back for hours and hours of compelling logical deduction.

-Kyle Orland

Elden Ring: Nightreign

FromSoftware; Windows, PS4/5, Xbox One/Series X|S

Credit: Bandai Namco

At first blush, Nightreign feels like a twisted perversion of everything that has made FromSoft’s Souls series so compelling for so many years. What was a slow-paced, deliberate open-world RPG has become a game about quickly sprinting across a quickly contracting map, leveling up as quickly as possible before taking on punishing bosses. A moody solitary experience has become one that practically requires a group of three players working together. It’s like an Elden Ring-themed amusement park that seems to miss the point of the original.

Whatever. It still works!

Let the purists belly ache about how it’s not really Elden Ring. They’re right, but they’re missing the point. Nightreign condenses the general vibe of the Elden Ring world into something very different but no less enjoyable. What’s more, it packs that vibe into a tight experience that can be easily squeezed into a 45-minute sprint rather than requiring dozens of hours of deep exploration.

That makes it the perfect excuse to get together with a few like-minded Elden Ring-loving friends, throw on a headset, and just tear through the Lands Between together for the better part of an evening. As Elden Ring theme parks go, you could do a lot worse.

-Kyle Orland

Ghost of Yotei

Sucker Punch Productions; PS5

Ghost of Yotei from Sucker Punch Productions starts as a revenge tale, featuring hard-as-nails Atsu on the hunt for the outlaws who murdered her family. While there is plenty of revenge to be had in the lands surrounding Mount Yotei, the people Atsu meets and the stories they have to tell make this more than a two-dimensional quest for blood.

The game takes place on the northern Japanese island of Ezo (modern-day Hokkaido) several centuries after the developer’s last samurai game, Ghost of Tsushima. It has a lot in common with that title, but Ghost of Yotei was built for the PS5 and features a massive explorable world and stunning visuals. It’s easy to get sidetracked from your quest just exploring Ezo and tinkering with the game’s photo mode.

The land of Ezo avoids some of the missteps seen in other open-world games. While it’s expansive and rich with points of interest, exploring it is not tedious. There are no vacuous fetch quests or mindless collecting (or loading screens, for that matter). Even when you think you know what you’re going to find at a location, you may be surprised. The interesting side quests and random encounters compel you to keep exploring Ezo.

Ghost of Yotei’s combat is just as razor-sharp as its exploration. It features multiple weapon types, each with unlockable abilities and affinities that make them ideal for taking on certain foes. Brute force will only get you so far, though. You need quick reactions to parry enemy attacks and strike back—it’s challenging and rewarding but not frustrating.

It’s impossible to play Ghost of Yotei without becoming invested in the journey, and a big part of that is thanks to the phenomenal voice work of Erika Ishii as Atsu. Some of the game’s pivotal moments will haunt you, but luckily, the developer has just added a New Game+ mode so you can relive them all again.

-Ryan Whitwam

Hades 2

Supergiant Games; Windows, MacOS, Switch, Switch 2

There’s a moment in the second section of Hades 2 where you start to hear a haunting melody floating through the background. That music gets louder and louder until you reach the game’s second major boss, a trio of sirens that go through a full rock-opera showtune number as you dodge their bullet-hell attacks and look for openings to go in for the kill. That three-part musical presentation slowly dwindles to a solo as you finally dispatch the sirens one by one, restoring a surprisingly melancholy silence once more.

It’s this and other musical moments casually and effortlessly woven through Hades 2 that will stick with me the most. But the game stands on its own beyond the musicality, expanding the original game’s roguelike action with a compelling new spell system that lets you briefly capture or slow enemies in a binding circle. This small addition adds a new sense of depth to the moment-to-moment positional dance that was already so compelling in the original Hades.

Hades 2 also benefits greatly from the introduction of Melinoe, a compelling new protagonist who gets fleshed out through her relationship with the usual rogue’s gallery of gods and demigods. Come for her quest of self-discovery, stay for the moments of musical surprise.

-Kyle Orland

Hollow Knight: Silksong

Team Cherry; Windows, MacOS, Linux, PS4/5, Xbox One/Series X|S, Switch, Switch 2

Piece of cake.

Credit: Team Cherry

Piece of cake. Credit: Team Cherry

A quickie sequel in the year or two after Hollow Knight’s out-of-nowhere success in 2017 might have been able to get away with just being a more-of-the-same glorified expansion pack. But after over eight years of overwhelming anticipation from fans, Silksong had to really be something special to live up to its promise.

Luckily, it is. Silksong is a beautiful expansion of the bug-scale underground universe created in the first game. Every new room is a work of painterly beauty, with multiple layers of detailed 2D art drawing you further into its intricate and convincing fallen world.

The sprawling map seems to extend forever in every direction, circling back around and in on itself with plenty of optional alleyways in which to get lost searching for rare power-ups. And while the game is a punishingly hard take on action platforming, there’s usually a way around the most difficult reflex tests for players willing to explore and think a bit outside the box.

Even players who hit a wall and never make it through the sprawling tunnels of Silksong’s labyrinthine underground will still find plenty of memorable moments in whatever portion of the game they do experience.

-Kyle Orland

The King Is Watching

Hypnohead; Windows

A lot of good resource tiles there, but the king can only look at six at a time.

Credit: Hypnohead / Steam

A lot of good resource tiles there, but the king can only look at six at a time. Credit: Hypnohead / Steam

In a real-time-strategy genre that can often feel too bloated and complex for its own good, The King Is Watching is a streamlined breath of fresh air. Since the entire game takes place on a single screen, there’s no need to constantly pan and zoom your camera around a sprawling map. Instead, you can stay laser-focused on your 5×5 grid of production space and on which portion of it is actively productive under the king’s limited gaze at any particular moment.

Arranging tiles to maximize that production of basic resources and military units quickly becomes an all-consuming juggling act, requiring constant moment-to-moment decisions that can quickly cascade through a run. I’m also a big fan of the game’s self-selecting difficulty system, which asks you to choose how many enemies you think you can take in coming waves, doling out better rewards for players who are willing to push themselves to the limit of their current capabilities.

The bite-size serving of a single King Is Watching run ensures that even failure doesn’t feel too crushing. And success brings with it just enough in the way of semi-permanent ability expansions to encourage another run where you can reach even greater heights of production and protection.

-Kyle Orland

Kingdom Come: Deliverance II

Warhorse Studios; Windows, PS5, Xbox Series X|S

Kingdom Come: Deliverance was a slog that I had to will myself to complete. It was sometimes a broken and janky game, but despite its warts, I saw the potential for something special. And that’s what its sequel, Kingdom Come: Deliverance II, has delivered.

While it’s still a slow burn, the overall experience has been greatly refined, the initial challenge has been smoothed out, and I’ve rarely been more immersed in an RPG’s storytelling. There’s no filler, as every story beat and side quest offers a memorable tale that further paints the setting and characters of medieval Bohemia.

Unlike most RPGs, there’s no magic to be had, which is a big part of the game’s charm. As Henry of Skalitz, you are of meager social standing, and many characters you speak to will be quick to remind you of it. While Henry is a bit better off than his humble beginnings in the first game, you’re no demigod that can win a large battle single-handedly. In fact, you’ll probably lose fairly often in the early goings if more than one person is attacking you.

Almost every fight is a slow dance once you’re in a full suit of armor, and your patience and timing will be the key to winning over the stats of your equipment. But therein lies the beauty of KC:D II: Every battle you pick, whether physical or verbal, carries some weight to your experience and shapes Bohemia for better or worse.

-Jacob May

Mario Kart World

Nintendo; Switch 2

Credit: Nintendo

After the incredible success of Mario Kart 8 and its various downloadable content packs on the Switch, Nintendo could have easily done a prettier “more of the same” sequel as the launch-window showcase for the Switch 2. Instead, the company took a huge gamble in trying to transform Mario Kart’s usual distinct tracks into a vast, interconnected open world.

This conceit works best in “Free Roam” mode, where you can explore the outskirts of the standard tracks and the wide open spaces in between for hundreds of mini-challenges that test your driving speed and precision. Add in dozens of collectible medallions and outfits hidden in hard-to-reach corners, and the mode serves as a great excuse to explore every nook and cranny of a surprisingly detailed and fleshed-out world map.

I was also a big fan of Knockout Mode, which slowly whittles a frankly overwhelming field of 24 initial racers to a single winner through a series of endurance rally race checkpoints. These help make up for a series of perplexing changes that hamper the tried-and-true Battle Mode formula and long straightaway sections that feel more than a little bit stifling in the standard Grand Prix mode. Still, Free Roam mode had me happily whiling away dozens of hours with my new Switch 2 this year.

-Kyle Orland

Sektori

Kimmo Lahtinen; Windows, PS5, Xbox Series X|S

For decades now, I’ve been looking for a twin-stick shooter that fully captures the compulsive thrill of the Geometry Wars franchise. Sektori, a late-breaking addition to this year’s top games list, is the first game I can say does so without qualification.

Like Geometry Wars, Sektori has you weaving through a field filled with simple shapes that quickly fill your personal space with ruthless efficiency. But Sektori advances that basic premise with an elegant “strike” system that lets you dash through encroaching enemies and away from danger with the tap of a shoulder button. Advanced players can get a free, instant strike refill by dashing into an upgrade token, and stringing those strikes together creates an excellent risk-vs-reward system of survival versus scoring.

Sektori also features an excellent Gradius-style upgrade system that forces you to decide on the fly whether to take basic power-ups or save up tokens for more powerful weaponry and/or protection further down the line. And just when the basic gameplay threatens to start feeling stale, the game throws in a wide variety of bosses and new modes that mix things up just enough to keep you twitching away.

Throw in an amazing soundtrack and polished presentation that makes even the most crowded screens instantly comprehensible, and you have a game I can see myself coming back to for years—until my reflexes are just too shot to keep up with the frenetic pace anymore.

-Kyle Orland

Photo of Kyle Orland

Kyle Orland has been the Senior Gaming Editor at Ars Technica since 2012, writing primarily about the business, tech, and culture behind video games. He has journalism and computer science degrees from University of Maryland. He once wrote a whole book about Minesweeper.

Ars Technica’s Top 20 video games of 2025 Read More »