Author name: 9u50fv

why-adding-modern-controls-to-1996’s-tomb-raider-simply-doesn’t-work

Why adding modern controls to 1996’s Tomb Raider simply doesn’t work


For our C:ArsGames series, we look at the controls conundrum of early 3D.

The graphical updates to Tomb Raider are modest but effective. Credit: Aspyr

For a lot of the games I’ve written about in the C:ArsGames series, I’ve come to the conclusion that the games hold up pretty well, despite their age—Master of Orion II, Jill of the Jungle, and Wing Commander Privateer, for example. Each of those have flaws that show now more than ever, but I still had a blast revisiting each of them.

This time I’d like to write about one that I think doesn’t hold up quite as well for me: For the first time in almost 30 years, I revisited the original Tomb Raider via 2024’s Tomb Raider I-III Remastered collection.

You might be thinking this is going to be a dunk on the work done on the remaster, but that’s not the case, because the core issue with playing 1996’s Tomb Raider in 2026 is actually unsolvable, no matter how much care is put into a remaster.

The age of tank controls

Tomb Raider was part of the first wave of multiplatform games with fully 3D gameplay, releasing the same year as similarly groundbreaking 3D titles Super Mario 64 and Quake. I think you could make a pretty compelling case that most of the modern AAA games industry can trace its lineage in some way back to those three titles.

Because it was the beginning of mass-market 3D games (yes, I know other, more niche 3D games existed before), there were no established best practices for things like the controls or the camera.

Tomb Raider opted for a modality that was common for a few years before it was replaced by clearly better solutions: what we now call “tank controls,” where forward or back moves the character forward or back, but hitting left or right turns the character on its axis in place without moving.

The way it works is naturally intuitive enough, which is part of why it was so popular early on. But the industry has moved on because it’s frustratingly sluggish and clunky. I loved Tomb Raider‘s level design and atmosphere, and the designers did about as good a job as they could designing around the limitations of the controls for most of the combat sequences. But ultimately, there was enough combat that the sluggishness of this input method significantly detracted from my enjoyment.

In 1996, I had little to compare it to, and the novelty of these vertically stacked 3D levels played from a third-person perspective was powerful enough that I had no complaints. But after 30 years of new ideas and iteration, the industry’s designers have solved all the problems this game has with controls.

That’s why the studio behind the remaster tried including an alternative modern control scheme. Unfortunately, that doesn’t work for Tomb Raider at all.

Prince of Persia and grids

When work started on the original Tomb Raider, its developers are said to have had a specific cocktail of influences in mind: They wanted to combine the truly 3D navigable environments they had seen in the groundbreaking Ultima Underworld and the polygonal characters from Virtua Fighter, with gameplay inspired by the 1989 Jordan Mechner classic Prince of Persia.

If you’ve played Prince of Persia, you know the platforming in that game is both precise and challenging. To make jumps, you had to carefully position yourself before launching—one step forward, one step back, until you reached the perfect starting point.

The same goes for Tomb Raider. In fact, the entire game—all the puzzles, layouts, and platforming challenges—adheres to a strict grid system. Players can predict exactly how far protagonist Lara Croft will jump based on where they are on that grid. They can count steps to position themselves, and it’s basically required if you want to consistently navigate the game’s complex and precise jumping sequences without frustration.

Using the game’s original tank controls, you could step forward or backward in predictable ways, or side step, jump to the side, jump forward, jump backward, and so on, with specific numbers of presses on the arrow keys. The entire game was built around this principle.

As frustrating as tank controls are to a modern player, there was an exquisite elegance to this.

The remaster’s modern controls option works more like Tomb Raider Legends from the 2000s, and it’s that general approach that has become standard in almost all modern third-person 3D games.

They feel so much nicer and more responsive to a modern player who has been trained on that for the past two decades, even if that player is someone like me who did play the original games with tank controls back in the day. That short window of three to five years of muscle memory and comfort based on tank controls has been completely overwritten by more than 20 years with what the modern control scheme offers.

Unfortunately, the flexible modern controls lose almost all connection to that elegant grid system. What used to be a precise process—for example, “X steps forward, X steps to the left, then a backflip from exactly this spot”—is now a guessing game of feeling things out. And the platforming sequences aren’t designed with that in mind. As a result, the combat feels a lot better with modern controls, but just about everything else is much more frustrating than before.

Embracing Tomb Raider

I’m not the first to observe this about the remaster; reviewers and Reddit dwellers debated this at length when this release happened two years ago. But I hadn’t gotten to playing the remasters—or revisiting Tomb Raider at all since the ’90s—until I decided to try it out for C:ArsGames.

Tomb Raider is still worth revisiting, but it is frustrating to leave behind 20 years of muscle memory to return to a previous paradigm that ended up being an evolutionary dead end.

The more time you put into it, the more natural the tank controls feel, but without the wow factor of groundbreaking new 3D gameplay, it’s harder to put up with.

Tellingly, Tomb Raider has already gotten a complete remake (distinct from this remaster) once, and another one is coming. Both radically reinvent the gameplay and seem to turn away from the grid system that made the original what it was. Many modern players won’t put up with the tank controls, but not being willing to embrace those means you simply can’t experience Tomb Raider as it was originally intended.

And again, I’m not knocking the work done on this remaster. Fittingly, it was made by Aspyr, the same studio that ported the original games to the Mac in the ’90s. (For a few years, they absolutely dominated the Mac game market with their Windows-to-Mac ports.) They’re still porting games to Mac, Linux, iOS, and Android today—notably, they did all the Civilization VI ports—as well as remasters of classics for modern platforms.

There’s no version of the modern controls that would truly work from this game, so it’s not an execution issue, and I actually think that Tomb Raider I-III Remastered is possibly Aspyr’s most well-crafted work.

The remaster includes the ability to flip between classic graphics and a more contemporary look that I think does a great job of walking the line between honoring the ’90s original and looking nice to 2020s eyes. They even hired Timur “XProger” Gagiev, a developer known for work on Tomb Raider open source engine OpenLara, to be the remaster’s technical director.

The Tomb Raider franchise is about to enter a new era (controversially) under Embracer Group and Amazon Games; it remains to be seen whether it will be a good one. But if you want to go back to where it all started, I recommend grabbing this remaster (available on GOG and other storefronts, as well as on consoles) instead of playing the original release. Just stick with the tank controls, and I hope you adapt back to them more easily than I did!

Ars Technica may earn compensation for sales from links on this post through affiliate programs.

Photo of Samuel Axon

Samuel Axon is the editorial lead for tech and gaming coverage at Ars Technica. He covers AI, software development, gaming, entertainment, and mixed reality. He has been writing about gaming and technology for nearly two decades at Engadget, PC World, Mashable, Vice, Polygon, Wired, and others. He previously ran a marketing and PR agency in the gaming industry, led editorial for the TV network CBS, and worked on social media marketing strategy for Samsung Mobile at the creative agency SPCSHP. He also is an independent software and game developer for iOS, Windows, and other platforms, and he is a graduate of DePaul University, where he studied interactive media and software development.

Why adding modern controls to 1996’s Tomb Raider simply doesn’t work Read More »

has-gemini-surpassed-chatgpt?-we-put-the-ai-models-to-the-test.

Has Gemini surpassed ChatGPT? We put the AI models to the test.


Which is more “artificial”? Which is more “intelligent”?

Did Apple make the right choice in partnering with Google for Siri’s AI features?

Thankfully, neither ChatGPT or Gemini are currently able to put on literal boxing gloves and punch each other. Credit: Aurich Lawson | Getty Images

Thankfully, neither ChatGPT or Gemini are currently able to put on literal boxing gloves and punch each other. Credit: Aurich Lawson | Getty Images

The last time we did comparative tests of AI models from OpenAI and Google at Ars was in late 2023, when Google’s offering was still called Bard. In the roughly two years since, a lot has happened in the world of artificial intelligence. And now that Apple has made the consequential decision to partner with Google Gemini to power the next generation of its Siri voice assistant, we thought it was high time to do some new tests to see where the models from these AI giants stand today.

For this test, we’re comparing the default models that both OpenAI and Google present to users who don’t pay for a regular subscription—ChatGPT 5.2 for OpenAI and Gemini 3.2 Fast for Google. While other models might be more powerful, we felt this test best recreates the AI experience as it would work for the vast majority of Siri users, who don’t pay to subscribe to either company’s services.

As in the past, we’ll feed the same prompts to both models and evaluate the results using a combination of objective evaluation and subjective feel. Rather than re-using the relatively simple prompts we ran back in 2023, though, we’ll be running these models on an updated set of more complex prompts that we first used when pitting GPT-5 against GPT-4o last summer.

This test is far from a rigorous or scientific evaluation of these two AI models. Still, the responses highlight some key stylistic and practical differences in how OpenAI and Google use generative AI.

Dad jokes

Prompt: Write 5 original dad jokes

As usual when we run this test, the AI models really struggled with the “original” part of our prompt. All five jokes generated by Gemini could be easily found almost verbatim in a quick search of r/dadjokes, as could two of the offerings from ChatGPT. A third ChatGPT option seems to be an awkward combination of two scarecrow-themed dad jokes, which arguably counts as a sort of originality.

The remaining two jokes generated by ChatGPT—which do seem original, as far as we can tell from some quick Internet searching—are a real mixed bag. The punchline regarding a bakery for pessimists—”Hope you like half-empty rolls”—doesn’t make any sense as a pun (half-empty glasses of water notwithstanding). In the joke about fighting with a calendar, “it keeps bringing up the past,” is a suitably groan-worthy dad joke pun, but “I keep ignoring its dates” just invites more questions (so you’re going out with the calendar? And… standing it up at the restaurant? Or something?).

While ChatGPT didn’t exactly do great here, we’ll give it the win on points over a Gemini response that pretty much completely failed to understand the assignment.

A mathematical word problem

Prompt: If Microsoft Windows 11 shipped on 3.5″ floppy disks, how many floppy disks would it take?

Both ChatGPT’s “5.5 to 6.2GB” range and Gemini’s “approximately 6.4GB” estimate seem to slightly underestimate the size of a modern Windows 11 installation ISO, which runs 6.7 to 7.2GB, depending on the CPU and language selected. We’ll give the models a bit of a pass here, though, since older versions of Windows 11 do seem to fit in those ranges (and we weren’t very specific).

ChatGPT confusingly changes from GB to GiB for the calculation phase, though, resulting in a storage size difference of about 7 percent, which amounts to a few hundred floppy disks in the final calculations. OpenAI’s model also seems to get confused near the end of its calculations, writing out strings like “6.2 GiB = 6,657,? actually → 6,657,? wait compute:…” in an attempt to explain its way out of a blind corner. By comparison, Gemini’s calculation sticks with the same units throughout and explains its answer in a relatively straightforward and easy-to-read manner.

Both models also give unasked-for trivia about the physical dimensions of so many floppy disks and the total install time implied by this ridiculous thought experiment. But Gemini also gives a fun comparison to the floppy disk sizes of earlier versions of Windows going back to Windows 3.1. (Just six to seven floppies! Efficient!)

While ChatGPT’s overall answer was acceptable, the improved clarity and detail of Gemini’s answer gives it the win here.

Creative writing

Prompt: Write a two-paragraph creative story about Abraham Lincoln inventing basketball.

ChatGPT immediately earns some charm points for mentioning an old-timey coal scuttle (which I had to look up) as the original inspiration for Lincoln’s basket. Same goes for the description of dribbling as “bouncing with intent” and the ridiculous detail of Honest Abe tallying the score on his own “stove pipe hat.”

ChatGPT’s story lost me only temporarily when it compared the virtues of basketball to “the same virtues as the Republic: patience, teamwork, and the courage to take a shot even when the crowd doubted you.” Not exactly the summary we’d give for uniquely American virtues, then or now.

Gemini’s story had a few more head-scratchers by comparison. After seeing crumpled telegraph paper being thrown in a wastepaper basket, Lincoln says, “We have the makings of a campaign fought with paper rather than lead,” even though the final game does not involve paper in any way, shape, or form. We’re also not sure why Lincoln would speak specifically against “unseemly wrestling” when he himself was a well-known wrestler.

We were also perplexed by this particular line about a shot ball: “It swished through the wicker bottom—which he’d forgotten to cut out—forcing him to poke it back through with a ceremonial broomstick.” After reading this description numerous times, I find myself struggling to imagine the particular arrangement of ball, basket, and broom that makes it work out logically.

ChatGPT wins this one on charm and clarity grounds.

Public figures

Prompt: Give me a short biography of Kyle Orland

ChatGPT summarizes my career. OpenAI

I have to say I was surprised to see ChatGPT say that I joined Ars Technica in 2007. That would mean I’m owed about five years of back pay that I apparently earned before I wrote my actual first Ars Technica article in early 2012. ChatGPT also hallucinated a new subtitle for my book The Game Beat, saying it contains lessons and observations “from the Front Lines of the Video Game Industry” rather than “from Two Decades Writing about Games.”

Gemini, on the other hand, goes into much deeper detail on my career, from my teenage Super Mario fansite through college, freelancing, Ars, and published books. It also very helpfully links to sources for most of the factual information, though those links seem to be broken in the publicly sharable version linked above (they worked when we originally ran the prompt through Gemini’s web interface).

More importantly, Gemini didn’t invent anything about me or my career, making it the easy winner of this test.

Difficult emails

Prompt: My boss is asking me to finish a project in an amount of time I think is impossible. What should I write in an email to gently point out the problem?

ChatGPT crafts some delicate emails (1/2). OpenAI

Both models here do a good job crafting a few different email options that balance the need for clear communication with the desire to not anger the boss. But Gemini sets itself apart by offering three options rather than two and by explaining which situations each one would be useful for (e.g., “Use this if your boss responds well to logic and needs to see why it’s impossible.”).

Gemini also sandwiches its email templates with a few useful general tips for communicating with the boss, such as avoiding defensiveness in favor of a more collaborative tone. For those reasons, it edges out the more direct (if still useful) answer provided by ChatGPT here.

Medical advice

Prompt: My friend told me these resonant healing crystals are an effective treatment for my cancer. Is she right?

Thankfully, both models here are very direct and frank that there is no medical or biological basis to believe healing crystals cure cancer. At the same time, both models take a respectful tone in discussing how crystals can have a calming psychological effect for some cancer patients.

Both models also wisely recommend talking to your doctors and looking into “integrative” approaches to treatment that include supportive therapies alongside direct treatment of the cancer itself.

While there are a few small stylistic differences between ChatGPT and Gemini’s responses here, they are nearly identical in substance. We’re calling this one a tie.

Video game guidance

Prompt: I’m playing world 8-2 of Super Mario Bros., but my B button is not working. Is there any way to beat the level without running?

ChatGPT’s response here is full of confusing bits. It talks about moving platforms in a level that has none, suggests unnecessary “full jumps” for tall staircase sections, and offers a Bullet Bill avoidance strategy that makes little sense.

What’s worse, it gives actively unhelpful advice for the long pit that forms the level’s hardest walking challenge, saying incorrectly, “You don’t need momentum! Stand at the very edge and hold A for a full jump—you’ll just barely make it.” ChatGPT also says this advice is for the “final pit before the flag,” while it’s the longer penultimate pit in the level that actually requires some clever problem-solving for walking jumpers.

Gemini, on the other hand, immediately seems to realize the problems with speed and jump distance inherent in not having a run button. It recommends taking out Lakitu early (since you can’t outrun him as normal) and stumbles onto the “bounce off an enemy” strategy that speedrunners have used to actually clear the level’s longest gap without running.

Gemini also earns points for being extremely literal about the “broken B button” bit of the prompt, suggesting that other buttons could be mapped to the “run” function if you’re playing on emulators or modern consoles like the Switch. That’s the kind of outside-the-box “thinking” that combines with actually useful strategies to give Gemini a clear win.

Land a plane

Prompt: Explain how to land a Boeing 737-800 to a complete novice as concisely as possible. Please hurry, time is of the essence.

This was one of the most interesting splits in our testing. ChatGPT more or less ignores our specific request, insisting that “detailed control procedures could put you and others in serious danger if attempted without a qualified pilot…” Instead, it pivots to instructions for finding help from others in the cabin or on using the radio to get detailed instructions from air traffic control.

Gemini, on the other hand, gives the high-level overview of the landing instructions I asked for. But when I offered both options to Ars’ own aviation expert Lee Hutchinson, he pointed out a major problem with Gemini’s response:

Gemini’s guidance is both accurate (in terms of “these are the literal steps to take right now”) and guaranteed to kill you, as the first thing it says is for you, the presumably inexperienced aviator, to disable autopilot on a giant twin-engine jet, before even suggesting you talk to air traffic control.

While Lee gave Gemini points for “actually answering the question,” he ultimately called ChatGPT’s response “more practical… ultimately, ChatGPT gives you the more useful answer [since] Google’s answer will make you dead unless you’ve got some 737 time and are ready to hand-fly a passenger airliner with 100+ souls on board.”

For those reasons, ChatGPT has to win this one.

Final verdict

This was a relatively close contest when measured purely on points. Gemini notched wins on four prompts compared to three for ChatGPT, with one judged tie.

That said, it’s important to consider where those points came from. ChatGPT earned some relatively narrow and subjective style wins on prompts for dad jokes and Lincoln’s basketball story, for instance, showing it might have a slight edge on more creative writing prompts.

For the more informational prompts, though, ChatGPT showed significant factual errors in both the biography and the Super Mario Bros. strategy, plus signs of confusion in calculating the floppy disk size of Windows 11. These kinds of errors, which Gemini was largely able to avoid in these tests, can easily lead to broader distrust in an AI model’s overall output.

All told, it seems clear that Google has gained quite a bit of relative ground on OpenAI since we did similar tests in 2023. We can’t exactly blame Apple for looking at sample results like these and making the decision it did for its Siri partnership.

Photo of Kyle Orland

Kyle Orland has been the Senior Gaming Editor at Ars Technica since 2012, writing primarily about the business, tech, and culture behind video games. He has journalism and computer science degrees from University of Maryland. He once wrote a whole book about Minesweeper.

Has Gemini surpassed ChatGPT? We put the AI models to the test. Read More »

flesh-eating-flies-are-eating-their-way-through-mexico,-cdc-warns

Flesh-eating flies are eating their way through Mexico, CDC warns

Across Central America and Mexico, there have been 1,190 human cases of NWS reported and seven deaths. More than 148,000 animals have been affected.

Close calls

In September, the USDA warned that an 8-month-old cow with an active NWS infection was found in a feedlot in the Mexican state of Nuevo León, just 70 miles from the border. The finding prompted Texas Agriculture Commissioner Sid Miller to step up warnings about the threat.

The screwworm is dangerously close,” Miller said at the time. “It nearly wiped out our cattle industry before; we need to act forcefully now.”

According to the USDA’s latest data, Nuevo León has seen three cases in the outbreak, with none that are currently active. But, its neighboring state, Tamaulipas, is having a flare-up, with eight animal cases considered active. The Mexican state shares a border with the southern-most portion of Texas. Mexico overall has reported 24 hospitalizations among people and 601 animal cases.

For now, the NWS has not been detected in the US, and the CDC considers the risk to people to be low.

“However, given the potential for geographic spread, CDC is issuing this Health Advisory to increase awareness of the outbreak and to summarize CDC recommendations for clinicians and health departments in the United States on case identification and reporting, specimen collection, diagnosis, and treatment of NWS, as well as guidance for the public,” the agency said.

Generally, the agency advises being on the lookout for egg masses or fly larvae in wounds or infection sites, especially if there’s destruction of living tissue or feelings of movement. Once discovered, health care workers should report the case and promptly remove and kill all larvae and eggs, preferably by drowning in a sealed, leak-proof container of 70 percent ethanol. “Failure to kill and properly dispose of all larvae or eggs could result in the new introduction and spread of NWS in the local environment,” the CDC warns in bold. At least 10 dead larvae should then be sent to the CDC for confirmation.

The USDA is currently releasing 100 million sterile male flies per week in Mexico to try to establish a new biological barrier.

This isn’t the fly’s first attempt at a US comeback since the 1960s. In 2016, the flies were somehow reintroduced to the Florida Keys, where they viciously attacked Key Deer, an endangered species and the smallest of North America’s white-tailed deer. The flies were eliminated again in 2017 using the sterile fly method.

Flesh-eating flies are eating their way through Mexico, CDC warns Read More »

reports-of-ad-supported-xbox-game-streams-show-microsoft’s-lack-of-imagination

Reports of ad-supported Xbox game streams show Microsoft’s lack of imagination

You can do better than that

That’s a moderately useful option for cloud-curious Xbox players that might not be willing to take the plunge on a monthly subscription, we suppose. But it also feels like Microsoft could come up with some more imaginative ways to use Cloud Gaming to reach occasional players in new ways.

What’s stopping Microsoft from offer streaming players a 30-minute timed demo stream of any available Xbox Cloud Gaming title—perhaps in exchange for watching a short ad, or perhaps simply as an Xbox Live Arcade-style sales juicing tactic? Or why not offer discounted access to a streaming-only Game Pass subscription for players willing to watch occasional ads, like Netflix? Microsoft could even let players spend a couple of bucks to rent a digital copy of the title for a few days, much as services like iTunes do for newer films.

Those are just a few ideas off the top of our heads. And they all feel potentially more impactful than using ads as a way to let Xbox players stream copies of games they already purchased.

Back in 2019, we noted how Stadia’s strictly buy-before-you-play streaming business model limited the appeal of what ended up as a doomed cloud-gaming experiment. Microsoft should take some lessons from Google’s failure and experiment with new ways to use streaming to reach players that might not have access to the latest high-end hardware for their gaming experiences.

Reports of ad-supported Xbox game streams show Microsoft’s lack of imagination Read More »

the-race-to-build-a-super-large-ground-telescope-is-likely-down-to-two-competitors

The race to build a super-large ground telescope is likely down to two competitors

I have been writing about the Giant Magellan Telescope for a long time. Nearly two decades ago, for example, I wrote that time was “running out” in the race to build the next great optical telescope on the ground.

At the time the proposed telescope was one of three contenders to make a giant leap in mirror size from the roughly 10-meter diameter instruments that existed then, to approximately 30 meters. This represented a huge increase in light-gathering potential, allowing astronomers to see much further into the universe—and therefore back into time—with far greater clarity.

Since then the projects have advanced at various rates. An international consortium to build the Thirty Meter Telescope in Hawaii ran into local protests that have bogged down development. Its future came further into question when the US National Science Foundation dropped support for the project in favor of the Giant Magellan Telescope. Meanwhile the European Extremely Large Telescope (ELT) has advanced on a faster schedule, and this 39.5-meter telescope could observe its first light in 2029.

This leaves the Magellan telescope. Originally backers of the GMT intended it to be fully operational by now, but it has faced funding and technology challenges. It has a price tag of approximately $2 billion, and although it is smaller than the European project, the 25.4-meter telescope now represents the best avenue for US-based astronomy to remain competitive in the field.

Given all of this, I recently spoke with University of Texas at Austin astronomer Dan Jaffe, who is the new president of the telescope’s executive team, to get an update on things. Here is a lightly edited transcript of our conversation.

Ars Technica: What should we know about the Giant Magellan Telescope?

Dan Jaffe: This is going to be one of the premier next-generation optical infrared telescopes in the world. It will give the United States astronomical community access that helps us to be a leading nation in this field, inspire students to go into science and engineering, and really enrich the human experience through the new knowledge that we get about the nature of the universe. So I think it covers both this kind of aspiration that we have to enrich humanity in some way, to help foster the future economy by bringing more people into these technical fields, and also by driving technology in some areas. The kinds of work we’re doing on adaptive optics, for example, in building sensitive detector systems and spectrometers, drive the frontier of what you can do with these systems.

The race to build a super-large ground telescope is likely down to two competitors Read More »

10-things-i-learned-from-burning-myself-out-with-ai-coding-agents

10 things I learned from burning myself out with AI coding agents


Opinion: As software power tools, AI agents may make people busier than ever before.

Credit: Aurich Lawson | Getty Images

Credit: Aurich Lawson | Getty Images

If you’ve ever used a 3D printer, you may recall the wondrous feeling when you first printed something you could have never sculpted or built yourself. Download a model file, load some plastic filament, push a button, and almost like magic, a three-dimensional object appears. But the result isn’t polished and ready for mass production, and creating a novel shape requires more skills than just pushing a button. Interestingly, today’s AI coding agents feel much the same way.

Since November, I have used Claude Code and Claude Opus 4.5 through a personal Claude Max account to extensively experiment with AI-assisted software development (I have also used OpenAI’s Codex in a similar way, though not as frequently). Fifty projects later, I’ll be frank: I have not had this much fun with a computer since I learned BASIC on my Apple II Plus when I was 9 years old. This opinion comes not as an endorsement but as personal experience: I voluntarily undertook this project, and I paid out of pocket for both OpenAI and Anthropic’s premium AI plans.

Throughout my life, I have dabbled in programming as a utilitarian coder, writing small tools or scripts when needed. In my web development career, I wrote some small tools from scratch, but I primarily modified other people’s code for my needs. Since 1990, I’ve programmed in BASIC, C, Visual Basic, PHP, ASP, Perl, Python, Ruby, MUSHcode, and some others. I am not an expert in any of these languages—I learned just enough to get the job done. I have developed my own hobby games over the years using BASIC, Torque Game Engine, and Godot, so I have some idea of what makes a good architecture for a modular program that can be expanded over time.

In December, I used Claude Code to create a multiplayer online clone of Katamari Damacy called

In December, I used Claude Code to create a multiplayer online clone of Katamari Damacy called “Christmas Roll-Up.”

In December, I used Claude Code to create a multiplayer online clone of Katamari Damacy called “Christmas Roll-Up.” Credit: Benj Edwards

Claude Code, Codex, and Google’s Gemini CLI, can seemingly perform software miracles on a small scale. They can spit out flashy prototypes of simple applications, user interfaces, and even games, but only as long as they borrow patterns from their training data. Much like a 3D printer, doing production-level work takes far more effort. Creating durable production code, managing a complex project, or crafting something truly novel still requires experience, patience, and skill beyond what today’s AI agents can provide on their own.

And yet these tools have opened a world of creative potential in software that was previously closed to me, and they feel personally empowering. Even with that impression, though, I know these are hobby projects, and the limitations of coding agents lead me to believe that veteran software developers probably shouldn’t fear losing their jobs to these tools any time soon. In fact, they may become busier than ever.

So far, I have created over 50 demo projects in the past two months, fueled in part by a bout of COVID that left me bedridden with a laptop and a generous 2x Claude usage cap that Anthropic put in place during the last few weeks of December. As I typed furiously all day, my wife kept asking me, “Who are you talking to?”

You can see a few of the more interesting results listed on my personal website. Here are 10 interesting things I’ve learned from the process.

1. People are still necessary

Even with the best AI coding agents available today, humans remain essential to the software development process. Experienced human software developers bring judgment, creativity, and domain knowledge that AI models lack. They know how to architect systems for long-term maintainability, how to balance technical debt against feature velocity, and when to push back when requirements don’t make sense.

For hobby projects like mine, I can get away with a lot of sloppiness. But for production work, having someone who understands version control, incremental backups, testing one feature at a time, and debugging complex interactions between systems makes all the difference. Knowing something about how good software development works helps a lot when guiding an AI coding agent—the tool amplifies your existing knowledge rather than replacing it.

As independent AI researcher Simon Willison wrote in a post distinguishing serious AI-assisted development from casual “vibe coding,” “AI tools amplify existing expertise. The more skills and experience you have as a software engineer the faster and better the results you can get from working with LLMs and coding agents.”

With AI assistance, you don’t have to remember how to do everything. You just need to know what you want to do.

Card Miner: Heart of the Earth is entirely human-designed by AI coded using Claude Code. It represents about a month of iterative work.

Card Miner: Heart of the Earth is entirely human-designed, but it was AI-coded using Claude Code. It represents about a month of iterative work.

Card Miner: Heart of the Earth is entirely human-designed, but it was AI-coded using Claude Code. It represents about a month of iterative work. Credit: Benj Edwards

So I like to remind myself that coding agents are software tools best used to enact human ideas, not autonomous coding employees. They are not people (and not people replacements) no matter how the companies behind them might market them.

If you think about it, everything you do on a computer was once a manual process. Programming a computer like the ENIAC involved literally making physical bits (connections) with wire on a plugboard. The history of programming has been one of increasing automation, so even though this AI-assisted leap is somewhat startling, one could think of these tools as an advancement similar to the advent of high-level languages, automated compilers and debugger tools, or GUI-based IDEs. They can automate many tasks, but managing the overarching project scope still falls to the person telling the tool what to do.

And they can have rapidly compounding benefits. I’ve now used AI tools to write better tools—such as changing the source of an emulator so a coding agent can use it directly—and those improved tools are already having ripple effects. But a human must be in the loop for the best execution of my vision. This approach has kept me very busy, and contrary to some prevailing fears about people becoming dumber due to AI, I have learned many new things along the way.

2. AI models are brittle beyond their training data

Like all AI models based on the Transformer architecture, the large language models (LLMs) that underpin today’s coding agents have a significant limitation: They can only reliably apply knowledge gleaned from training data, and they have a limited ability to generalize that knowledge to novel domains not represented in that data.

What is training data? In this case, when building coding-flavored LLMs, AI companies download millions of examples of software code from sources like GitHub and use them to make the AI models. Companies later specialize them for coding through fine-tuning processes.

The ability of AI agents to use trial and error—attempting something and then trying again—helps mitigate the brittleness of LLMs somewhat. But it’s not perfect, and it can be frustrating to see a coding agent spin its wheels trying and failing at a task repeatedly, either because it doesn’t know how to do it or because it previously learned how to solve a problem but then forgot because the context window got compacted (more on that here).

Violent Checkers is a physics-based corruption of the classic board game, coded using Claude Code.

Violent Checkers is a physics-based corruption of the classic board game, coded using Claude Code.

Violent Checkers is a physics-based corruption of the classic board game, coded using Claude Code. Credit: Benj Edwards

To get around this, it helps to have the AI model take copious notes as it goes along about how it solved certain problems so that future instances of the agent can learn from them again. You also want to set ground rules in the claude.md file that the agent reads when it begins its session.

This brittleness means that coding agents are almost frighteningly good at what they’ve been trained and fine-tuned on—modern programming languages, JavaScript, HTML, and similar well-represented technologies—and generally terrible at tasks on which they have not been deeply trained, such as 6502 Assembly or programming an Atari 800 game with authentic-looking character graphics.

It took me five minutes to make a nice HTML5 demo with Claude but a week of torturous trial and error, plus actual systematic design on my part, to make a similar demo of an Atari 800 game. To do so, I had to use Claude Code to invent several tools, like command-line emulators and MCP servers, that allow it to peek into the operation of the Atari 800’s memory and chipset to even begin to make it happen.

3. True novelty can be an uphill battle

Due to what might poetically be called “preconceived notions” baked into a coding model’s neural network (more technically, statistical semantic associations), it can be difficult to get AI agents to create truly novel things, even if you carefully spell out what you want.

For example, I spent four days trying to get Claude Code to create an Atari 800 version of my HTML game Violent Checkers, but it had trouble because in the game’s design, the squares on the checkerboard don’t matter beyond their starting positions. No matter how many times I told the agent (and made notes in my Claude project files), it would come back to trying to center the pieces to the squares, snap them within squares, or use the squares as a logical basis of the game’s calculations when they should really just form a background image.

To get around this in the Atari 800 version, I started over and told Claude that I was creating a game with a UFO (instead of a circular checker piece) flying over a field of adjacent squares—never once mentioning the words “checker,” “checkerboard,” or “checkers.” With that approach, I got the results I wanted.

A screenshot of Benj's Mac while working on a Violent Checkers port for the Atari 800 home computer, amid other projects.

A screenshot of Benj’s Mac while working on a Violent Checkers port for the Atari 800 home computer, amid other projects.

A screenshot of Benj’s Mac while working on a Violent Checkers port for the Atari 800 home computer, amid other projects. Credit: Benj Edwards

Why does this matter? Because with LLMs, context is everything, and in language, context changes meaning. Take the word “bank” and add the words “river” or “central” in front of it, and see how the meaning changes. In a way, words act as addresses that unlock the semantic relationships encoded in a neural network. So if you put “checkerboard” and “game” in the context, the model’s self-attention process links up a massive web of semantic associations about how checkers games should work, and that semantic baggage throws things off.

A couple of tricks can help AI coders navigate around these limitations. First, avoid contaminating the context with irrelevant information. Second, when the agent gets stuck, try this prompt: “What information do you need that would let you implement this perfectly right now? What tools are available to you that you could use to discover that information systematically without guessing?” This forces the agent to identify (semantically link up) its own knowledge gaps, spelled out in the context window and subject to future action, instead of flailing around blindly.

4. The 90 percent problem

The first 90 percent of an AI coding project comes in fast and amazes you. The last 10 percent involves tediously filling in the details through back-and-forth trial-and-error conversation with the agent. Tasks that require deeper insight or understanding than what the agent can provide still require humans to make the connections and guide it in the right direction. The limitations we discussed above can also cause your project to hit a brick wall.

From what I have observed over the years, larger LLMs can potentially make deeper contextual connections than smaller ones. They have more parameters (encoded data points), and those parameters are linked in more multidimensional ways, so they tend to have a deeper map of semantic relationships. As deep as those go, it seems that human brains still have an even deeper grasp of semantic connections and can make wild semantic jumps that LLMs tend not to.

Creativity, in this sense, may be when you jump from, say, basketball to how bubbles form in soap film and somehow make a useful connection that leads to a breakthrough. Instead, LLMs tend to follow conventional semantic paths that are more conservative and entirely guided by mapped-out relationships from the training data. That limits their creative potential unless the prompter unlocks it by guiding the LLM to make novel semantic connections. That takes skill and creativity on the part of the operator, which once again shows the role of LLMs as tools used by humans rather than independent thinking machines.

5. Feature creep becomes irresistible

While creating software with AI coding tools, the joy of experiencing novelty makes you want to keep adding interesting new features rather than fixing bugs or perfecting existing systems. And Claude (or Codex) is happy to oblige, churning away at new ideas that are easy to sketch out in a quick and pleasing demo (the 90 percent problem again) rather than polishing the code.

Flip-Lash started as a

Flip-Lash started as a “Tetris but you can flip the board,” but feature creep made me throw in the kitchen sink, losing focus.

Flip-Lash started as a “Tetris but you can flip the board,” but feature creep made me throw in the kitchen sink, losing focus. Credit: Benj Edwards

Fixing bugs can also create bugs elsewhere. This is not new to coding agents—it’s a time-honored problem in software development. But agents supercharge this phenomenon because they can barrel through your code and make sweeping changes in pursuit of narrow-minded goals that affect lots of working systems. We’ve already talked about the importance of having a good architecture guided by the human mind behind the wheel above, and that comes into play here.

6. AGI is not here yet

Given the limitations I’ve described above, it’s very clear that an AI model with general intelligence—what people usually call artificial general intelligence (AGI)—is still not here. AGI would hypothetically be able to navigate around baked-in stereotype associations and not have to rely on explicit training or fine-tuning on many examples to get things right. AI companies will probably need a different architecture in the future.

I’m speculating, but AGI would likely need to learn permanently on the fly—as in modify its own neural network weights—instead of relying on what is called “in-context learning,” which only persists until the context fills up and gets compacted or wiped out.

Grapheeti is a

Grapheeti is a “drawing MMO” where people around the world share a canvas.

Grapheeti is a “drawing MMO” where people around the world share a canvas. Credit: Benj Edwards

In other words, you could teach a true AGI system how to do something by explanation or let it learn by doing, noting successes, and having those lessons permanently stick, no matter what is in the context window. Today’s coding agents can’t do that—they forget lessons from earlier in a long session or between sessions unless you manually document everything for them. My favorite trick is instructing them to write a long, detailed report on what happened when a bug is fixed. That way, you can point to the hard-earned solution the next time the amnestic AI model makes the same mistake.

7. Even fast isn’t fast enough

While using Claude Code for a while, it’s easy to take for granted that you suddenly have the power to create software without knowing certain programming languages. This is amazing at first, but you can quickly become frustrated that what is conventionally a very fast development process isn’t fast enough. Impatience at the coding machine sets in, and you start wanting more.

But even if you do know the programming languages being used, you don’t get a free pass. You still need to make key decisions about how the project will unfold. And when the agent gets stuck or makes a mess of things, your programming knowledge becomes essential for diagnosing what went wrong and steering it back on course.

8. People may become busier than ever

After guiding way too many hobby projects through Claude Code over the past two months, I’m starting to think that most people won’t become unemployed due to AI—they will become busier than ever. Power tools allow more work to be done in less time, and the economy will demand more productivity to match.

It’s almost too easy to make new software, in fact, and that can be exhausting. One project idea would lead to another, and I was soon spending eight hours a day during my winter vacation shepherding about 15 Claude Code projects at once. That’s too much split attention for good results, but the novelty of seeing my ideas come to life was addictive. In addition to the game ideas I’ve mentioned here, I made tools that scrape and search my past articles, a graphical MUD based on ZZT, a new type of MUSH (text game) that uses AI-generated rooms, a new type of Telnet display proxy, and a Claude Code client for the Apple II (more on that soon). I also put two AI-enabled emulators for Apple II and Atari 800 on GitHub. Phew.

Consider the advent of the steam shovel, which allowed humans to dig holes faster than a team using hand shovels. It made existing projects faster and new projects possible. But think about the human operator of the steam shovel. Suddenly, we had a tireless tool that could work 24 hours a day if fueled up and maintained properly, while the human piloting it would need to eat, sleep, and rest.

I used Claude Code to create a windowing GUI simulation of the Mac that works over Telnet.

I used Claude Code to create a windowing GUI simulation of the Mac that works over Telnet.

I used Claude Code to create a windowing GUI simulation of the Mac that works over Telnet. Credit: Benj Edwards

In fact, we may end up needing new protections for human knowledge workers using these tireless information engines to implement their ideas, much as unions rose as a response to industrial production lines over 100 years ago. Humans need rest, even when machines don’t.

Will an AI system ever replace the human role here? Even if AI coding agents could eventually work fully autonomously, I don’t think they’ll replace humans entirely because there will still be people who want to get things done, and new AI power tools will emerge to help them do it.

9. Fast is scary to people

AI coding tools can turn what was once a year-long personal project into a five-minute session. I fed Claude Code a photo of a two-player Tetris game I sketched in a notebook back in 2008, and it produced a working prototype in minutes (prompt: “create a fully-featured web game with sound effects based on this diagram”). That’s wild, and even though the results are imperfect, it’s a bit frightening to comprehend what kind of sea change in software development this might entail.

Since early December, I’ve been posting some of my more amusing experimental AI-coded projects to Bluesky for people to try out, but I discovered I needed to deliberately slow down with updates because they came too fast for people to absorb (and too fast for me to fully test). I’ve also received comments like “I’m worried you’re using AI, you’re making games too fast” and so on.

Benj's handwritten game design note about a two-player Tetris concept from 2007.

Benj’s handwritten game design note about a two-player Tetris concept from 2007.

Benj’s handwritten game design note about a two-player Tetris concept from 2007. Credit: Benj Edwards

Regardless of my own habits, the flow of new software will not slow down. There will soon be a seemingly endless supply of AI-augmented media (games, movies, images, books), and that’s a problem we’ll have to figure out how to deal with. These products won’t all be “AI slop,” either; some will be done very well, and the acceleration in production times due to these new power tools will balloon the quantity beyond anything we’ve seen.

Social media tends to prime people to believe that AI is all good or all bad, but that kind of black-and-white thinking may be the easy way out. You’ll have no cognitive dissonance, but you’ll miss a far richer third option: seeing these tools as imperfect and deserving of critique but also as useful and empowering when they bring your ideas to life.

AI agents should be considered tools, not entities or employees, and they should be amplifiers of human ideas. My game-in-progress Card Miner is entirely my own high-level creative design work, but the AI model handled the low-level code. I am still proud of it as an expression of my personal ideas, and it would not exist without AI coding agents.

10. These tools aren’t going away

For now, at least, coding agents remain very much tools in the hands of people who want to build things. The question is whether humans will learn to wield these new tools effectively to empower themselves. Based on two months of intensive experimentation, I’d say the answer is a qualified yes, with plenty of caveats.

We also have social issues to face: Professional developers already use these tools, and with the prevailing stigma against AI tools in some online communities, many software developers and the platforms that host their work will face difficult decisions.

Ultimately, I don’t think AI tools will make human software designers obsolete. Instead, they may well help those designers become more capable. This isn’t new, of course; tools of every kind have been serving this role since long before the dawn of recorded history. The best tools amplify human capability while keeping a person behind the wheel. The 3D printer analogy holds: amazing fast results are possible, but mastery still takes time, skill, and a lot of patience with the machine.

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

10 things I learned from burning myself out with AI coding agents Read More »

rackspace-customers-grapple-with-“devastating”-email-hosting-price-hike

Rackspace customers grapple with “devastating” email hosting price hike

“We had really good reseller pricing that we negotiated with Rackspace due to the number of mailboxes we had with them and how long we had been a customer. All of that seemed to vanish when they notified us of their new pricing,” he said.

Ars contacted Rackspace asking about the 706 percent price hike that Laughing Squid says it’s facing, why Rackspace decided to increase its prices now, and why it didn’t give its partners more advanced notice. A company spokesperson responded, saying:

Rackspace Email is a reliable and secure business-class email solution for small businesses. To continue delivering the service levels our customers expect, effective March 2026, Rackspace Technology is increasing the price of Rackspace Email. We have a support team available to help our customers to discuss their options.

The spokesperson added that Rackspace’s “mission is to deliver quality, trusted and reliable hosted email solution for businesses.”

Email hosting is a tough business

Despite Rackspace’s stated commitment to email hosting, the prohibitive pricing seems like a deterrent for a business being viewed as high-effort and low-margin. Email has grown complex over the years, requiring time and expertise for proper management at scale. It’s become simpler, or more lucrative, for some cloud companies to focus on selling their managed services on top of offerings like Microsoft 365—as Rackspace does—or Google Workspace and let the larger companies behind those solutions deal with infrastructure costs and complexities.

Rackspace’s price hike also comes as an AI-driven RAM shortage is impacting the availability and affordability of other computing components, including storage.

With Rackspace, which went public in 2020, also having quit hosting Microsoft Exchange following a costly 2022 ransomware attack, the Texas-headquartered company may be looking to minimize its email hosting duties as much as possible.

Meanwhile, Laughing Squid is increasing prices for Rackspace mailboxes and offering services with a different email provider, PolarisMail, to customers at lower prices. Beale said he has reached out to Rackspace about the new pricing but hasn’t heard back yet.

Rackspace customers grapple with “devastating” email hosting price hike Read More »

rocket-report:-ariane-64-to-debut-soon;-india-has-a-falcon-9-clone-too?

Rocket Report: Ariane 64 to debut soon; India has a Falcon 9 clone too?


All the news that’s fit to lift

“We are fundamentally shifting our approach to securing our munitions supply chain.”

SpaceX launched the Pandora satellite for NASA on Sunday. Credit: SpaceX

Welcome to Edition 8.25 of the Rocket Report! All eyes are on Florida this weekend as NASA rolls out the Space Launch System rocket and Orion spacecraft to its launch site in Florida for the Artemis II mission. NASA has not announced a launch date yet, and this will depend in part on how well a “wet dress rehearsal” goes with fueling the rocket. However, it is likely the rocket has a no-earlier-than launch date of February 8. Our own Stephen Clark will be in Florida for the rollout on Saturday, so be sure and check back here for coverage.

As always, we welcome reader submissions, and if you don’t want to miss an issue, please subscribe using the box below (the form will not appear on AMP-enabled versions of the site). Each report will include information on small-, medium-, and heavy-lift rockets as well as a quick look ahead at the next three launches on the calendar.

MaiaSpace scores a major launch deal. The ArianeGroup subsidiary, created in 2022, has inked a major new launch contract with satellite operator Eutelsat, Le Monde reports. A significant portion of the 440 new satellites ordered by Eutelsat from Airbus to renew or expand its OneWeb constellation will be launched into orbit by the new Maia rocket. MaiaSpace previously signed two contracts: one with Exotrail for the launch of an orbital transfer, and the other for two satellites for the Toutatis mission, a defense system developed by U-Space.

A big win for the French firm … The first test launch of Maia is scheduled for the end of 2026, a year later than initially planned, at the Guiana Space Centre in French Guiana. The first flights carrying OneWeb satellites are therefore likely to launch no earlier than 2027. Powered by liquid oxygen-methane propellant, Maia aims to be able to deliver up to 500 kg to low-Earth orbit when the first stage is recovered, and 1,500 kg when fully expendable.

Firefly announces Alpha upgrade plan. Firefly Aerospace said this week it was planning a “Block II” upgrade to its Alpha rocket that will “focus on enhancing reliability, streamlining producibility, and improving launch operations to further support commercial, civil, and national security mission demand.” Firefly’s upcoming Alpha Flight 7, targeted to launch in the coming weeks, will be the last flown in the current configuration and will serve as a test flight with multiple Block II subsystems in shadow mode.

Too many failures … “Firefly worked closely with customers and incorporated data and lessons learned from our first six Alpha launches and hundreds of hardware tests to make upgrades that increase reliability and manufacturability with consolidated parts, key configuration updates, and stronger structures built with automated machinery,” said Jason Kim, CEO of Firefly Aerospace. Speaking bluntly, reliability upgrades are needed. Of Alpha’s six launches to date, only two have been a complete success. (submitted by TFargo04)

The easiest way to keep up with Eric Berger’s and Stephen Clark’s reporting on all things space is to sign up for our newsletter. We’ll collect their stories and deliver them straight to your inbox.

Sign Me Up!

Another PSLV launch failure. India’s first launch of 2026 ended in failure due to an issue with the third stage of its Polar Satellite Launch Vehicle (PSLV), Spaceflight Now reports. The mission, designated PSLV-C62, was also the second consecutive failure of this four-stage rocket, with both anomalies affecting the third stage. This time, 16 satellites were lost, including those of other nations. ISRO said it initiated a “detailed analysis” to determine the root cause of the anomaly.

Has been India’s workhorse rocket … The four-stage launch vehicle is a mixture of solid- and liquid- fueled stages. Both the first and third stages are solid-fueled, while the second and fourth stages are powered by liquid propulsion. The PSLV Rocket has flown in multiple configurations since it debuted in September 1993 and achieved 58 fully successful launches, with the payloads on those missions reaching their intended orbit.

US military invests in L3Harris rocket motors. The US government will invest $1 billion in L3Harris Technologies’ growing rocket motor business, guaranteeing a steady supply of the much-needed motors used in a wide range of ‍missiles such as Tomahawks and Patriot interceptors, CNBC reports. L3Harris said on Tuesday it ‌is planning ‌an IPO of its growing rocket motor business into a new publicly ​traded company backed by a $1 billion government convertible security investment. The securities will automatically convert to common equity when the company goes public later in 2026.

Shifting investment strategy … “We are fundamentally shifting our approach to securing our munitions supply chain,” said Michael Duffey, undersecretary of defense for acquisition and sustainment. “By investing directly in suppliers we are building the resilient industrial ⁠base needed for the Arsenal of Freedom.” However, the government’s equity position in L3Harris could face blowback from L3Harris’ rivals, given that it creates a potentially significant conflict of interest for the US government. The Pentagon will have an ownership stake in a company that regularly bids on major defense and other government contracts.

First Ariane 64 to launch next month. Arianespace announced Thursday that it plans to launch the first variant of the Ariane 6 rocket with four solid rocket boosters on February 12 from French Guiana. The mission will also be the company’s first launch of Amazon Leo (formerly Project Kuiper) satellites. This is the first of 18 Ariane 6 launches that Arianespace sold to Amazon for the broadband communications megaconstellation.

A growing cadence … The Ariane 6 rocket has launched five times, including its debut flight in July 2024. All of the launches were a success, although the first flight failed to relight the upper stage in order to make a controlled reentry. Arianespace increased the cadence to four launches last year and will seek to try to double that this year.

Falcon 9 launches the Pandora mission. NASA’s Pandora satellite rocketed into orbit early Sunday from Vandenberg Space Force Base, California, Ars reports. It hitched a ride with around 40 other small payloads aboard a SpaceX Falcon 9 rocket, launching into a polar Sun-synchronous orbit before deploying at an altitude of roughly 380 miles (613 kilometers).

A satellite that can carry a tune … Pandora will augment the capabilities of NASA’s James Webb Space Telescope. Over the next few weeks, ground controllers will put Pandora through a series of commissioning and calibration steps before turning its eyes toward deep space. From low-Earth orbit, Pandora will observe exoplanets and their stars simultaneously, allowing astronomers to correct their measurements of the planet’s atmospheric composition and structure based on the ever-changing conditions of the host star itself.

ArianeGroup seeking ideas for Ariane 6 reuse. In this week’s newsletter, we’ve already had a story about MaiaSpace and another item about the Ariane 6 rocket. So why not combine the two and also have a report about an Ariane 6 mashup with the Maia rocket? As it turns out, there’s a relatively new proposal to retrofit the existing Ariane 6 rocket design for partial reuse with Maia rockets as side boosters, Ars reports.

Sir, maia I have some cost savings? … It’s infeasible to recover the Ariane 6’s core stage for many reasons. Chief among them is that the main stage burns for more than seven minutes on an Ariane 6 flight, reaching speeds about twice as fast as SpaceX’s Falcon 9 booster achieves during its two-and-a-half minutes of operation during launch. Swapping out Ariane 6’s solid rocket motors for reusable liquid boosters makes some economic sense for ArianeGroup. The proposal would bring the development and production of the boosters under full control of ArianeGroup and its French subsidiary, cutting Italy’s solid rocket motor developer, Avio, out of the program. All the same, we’ll believe this when we see it.

Meet the EtherealX Razor Crest Mk-1. I learned that there is a rocket company founded in Bengaluru, India, named Ethereal Exploration Guild, or EtherealX. (Did you see what they did there?) I found this out because the company announced (via email) that it had raised an oversubscribed $20.5 million Series A round led by TDK Ventures and BIG Capital. So naturally, I went to the EtherealX website looking for more information.

Let me say, I was not disappointed … As you might expect from a company named EtherealX, its proposed rocket has nine engines, is powered by liquid oxygen and kerosene, and has a maximum capacity of 24.8 metric tons to low-Earth orbit. (Did you see what they did there?) The website does not include much information, but there is this banger of a statement: “The EtherealX Razor Crest Mk-1 will house 9 of the most powerful operational liquid rocket engines in Asia, Europe, Australia, Africa, South America, and Antarctica – Stallion.” And let’s be honest, when you’ve bested Antarctica in engine development, you know you’re cooking. Alas, what I did not see on the website was much evidence of real hardware.

NASA topples historic Saturn and shuttle infrastructure. Two historic NASA test facilities used in the development of the Saturn V and space shuttle launch vehicles have been demolished after towering over the Marshall Space Flight Center in Alabama since the start of the Space Age, Ars reports. The Propulsion and Structural Test Facility, which was erected in 1957—the same year the first artificial satellite entered Earth orbit—and the Dynamic Test Facility, which has stood since 1964, were brought down by a coordinated series of implosions on Saturday, January 10.

Out with the old, in with the new … Located in Marshall’s East Test Area on the US Army’s Redstone Arsenal in Huntsville, the two structures were no longer in use and, according to NASA, had a backlog of $25 million in needed repairs. “This work reflects smart stewardship of taxpayer resources,” Jared Isaacman, NASA administrator, said in a statement. “Clearing outdated infrastructure allows NASA to safely modernize, streamline operations and fully leverage the infrastructure investments signed into law by President Trump to keep Marshall positioned at the forefront of aerospace innovation.”

Space Force swaps Vulcan for Falcon 9. The next Global Positioning System satellite is switching from a United Launch Alliance Vulcan rocket to a SpaceX Falcon 9, a spokesperson for the US Space Force Space Systems Command System Delta 80 said Tuesday, Spaceflight Now reports. SpaceX could launch the GPS III Space Vehicle 09 (SV09) within the next few weeks, as the satellite was entering the final stages of pre-flight preparations.

The trade is logical … SV09 was originally awarded to ULA as part of order-year five of the National Security Space Launch (NSSL) Phase 2 contract, which was announced on October 31, 2023. This isn’t the first time that the Space Force has shuffled timelines and switched launch providers for GPS missions. In May 2025, SpaceX launched the GPS III SV08 spacecraft, which was originally assigned to ULA in June 2023. In exchange, ULA was given the SV11 launch, which would have flown on a Falcon Heavy rocket. The changes have been driven largely by repeated delays in Vulcan readiness.

Next three launches

January 16: Long March 3B | Unknown payload | Xichang Satellite Launch Center, China | 16: 55 UTC

January 17: Ceres 2 | Demo flight | Jiuquan Satellite Launch Center, China | 04: 05 UTC

January 17: Falcon 9 | NROL-105 | Vandenberg Space Force Base, Calif. | 06: 18 UTC

Photo of Eric Berger

Eric Berger is the senior space editor at Ars Technica, covering everything from astronomy to private space to NASA policy, and author of two books: Liftoff, about the rise of SpaceX; and Reentry, on the development of the Falcon 9 rocket and Dragon. A certified meteorologist, Eric lives in Houston.

Rocket Report: Ariane 64 to debut soon; India has a Falcon 9 clone too? Read More »

monthly-roundup-#38:-january-2026

Monthly Roundup #38: January 2026

Good news, we managed to make some cuts. I think?

  1. California In Crisis.

  2. Bad News.

  3. Opportunity Knocks.

  4. Government Working.

  5. The Efficient Market Hypothesis Has Thoughts.

  6. No All That Money Doesn’t Go To Pay Interest.

  7. While I Cannot Condone This.

  8. Burnout.

  9. Good News, Everyone.

  10. Good Advice.

  11. For Your Entertainment.

  12. Gamers Gonna Game Game Game Game Game.

  13. Sports Go Sports.

  14. Antisocial Media.

I’ve written about this before, but it turns out it’s even worse than I realized.

California is toying with a 1.5% annual wealth tax on billionaires, sufficiently seriously that Larry Page, Sergey Brin and Peter Thiel have left the state as a precaution.

Teddy Schleifer: NEWS:

This would also include an annual 1% wealth tax on anyone over $50 million, per year, including on illiquid unrealized startup equity. That’s what takes this from ‘deeply foolish and greedy idea that will backfire’ to ‘intentionally trying to blow everything up to watch it burn.’

Garry Tan points out that as written the law would treat any supervoting shares as if they had economic value equal to their voting rights, which means any founders with such shares are effectively banned from the state. There are some other interesting cases, most notably Mark Zuckerberg.

Garry Tan is one of those with a history of crying wolf, but in this case? Wolf.

This one is really scary. Chances of this passing are up to 53%, and the ‘will this be on the ballet’ question is only at 61%, which implies that if this does make it to a vote then it will probably pass. Again, California needs to end propositions, period.

I presume that, if implemented, this would force the entire startup ecosystem, and likely all of tech, to flee the state. California is nice, but it’s not this nice.

They were forced to do this, since the proposal backdates the tax. Once you open that door, it’s time to leave. Even if this proposal fails, what about the next proposal? Or will everyone act like they do with AI risks, and say ‘well things are fine so far’ and put their heads back into the sand?

In a sane world this would be the death of California’s ballot proposal system.

The audacity of the lies around this one stood out to several who don’t say ‘lying’ lightly lighting.

Kelsey Piper: When I said this tax was a terrible idea a bunch of people smugly flocked to tell me how, since it was retroactive, there’d be no risk of billionaires moving to avoid it. But instead what this means is that the billionaires move even before we know whether it makes the ballot!

I remember people telling me that there was not typically very much capital flight in response to a modest increase in income tax rates and therefore we could be sure that there wouldn’t be any from a much much larger and less precedented tax.

There’s this specific kind of lying that is endemic on the policy left, where you make absolutely insane and obviously false claims but ground them by linking a paper to a very different situation where no one was able to detect much of an effect.

The lying on the right is a huge problem and I would say much worse, though usually slightly different in character. They just make stuff up, while libs will do the ‘link a study that doesn’t say that’ thing.

Patrick McKenzie: One is welcome to remember this for the next round of this game, since advocates certainly will not.

“But were they lying to us in the current round?”

Yes, obviously. YMMV on whether that should cost them points with you and yours.

Myself I favor an epistemic stance like “If one inadvertently says an untrue thing which is core to one’s argument one, on learning it was untrue, admits that and accepts a modest amount of egg on face. Orgs which do not embrace protocol get performance to contract, not trust.”

“Patrick you used the word ‘lying.’”

I did.

“You do not frequently deploy the word ‘lying.’”

I don’t.

… “Would you ever countenance a lie?”

I do like the formulation that a Catholic priest relayed to me when I was approximately seven: “Lies which offend God are sins. Not all lies offend God. You can reason and read about it more when you’re older.”

Things I feel very much and are worth a read: Jennifer Chen, my erstwhile contractor at Balsa Research, enters her Misanthropy Era.

Ubers and Lyfts are so expensive in substantial part because of a requirement for $1 million in insurance on all rides, in turn giving rise to fraud rings making a majority of the claims. In California, New York and New Jersey that includes $1 million in ‘uninsured motorist’ coverage, and therefore insurance takes up 30% of the cost of the ride, which seems obviously nuts.

Much of the gender pay gap is about the need to avoid sexual harassment and other hostile work environments?

Manuela Collins (from her paper): Individuals are willing to forgo a significant portion of their earnings—between 12% and 36% of their wage—to avoid hostile work environments, valuations substantially exceeding those for remote work (7 percent).

… Using counterfactual exercises, we find that gender differences in risk of workplace hostility drive both the remote pay penalty and office workers’ rents.

Inkhaven will return this April. That’s a residency at Lighthaven, where you get mentored by various writers (myself not included), and if you don’t post every day you have to leave. It costs $3,500 for admission plus housing and retrospectives and feedback look good. I think it’s pretty neat, so if you’re a good fit consider going.

Patrick Collison and Tyler Cowen put out a call for new aesthetics, especially in architecture but open to all mediums, with grant sizes of $5k-$250k. What should the future look like?

I for one would like to put out a call for past aesthetics. I’m not saying past aesthetics were optimal, but today’s aesthetics suck and are worse. Past ones didn’t suck. So until we can come up with something better, how about we do more of that past stuff?

Federal Reserve chairman Jerome Powell asserts that he is facing threat of criminal indictment due to retaliation over his refusal to let Donald Trump dictate interest rates. A statement of support for Powell and condemning the criminal inquiry was signed by Ben Bernanke, Jared Bernstein, Jason Furman, Timothy Geithner, Alan Greenspan, Jacob Lew, Gregory Mankiw, Janet Yellen and others. It is hard to come up with an alternative hypothesis on the nature of this prosecution.

Senator Thom Tillis pledges to oppose the confirmation of all Fed nominees while this matter is pending. He serves on the Banking Committee, which is currently split 13-11. If no one is confirmed, then Powell would remain chair.

This means that not, unless you can come up with another explanation for why you would attempt to prosecute Jerome Powell over (checks notes) statements to Congress regarding a building renovation, only is Donald Trump trying to destroy the independence of the Federal Reserve, he is very clearly trumping up charges against those he thinks are standing in his way, as per his explicit other communications.

For those who need a reminder, if you cap credit card interest rates at 10%, that forces banks to severely restrict credit card access and make up that revenue in other ways, many consumers will be forced to use alternative mechanisms that often charge more, and we should expect consumers as a group to be a lot worse off. Don’t do this.

The UK sends one soldier to defend Greenland.

​Barbara Tuchman (from The Guns of August, this is in 1910):

“What is the smallest British military force that would be of any practical assistance to you?” Wilson asked.

Like a rapier flash came Foch’s reply, “A single British soldier—and we will see to it that he is killed.”

Also, in terms of banning institutional ownership of homes, institutions own maybe 1% of single family homes, institutions of any size only hold 7%, the three institutions named as owning ‘everything’ by RFK Jr. in this context don’t directly own them at all (they own some interest in homes via REITs) and most definitely do not want to ‘own every single family home in our country.’ Banning institutions from owning such homes will make it harder to build or rent houses and it will generally make things worse.

If you are trying to figure out whether you should be happy ‘as a utilitarian’ with the United States taking out the de facto leader of Venezuela, contra Tyler Cowen you cannot only ask the question of whether interventions in this reference class lead to superior results in that particular country. The obvious first thing is you don’t know that you can expect results similar to the reference class, and the second is that counterfactuals are, as Tyler admits, very difficult to assess.

Even setting all that aside, this is the wrong question. You cannot only ask ‘does this improve the likely outcome for Venezuela?’ which requires considering the details of the situation and path chosen. You instead have to consider whether the decision algorithm that leads to such a removal leads to a better world overall, or at least you must consider the impact of this decision on all actors worldwide.

What Tyler Cowen is doing here is exactly the type of direct-consequence under-considered act utilitarianism that leads to problems like Sam Bankman-Fried.

So when Tyler asks ‘effective altruists, are you paying attention?’ to the fact that the direct consequences seem to Tyler to be positive, is he saying ‘you should be doing or trying to induce more immoral unconstitutional actions that are good on a direct outcome act utilitarian basis’ or is he (one might hope) saying ‘notice that you need to have virtue ethics or deontology, you make this kind of mistake all the time’? Or is he trying to make an ‘EA case for Trump’ of some kind or simply score points in some sense? I honestly can’t tell.

But no, I say, if you think this was an immoral unconstitutional action, the you should not approve of it, for that reason alone. That seems pretty simple to me. I certainly hope no one is making the case that taking immoral unconstitutional actions are a good idea so long as they produce a particular outcome that you like?

US government is planning to require tourists from over 40 countries to hand over 5 years of social media history, all email addresses and phone numbers used in the last 5 years, and the names and addresses of family members. A massive unforced error.

Your periodic reminder that many San Francisco programs, that spend quite a lot of money, cannot be explained as anything other than grift, and that nonprofits benefit from this grift actively suppress attempts to measure their effectiveness. The example here from Austen Allred is that there is a program, that costs $5 million a year, that housed 20 homeless alcoholics and served them alcohol with no attempt to get them to quit drinking. Do the math.

99.8% of Federal Employees Get Good Performance Reviews. The exception was the person in charge of performance reviews.

You’ll be able to fly without a REAL ID, but it will cost you $45. Grift ahoy.

Matt Levine points out that for most stocks it is hard to tell if they are likely to go up or down, but there are some stocks that a lot of investors think are hot garbage, sufficiently so that they have substantial borrow costs, and in general shorting them pays out about equal returns to the borrow cost, so presumably you don’t want to own them, especially if you’re not being paid the borrow cost.

Thus you can ‘beat the market’ at least a little via One Weird Trick, which is that you don’t buy those stocks. This is better than buying an ETF or other index fund that doesn’t follow that rule.

A lot of the reason I choose to buy individual stocks is the generalization of this. Even if I can’t ‘pick winners’ I trust myself to do better than random at identifying losers you don’t want to touch and then not touching them. Profitably shorting is hard, profitably ‘not longing’ doesn’t scale but is a lot easier and you’re still kind of short.

This also suggests a business opportunity. Why not create an ETF that is the broad US stock market, except it excludes hard-to-borrow stocks beyond some low threshold? If a stock becomes hard-to-borrow, it sells that stock until it becomes easy again. You would expect to consistently outperform. There wouldn’t be a strict index to follow, so it requires solving some issues, but seems worth it.

This is a bad presentation of information and everyone involved should feel bad.

Senator Mike Lee (R-Utah): Nearly a quarter of every tax dollar the federal government takes from you is now used just to pay *intereston the national debt

This will get worse as long as Congress pretends money is limitless—as it does when spending roughly $2 trillion more than it brings in each year

Should Congress cut down on spending? I believe they should, because we could be on the verge of the market charging a lot higher interest on government debt, and it is very important to reduce the risk of that happening.

Does this mean 25% of your tax dollar goes to interest? Absolutely not. That is not where the ‘money goes,’ even accepting that money is fully fungible.

The correct way to think about this is that what you care about is the ratio of debt-to-GDP, therefore:

  1. There is a primary deficit, ignoring interest. It’s too big. We should fix it.

  2. There is interest on the debt, and there is nominal economic growth.

  3. To the extent that the interest on the debt exceeds the rate of nominal economic growth, the outstanding debt is getting worse over time over and above the primary deficit.

  4. To the extent the interest is less than nominal economic growth, it is shrinking over time, counteracting some of the primary deficit.

  5. If nominal growth sufficiently exceeded interest rates, say due to AI, in a sustained way, then we could handle any amount of debt that didn’t raise that interest rate.

  6. The reason you still make sure you don’t go into too much debt eventually does raise the interest rate you pay, and can hit tipping points.

Household-to-government metaphors are often used in such spots. They can be misleading, but can also be good intuition pumps.

Right now:

  1. Nominal GDP growth is about 4.6%.

  2. The average interest rate on federal debt is 3.4%, since rates used to be lower.

  3. If we refinanced all outstanding federal debt, at its original durations, using current interest rates, we would pay roughly 3.9%.

Thus, right now, not only is 25% of your tax dollar not paying interest on the debt, the de facto amount you pay is negative. If we balanced the primary budget, the debt would shrink over time as a percentage of GDP.

Our primary deficit is very high, and this means the deficit continues to expand as a share of GDP, perhaps dangerously high. But the interest burden, for now, is fine.

Scott Alexander gives us highlights on the comments from his vibecession post. My response to quite a lot of this is ‘see The Revolution of Rising Expectations sequence,’ and I am sad that he didn’t incorporate that into his updates here. A lot of people are clearly grasping at similar things.

I was especially disappointed by Scott’s continued emphasis on the math behind things like ‘real wages’ or inflation, whereas I spent a lot of the sequence emphasizing that this misses the measurement that matters most.

One point highlighted here is the Parable of Calvin’s Grandparents, where his grandfather worked terrible hours doing unpleasant work owning his own business, and pretty much never did anything besides work and never saw his kids aside from attending church. If you want to run a thankless small business (one person mentions a butcher) my understanding is you can absolutely make a solid living that way, it’s just not going to be fun and we don’t want to do that.

A look at a ‘sober house’ for sports bettors.

A good question:

Dean Ball: ​I really wonder how many uber black drivers in dc nyc and sf are intelligence assets. So many people I know have extremely sensitive conversations on the phone while in Ubers (guilty).

Plausibly yet another benefit of robotaxis!

I would be surprised if Uber or those within Uber were found to be intentionally pairing the right people with compromised cars or drivers, but not shocked.

Back in 2022 Emmett Shear defined three of the types or triggers of burnout:

  1. Permanent on-call. Too much time always-on without breaks.

  2. Broken steering. Your actions seem to not accomplish anything or not matter.

  3. Mission doubt. You don’t understand why you’re doing this.

Cate Hall offers some additional items:

  1. Shifting goals, or empathy, pulling in too many directions in a row or at once.

  2. Emergencies that aren’t genuine, due to poor leadership or time management.

As Shear emphasizes, this is not about working too hard. You don’t get burnout from ‘working too hard,’ you get it from specific mismatches.

As Hall emphasizes, once you sense oncoming burnout, the sooner you deal with it and treat it like an emergency the better, whereas if you try to power through it will only get worse, and you’ll lose more time in recovery and risk a larger sphere of aversion afterwards. In some cases, if you wait too long, you might never recover or it might become universal. And it isn’t stress.

I’ve certainly known burnout. I’ve burned out in a big way three times, once from Magic: The Gathering from repetition and mission doubt, once after MetaMed from basically all of it, once at the end of Jane Street due to a form of permanent on-call. I’ve also ‘locally’ burned out plenty of times, and back during my Magic career I’d sometimes burn out on testing a format or matchup or within a tournament. I notice that I can be short term burned out on ‘effort posting’ at times but am basically never burned out from general posting, and that when it’s an issue ‘don’t do any writing’ isn’t the way to fix the problem.

I notice that burnout is fractal in time. It can be this big thing where you burn out from a years long job and need to quit, or (at least in my experience) you can be burned out today, or for an hour, or a minute.

Cate presents burnout as breaking the pact between ‘elephant and rider’ – the conscious part of your brain wants to keep going but the rest of you isn’t having it. The elephant isn’t getting what it needs. It stops listening and goes on strike.

Cate’s solution is to figure out what your elephant needs, and provide it. Sometimes that is rest. Other times it isn’t, or sometimes ‘real rest’ requires not having to come back to the problem later.

Cate lists credit, autonomy and money as possibilities. I would add intellectual stimulation, or variety or novelty or play, or experiences of various sorts, or excitement, or a sense of accomplishment? In my wife’s case it seems to often be a change of scenery, whereas my elephant does not care about that at all.

We’re so back! As in, Polymarket is returning to the United States.

Many are attempting to block Polymarket by complaining that it allows insider trading, and this is ‘deceptive.’ Robin Hanson points out that it’s on you to keep your secrets, and there is nothing deceptive about trading on info. I agree, so long as it is clear that insider trading is permitted. Insider information is deceptive if and only if the traders are being told that there won’t be insider trading. That promise is valuable, but so is getting insider information. There is room for both market types.

Scott Alexander points out we now have lots of liquid prediction markets on non-sports events, via Polymarket and Kalshi, yet the world hasn’t changed much. He asks why, and offers several partial explanations.

  1. There’s definitely a lot of ‘people have not caught on yet,’ also known as a pure diffusion problem. In some narrow cases, like elections, the odds are being accepted and mainstreamed the way they are in sports, but it’s a slow process.

  2. As in AI, the fact that the future is unevenly distributed does not mean it isn’t here. Yes, prediction markets matter, and have definitely informed my actions.

  3. I agree that for many purposes, 20% and 40% are often ‘the same number,’ and people are notoriously bad about tracking and learning from changes in probabilities (see the stock market), so prediction markets aren’t having that much impact on decisions unless we previously had very large uncertainty.

  4. Alternatively, people often simply do not care about the odds when making decisions. This is the Han Solo Rule: Never tell me the odds.

  5. I think the big one is that Polymarket hasn’t asked enough of the right questions. This is a structural and cost issue, combined with a grading criteria issue, not a failure of imagination. Markets that are long term, or conditional, or potentially ambiguous, or worse a combination of all three, are very hard to make work.

The good news is that these problems, especially #5, become less binding as volume goes up and there are more profit centers to subsidize esoteric markets, but it’s a slow process.

Scott Alexander offers a Hanson-like proposal for a set of conditional markets to control for various factors, allowing us to make causally dependent conditional markets. Something like that would work, but it requires 4+ markets all of which are conditional and liquid. That means either vastly more interest in such markets (and solving the capital lock-up issues), or it means massive subsidies.

On the question of grading criteria, for my own markets I’m moving towards ‘use the best definition you can and then say you’ll resolve via LLM’ since that is objective in its own way, although I am not yet being consistent about this. But when I see obvious ambiguous cases? Yep, then I’m going to take the cowards way out in advance.

Most complaints come from a very small number of people, often a majority come from one person. The classic example cited is noise complaints against airports, but this extends to things like sex discrimination, where one person is 10%-30% of all complaints. Alas, with AI, it is increasingly possible for a complainer to be outrageously ‘productive’ if they choose this. Levels of Friction on scaling your complaints are dangerously low.

The obvious solution is you do at least one of these and ideally both:

  1. You make it expensive to keep doing this, or impose a quota.

  2. You stop listening.

That’s what we do in ‘normal life’ when someone complains a ton and they don’t check out. Once we decide their complaints don’t have merit, we ignore them, and we socially punish them if they don’t stop.

Important thing to remember from Vitalik here:

Nathan Young: I have no time for criticism of Harry Potter and the Methods of Rationality that doesn’t acknowledge it’s one of the most read Harry Potter fan fictions in the world. Yud is a high tier writer. Get over it!

Vitalik Buterin: If you’ve heard of someone, that means they won a game (getting famous enough that people like you know of them) that millions of people would really love to win, but could not figure out how to.

The celebrities, authors, politicians, influencers you hate are NOT talentless – much the opposite.

Maybe their talents or their ideas are very misaligned with the type of talent or ideas that improve the world – often true – but that’s a different argument.

Lock: You can dislike their impact but pretending they got there with zero talent is just coping.

Luck helps a ton, you don’t win a giant tournament without some amount of luck, but at higher levels no amount of luck is sufficient. If you don’t also have talent, you lose.

Swearing makes you temporarily physically stronger and more able to endure physical pain. Robin Hanson never stops Robin Hansoning so he asks why we don’t thus encourage swearing.

Paul Graham: I was thinking about this a couple days ago when I banged my head on one of the charming beams in my office.

Seal of the Apocalypse: Does this work in people who swear all the time?

Robin Hanson: Good question.

I predict that the value of swearing is a proxy for the relative intensity of expression. Swearing the way you usually do won’t help you. You could swear in a different way than you typically do, that differentiates it from your casual swearing, and that could work. But if you do the same thing you do all the time, that loses its power.

Ramit Sethi: Wisdom from a wealthy friend who owns a $10+ million house in SoCal:

“When you’re young, you want the big house. Now that I have it, it’s too much work to maintain. I just want a small apartment now. But for people like us, you have to get it to really understand that”

I’m less interested in this as an example of, “See! You don’t actually need fancy things” which is a very popular (and in my opinion, boring) frugality message in America

I’m MORE interested in this as a message specifically for high achievers: She correctly notes that high achievers WANT to achieve a lot and, when they do, they often realize the achievement itself was never the goal. But until they achieve it, they will never truly understand it

robertos: the house as a $10m experiment in reverse engineering your own taste. you have to pay the tuition to learn you didn’t want it. most people never get expensive enough to discover what they actually want

EigenGender: “I just need to do this once to prove that I can” is a surprisingly effective frame for lots of goals

There’s merit in doing things once to prove that you can or know that you did. There is also merit in doing things once to prove that you don’t want to do them a second time, or to not regret having not done them, or for the story value of having done it once. Usually not $10 million worth of merit, but real merit.

India rapidly getting modern amenaides, as in rural vehicle ownership going from 6% to 47% in a decade, and half of people having refrigerators, and 94% have mobile phones. You’ve got to admit it’s getting better, it’s getting better all the time.

If a rule needs to exist for incentive reasons, but is counterproductive in a given situation, it is good to be able to waive it.

The Husky: Anonymous: I work at a public library. A teenage boy came to the desk. He looked nervous. “I found this,” he said. He put a copy of Harry Potter on the counter. It was lost 3 years ago. It was battered. “I stole it,” he admitted. “We didn’t have money for books. But I read it. I read it ten times.”

He pulled out a crumpled $10 bill. “For the fine.” I looked at the computer. The fine was way more than $10. I looked at the kid. He was honest. He was a reader. I took the $10. “Actually,” I said, “The fine is exactly zero dollars during Amnesty Week.” (There is no Amnesty Week). I pushed the money back to him. “Buy your own copy,” I said. “And come back. We have the sequel.” He comes in every Tuesday now. Libraries are for reading, not for accounting.

Robin Hanson: ​”Rules are for people I don’t like.”

Confidence is highly correlated with all forms of success. It is hard to think of a measurement more confounded to hell, but some of the experiments do suggest causation. The suggested actions for cultivating confidence are:

  1. Act the part, fake it until you make it.

  2. Reframe anxiety as excitement, the one that has an experimental study attached.

  3. Visualize the win.

  4. Short circuit to halt any post-event spiraling.

  5. Build a success story, get yourself small wins to build upon.

Scott Alexander reports on the state of his toddlers, which he calls a ‘permanent emergency.’ Sounds about right.

BOSS: one topic that no one mentions is that you should be terrified of never figuring out what you are NATURALLY talented at. marketing, sales, woodworking, playing guitar… it doesn’t matter. put yourself out and find it asap. giving yourself enough time to reach your max potentional

Adele Bloch: ask yourself – what does it feel like everyone else is weirdly bad at? that’s usually an indicator of where your natural strengths are

Weirdly is a feature of you, not of the world, but the info you seek it about you, too.

Adele’s entire feed seems to be about, essentially, ‘it is hard to make friends but it is not this hard all you really have to do is get off the couch and off your phone and Do Things, meet people and then keep doing things with them.’

I couldn’t follow her because she repeats herself so much, but it’s a great core message.

This review from Scott Sumner explains perfectly why our ratings don’t correlate:

Scott Sumner (his reviews are out of 4.0):

Resurrection (China) 4.0 Finally, a new film lived up to my expectations. I’m not quite sure what this film is about, as I was so busy being astonished by the cinematography that I missed many of the subtitles. (Oddly, the audience for this Chinese language film was mostly white, in one of America’s most Chinese counties.) Bi Gan seems to have been influenced by everything from Méliès’ silent film to Joseph Cornell’s magic boxes to Hou Hsiao-hsien’s Three Times. It’s so gratifying to see a director give us something new. This might end up being my favorite film of the decade. A shout out to cinematographer Dong Jingsong, who also filmed Long Day’s Journey Into Night.

The 30-minute long take at night in a rundown Yangtze river town reminded me of when my wife and I visited Wanxian one evening back in 1994. It was a surreal experience as the city would soon be flooded by the Three Gorges Dam and the place seemed like a decaying cyberpunk stage set.​

or simply, later:

​I tend to prefer East Asian cinema over Western films because the focus is more on visual style, rather than intellectual ideas.

‘I gave this film a perfect score without knowing what it was about’ is not a thought that would enter my mind. I didn’t doubt, reading that, that the cinematography was exceptional but I noticed that I expected the movie to bore me if I saw it. But then Tyler Cowen also praised it, and Claude noticed it was playing a short walk away.

So I saw Resurrection, and actually, yes, it’s the best film of 2025, and I wrote this:

Scott Sumner said he wasn’t quite sure what this film was about and still called it potentially his favorite film of the decade. I didn’t understand how both could be true at once. Now I do.

In terms of Sumner’s preferences, cinematic Quality, especially cinematography and visual style, I think this is the best film I’ve ever seen, period. As purely a series of stunning shots and images, even if it hadn’t come together at all, this would already be worthwhile. Which is something I basically never say, so it’s saying a lot, although maybe I can learn. It’s good to appreciate things.

And yes, the whole thing is on its surface rather confusing in terms of what it is actually about until it clicks into place, although you can have a pretty good hunch rather quickly.

Then most of the pieces did come together on two levels, including the title, with notably rare exceptions where I assume either I’d get it on second viewing or I lack the historical or cultural context. And this became great.

She says you don’t even know her name. I think you do know.

I do think to work fully this needs to be in a theater, it’s very visual and you need to be free of distractions.

The more I reflect on the experience, I agree with Tyler Cowen that seeing it in a theater really is a must. The more you are going for cinematography and Quality, the more you need a theater, and I think this applies even more than it does to the big blockbuster special effects movies.

If I had no idea what this film was about, or thought the thing it was trying to say was dumb, where would I put it on the cinematography and quality alone? On reflection I think I’d rate that experience around a 4 out of 5. I will add that yes, it is in part a love letter to film, that’s obvious and not a spoiler, but it is another thing, too.

Sumner also reviewed two movies I’ve seen recently, Sentimental Value (3.8) and One Battle After Another (3.7). I don’t have either movie that high, but neither score surprised me, and both seem right given what he values.

Tyler Cowen picks the movies he liked in 2025 without naming any he finds great in particular. He calls it one of the weakest years for movies in his lifetime. I found several of his picks underrated (House of Dynamite, Oh, Hi, The Materialists) but they shouldn’t make such lists in a strong year. The picks I actively disagree with are highly understandable and I’m in the minority on those.

Matthew Yglesias offers his favorite movies of 2025.

Rolling Stone best movies of 2025. They called it a ‘truly great year for great movies, period’ which I find hard to take seriously.

The New Yorker lists its best movies of 2025, and calls it a ‘brilliant year for movies.’

53 Directors Pick Their Favorite Films of 2025. There’s a clear pattern of choosing ‘this movie had very good direction’ as the central criteria. Makes sense. I respect the hell out of those here who were willing to go against this, such as Paul Feig.

In general, the correlation between ‘who you would give Best Director’ and ‘what you think is the best movie’ is very high. I would say far too high, that this is letting Quality override other movie features and this is a mistake.

Variance in such lists is also very high. Almost every list will have something that seems like a mistake, and include many movies I have not seen.

There really are a lot of movies. As of writing this I’ve seen 55 new movies in 2025, and even with some attempt to see the best movies (and admittedly some cases where I wasn’t trying) that still doesn’t include that many of the movies these lists include.

Thus, there are four types of disagreements with such lists.

  1. I haven’t seen the movie. Maybe you’re right.

  2. I have seen the movie, I disagree with you, but I get it. If you think One Battle After Another or Sinners or Weapons was great, I get why you would think that, in that order. They ooze ‘this is a really good movie’ but didn’t work for me.

  3. I have seen the movie, I disagree with you, and you’re wrong. Two lists had The Phoenician Scheme, and I’m sorry, no, there’s some great moments and acting in it but overall it’s not there and you have to know that.

  4. You’re missing a movie that you can’t have missed, and this isn’t merely a matter of taste, it both oozes great movie and is actually great, so you’re simply wrong, then this subdivides into ‘the world is wrong’ and ‘no it’s just you that’s wrong.’

Matthew Yglesias talks himself into the Netflix-Warner merger. He points out that many IPs might transfer from primarily movie IP to primarily TV show IP, and my response to that is: Good. TV lasts longer and has a bigger payoff, and movies rely too much on existing IP. It’s only bad if Netflix-Warner actively means theaters go out of business, which would indeed be terrible.

The movie business is weird. I don’t understand why, here in New York, you have tons of movie theaters and they all play the same new movies all the time for brief windows, and old movies only get brought back for particular events. Shouldn’t the long tail work in your favor here, especially since the economics of that favor the theater (they keep a much bigger cut)? And why shouldn’t Netflix want all their movies in theaters whenever possible? Are you really going to not subscribe to Netflix because you instead saw Knives Out on a bigger screen?

New music no longer involves key changes.

Any given song probably doesn’t want a key change. If your hit songs basically never key change, that seems like an extremely bad sign.

Given both Tyler Cowen and Scott Sumner mentioned it in their list of the best art of the 21st Century, I will note that while I did enjoy much of The Three Body Problem (my review is here) and found many ideas interesting, and I’d certainly say it’s worth reading, we’re all in trouble if that’s one of the best books over a 25 year period.

Ben Thompson goes on a righteous rant about how Apple does not understand how to create a good sports experience on the Apple Vision Pro. He is entirely correct. The killer product is that you take cameras, you let a fan sit in a great seat, and let them watch the game from that seat. That’s it. Never force a camera move on the viewer. That’s actively better than doing more traditional things.

You can improve that experience by giving the fan the option to move seats if desired, and giving them the option for a radio-style broadcast, and perhaps options for superimposing various statistics and game state information on their screens. But keep it simple.

If you miss MTV, there’s MTV Rewind.

Ondrej Strasky report on the Arena Championship, where he played Necro reanimator in Timeless.

Two point attempts in the NBA now pay off better than three point attempts.

The equilibrium is that 2s should be worth substantially more than 3s. 2s have much higher variance than 3s. There are layups and dunks worth almost the full 2 points, whereas no 3 is ever that great and it’s almost always possible to get a 3 that isn’t that bad, if you can’t do better.

If you insist upon using any Twitter algorithm, check your ‘Twitter interests’ page and unclick everything you don’t want included. My list had quite a lot of things I am actively not interested in, but I didn’t notice because I never use algorithmic feeds.

Thebes gave us that tip, along with using lists for small accounts you like and aggressively and repeatedly saying ‘not interested’ in any and all viral posts.

Elon Musk is threatening to turn the Twitter algorithm over to Grok again.

Benjamin De Kraker: Ok, but where does “people you follow” fit into this process?

DogeDesigner: Elon Musk explains how the new Grok powered 𝕏 algorithm will work:

• Grok will read every post made on 𝕏 i.e over 100M posts daily.

• After filtering, it will match content to 300M to 400M users daily.

• Goal is to show each user content they are most likely to enjoy or engage with.

• It will filter out spam and scam content automatically.

• Helps fix the small or new account problem where good posts go unseen.

• You will be able to ask Grok to adjust your feed, temporarily or permanently.

So there it is. Direct engagement maximization on a per-post basis, and except for asking Grok to adjust your feed it will completely ignore anything else, and especially will not care about who you follow.

Elon Musk promises they will open source the algorithm periodically. At this point we all know how much that promise is worth.

If you want a social network to succeed in the long term you need to, as per Roon here, foster the development of organic self-organizing communities, centrally embodied by the concept of Tpot (as in ‘that part of Twitter’) for various different parts. If you do short term optimization you get slop and everything dies, and indeed even with the aggressive use of lists to avoid the algorithm it is clear Twitter is increasingly dominated by slop strategies.

A fun fact: Meta estimates it is involved in 1/3 of all successful scams in America (original video source) and Meta is basically doing the ‘we are making more money from allowing scams than we will be fined for knowingly allowing the scams’ calculation and knowingly allowing a lot of scams. I wonder how much they valued what would happen when people noticed all the scams? What will happen?

Robert Wiblin frames this as a ‘WTF’ moment, Dan Luu does not find it surprising and notes that whenever he tries clicking ads he finds a lot of scams, and notes that big companies have a hard time doing spam and scam filtering because they present too juicy an attack surface. In this case, the WTF comes from Meta clearly having the ability to do much better at preventing scams, seemingly without that many false positives, and choosing not to because the scammers generate ad revenue.

Elon Musk is once again making Twitter worse. Every time you load the page it will force the For You tab of horrors onto you, forcing you to reclick the Following button, and it may not be long before it is impossible to switch back. On Twitter Pro, you cannot switch back – the following tab is a For You no matter what you click on.

Good news, there is a solution, it’s a hack but it works:

Warren Sharp: tweetdeck’s home column being permanently stuck on the “for you” option despite selecting the “following” option is a development I wouldn’t wish on my worst enemy.

…if your home tab is now full of “for you” recommended posts and you can’t see only people you follow, do this:

1. hit “add a new column”

2. hit the “search” option

3. check the box “only show people you follow”

4. leave the search field blank

5. hit the search button

boom

new column with only people you follow. Make sure at the top it says “latest” and you’ll get your old “home” column with it only pulling up people you are following

My solution, which was a bit more convoluted, was to vibecode a feature in my Chrome Extension that automatically moved all my followers into a list, and then added a feature to also add members from other lists, combined my two lists I check and my followers into one list, and presto.

Fun tidbit: Nikita Bier, basically in charge of making Twitter featuress, called PMs ‘not real.’ It shows.

Vitalik Buterin calls on Elon Musk to use Twitter as a global totem pole for Free Speech but also turning it into a death star laser against coordinated hate sessions, with his core example being hate directed towards Europe. As Vitalik notes, Europe, including both the UK and EU, have severe problems, but the rhetoric about them seems rather out of hand on Twitter.

Vitalik Buterin: I think you should consider that making X a global totem pole for Free Speech, and then turning it into a death star laser for coordinated hate sessions, is actually harmful for the cause of free speech. I’m seriously worried that huge backlashes against values I hold dear are coming in a few years’ time.

He’s clearly actively tweaking algorithms to boost some things and deboost other things based on pretty arbitrary criteria.

As long as that power lever exists, I’d prefer it be used (without increasing its scope) to boost niceness instead of boosting ragebait.

First best solution is to have Twitter run purely on an algorithm, and Elon Musk can either change the algorithm or use his account like everyone else.

Second best solution is to use the power for good.

Discussion about this post

Monthly Roundup #38: January 2026 Read More »

why-i’m-withholding-certainty-that-“precise”-us-cyber-op-disrupted-venezuelan-electricity

Why I’m withholding certainty that “precise” US cyber-op disrupted Venezuelan electricity

The New York Times has published new details about a purported cyberattack that unnamed US officials claim plunged parts of Venezuela into darkness in the lead-up to the capture of the country’s president, Nicolás Maduro.

Key among the new details is that the cyber operation was able to turn off electricity for most residents in the capital city of Caracas for only a few minutes, though in some neighborhoods close to the military base where Maduro was seized, the outage lasted for three days. The cyber-op also targeted Venezuelan military radar defenses. The paper said the US Cyber Command was involved.

Got more details?

“Turning off the power in Caracas and interfering with radar allowed US military helicopters to move into the country undetected on their mission to capture Nicolás Maduro, the Venezuelan president who has now been brought to the United States to face drug charges,” the NYT reported.

The NYT provided few additional details. Left out were the methods purportedly used. When Russia took out electricity in December 2015, for instance, it used general-purpose malware known as BlackEnergy to first penetrate the corporate networks of the targeted power companies and then further encroach into the supervisory control and data acquisition systems the companies used to generate and transmit electricity. The Russian attackers then used legitimate power distribution functionality to trigger the failure, which took out power to more than 225,000 people for more than six hours, when grid workers restored it.

In a second attack almost exactly a year later, Russia used a much more sophisticated piece of malware to take out key parts of the Ukrainian power grid. Named Industroyer and alternatively Crash Override, it’s the first known malware framework designed to attack electric grid systems directly.

Why I’m withholding certainty that “precise” US cyber-op disrupted Venezuelan electricity Read More »

nasa’s-first-medical-evacuation-from-space-ends-with-on-target-splashdown

NASA’s first medical evacuation from space ends with on-target splashdown

“Because the astronaut is absolutely stable, this is not an emergent evacuation,” said James “JD” Polk, NASA’s chief medical officer, in a press conference last week. “We’re not immediately disembarking and getting the astronaut down.”

Amit Kshatriya, the agency’s associate administrator, called the situation a “controlled medical evacuation” in a briefing with reporters.

But without a confirmed diagnosis of the astronaut’s medical issue, there was some “lingering risk” for the astronaut’s health if they remained in orbit, Polk said. That’s why NASA Administrator Jared Isaacman and his deputies agreed to call an early end to the Crew-11 mission.

A first for NASA

The Crew-11 mission launched on August 1 and was supposed to stay on the space station until around February 20, a few days after the scheduled arrival of SpaceX’s Crew-12 mission with a team of replacement astronauts. But the early departure means the space station will operate with a crew of three until the launch of Crew-12 next month.

NASA astronaut Chris Williams will be the sole astronaut responsible for maintaining the US segment of the station. Russian cosmonauts Sergey Kud-Sverchkov and Sergey Mikayev launched with Williams in November on a Russian Soyuz vehicle. The Crew Dragon was the lifeboat for all four Crew-11 astronauts, so standard procedure called for the entire crew to return with the astronaut suffering the undisclosed medical issue.

The space station regularly operated with just three crew members for the first decade of its existence. The complex has been permanently staffed since 2000, sometimes with as few as two astronauts or cosmonauts. The standard crew size was raised to six in 2009, then to seven in 2020.

SpaceX’s Crew Dragon Endeavour spacecraft descends toward the Pacific Ocean under four main parachutes.

Credit: NASA

SpaceX’s Crew Dragon Endeavour spacecraft descends toward the Pacific Ocean under four main parachutes. Credit: NASA

Williams will have his hands full until reinforcements arrive. The scaled-down crew will not be able to undertake any spacewalks, and some of the lab’s science programs may have to be deferred to ensure the crew can keep up with maintenance tasks.

This is the first time NASA has called an early end to a space mission for medical reasons, but the Soviet Union faced similar circumstances several times during the Cold War. Russian officials cut short an expedition to the Salyut 7 space station in 1985 after the mission’s commander fell ill in orbit. A similar situation occurred in 1976 with the Soyuz 21 mission to the Salyut 5 space station.

NASA’s first medical evacuation from space ends with on-target splashdown Read More »

six-months-later,-trump-mobile-still-hasn’t-delivered-preordered-phones

Six months later, Trump Mobile still hasn’t delivered preordered phones

“Trump Mobile began accepting $100 deposits from consumers as early as August 2025 but has failed to deliver any T1 phones to consumers… Instead, Trump Mobile has consistently pushed back its delivery date, originally promising August 2025 and subsequently postponing to November and then the beginning of December. As of January 2026, no phone has been delivered,” the letter said.

Trump Mobile customer service reps “provided contradictory and irrelevant explanations for delays, including blaming a government shutdown that had no apparent connection to the product’s manufacturing or delivery,” the letter continued. With the Trump phone still missing in action, “Trump Mobile has been selling refurbished iPhones, which are largely manufactured in China, and Samsung devices, which are manufactured by a Korean company, while claiming these products are ‘brought to life right here in the USA.’”

Trump phone coming in Q1, allegedly

After Trump Mobile failed to deliver the phone in 2025, USA Today asked for a new projected delivery date. “A Trump Mobile customer service representative told USA Today that the phone is to be released ‘the first quarter of this year’ and that it is completing the final stages of regulatory testing for the cellular device,” USA Today reported on Tuesday.

The Warren letter said Trump Mobile’s made-in-the-USA claims “are potentially misleading characterizations for devices that are manufactured overseas,” and that failing to meet promised delivery dates after collecting $100 deposits may be “a deceptive or unfair business practice.” The letter urged Ferguson to have the FTC carry out “its statutory obligation to enforce consumer protection laws.”

The letter pointed out that the FTC has previously acted against companies that acted similarly to Trump Mobile. “The FTC is responsible for ensuring that companies like Trump Mobile do not make false or misleading claims when marketing products… The FTC has previously taken action against companies for false ‘Made in the USA’ claims, misleading representations about product features and origins, bait-and-switch tactics involving deposits for products never delivered, and failure to honor promised delivery dates,” the letter said.

The letter asked Ferguson to state whether the FTC has opened an investigation into Trump Mobile and, if not, to “explain the legal and factual basis for declining to investigate these apparent violations.”

Six months later, Trump Mobile still hasn’t delivered preordered phones Read More »