gaming

what-we’re-expecting-from-nintendo’s-switch-2-announcement-wednesday

What we’re expecting from Nintendo’s Switch 2 announcement Wednesday

Implausible: Long-suffering Earthbound fans have been hoping for a new game in the series (or even an official localization of the Japan-exclusive Mother 3) for literal decades now. Personally, though, I’m hoping for a surprise revisit to the Punch-Out series, following on its similar surprise return on the Wii in 2009.

Screen

This compressed screenshot of a compressed video is by no means the resolution of the Switch 2 screen, but it’s going to be higher than the original Switch.

Credit: Nintendo

This compressed screenshot of a compressed video is by no means the resolution of the Switch 2 screen, but it’s going to be higher than the original Switch. Credit: Nintendo

Likely: While a 720p screen was pretty nice in a 2017 gaming handheld, a full 1080p display is much more standard in today’s high-end gaming portables. We expect Nintendo will follow this trend for what looks to be a nearly 8-inch screen on the Switch 2.

Possible: While a brighter OLED screen would be nice as a standard feature on the Switch 2, we expect Nintendo will follow the precedent of the Switch generation and offer this as a pricier upgrade at some point in the future.

Implausible: The Switch 2 would be the perfect time for Nintendo to revisit the glasses-free stereoscopic 3D that we all thought was such a revelation on the 3DS all those years ago.

C Button

Close-up of the

C-ing is believing.

Credit: Nintendo

C-ing is believing. Credit: Nintendo

Likely: The mysterious new button labeled “C” on the Switch 2’s right Joy-Con could serve as a handy way to “connect” to other players, perhaps through a new Miiverse-style social network.

Possible: Recent rumors suggest the C button could be used to connect to a second Switch console (or the TV-connected dock) for a true dual-screen experience. That would be especially fun and useful for Wii U/DS emulation and remasters.

Implausible: The C stands for Chibi-Robo! and launches a system-level mini-game focused on the miniature robot.

New features

Switch 2, with joycons slightly off the central unit/screen.

Credit: Nintendo

Likely: After forcing players to use a wonky smartphone app for voice chat on the Switch, we wouldn’t be surprised if Nintendo finally implements full on-device voice chat for online games on the Switch 2—at least between confirmed “friends” on the system.

Possible: Some sort of system-level achievement tracking would bring Nintendo’s new console in line with a feature that the competition from Sony and Microsoft has had for decades now.

Implausible: After killing it off for the Switch generation, we’d love it if Nintendo brought back the Virtual Console as a way to buy permanent downloadable copies of emulated classics that will carry over across generations. Failing that, how about a revival of the 3DS’s StreetPass passive social network for Switch 2 gamers on the go?

What we’re expecting from Nintendo’s Switch 2 announcement Wednesday Read More »

the-timeless-genius-of-a-1980s-atari-developer-and-his-swimming-salmon-masterpiece

The timeless genius of a 1980s Atari developer and his swimming salmon masterpiece

Williams’ success with APX led him to create several games for Synapse Software, including the beloved Alley Cat and the incomprehensible fantasy masterpiece Necromancer, before moving to the Amiga, where he created the experimental Mind Walker and his ambitious “cultural simulation” Knights of the Crystallion.

Necromancer, Williams’ later creation for the Atari 800, plays like a fever dream—you control a druid fighting off spiders while growing magic trees and battling an undead wizard. It makes absolutely no sense by conventional standards, but it’s brilliant in its otherworldliness.

“The first games that I did were very hard to explain to people and they just kind of bought it on faith,” Williams said in a 1989 interview with YAAM (Yet Another Amiga Magazine), suggesting this unconventional approach started early. That willingness to create deeply personal, almost surreal experiences defined Williams’ work throughout his career.

An Atari 800 (the big brother of the Atari 400) that Benj Edwards set up to play M.U.L.E. at his mom's house in 2015, for nostalgia purposes.

An Atari 800 that Benj Edwards set up to play M.U.L.E. at his mom’s house in 2015, for nostalgia purposes. Credit: Benj Edwards

After a brief stint making licensed games (like Bart’s Nightmare) for the Super Nintendo at Sculptured Software, he left the industry entirely to pursue his calling as a pastor, attending seminary in Chicago with his wife Martha, before declining health forced him to move to Rockport, Texas. Perhaps reflecting on the choices that led him down this path, Williams had noted years earlier in that 1989 interview, “Sometimes in this industry we tend to forget that life is a lot more interesting than computers.”

Bill Williams died on May 28, 1998, one day before his 38th birthday. He died young, but he outlived his doctors’ prediction that he wouldn’t reach age 13, and created cultural works that stand the test of time. Like Sam the Salmon, Williams pushed forward relentlessly—in his case, creating powerful digital art that was uniquely his own.

In our current era of photorealistic graphics and cinematic game experiences, Salmon Run‘s blocky pixels might seem quaint. But its core themes—persistence, natural beauty, and finding purpose against long odds—remain as relevant as ever. We all face bears in life—whether they come from natural adversity or from those who might seek to do us harm. The beauty of Williams’ game is in showing us that, despite their menacing presence, there’s still a reward waiting upstream for those willing to keep swimming.

If you want to try Salmon Run, you can potentially play it in your browser through an emulated Atari 800, hosted on The Internet Archive. Press F1 to start the game.

The timeless genius of a 1980s Atari developer and his swimming salmon masterpiece Read More »

satisfactory-now-has-controller-support,-so-there’s-no-excuse-for-your-bad-lines

Satisfactory now has controller support, so there’s no excuse for your bad lines

Satisfactory starts out as a game you play, then becomes a way you think. The only way I have been able to keep the ridiculous factory simulation from eating an even-more-unhealthy amount of my time was the game’s keyboard-and-mouse dependency. But the work, it has found me—on my couch, on a trip, wherever one might game, really.

In a 1.1 release on Satisfactory‘s Experimental branch, there are lots of new things, but the biggest new thing is a controller scheme. Xbox and DualSense are officially supported, though anyone playing on Steam can likely tweak their way to something that works on other pads. With this, the game becomes far more playable for those playing on a couch, on a portable gaming PC like the Steam Deck, or over household or remote streaming. It also paves the way for the game’s console release, which is currently slated for sometime in 2025.

Coffee Stain Studios reviews the contents of its Experimental branch 1.1 update.

Satisfactory seems like an unlikely candidate for controller support, let alone consoles. It’s a game where you do a lot of three-dimensional thinking, putting machines and conveyer belts and power lines in just the right places, either because you need to or it just feels proper. How would it feel to select, rotate, place, and connect everything using a controller? Have I just forgotten that Minecraft, and first-person games as a whole, probably seemed similarly desk-bound at one time? I grabbed an Xbox Wireless controller, strapped on my biofuel-powered jetpack, and gave a reduced number of inputs a shot.

The biggest hurdle to get past, for me, is not jumping in place when I wanted to do something, though it’s not unique to this game. In most games that have some kind of building or planning through a controller, the bottom-right button (“A” on Xbox, “X” on PlayStation DualSense) is often the do/interact/confirm button. In Satisfactory, and some other games where I switch between keyboard/mouse and controller, A/X is jump. Satisfactory wants you to primarily use the triggers and bumpers to select, build, and dismantle things, which feels okay when you’ve got the hang of things. But even after an hour or so, I still found my pioneer unexpectedly jumping, as if he needed to get the zoomies out before placing a storage container.

Satisfactory now has controller support, so there’s no excuse for your bad lines Read More »

what-to-make-of-nintendo’s-mention-of-new-“switch-2-edition-games”

What to make of Nintendo’s mention of new “Switch 2 Edition games”

When Nintendo finally officially revealed the Switch 2 in January, one of our major unanswered questions concerned whether games designed for the original Switch would see some form of visual or performance enhancement when running on the backward-compatible Switch 2. Now, Nintendo-watchers are pointing to a fleeting mention of “Switch 2 Edition games” as a major hint that such enhancements are in the works for at least some original Switch games.

The completely new reference to “Switch 2 Edition games” comes from a Nintendo webpage discussing yesterday’s newly announced Virtual Game Cards digital lending feature. In the fine print at the bottom of that page, Nintendo notes that “Nintendo Switch 2 exclusive games and Nintendo Switch 2 Edition games can only be loaded on a Nintendo Switch 2 system [emphasis added].”

The specific wording differentiating these “Switch 2 Edition” games from “Switch 2 exclusives” suggests a new category of game that is compatible with the original Switch but able to run with enhancements on the Switch 2. But it’s currently unclear what Switch games will get “Switch 2 Edition” releases or how much developer work (if any) will be needed to create those new versions.

We’ve seen this before

Nintendo is no stranger to the idea of single game releases that work differently across different hardware. Back in the days of the Game Boy Color, developers could create special “Dual Mode” cartridges that ran in full color on the newer handheld or in regular grayscale on the original Game Boy. Late-era Game Boy cartridges could also be coded with special enhancements that activated when played on a TV via the Super Game Boy adapter—Taito even memorably used this feature to include a complete SNES edition of Space Invaders on a Game Boy cartridge.

What to make of Nintendo’s mention of new “Switch 2 Edition games” Read More »

gran-turismo-7-expands-its-use-of-ai/ml-trained-npcs-with-good-effect

Gran Turismo 7 expands its use of AI/ML-trained NPCs with good effect

GT Sophy can now race at 19 tracks, up from the nine that were introduced in November 2023. The AI agent is an alternative to the regular, dumber AI in the game’s quick race mode, with easy, medium, and hard settings. But now, at those same tracks, you can also create custom races using GT Sophy, meaning you’re no longer limited to just two or three laps. You can enable things like damage, fuel consumption and tire wear, and penalties, and you can have some control over the cars you race against.

Unlike the time-limited demo, the hardest setting is no longer alien-beating. As a GT7 player, I’m slowing with age, and I find the hard setting to be that—hard, but beatable. (I suspect but need to confirm that the game tailors the hardest setting to your ability based on your results, as, when I create a custom race on hard, only seven of the nine progress bars are filled, and in the screenshot above, only five bars are filled.)

Having realistic competition has always been one of the tougher challenges for a racing game, and one that the GT franchise was never particularly great at during previous console generations. This latest version of GT Sophy does feel different to race against: The AI is opportunistic and aggressive but also provokable into mistakes. If only the developer would add it to more versions of the in-game Nürburgring.

Gran Turismo 7 expands its use of AI/ML-trained NPCs with good effect Read More »

discord-is-planning-an-ipo-this-year,-and-big-changes-could-be-on-the-horizon

Discord is planning an IPO this year, and big changes could be on the horizon

The product has evolved into something akin to Slack, but for personal use. It’s used by artist communities, game developers, open source projects, influencers, and more to manage communities and coordinate work. In some cases, people simply use it as an extremely robust group messaging tool for groups of friends without any games or projects involved.

Limited ads to tackle limited revenue

For years, Discord proudly touted a “no ads” policy, but that dam has broken in some small ways in recent months. Discord began offering game publishers opportunities to create special “quests” that appear in the Discord interface, wherein players can earn in-game rewards for doing specific tasks, like streaming a game to friends. A new format, called video quests, is planned for this summer, too.

The new ad products are meant to drum up Discord’s revenue potential in the lead-up to an IPO; the platform already offered premium subscriptions for access to more advanced features and a marketplace for cosmetics to jazz up profiles.

So far, the ad products are, by and large, much less intrusive than ads in many other social networks and seem to be oriented around providing some user value. However, an IPO could lead to shareholders demanding more from the company in pursuit of revenue.

Discord is planning an IPO this year, and big changes could be on the horizon Read More »

pillars-of-eternity-is-getting-turn-based-combat,-all-but-demanding-replays

Pillars of Eternity is getting turn-based combat, all but demanding replays

More than just rolling for initiative

Obsidian added a turn-based mode to Pillars of Eternity II: Deadfire in patch 4.1, roughly eight months after the game’s initial release. Designer Josh Sawyer, who worked on Baldur’s Gate II and directed both PoE games, said in a 2023 interview with Touch Arcade that the real-time systems in the PoE games were largely a concession to the old-school CRPG fans that crowdfunded both games.

Turn-based was Sawyer’s stated preference, and he thinks Baldur’s Gate 3 largely put an end to the debate in modern times:

I just think it’s easier to design more intricate combats. I like games with a lot of stats, obviously. (He laughs). But the problem with real time with pause is that it’s honestly very difficult for people to actually parse all of that information, and one of the things I’ve heard a lot from people who’ve played Deadfire in turn based, is that there were things about the game like the affliction and inspiration system that they didn’t really understand very clearly until they played it in turn based.

But both Pillars games were designed with real-time combat in mind, such that, even with his appreciation for the turn-based addition to PoE 2, Sawyer knows “the game wasn’t designed for it,” he told Touch Arcade. This is almost certainly going to be the case, too, for the original PoE, but there could be lessons learned from PoE 2‘s transformation to apply. Other games from that era might also lure folks like me back, though perhaps they, too, have a density of encounters and maps that just can’t cut it for turn-based.

Beyond this notably big “patch” coming to the original PoE, the 10th anniversary patch should make it easier for Mac and Linux (through Proton) users to stay up to date on bug fixes, and for players on GOG and Epic to get Kickstarter rewards and achievements. Lots of audio and visual effects were fixed up, along with a whole heap of mechanical and combat fixes.

Pillars of Eternity is getting turn-based combat, all but demanding replays Read More »

no-cloud-needed:-nvidia-creates-gaming-centric-ai-chatbot-that-runs-on-your-gpu

No cloud needed: Nvidia creates gaming-centric AI chatbot that runs on your GPU

Nvidia has seen its fortunes soar in recent years as its AI-accelerating GPUs have become worth their weight in gold. Most people use their Nvidia GPUs for games, but why not both? Nvidia has a new AI you can run at the same time, having just released its experimental G-Assist AI. It runs locally on your GPU to help you optimize your PC and get the most out of your games. It can do some neat things, but Nvidia isn’t kidding when it says this tool is experimental.

G-Assist is available in the Nvidia desktop app, and it consists of a floating overlay window. After invoking the overlay, you can either type or speak to G-Assist to check system stats or make tweaks to your settings. You can ask basic questions like, “How does DLSS Frame Generation work?” but it also has control over some system-level settings.

By calling up G-Assist, you can get a rundown of how your system is running, including custom data charts created on the fly by G-Assist. You can also ask the AI to tweak your machine, for example, optimizing the settings for a particular game or toggling on or off a setting. G-Assist can even overclock your GPU if you so choose, complete with a graph of expected performance gains.

Nvidia on G-Assist.

Nvidia demoed G-Assist last year with some impressive features tied to the active game. That version of G-Assist could see what you were doing and offer suggestions about how to reach your next objective. The game integration is sadly quite limited in the public version, supporting just a few games, like Ark: Survival Evolved.

There is, however, support for a number of third-party plug-ins that give G-Assist control over Logitech G, Corsair, MSI, and Nanoleaf peripherals. So, for instance, G-Assist could talk to your MSI motherboard to control your thermal profile or ping Logitech G to change your LED settings.

No cloud needed: Nvidia creates gaming-centric AI chatbot that runs on your GPU Read More »

how-a-nephew’s-cd-burner-inspired-early-valve-to-embrace-drm

How a nephew’s CD burner inspired early Valve to embrace DRM

Back in 2004, the launch of Half-Life 2 would help launch Steam on the path to eventually becoming the de facto digital rights management (DRM) system for the vast majority of PC games. But years before that, with the 1998 launch of the original Half-Life, Valve cofounder and then-CMO Monica Harrington said she was inspired to take DRM more seriously by her nephew’s reaction to the purchase of a new CD-ROM burner.

PC Gamer pulled that interesting tidbit from a talk Harrington gave at last week’s Game Developers Conference. In her remembering, Harrington’s nephew had used funds she had sent for school supplies on a CD replicator, then sent her “a lovely thank you note essentially saying how happy he was to copy and share games with his friends.”

That was the moment Harrington said she realized this new technology was leading to a “generational shift” in both the availability and acceptability of PC game piracy. While game piracy and DRM definitely existed prior to CD burners (anyone else remember the large codewheels that cluttered many early PC game boxes?), Harrington said the new technology—and the blasé attitude her nephew showed toward using it for piracy—could “put our entire business model at risk.”

Shortly after Half-Life launched with a simple CD key verification system in place, Harrington said the company noticed a wave of message board complaints about the game not working. But when Valve cofounder (and Monica’s then-husband) Mike Harrington followed up with those complaining posters, he found that “none of them had actually bought the game. So it turned out that the authentication system was working really well,” Harrington said.

How a nephew’s CD burner inspired early Valve to embrace DRM Read More »

why-anthropic’s-claude-still-hasn’t-beaten-pokemon

Why Anthropic’s Claude still hasn’t beaten Pokémon


Weeks later, Sonnet’s “reasoning” model is struggling with a game designed for children.

A game Boy Color playing Pokémon Red surrounded by the tendrils of an AI, or maybe some funky glowing wires, what do AI tendrils look like anyways

Gotta subsume ’em all into the machine consciousness! Credit: Aurich Lawson

Gotta subsume ’em all into the machine consciousness! Credit: Aurich Lawson

In recent months, the AI industry’s biggest boosters have started converging on a public expectation that we’re on the verge of “artificial general intelligence” (AGI)—virtual agents that can match or surpass “human-level” understanding and performance on most cognitive tasks.

OpenAI is quietly seeding expectations for a “PhD-level” AI agent that could operate autonomously at the level of a “high-income knowledge worker” in the near future. Elon Musk says that “we’ll have AI smarter than any one human probably” by the end of 2025. Anthropic CEO Dario Amodei thinks it might take a bit longer but similarly says it’s plausible that AI will be “better than humans at almost everything” by the end of 2027.

A few researchers at Anthropic have, over the past year, had a part-time obsession with a peculiar problem.

Can Claude play Pokémon?

A thread: pic.twitter.com/K8SkNXCxYJ

— Anthropic (@AnthropicAI) February 25, 2025

Last month, Anthropic presented its “Claude Plays Pokémon” experiment as a waypoint on the road to that predicted AGI future. It’s a project the company said shows “glimmers of AI systems that tackle challenges with increasing competence, not just through training but with generalized reasoning.” Anthropic made headlines by trumpeting how Claude 3.7 Sonnet’s “improved reasoning capabilities” let the company’s latest model make progress in the popular old-school Game Boy RPG in ways “that older models had little hope of achieving.”

While Claude models from just a year ago struggled even to leave the game’s opening area, Claude 3.7 Sonnet was able to make progress by collecting multiple in-game Gym Badges in a relatively small number of in-game actions. That breakthrough, Anthropic wrote, was because the “extended thinking” by Claude 3.7 Sonnet means the new model “plans ahead, remembers its objectives, and adapts when initial strategies fail” in a way that its predecessors didn’t. Those things, Anthropic brags, are “critical skills for battling pixelated gym leaders. And, we posit, in solving real-world problems too.”

Over the last year, new Claude models have shown quick progress in reaching new Pokémon milestones.

Over the last year, new Claude models have shown quick progress in reaching new Pokémon milestones. Credit: Anthropic

But relative success over previous models is not the same as absolute success over the game in its entirety. In the weeks since Claude Plays Pokémon was first made public, thousands of Twitch viewers have watched Claude struggle to make consistent progress in the game. Despite long “thinking” pauses between each move—during which viewers can read printouts of the system’s simulated reasoning process—Claude frequently finds itself pointlessly revisiting completed towns, getting stuck in blind corners of the map for extended periods, or fruitlessly talking to the same unhelpful NPC over and over, to cite just a few examples of distinctly sub-human in-game performance.

Watching Claude continue to struggle at a game designed for children, it’s hard to imagine we’re witnessing the genesis of some sort of computer superintelligence. But even Claude’s current sub-human level of Pokémon performance could hold significant lessons for the quest toward generalized, human-level artificial intelligence.

Smart in different ways

In some sense, it’s impressive that Claude can play Pokémon with any facility at all. When developing AI systems that find dominant strategies in games like Go and Dota 2, engineers generally start their algorithms off with deep knowledge of a game’s rules and/or basic strategies, as well as a reward function to guide them toward better performance. For Claude Plays Pokémon, though, project developer and Anthropic employee David Hershey says he started with an unmodified, generalized Claude model that wasn’t specifically trained or tuned to play Pokémon games in any way.

“This is purely the various other things that [Claude] understands about the world being used to point at video games,” Hershey told Ars. “So it has a sense of a Pokémon. If you go to claude.ai and ask about Pokémon, it knows what Pokémon is based on what it’s read… If you ask, it’ll tell you there’s eight gym badges, it’ll tell you the first one is Brock… it knows the broad structure.”

A flowchart summarizing the pieces that help Claude interact with an active game of Pokémon (click through to zoom in).

A flowchart summarizing the pieces that help Claude interact with an active game of Pokémon (click through to zoom in). Credit: Anthropic / Excelidraw

In addition to directly monitoring certain key (emulated) Game Boy RAM addresses for game state information, Claude views and interprets the game’s visual output much like a human would. But despite recent advances in AI image processing, Hershey said Claude still struggles to interpret the low-resolution, pixelated world of a Game Boy screenshot as well as a human can. “Claude’s still not particularly good at understanding what’s on the screen at all,” he said. “You will see it attempt to walk into walls all the time.”

Hershey said he suspects Claude’s training data probably doesn’t contain many overly detailed text descriptions of “stuff that looks like a Game Boy screen.” This means that, somewhat surprisingly, if Claude were playing a game with “more realistic imagery, I think Claude would actually be able to see a lot better,” Hershey said.

“It’s one of those funny things about humans that we can squint at these eight-by-eight pixel blobs of people and say, ‘That’s a girl with blue hair,’” Hershey continued. “People, I think, have that ability to map from our real world to understand and sort of grok that… so I’m honestly kind of surprised that Claude’s as good as it is at being able to see there’s a person on the screen.”

Even with a perfect understanding of what it’s seeing on-screen, though, Hershey said Claude would still struggle with 2D navigation challenges that would be trivial for a human. “It’s pretty easy for me to understand that [an in-game] building is a building and that I can’t walk through a building,” Hershey said. “And that’s [something] that’s pretty challenging for Claude to understand… It’s funny because it’s just kind of smart in different ways, you know?”

A sample Pokémon screen with an overlay showing how Claude characterizes the game’s grid-based map.

A sample Pokémon screen with an overlay showing how Claude characterizes the game’s grid-based map. Credit: Anthrropic / X

Where Claude tends to perform better, Hershey said, is in the more text-based portions of the game. During an in-game battle, Claude will readily notice when the game tells it that an attack from an electric-type Pokémon is “not very effective” against a rock-type opponent, for instance. Claude will then squirrel that factoid away in a massive written knowledge base for future reference later in the run. Claude can also integrate multiple pieces of similar knowledge into pretty elegant battle strategies, even extending those strategies into long-term plans for catching and managing teams of multiple creatures for future battles.

Claude can even show surprising “intelligence” when Pokémon’s in-game text is intentionally misleading or incomplete. “It’s pretty funny that they tell you you need to go find Professor Oak next door and then he’s not there,” Hershey said of an early-game task. “As a 5-year-old, that was very confusing to me. But Claude actually typically goes through that same set of motions where it talks to mom, goes to the lab, doesn’t find [Oak], says, ‘I need to figure something out’… It’s sophisticated enough to sort of go through the motions of the way [humans are] actually supposed to learn it, too.”

A sample of the kind of simulated reasoning process Claude steps through during a typical Pokémon battle.

A sample of the kind of simulated reasoning process Claude steps through during a typical Pokémon battle. Credit: Claude Plays Pokemon / Twitch

These kinds of relative strengths and weaknesses when compared to “human-level” play reflect the overall state of AI research and capabilities in general, Hershey said. “I think it’s just a sort of universal thing about these models… We built the text side of it first, and the text side is definitely… more powerful. How these models can reason about images is getting better, but I think it’s a decent bit behind.”

Forget me not

Beyond issues parsing text and images, Hershey also acknowledged that Claude can have trouble “remembering” what it has already learned. The current model has a “context window” of 200,000 tokens, limiting the amount of relational information it can store in its “memory” at any one time. When the system’s ever-expanding knowledge base fills up this context window, Claude goes through an elaborate summarization process, condensing detailed notes on what it has seen, done, and learned so far into shorter text summaries that lose some of the fine-grained details.

This can mean that Claude “has a hard time keeping track of things for a very long time and really having a great sense of what it’s tried so far,” Hershey said. “You will definitely see it occasionally delete something that it shouldn’t have. Anything that’s not in your knowledge base or not in your summary is going to be gone, so you have to think about what you want to put there.”

A small window into the kind of “cleaning up my context” knowledge-base update necessitated by Claude’s limited “memory.”

A small window into the kind of “cleaning up my context” knowledge-base update necessitated by Claude’s limited “memory.” Credit: Claude Play Pokemon / Twitch

More than forgetting important history, though, Claude runs into bigger problems when it inadvertently inserts incorrect information into its knowledge base. Like a conspiracy theorist who builds an entire worldview from an inherently flawed premise, Claude can be incredibly slow to recognize when an error in its self-authored knowledge base is leading its Pokémon play astray.

“The things that are written down in the past, it sort of trusts pretty blindly,” Hershey said. “I have seen it become very convinced that it found the exit to [in-game location] Viridian Forest at some specific coordinates, and then it spends hours and hours exploring a little small square around those coordinates that are wrong instead of doing anything else. It takes a very long time for it to decide that that was a ‘fail.’”

Still, Hershey said Claude 3.7 Sonnet is much better than earlier models at eventually “questioning its assumptions, trying new strategies, and keeping track over long horizons of various strategies to [see] whether they work or not.” While the new model will still “struggle for really long periods of time” retrying the same thing over and over, it will ultimately tend to “get a sense of what’s going on and what it’s tried before, and it stumbles a lot of times into actual progress from that,” Hershey said.

“We’re getting pretty close…”

One of the most interesting things about observing Claude Plays Pokémon across multiple iterations and restarts, Hershey said, is seeing how the system’s progress and strategy can vary quite a bit between runs. Sometimes Claude will show it’s “capable of actually building a pretty coherent strategy” by “keeping detailed notes about the different paths to try,” for instance, he said. But “most of the time it doesn’t… most of the time, it wanders into the wall because it’s confident it sees the exit.”

Where previous models wandered aimlessly or got stuck in loops, Claude 3.7 Sonnet plans ahead, remembers its objectives, and adapts when initial strategies fail.

Critical skills for battling pixelated gym leaders. And, we posit, in solving real-world problems too. pic.twitter.com/scvISp14XG

— Anthropic (@AnthropicAI) February 25, 2025

One of the biggest things preventing the current version of Claude from getting better, Hershey said, is that “when it derives that good strategy, I don’t think it necessarily has the self-awareness to know that one strategy [it] came up with is better than another.” And that’s not a trivial problem to solve.

Still, Hershey said he sees “low-hanging fruit” for improving Claude’s Pokémon play by improving the model’s understanding of Game Boy screenshots. “I think there’s a chance it could beat the game if it had a perfect sense of what’s on the screen,” Hershey said, saying that such a model would probably perform “a little bit short of human.”

Expanding the context window for future Claude models will also probably allow those models to “reason over longer time frames and handle things more coherently over a long period of time,” Hershey said. Future models will improve by getting “a little bit better at remembering, keeping track of a coherent set of what it needs to try to make progress,” he added.

Twitch chat responds with a flood of bouncing emojis as Claude concludes an epic 78+ hour escape from Pokémon’s Mt. Moon.

Twitch chat responds with a flood of bouncing emojis as Claude concludes an epic 78+ hour escape from Pokémon’s Mt. Moon. Credit: Claude Plays Pokemon / Twitch

Whatever you think about impending improvements in AI models, though, Claude’s current performance at Pokémon doesn’t make it seem like it’s poised to usher in an explosion of human-level, completely generalizable artificial intelligence. And Hershey allows that watching Claude 3.7 Sonnet get stuck on Mt. Moon for 80 hours or so can make it “seem like a model that doesn’t know what it’s doing.”

But Hershey is still impressed at the way that Claude’s new reasoning model will occasionally show some glimmer of awareness and “kind of tell that it doesn’t know what it’s doing and know that it needs to be doing something different. And the difference between ‘can’t do it at all’ and ‘can kind of do it’ is a pretty big one for these AI things for me,” he continued. “You know, when something can kind of do something it typically means we’re pretty close to getting it to be able to do something really, really well.”

Photo of Kyle Orland

Kyle Orland has been the Senior Gaming Editor at Ars Technica since 2012, writing primarily about the business, tech, and culture behind video games. He has journalism and computer science degrees from University of Maryland. He once wrote a whole book about Minesweeper.

Why Anthropic’s Claude still hasn’t beaten Pokémon Read More »

hands-on-with-frosthaven’s-ambitious-port-from-gigantic-box-to-inviting-pc-game

Hands-on with Frosthaven’s ambitious port from gigantic box to inviting PC game

I can say this for certain: The game’s tutorial does a lot of work in introducing you to the game’s core mechanics, which include choosing cards with sequential actions, “burning” cards for temporary boosts, positioning, teamwork, and having enough actions or options left if a fight goes longer than you think. I’m not a total newcomer to the -haven games, having played a couple rounds of the Gloomhaven board game. But none of my friends, however patient, did as good a job of showing just how important it was to consider not just attack, defend, or move, but where each choice would place you, and how it would play with your teammates.

I played as a “Banner Spear,” one of the six starting classes. Their thing is—you guessed it—having a spear, and they can throw it or lunge with it from farther away. Many of the Banner Spear’s cards are more effective with positioning, like pincer-flanking an enemy or attacking from off to the side of your more up-close melee teammate. With only two players taking on a couple of enemies, I verbally brushed off the idea of using some more advanced options. My developer partner, using a Deathwalker, interjected: “Ah, but that is what summons are for.”

Soon enough, one of the brutes was facing down two skeletons, and I was able to get a nice shot in from an adjacent hex. The next thing I wanted to do was try out being a little selfish, running for some loot left behind by a vanquished goon. I forgot that you only pick up loot if you end your turn on a hex, not just pass through it, so my Banner Spear appeared to go on a little warm-up jog, for no real reason, before re-engaging the Germinate we were facing.

The art, animations, and feel of everything I clicked on was engaging, even as the developers regularly reassured me that all of it needs working on. With many more experienced players kicking the tires in early access, I expect the systems and quality-of-life details to see even more refinement. It’s a long campaign, both for players and the developers, but there’s a good chance it will be worth it.

Hands-on with Frosthaven’s ambitious port from gigantic box to inviting PC game Read More »

developer’s-gdc-billboard-pokes-at-despised-former-google-stadia-exec

Developer’s GDC billboard pokes at despised former Google Stadia exec

It has been nearly two years now since game industry veteran Phil Harrison left Google following the implosion of the company’s Stadia cloud gaming service. But the passage of time hasn’t stopped one company from taking advantage of this week’s Game Developers Conference to poke fun at the erstwhile gaming executive for his alleged mistreatment of developers.

VGC spotted a conspicuous billboard in San Francisco’s Union Square Monday featuring the overinflated, completely bald head of Gunther Harrison, the fictional Alta Interglobal CEO who was recently revealed as the blatantly satirical antagonist in the upcoming game Revenge of the Savage Planet. A large message atop the billboard asks passersby—including the tens of thousands in town for GDC—”Has a Harrison fired you lately? You might be eligible for emotional support.”

Google’s Phil Harrison talks about the Google Stadia controller at GDC 2019.

Google’s Phil Harrison talks about the Google Stadia controller at GDC 2019. Credit: Google

While Gunther Harrison probably hasn’t fired any GDC attendees, the famously bald Phil Harrison was responsible for the firing of plenty of developers when he shut down Google’s short-lived Stadia Games & Entertainment (SG&E) publishing imprint in early 2021. That shutdown surprised a lot of newly jobless game developers, perhaps none more so than those at Montreal-based Typhoon Games, which Google had acquired in late 2019 to make what Google’s Jade Raymond said at the time would be “platform-defining exclusive content” for Stadia.

Yet on the very same day that Journey to the Savage Planet launched as a Stadia exclusive, the developers at Typhoon found themselves jobless, alongside the rest of SG&E. By the end of 2022, Google would shut down Stadia entirely, blindsiding even more game developers.

Don’t forgive, don’t forget

After being let go by Google, Typhoon Games would reform as Raccoon Logic (thanks in large part to investment from Chinese publishing giant Tencent) and reacquire the rights to the Savage Planet franchise. And now that the next game in that series is set to launch in May, it seems the developers still haven’t fully gotten over how they were treated during Google’s brief foray into game publishing.

Developer’s GDC billboard pokes at despised former Google Stadia exec Read More »