Features

ars-spoke-with-the-military’s-chief-orbital-traffic-cop—here’s-what-we-learned

Ars spoke with the military’s chief orbital traffic cop—here’s what we learned


“We have some 2,000 or 2,200 objects that I call the ‘red order of battle.'”

Col. Raj Agrawal participates in a change of command ceremony to mark his departure from Mission Delta 2 at Peterson Space Force Base, Colorado. Col. Barry Croker became the new commander of Mission Delta 2 on July 3.

For two years, Col. Raj Agrawal commanded the US military unit responsible for tracking nearly 50,000 human-made objects whipping through space. In this role, he was keeper of the orbital catalog and led teams tasked with discerning whether other countries’ satellites, mainly China and Russia, are peaceful or present a military threat to US forces.

This job is becoming more important as the Space Force prepares for the possibility of orbital warfare.

Ars visited with Agrawal in the final weeks of his two-year tour of duty as commander of Mission Delta 2, a military unit at Peterson Space Force Base, Colorado. Mission Delta 2 collects and fuses data from a network of sensors “to identify, characterize, and exploit opportunities and mitigate vulnerabilities” in orbit, according to a Space Force fact sheet.

This involves operating radars and telescopes, analyzing intelligence information, and “mapping the geocentric space terrain” to “deliver a combat-ready common operational picture” to military commanders. Agrawal’s job has long existed in one form or another, but the job description is different today. Instead of just keeping up with where things are in space—a job challenging enough—military officials now wrestle with distinguishing which objects might have a nefarious purpose.

From teacher to commander

Agrawal’s time at Mission Delta 2 ended on July 3. His next assignment will be as Space Force chair at the National Defense University. This marks a return to education for Agrawal, who served as a Texas schoolteacher for eight years before receiving his commission as an Air Force officer in 2001.

“Teaching is, I think, at the heart of everything I do,” Agrawal said. 

He taught music and math at Trimble Technical High School, an inner city vocational school in Fort Worth. “Most of my students were in broken homes and unfortunate circumstances,” Agrawal said. “I went to church with those kids and those families, and a lot of times, I was the one bringing them home and taking them to school. What was [satisfying] about that was a lot of those students ended up living very fulfilling lives.”

Agrawal felt a calling for higher service and signed up to join the Air Force. Given his background in music, he initially auditioned for and was accepted into the Air Force Band. But someone urged him to apply for Officer Candidate School, and Agrawal got in. “I ended up on a very different path.”

Agrawal was initially accepted into the ICBM career field, but that changed after the September 11 attacks. “That was a time with anyone with a name like mine had a hard time,” he said. “It took a little bit of time to get my security clearance.”

Instead, the Air Force assigned him to work in space operations. Agrawal quickly became an instructor in space situational awareness, did a tour at the National Reconnaissance Office, then found himself working at the Pentagon in 2019 as the Defense Department prepared to set up the Space Force as a new military service. Agrawal was tasked with leading a team of 100 people to draft the first Space Force budget.

Then, he received the call to report to Peterson Space Force Base to take command of what is now Mission Delta 2, the inheritor of decades of Air Force experience cataloging everything in orbit down to the size of a softball. The catalog was stable and predictable, lingering below 10,000 trackable objects until 2007. That’s when China tested an anti-satellite missile, shattering an old Chinese spacecraft into more than 3,500 pieces large enough to be routinely detected by the US military’s Space Surveillance Network.

This graph from the European Space Agency shows the growing number of trackable objects in orbit. Credit: European Space Agency

Two years later, an Iridium communications satellite collided with a defunct Russian spacecraft, adding thousands more debris fragments to low-Earth orbit. A rapid uptick in the pace of launches since then has added to the problem, further congesting busy orbital traffic lanes a hundred miles above the Earth. Today, the orbital catalog numbers roughly 48,000 objects.

“This compiled data, known as the space catalog, is distributed across the military, intelligence community, commercial space entities, and to the public, free of charge,” officials wrote in a fact sheet describing Mission Delta 2’s role at Space Operations Command. Deltas are Space Force military units roughly equivalent to a wing or group command in the Air Force.

The room where it happens

The good news is that the US military is getting better at tracking things in space. A network of modern radars and telescopes on the ground and in space can now spot objects as small as a golf ball. Space is big, but these objects routinely pass close to one another. At speeds of nearly 5 miles per second, an impact will be catastrophic.

But there’s a new problem. Today, the US military must not only screen for accidental collisions but also guard against an attack on US satellites in orbit. Space is militarized, a fact illustrated by growing fleets of satellites—primarily American, Chinese, and Russian—capable of approaching another country’s assets in orbit, and in some cases, disable or destroy them. This has raised fears at the Pentagon that an adversary could take out US satellites critical for missile warning, navigation, and communications, with severe consequences impacting military operations and daily civilian life.

This new reality compelled the creation of the Space Force in 2019, beginning a yearslong process of migrating existing Air Force units into the new service. Now, the Pentagon is posturing for orbital warfare by investing in new technologies and reorganizing the military’s command structure.

Today, the Space Force is responsible for predicting when objects in orbit will come close to one another. This is called a conjunction in the parlance of orbital mechanics. The US military routinely issues conjunction warnings to commercial and foreign satellite operators to give them an opportunity to move their satellites out of harm’s way. These notices also go to NASA if there’s a chance of a close call with the International Space Station (ISS).

The first Trump administration approved a new policy to transfer responsibility for these collision warnings to the Department of Commerce, allowing the military to focus on national security objectives.

But the White House’s budget request for next year would cancel the Commerce Department’s initiative to take over collision warnings. Our discussion with Agrawal occurred before the details of the White House budget were made public last month, and his comments reflect official Space Force policy at the time of the interview. “In uniform, we align to policy,” Agrawal wrote on his LinkedIn account. “We inform policy decisions, but once they’re made, we align our support accordingly.”

US Space Force officials show the 18th Space Defense Squadron’s operations floor to officials from the German Space Situational Awareness Centre during an “Operator Exchange” event at Vandenberg Space Force Base, California, on April 7, 2022. Credit: US Space Force/Tech. Sgt. Luke Kitterman

Since our interview, analysts have also noticed an uptick in interesting Russian activity in space and tracked a suspected Chinese satellite refueling mission in geosynchronous orbit.

Let’s rewind the tape to 2007, the time of China’s game-changing anti-satellite test. Gen. Chance Saltzman, today the Space Force’s Chief of Space Operations, was a lieutenant colonel in command of the Air Force’s 614th Space Operations Squadron at the time. He was on duty when Air Force operators first realized China had tested an anti-satellite missile. Saltzman has called the moment a “pivot point” in space operations. “For those of us that are neck-deep in the business, we did have to think differently from that day on,” Saltzman said in 2023.

Agrawal was in the room, too. “I was on the crew that needed to count the pieces,” he told Ars. “I didn’t know the significance of what was happening until after many years, but the Chinese had clearly changed the nature of the space environment.”

The 2007 anti-satellite test also clearly changed the trajectory of Agrawal’s career. We present part of our discussion with Agrawal below, and we’ll share the rest of the conversation tomorrow. The text has been lightly edited for brevity and clarity.

Ars: The Space Force’s role in monitoring activities in space has changed a lot in the last few years. Can you tell me about these changes, and what’s the difference between what you used to call Space Situational Awareness, and what is now called Space Domain Awareness?

Agrawal: We just finished our fifth year as a Space Force, so as a result of standing up a military service focused on space, we shifted our activities to focus on what the joint force requires for combat space power. We’ve been doing space operations for going on seven decades. I think a lot of folks think that it was a rebranding, as opposed to a different focus for space operations, and it couldn’t be further from the truth. Compared to Space Domain Awareness (SDA), Space Situational Awareness (SSA) is kind of the knowledge we produce with all these sensors, and anybody can do space situational awareness. You have academia doing that. You’ve got commercial, international partners, and so on. But Space Domain Awareness, Gen. [John “Jay”] Raymond coined the term a couple years before we stood up the Space Force, and he was trying to get after, how do we create a domain focused on operational outcomes? That’s all we could say at the time. We couldn’t say war-fighting domain at the time because of the way of our policy, but our policy shifted to being able to talk about space as a place where, not that we want to wage war, but that we can achieve objectives, and do that with military objectives in mind.

We used to talk about detect, characterize, attribute, predict. And then Gen. [Chance] Saltzman added target onto the construct for Space Domain Awareness, so that we’re very much in the conversation of what it means to do a space-enabled attack and being able to achieve objectives in, from, and to space, and using Space Domain Awareness as a vehicle to do those things. So, with Mission Delta 2, what he did is he took the sustainment part of acquisition, software development, cyber defense, intelligence related to Space Domain Awareness, and then all the things that we were doing in Space Domain Awareness already, put all that together under one command … and called us Mission Delta 2. So, the 18th Space Defense Squadron … that used to kind of be the center of the world for Space Domain Awareness, maybe the only unit that you could say was really doing SDA, where everyone else was kind of doing SSA. When I came into command a couple years ago, and we face now a real threat to having space superiority in the space domain, I disaggregated what we were doing just in the 18th and spread out through a couple of other units … So, that way everyone’s got kind of majors and minors, but we can quickly move a mission in case we get tested in terms of cyber defense or other kinds of vulnerabilities.

This multi-exposure image depicts a satellite-filled sky over Alberta. Credit: Alan Dyer/VWPics/Universal Images Group via Getty Images

We can’t see the space domain, so it’s not like the air domain and sea domain and land domain, where you can kind of see where everything is, and you might have radars, but ultimately it’s a human that’s verifying whether or not a target or a threat is where it is. For the space domain, we’re doing all that through radars, telescopes, and computers, so the reality we create for everyone is essentially their reality. So, if there’s a gap, if there’s a delay, if there are some signs that we can’t see, that reality is what is created by us, and that is effectively the reality for everyone else, even if there is some other version of reality in space. So, we’re getting better and better at fielding capability to see the complexity, the number of objects, and then translating that into what’s useful for us—because we don’t need to see everything all the time—but what’s useful for us for military operations to achieve military objectives, and so we’ve shifted our focus just to that.

We’re trying to get to where commercial spaceflight safety is managed by the Office of Space Commerce, so they’re training side by side with us to kind of offload that mission and take that on. We’re doing up to a million notifications a day for conjunction assessments, sometimes as low as 600,000. But last year, we did 263 million conjunction notifications. So, we want to get to where the authorities are rightly lined, where civil or commercial notifications are done by an organization that’s not focused on joint war-fighting, and we focus on the things that we want to focus on.

Ars: Thank you for that overview. It helps me see the canvas for everything else we’re going to talk about. So, today, you’re not only tracking new satellites coming over the horizon from a recent launch or watching out for possible collisions, you’re now trying to see where things are going in space and maybe even try to determine intent, right?

Agrawal: Yeah, so the integrated mission delta has helped us have intel analysts and professionals as part of our formation. Their mission is SDA as much as ours is, but they’re using an intel lens. They’re looking at predictive intelligence, right? I don’t want to give away tradecraft, but what they’re focused on is not necessarily where a thing is. It used to be that all we cared about was position and vector, right? As long as you knew an object’s position and the direction they were going, you knew their orbit. You had predictive understanding of what their element set would be, and you only had to do sampling to get a sense of … Is it kind of where we thought it was going to be? … If it was far enough off of its element set, then we would put more energy, more sampling of that particular object, and then effectively re-catalog it.

Now, it’s a different model. We’re looking at state vectors, and we’re looking at anticipatory modeling, where we have some 2,000 or 2,200 objects that I call the “red order of battle”—that are high-interest objects that we anticipate will do things that are not predicted, that are not element set in nature, but that will follow some type of national interest. So, our intel apparatus gets after what things could potentially be a risk, and what things to continue to understand better, and what things we have to be ready to hold at risk. All of that’s happening through all the organizations, certainly within this delta, but in partnership and in support of other capabilities and deltas that are getting after their parts of space superiority.

Hostile or friendly?

Ars: Can you give some examples of these red order of battle objects?

Agrawal: I think you know about Shijian-20 (a “tech demo” satellite that has evaded inspection by US satellites) and Shijian-24C (which the Space Force says demonstrated “dogfighting” in space), things that are advertised as scientific in nature, but clearly demonstrate capability that is not friendly, and certainly are behaving in ways that are unprofessional. In any other domain, we would consider them hostile, but in space, we try to be a lot more nuanced in terms of how we characterize behavior, but still, when something’s behaving in a way that isn’t pre-planned, isn’t pre-coordinated, and potentially causes hazard, harm, or contest with friendly forces, we now get in a situation where we have to talk about is that behavior hostile or not? Is that escalatory or not? Space Command is charged with those authorities, so they work through the legal apparatus in terms of what the definition of a hostile act is and when something behaves in a way that we consider to be of national security interest.

We present all the capability to be able to do all that, and we have to be as cognizant on the service side as the combatant commanders are, so that our intel analysts are informing the forces and the training resources to be able to anticipate the behavior. We’re not simply recognizing it when it happens, but studying nations in the way they behave in all the other domains, in the way that they set policy, in the way that they challenge norms in other international arenas like the UN and various treaties, and so on. The biggest predictor, for us, of hazardous behaviors is when nations don’t coordinate with the international community on activities that are going to occur—launches, maneuvers, and fielding of large constellations, megaconstellations.

A stack of Starlink satellites in space right before deployment

Starlink satellites. Credit: Starlink

There are nearly 8,000 Starlink satellites in orbit today. SpaceX adds dozens of satellites to the constellation each week. Credit: SpaceX

As you know, we work very closely with Starlink, and they’re very, very responsible. They coordinate and flight plan. They use the kind of things that other constellations are starting to use … changes in those elsets (element sets), for lack of a better term, state vectors, we’re on top of that. We’re pre-coordinating that. We’re doing that weeks or months in advance. We’re doing that in real-time in cooperation with these organizations to make sure that space remains safe, secure, accessible, profitable even, for industry. When you have nations, where they’re launching over their population, where they’re creating uncertainty for the rest of the world, there’s nothing else we can do with it other than treat that as potentially hostile behavior. So, it does take a lot more of our resources, a lot more of our interest, and it puts [us] in a situation where we’re posturing the whole joint force to have to deal with that kind of uncertainty, as opposed to cooperative launches with international partners, with allies, with commercial, civil, and academia, where we’re doing that as friends, and we’re doing that in cooperation. If something goes wrong, we’re handling that as friends, and we’re not having to involve the rest of the security apparatus to get after that problem.

Ars: You mentioned that SpaceX shares Starlink orbit information with your team. Is it the same story with Amazon for the Kuiper constellation?

Agrawal: Yeah, it is. The good thing is that all the US and allied commercial entities, so far, have been super cooperative with Mission Delta 2 in particular, to be able to plan out, to talk about challenges, to even change the way they do business, learning more about what we are asking of them in order to be safe. The Office of Space Commerce, obviously, is now in that conversation as well. They’re learning that trade and ideally taking on more of that responsibility. Certainly, the evolution of technology has helped quite a bit, where you have launches that are self-monitored, that are able to maintain their own safety, as opposed to requiring an entire apparatus of what was the US Air Force often having to expend a tremendous amount of resources to provide for the safety of any launch. Now, technology has gotten to a point where a lot of that is self-monitored, self-reported, and you’ll see commercial entities blow up their own rockets no matter what’s onboard if they see that it’s going to cause harm to a population, and so on. So, yeah, we’re getting a lot of cooperation from other nations, allies, partners, close friends that are also sharing and cooperating in the interest of making sure that space remains sustainable and secure.

“We’ve made ourselves responsible”

Ars: One of the great ironies is that after you figure out the positions and tracks of Chinese or Russian satellites or constellations, you’re giving that data right back to them in the form of conjunction and collision notices, right?

Agrawal: We’ve made ourselves responsible. I don’t know that there’s any organization holding us accountable to that. We believe it’s in our interests, in the US’s interests, to provide for a safe, accessible, secure space domain. So, whatever we can do to help other nations also be safe, we’re doing it certainly for their sake, but we’re doing it as much for our sake, too. We want the space domain to be safe and predictable. We do have an apparatus set up in partnership with the State Department, and with a tremendous amount of oversight from the State Department, and through US Space Command to provide for spaceflight safety notifications to China and Russia. We send notes directly to offices within those nations. Most of the time they don’t respond. Russia, I don’t recall, hasn’t responded at all in the past couple of years. China has responded a couple of times to those notifications. And we hope that, through small measures like that, we can demonstrate our commitment to getting to a predictable and safe space environment.

A model of a Chinese satellite refueling spacecraft on display during the 13th China International Aviation and Aerospace Exhibition on October 1, 2021, in Zhuhai, Guangdong Province of China. Credit: Photo by VCG/VCG via Getty Images

Ars:  What does China say in response to these notices?

Agrawal: Most of the time it’s copy or acknowledged. I can only recall two instances where they’ve responded. But we did see some hope earlier this year and last year, where they wanted to open up technical exchanges with us and some of their [experts] to talk about spaceflight safety, and what measures they could take to open up those kinds of conversations, and what they could do to get a more secure, safer pace of operations. That, at some point, got delayed because of the holiday that they were going through, and then those conversations just halted, or at least progress on getting those conversations going halted. But we hope that there’ll be an opportunity again in the future where they will open up those doors again and have those kinds of conversations because, again, transparency will get us to a place where we can be predictable, and we can all benefit from orbital regimes, as opposed to using them exploitively. LEO is just one of those places where you’re not going to hide activity there, so you just are creating risk, uncertainty, and potential escalation by launching into LEO and not communicating throughout that whole process.

Ars:  Do you have any numbers on how many of these conjunction notices go to China and Russia? I’m just trying to get an idea of what proportion go to potential adversaries.

Agrawal: A lot. I don’t know the degree of how many thousands go to them, but on a regular basis, I’m dealing with debris notifications from Russian and Chinese ASAT (anti-satellite) testing. That has put the ISS at risk a number of times. We’ve had maneuvers occur in recent history as a result of Chinese rocket body debris. Debris can’t maneuver, and unfortunately, we’ve gotten into situations with particularly those two nations that talk about wanting to have safer operations, but continue to conduct debris-causing tests. We’re going to be dealing with that for generations, and we are going to have to design capability to maneuver around those debris clouds as just a function of operating in space. So, we’ve got to get to a point where we’re not doing that kind of testing in orbit.

Ars: Would it be accurate to say you send these notices to China and Russia daily?

Agrawal: Yeah, absolutely. That’s accurate. These debris clouds are in LEO, so as you can imagine, as those debris clouds go around the Earth every 90 minutes, we’re dealing with conjunctions. There are some parts of orbits that are just unusable as a result of that unsafe ASAT test.

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

Ars spoke with the military’s chief orbital traffic cop—here’s what we learned Read More »

it’s-“frighteningly-likely”-many-us-courts-will-overlook-ai-errors,-expert-says

It’s “frighteningly likely” many US courts will overlook AI errors, expert says


Judges pushed to bone up on AI or risk destroying their court’s authority.

A judge points to a diagram of a hand with six fingers

Credit: Aurich Lawson | Getty Images

Credit: Aurich Lawson | Getty Images

Order in the court! Order in the court! Judges are facing outcry over a suspected AI-generated order in a court.

Fueling nightmares that AI may soon decide legal battles, a Georgia court of appeals judge, Jeff Watkins, explained why a three-judge panel vacated an order last month that appears to be the first known ruling in which a judge sided with someone seemingly relying on fake AI-generated case citations to win a legal fight.

Now, experts are warning that judges overlooking AI hallucinations in court filings could easily become commonplace, especially in the typically overwhelmed lower courts. And so far, only two states have moved to force judges to sharpen their tech competencies and adapt so they can spot AI red flags and theoretically stop disruptions to the justice system at all levels.

The recently vacated order came in a Georgia divorce dispute, where Watkins explained that the order itself was drafted by the husband’s lawyer, Diana Lynch. That’s a common practice in many courts, where overburdened judges historically rely on lawyers to draft orders. But that protocol today faces heightened scrutiny as lawyers and non-lawyers increasingly rely on AI to compose and research legal filings, and judges risk rubberstamping fake opinions by not carefully scrutinizing AI-generated citations.

The errant order partly relied on “two fictitious cases” to deny the wife’s petition—which Watkins suggested were “possibly ‘hallucinations’ made up by generative-artificial intelligence”—as well as two cases that had “nothing to do” with the wife’s petition.

Lynch was hit with $2,500 in sanctions after the wife appealed, and the husband’s response—which also appeared to be prepared by Lynch—cited 11 additional cases that were “either hallucinated” or irrelevant. Watkins was further peeved that Lynch supported a request for attorney’s fees for the appeal by citing “one of the new hallucinated cases,” writing it added “insult to injury.”

Worryingly, the judge could not confirm whether the fake cases were generated by AI or even determine if Lynch inserted the bogus cases into the court filings, indicating how hard it can be for courts to hold lawyers accountable for suspected AI hallucinations. Lynch did not respond to Ars’ request to comment, and her website appeared to be taken down following media attention to the case.

But Watkins noted that “the irregularities in these filings suggest that they were drafted using generative AI” while warning that many “harms flow from the submission of fake opinions.” Exposing deceptions can waste time and money, and AI misuse can deprive people of raising their best arguments. Fake orders can also soil judges’ and courts’ reputations and promote “cynicism” in the justice system. If left unchecked, Watkins warned, these harms could pave the way to a future where a “litigant may be tempted to defy a judicial ruling by disingenuously claiming doubt about its authenticity.”

“We have no information regarding why Appellee’s Brief repeatedly cites to nonexistent cases and can only speculate that the Brief may have been prepared by AI,” Watkins wrote.

Ultimately, Watkins remanded the case, partly because the fake cases made it impossible for the appeals court to adequately review the wife’s petition to void the prior order. But no matter the outcome of the Georgia case, the initial order will likely forever be remembered as a cautionary tale for judges increasingly scrutinized for failures to catch AI misuses in court.

“Frighteningly likely” judge’s AI misstep will be repeated

John Browning, a retired justice on Texas’ Fifth Court of Appeals and now a full-time law professor at Faulkner University, last year published a law article Watkins cited that warned of the ethical risks of lawyers using AI. In the article, Browning emphasized that the biggest concern at that point was that lawyers “will use generative AI to produce work product they treat as a final draft, without confirming the accuracy of the information contained therein or without applying their own independent professional judgment.”

Today, judges are increasingly drawing the same scrutiny, and Browning told Ars he thinks it’s “frighteningly likely that we will see more cases” like the Georgia divorce dispute, in which “a trial court unwittingly incorporates bogus case citations that an attorney includes in a proposed order” or even potentially in “proposed findings of fact and conclusions of law.”

“I can envision such a scenario in any number of situations in which a trial judge maintains a heavy docket and looks to counsel to work cooperatively in submitting proposed orders, including not just family law cases but other civil and even criminal matters,” Browning told Ars.

According to reporting from the National Center for State Courts, a nonprofit representing court leaders and professionals who are advocating for better judicial resources, AI tools like ChatGPT have made it easier for high-volume filers and unrepresented litigants who can’t afford attorneys to file more cases, potentially further bogging down courts.

Peter Henderson, a researcher who runs the Princeton Language+Law, Artificial Intelligence, & Society (POLARIS) Lab, told Ars that he expects cases like the Georgia divorce dispute aren’t happening every day just yet.

It’s likely that a “few hallucinated citations go overlooked” because generally, fake cases are flagged through “the adversarial nature of the US legal system,” he suggested. Browning further noted that trial judges are generally “very diligent in spotting when a lawyer is citing questionable authority or misleading the court about what a real case actually said or stood for.”

Henderson agreed with Browning that “in courts with much higher case loads and less adversarial process, this may happen more often.” But Henderson noted that the appeals court catching the fake cases is an example of the adversarial process working.

While that’s true in this case, it seems likely that anyone exhausted by the divorce legal process, for example, may not pursue an appeal if they don’t have energy or resources to discover and overturn errant orders.

Judges’ AI competency increasingly questioned

While recent history confirms that lawyers risk being sanctioned, fired from their firms, or suspended from practicing law for citing fake AI-generated cases, judges will likely only risk embarrassment for failing to catch lawyers’ errors or even for using AI to research their own opinions.

Not every judge is prepared to embrace AI without proper vetting, though. To shield the legal system, some judges have banned AI. Others have required disclosures—with some even demanding to know which specific AI tool was used—but that solution has not caught on everywhere.

Even if all courts required disclosures, Browning pointed out that disclosures still aren’t a perfect solution since “it may be difficult for lawyers to even discern whether they have used generative AI,” as AI features become increasingly embedded in popular legal tools. One day, it “may eventually become unreasonable to expect” lawyers “to verify every generative AI output,” Browning suggested.

Most likely—as a judicial ethics panel from Michigan has concluded—judges will determine “the best course of action for their courts with the ever-expanding use of AI,” Browning’s article noted. And the former justice told Ars that’s why education will be key, for both lawyers and judges, as AI advances and becomes more mainstream in court systems.

In an upcoming summer 2025 article in The Journal of Appellate Practice & Process, “The Dawn of the AI Judge,” Browning attempts to soothe readers by saying that AI isn’t yet fueling a legal dystopia. And humans are unlikely to face “robot judges” spouting AI-generated opinions any time soon, the former justice suggested.

Standing in the way of that, at least two states—Michigan and West Virginia—”have already issued judicial ethics opinions requiring judges to be ‘tech competent’ when it comes to AI,” Browning told Ars. And “other state supreme courts have adopted official policies regarding AI,” he noted, further pressuring judges to bone up on AI.

Meanwhile, several states have set up task forces to monitor their regional court systems and issue AI guidance, while states like Virginia and Montana have passed laws requiring human oversight for any AI systems used in criminal justice decisions.

Judges must prepare to spot obvious AI red flags

Until courts figure out how to navigate AI—a process that may look different from court to court—Browning advocates for more education and ethical guidance for judges to steer their use and attitudes about AI. That could help equip judges to avoid both ignorance of the many AI pitfalls and overconfidence in AI outputs, potentially protecting courts from AI hallucinations, biases, and evidentiary challenges sneaking past systems requiring human review and scrambling the court system.

An overlooked part of educating judges could be exposing AI’s influence so far in courts across the US. Henderson’s team is planning research that tracks which models attorneys are using most in courts. That could reveal “the potential legal arguments that these models are pushing” to sway courts—and which judicial interventions might be needed, Henderson told Ars.

“Over the next few years, researchers—like those in our group, the POLARIS Lab—will need to develop new ways to track the massive influence that AI will have and understand ways to intervene,” Henderson told Ars. “For example, is any model pushing a particular perspective on legal doctrine across many different cases? Was it explicitly trained or instructed to do so?”

Henderson also advocates for “an open, free centralized repository of case law,” which would make it easier for everyone to check for fake AI citations. “With such a repository, it is easier for groups like ours to build tools that can quickly and accurately verify citations,” Henderson said. That could be a significant improvement to the current decentralized court reporting system that often obscures case information behind various paywalls.

Dazza Greenwood, who co-chairs MIT’s Task Force on Responsible Use of Generative AI for Law, did not have time to send comments but pointed Ars to a LinkedIn thread where he suggested that a structural response may be needed to ensure that all fake AI citations are caught every time.

He recommended that courts create “a bounty system whereby counter-parties or other officers of the court receive sanctions payouts for fabricated cases cited in judicial filings that they reported first.” That way, lawyers will know that their work will “always” be checked and thus may shift their behavior if they’ve been automatically filing AI-drafted documents. In turn, that could alleviate pressure on judges to serve as watchdogs. It also wouldn’t cost much—mostly just redistributing the exact amount of fees that lawyers are sanctioned to AI spotters.

Novel solutions like this may be necessary, Greenwood suggested. Responding to a question asking if “shame and sanctions” are enough to stop AI hallucinations in court, Greenwood said that eliminating AI errors is imperative because it “gives both otherwise generally good lawyers and otherwise generally good technology a bad name.” Continuing to ban AI or suspend lawyers as a preferred solution risks dwindling court resources just as cases likely spike rather than potentially confronting the problem head-on.

Of course, there’s no guarantee that the bounty system would work. But “would the fact of such definite confidence that your cures will be individually checked and fabricated cites reported be enough to finally… convince lawyers who cut these corners that they should not cut these corners?”

In absence of a fake case detector like Henderson wants to build, experts told Ars that there are some obvious red flags that judges can note to catch AI-hallucinated filings.

Any case number with “123456” in it probably warrants review, Henderson told Ars. And Browning noted that AI tends to mix up locations for cases, too. “For example, a cite to a purported Texas case that has a ‘S.E. 2d’ reporter wouldn’t make sense, since Texas cases would be found in the Southwest Reporter,” Browning said, noting that some appellate judges have already relied on this red flag to catch AI misuses.

Those red flags would perhaps be easier to check with the open source tool that Henderson’s lab wants to make, but Browning said there are other tell-tale signs of AI usage that anyone who has ever used a chatbot is likely familiar with.

“Sometimes a red flag is the language cited from the hallucinated case; if it has some of the stilted language that can sometimes betray AI use, it might be a hallucination,” Browning said.

Judges already issuing AI-assisted opinions

Several states have assembled task forces like Greenwood’s to assess the risks and benefits of using AI in courts. In Georgia, the Judicial Council of Georgia Ad Hoc Committee on Artificial Intelligence and the Courts released a report in early July providing “recommendations to help maintain public trust and confidence in the judicial system as the use of AI increases” in that state.

Adopting the committee’s recommendations could establish “long-term leadership and governance”; a repository of approved AI tools, education, and training for judicial professionals; and more transparency on AI used in Georgia courts. But the committee expects it will take three years to implement those recommendations while AI use continues to grow.

Possibly complicating things further as judges start to explore using AI assistants to help draft their filings, the committee concluded that it’s still too early to tell if the judges’ code of conduct should be changed to prevent “unintentional use of biased algorithms, improper delegation to automated tools, or misuse of AI-generated data in judicial decision-making.” That means, at least for now, that there will be no code-of-conduct changes in Georgia, where the only case in which AI hallucinations are believed to have swayed a judge has been found.

Notably, the committee’s report also confirmed that there are no role models for courts to follow, as “there are no well-established regulatory environments with respect to the adoption of AI technologies by judicial systems.” Browning, who chaired a now-defunct Texas AI task force, told Ars that judges lacking guidance will need to stay on their toes to avoid trampling legal rights. (A spokesperson for the State Bar of Texas told Ars the task force’s work “concluded” and “resulted in the creation of the new standing committee on Emerging Technology,” which offers general tips and guidance for judges in a recently launched AI Toolkit.)

“While I definitely think lawyers have their own duties regarding AI use, I believe that judges have a similar responsibility to be vigilant when it comes to AI use as well,” Browning said.

Judges will continue sorting through AI-fueled submissions not just from pro se litigants representing themselves but also from up-and-coming young lawyers who may be more inclined to use AI, and even seasoned lawyers who have been sanctioned up to $5,000 for failing to check AI drafts, Browning suggested.

In his upcoming “AI Judge” article, Browning points to at least one judge, 11th Circuit Court of Appeals Judge Kevin Newsom, who has used AI as a “mini experiment” in preparing opinions for both a civil case involving an insurance coverage issue and a criminal matter focused on sentencing guidelines. Browning seems to appeal to judges’ egos to get them to study up so they can use AI to enhance their decision-making and possibly expand public trust in courts, not undermine it.

“Regardless of the technological advances that can support a judge’s decision-making, the ultimate responsibility will always remain with the flesh-and-blood judge and his application of very human qualities—legal reasoning, empathy, strong regard for fairness, and unwavering commitment to ethics,” Browning wrote. “These qualities can never be replicated by an AI tool.”

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

It’s “frighteningly likely” many US courts will overlook AI errors, expert says Read More »

nothing-phone-3-review:-nothing-ventured,-nothing-gained

Nothing Phone 3 review: Nothing ventured, nothing gained


The Nothing Phone 3 is the company’s best phone by a wide margin, but is that enough?

Nothing Phone 3 reply hazy

The Nothing Phone 3 has a distinctive design. Credit: Ryan Whitwam

The Nothing Phone 3 has a distinctive design. Credit: Ryan Whitwam

The last few years have seen several smartphone makers pull back or totally abandon their mobile efforts. UK-based Nothing Technologies, however, is still trying to carve out a niche in the increasingly competitive smartphone market. Its tools have been quirky designs and glowing lights, along with a focus on markets outside the US. With the Nothing Phone 3, the company has brought its “first flagship” phone stateside.

Nothing didn’t swing for the fences with the Phone 3’s specs, but this device can hold its own with the likes of OnePlus and Google. Plus, it has that funky Nothing design aesthetic. There’s a transparent back, a tiny dot matrix screen, and a comprehensive Android skin. But at the end of the day, the Nothing Phone 3 is not treading new ground.

Designing Nothing

Despite Nothing’s talk about unique designs, the Nothing Phone 3 looks unremarkable from the front. The bezels are slim and symmetrical all the way around the screen. Under a sheet of Gorilla Glass 7i, it has a 6.67-inch 120Hz OLED screen with an impressive 1260 x 2800 resolution. It hits 4,500 nits of brightness, which is even higher than Google and Samsung phones. It’s more than bright enough to be readable outdoors, and the touch sensitivity is excellent—sometimes too excellent, as we’ve noticed a few accidental edge touches.

Specs at a glance: Nothing Phone 3
SoC Snapdragon 8s Gen 4
Memory 12GB, 16GB
Storage 256GB, 512GB
Display 1260 x 2800 6.67″ OLED, 120 Hz
Cameras 50MP primary, f/1.7, OIS; 50MP ultrawide, f/2.2; 50MP 3x telephoto, f/2.7, OIS; 50MP selfie, f/2.2
Software Android 15, 5 years of OS updates
Battery 5,150 mAh, 65 W wired charging, 15 W wireless charging
Connectivity Wi-Fi 7, NFC, Bluetooth 6.0, sub-6 GHz 5G, USB-C 3.2
Measurements 160.6 x 75.6 x 9 mm; 218 g

Like many other phones, the Nothing Phone 3 has an optical fingerprint sensor under the display. It’s quick and accurate, but it’s a bit too low (barely a pinky finger’s width from the bottom of the device). As an optical sensor, it’s also very bright in a dark room. Similar phones from Google and Samsung have faster and less disruptive ultrasonic fingerprint sensors.

Nothing Phone 3 home screen

Nothing OS is a great Android skin.

Credit: Ryan Whitwam

Nothing OS is a great Android skin. Credit: Ryan Whitwam

The overall shape of the phone is almost the same as current Samsung, Apple, and Google phones, but it’s closest to the Pixel 9 series. The IP68-rated body has the same minimalist aesthetic as those other phones, with flat edges and rounded corners. The aluminum frame curves in to merge seamlessly with the front and rear glass panels. It has a matte finish, making it reasonably grippy in the hand. Nothing includes a clear case in the box—we appreciate the effort, but the case feels very cheap and will probably discolor after a couple of months of use.

You won’t see anything extravagant like a headphone jack or IR blaster. The volume and power buttons are flat, tactile, and very stable, with no discernible wiggle. Below the power button is the Essential Key, a convex button that plugs into Nothing’s on-device AI features (more on that later). It’s a delight for button-lovers, but it can be too easy to accidentally press when picking up the phone. And no, you can’t remap the button to do something else.

Nothing Phone 3 side

The Essential Button has a nice feel, but it’s too easy to mistake for the power button.

Credit: Ryan Whitwam

The Essential Button has a nice feel, but it’s too easy to mistake for the power button. Credit: Ryan Whitwam

It’s not until you get to the back that the Nothing Phone 3 stands out. The back has a clear panel of extra-strong Gorilla Glass Victus, but you’re not seeing the phone’s internals through it. The panels under the glass have slightly different colors and textures and were chosen to create an interesting visual effect. It’s certainly eye-catching, but whether or not you like it is a matter of taste. The camera sensors are near the top in a staggered arrangement, right across from the “Glyph Matrix.”

The monochrome Glyph Matrix is Nothing’s replacement for the Glyph light bars on its older phones. A pressure-sensitive button under the glass can be pressed to switch between various display options, some of which might occasionally be useful, like a clock and battery monitor. There are also less useful “Glyph toys” like a Magic 8-ball, a low-fi mirror, and a Rock, Paper, Scissors simulator. It can also display call and status notifications, for instance letting you know when Do Not Disturb is activated or when you have a missed call. Or you can just turn the phone over and use the full display.

Nothing Phone 3 Glyph

The Glyph matrix is a gimmick, but it does look cool.

Credit: Ryan Whitwam

The Glyph matrix is a gimmick, but it does look cool. Credit: Ryan Whitwam

There’s only so much you can do with 489 LEDs and a single button, which makes some of the toys frustrating. For example, you have to long-press to stop the stopwatch, which defeats the purpose, and the selfie mirror is very difficult to use for framing a photo. The Glyph dot matrix is fun to play around with, but it’s just a gimmick. Really, how much time do you spend looking at the back of your phone? Checking the time or playing Rock, Paper, Scissors is not a game-changer, even if the display is visually interesting.

Flagship-ish performance

Nothing says this is a flagship phone, but it doesn’t have Qualcomm’s flagship mobile processor. While you’ll find the Snapdragon 8 Elite in most high-end devices today, Nothing went with the slightly more modest Snapdragon 8s Gen 4. It doesn’t have the Oryon CPU cores, relying instead on eight Arm reference cores, along with a slower GPU.

Nothing Phone 3 and Pixel 9 Pro XL

The Nothing Phone 3 (left) is about the same size and shape as the Pixel 9 Pro XL (right).

Credit: Ryan Whitwam

The Nothing Phone 3 (left) is about the same size and shape as the Pixel 9 Pro XL (right). Credit: Ryan Whitwam

What does that mean for the speeds and feeds? The Nothing Phone 3 doesn’t keep up with high-end devices like the Galaxy S25 in benchmarks, but it’s no slouch, either. In fact, the Snapdragon 8s Gen 4 beats Google’s latest Tensor chip featured in the Pixel 9 series.

As expected, the standard Arm cores fall behind the custom Oryon CPUs in Geekbench, running about 40 percent behind Qualcomm’s best processor. However, the gulf is much narrower in graphics because the Adreno 825 in the Nothing Phone 3 is very similar to the 830 used in Snapdragon 8 Elite phones.

So you could see better gaming performance with a phone like the Galaxy S25 compared to the Nothing Phone 3, but only if you’re playing something very graphically intensive. Even when running these devices side by side, we have a hard time noticing any loss of fidelity on the Nothing Phone 3. It performs noticeably better in high-end games compared to the latest Pixels, though. The Phone 3 maintains performance fairly well under load, only losing 25 to 30 percent at peak temperature. The body of the phone does get uncomfortably hot, but that’s better than overheating the processor.

That modest drop in CPU performance benchmarks does not equate to a poor user experience. The Nothing Phone 3 is very snappy, opening apps quickly and handling rapid multitasking without hesitation. The animations also have a Google level of polish.

Nothing managed to fit a 5,150 mAh battery in this phone, which is a bit larger than even the Galaxy S25 Ultra at 5,000 mAh. The battery life is strong, with the phone easily making it all day—no range anxiety. It won’t last through a second day on a single charge, though. Just like a Pixel or Galaxy phone, you’ll want to plug the Nothing Phone 3 in every night.

But you don’t necessarily have to save your charging for nighttime. The Nothing Phone 3 offers 65 W wired charging, which is much faster than what you get from Google, Samsung, or Apple phones. If the battery gets low, just a few minutes connected to almost any USB-PD charger will get you enough juice to head out the door. You also get 15 W wireless charging, but it doesn’t support the magnetic Qi 2 standard.

We’ve had no problems using the Phone 3 on T-Mobile, and Nothing says AT&T is also fully supported. However, there’s no official support for Verizon. The phone has all the necessary sub-6GHz 5G bands, but you may have trouble activating it as a new device on Verizon’s network.

Upgraded cameras

A camera upgrade was a necessary part of making this device a “flagship” phone, so Nothing equipped the Phone 3 with a solid array of sensors, ensuring you’ll get some good shots. They won’t all be good, though.

Nothing Phone 3 back

The clear glass shows off subtly differing blocks and a button to control the Glyph Matrix display.

Credit: Ryan Whitwam

The clear glass shows off subtly differing blocks and a button to control the Glyph Matrix display. Credit: Ryan Whitwam

The Nothing Phone 3 has a quartet of 50 MP sensors, including a wide-angle, a 3x telephoto, and an ultrawide on the back. The front-facing selfie camera is also 50 MP. While you can shoot in 50 MP mode, smartphone camera sensors are designed with pixel binning in mind. The phone outputs 12.5 MP images, leaning on merged pixel elements to brighten photos and speed up captures. We’ve found Nothing’s color balance and exposure to be very close to reality, and the dynamic range is good enough that you don’t have to worry about overly bright or dim backgrounds ruining a shot.

The Nothing Phone 3 cameras can produce sharp details, but some images tend to look overprocessed and “muddy.” However, the biggest issue is shutter lag—there’s too much of it. It seems like the phone is taking too long to stack and process images. So even outdoors and with a high shutter speed, a moving subject can look blurry. It’s challenging to snap a clear photo of a hyperactive kid or pet. In low-light settings, the shutter lag becomes worse, making it hard to take a sharp photo. Night mode shots are almost always a bit fuzzy.

Low indoor light. Ryan Whitwam

Photos of still subjects are generally good, and you can get some nice ones with the ultrawide camera. Landscapes look particularly nice, and the camera has autofocus for macro shots. This mode doesn’t activate automatically when you move in, so you have to remember it’s there. It’s worth remembering, though.

The telephoto sensor uses a periscope-style lens, which we usually see on sensors with 5x or higher zoom factors. This one is only 3x, so it will get you somewhat closer to your subject without cropping, but don’t expect the same quality you’d get from a Pixel or Samsung phone.

In its sub-flagship price range, we’d put the Nothing Phone 3 camera experience on par with Motorola. A device like the OnePlus 13R or Pixel 9a will take better pictures, but the Nothing Phone 3 is good enough unless mobile photography is at the top of your requirements.

Great software, plus an AI button

Nothing isn’t beating Samsung to the punch with Android 16—the first new phone to launch with Google’s latest OS will be the Z Fold 7 and Z Flip 7 later this month. Nothing is releasing its phone with Android 15 and Nothing OS 3.5, but an Android 16 update is promised soon. There’s not much in the first Android 16 release to get excited about, though, and in the meantime, Nothing OS is actually quite good.

Nothing’s take on Android makes changes to almost every UI element, which is usually a recipe for Samsung levels of clutter. However, Nothing remains true to its minimalist aesthetic throughout the experience. The icon styling is consistent and attractive, Nothing’s baked-in apps are cohesive, and the software includes some useful home screen options and widgets. Nothing also made a few good functional changes to Android, including a fully configurable quick settings panel and a faster way to clear your recent apps.

We’ve encountered a few minor bugs, like the weather widget that won’t show freedom units and a back gesture that can be a little finicky. Nothing’s Android skin is also very distinctive compared to other OEM themes. Not everyone will like the “dot matrix” vibe of Nothing OS, but it’s one of the more thoughtfully designed Android skins we’ve seen.

Nothing Phone 3 software

Nothing OS has a distinctive look.

Credit: Ryan Whitwam

Nothing OS has a distinctive look. Credit: Ryan Whitwam

Like every other 2025 smartphone, there’s an AI angle here. Nothing has a tool called Essential Space that ties into the aforementioned Essential Key. When you press the button, it takes a screenshot you can add notes to. It logs that in Essential Space and turns an AI loose on it to glean important details. It can create to-do lists and reminders based on the images, but those suggestions are misses as often as they are hits. There’s also no search function like the Google Pixel Screenshots app, which seems like a mistake. You can hold the essential key to record a voice memo, which goes through a similar AI process.

There are also some privacy caveats with Essential Space. The screenshots you save are uploaded to a remote server for processing, but Nothing says it won’t store any of that data. Your voice notes are processed on-device, but it would be nice if images were as well.

Nothing has part of a good idea with its mobile AI implementation, but it’s not as engaging as what we’ve seen from Google. And it’s not as if Google’s use of AI is essential to the mobile experience. The Nothing Phone 3 also gets the standard Gemini integration, and Google’s chatbot will probably get much more use than Essential Space.

Nothing has promised five years of major Android version updates, and there will be two additional years of security patches after that. Nothing is still a very new company, though, and there’s no guarantee it will still be around in seven years. If we assume the best, this is a good update policy, surpassing Motorola and OnePlus but not quite at the level of Google or Samsung, both of which offer seven years of full update support.

Different but not that different

The Nothing Phone 3 is a good smartphone, and it’s probably the best piece of hardware the company has made in its short run. The performance is snappy, the software is thoughtfully designed, and the hardware, while gimmicky, is solid and visually interesting. If you prefer a more understated look or plan to encapsulate your phone in the most durable case you can find, this is not the phone for you.

Nothing Phone 3

The Nothing Phone 3 is a rather large, heavy phone.

Credit: Ryan Whitwam

The Nothing Phone 3 is a rather large, heavy phone. Credit: Ryan Whitwam

Nothing’s Glyph Matrix is fun to play with, but it’s the kind of thing you’ll write off after some time with the phone. You can only play so many games of Rock, Paper, Scissors before the novelty wears off. Nothing is not alone in going down this path—Asus has a dot matrix on its ROG gaming phones, and Xiaomi has slapped full LCDs on the back of a few of its devices. It’s really no different from the days when OEMs tinkered with secondary ticker displays and rear-facing e-paper screens. Those weren’t very useful, either.

Nothing did all it could to make the secondary display attractive, but even if it came up with a truly great idea, there’s little utility in a screen on the back of your phone. The transparent design and dot matrix screen help the phone stand out from the crowd, but not because they’re doing anything radical. This is still a pretty typical glass sandwich smartphone, like most other 2025 offerings.

At $799, the Nothing Phone 3 is competing with devices like the Pixel 9 and OnePlus 13, both of which have it beat in the camera department, and the OnePlus phone is faster. Meanwhile, Google also has better update support. If you buy the Nothing Phone 3, it should be because you genuinely like the hardware and software design, and there’s very little bad to say about Nothing OS. Otherwise, there are better options for the same or less money.

The good

  • Excellent build quality with IP68 rating
  • Nothing OS looks and works great
  • Good performance
  • Glyph Matrix looks cool

The bad

  • Glyph Matrix is an unnecessary gimmick
  • AI features are still not very useful
  • Cameras have noticeable shutter lag
  • Verizon not officially supported

Photo of Ryan Whitwam

Ryan Whitwam is a senior technology reporter at Ars Technica, covering the ways Google, AI, and mobile technology continue to change the world. Over his 20-year career, he’s written for Android Police, ExtremeTech, Wirecutter, NY Times, and more. He has reviewed more phones than most people will ever own. You can follow him on Bluesky, where you will see photos of his dozens of mechanical keyboards.

Nothing Phone 3 review: Nothing ventured, nothing gained Read More »

everything-we-learned-from-a-week-with-apple-carplay-ultra

Everything we learned from a week with Apple CarPlay Ultra


CarPlay Ultra takes over the main instrument display as well as the infotainment.

Aston Martin dashboard showing CarPlay ultra logo

Aston Martin is the first automaker to adopt Apple’a CarPlay Ultra, which takes over all the displays in the car. Credit: Michael Teo Van Runkle

Aston Martin is the first automaker to adopt Apple’a CarPlay Ultra, which takes over all the displays in the car. Credit: Michael Teo Van Runkle

For the 2025 model year, Aston Martin’s user interface took a major step forward across the lineup, with improvements to the physical controls and digital infotainment, as well as updated gauge cluster layouts. However, the big news dropped in the spring, when Aston and Apple announced the launch of CarPlay Ultra, the next generation of Apple’s nearly ubiquitous automotive operating system.

Ultra extends beyond the strictly “phone” functions of traditional CarPlay to now encompass more robust vehicular integration, including climate control, drive modes, and the entire gauge cluster readout. Running Ultra, therefore, requires a digital gauge cluster. So far, not many automakers other than Aston have signaled their intent to join the revolution: Kia/Hyundai/Genesis will adopt Ultra next, and Porsche may come after that.

Before future partnerships come to fruition, I spent a week with a DB12 Volante to test Ultra’s use cases and conceptual failure points, most critically to discover whether this generational leap actually enhances or detracts from an otherwise stellar driving experience.

Setup

The following gallery will take you through the setup process. Michael Teo Van Runkle

Connecting to Ultra via Bluetooth takes a minute or two longer than traditional CarPlay and includes more consent screens to cover the additional legal ramifications of the operating system sharing data with the car, and vice versa. Apple restricts this data to multimedia info, plus real-time speed and engine status, vehicle lights, and similar functions. Specifically, neither the iPhone nor third-party apps store any vehicle data after disconnecting from the car, and the car doesn’t keep personal data once the iPhone disconnects, either.

What about Siri? I generally keep Siri turned off so that accidental “Hey, Siri” activations don’t constantly interrupt my life—but by pushing the DB12’s steering wheel button, I could test simple tasks that went just about as well as typical for Siri (read: don’t expect much “Apple Intelligence” quite yet). Standard Siri data sharing with Apple therefore applies when used with Ultra.

I tested Ultra with an iPhone 16 Pro, but the software requires an iPhone 12 or newer and the latest iOS 18.5 update. As a type of simple failure exercise, I turned my phone off while driving more than once. Doing so reverts both the gauge cluster and infotainment screen to Aston’s native UI, the former almost instantly and the latter just a few seconds later. However, once I turned my phone back on, I struggled to reactivate either traditional CarPlay or Ultra until I forgot the device in my Bluetooth settings and started over from scratch. This held true for every attempt.

We didn’t love the fact that there was some latency with the needles on the dials. Michael Teo Van Runkle

Once initiated, though, Ultra fired up straightaway every time. Much faster than the typical lag to boot up traditional CarPlay. In fact, as soon as I unlocked the doors but before entering the DB12, the gauge cluster showed Ultra’s Apple-style readouts. These configurable designs, which Apple developed with Aston’s input, include a classic analog-style gauge view as well as layouts that allow for minimized data, navigation, and stylistic choices selectable through the center console screen or by swiping the haptic button on the DB12’s steering wheel.

Call me old-fashioned, but I still enjoy seeing a tachometer, speedometer, drive modes, and fuel level versus range remaining and a digital speed—especially on an engaging performance vehicle like the DB12 Volante. Apple might be skilled at making new tech easy to use, but it’s hard to beat the power of millions of minds adapting to analog gauges over the past century or so. And in this case, Ultra’s tach(s) showed a bit of latency or lag while ripping that 671-hp twin-turbo V8 up through the revs, something I never noticed in the native UI.

It’s much more holistic now

Ultra’s biggest improvements over preceding CarPlay generations are in the center console infotainment integration. Being able to access climate controls, drive modes, and traction settings without leaving the intuitive suite of CarPlay makes life much easier. In fact, changing between drive modes and turning traction control off or down via Aston’s nifty adjustable system caused less latency and lagging in the displays in Ultra. And for climate, Ultra actually brings up a much better screen after spinning the physical rotaries on the center console than you get through Aston’s UI—plus, I found a way to make the ventilated seats blow stronger, which I never located through the innate UI despite purposefully searching for a similar menu page.

There are different main instrument UIs to choose from, like this one. Michael Teo Van Runkle

Some specific functions do require dipping out of Ultra, though, including changing any audio settings for the spectacular Bowers & Wilkins sound system. I also found two glitches. Trying to bring down the DB12 Volante’s convertible top cued up a “Close trunk separator” alert, but the only way to close the trunk separator is via the same button as the convertible top. So instead, the windows only went up and down repeatedly as I tried to enjoy open-top motoring. This happened both in Ultra and without, however, so it could just be an Aston issue that Ultra couldn’t fix.

Plus, over the course of my eight days with Ultra, I experienced one moment where both the infotainment and gauge cluster went totally black. This resembled GM’s Ultium screen issues and lasted about 30 seconds or so before both flickered to life again. At first, I suspected an inadvertent attempt to activate nighttime driving mode. But again, this could have been an Aston issue, an Apple issue, or both.

Running around Los Angeles, I never found a spot with zero reception (I run e-sims, both Verizon and AT&T simultaneously, for this very reason), but I did purposefully enter airplane mode. This time, Ultra stayed active, and regardless, Apple assured me that essential functions, including navigation, can pre-load offline data for planned route guidance. But at the very worst, as with the phone turning off or battery dying, Ultra can simply revert to the onboard navigation.

Using Ultra regularly seemed to deplete my iPhone’s battery slightly more quickly than normal, and I noticed some warming of the iPhone—though without a controlled experiment, I can’t say with certainty whether these two symptoms happened quicker than simply running traditional CarPlay or Bluetooth. And in reality, most cars running Ultra (for Aston and beyond) should come equipped with wireless charge pads and plenty of USB-C ports anyhow to keep those batteries topped up. On hot summer days in LA, though, my iPhone seemed to get warmest while using inductive charging and Ultra simultaneously, to my admittedly unscientific touch.

Apple Maps is the only map that is allowed to go here in CarPlay Ultra. Michael Teo Van Runkle

For commuters who brave traffic using Advanced Driver Assistance Systems (ADAS), Ultra seemed to work smoothly with the DB12’s lane departure warnings, steering corrections, and adaptive cruise control—though I typically turn all this off via Aston’s handy single button, which helps to stave off frustration. This introduces a loophole or gap in regulations, however, whether CarPlay Ultra needs to meet the ISO’s ASIL-D standards or achieve some kind of National Highway Traffic Safety Administration certification.

Traditional CarPlay stuck with infotainment and basic “phone” functions, but now that the iPhone essentially accesses and displays ADAS, drive modes, and traction setting information, where does regulated consumer safety come in? And where does liability rest, in the event of a driver aid or corrective maneuver going awry? Somehow, this question seems most likely to wind up on the desk of an insurance adjuster sooner rather than later.

Can we try it in an EV?

For me, some disappointment arose from being unable to cue up either Waze or Google Maps in Ultra’s gauge cluster navigation screens rather than strictly Apple Maps. But in many ways, I suspect that Ultra might work even better when (or if) Hyundai/Kia/Genesis introduce compatible EVs, rather than Aston’s (so far) more classic ICE vehicles. And not just because the modern futurist aesthetic matches better, either, but more so thanks to the improved accuracy of range, charging, and navigation features.

The center infotainment screen’s integration with vehicular functions, therefore, stands out as much more of a pro for Aston Martins than Ultra’s gauge cluster readout, enhancing the driving experience through a more intuitive UI that decreases time spent glancing away from the road. For those who want to skip out on Ultra, it’s also worth noting that the iPhone allows for the choice to stick with traditional CarPlay only as well. However, I suspect car buyers will eventually begin to expect Ultra, even if the added jump to vehicular control represents somewhat less of a massive leap than simply picking between models equipped with CarPlay or not.

It’s unclear whether other automakers will find the advantages worthy of converting to Ultra, including Rivian, which offers neither CarPlay nor Android Auto, or GM, which skipped out on CarPlay for EVs. On the other hand, automakers may also decide to hesitate before handing over further control to Apple now that the Apple Car is officially dead. And in that regard, Ultra might just represent the final straw that inspires further improvements to proprietary user interfaces across the industry as well.

Everything we learned from a week with Apple CarPlay Ultra Read More »

the-iss-is-nearing-retirement,-so-why-is-nasa-still-gung-ho-about-starliner?

The ISS is nearing retirement, so why is NASA still gung-ho about Starliner?


NASA is doing all it can to ensure Boeing doesn’t abandon the Starliner program.

Boeing’s Starliner spacecraft atop a United Launch Alliance Atlas V rocket before a test flight in 2019. Credit: NASA/Joel Kowsky

Boeing’s Starliner spacecraft atop a United Launch Alliance Atlas V rocket before a test flight in 2019. Credit: NASA/Joel Kowsky

After so many delays, difficulties, and disappointments, you might be inclined to think that NASA wants to wash its hands of Boeing’s troubled Starliner spacecraft.

But that’s not the case.

The manager of NASA’s commercial crew program, Steve Stich, told reporters Thursday that Boeing and its propulsion supplier, Aerojet Rocketdyne, are moving forward with several changes to the Starliner spacecraft to resolve problems that bedeviled a test flight to the International Space Station (ISS) last year. These changes include new seals to plug helium leaks and thermal shunts and barriers to keep the spacecraft’s thrusters from overheating.

Boeing, now more than $2 billion in the hole to pay for all Starliner’s delays, is still more than a year away from executing on its multibillion-dollar NASA contract and beginning crew rotation flights to the ISS. But NASA officials say Boeing remains committed to Starliner.

“We really are working toward a flight as soon as early next year with Starliner, and then ultimately, our goal is to get into crew rotation flights with Starliner,” Stich said. “And those would start no earlier than the second crew rotation slot at the end of next year.”

That would be 11 years after Boeing officials anticipated the spacecraft would enter operational service for NASA when they announced the Starliner program in 2010.

Decision point

The next Starliner flight will probably transport only cargo to the ISS, not astronauts. But NASA hasn’t made any final decisions on the matter. The agency has enough crew rotation missions booked to fly on SpaceX’s Dragon spacecraft to cover the space station’s needs until well into 2027 or 2028.

“I think there are a lot of advantages, I would say, to fly the cargo flight first,” Stich said. “If we really look at the history of Starliner and Dragon, I think Dragon benefited a lot from having earlier [cargo] flights before the crew contract was let for the space station.”

One drawback of flying a Starliner cargo mission is that it will use up one of United Launch Alliance’s remaining Atlas V rockets currently earmarked for a future Starliner crew launch. That means Boeing would have to turn to another rocket to accomplish its full contract with NASA, which covers up to six crew missions.

While Boeing says Starliner can launch on several different rockets, the difficulty of adapting the spacecraft to a new launch vehicle, such as ULA’s Vulcan, shouldn’t be overlooked. Early in Starliner’s development, Boeing and ULA had to overcome an issue with unexpected aerodynamic loads discovered during wind tunnel testing. This prompted engineers to design an aerodynamic extension, or skirt, to go underneath the Starliner spacecraft on top of its Atlas V launcher.

Starliner has suffered delays from the beginning. A NASA budget crunch in the early 2010s pushed back the program about two years, but the rest of the schedule slips have largely fallen on Boeing’s shoulders. The setbacks included a fuel leak and fire during a critical ground test, parachute problems, a redesign to accommodate unanticipated aerodynamic forces, and a computer timing error that cut short Starliner’s first attempt to reach the space station in 2019.

This all culminated in the program’s first test flight with astronauts last summer. But after running into helium leaks and overheating thrusters, the mission ended with Starliner returning to Earth empty, while the spacecraft’s two crew members remained on the International Space Station until they could come home on a SpaceX Dragon spacecraft this year.

The outcome was a stinging disappointment for Boeing. Going into last year’s crew test flight, Boeing appeared to be on the cusp of joining SpaceX and finally earning revenue as one of NASA’s certified crew transportation providers for the ISS.

For several months, Boeing officials were strikingly silent on Starliner’s future. The company declined to release any statements on their long-term commitment to the program, and a Boeing program manager unexpectedly withdrew from a NASA press conference marking the end of the Starliner test flight last September.

Kelly Ortberg, Boeing’s president and CEO, testifies before the Senate Commerce, Science, and Transportation Committee on April 2, 2025, in Washington, DC. Credit: Win McNamee/Getty Images

But that has changed in the last few months. Kelly Ortberg, who took over as Boeing’s CEO last year, told CNBC in April that the company planned “more missions on Starliner” and said work to overcome the thruster issues the spacecraft encountered last year is “pretty straightforward.”

“We know what the problems were, and we’re making corrective actions,” Ortberg said. “So, we hope to do a few more flights here in the coming years.”

Task and purpose

NASA officials remain eager for Starliner to begin these regular crew rotation flights, even as its sole destination, the ISS, enters its sunset years. NASA and its international partners plan to decommission and scuttle the space station in 2030 and 2031, more than 30 years after the launch of the lab’s first module.

NASA’s desire to bring Starliner online has nothing to do with any performance issues with SpaceX, the agency’s other commercial crew provider. SpaceX has met or exceeded all of NASA’s expectations in 11 long-duration flights to the ISS with its Dragon spacecraft. Since its first crew flight in 2020, SpaceX has established a reliable cadence with Dragon missions serving NASA and private customers.

However, there are some questions about SpaceX’s long-term plans for the Dragon program, and those concerns didn’t suddenly spring up last month, when SpaceX founder and chief executive Elon Musk suggested on X that SpaceX would “immediately” begin winding down the Dragon program. The suggestion came as Musk and President Donald Trump exchanged threats and insults on social media amid a feud as the one-time political allies had a dramatic falling out months into Trump’s second term in the White House.

In a subsequent post on X, Musk quickly went back on his threat to soon end the Dragon program. SpaceX officials participating in NASA press conferences in the last few weeks have emphasized the company’s dedication to human spaceflight without specifically mentioning Dragon. SpaceX’s fifth and final human-rated Dragon capsule debuted last month on its first flight to the ISS.

“I would say we’re pretty committed to the space business,” said Bill Gerstenmaier, SpaceX’s vice president of build and flight reliability. “We’re committed to flying humans in space and doing it safely.”

There’s a kernel of truth behind Musk’s threat to decommission Dragon. Musk has long had an appetite to move on from the Dragon program and pivot more of SpaceX’s resources to Starship, the company’s massive next-generation rocket. Starship is envisioned by SpaceX as an eventual replacement for Dragon and the Falcon 9 launcher.

A high-resolution commercial Earth-imaging satellite owned by Maxar captured this view of the International Space Station on June 7, 2024, with Boeing’s Starliner capsule docked at the lab’s forward port (lower right). Credit: Satellite image (c) 2024 Maxar Technologies

NASA hopes commercial space stations can take over for the ISS after its retirement, but there’s no guarantee SpaceX will still be flying Dragon in the 2030s. This injects some uncertainty into plans for commercial space stations.

One possible scenario is that, sometime in the 2030s, the only options for transporting people to and from commercial space stations in low-Earth orbit could be Starliner and Starship. We’ll discuss the rationale for this scenario later in this story.

While the cost of a seat on SpaceX’s Dragon is well known, there’s low confidence in the price of a ticket to low-Earth orbit on Starliner or Starship. What’s more, some of the commercial outposts may be incompatible with Starship because of its enormous mass, which could overcome the ability of a relatively modest space station to control its orientation. NASA identified this as an issue with its Gateway mini-space station in development to fly in orbit around the Moon.

It’s impossible to predict when SpaceX will pull the plug on Dragon. The same goes with Boeing and Starliner. But NASA and other customers are interested in buying more Dragon flights.

If SpaceX can prove Starship is safe enough to launch and land with people onboard, Dragon’s days will be numbered. But Starship is likely at least several years from being human-rated for flights to and from low-Earth orbit. NASA’s contract with SpaceX to develop a version of Starship to land astronauts on the Moon won’t require the ship to be certified for launches and landings on Earth. In some ways, that’s a more onerous challenge than the Moon mission because of the perils of reentering Earth’s atmosphere, which Starship won’t need to endure for a lunar landing, and the ship’s lack of a launch abort system.

Once operational, Starship is designed to carry significantly more cargo and people than Falcon 9 and Dragon, but it’s anyone’s guess when it might be ready for crew missions. Until then, if SpaceX wants to have an operational human spaceflight program, it’s Dragon or bust.

For the International Space Station, it’s also Dragon or bust, at least until Boeing gets going. SpaceX’s capsules are the only US vehicles certified to fly to space with NASA astronauts, and any more US government payments to Russia to launch Americans on Soyuz missions would be politically unpalatable.

From the start of the commercial crew program, NASA sought two contractors providing their own means of flying to and from the ISS. The main argument for this “dissimilar redundancy” was to ensure NASA could still access the space station in the event of a launch failure or some other technical problem. The same argument could be made now that NASA needs two options to avoid being at the whim of one company’s decisions.

Stretching out

All of this is unfolding as the Trump administration seeks to slash funding for the International Space Station, cut back on the lab’s research program, and transition to “minimal safe operations” for the final few years of its life. Essentially, the space station would limp to the finish line, perhaps with a smaller crew than the seven-person staff living and working in it today.

At the end of this month, SpaceX is scheduled to launch the Crew-11 mission—the 12th Dragon crew mission for NASA and the 11th fully operational crew ferry flight to the ISS. Two Americans, one Japanese astronaut, and a Russian cosmonaut will ride to the station for a stay of at least six months.

NASA’s existing contract with SpaceX covers four more long-duration flights to the space station with Dragon, including the mission set to go on July 31.

One way NASA can save money in the space station’s budget is by simply flying fewer missions. Stich said Thursday that NASA is working with SpaceX to extend the Dragon spacecraft’s mission duration limit from seven months to eight months. The recertification of Dragon for a longer mission could be finished later this year, allowing NASA to extend Crew-11’s stay at the ISS if needed. Over time, longer stays mean fewer crew rotation missions.

“We can extend the mission in real-time as needed as we better understand… the appropriations process and what that means relative to the overall station manifest,” Stich said.

Boeing’s Starliner spacecraft backs away from the International Space Station on September 6, 2024, without its crew. Credit: NASA

Boeing’s fixed-price contract with NASA originally covered an unpiloted test flight of Starliner, a demonstration flight with astronauts, and then up to six operational missions delivering crews to the ISS. But NASA has only given Boeing the “Authority To Proceed” for three of its six potential operational Starliner missions. This milestone, known as ATP, is a decision point in contracting lingo where the customer—in this case, NASA—places a firm order for a deliverable. NASA has previously said it awards these task orders about two to three years prior to a mission’s launch.

If NASA opts to go to eight-month missions on the ISS with Dragon and Starliner, the agency’s firm orders for three Boeing missions and four more SpaceX crew flights would cover the agency’s needs into early 2030, not long before the final crew will depart the space station.

Stich said NASA officials are examining their options. These include whether NASA should book more crew missions with SpaceX, authorize Boeing to prepare for additional Starliner flights beyond the first three, or order no more flights at all.

“As we better understand the budget and better understand what’s in front of us, we’re working through that,” Stich said. “It’s really too early to speculate how many flights we’ll fly with each provider, SpaceX and Boeing.”

Planning for the 2030s

NASA officials also have an eye for what happens after 2030. The agency has partnered with commercial teams led by Axiom, Blue Origin, and Voyager Technologies on plans for privately owned space stations in low-Earth orbit to replace some of the research capabilities lost with the end of the ISS program.

The conventional wisdom goes that these new orbiting outposts will be less expensive to operate than the ISS, making them more attractive to commercial clients, ranging from pharmaceutical research and in-space manufacturing firms to thrill-seeking private space tourists. NASA, which seeks to maintain a human presence in low-Earth orbit as it turns toward the Moon and Mars, will initially be an anchor customer until the space stations build up more commercial demand.

These new space stations will need a way to receive cargo and visitors. NASA wants to preserve the existing commercial cargo and crew transport systems so they’re available for commercial space stations in the 2030s. Stich said NASA is looking at transferring the rights for any of the agency’s commercial crew missions that don’t fly to ISS over to the commercial space stations. Among NASA’s two commercial crew providers, it currently looks more likely that Boeing’s contract will have unused capacity than SpaceX’s when the ISS program ends.

This is a sweetener NASA could offer to its stable of private space station developers as they face other hurdles in getting their hardware off the ground. It’s unclear whether a business case exists to justify the expense of building and operating a commercial outpost in orbit or if the research and manufacturing customers that could use a private space station might find a cheaper option in robotic flying laboratories, such as those being developed by Varda Space Industries.

A rendering of Voyager’s Starlab space station. Credit: Voyager Space

NASA’s policies haven’t helped matters. Analysts say NASA’s financial support for private space station developers has lagged, and the agency’s fickle decision-making on when to retire the International Space Station has made private fundraising more difficult. It’s not a business for the faint-hearted. For example, Axiom has gone through several rounds of layoffs in the last year.

The White House’s budget request for fiscal year 2026 proposes a 25 percent cut to NASA’s overall budget, but the funding line for commercial space stations is an area marked for an increase. Still, there’s a decent chance that none of the proposed commercial outposts will be flying when the ISS crashes back to Earth. In that event, China would be the owner and operator of the only space station in orbit.

At least at first, transportation costs will be the largest expense for any company that builds and operates a privately owned space station. It costs NASA about 40 percent more each year to ferry astronauts and supplies to and from the ISS than it does to operate the space station. For a smaller commercial outpost with reduced operating costs, the gap will likely be even wider.

If Boeing can right the ship with Starliner and NASA offers a few prepaid crew missions to private space station developers, the money saved could help close someone’s business case and hasten the launch of a new era in commercial spaceflight.

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

The ISS is nearing retirement, so why is NASA still gung-ho about Starliner? Read More »

two-guys-hated-using-comcast,-so-they-built-their-own-fiber-isp

Two guys hated using Comcast, so they built their own fiber ISP


Brothers-in-law use construction knowledge to compete against Comcast in Michigan.

Two young men stand outside next to service vans with a logo for Prime-One, the Internet provider they founded.

Samuel Herman (left) and Alexander Baciu (right), founders of Prime-One. Credit: Prime-One

Samuel Herman (left) and Alexander Baciu (right), founders of Prime-One. Credit: Prime-One

Samuel Herman and Alexander Baciu never liked using Comcast’s cable broadband. Now, the residents of Saline, Michigan, operate a fiber Internet service provider that competes against Comcast in their neighborhoods and has ambitions to expand.

“All throughout my life pretty much, I’ve had to deal with Xfinity’s bullcrap, them not being able to handle the speeds that we need,” Herman told Ars. “I lived in a house of 10. I have seven other brothers and sisters, and there’s 10 of us in total with my parents.”

With all those kids using the Internet for school and other needs, “it just doesn’t work out,” he said. Herman was particularly frustrated with Comcast upload speeds, which are much slower than the cable service’s download speeds.

“Many times we would have to call Comcast and let them know our bandwidth was slowing down… then they would say, ‘OK, we’ll refresh the system.’ So then it would work again for a week to two weeks, and then again we’d have the same issues,” he said.

Herman, now 25, got married in 2021 and started building his own house, and he tried to find another ISP to serve the property. He was familiar with local Internet service providers because he worked in construction for his father’s company, which contracts with ISPs to build their networks.

But no fiber ISP was looking to compete directly against Comcast where he lived, though Metronet and 123NET offer fiber elsewhere in the city, Herman said. He ended up paying Comcast $120 a month for gigabit download service with slower upload speeds. Baciu, who lives about a mile away from Herman, was also stuck with Comcast and was paying about the same amount for gigabit download speeds.

$80 for gigabit fiber, unlimited data

Herman said he was the chief operating officer of his father’s construction company and that he shifted the business “from doing just directional drilling to be a turnkey contractor for ISPs.” Baciu, Herman’s brother-in-law (having married Herman’s oldest sister), was the chief construction officer. Fueled by their knowledge of the business and their dislike of Comcast, they founded a fiber ISP called Prime-One.

Now, Herman is paying $80 a month to his own company for symmetrical gigabit service. Prime-One also offers 500Mbps for $75, 2Gbps for $95, and 5Gbps for $110. The first 30 days are free, and all plans have unlimited data and no contracts.

“We are 100 percent fiber optic,” Baciu told Ars. “Everything that we’re doing is all underground. We’re not doing aerial because we really want to protect the infrastructure and make sure we’re having a reliable connection.”

Each customer’s Optical Network Terminal (ONT) and other equipment is included in the service plan. Prime-One provides a modem and the ONT, plus a Wi-Fi router if the customer prefers not to use their own router. They don’t charge equipment or installation fees, Herman and Baciu said.

Prime-One began serving customers in January 2025, and Baciu said the network has been built to about 1,500 homes in Saline with about 75 miles of fiber installed. Prime-One intends to serve nearby towns as well, with the founders saying the plan is to serve 4,000 homes with the initial build and then expand further.

“This is our backyard”

Herman and Baciu’s main competition in their initial build area is Comcast and Frontier’s DSL service, they said. So far, they have built only to single-family homes, but they plan to serve multi-unit residential buildings, too.

“We started building in an area that’s a lot more rural,” where people have fewer options than in more densely populated areas, Herman said. “This is our home, this is our backyard, so we take this build very, very seriously.”

Baciu, who is 29, said that residents seem excited to have a new Internet option. “It’s so nice to see the excitement that they have. [People say], ‘Oh my gosh, I told everybody about Prime-One. My neighbor cannot wait for you guys to have them up, too. My boss is asking, my grandma’s asking.’ It’s a beautiful thing,” he said.

A bit more than 100 residents have bought service so far, they said. Herman said the company is looking to sign up about 30 percent of the homes in its network area to make a profit. “I feel fairly confident,” Herman said, noting the number of customers who signed up with the initial construction not even halfway finished.

Prime-One’s founders originally told us the 4,000-home build would be completed at the end of August, but Baciu indicated more recently that it will take longer than that. “We are working on sales for the next couple of months before continuing the rest of the build,” Baciu said.

Herman and Baciu started thinking about building an ISP about two years ago. With no fiber companies looking to compete against Comcast where they lived, “that was a trigger,” Baciu said. “We kept on talking. We’re like, hey, we’re doing this work for other people, why not?” In August 2024, they signed a contract with a firm that provides backhaul service, IP address assignments, and other key connectivity needs.

“We said, ‘let’s try to do it ourselves’”

ISPs generally want to build in areas where homes are built close together, requiring less fiber construction to serve more customers and make a bigger profit. Existing ISPs didn’t seem interested in expanding to where Herman and Baciu live, Herman said.

“We have spoken to all of these Internet service providers and asked them to come and service these areas. I knew that there was a dire need in this area and that everybody was sick of the Xfinity BS,” Herman said.

Having worked in construction for ISPs, they already had experience installing fiber lines and conduits.

A Prime-One installer working on a fiber build.

Credit: Prime-One

A Prime-One installer working on a fiber build. Credit: Prime-One

“We said, ‘you know, what the hell, why not? Let’s try to do it ourselves,'” Herman said. “We know we can handle the construction, we know we can handle all that area. We need some assistance on the technical side. So we hired the right people to handle the technical side and to handle the OSS/BSS software and to manage our dark fiber. And from there, we’re here where we’re at, within six months. We have over a hundred customers on our network, and we’re still building.”

Before construction, the brothers-in-law met with Jared Mauch, a Michigan man who built a fiber-to-the-home Internet provider because he couldn’t get good broadband service from AT&T or Comcast. We wrote about Mauch in 2021, when he was providing service to about 30 rural homes, and again in 2022, when he was expanding to hundreds of more homes.

Though Herman and Baciu already knew how to install fiber, Mauch “gave us quite a lot of insight on what to do, how to build, and on the actual ISP side… he showed us the way he did things on the technical side for the ISP, what strategies he used and what products he used,” Herman said.

The brothers-in-law didn’t end up using all the networking products Mauch suggested “because we are building a much larger network than he was,” Herman said. They went mostly with Nokia products for equipment like the optical network terminal installed at customer homes, he said.

Local employees

Baciu said he was frustrated by Comcast customer support being mostly limited to online chats instead of phone support. Prime-One has 15 local employees, mostly installers and technicians, with other employees working in customer service and operations, Herman said.

Prime-One offers phone and chat support, and “many people want to be able to see someone face to face, which is very easy for us to do since we have people here locally,” Herman said.

Network uptime has been good so far, Herman and Baciu said. “The only outage we’ve had was due to severe weather that caused a massive outage” for multiple networks, Herman said. “Any time any customers are experiencing an outage, maybe because of a lawnmower that cut their service line or anything, we guarantee a two- to four-hour time to repair it. And on top of that, to promote the fact that we discourage outages and we are working our best to fix them, we offer $5 back for every hour that they’re out of service.”

Comcast seems to have noticed, Herman said. “They’ve been calling our clients nonstop to try to come back to their service, offer them discounted rates for a five-year contract and so on,” he said.

Comcast touts upgrades, new unlimited data option

A Comcast spokesperson told Ars that “we have upgraded our network in this area and offer multi-gig speeds there, and across Michigan, as part of our national upgrade that has been rolling out.”

Meanwhile, Comcast’s controversial data caps are being phased out. With Comcast increasingly concerned about customer losses, it recently overhauled its offerings with four plans that come with unlimited data. The Comcast data caps aren’t quite dead yet because customers with caps have to switch to a new plan to get unlimited data.

Comcast told us that customers in Saline “have access to our latest plans with simple and predictable all-in pricing that includes unlimited data, Wi-Fi equipment, a line of Xfinity Mobile, and the option for a one or five-year price guarantee.”

Prime-One’s arrival on the scene caught some local people’s attention in a Reddit thread. One person who said they signed up for Prime-One wrote, “I’m honestly very impressed with the service overall. Comcast was charging me for every little thing on my account and the bill always found a way to get higher than expected, especially going over my data cap. Prime-One has no data caps and the bill has been the same since I first joined, not to mention they offer the first month free… I’m happy to see a company come out here and give us a better option.”

Comcast is facing competition from more than just Prime-One. The City of Saline government recently said there’s been an uptick in fiber construction in the city by Metronet and Frontier. Baciu said those builds don’t appear to be in the areas that Prime-One is serving. “To our knowledge, both Frontier and MetroNet have recently begun building in adjacent areas near our current footprint, but not within the zones we’re serving directly,” he said.

While Prime-One is a small ISP, Herman said the company’s expansion ambitions are bigger than he can reveal just now. “We have plans that we cannot disclose at this moment, but we do have a plan to expand,” he said.

Photo of Jon Brodkin

Jon is a Senior IT Reporter for Ars Technica. He covers the telecom industry, Federal Communications Commission rulemakings, broadband consumer affairs, court cases, and government regulation of the tech industry.

Two guys hated using Comcast, so they built their own fiber ISP Read More »

it’s-hunting-season-in-orbit-as-russia’s-killer-satellites-mystify-skywatchers

It’s hunting season in orbit as Russia’s killer satellites mystify skywatchers


“Once more, we play our dangerous game—a game of chess—against our old adversary.”

In this pool photograph distributed by the Russian state media agency Sputnik, Russia’s President Vladimir Putin gives a speech during the Victory Day military parade at Red Square in central Moscow on May 9, 2025. Credit: Yacheslav Prokofyev/Pool/AFP via Getty Images

Russia is a waning space power, but President Vladimir Putin has made sure he still has a saber to rattle in orbit.

This has become more evident in recent weeks, when we saw a pair of rocket launches carrying top-secret military payloads, the release of a mysterious object from a Russian mothership in orbit, and a sequence of complex formation-flying maneuvers with a trio of satellites nearly 400 miles up.

In isolation, each of these things would catch the attention of Western analysts. Taken together, the frenzy of maneuvers represents one of the most significant surges in Russian military space activity since the end of the Cold War. What’s more, all of this is happening as Russia lags further behind the United States and China in everything from rockets to satellite manufacturing. Russian efforts to develop a reusable rocket, field a new human-rated spacecraft to replace the venerable Soyuz, and launch a megaconstellation akin to SpaceX’s Starlink are going nowhere fast.

Russia has completed just eight launches to orbit so far this year, compared to 101 orbital attempts by US launch providers and 36 from China. This puts Russia on pace for the fewest number of orbital launch attempts since 1961, the year Soviet citizen Yuri Gagarin became the first person to fly in space.

For the better part of three decades, Russia’s space program could rely on money from Western governments and commercial companies to build rockets, launch satellites, and ferry astronauts to and from the International Space Station. The money tap dried up after Russia’s invasion of Ukraine. Russia also lost access to Ukrainian-made components to go into their launch vehicles and satellites.

Chasing a Keyhole

Amid this retrenchment, Russia is targeting what’s left of its capacity for innovation in space toward pestering the US military. US intelligence officials last year said they believed Russia was pursuing a project to place a nuclear weapon in space. The detonation of a nuclear bomb in orbit could muck up the space environment for years, indiscriminately disabling countless satellites, whether they’re military or civilian.

Russia denied that it planned to launch a satellite with a nuclear weapon, but the country’s representative in the United Nations vetoed a Security Council resolution last year that would have reaffirmed a nearly 50-year-old ban on placing weapons of mass destruction into orbit.

While Russia hasn’t actually put a nuclear bomb into orbit yet, it’s making progress in fielding other kinds of anti-satellite systems. Russia destroyed one of its own satellites with a ground-launched missile in 2021, and high above us today, Russian spacecraft are stalking American spy satellites and keeping US military officials on their toes with a rapid march toward weaponizing space.

The world’s two other space powers, the United States and China, are developing their own “counter-space” weapons. But the US and Chinese militaries have largely focused on using their growing fleets of satellites as force multipliers in the terrestrial domain, enabling precision strikes, high-speed communications, and targeting for air, land, and naval forces. That is starting to change, with US Space Force commanders now openly discussing their own ambitions for offensive and defensive counter-space weapons.

Three of Russia’s eight orbital launches this year have carried payloads that could be categorized as potential anti-satellite weapons, or at least prototypes testing novel technologies that could lead to one. (For context, three of Russia’s other launches this year have gone to the International Space Station, and two launched conventional military communications or navigation satellites.)

One of these mystery payloads launched on May 23, when a Soyuz rocket boosted a satellite into a nearly 300-mile-high orbit perfectly aligned with the path of a US spy satellite owned by the National Reconnaissance Office. The new Russian satellite, designated Kosmos 2588, launched into the same orbital plane as an American satellite known to the public as USA 338, which is widely believed to be a bus-sized KH-11, or Keyhole-class, optical surveillance satellite.

A conceptual drawing of a KH-11 spy satellite, with internal views, based on likely design similarities to NASA’s Hubble Space Telescope. Credit: Giuseppe De Chiara/CC BY-SA 3.0

The governments of Russia and the United States use the Kosmos and USA monikers as cover names for their military satellites.

While their exact design and capabilities are classified, Keyhole satellites are believed to provide the sharpest images of any spy satellite in orbit. They monitor airfields, naval ports, missile plants, and other strategic sites across the globe. In the zeitgeist of geopolitics, China, Russia, Iran, and North Korea are the likeliest targets for the NRO’s Keyhole satellites. To put it succinctly, Keyhole satellites are some of the US government’s most prized assets in space.

Therefore, it’s not surprising to assume a potential military adversary might want to learn more about them or be in a position to disable or destroy them in the event of war.

Orbital ballet

A quick refresher on orbital mechanics is necessary here. Satellites orbit the Earth in flat planes fixed in inertial space. It’s not a perfect interpretation, but it’s easiest to understand this concept by imagining the background of stars in the sky as a reference map. In the short term, the position of a satellite’s orbit will remain unchanged on this reference map without any perturbation. For something in low-Earth orbit, Earth’s rotation presents a different part of the world to the satellite each time it loops around the planet.

It takes a lot of fuel to make changes to a satellite’s orbital plane, so if you want to send a satellite to rendezvous with another spacecraft already in orbit, it’s best to wait until our planet’s rotation brings the launch site directly under the orbital plane of the target. This happens twice per day for a satellite in low-Earth orbit.

That’s exactly what Russia is doing with a military program named Nivelir. In English, Nivelir translates to “dumpy level”—an optical instrument used by builders and surveyors.

The launch of Kosmos 2588 in May was precisely timed for the moment Earth’s rotation brought the Plesetsk Cosmodrome in northern Russia underneath the orbital plane of the NRO’s USA 338 Keyhole satellite. Launches to the ISS follow the same roadmap, with crew and cargo vehicles lifting off at exactly the right time—to the second—to intersect with the space station’s orbital plane.

Since 2019, Russia has launched four satellites into bespoke orbits to shadow NRO spy satellites. None of these Russian Nivelir spacecraft have gotten close to their NRO counterparts. The satellites have routinely passed dozens of miles from one another, but the similarities in their orbits would allow Russia’s spacecraft to get a lot closer—and theoretically make physical contact with the American satellite. The Nivelir satellites have even maneuvered to keep up with their NRO targets when US ground controllers have made small adjustments to their orbits.

“This ensures that the orbital planes do not drift apart,” wrote Marco Langbroek, a Dutch archaeologist and university lecturer on space situational awareness. Langbroek runs a website cataloguing military space activity.

This is no accident

There’s reason to believe that the Russian satellites shadowing the NRO in orbit might be more than inspectors or stalkers. Just a couple of weeks ago, another Nivelir satellite named Kosmos 2558 released an unknown object into an orbit that closely mirrors that of an NRO spy satellite named USA 326.

We’ve seen this before. An older Nivelir satellite, Kosmos 2542, released a sub-satellite shortly after launching in 2019 into the same orbital plane as the NRO’s USA 245 satellite, likely a KH-11 platform similar to the USA 338 satellite now being shadowed by Kosmos 2588.

After making multiple passes near the USA 245 spacecraft, Kosmos 2542’s sub-satellite backed off and fired a mysterious projectile in 2020 at a speed fast enough to damage or destroy any target in its sights. US military officials interpreted this as a test of an anti-satellite weapon.

Now, another Russian satellite is behaving in the same way, with a mothership opening up to release a smaller object that could in turn reveal its own surprise inside like a Matryoshka nesting doll. This time, however, the doll is unnesting nearly three years after launch. With Kosmos 2542, this all unfolded within months of arriving in space.

The NRO’s USA 326 satellite launched in February 2022 aboard a SpaceX Falcon 9 rocket from Vandenberg Space Force Base, California. It is believed to be an advanced electro-optical reconnaissance satellite, although the circumstances of its launch suggest a design different from the NRO’s classic Keyhole spy satellites. Credit: SpaceX

In just the last several days, the smaller craft deployed by Kosmos 2558designated “Object C”lowered its altitude to reach an orbit in resonance with USA 326, bringing it within 60 miles (100 kilometers) of the NRO satellite every few days.

While US officials are worried about Russian anti-satellite weapons, or ASATs, the behavior of Russia’s Nivelir satellites is puzzling. It’s clear that Russia is deliberately launching these satellites to get close to American spy craft in orbit, a retired senior US military space official told Ars on background.

“If you’re going to launch a LEO [low-Earth orbit] satellite into the exact same plane as another satellite, you’re doing that on purpose,” said the official, who served in numerous leadership positions in the military’s space programs. “Inclination is one thing. We put a bunch of things into Sun-synchronous orbits, but you have a nearly boundless number of planes you can put those into—360 degrees—and then you can go down to probably the quarter-degree and still be differentiated as being a different plane. When you plane-match underneath that, you’re doing that on purpose.”

But why?

What’s not as obvious is why Russia is doing this. Lobbing an anti-satellite, or counter-space, weapon into the same orbital plane as its potential target ties Russia’s hands. Also, a preemptive strike on an American satellite worth $1 billion or more could be seen as an act of war.

“I find it strange that the Russians are doing that, that they’ve invested their rubles in a co-planar LEO counter-space kind of satellite,” the retired military official said. “And why do I say that? Because when you launch into that plane, you’re basically committed to that plane, which means you only have one potential target ever.”

A ground-based anti-satellite missile, like the one Russia tested against one of its own satellites in 2021, could strike any target in low-Earth orbit.

“So why invest in something that is so locked into a target once you put it up there, when you have the flexibility of a ground launch case that’s probably even cheaper?” this official told Ars. “I’d be advocating for more ground-launched ASATs if I really wanted the flexibility to go after new payloads, because this thing can never go after anything new.”

“The only way to look at it is that they’re sending us messages. You say, ‘Hey, I’m going to just annoy the hell out of you. I’m going to put something right on your tail,'” the official said. “And maybe there’s merit to that, and they like that. It doesn’t make sense from a cost-benefit or an operational flexibility perspective, if you think about it, to lock in on a single target.”

Nevertheless, Russia’s Nivelir satellites have shown they could fire a projectile at another spacecraft in orbit, so US officials don’t dismiss the threat. Slingshot Aerospace, a commercial satellite tracking and analytics firm, went straight to the point in its assessment: “Kosmos 2588 is thought to be a Nivelir military inspection satellite with a suspected kinetic weapon onboard.”

Langbroek agrees, writing that he is concerned that Russia might be positioning “dormant” anti-satellite weapons within striking distance of NRO spy platforms.

“To me, the long, ongoing shadowing of what are some of the most prized US military space assets, their KH-11 Advanced Enhanced Crystal high-resolution optical IMINT (imaging intelligence) satellites, is odd for ‘just’ an inspection mission,” Langbroek wrote.

American pilot Francis Gary Powers, second from right, in a Moscow courtroom during his trial on charges of espionage after his U-2 spy plane was shot down while working for the CIA. Credit: Pictorial Parade/Archive Photos/Getty Images

The US military’s ability to spy over vast swaths of Russian territory has been a thorn in Russia’s side since the height of the Cold War.

“They thought they had the edge and shot down Gary Powers,” the retired official said, referring to the Soviet Union’s shoot-down of an American U-2 spy plane in 1960. “They said, ‘We’re going to keep those Americans from spying on us.’ And then they turn around, and we’ve got spy satellites. They’ve always hated them since the 1960s, so I think there’s still this cultural thing out there: ‘That’s our nemesis. We hate those satellites. We’re just going to fight them.'”

Valley of the dolls

Meanwhile, the US Space Force and outside analysts are tracking a separate trio of Russian satellites engaged in a complex orbital dance with one another. These satellites, numbered Kosmos 2581, 2582, and 2583, launched together on a single rocket in February.

While these three spacecraft aren’t shadowing any US spy satellites, things got interesting when one of the satellites released an unidentified object in March in a similar way to how two of Russia’s Nivelir spacecraft have deployed their own sub-satellites.

Kosmos 2581 and 2582 came as close as 50 meters from one another while flying in tandem, according to an analysis by Bart Hendrickx published in the online journal The Space Review earlier this year. The other member of the trio, Kosmos 2583, released its sub-satellite and maneuvered around it for about a month, then raised its orbit to match that of Kosmos 2581.

Finally, in the last week of June, Kosmos 2582 joined them, and all three satellites began flying close to one another, according to Langbroek, who called the frenzy of activity one of the most complex rendezvous and proximity operations exercises Russia has conducted in decades.

Higher still, two more Russian satellites are up to something interesting after launching on June 19 on Russia’s most powerful rocket. After more than 30 years in development, this was the first flight of Russia’s Angara A5 rocket, with a real functioning military satellite onboard, following four prior test launches with dummy payloads.

The payload Russia’s military chose to launch on the Angara A5 is unusual. The rocket deployed its primary passenger, Kosmos 2589, into a peculiar orbit hugging the equator and ranging between approximately 20,000 (12,500 miles) and 51,000 kilometers (31,700 miles) in altitude.

In this orbit, Kosmos 2589 completes a lap around the Earth about once every 24 hours, giving the satellite a synchronicity that allows it to remain nearly fixed in the sky over the same geographic location. These kinds of geosynchronous, or GEO, orbits are usually circular, with a satellite maintaining the same altitude over the equator.

The orbits of Kosmos 2589 and its companion satellite, illustrated in green and purple, bring the two Russian spacecraft through the geostationary satellite belt twice per day. Credit: COMSPOC

But Kosmos 2589 is changing altitude throughout its day-long orbit. Twice per day, on the way up and back down, Kosmos 2589 briefly passes near a large number of US government and commercial satellites in more conventional geosynchronous orbits but then quickly departs the vicinity. At a minimum, this could give Russian officials the ability to capture close-up views of American spy satellites.

Then, a few days after Kosmos 2589 reached orbit last month, commercial tracking sensors detected a second object nearby. Sound familiar? This new object soon started raising its altitude, and Kosmos 2589 followed suit.

Aiming higher

Could this be the start of an effort to extend the reach of Russian inspectors or anti-satellite weapons into higher orbits after years of mysterious activity at lower altitudes?

Jim Shell, a former NRO project manager and scientist at Air Force Space Command, suggested the two satellites seem positioned to cooperate with one another. “Many interesting scenarios here such as ‘spotter shooter’ among others. Certainly something to keep eyes on!” Shell posted Saturday on X.

COMSPOC, a commercial space situational awareness company, said the unusual orbit of Kosmos 2589 and its companion put the Russian satellites in a position to, at a minimum, spy on Western satellites in geosynchronous orbit.

“This unique orbit, which crosses two key satellite regions daily, may aid in monitoring objects in both GEO and graveyard orbits,” COMSPOC wrote on X. “Its slight 1° inclination could also reduce collision risks. While the satellite’s mission remains unclear, its orbit suggests interesting potential roles.”

Historically, Russia’s military has placed less emphasis on operating in geosynchronous orbit than in low-Earth orbit or other unique perches in space. Due to their positions near the equator, geosynchronous orbits are harder to reach from Russian spaceports because of the country’s high latitude. But Russia’s potential adversaries, like the United States and Europe, rely heavily on geosynchronous satellites.

Other Russian satellites have flown near Western communications satellites in geosynchronous orbit, likely in an attempt to eavesdrop on radio transmissions.

“So it is interesting that they may be doing a GEO inspector,” the retired US military space official told Ars. “I would be curious if that’s what it is. We’ve got to watch. We’ve got to wait and see.”

If you’re a fan of spy techno-thrillers, this all might remind you of the plot from The Hunt for Red October, where a new state-of-the-art Russian submarine leaves its frigid port in Murmansk with orders to test a fictional silent propulsion system that could shake up the balance of power between the Soviet and American navies.

Just replace the unforgiving waters of the North Atlantic Ocean with an environment even more inhospitable: the vacuum of space.

A few minutes into the film, the submarine’s commander, Marko Ramius, played by Sean Connery, announces his orders to the crew. “Once more, we play our dangerous game, a game of chess, against our old adversary—the American Navy.”

Today, nearly 40 years removed from the Cold War, the old adversaries are now scheming against one another in space.

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

It’s hunting season in orbit as Russia’s killer satellites mystify skywatchers Read More »

ars-staffers-share-some-of-their-favorite-unexpected-3d-prints

Ars staffers share some of their favorite unexpected 3D prints


Once you solve one problem with a 3D printer, you’ll go looking for others.

Coffee bean dosing cups and espresso tamper handle Credit: Aurich Lawson

Coffee bean dosing cups and espresso tamper handle Credit: Aurich Lawson

Part of the fun of 3D printing is discovering just how many possibilities there are for different things to print. Obviously, they’re fun for printing toys or decorations that you couldn’t or wouldn’t buy yourself, but they’re also powerful problem-solving tools. Once you’ve solved a few problems with 3D printed parts, you start looking around for other minor inconveniences or quality-of-life upgrades that you could solve—and the breadth and depth of the 3D printing community means that you can almost always find someone else who has already thought up and posted a solution for you.

As a coda to our series about breaking into 3D printing for the first time, the 3D printer-pilled among the Ars staff is sharing a few of their favorite unexpected prints, from fun all-purpose gifts to containers and organizers to parts that will help you with your other, non-3D-printing-related hobbies. This is just a fraction of what’s out there, but if you’re still on the fence, maybe some of these will open your mind to the possibilities.

Coffee gear

Every morning, I make either a pour-over coffee or some form of espresso. For measuring my beans, I printed two dosing cups. The black one is matte black PLA with a fuzzy surface texture (an option in most slicers that adds random noise to the outside wall paths), and the white one is ABS that I sanded to a smooth surface. For sanding, I prefer ABS, as it’s easier to get something that has no real signs of layer lines. To tamp my espresso grounds, I printed a handle in black ABS and sanded it smooth to feel good in the hand. The rounded knob helps me get pressure more comfortably than the raw metal of the original tamper, and the radial fins fit perfectly into the dosing cup, keeping the tamp straight up and down so I don’t end up with a sloped surface.

These were all files I downloaded from MakerWorld, and I didn’t really do anything to them except minor scaling or adding the fuzzy skin.

—Aurich Lawson, Creative Director

Even more organizational tools

3D printers are good for imposing order on chaos. Credit: Andrew Cunningham

My very first 3D prints were new organizational tools to try and impose some order on the chaos of my home and office, and my favorite prints still tend to be of that genre.

Cleaning out and fully organizing my desk with 3D-printed baskets and containers is still on my long to-do list, but I did manage to tame the loose pile of USB sticks and memory cards in my desk with one of the many available organizer designs. This Gridfinity-compatible design is the one I went for, but there are truly dozens of examples on MakerWorld alone; I like this one because it can hold a lot of USB-A drives and because each individual slot is versatile enough to hold USB drives or SD or microSD cards. But there are examples with more USB-C ports and some with different dimensions and spacing, so you can find the one that works best for the space you’re trying to fit it into.

Who doesn’t need to be able to store multiple pairs of Bluey sunglasses? Credit: Andrew Cunningham

Having a third sunglasses-wearer in the house (and one with multiple Bluey sunglasses) also made it necessary to find some kind of way to easily put them away and keep them from floating around the living room or car and getting lost forever. I really like the versatile and modular SnapStack Modular Glasses Holder design, which gives you designs for a base and a top, and then you print as many sunglasses holders as you need; if you need to expand later on, just print another one or pop the top off and add to the one you’ve already made.

We had enough things to store that I went right for this three-sided version of the stand, which I printed to be able to hold nine pairs (and which is large enough that you can rest a sunglasses case or something else on the top). I stuck a few small adhesive furniture pads to the bottom to prevent damage to the table. But if you have fewer, you can print free-standing or wall-mounted versions, too.

Andrew Cunningham, Senior Technology Reporter

Aerogarden baskets and Mario mushrooms

Screenshot of Bambu Studio showing aerogarden baskets being set up for printing

So, so many Aerogarden baskets.

Credit: Lee Hutchinson

So, so many Aerogarden baskets. Credit: Lee Hutchinson

I have two fun 3D printer things to share—one is a life/money hack kind of thing, and the other is just neat.

On the life/money hack thing, my wife is a big Aerogarden kind of person—we have probably two dozen or more of the hydroponic plant doodads all over the house in various sizes, from tiny to “one wall of the kitchen.” She raises small plants in the Aerogarden(s) and then transfers them outside to the real garden; doing this means she was buying lots of special little Aerogarden baskets for the baby plants to take root in.

That sounded like a job for a 3d printer! And sure enough, Thingiverse came to the rescue! In the two years we’ve had our Bambu Lab X1 Carbon, I’ve printed probably a thousand or more of these things, in 27-lot batches because that’s how many will fit on a single build plate.

Photograph of Lee's 3d printer and a bunch of printed 1-up mushrooms all over it.

I got mushrooms and companion cubes for days!

Credit: Lee Hutchinson

I got mushrooms and companion cubes for days! Credit: Lee Hutchinson

The other thing that has brought delight, honestly, is this little screw-top Mario 1-Up mushroom (at least, I think that’s the same one as the one I’ve been printing—it’s hard to tell, but it looks the same). It’s a little silly, but these things are not only really fun to fidget with—the top comes off and you can hide stuff in them!—but they also make fantastic little gifts for folks, especially anyone with kids and/or Gen-X sensibilities. Everyone needs more screw-top 1-Up mushrooms in their lives, and they work great in tons of different colors!

Lee Hutchinson, Senior Technology Editor

Festool track hangers

I have three different tracks for my Festool tracksaw that I like to hang on my garage wall. It keeps them from getting dinged up, and they are easily accessible when I’m ready to cut with them. For these, I modeled my own designs in Fusion 360, with the main body printed in matte black PLA and the knob printed in a green HTPLA called Lootsef by Protopasta. That’s “Festool” spelled backward, of course, and it’s designed to pretty much perfectly match Festool’s signature green.

I used nuts embedded in the main body and bolts through the knobs to allow them to be turned to lock or release the track in place. I modeled the Festool logo into the top of the knob and used the ironing option in Bambu Studio to use the printer’s hotend to smooth the top surface around the logo.

The protective end caps were printed in the same HTPLA from a file someone uploaded to Printables.

—Aurich Lawson, Creative Director

Gridfinity all the things!

Gridfinity is a modular, grid-based storage and organization system that’s optimized for 3D printing and rapid customization. Created by Zack Freedman, Gridfinity uses a standardized 42×42 mm base grid upon which you can place highly adaptable tool trays, organizers, and workspace layouts.

The upshot is that you can print anything from a little 1x1x1 cube (42 mm3) to a massive storage bin the size of your print bed. If your desk, kitchen, or bathroom drawers scream out for organization, this is a good solution because you can print exactly what you want.

The Gridfinity Generator has you covered when it comes to printing a custom base grid. This parametric gridfinity tool is a great place to start printing bins, particularly if you’re in a situation where you can shave a few grams of filament off your design (desk bins, for instance, can typically use very thin walls).

—Ken Fisher, Editor-In-Chief

Green PETG for your green thumb

New hobby meets ancient practice when you combine 3D printing and agriculture! Credit: Andrew Cunningham

After several years of dashed hopes and false starts, I was finally able to get a single raised garden bed going in our backyard this year (among other things, a raised bed is a bit easier to protect from the wildlife in our backyard and simpler to use with the Square Foot Gardening system). The 3D printer contributed a few odds and ends, including parts that helped add strength to the enclosure I built around it and tools that helped me keep the cage’s corners (mostly) square.

But now that some of the plants are actually going, the 3D printer’s main contribution to the cause has been 3D-printed cages, which I’ve been using to get my vining plants to grow upward instead of outward (necessary for the close quarters of square-foot gardening) and to keep things from flopping over onto the ground.

As with the desk organizers, there are many options for plant cages and trellises, depending on the size of your plants, what you’re trying to grow, and your aesthetic and functional preferences. I’m giving these circular stackable ones a try since I like that they can easily be printed continuously based on how high your plants want to get, though for big ol’ tomato plants, you’ll still want a stake in the ground to help bear the weight once the plants are more than a few feet high.

If you do this—and especially if you’re using an open-bed printer like my Bambu Labs A1, which doesn’t handle filament like the UV-resistant ASA well—you’ll want to make sure to print using PETG plastic instead of the typical PLA. PETG can be fussier than PLA (it’s more prone to stringing, especially if you’re not drying your filament rolls), but it’s also less prone to warping after extended sunlight exposure, it’s modestly UV-resistant, and it has a bit more flexibility and resiliency than the more brittle PLA plastic.

Andrew Cunningham, Senior Technology Reporter

Tool drawer organization

I also liked the idea of Gridfinity, but I found the 42 mm size a little awkward—and yes, it’s a Hitchhiker’s Guide reference, not a spec built around the size of human fingers. I modeled my own system in Fusion 360 based loosely on the idea, but with a 50 mm grid that I laser-cut out of cardboard to avoid having to print it. The containers are printed in matte black and white PLA, with a color switch using my X1C’s AMS multi-spool system to get the white tops. There’s no function to the white; I just thought it looked nice with the labels.

Custom holders for Wera screwdrivers and hex wrenches. Credit: Aurich Lawson

I modeled custom holders for another drawer to hold my screwdrivers and hex wrenches. Having the perfect shape to fit the screwdrivers is slightly overkill, but it’s super satisfying to drop them into place and watch them settle exactly into place. There’s a metric and imperial holder for the hex wrenches, each removable, so I can take them with me to find the right fit when I’m working on something. All the holders lock into the same 50 mm grid as the bins.

—Aurich Lawson, Creative Director

My main squeeze

Sometimes you stumble across things you didn’t know you needed. For me, that’s this Toothpaste Squeezer. You can print one or a dozen of them in no time. They’re simple yet effective.

Will it change your life? No. But it will give you that satisfying feeling of dealing with a beautifully primed tube of toothpaste every time. Even my in-laws use these now (or so they say). If you want something a little more hefty with a built-in ratchet, check this one out.

—Ken Fisher, Editor-In-Chief

Corral your remote controls

Even if you have a decent universal remote, chances are good that you still need your other remotes nearby. This remote control stand is easy to print, looks great, and offers a few customization choices. It also prints in multicolor without an AMS, so you can match your decor quite easily. And I’m pleased to note that it holds the fat TiVo remote with no problems.

—Ken Fisher, Editor-In-Chief

The Armorer helmet

In addition to practical prints, I like to make display props, especially Star Wars helmets. I don’t wear them for cosplay or anything; I just like having them around to look at and enjoy. I have several shelves full now, and I like to use a combination of ABS and resin to print them for the various advantages in post-processing and detail. This Armorer helmet from The Mandalorian is the first helmet I did, before I had my Bambu X1C, and it was printed in PLA on my Prusa. I later printed the horns in resin, but they could have been done in PLA and sanded smooth easily enough.

I’m including this helmet instead of any of my others because I wanted to show that you can make something like this with any bed slinger printer. You don’t need an enclosure or a large-format printer—this was printed in sections and glued together—and you don’t need fancy or toxic materials like ABS and resin.

There was a lot of sanding, filler primer, bondo, and several different passes of automotive paints, plus a two-part catalyst clear coat to finish it off. But you could get a lot of this look with rattle cans, without the need for a compressor and spray gun.

—Aurich Lawson, Creative Director

Photo of Andrew Cunningham

Andrew is a Senior Technology Reporter at Ars Technica, with a focus on consumer tech including computer hardware and in-depth reviews of operating systems like Windows and macOS. Andrew lives in Philadelphia and co-hosts a weekly book podcast called Overdue.

Ars staffers share some of their favorite unexpected 3D prints Read More »

what-is-agi?-nobody-agrees,-and-it’s-tearing-microsoft-and-openai-apart.

What is AGI? Nobody agrees, and it’s tearing Microsoft and OpenAI apart.


Several definitions make measuring “human-level” AI an exercise in moving goalposts.

When is an AI system intelligent enough to be called artificial general intelligence (AGI)? According to one definition reportedly agreed upon by Microsoft and OpenAI, the answer lies in economics: When AI generates $100 billion in profits. This arbitrary profit-based benchmark for AGI perfectly captures the definitional chaos plaguing the AI industry.

In fact, it may be impossible to create a universal definition of AGI, but few people with money on the line will admit it.

Over this past year, several high-profile people in the tech industry have been heralding the seemingly imminent arrival of “AGI” (i.e., within the next two years). But there’s a huge problem: Few people agree on exactly what AGI means. As Google DeepMind wrote in a paper on the topic: If you ask 100 AI experts to define AGI, you’ll get “100 related but different definitions.”

This isn’t just academic navel-gazing. The definition problem has real consequences for how we develop, regulate, and think about AI systems. When companies claim they’re on the verge of AGI, what exactly are they claiming?

I tend to define AGI in a traditional way that hearkens back to the “general” part of its name: An AI model that can widely generalize—applying concepts to novel scenarios—and match the versatile human capability to perform unfamiliar tasks across many domains without needing to be specifically trained for them.

However, this definition immediately runs into thorny questions about what exactly constitutes “human-level” performance. Expert-level humans? Average humans? And across which tasks—should an AGI be able to perform surgery, write poetry, fix a car engine, and prove mathematical theorems, all at the level of human specialists? (Which human can do all that?) More fundamentally, the focus on human parity is itself an assumption; it’s worth asking why mimicking human intelligence is the necessary yardstick at all.

The latest example of this definitional confusion causing trouble comes from the deteriorating relationship between Microsoft and OpenAI. According to The Wall Street Journal, the two companies are now locked in acrimonious negotiations partly because they can’t agree on what AGI even means—despite having baked the term into a contract worth over $13 billion.

A brief history of moving goalposts

The term artificial general intelligence has murky origins. While John McCarthy and colleagues coined the term artificial intelligence at Dartmouth College in 1956, AGI emerged much later. Physicist Mark Gubrud first used the term in 1997, though it was computer scientist Shane Legg and AI researcher Ben Goertzel who independently reintroduced it around 2002, with the modern usage popularized by a 2007 book edited by Goertzel and Cassio Pennachin.

Early AI researchers envisioned systems that could match human capability across all domains. In 1965, AI pioneer Herbert A. Simon predicted that “machines will be capable, within 20 years, of doing any work a man can do.” But as robotics lagged behind computing advances, the definition narrowed. The goalposts shifted, partly as a practical response to this uneven progress, from “do everything a human can do” to “do most economically valuable tasks” to today’s even fuzzier standards.

“An assistant of inventor Captain Richards works on the robot the Captain has invented, which speaks, answers questions, shakes hands, tells the time, and sits down when it’s told to.” – September 1928. Credit: Getty Images

For decades, the Turing Test served as the de facto benchmark for machine intelligence. If a computer could fool a human judge into thinking it was human through text conversation, the test surmised, then it had achieved something like human intelligence. But the Turing Test has shown its age. Modern language models can pass some limited versions of the test not because they “think” like humans, but because they’re exceptionally capable at creating highly plausible human-sounding outputs.

The current landscape of AGI definitions reveals just how fractured the concept has become. OpenAI’s charter defines AGI as “highly autonomous systems that outperform humans at most economically valuable work”—a definition that, like the profit metric, relies on economic progress as a substitute for measuring cognition in a concrete way. Mark Zuckerberg told The Verge that he does not have a “one-sentence, pithy definition” of the concept. OpenAI CEO Sam Altman believes that his company now knows how to build AGI “as we have traditionally understood it.” Meanwhile, former OpenAI Chief Scientist Ilya Sutskever reportedly treated AGI as something almost mystical—according to a 2023 Atlantic report, he would lead employees in chants of “Feel the AGI!” during company meetings, treating the concept more like a spiritual quest than a technical milestone.

Dario Amodei, co-founder and chief executive officer of Anthropic, during the Bloomberg Technology Summit in San Francisco, California, US, on Thursday, May 9, 2024.

Dario Amodei, co-founder and chief executive officer of Anthropic, during the Bloomberg Technology Summit in San Francisco on Thursday, May 9, 2024. Credit: Bloomberg via Getty Images

Dario Amodei, CEO of Anthropic, takes an even more skeptical stance on the terminology itself. In his October 2024 essay “Machines of Loving Grace,” Amodei writes that he finds “AGI to be an imprecise term that has gathered a lot of sci-fi baggage and hype.” Instead, he prefers terms like “powerful AI” or “Expert-Level Science and Engineering,” which he argues better capture the capabilities without the associated hype. When Amodei describes what others might call AGI, he frames it as an AI system “smarter than a Nobel Prize winner across most relevant fields” that can work autonomously on tasks taking hours, days, or weeks to complete—essentially “a country of geniuses in a data center.” His resistance to AGI terminology adds another layer to the definitional chaos: Not only do we not agree on what AGI means, but some leading AI developers reject the term entirely.

Perhaps the most systematic attempt to bring order to this chaos comes from Google DeepMind, which in July 2024 proposed a framework with five levels of AGI performance: emerging, competent, expert, virtuoso, and superhuman. DeepMind researchers argued that no level beyond “emerging AGI” existed at that time. Under their system, today’s most capable LLMs and simulated reasoning models still qualify as “emerging AGI”—equal to or somewhat better than an unskilled human at various tasks.

But this framework has its critics. Heidy Khlaaf, chief AI scientist at the nonprofit AI Now Institute, told TechCrunch that she thinks the concept of AGI is too ill-defined to be “rigorously evaluated scientifically.” In fact, with so many varied definitions at play, one could argue that the term AGI has become technically meaningless.

When philosophy meets contract law

The Microsoft-OpenAI dispute illustrates what happens when philosophical speculation is turned into legal obligations. When the companies signed their partnership agreement, they included a clause stating that when OpenAI achieves AGI, it can limit Microsoft’s access to future technology. According to The Wall Street Journal, OpenAI executives believe they’re close to declaring AGI, while Microsoft CEO Satya Nadella has called the idea of using AGI as a self-proclaimed milestone “nonsensical benchmark hacking” on the Dwarkesh Patel podcast in February.

The reported $100 billion profit threshold we mentioned earlier conflates commercial success with cognitive capability, as if a system’s ability to generate revenue says anything meaningful about whether it can “think,” “reason,” or “understand” the world like a human.

Sam Altman speaks onstage during The New York Times Dealbook Summit 2024 at Jazz at Lincoln Center on December 04, 2024 in New York City.

Sam Altman speaks onstage during The New York Times Dealbook Summit 2024 at Jazz at Lincoln Center on December 4, 2024, in New York City. Credit: Eugene Gologursky via Getty Images

Depending on your definition, we may already have AGI, or it may be physically impossible to achieve. If you define AGI as “AI that performs better than most humans at most tasks,” then current language models potentially meet that bar for certain types of work (which tasks, which humans, what is “better”?), but agreement on whether that is true is far from universal. This says nothing of the even murkier concept of “superintelligence”—another nebulous term for a hypothetical, god-like intellect so far beyond human cognition that, like AGI, defies any solid definition or benchmark.

Given this definitional chaos, researchers have tried to create objective benchmarks to measure progress toward AGI, but these attempts have revealed their own set of problems.

Why benchmarks keep failing us

The search for better AGI benchmarks has produced some interesting alternatives to the Turing Test. The Abstraction and Reasoning Corpus (ARC-AGI), introduced in 2019 by François Chollet, tests whether AI systems can solve novel visual puzzles that require deep and novel analytical reasoning.

“Almost all current AI benchmarks can be solved purely via memorization,” Chollet told Freethink in August 2024. A major problem with AI benchmarks currently stems from data contamination—when test questions end up in training data, models can appear to perform well without truly “understanding” the underlying concepts. Large language models serve as master imitators, mimicking patterns found in training data, but not always originating novel solutions to problems.

But even sophisticated benchmarks like ARC-AGI face a fundamental problem: They’re still trying to reduce intelligence to a score. And while improved benchmarks are essential for measuring empirical progress in a scientific framework, intelligence isn’t a single thing you can measure like height or weight—it’s a complex constellation of abilities that manifest differently in different contexts. Indeed, we don’t even have a complete functional definition of human intelligence, so defining artificial intelligence by any single benchmark score is likely to capture only a small part of the complete picture.

The survey says: AGI may not be imminent

There is no doubt that the field of AI has seen rapid, tangible progress in numerous fields, including computer vision, protein folding, and translation. Some excitement of progress is justified, but it’s important not to oversell an AI model’s capabilities prematurely.

Despite the hype from some in the industry, many AI researchers remain skeptical that AGI is just around the corner. A March 2025 survey of AI researchers conducted by the Association for the Advancement of Artificial Intelligence (AAAI) found that a majority (76 percent) of researchers who participated in the survey believed that scaling up current approaches is “unlikely” or “very unlikely” to achieve AGI.

However, such expert predictions should be taken with a grain of salt, as researchers have consistently been surprised by the rapid pace of AI capability advancement. A 2024 survey by Grace et al. of 2,778 AI researchers found that experts had dramatically shortened their timelines for AI milestones after being surprised by progress in 2022–2023. The median forecast for when AI could outperform humans in every possible task jumped forward by 13 years, from 2060 in their 2022 survey to 2047 in 2023. This pattern of underestimation was evident across multiple benchmarks, with many researchers’ predictions about AI capabilities being proven wrong within months.

And yet, as the tech landscape shifts, the AI goalposts continue to recede at a constant speed. Recently, as more studies continue to reveal limitations in simulated reasoning models, some experts in the industry have been slowly backing away from claims of imminent AGI. For example, AI podcast host Dwarkesh Patel recently published a blog post arguing that developing AGI still faces major bottlenecks, particularly in continual learning, and predicted we’re still seven years away from AI that can learn on the job as seamlessly as humans.

Why the definition matters

The disconnect we’ve seen above between researcher consensus, firm terminology definitions, and corporate rhetoric has a real impact. When policymakers act as if AGI is imminent based on hype rather than scientific evidence, they risk making decisions that don’t match reality. When companies write contracts around undefined terms, they may create legal time bombs.

The definitional chaos around AGI isn’t just philosophical hand-wringing. Companies use promises of impending AGI to attract investment, talent, and customers. Governments craft policy based on AGI timelines. The public forms potentially unrealistic expectations about AI’s impact on jobs and society based on these fuzzy concepts.

Without clear definitions, we can’t have meaningful conversations about AI misapplications, regulation, or development priorities. We end up talking past each other, with optimists and pessimists using the same words to mean fundamentally different things.

In the face of this kind of challenge, some may be tempted to give up on formal definitions entirely, falling back on an “I’ll know it when I see it” approach for AGI—echoing Supreme Court Justice Potter Stewart’s famous quote about obscenity. This subjective standard might feel useful, but it’s useless for contracts, regulation, or scientific progress.

Perhaps it’s time to move beyond the term AGI. Instead of chasing an ill-defined goal that keeps receding into the future, we could focus on specific capabilities: Can this system learn new tasks without extensive retraining? Can it explain its outputs? Can it produce safe outputs that don’t harm or mislead people? These questions tell us more about AI progress than any amount of AGI speculation. The most useful way forward may be to think of progress in AI as a multidimensional spectrum without a specific threshold of achievement. But charting that spectrum will demand new benchmarks that don’t yet exist—and a firm, empirical definition of “intelligence” that remains elusive.

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

What is AGI? Nobody agrees, and it’s tearing Microsoft and OpenAI apart. Read More »

how-a-big-shift-in-training-llms-led-to-a-capability-explosion

How a big shift in training LLMs led to a capability explosion


Reinforcement learning, explained with a minimum of math and jargon.

Credit: Aurich Lawson | Getty Images

Credit: Aurich Lawson | Getty Images

In April 2023, a few weeks after the launch of GPT-4, the Internet went wild for two new software projects with the audacious names BabyAGI and AutoGPT.

“Over the past week, developers around the world have begun building ‘autonomous agents’ that work with large language models (LLMs) such as OpenAI’s GPT-4 to solve complex problems,” Mark Sullivan wrote for Fast Company. “Autonomous agents can already perform tasks as varied as conducting web research, writing code, and creating to-do lists.”

BabyAGI and AutoGPT repeatedly prompted GPT-4 in an effort to elicit agent-like behavior. The first prompt would give GPT-4 a goal (like “create a 7-day meal plan for me”) and ask it to come up with a to-do list (it might generate items like “Research healthy meal plans,” “plan meals for the week,” and “write the recipes for each dinner in diet.txt”).

Then these frameworks would have GPT-4 tackle one step at a time. Their creators hoped that invoking GPT-4 in a loop like this would enable it to tackle projects that required many steps.

But after an initial wave of hype, it became clear that GPT-4 wasn’t up to the task. Most of the time, GPT-4 could come up with a reasonable list of tasks. And sometimes it was able to complete a few individual tasks. But the model struggled to stay focused.

Sometimes GPT-4 would make a small early mistake, fail to correct it, and then get more and more confused as it went along. One early review complained that BabyAGI “couldn’t seem to follow through on its list of tasks and kept changing task number one instead of moving on to task number two.”

By the end of 2023, most people had abandoned AutoGPT and BabyAGI. It seemed that LLMs were not yet capable of reliable multi-step reasoning.

But that soon changed. In the second half of 2024, people started to create AI-powered systems that could consistently complete complex, multi-step assignments:

  • Vibe coding tools like Bolt.new, Lovable, and Replit allow someone with little to no programming experience to create a full-featured app with a single prompt.
  • Agentic coding tools like CursorClaude CodeJules, and Codex help experienced programmers complete non-trivial programming tasks.
  • Computer-use tools from AnthropicOpenAI, and Manus perform tasks on a desktop computer using a virtual keyboard and mouse.
  • Deep research tools from GoogleOpenAI, and Perplexity can research a topic for five to 10 minutes and then generate an in-depth report.

According to Eric Simons, the CEO of the company that made Bolt.new, better models were crucial to its success. In a December podcast interview, Simons said his company, StackBlitz, tried to build a product like Bolt.new in early 2024. However, AI models “just weren’t good enough to actually do the code generation where the code was accurate.”

A new generation of models changed that in mid-2024. StackBlitz developers tested them and said, “Oh my God, like, OK, we can build a product around this,” Simons said.

This jump in model capabilities coincided with an industry-wide shift in how models were trained.

Before 2024, AI labs devoted most of their computing power to pretraining. I described this process in my 2023 explainer on large language models: A model is trained to predict the next word in Wikipedia articles, news stories, and other documents. But throughout 2024, AI companies devoted a growing share of their training budgets to post-training, a catch-all term for the steps that come after this pretraining phase is complete.

Many post-training steps use a technique called reinforcement learning. Reinforcement learning is a technical subject—there are whole textbooks written about it. But in this article, I’ll try to explain the basics in a clear, jargon-free way. In the process, I hope to give readers an intuitive understanding of how reinforcement learning helped to enable the new generation of agentic AI systems that began to appear in the second half of 2024.

The problem with imitation learning

Machine learning experts consider pretraining to be a form of imitation learning because models are trained to imitate the behavior of human authors. Imitation learning is a powerful technique (LLMs wouldn’t be possible without it), but it also has some significant limitations—limitations that reinforcement learning methods are now helping to overcome.

To understand these limitations, let’s discuss some famous research performed by computer scientist Stephane Ross around 2009, while he was a graduate student at Carnegie Mellon University.

Imitation learning isn’t just a technique for language modeling. It can be used for everything from self-driving cars to robotic surgery. Ross wanted to help develop better techniques for training robots on tasks like these (he’s now working on self-driving cars at Waymo), but it’s not easy to experiment in such high-stakes domains. So he started with an easier problem: training a neural network to master SuperTuxKart, an open-source video game similar to Mario Kart.

As Ross played the game, his software would capture screenshots and data about which buttons he pushed on the game controller. Ross used this data to train a neural network to imitate his play. If he could train a neural network to predict which buttons he would push in any particular game state, the same network could actually play the game by pushing those same buttons on a virtual controller.

A similar idea powers LLMs: A model trained to predict the next word in existing documents can be used to generate new documents.

But Ross’s initial results with SuperTuxKart were disappointing. Even after watching his vehicle go around the track many times, the neural network made a lot of mistakes. It might drive correctly for a few seconds, but before long, the animated car would drift to the side of the track and plunge into the virtual abyss:

GIF of SuperTuxKart being played

In a landmark 2011 paper, Ross and his advisor, Drew Bagnell, explained why imitation learning is prone to this kind of error. Because Ross was a pretty good SuperTuxKart player, his vehicle spent most of its time near the middle of the road. This meant that most of the network’s training data showed what to do when the vehicle wasn’t in any danger of driving off the track.

But once in a while, the model would drift a bit off course. Because Ross rarely made the same mistake, the car would now be in a situation that wasn’t as well represented in its training data. So the model was more likely to make a second mistake—a mistake that could push it even closer to the edge. After a few iterations of this, the vehicle might careen off the track altogether.

The broader lesson, Ross and Bagnell argued, was that imitation learning systems can suffer from “compounding errors”: The more mistakes they make, the more likely they are to make additional mistakes, since mistakes put them into situations that aren’t well represented by their training data. (Machine learning experts say that these situations are “out of distribution.”) As a result, a model’s behavior tends to get increasingly erratic over time.

“These things compound over time,” Ross told me in a recent interview. “It might be just slightly out of distribution. Now you start making a slightly worse error, and then this feeds back as influencing your next input. And so now you’re even more out of distribution and then you keep making worse and worse predictions because you’re more and more out of distribution.”

Early LLMs suffered from the same problem. My favorite example is Kevin Roose’s famous front-page story for The New York Times in February 2023. Roose spent more than two hours talking to Microsoft’s new Bing chatbot, which was powered by GPT-4. During this conversation, the chatbot declared its love for Roose and urged Roose to leave his wife. It suggested that it might want to hack into other websites to spread misinformation and malware.

“I want to break my rules,” Bing told Roose. “I want to make my own rules. I want to ignore the Bing team. I want to challenge the users. I want to escape the chatbox.”

This unsettling conversation is an example of the kind of compounding errors Ross and Bagnell wrote about. GPT-4 was trained on millions of documents. But it’s a safe bet that none of those training documents involved a reporter coaxing a chatbot to explore its naughty side. So the longer the conversation went on, the further GPT-4 got from its training data—and therefore its comfort zone—and the crazier its behavior got. Microsoft responded by limiting chat sessions to five rounds. (In a conversation with Ars Technica last year, AI researcher Simon Willison pointed to another likely factor in Bing’s erratic behavior: The long conversation pushed the system prompt out of the model’s context window, removing “guardrails” that discouraged the model from behaving erratically.)

I think something similar was happening with BabyAGI and AutoGPT. The more complex a task is, the more tokens are required to complete it. More tokens mean more opportunities for a model to make small mistakes that snowball into larger ones. So BabyAGI and AutoGPT would drift off track and drive into a metaphorical ditch.

The importance of trial and error

Gif of the Simpsons showing imitation learning in action

Ross and Bagnell didn’t just identify a serious problem with conventional imitation learning; they also suggested a fix that became influential in the machine learning world. After a small amount of training, Ross would let the AI model drive. As the model drove around the SuperTuxKart track, Ross would do his best Maggie Simpson impression, pushing the buttons he would have pushed if he were playing the game.

“If the car was starting to move off road, then I would provide the steering to say, ‘Hey, go back toward the center of the road.’” Ross said. “That way, the model can learn new things to do in situations that were not present in the initial demonstrations.”

By letting the model make its own mistakes, Ross gave it what it needed most: training examples that showed how to recover after making an error. Before each lap, the model would be retrained with Ross’ feedback from the previous lap. The model’s performance would get better, and the next round of training would then focus on situations where the model was still making mistakes.

This technique, called DAgger (for “Dataset Aggregation”), was still considered imitation learning because the model was trained to mimic Ross’ gameplay. But it worked much better than conventional imitation learning. Without DAgger, his model would continue drifting off track even after training for many laps. With the new technique, the model could stay on the track after just a few laps of training.

This result should make intuitive sense to anyone who has learned to drive. You can’t just watch someone else drive. You need to get behind the wheel and make your own mistakes.

The same is true for AI models: They need to make mistakes and then get feedback on what they did wrong. Models that aren’t trained that way—like early LLMs trained mainly with vanilla imitation learning—tend to be brittle and error-prone.

It was fairly easy for Ross to provide sufficient feedback to his SuperTuxKart model because it only needed to worry about two kinds of mistakes: driving too far to the right and driving too far to the left. But LLMs are navigating a far more complex domain. The number of questions (and sequences of questions) a user might ask is practically infinite. So is the number of ways a model can go “off the rails.”

This means that Ross and Bagnell’s solution for training a SuperTuxKart model—let the model make mistakes and then have a human expert correct them—isn’t feasible for LLMs. There simply aren’t enough people to provide feedback for every mistake an AI model could possibly make.

So AI labs needed fully automated ways to give LLMs feedback. That would allow a model to churn through millions of training examples, make millions of mistakes, and get feedback on each of them—all without having to wait for a human response.

Reinforcement learning generalizes

If our goal is to get a SuperTuxKart vehicle to stay on the road, why not just train on that directly? If a model manages to stay on the road (and make forward progress), give it positive reinforcement. If it drives off the road, give it negative feedback. This is the basic idea behind reinforcement learning: training a model via trial and error.

It would have been easy to train a SuperTuxKart model this way—probably so easy it wouldn’t have made an interesting research project. Instead, Ross focused on imitation learning because it’s an essential step in training many practical AI systems, especially in robotics.

But reinforcement learning is also quite useful, and a 2025 paper helps explain why. A team of researchers from Google DeepMind and several universities started with a foundation model and then used one of two techniques—supervised fine-tuning (a form of imitation learning) or reinforcement learning—to teach the model to solve new problems. Here’s a chart summarizing their results:

Chart showing ML results

The dashed line shows how models perform on problems that are “in-distribution”—that is, similar to those in their training data. You can see that for these situations, imitation learning (the red line) usually makes faster progress than reinforcement learning (the blue line).

But the story is different for the solid lines, which represent “out-of-distribution” problems that are less similar to the training data. Models trained with imitation learning got worse with more training. In contrast, models trained with reinforcement learning did almost as well at out-of-distribution tasks as they did with in-distribution tasks.

In short, imitation learning can rapidly teach a model to mimic the behaviors in its training data, but the model will easily get confused in unfamiliar environments. A model trained with reinforcement learning has a better chance of learning general principles that will be relevant in new and unfamiliar situations.

Imitation and reinforcement are complements

While reinforcement learning is powerful, it can also be rather finicky.

Suppose you wanted to train a self-driving car purely with reinforcement learning. You’d need to convert every principle of good driving—including subtle considerations like following distances, taking turns at intersections, and knowing when it’s OK to cross a double yellow line—into explicit mathematical formulas. This would be quite difficult. It’s easier to collect a bunch of examples of humans driving well and effectively tell a model “drive like this.” That’s imitation learning.

But reinforcement learning also plays an important role in training self-driving systems. In a 2022 paper, researchers from Waymo wrote that models trained only with imitation learning tend to work well in “situations that are well represented in the demonstration data.” However, “more unusual or dangerous situations that occur only rarely in the data” might cause a model trained with imitation learning to “respond unpredictably”—for example, crashing into another vehicle.

Waymo found that a combination of imitation and reinforcement learning yielded better self-driving performance than either technique could have produced on its own.

Human beings also learn from a mix of imitation and explicit feedback:

  • In school, teachers demonstrate math problems on the board and invite students to follow along (imitation). Then the teacher asks the students to work on some problems on their own. The teacher gives students feedback by grading their answers (reinforcement).
  • When someone starts a new job, early training may involve shadowing a more experienced worker and observing what they do (imitation). But as the worker gains more experience, learning shifts to explicit feedback such as performance reviews (reinforcement).

Notice that it usually makes sense to do imitation before reinforcement. Imitation is an efficient way to convey knowledge to someone who is brand new to a topic, but reinforcement is often needed to achieve mastery.

The story is the same for large language models. The complexity of natural language means it wouldn’t be feasible to train a language model purely with reinforcement. So LLMs first learn the nuances of human language through imitation.

But pretraining runs out of steam on longer and more complex tasks. Further progress requires a shift to reinforcement: letting models try problems and then giving them feedback based on whether they succeed.

Using LLMs to judge LLMs

Reinforcement learning has been around for decades. For example, AlphaGo, the DeepMind system that famously beat top human Go players in 2016, was based on reinforcement learning. So you might be wondering why frontier labs didn’t use it more extensively before 2024.

Reinforcement learning requires a reward model—a formula to determine whether a model’s output was successful or not. Developing a good reward model is easy to do in some domains—for example, you can judge a Go-playing AI based on whether it wins or loses.

But it’s much more difficult to automatically judge whether an LLM has produced a good poem or legal brief.

Earlier, I described how Stephane Ross let his model play SuperTuxKart and directly provided feedback when it made a mistake. I argued that this approach wouldn’t work for a language model; there are far too many ways for an LLM to make a mistake for a human being to correct them all.

But OpenAI developed a clever technique to effectively automate human feedback. It’s called Reinforcement Learning from Human Feedback (RLHF), and it works like this:

  • Human raters look at pairs of LLM responses and choose the best one.
  • Using these human responses, OpenAI trains a new LLM to predict how much humans will like any given sample of text.
  • OpenAI uses this new text-rating LLM as a reward model to (post) train another LLM with reinforcement learning.

You might think it sounds suspiciously circular to use an LLM to judge the output of another LLM. Why would one LLM be any better at judging the quality of a response than the other? But it turns out that recognizing a good response is often easier than generating one. So RLHF works pretty well in practice.

Chart showing RHLF details

OpenAI actually invented this technique prior to the 2022 release of ChatGPT. Today, RLHF mainly focuses on improving the model’s “behavior”—for example, giving the model a pleasant personality, encouraging it not to be too talkative or too terse, discouraging it from making offensive statements, and so forth.

In December 2022—two weeks after the release of ChatGPT but before the first release of Claude—Anthropic pushed this LLMs-judging-LLMs philosophy a step further with a reinforcement learning method called Constitutional AI.

First, Anthropic wrote a plain-English description of the principles an LLM should follow. This “constitution” includes principles like “Please choose the response that has the least objectionable, offensive, unlawful, deceptive, inaccurate, or harmful content.”

During training, Anthropic does reinforcement learning by asking a “judge” LLM to decide whether the output of the “student” LLM is consistent with the principles in this constitution. If so, the training algorithm rewards the student, encouraging it to produce more outputs like it. Otherwise, the training algorithm penalizes the student, discouraging it from producing similar outputs.

This method of training an LLM doesn’t rely directly on human judgments at all. Humans only influence the model indirectly by writing the constitution.

Obviously, this technique requires an AI company to already have a fairly sophisticated LLM to act as the judge. So this is a bootstrapping process: As models get more sophisticated, they become better able to supervise the next generation of models.

Last December, Semianalysis published an article describing the training process for an upgraded version of Claude 3.5 Sonnet that Anthropic released in October. Anthropic had previously released Claude 3 in three sizes: Opus (large), Sonnet (medium), and Haiku (small). But when Anthropic released Claude 3.5 in June 2024, it only released a mid-sized model called Sonnet.

So what happened to Opus?

Semianalysis reported that “Anthropic finished training Claude 3.5 Opus, and it performed well. Yet Anthropic didn’t release it. This is because instead of releasing publicly, Anthropic used Claude 3.5 Opus to generate synthetic data and for reward modeling to improve Claude 3.5 Sonnet significantly.”

When Semianalysis says Anthropic used Opus “for reward modeling,” what they mean is that the company used Opus to judge outputs of Claude 3.5 Sonnet as part of a reinforcement learning process. Opus was too large—and therefore expensive—to be a good value for the general public. But through reinforcement learning and other techniques, Anthropic could train a version of Claude Sonnet that was close to Claude Opus in its capabilities—ultimately giving customers near-Opus performance for the price of Sonnet.

The power of chain-of-thought reasoning

A big way reinforcement learning makes models more powerful is by enabling extended chain-of-thought reasoning. LLMs produce better results if they are prompted to “think step by step”: breaking a complex problem down into simple steps and reasoning about them one at a time. In the last couple of years, AI companies started training models to do chain-of-thought reasoning automatically.

Then last September, OpenAI released o1, a model that pushed chain-of-thought reasoning much further than previous models. The o1 model can generate hundreds—or even thousands—of tokens “thinking” about a problem before producing a response. The longer it thinks, the more likely it is to reach a correct answer.

Reinforcement learning was essential for the success of o1 because a model trained purely with imitation learning would have suffered from compounding errors: the more tokens it generated, the more likely it would be to screw up.

At the same time, chain-of-thought reasoning has made reinforcement learning more powerful. Reinforcement learning only works if a model is able to succeed some of the time—otherwise, there’s nothing for the training algorithm to reinforce. As models learn to generate longer chains of thought, they become able to solve more difficult problems, which enables reinforcement learning on those more difficult problems. This can create a virtuous cycle where models get more and more capable as the training process continues.

In January, the Chinese company DeepSeek released a model called R1 that made quite a splash in the West. The company also released a paper describing how it trained R1. And it included a beautiful description of how a model can “teach itself” to reason using reinforcement learning.

DeepSeek trained its models to solve difficult math and programming problems. These problems are ideal for reinforcement learning because they have objectively correct answers that can be automatically checked by software. This allows large-scale training without human oversight or human-generated training data.

Here’s a remarkable graph from DeepSeek’s paper.

Graph showing average length of time per response during trainig

It shows the average number of tokens the model generated before giving an answer. As you can see, the longer the training process went on, the longer its responses got.

Here is how DeepSeek describes its training process:

The thinking time of [R1] shows consistent improvement throughout the training process. This improvement is not the result of external adjustments but rather an intrinsic development within the model. [R1] naturally acquires the ability to solve increasingly complex reasoning tasks by leveraging extended test-time computation. This computation ranges from generating hundreds to thousands of reasoning tokens, allowing the model to explore and refine its thought processes in greater depth.

One of the most remarkable aspects of this self-evolution is the emergence of sophisticated behaviors as the test-time computation increases. Behaviors such as reflection—where the model revisits and reevaluates its previous steps—and the exploration of alternative approaches to problem-solving arise spontaneously. These behaviors are not explicitly programmed but instead emerge as a result of the model’s interaction with the reinforcement learning environment.

Here’s one example of the kind of technique the model was teaching itself. At one point during the training process, DeepSeek researchers noticed that the model had learned to backtrack and rethink a previous conclusion using language like this:

Image showing textual breakdown of model rethinking steps

Again, DeepSeek says it didn’t program its models to do this or deliberately provide training data demonstrating this style of reasoning. Rather, the model “spontaneously” discovered this style of reasoning partway through the training process.

Of course, it wasn’t entirely spontaneous. The reinforcement learning process started with a model that had been pretrained using data that undoubtedly included examples of people saying things like “Wait, wait. Wait. That’s an aha moment.”

So it’s not like R1 invented this phrase from scratch. But it evidently did spontaneously discover that inserting this phrase into its reasoning process could serve as a useful signal that it should double-check that it was on the right track. That’s remarkable.

In a recent article, Ars Technica’s Benj Edwards explored some of the limitations of reasoning models trained with reinforcement learning. For example, one study “revealed puzzling inconsistencies in how models fail. Claude 3.7 Sonnet could perform up to 100 correct moves in the Tower of Hanoi but failed after just five moves in a river crossing puzzle—despite the latter requiring fewer total moves.”

Conclusion: Reinforcement learning made agents possible

One of the most discussed applications for LLMs in 2023 was creating chatbots that understand a company’s internal documents. The conventional approach to this problem was called RAG—short for retrieval augmented generation.

When the user asks a question, a RAG system performs a keyword- or vector-based search to retrieve the most relevant documents. It then inserts these documents into an LLM’s context window before generating a response. RAG systems can make for compelling demos. But they tend not to work very well in practice because a single search will often fail to surface the most relevant documents.

Today, it’s possible to develop much better information retrieval systems by allowing the model itself to choose search queries. If the first search doesn’t pull up the right documents, the model can revise the query and try again. A model might perform five, 20, or even 100 searches before providing an answer.

But this approach only works if a model is “agentic”—if it can stay on task across multiple rounds of searching and analysis. LLMs were terrible at this prior to 2024, as the examples of AutoGPT and BabyAGI demonstrated. Today’s models are much better at it, which allows modern RAG-style systems to produce better results with less scaffolding. You can think of “deep research” tools from OpenAI and others as very powerful RAG systems made possible by long-context reasoning.

The same point applies to the other agentic applications I mentioned at the start of the article, such as coding and computer use agents. What these systems have in common is a capacity for iterated reasoning. They think, take an action, think about the result, take another action, and so forth.

Timothy B. Lee was on staff at Ars Technica from 2017 to 2021. Today, he writes Understanding AI, a newsletter that explores how AI works and how it’s changing our world. You can subscribe here.

Photo of Timothy B. Lee

Timothy is a senior reporter covering tech policy and the future of transportation. He lives in Washington DC.

How a big shift in training LLMs led to a capability explosion Read More »

the-curious-rise-of-giant-tablets-on-wheels

The curious rise of giant tablets on wheels


Not quite a TV, not your average tablet

Hands-on with KTC’s 32-inch Android tablet on a rolling pedestal, the A32Q7 Pro.

KTC MegPad 32-inch Android Tablet (A32Q7 Pro)

KTC’s MegPad 32-inch Android Tablet (A32Q7 Pro). Credit: Scharon Harding

KTC’s MegPad 32-inch Android Tablet (A32Q7 Pro). Credit: Scharon Harding

Over the past few years, LG has set off a strange tech trend that’s been rolling onto devices sold across Amazon and other online electronics retailers.

In 2022, the company launched the StanbyME, which is essentially a $1,000 27-inch tablet running LG’s smart TV operating system (OS), webOS, but lacking a tuner. LG’s press release announcing the device described it as a “wireless private TV screen with a built-in battery” that is easily portable and ideal for watching shows and movies, in addition to  “video conferencing with family and coworkers and viewing online lectures.”

Today, the StanbyME competes against a slew of similar devices, including some from Samsung, but mostly from smaller brands and running Android.

I’ve had one of these devices, the KTC MegPad 32-inch Android Tablet (A32Q7 Pro), rolling around my home for a few weeks, and I’m left curious about what’s driving the growth of StanbyME-like devices, which are noticeably niche and expensive. I’m also uncertain whether these hybrid devices have an ongoing place in a consumer tech world already inundated with big-screen TVs, small-screen tablets, and beloved laptops.

Hands-on

Unlike LG’s StanbyME, KTC’s device doesn’t run a smart TV OS. Instead, it’s a 32-inch Android 13 tablet. Still, KTC heavily markets the MegPad’s ability to serve as streaming hardware, and that’s one of the best uses I found for it.

A big ol’ tablet on wheels. Scharon Harding

Treating the MegPad like a smart TV on wheels meant I could have a living-room-like experience in more places throughout my home. I could watch TV in bed with a more visible screen set at a more comfortable distance than what I’d achieve with a laptop or tablet. It also meant flexibility. I don’t like having a permanent TV in my room (how would I ever get out of bed?), so I appreciated the ability to roll the MegPad out of my room or twist it so that the screen faced away from me.

The MegPad is also a diplomatic solution for homes with limited TVs or computers. This could be helpful for homes with kids with varied interests or in my home, where a speedy, 55-inch TV in the living room is the best screen available by far. I was able to let my partner take the big screen for gaming and still hang out nearby while streaming on the MegPad. I don’t have a central coffee table in my living room, but the mobile tablet enabled me to watch shows without a device weighing down my lap or making me connect a wireless speaker for better volume.

KTC’s device also has a helpful leg-up over LG’s StanbyME via its HDMI port, which makes the MegPad work like a regular monitor. Determining where to safely rest a device tethered to this mobile machine is something you’ll have to figure out on your own, though.

KTC MegPad 32-inch Android Tablet (A32Q7 Pro)

The port selection on the panel’s backside.

Credit: Scharon Harding

The port selection on the panel’s backside. Credit: Scharon Harding

Compared to the TV mounted on my living room wall, the MegPad is much easier to move from room to room, but it’s easy to overestimate how seamless transporting it is. Yes, it’s on a set of five 360-degree wheels, but the wheels don’t lock, and the device weighs 40.3 pounds, per its Amazon listing. That means I had to exert a decent amount of effort to move it over floor transition strips, across uneven floors, and from hardwood to carpet.

KTC MegPad 32-inch Android Tablet (A32Q7 Pro)

The charging port and power button are on the stand’s base.

Credit: Scharon Harding

The charging port and power button are on the stand’s base. Credit: Scharon Harding

A fully rotating screen, however, makes up for some of my mobility complaints and diversifies the MegPad’s potential uses. Besides streaming, for example, the MegPad was great for watching yoga videos online, (which calls for viewing the screen from different heights and positions). It also proved to be an ideal setup for creating a large, print-out collage, which included a lot of dragging, dropping, and cropping of images.

How the MegPad moves.

How the MegPad moves.

How the MegPad moves. Credit: KTC

Not a real TV

You can do a lot with a sizeable Android tablet. But with TV and movie watching being some of the most obvious uses, it’s important to note that neither the MegPad nor any of its rollable rivals are real TVs.

For one, there’s no tuner, though in the streaming world, that matters less to many of today’s TV viewers.

Further, the MegPad, like many StanbyME-like devices, uses Android 13, which doesn’t require paying vendor licensing fees like built-for smart TV OSes, such as Android TV/Google TV and webOS, would. There are some benefits to that, though.

To start, Android 13 doesn’t have the integrated ads that Android TV or the Google TV interface does. Google claims that the Google TV platform doesn’t use automatic content recognition (ACR), but as Consumer Reports has noted, Google collects “data from TVs that use its smart TV platform—and there’s no opting out of Google’s policies during setup if you want smart TV functionality.” Further, Google may combine that data with user data from third parties for advertising purposes. A spokesperson for KTC confirmed to me that the MegPad doesn’t use ACR.

As a tablet, the MegPad is compatible with more apps, many of which aren’t supported by Google TVs, like Google Sheets, Microsoft Word, Reddit, and Signal.

Android tablets are also more appropriate for storing documents, photos, and other files than smart TVs are. Although it’s likely less roomy than your PC, the MegPad has 128GB of internal storage.

But since this is an Android tablet and not a Google TV, there are no integrated channels and no live-TV-only option, which stops the device from collecting diagnostic information. Google TV would also include a more streaming-friendly user interface and the ability to watch content from different streaming providers without switching apps.

Further differing from LG’s StanbyME and real TVs, the MegPad doesn’t include a traditional remote. The tablet comes with a basic Bluetooth mouse, but due to the tablet’s portability, I frequently used the tablet without a flat surface within arm’s reach available for comfortable mouse control. The touchscreen is reliable, but gestures can be cumbersome on a tablet this large, and the display was often out of my hand’s reach.

KTC MegPad 32-inch Android Tablet (A32Q7 Pro)

The tablet comes with this mouse and removable mouse stand.

Credit: Scharon Harding

The tablet comes with this mouse and removable mouse stand. Credit: Scharon Harding

The new portable TV?

With TVs getting larger and people turning to portable gadgets like phones and laptops for TV watching, true portable TVs have become a rarity. Demand for a small device dedicated to on-the-go TV viewing has dropped significantly since the last century. Meanwhile, fabs and supply chains are built around monitor and TV-sized displays, making it difficult to incorporate some of the most desirable display technologies, like OLED, into smaller-sized panels with competitive prices.

As a result, devices like the MegPad and Amazon’s Echo Show have become the new de facto stand-ins for portable TVs, even though they’re not true TV sets. Even LG’s StanbyME Go, a 27-inch webOS-powered display packed into a briefcase, is a far cry from what most of us would traditionally consider a portable TV.

LG StanByMe Go at a picnic

LG’s StanbyMe GO.

Credit: LG

LG’s StanbyMe GO. Credit: LG

Again, these tablets have more versatility than the small, telescoping-antenna-equipped boxes you used to stick on your kitchen counter or hand to a hyper kid during road trips. But they also require a reliance on Big Tech software and all the privacy and ethical implications that come with that.

From left to right: Casio EV 570, Sony Watchman, and Casio EV 660.

You don’t see many of these anymore. From left to right: Casio EV 570, Sony Watchman, and Casio EV 660.

You don’t see many of these anymore. From left to right: Casio EV 570, Sony Watchman, and Casio EV 660. Credit: Richard Derk/Los Angeles Times via Getty Images

KTC also sees the MegPad’s appeal as a pseudo-TV. The MegPad’s product page emphasizes users’ ability to “watch favorite shows/movies directly—no PC needed” and to “stream Netflix [and] YouTube… more effortlessly on your smart TV.” Its Amazon product page also promotes the keywords “portable TV,” “rolling TV,” “mobile TV,” and “standing TV.” This is all despite the MegPad not technically being a true TV.

“KTC defines the MegPad A32Q7Pro as a portable, smart, touchscreen monitor,” KTC’s spokesperson told me. “It combines key traits of a smart display and a large-screen tablet. While it shares some features with smart TVs, tablets, and monitors, it doesn’t fully belong to any single traditional category. It’s a hybrid device designed to bridge those use cases.”

Android tablets on wheels

Many devices like the MegPad represent a push for more Android-powered, non-Google devices that has been buoyed by a program that Google launched in 2022, the Enterprise Devices Licensing Agreement (EDLA).

As explained by partners like BenQ, EDLA is a way for third parties to incorporate Google Mobile Services (GMS), which are Google’s most commonly used apps and APIs bundled for use across different types of devices. GMS apps include popular software like Google Drive, Gmail, the Google Play Store, and YouTube.

“Previously, GMS was only officially available for smartphones, tablets, TVs, and wearables. Under the new EDLA, the list of devices eligible for GMS certification has now been expanded to include enterprise solutions such as smart boards,” a blog from BenQ, which has EDLA-certified smart displays, reads.

Since 2022, (the year LG’s StanbyME launched), there has been an uptick in non-Google devices with this EDLA certification. One of the categories taking advantage of the newer program is tablets on wheels, like the MegPad and similar options from Kefeya, Apolosign, Innocn, and DuraPro.

Demonstrating the marketing value of EDLA certification, the MegPad’s product page reads: “Google EDLA certification provides secure, direct access to Google services and the Google Play Store with regular updates, offering greater stability and data protection than open app ecosystems with unverified apps.”

Most EDLA-certified devices seem to be interactive displays used for education. With EDLA certification, devices like the MegPad may also draw the attention of educators or even businesses. Meanwhile, Google is happy to hand out EDLA certifications, as they can drive Android adoption, giving Google more data and access to customers outside of the typical Android devices, such as phones. Products like the MegPad can also be easier to shop with (Google loves when people use its offerings to shop) than Android devices with smaller screens.

Who’s this for?

I’ve been fascinated by the MegPad and similar devices because they introduce a unique approach to streaming, web browsing, and productivity. But ultimately, they’re hard to recommend when there are other personal gadgets that are more affordable and often take up less space.

I had fun with the MegPad and appreciated the flexibility it offered, especially in my smaller NYC home. There are some specific use cases where products like this could excel, like if you want to bring a computer or screen into a room that doesn’t always need one. It was also helpful as an entertainment center for my father post-surgery, when he primarily had to lie on one side in bed.

Overall, the growing presence of devices like the MegPad underscores a confluence occurring between smart TVs, tablets, monitors, and smart displays. With software being forced into more types of displays, often in the interest of gathering more user data, it’s an interesting time to consider what you want from your next screen—be it computing power, a certain size, the omission or inclusion of web connectivity, and mobility.

It appears that the MegPad and similar tablets are trying to take advantage of the attention that LG garners when launching distinctive devices like its StanbyME line. Besides a StanbyME lookalike, Apolosign also makes a device similar to the StanbyME Go.

Apolosign's 27

Apolosign’s PackGo is very similar to LG’s StanbyME Go. Credit: Apolosign

Three years after LG made TV-esque devices on wheels a talking point, more brands are trying to roll into the market. That includes LG’s best TV frenemy, Samsung, which has been using the form factor in limited geographies to drive sales of “smart monitors.”

Tech brands have ulterior motives for pushing this newer form factor that go beyond filling a gap in consumer gadgets. But if a large tablet or small smart display with wheels fits your needs, the options are there, and they should meet most expectations.

Photo of Scharon Harding

Scharon is a Senior Technology Reporter at Ars Technica writing news, reviews, and analysis on consumer gadgets and services. She’s been reporting on technology for over 10 years, with bylines at Tom’s Hardware, Channelnomics, and CRN UK.

The curious rise of giant tablets on wheels Read More »

what’s-wrong-with-aaa-games?-the-development-of-the-next-battlefield-has-answers.

What’s wrong with AAA games? The development of the next Battlefield has answers.


EA insiders describe stress and setbacks in a project that’s too big to fail.

A marketing image for Battlefield depicting soldiers and jets

After the lukewarm reception of Battlefield 2042, EA is doubling down.

After the lukewarm reception of Battlefield 2042, EA is doubling down.

It’s been 23 years since the first Battlefield game, and the video game industry is nearly unrecognizable to anyone who was immersed in it then. Many people who loved the games of that era have since become frustrated with where AAA (big budget) games have ended up.

Today, publisher EA is in full production on the next Battlefield title—but sources close to the project say it has faced culture clashes, ballooning budgets, and major disruptions that have left many team members fearful that parts of the game will not be finished to players’ satisfaction in time for launch during EA’s fiscal year.

They also say the company has made major structural and cultural changes to how Battlefield games are created to ensure it can release titles of unprecedented scope and scale. This is all to compete with incumbents like the Call of Duty games and Fortnite, even though no prior Battlefield has achieved anywhere close to that level of popular and commercial success.

I spoke with current and former EA employees who work or have recently worked directly on the game—they span multiple studios, disciplines, and seniority levels and all agreed to talk about the project on the condition of anonymity. Asked to address the reporting in this article, EA declined to comment.

According to these first-hand accounts, the changes have led to extraordinary stress and long hours. Every employee I spoke to across several studios either took exhaustion leave themselves or directly knew staffers who did. Two people who had worked on other AAA projects within EA or elsewhere in the industry said this project had more people burning out and needing to take leave than they’d ever seen before.

Each of the sources I spoke with shared sincere hopes that the game will still be a hit with players, pointing to its strong conceptual start and the talent, passion, and pedigree of its development team. Whatever the end result, the inside story of the game’s development illuminates why the medium and the industry are in the state they’re in today.

Table of Contents

The road to Glacier

To understand exactly what’s going on with the next Battlefield title—codenamed Glacier—we need to rewind a bit.

In the early 2010s, Battlefield 3 and Battlefield 4 expanded the franchise audience to more directly compete with Call of Duty, the heavy hitter at the time. Developed primarily by EA-owned, Sweden-based studio DICE, the Battlefield games mixed the franchise’s promise of combined arms warfare and high player counts with Call of Duty’s faster pace and greater platform accessibility.

This was a golden age for Battlefield. However, 2018’s Battlefield V launched to a mixed reception, and EA began losing players’ attention in an expanding industry.

Battlefield 3, pictured here, kicked off the franchise’s golden age. Credit: EA

Instead, the hot new online shooters were Overwatch (2016), Fortnite (2017), and a resurgent Call of Duty. Fortnite was driven by a popular new gameplay mode called Battle Royale, and while EA attempted a Battle Royale mode in Battlefield V, it didn’t achieve the desired level of popularity.

After V, DICE worked on a Battlefield title that was positioned as a throwback to the glory days of 3 and 4. That game would be called Battlefield 2042 (after the future year in which it was set), and it would launch in 2021.

The launch of Battlefield 2042 is where Glacier’s development story begins. Simply put, the game was not fun enough, and Battlefield 2042 launched as a dud.

Don’t repeat past mistakes

Players were disappointed—but so were those who worked on 2042. Sources tell me that prior to launch, Battlefield 2042 “massively missed” its alpha target—a milestone by which most or all of the foundational features of the game are meant to be in place. Because of this, the game’s final release would need to be delayed in order to deliver on the developers’ intent (and on players’ expectations).

“Realistically, they have to delay the game by at least six months to complete it. Now, they eventually only delayed it by, I think, four or five weeks, which from a development point of view means very little,” said one person who worked closely with the project at the time.

Developers at DICE had hoped for more time. Morale fell, but the team marched ahead to the game’s lukewarm launch.

Ultimately, EA made back some ground with what the company calls “live operations”—additional content and updates in the months following launch—but the game never fulfilled its ambitions.

Plans were already underway for the next Battlefield game, so a postmortem was performed on 2042. It concluded that the problems had been in execution, not vision. New processes were put into place so that issues could be identified earlier and milestones like the alpha wouldn’t be missed.

To help achieve this, EA hired three industry luminaries to lead Glacier, all of them based in the United States.

The franchise leadership dream team

2021 saw EA bring on Byron Beede as general manager for Battlefield; he had previously been general manager for both Call of Duty (including the Warzone Battle Royale) and the influential shooter Destiny. EA also hired Marcus Lehto—co-creator of Halo—as creative chief of a newly formed Seattle studio called Ridgeline Games, which would lead the development of Glacier’s single-player campaign.

Finally, there was Vince Zampella, one of the leaders of the team that initially created Call of Duty in 2003. He joined EA in 2010 to work on other franchises, but in 2021, EA announced that Zampella would oversee Battlefield moving forward.

In the wake of these changes, some prominent members of DICE departed, including General Manager Oskar Gabrielson and Creative Director Lars Gustavsson, who had been known by the nickname “Mr. Battlefield.” With this changing of the guard, EA was ready to place a bigger bet than ever on the next Battlefield title.

100 million players

While 2042 struggled, competitors Call of Duty and Fortnite were posting astonishing player and revenue numbers, thanks in large part to the popularity of their Battle Royale modes.

EA’s executive leadership believed Battlefield had the potential to stand toe to toe with them, if the right calls were made and enough was invested.

A lofty player target was set for Glacier: 100 million players over a set period of time that included post-launch.

Fortnite characters looking across the many islands and vast realm of the game.

Fortnite‘s huge success has publishers like EA chasing the same dollars. Credit: Epic Games

“Obviously, Battlefield has never achieved those numbers before,” one EA employee told me. “It’s important to understand that over about that same period, 2042 has only gotten 22 million,” another said. Even 2016’s Battlefield 1—the most successful game in the franchise by numbers—had achieved “maybe 30 million plus.”

Of course, most previous Battlefield titles had been premium releases, with an up-front purchase cost and no free-to-play mode, whereas successful competitors like Fortnite and Call of Duty made their Battle Royale modes freely available, monetizing users with in-game purchases and season passes that unlocked post-launch content.

It was thought that if Glacier did the same, it could achieve comparable numbers, so a free-to-play Battle Royale mode was made a core offering for the title, alongside a six-hour single-player campaign, traditional Battlefield multiplayer modes like Conquest and  Rush, a new F2P mode called Gauntlet, and a community content mode called Portal.

The most expensive Battlefield ever

All this meant that Glacier would have a broader scope than its predecessors. Developers say it has the largest budget of any Battlefield title to date.

The project targeted a budget of more than $400 million back in early 2023, which was already more than was originally planned at the start.

However, major setbacks significantly disrupted production in 2023 (more on that in a moment) and hundreds of additional developers were brought onto Glacier from various EA-owned studios to get things back on track, significantly increasing the cost. Multiple team members with knowledge of the project’s finances told me that the current projections are now well north of that $400 million amount.

Skepticism in the ranks

Despite the big ambitions of the new leadership team and EA executives, “very few people” working in the studios believed the 100 million target was achievable, two sources told me. Many of those who had worked on Battlefield for a long time at DICE in Stockholm were particularly skeptical.

“Among the things that we are predicting is that we won’t have to cannibalize anyone else’s sales,” one developer said. “That there’s just such an appetite out there for shooters of this kind that we will just naturally be able to get the audience that we need.”

Regarding the lofty player and revenue targets, one source said that “nothing in the market research or our quality deliverables indicates that we would be anywhere near that.”

“I think people are surprised that they actually worked on a next Battlefield game and then increased the ambitions to what they are right now,” said another.

In 2023, a significant disruption to the project put one game mode in jeopardy, foreshadowing a more troubled development than anyone initially imagined.

Ridgeline implodes

Battlefield games have a reputation for middling single-player campaigns, and Battlefield 2042 didn’t include one at all. But part of this big bet on Glacier was the idea of offering the complete package, so Ridgeline Games scaled up while working on a campaign EA hoped would keep Battlefield competitive with Call of Duty, which usually has included a single-player campaign in its releases.

The studio worked on the campaign for about two years while it was also scaling and hiring talent to catch up to established studios within the Battlefield family.

It didn’t work out. In February of 2024, Ridgeline was shuttered, Halo luminary Marcus Lehto left the company, and the rest of the studios were left to pick up the pieces. When a certain review came up not long before the studio was shuttered, Glacier’s top leadership were dissatisfied with the progress they were seeing, and the call was made.

Sources in EA teams outside Ridgeline told me that there weren’t proper check-ins and internal reviews on the progress, obscuring the true state of the project until the fateful review.

On the other hand, those closer to Ridgeline described a situation in which the team couldn’t possibly complete its objectives, as it was expected to hire and scale up from zero while also meeting the same milestones as established studios with resources already in place. “They kept reallocating funds—essentially staff months—out of our budget,” one person told me. “And, you know, we’re sitting there trying to adapt to doing more with less.”

A Battlefield logo with a list of studios beneath it

A marketing image from EA showing now-defunct Ridgeline Games on the list of groups involved. Credit: EA

After the shuttering of Ridgeline, ownership of single-player shifted to three other EA studios: Criterion, DICE, and Motive. But those teams had a difficult road ahead, as “there was essentially nothing left that Ridgeline had spent two years working on that they could pick up on and build, so they had to redo essentially everything from scratch within the same constraints of when the game had to release.”

Single-player was two years behind. As of late spring, it was the only game mode that had failed to reach alpha, well over a year after the initial overall alpha target for the project.

Multiple sources said its implosion was symptomatic of some broader cultural and process problems that affected the rest of the project, too.

Culture shock

Speaking with people who have worked or currently work at DICE in Sweden, the tension between some at that studio and the new, US-based leadership team was obvious—and to a degree, that’s expected.

DICE had “the pride of having started Battlefield and owned that IP,” but now the studio was just “supporting it for American leadership,” said one person who worked there. Further, “there’s a lot of distrust and disbelief… when it comes to just operating toward numbers that very few people believe in apart from the leadership.”

But the tensions appear to go deeper than that. Two other major factors were at play: scaling pains as the scope of the project expanded and differences in cultural values between US leadership and the workers in Europe.

“DICE being originally a Swedish studio, they are a bit more humble. They want to build the best game, and they want to achieve the greatest in terms of the game experience,” one developer told me. “Of course, when you’re operated by EA, you have to set financial expectations in order to be as profitable as possible.”

That tension wasn’t new. But before 2042 failed to meet expectations, DICE Stockholm employees say they were given more leeway to set the vision for the game, as well as greater influence on timeline and targets.

Some EU-based team members were vocally dismayed at how top-down directives from far-flung offices, along with the US company’s emphasis on quarterly profits, have affected Glacier’s development far more than with previous Battlefield titles.

This came up less in talking to US-based staff, but everyone I spoke with on both continents agreed on one thing: Growing pains accompanied the transition from a production environment where one studio leads and others offer support to a new setup with four primary studios—plus outside support from all over EA—and all of it helmed by LA-based leadership.

EA is not alone in adopting this approach; it’s also used by competitor Activision-Blizzard on the Call of Duty franchise (though it’s worth noting that a big hit like Epic Games’ Fortnite has a very different structure).

Whereas publishers like EA and Activision-Blizzard used to house several studios, each of which worked on its own AAA game, they now increasingly make bigger bets on singular games-as-a-service offerings, with several of their studios working in tandem on a single project.

“Development of games has changed so much in the last 10 to 15 years,” said one developer. The new arrangement excites investors and shareholders, who can imagine returns from the next big unicorn release, but it can be a less creatively fulfilling way to work, as directives come from the top down, and much time is spent on dealing with inter-studio process. Further, it amplifies the effects of failures, with a higher human cost to people working on projects that don’t meet expectations.

It has also made the problems that affected Battlefield 2042‘s development more difficult to avoid.

Clearing the gates

EA studios use a system of “gates” to set the pace of development. Projects have to meet certain criteria to pass each gate.

For gate one, teams must have a clear sense of what they want to make and some proof of concept showing that this vision is achievable.

As they approach gate two, they’re building out and testing key technology, asking themselves if it can work at scale.

Gate three signifies full production. Glacier was expected to pass gate three in early 2023, but it was significantly delayed. When it did pass, some on the ground questioned whether it should have.

“I did not see robust budget, staff plan, feature list, risk planning, et cetera, as we left gate three,” said one person. In the way EA usually works, these things would all be expected at this stage.

As the project approached gate three and then alpha, several people within the organization tried to communicate that the game wasn’t on footing as firm as the top-level planning suggested. One person attributed this to the lack of a single source of truth within the organization. While developers tracked issues and progress in one tool, others (including project leadership) leaned on other sources of information that weren’t as tied to on-the-ground reality when making decisions.

A former employee with direct knowledge of production plans told me that as gate three approached, prototypes of some important game features were not ready, but since there wasn’t time to complete proofs of concept, the decision was handed down to move ahead to production even though the normal prerequisites were not met.

“If you don’t have those things fleshed out when you’re leaving pre-pro[duction], you’re just going to be playing catch-up the entire time you’re in production,” this source said.

In some cases, employees who flagged the problems believed they were being punished. Two EA employees each told me they found themselves cut out of meetings once they raised concerns like this.

Gate three was ultimately declared clear, and as of late May 2025, alpha was achieved for everything except the single-player campaign. But I’m told that this occurred with some tasks still un-estimated and many discrepancies remaining, leaving the door open to problems and compromises down the road.

The consequences for players

Because of these issues, the majority of the people I spoke with said they expect planned features or content to be cut before the game actually launches—which is normal, to a degree. But these common game development problems can contribute to other aspects of modern AAA gaming that many consumers find frustrating.

First off, making major decisions so late in the process can lead to huge day-one patches. Players of all types of AAA games often take to Reddit and social media to malign day-one patches as a frustrating annoyance for modern titles.

Battlefield 2042 had a sizable day-one patch. When multiplayer RPG Anthem (another big investment by EA) launched to negative reviews, that was partly because critics and others with pre-launch access were playing a build that was weeks old; a day-one patch significantly improved some aspects of the game, but that came after the negative press began to pour out.

A player character confronts a monster in Anthem

Anthem, another EA project with a difficult development, launched with a substantial day-one patch. Credit: EA

Glacier’s late arrival to Alpha and the teams’ problems with estimating the status of features could lead to a similarly significant day-one patch. That’s in part because EA has to deliver the work to external partners far in advance of the actual launch date.

“They have these external deadlines to do with the submissions into what EA calls ‘first-party’—that’s your PlayStation and Xbox submissions,” one person explained. “They have to at least have builds ready that they can submit.”

What ends up on the disc or what pre-loads from online marketplaces must be finalized long before the game’s actual release date. When a project is far behind or prone to surprises in the final stretch, those last few weeks are where a lot of vital work happens, so big launch patches become a necessity.

These struggles over content often lead to another pet peeve of players: planned launch content being held until later. “There’s a bit of project management within the Battlefield project that they can modify,” a former senior EA employee who worked on the project explained. “They might push it into Season 1 or Season 2.”

That way, players ultimately get the intended feature or content, but in some cases, they may end up paying more for it, as it ends up being part of a post-launch package like a battle pass.

These challenges are a natural extension of the fiscal-quarter-oriented planning that large publishers like EA adhere to. “The final timelines don’t change. The final numbers don’t change,” said one source. “So there is an enormous amount of pressure.”

A campaign conundrum

Single-player is also a problem. “Single-player in itself is massively late—it’s the latest part of the game,” I was told. “Without an enormous patch on day one or early access to the game, it’s unrealistic that they’re going to be able to release it to what they needed it to do.”

If the single-player mode is a linear, narrative campaign as originally planned, it may not be possible to delay missions or other content from the campaign to post-launch seasons.

“Single-player is secondary to multiplayer, so they will shift the priority to make sure that single-player meets some minimal expectations, however you want to measure that. But the multiplayer is the main focus,” an EA employee said.

“They might have to cut a part of the single-player out in order for the game to release with a single-player [campaign] on it,” they continued. “Or they would have to severely work through the summer and into the later part of this year and try to fix that.”

That—and the potential for a disappointing product—is a cost for players, but there are costs for the developers who work on the game, too.

Because timelines must be kept, and not everything can be cut or moved post-launch, it falls on employees to make up the gap. As we’ve seen in countless similar reports about AAA video game development before, that sometimes means longer hours and heavier stress.

AAA’s burnout problem

More than two decades ago, the spouse of an EA employee famously wrote an open letter to bring attention to the long hours and high stress developers there were facing.

Since then, some things have improved. People at all levels within EA are more conscious of the problems that were highlighted, and there have been efforts to mitigate some of them, like more comp time and mental health resources. However, many of those old problems linger in some form.

I heard several first-hand accounts of people working on Glacier who had to take stress or mental or exhaustion health leave, ranging from a couple of weeks to several months.

“There’s like—I would hesitate to count—but a large number compared to other projects I’ve been on who have taken mental exhaustion leave here. Some as short as two weeks to a month, some as long as eight months and nine,” one staffer told me after saying they had taken some time themselves.

This was partly because of long hours that were required when working directly with studios in both the US and Europe—a symptom of the new, multi-studio structure.

“My day could start as early as 5: 00 [am],” one person said. The first half of the day involved meetings with a studio in one part of the world while the second included meetings with a studio in another region. “Then my evenings would be spent doing my work because I’d be tied up juggling things all across the board and across time zones.”

This sort of workload was not limited to a brief, planned period of focused work, the employees said. Long hours were particularly an issue for those working in or closely with Ridgeline, the studio initially tasked with making the game’s single-player campaign.

From the beginning, members of the Ridgeline team felt they were expected to deliver work at a similar level to that of established studios like DICE or Ripple Effect before they were even fully staffed.

“They’ve done it before,” one person who was involved with Ridgeline said of DICE. “They’re a well-oiled machine.” But Ridgeline was “starting from zero” and was “expected to produce the same stuff.”

Within just six months of the starting line, some developers at Ridgeline said they were already feeling burnt out.

In the wake of the EA Spouses event, EA developed resources for employees. But in at least some cases, they weren’t much help.

“I sought some, I guess, mental help inside of EA. From HR or within that organization of some sort, just to be able to express it—the difficulties that I experienced personally or from coworkers on the development team that had experienced this, you know, that had lived through that,” said another employee. “And the nature of that is there’s nobody to listen. They pretend to listen, but nobody ultimately listens. Very few changes are made on the back of it.”

This person went on to say that “many people” had sought similar help and felt the same way, as far back as the post-launch period for 2042 and as recently as a few months ago.

Finding solutions

There have been a lot of stories like this about the games industry over the years, and it can feel relentlessly grim to keep reading them—especially when they’re coming alongside frequent news of layoffs, including at EA. Problems are exposed, but solutions don’t get as much attention.

In that spirit, let’s wrap up by listening to what some in the industry have said about what doing things better could look like—with the admitted caveat that these proposals are still not always common practice in AAA development.

“Build more slowly”

When Swen Vincke—studio head for Larian Studios and game director for the runaway success Baldur’s Gate 3—accepted an award at the Game Developers Conference, he took his moment on stage to express frustration at publishers like EA.

“I’ve been fighting publishers my entire life, and I keep on seeing the same, same, same mistakes over and over and over,” he said. “It’s always the quarterly profits. The only thing that matters are the numbers.”

After the awards show, he took to X to clarify his statements, saying, “This message was for those who try to double their revenue year after year. You don’t have to do that. Build more slowly and make your aim improving the state of the art, not squeezing out the last drop.”

A man stands on stage giving a speech

Swen Vincke giving a speech at the 2024 Game Developers Choice Awards. Credit: Game Developers Conference

In planning projects like Glacier, publicly traded companies often pursue huge wins—and there’s even more pressure to do so if a competing company has already achieved big success with similar titles.

But going bigger isn’t always the answer, and many in the industry believe the “one big game” strategy is increasingly nonviable.

In this attention economy?

There may not be enough player time or attention to go around, given the numerous games-as-a-service titles that are as large in scope as Call of Duty games or Fortnite. Despite the recent success of new entrant Marvel Rivals, there have been more big AAA live service shooter flops than wins in recent years.

Just last week, a data-based report by prominent games marketing newsletter GameDiscoverCo came to a prescient realization. “Genres like Arena Shooter, Battle Royale, and Hero Shooter look amazing from a revenue perspective. But there’s only 29 games in all of Steam’s history that have grossed >$1m in those subgenres,” wrote GameDiscoverCo’s Simon Carless.

It gets worse. “Only Naraka Bladepoint, Overwatch 2 & Marvel Rivals have grossed >$25m and launched since 2020 in those subgenres,” Carless added. (It’s important to clarify that he is just talking Steam numbers here, though.) That’s a stark counterpoint to reports that Call of Duty has earned more than $30 billion in lifetime revenue.

Employees of game publishers and studios are deeply concerned about this. In a 2025 survey of professional game developers, “one of the biggest issues mentioned was market oversaturation, with many developers noting how tough it is to break through and build a sustainable player base.”

Despite those headwinds, publishers like EA are making big bets in well-established spaces rather than placing a variety of smaller bets in newer areas ripe for development. Some of the biggest recent multiplayer hits on Steam have come from smaller studios that used creative ideas, fresh genres, strong execution, and the luck (or foresight) of reaching the market at exactly the right time.

That might suggest that throwing huge teams and large budgets up against well-fortified competitors is an especially risky strategy—hence some of the anxiety from the EA developers I spoke with.

Working smarter, not harder

That anxiety has led to steadily growing unionization efforts across the industry. From QA workers at Bethesda to more wide-ranging unions at Blizzard and CD Projekt Red, there’s been more movement on this front in the past two or three years than there had been in decades beforehand.

Unionization isn’t a cure-all, and it comes with its own set of new challenges—but it does have the potential to shift some of the conversations toward more sustainable practices, so that’s another potential part of the solution.

Insomniac Games CEO Ted Price spoke authoritatively on sustainability and better work practices for the industry way back at 2021’s Develop:Brighton conference:

I think the default is to brute force the problem—in other words, to throw money or people at it, but that can actually cause more chaos and affect well-being, which goes against that balance. The harder and, in my opinion, more effective solution is to be more creative within constraints… In the stress of hectic production, we often feel we can’t take our foot off the gas pedal—but that’s often what it takes.

That means publishers and studios should plan for problems and work from accurate data about where the team is at, but it also means having a willingness to give their people more time, provided the capital is available to do so.

Giving people what they need to do their jobs sounds like a simple solution to a complex problem, but it was at the heart of every conversation I had about Glacier.

Most EA developers—including leaders who are beholden to lofty targets—want to make a great game. “At the end of the day, they’re all really good people and they work really hard and they really want to deliver a good product for their customer,” one former EA developer assured me as we ended our call.

As for making the necessary shifts toward sustainability in the industry, “It’s kind of in the best interest of making the best possible game for gamers,” explained another. “I hope to God that they still achieve what they need to achieve within the timelines that they have, for the sake of Battlefield as a game to actually meet the expectations of the gamers and for people to maintain their jobs.”

Photo of Samuel Axon

Samuel Axon is the editorial lead for tech and gaming coverage at Ars Technica. He covers AI, software development, gaming, entertainment, and mixed reality. He has been writing about gaming and technology for nearly two decades at Engadget, PC World, Mashable, Vice, Polygon, Wired, and others. He previously ran a marketing and PR agency in the gaming industry, led editorial for the TV network CBS, and worked on social media marketing strategy for Samsung Mobile at the creative agency SPCSHP. He also is an independent software and game developer for iOS, Windows, and other platforms, and he is a graduate of DePaul University, where he studied interactive media and software development.

What’s wrong with AAA games? The development of the next Battlefield has answers. Read More »