AI chips

openai-sidesteps-nvidia-with-unusually-fast-coding-model-on-plate-sized-chips

OpenAI sidesteps Nvidia with unusually fast coding model on plate-sized chips

But 1,000 tokens per second is actually modest by Cerebras standards. The company has measured 2,100 tokens per second on Llama 3.1 70B and reported 3,000 tokens per second on OpenAI’s own open-weight gpt-oss-120B model, suggesting that Codex-Spark’s comparatively lower speed reflects the overhead of a larger or more complex model.

AI coding agents have had a breakout year, with tools like OpenAI’s Codex and Anthropic’s Claude Code reaching a new level of usefulness for rapidly building prototypes, interfaces, and boilerplate code. OpenAI, Google, and Anthropic have all been racing to ship more capable coding agents, and latency has become what separates the winners; a model that codes faster lets a developer iterate faster.

With fierce competition from Anthropic, OpenAI has been iterating on its Codex line at a rapid rate, releasing GPT-5.2 in December after CEO Sam Altman issued an internal “code red” memo about competitive pressure from Google, then shipping GPT-5.3-Codex just days ago.

Diversifying away from Nvidia

Spark’s deeper hardware story may be more consequential than its benchmark scores. The model runs on Cerebras’ Wafer Scale Engine 3, a chip the size of a dinner plate that Cerebras has built its business around since at least 2022. OpenAI and Cerebras announced their partnership in January, and Codex-Spark is the first product to come out of it.

OpenAI has spent the past year systematically reducing its dependence on Nvidia. The company signed a massive multi-year deal with AMD in October 2025, struck a $38 billion cloud computing agreement with Amazon in November, and has been designing its own custom AI chip for eventual fabrication by TSMC.

Meanwhile, a planned $100 billion infrastructure deal with Nvidia has fizzled so far, though Nvidia has since committed to a $20 billion investment. Reuters reported that OpenAI grew unsatisfied with the speed of some Nvidia chips for inference tasks, which is exactly the kind of workload that OpenAI designed Codex-Spark for.

Regardless of which chip is under the hood, speed matters, though it may come at the cost of accuracy. For developers who spend their days inside a code editor waiting for AI suggestions, 1,000 tokens per second may feel less like carefully piloting a jigsaw and more like running a rip saw. Just watch what you’re cutting.

OpenAI sidesteps Nvidia with unusually fast coding model on plate-sized chips Read More »

nvidia’s-$100-billion-openai-deal-has-seemingly-vanished

Nvidia’s $100 billion OpenAI deal has seemingly vanished

A Wall Street Journal report on Friday said Nvidia insiders had expressed doubts about the transaction and that Huang had privately criticized what he described as a lack of discipline in OpenAI’s business approach. The Journal also reported that Huang had expressed concern about the competition OpenAI faces from Google and Anthropic. Huang called those claims “nonsense.”

Nvidia shares fell about 1.1 percent on Monday following the reports. Sarah Kunst, managing director at Cleo Capital, told CNBC that the back-and-forth was unusual. “One of the things I did notice about Jensen Huang is that there wasn’t a strong ‘It will be $100 billion.’ It was, ‘It will be big. It will be our biggest investment ever.’ And so I do think there are some question marks there.”

In September, Bryn Talkington, managing partner at Requisite Capital Management, noted the circular nature of such investments to CNBC. “Nvidia invests $100 billion in OpenAI, which then OpenAI turns back and gives it back to Nvidia,” Talkington said. “I feel like this is going to be very virtuous for Jensen.”

Tech critic Ed Zitron has been critical of Nvidia’s circular investments for some time, which touch dozens of tech companies, including major players and startups. They are also all Nvidia customers.

“NVIDIA seeds companies and gives them the guaranteed contracts necessary to raise debt to buy GPUs from NVIDIA,” Zitron wrote on Bluesky last September, “Even though these companies are horribly unprofitable and will eventually die from a lack of any real demand.”

Chips from other places

Outside of sourcing GPUs from Nvidia, OpenAI has reportedly discussed working with startups Cerebras and Groq, both of which build chips designed to reduce inference latency. But in December, Nvidia struck a $20 billion licensing deal with Groq, which Reuters sources say ended OpenAI’s talks with Groq. Nvidia hired Groq’s founder and CEO Jonathan Ross along with other senior leaders as part of the arrangement.

In January, OpenAI announced a $10 billion deal with Cerebras instead, adding 750 megawatts of computing capacity for faster inference through 2028. Sachin Katti, who joined OpenAI from Intel in November to lead compute infrastructure, said the partnership adds “a dedicated low-latency inference solution” to OpenAI’s platform.

But OpenAI has clearly been hedging its bets. Beyond the Cerebras deal, the company struck an agreement with AMD in October for six gigawatts of GPUs and announced plans with Broadcom to develop a custom AI chip to wean itself off of Nvidia dependence. When those chips will be ready, however, is currently unknown.

Nvidia’s $100 billion OpenAI deal has seemingly vanished Read More »

tsmc-says-ai-demand-is-“endless”-after-record-q4-earnings

TSMC says AI demand is “endless” after record Q4 earnings

TSMC posted net income of NT$505.7 billion (about $16 billion) for the quarter, up 35 percent year over year and above analyst expectations. Revenue hit $33.7 billion, a 25.5 percent increase from the same period last year. The company expects nearly 30 percent revenue growth in 2026 and plans to spend between $52 billion and $56 billion on capital expenditures this year, up from $40.9 billion in 2025.

Checking with the customers’ customers

Wei’s optimism stands in contrast to months of speculation about whether the AI industry is in a bubble. In November, Google CEO Sundar Pichai warned of “irrationality” in the AI market and said no company would be immune if a potential bubble bursts. OpenAI’s Sam Altman acknowledged in August that investors are “overexcited” and that “someone” will lose a “phenomenal amount of money.”

But TSMC, which manufactures the chips that power the AI boom, is betting the opposite way, with Wei telling analysts he spoke directly to cloud providers to verify that demand is real before committing to the spending increase.

“I want to make sure that my customers’ demand are real. So I talked to those cloud service providers, all of them,” Wei said. “The answer is that I’m quite satisfied with the answer. Actually, they show me the evidence that the AI really helps their business.”

The earnings report landed the same day the US and Taiwan finalized a trade agreement that cuts tariffs on Taiwanese goods to 15 percent, down from 20 percent. The deal commits Taiwanese companies to $250 billion in direct US investment, and TSMC is accelerating the expansion of its Arizona chip fabrication facilities to match.

TSMC says AI demand is “endless” after record Q4 earnings Read More »

us-taking-25%-cut-of-nvidia-chip-sales-“makes-no-sense,”-experts-say

US taking 25% cut of Nvidia chip sales “makes no sense,” experts say


Trump’s odd Nvidia reversal may open the door for China to demand Blackwell access.

Donald Trump’s decision to allow Nvidia to export an advanced artificial intelligence chip, the H200, to China may give China exactly what it needs to win the AI race, experts and lawmakers have warned.

The H200 is about 10 times less powerful than Nvidia’s Blackwell chip, which is the tech giant’s currently most advanced chip that cannot be exported to China. But the H200 is six times more powerful than the H20, the most advanced chip available in China today. Meanwhile China’s leading AI chip maker, Huawei, is estimated to be about two years behind Nvidia’s technology. By approving the sales, Trump may unwittingly be helping Chinese chip makers “catch up” to Nvidia, Jake Sullivan told The New York Times.

Sullivan, a former Biden-era national security advisor who helped design AI chip export curbs on China, told the NYT that Trump’s move was “nuts” because “China’s main problem” in the AI race “is they don’t have enough advanced computing capability.”

“It makes no sense that President Trump is solving their problem for them by selling them powerful American chips,” Sullivan said. “We are literally handing away our advantage. China’s leaders can’t believe their luck.”

Trump apparently was persuaded by Nvidia CEO Jensen Huang and his “AI czar,” David Sacks, to reverse course on H200 export curbs. They convinced Trump that restricting sales would ensure that only Chinese chip makers would get a piece of China’s market, shoring up revenue flows that dominant firms like Huawei could pour into R&D.

By instead allowing Nvidia sales, China’s industry would remain hooked on US chips, the thinking goes. And Nvidia could use those funds—perhaps $10–15 billion annually, Bloomberg Intelligence has estimated—to further its own R&D efforts. That cash influx, theoretically, would allow Nvidia to maintain the US advantage.

Along the way, the US would receive a 25 percent cut of sales, which lawmakers from both sides of the aisle warned may not be legal and suggested to foreign rivals that US national security was “now up for sale,” NYT reported. The president has claimed there are conditions to sales safeguarding national security but, frustrating critics, provided no details.

Experts slam Nvidia plan as “flawed”

Trump’s plan is “flawed,” The Economist reported.

For years, the US has established tech dominance by keeping advanced technology away from China. Trump risks rocking that boat by “tearing up America’s export-control policy,” particularly if China’s chip industry simply buys up the H200s as a short-term tactic to learn from the technology and beef up its domestic production of advanced chips, The Economist reported.

In a sign that’s exactly what many expect could happen, investors in China were apparently so excited by Trump’s announcement that they immediately poured money into Moore Threads, expected to be China’s best answer to Nvidia, the South China Morning Post reported.

Several experts for the non-partisan think tank the Counsel on Foreign Relations also criticized the policy change, cautioning that the reversal of course threatened to undermine US competition with China.

Suggesting that Trump was “effectively undoing” export curbs sought during his first term, Zongyuan Zoe Liu warned that China “buys today to learn today, with the intention to build tomorrow.”

And perhaps more concerning, she suggested, is that Trump’s policy signals weakness. Rather than forcing Chinese dependence on US tech, reversing course showed China that the US will “back down” under pressure, she warned. And they’re getting that message at a time when “Chinese leaders have a lot of reasons to believe they are not only winning the trade war but also making progress towards a higher degree of strategic autonomy.”

In a post on X, Rush Doshi—a CFR expert who previously advised Biden on national security issues related to China—suggested that the policy change was “possibly decisive in the AI race.”

“Compute is our main advantage—China has more power, engineers, and the entire edge layer—so by giving this up, we increase the odds the world runs on Chinese AI,” Doshi wrote.

Experts fear Trump may not understand the full impact of his decision. In the short-term, Michael C. Horowitz wrote for CFR, “it is indisputable” that allowing H200 exports benefits China’s frontier AI and efforts to scale data centers. And Doshi pointed out that Trump’s shift may trigger more advanced technology flowing into China, as US allies that restricted sales of machines to build AI chips may soon follow his lead and lift their curbs. As China learns to be self-reliant from any influx of advanced tech, Sullivan warned that China’s leaders “intend to get off of American semiconductors as soon as they can.”

“So, the argument that we can keep them ‘addicted’ holds no water,” Sullivan said. “They want American chips right now for one simple reason: They are behind in the AI race, and this will help them catch up while they build their own chip capabilities.”

China may reject H200, demand Blackwell access

It remains unclear if China will approve H200 sales, but some of the country’s biggest firms, including ByteDance, Tencent, and Alibaba, are interested, anonymous insider sources told Reuters.

In the past, China has instructed companies to avoid Nvidia, warning of possible backdoors giving Nvidia a kill switch to remotely shut down chips. Such backdoors could potentially destabilize Chinese firms’ operations and R&D. Nvidia has denied such backdoors exist, but Chinese firms have supposedly sought reassurances from Nvidia in the aftermath of Trump’s policy change. Likely just as unpopular with the Chinese firms and government, Nvidia confirmed recently that it has built location verification tech that could help the US detect when restricted chips are leaked into China. Should the US ever renew export curbs on H200 chips, adopting them widely could cause chaos in the future.

Without giving China sought-after reassurances, Nvidia may not end up benefiting as much as it hoped from its mission to reclaim lost revenue from the Chinese market. Today, Chinese firms control about 60 percent of China’s AI chip market, where only a few years ago American firms—led by Nvidia—controlled 80 percent, the Economist reported.

But for China, the temptation to buy up Nvidia chips may be too great to pass up. Another CFR expert, Chris McGuire, estimated that Nvidia could suddenly start exporting as many as 3 million H200s into China next year. “This would at least triple the amount of aggregate AI computing power China could add domestically” in 2026, McGuire wrote, and possibly trigger disastrous outcomes for the US.

“This could cause DeepSeek and other Chinese AI developers to close the gap with leading US AI labs and enable China to develop an ‘AI Belt and Road’ initiative—a complement to its vast global infrastructure investment network already in place—that competes with US cloud providers around the world,” McGuire forecasted.

As China mulls the benefits and risks, an emergency meeting was called, where the Chinese government discussed potential concerns of local firms buying chips, according to The Information. Reportedly, Beijing ended that meeting with a promise to issue a decision soon.

Horowitz suggested that a primary reason that China may reject the H200s could be to squeeze even bigger concessions out of Trump, whose administration recently has been working to maintain a tenuous truce with China.

“China could come back demanding the Blackwell or something else,” Horowitz suggested.

In a statement, Nvidia—which plans to release a chip called the Rubin to surpass the Blackwell soon—praised Trump’s policy as striking “a thoughtful balance that is great for America.”

China will rip off Nvidia’s chips, Republican warns

Both Democratic and Republican lawmakers in Congress criticized Trump’s plan, including senators behind a bipartisan push to limit AI chip sales to China.

Some have questioned how much thought was put into the policy, as the US confusingly continues restricting less advanced AI chips (like the A100 and H100) while green-lighting H200 sales. Trump’s Justice Department also seems to be struggling to keep up. The NYT noted that just “hours before” Trump announced the policy change, the DOJ announced “it had detained two people for selling those chips to the country.”

The chair of the Select Committee on Competition with China, Rep. John Moolenaar (R-Mich.), warned on X that the news wouldn’t be good for the US or Nvidia. First, the Chinese Communist Party “will use these highly advanced chips to strengthen its military capabilities and totalitarian surveillance,” he suggested. And second, “Nvidia should be under no illusions—China will rip off its technology, mass produce it themselves, and seek to end Nvidia as a competitor.”

“That is China’s playbook and it is using it in every critical industry,” Moolenaar said.

House Democrats on committees dealing with foreign affairs and competition with China echoed those concerns, The Hill reported, warning that “under this administration, our national security is for sale.”

Nvidia’s Huang seems pleased with the outcome, which comes after months of reportedly pressuring the administration to lift export curbs limiting its growth in Chinese markets, the NYT reported. Last week, Trump heaped praise on Huang after one meeting, calling Huang a “smart man” and suggesting the Nvidia chief has “done an amazing job” helping Trump understand the stakes.

At an October news conference ahead of the deal’s official approval, Huang suggested that government lawyers were researching ways to get around a US law that prohibits charging companies fees for export licenses. Eventually, Trump is expected to release a policy that outlines how the US will collect those fees without conflicting with that law.

Senate Democrats appear unlikely to embrace such a policy, issuing a joint statement condemning the H200 sales as dooming the US in the AI race and threatening national security.

“Access to these chips would give China’s military transformational technology to make its weapons more lethal, carry out more effective cyberattacks against American businesses and critical infrastructure and strengthen their economic and manufacturing sector,” Senators wrote.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

US taking 25% cut of Nvidia chip sales “makes no sense,” experts say Read More »

google-tells-employees-it-must-double-capacity-every-6-months-to-meet-ai-demand

Google tells employees it must double capacity every 6 months to meet AI demand

While AI bubble talk fills the air these days, with fears of overinvestment that could pop at any time, something of a contradiction is brewing on the ground: Companies like Google and OpenAI can barely build infrastructure fast enough to fill their AI needs.

During an all-hands meeting earlier this month, Google’s AI infrastructure head Amin Vahdat told employees that the company must double its serving capacity every six months to meet demand for artificial intelligence services, reports CNBC. The comments show a rare look at what Google executives are telling its own employees internally. Vahdat, a vice president at Google Cloud, presented slides to its employees showing the company needs to scale “the next 1000x in 4-5 years.”

While a thousandfold increase in compute capacity sounds ambitious by itself, Vahdat noted some key constraints: Google needs to be able to deliver this increase in capability, compute, and storage networking “for essentially the same cost and increasingly, the same power, the same energy level,” he told employees during the meeting. “It won’t be easy but through collaboration and co-design, we’re going to get there.”

It’s unclear how much of this “demand” Google mentioned represents organic user interest in AI capabilities versus the company integrating AI features into existing services like Search, Gmail, and Workspace. But whether users are using the features voluntarily or not, Google isn’t the only tech company struggling to keep up with a growing user base of customers using AI services.

Major tech companies are in a race to build out data centers. Google competitor OpenAI is planning to build six massive data centers across the US through its Stargate partnership project with SoftBank and Oracle, committing over $400 billion in the next three years to reach nearly 7 gigawatts of capacity. The company faces similar constraints serving its 800 million weekly ChatGPT users, with even paid subscribers regularly hitting usage limits for features like video synthesis and simulated reasoning models.

“The competition in AI infrastructure is the most critical and also the most expensive part of the AI race,” Vahdat said at the meeting, according to CNBC’s viewing of the presentation. The infrastructure executive explained that Google’s challenge goes beyond simply outspending competitors. “We’re going to spend a lot,” he said, but noted the real objective is building infrastructure that is “more reliable, more performant and more scalable than what’s available anywhere else.”

Google tells employees it must double capacity every 6 months to meet AI demand Read More »

openai-signs-massive-ai-compute-deal-with-amazon

OpenAI signs massive AI compute deal with Amazon

On Monday, OpenAI announced it has signed a seven-year, $38 billion deal to buy cloud services from Amazon Web Services to power products like ChatGPT and Sora. It’s the company’s first big computing deal after a fundamental restructuring last week that gave OpenAI more operational and financial freedom from Microsoft.

The agreement gives OpenAI access to hundreds of thousands of Nvidia graphics processors to train and run its AI models. “Scaling frontier AI requires massive, reliable compute,” OpenAI CEO Sam Altman said in a statement. “Our partnership with AWS strengthens the broad compute ecosystem that will power this next era and bring advanced AI to everyone.”

OpenAI will reportedly use Amazon Web Services immediately, with all planned capacity set to come online by the end of 2026 and room to expand further in 2027 and beyond. Amazon plans to roll out hundreds of thousands of chips, including Nvidia’s GB200 and GB300 AI accelerators, in data clusters built to power ChatGPT’s responses, generate AI videos, and train OpenAI’s next wave of models.

Wall Street apparently liked the deal, because Amazon shares hit an all-time high on Monday morning. Meanwhile, shares for long-time OpenAI investor and partner Microsoft briefly dipped following the announcement.

Massive AI compute requirements

It’s no secret that running generative AI models for hundreds of millions of people currently requires a lot of computing power. Amid chip shortages over the past few years, finding sources of that computing muscle has been tricky. OpenAI is reportedly working on its own GPU hardware to help alleviate the strain.

But for now, the company needs to find new sources of Nvidia chips, which accelerate AI computations. Altman has previously said that the company plans to spend $1.4 trillion to develop 30 gigawatts of computing resources, an amount that is enough to roughly power 25 million US homes, according to Reuters.

OpenAI signs massive AI compute deal with Amazon Read More »

nvidia-hits-record-$5-trillion-mark-as-ceo-dismisses-ai-bubble-concerns

Nvidia hits record $5 trillion mark as CEO dismisses AI bubble concerns

Partnerships and government contracts fuel optimism

At the GTC conference on Tuesday, Nvidia’s CEO went out of his way to repeatedly praise Donald Trump and his policies for accelerating domestic tech investment while warning that excluding China from Nvidia’s ecosystem could limit US access to half the world’s AI developers. The overall event stressed Nvidia’s role as an American company, with Huang even nodding to Trump’s signature slogan in his sign-off by thanking the audience for “making America great again.”

Trump’s cooperation is paramount for Nvidia because US export controls have effectively blocked Nvidia’s AI chips from China, costing the company billions of dollars in revenue. Bob O’Donnell of TECHnalysis Research told Reuters that “Nvidia clearly brought their story to DC to both educate and gain favor with the US government. They managed to hit most of the hottest and most influential topics in tech.”

Beyond the political messaging, Huang announced a series of partnerships and deals that apparently helped ease investor concerns about Nvidia’s future. The company announced collaborations with Uber Technologies, Palantir Technologies, and CrowdStrike Holdings, among others. Nvidia also revealed a $1 billion investment in Nokia to support the telecommunications company’s shift toward AI and 6G networking.

The agreement with Uber will power a fleet of 100,000 self-driving vehicles with Nvidia technology, with automaker Stellantis among the first to deliver the robotaxis. Palantir will pair Nvidia’s technology with its Ontology platform to use AI techniques for logistics insights, with Lowe’s as an early adopter. Eli Lilly plans to build what Nvidia described as the most powerful supercomputer owned and operated by a pharmaceutical company, relying on more than 1,000 Blackwell AI accelerator chips.

The $5 trillion valuation surpasses the total cryptocurrency market value and equals roughly half the size of the pan European Stoxx 600 equities index, Reuters notes. At current prices, Huang’s stake in Nvidia would be worth about $179.2 billion, making him the world’s eighth-richest person.

Nvidia hits record $5 trillion mark as CEO dismisses AI bubble concerns Read More »

ars-live-recap:-is-the-ai-bubble-about-to-pop?-ed-zitron-weighs-in.

Ars Live recap: Is the AI bubble about to pop? Ed Zitron weighs in.


Despite connection hiccups, we covered OpenAI’s finances, nuclear power, and Sam Altman.

On Tuesday of last week, Ars Technica hosted a live conversation with Ed Zitron, host of the Better Offline podcast and one of tech’s most vocal AI critics, to discuss whether the generative AI industry is experiencing a bubble and when it might burst. My Internet connection had other plans, though, dropping out multiple times and forcing Ars Technica’s Lee Hutchinson to jump in as an excellent emergency backup host.

During the times my connection cooperated, Zitron and I covered OpenAI’s financial issues, lofty infrastructure promises, and why the AI hype machine keeps rolling despite some arguably shaky economics underneath. Lee’s probing questions about per-user costs revealed a potential flaw in AI subscription models: Companies can’t predict whether a user will cost them $2 or $10,000 per month.

You can watch a recording of the event on YouTube or in the window below.

Our discussion with Ed Zitron. Click here for transcript.

“A 50 billion-dollar industry pretending to be a trillion-dollar one”

I started by asking Zitron the most direct question I could: “Why are you so mad about AI?” His answer got right to the heart of his critique: the disconnect between AI’s actual capabilities and how it’s being sold. “Because everybody’s acting like it’s something it isn’t,” Zitron said. “They’re acting like it’s this panacea that will be the future of software growth, the future of hardware growth, the future of compute.”

In one of his newsletters, Zitron describes the generative AI market as “a 50 billion dollar revenue industry masquerading as a one trillion-dollar one.” He pointed to OpenAI’s financial burn rate (losing an estimated $9.7 billion in the first half of 2025 alone) as evidence that the economics don’t work, coupled with a heavy dose of pessimism about AI in general.

Donald Trump listens as Nvidia CEO Jensen Huang speaks at the White House during an event on “Investing in America” on April 30, 2025, in Washington, DC. Credit: Andrew Harnik / Staff | Getty Images News

“The models just do not have the efficacy,” Zitron said during our conversation. “AI agents is one of the most egregious lies the tech industry has ever told. Autonomous agents don’t exist.”

He contrasted the relatively small revenue generated by AI companies with the massive capital expenditures flowing into the sector. Even major cloud providers and chip makers are showing strain. Oracle reportedly lost $100 million in three months after installing Nvidia’s new Blackwell GPUs, which Zitron noted are “extremely power-hungry and expensive to run.”

Finding utility despite the hype

I pushed back against some of Zitron’s broader dismissals of AI by sharing my own experience. I use AI chatbots frequently for brainstorming useful ideas and helping me see them from different angles. “I find I use AI models as sort of knowledge translators and framework translators,” I explained.

After experiencing brain fog from repeated bouts of COVID over the years, I’ve also found tools like ChatGPT and Claude especially helpful for memory augmentation that pierces through brain fog: describing something in a roundabout, fuzzy way and quickly getting an answer I can then verify. Along these lines, I’ve previously written about how people in a UK study found AI assistants useful accessibility tools.

Zitron acknowledged this could be useful for me personally but declined to draw any larger conclusions from my one data point. “I understand how that might be helpful; that’s cool,” he said. “I’m glad that that helps you in that way; it’s not a trillion-dollar use case.”

He also shared his own attempts at using AI tools, including experimenting with Claude Code despite not being a coder himself.

“If I liked [AI] somehow, it would be actually a more interesting story because I’d be talking about something I liked that was also onerously expensive,” Zitron explained. “But it doesn’t even do that, and it’s actually one of my core frustrations, it’s like this massive over-promise thing. I’m an early adopter guy. I will buy early crap all the time. I bought an Apple Vision Pro, like, what more do you say there? I’m ready to accept issues, but AI is all issues, it’s all filler, no killer; it’s very strange.”

Zitron and I agree that current AI assistants are being marketed beyond their actual capabilities. As I often say, AI models are not people, and they are not good factual references. As such, they cannot replace human decision-making and cannot wholesale replace human intellectual labor (at the moment). Instead, I see AI models as augmentations of human capability: as tools rather than autonomous entities.

Computing costs: History versus reality

Even though Zitron and I found some common ground about AI hype, I expressed a belief that criticism over the cost and power requirements of operating AI models will eventually not become an issue.

I attempted to make that case by noting that computing costs historically trend downward over time, referencing the Air Force’s SAGE computer system from the 1950s: a four-story building that performed 75,000 operations per second while consuming two megawatts of power. Today, pocket-sized phones deliver millions of times more computing power in a way that would be impossible, power consumption-wise, in the 1950s.

The blockhouse for the Semi-Automatic Ground Environment at Stewart Air Force Base, Newburgh, New York. Credit: Denver Post via Getty Images

“I think it will eventually work that way,” I said, suggesting that AI inference costs might follow similar patterns of improvement over years and that AI tools will eventually become commodity components of computer operating systems. Basically, even if AI models stay inefficient, AI models of a certain baseline usefulness and capability will still be cheaper to train and run in the future because the computing systems they run on will be faster, cheaper, and less power-hungry as well.

Zitron pushed back on this optimism, saying that AI costs are currently moving in the wrong direction. “The costs are going up, unilaterally across the board,” he said. Even newer systems like Cerebras and Grok can generate results faster but not cheaper. He also questioned whether integrating AI into operating systems would prove useful even if the technology became profitable, since AI models struggle with deterministic commands and consistent behavior.

The power problem and circular investments

One of Zitron’s most pointed criticisms during the discussion centered on OpenAI’s infrastructure promises. The company has pledged to build data centers requiring 10 gigawatts of power capacity (equivalent to 10 nuclear power plants, I once pointed out) for its Stargate project in Abilene, Texas. According to Zitron’s research, the town currently has only 350 megawatts of generating capacity and a 200-megawatt substation.

“A gigawatt of power is a lot, and it’s not like Red Alert 2,” Zitron said, referencing the real-time strategy game. “You don’t just build a power station and it happens. There are months of actual physics to make sure that it doesn’t kill everyone.”

He believes many announced data centers will never be completed, calling the infrastructure promises “castles on sand” that nobody in the financial press seems willing to question directly.

An orange, cloudy sky backlights a set of electrical wires on large pylons, leading away from the cooling towers of a nuclear power plant.

After another technical blackout on my end, I came back online and asked Zitron to define the scope of the AI bubble. He says it has evolved from one bubble (foundation models) into two or three, now including AI compute companies like CoreWeave and the market’s obsession with Nvidia.

Zitron highlighted what he sees as essentially circular investment schemes propping up the industry. He pointed to OpenAI’s $300 billion deal with Oracle and Nvidia’s relationship with CoreWeave as examples. “CoreWeave, they literally… They funded CoreWeave, became their biggest customer, then CoreWeave took that contract and those GPUs and used them as collateral to raise debt to buy more GPUs,” Zitron explained.

When will the bubble pop?

Zitron predicted the bubble would burst within the next year and a half, though he acknowledged it could happen sooner. He expects a cascade of events rather than a single dramatic collapse: An AI startup will run out of money, triggering panic among other startups and their venture capital backers, creating a fire-sale environment that makes future fundraising impossible.

“It’s not gonna be one Bear Stearns moment,” Zitron explained. “It’s gonna be a succession of events until the markets freak out.”

The crux of the problem, according to Zitron, is Nvidia. The chip maker’s stock represents 7 to 8 percent of the S&P 500’s value, and the broader market has become dependent on Nvidia’s continued hyper growth. When Nvidia posted “only” 55 percent year-over-year growth in January, the market wobbled.

“Nvidia’s growth is why the bubble is inflated,” Zitron said. “If their growth goes down, the bubble will burst.”

He also warned of broader consequences: “I think there’s a depression coming. I think once the markets work out that tech doesn’t grow forever, they’re gonna flush the toilet aggressively on Silicon Valley.” This connects to his larger thesis: that the tech industry has run out of genuine hyper-growth opportunities and is trying to manufacture one with AI.

“Is there anything that would falsify your premise of this bubble and crash happening?” I asked. “What if you’re wrong?”

“I’ve been answering ‘What if you’re wrong?’ for a year-and-a-half to two years, so I’m not bothered by that question, so the thing that would have to prove me right would’ve already needed to happen,” he said. Amid a longer exposition about Sam Altman, Zitron said, “The thing that would’ve had to happen with inference would’ve had to be… it would have to be hundredths of a cent per million tokens, they would have to be printing money, and then, it would have to be way more useful. It would have to have efficacy that it does not have, the hallucination problems… would have to be fixable, and on top of this, someone would have to fix agents.”

A positivity challenge

Near the end of our conversation, I wondered if I could flip the script, so to speak, and see if he could say something positive or optimistic, although I chose the most challenging subject possible for him. “What’s the best thing about Sam Altman,” I asked. “Can you say anything nice about him at all?”

“I understand why you’re asking this,” Zitron started, “but I wanna be clear: Sam Altman is going to be the reason the markets take a crap. Sam Altman has lied to everyone. Sam Altman has been lying forever.” He continued, “Like the Pied Piper, he’s led the markets into an abyss, and yes, people should have known better, but I hope at the end of this, Sam Altman is seen for what he is, which is a con artist and a very successful one.”

Then he added, “You know what? I’ll say something nice about him, he’s really good at making people say, ‘Yes.’”

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

Ars Live recap: Is the AI bubble about to pop? Ed Zitron weighs in. Read More »

nvidia-sells-tiny-new-computer-that-puts-big-ai-on-your-desktop

Nvidia sells tiny new computer that puts big AI on your desktop

For the OS, the Spark is an ARM-based system that runs Nvidia’s DGX OS, an Ubuntu Linux-based operating system built specifically for GPU processing. It comes with Nvidia’s AI software stack preinstalled, including CUDA libraries and the company’s NIM microservices.

Prices for the DGX Spark start at US $3,999. That may seem like a lot, but given the cost of high-end GPUs with ample video RAM like the RTX Pro 6000 (about $9,000) or AI server GPUs (like $25,000 for a base-level H100), the DGX Spark may represent a far less expensive option overall, though it’s not nearly as powerful.

In fact, according to The Register, the GPU computing performance of the GB10 chip is roughly equivalent to an RTX 5070. However, the 5070 is limited to 12GB of video memory, which limits the size of AI models that can be run on such a system. With 128GB of unified memory, the DGX Spark can run far larger models, albeit at a slower speed than, say, an RTX 5090 (which typically ships with 24 GB of RAM). For example, to run the 120 billion-parameter larger version of OpenAI’s recent gpt-oss language model, you’d need about 80GB of memory, which is far more than you can get in a consumer GPU.

A callback to 2016

Nvidia founder and CEO Jensen Huang marked the occasion of the DGX Spark launch by personally delivering one of the first units to Elon Musk at SpaceX’s Starbase facility in Texas, echoing a similar delivery Huang made to Musk at OpenAI in 2016.

“In 2016, we built DGX-1 to give AI researchers their own supercomputer. I hand-delivered the first system to Elon at a small startup called OpenAI, and from it came ChatGPT,” Huang said in a statement. “DGX-1 launched the era of AI supercomputers and unlocked the scaling laws that drive modern AI. With DGX Spark, we return to that mission.”

Nvidia sells tiny new computer that puts big AI on your desktop Read More »

amd-wins-massive-ai-chip-deal-from-openai-with-stock-sweetener

AMD wins massive AI chip deal from OpenAI with stock sweetener

As part of the arrangement, AMD will allow OpenAI to purchase up to 160 million AMD shares at 1 cent each throughout the chips deal.

OpenAI diversifies its chip supply

With demand for AI compute growing rapidly, companies like OpenAI have been looking for secondary supply lines and sources of additional computing capacity, and the AMD partnership is part the company’s wider effort to secure sufficient computing power for its AI operations. In September, Nvidia announced an investment of up to $100 billion in OpenAI that included supplying at least 10 gigawatts of Nvidia systems. OpenAI plans to deploy a gigawatt of Nvidia’s next-generation Vera Rubin chips in late 2026.

OpenAI has worked with AMD for years, according to Reuters, providing input on the design of older generations of AI chips such as the MI300X. The new agreement calls for deploying the equivalent of 6 gigawatts of computing power using AMD chips over multiple years.

Beyond working with chip suppliers, OpenAI is widely reported to be developing its own silicon for AI applications and has partnered with Broadcom, as we reported in February. A person familiar with the matter told Reuters the AMD deal does not change OpenAI’s ongoing compute plans, including its chip development effort or its partnership with Microsoft.

AMD wins massive AI chip deal from OpenAI with stock sweetener Read More »

why-does-openai-need-six-giant-data-centers?

Why does OpenAI need six giant data centers?

Training next-generation AI models compounds the problem. On top of running existing AI models like those that power ChatGPT, OpenAI is constantly working on new technology in the background. It’s a process that requires thousands of specialized chips running continuously for months.

The circular investment question

The financial structure of these deals between OpenAI, Oracle, and Nvidia has drawn scrutiny from industry observers. Earlier this week, Nvidia announced it would invest up to $100 billion as OpenAI deploys Nvidia systems. As Bryn Talkington of Requisite Capital Management told CNBC: “Nvidia invests $100 billion in OpenAI, which then OpenAI turns back and gives it back to Nvidia.”

Oracle’s arrangement follows a similar pattern, with a reported $30 billion-per-year deal where Oracle builds facilities that OpenAI pays to use. This circular flow, which involves infrastructure providers investing in AI companies that become their biggest customers, has raised eyebrows about whether these represent genuine economic investments or elaborate accounting maneuvers.

The arrangements are becoming even more convoluted. The Information reported this week that Nvidia is discussing leasing its chips to OpenAI rather than selling them outright. Under this structure, Nvidia would create a separate entity to purchase its own GPUs, then lease them to OpenAI, which adds yet another layer of circular financial engineering to this complicated relationship.

“NVIDIA seeds companies and gives them the guaranteed contracts necessary to raise debt to buy GPUs from NVIDIA, even though these companies are horribly unprofitable and will eventually die from a lack of any real demand,” wrote tech critic Ed Zitron on Bluesky last week about the unusual flow of AI infrastructure investments. Zitron was referring to companies like CoreWeave and Lambda Labs, which have raised billions in debt to buy Nvidia GPUs based partly on contracts from Nvidia itself. It’s a pattern that mirrors OpenAI’s arrangements with Oracle and Nvidia.

So what happens if the bubble pops? Even Altman himself warned last month that “someone will lose a phenomenal amount of money” in what he called an AI bubble. If AI demand fails to meet these astronomical projections, the massive data centers built on physical soil won’t simply vanish. When the dot-com bubble burst in 2001, fiber optic cable laid during the boom years eventually found use as Internet demand caught up. Similarly, these facilities could potentially pivot to cloud services, scientific computing, or other workloads, but at what might be massive losses for investors who paid AI-boom prices.

Why does OpenAI need six giant data centers? Read More »

openai-links-up-with-broadcom-to-produce-its-own-ai-chips

OpenAI links up with Broadcom to produce its own AI chips

OpenAI is set to produce its own artificial intelligence chip for the first time next year, as the ChatGPT maker attempts to address insatiable demand for computing power and reduce its reliance on chip giant Nvidia.

The chip, co-designed with US semiconductor giant Broadcom, would ship next year, according to multiple people familiar with the partnership.

Broadcom’s chief executive Hock Tan on Thursday referred to a mystery new customer committing to $10 billion in orders.

OpenAI’s move follows the strategy of tech giants such as Google, Amazon and Meta, which have designed their own specialised chips to run AI workloads. The industry has seen huge demand for the computing power to train and run AI models.

OpenAI planned to put the chip to use internally, according to one person close to the project, rather than make them available to external customers.

Last year it began an initial collaboration with Broadcom, according to reports at the time, but the timeline for mass production of a successful chip design had previously been unclear.

On a call with analysts, Tan announced that Broadcom had secured a fourth major customer for its custom AI chip business, as it reported earnings that topped Wall Street estimates.

Broadcom does not disclose the names of these customers, but people familiar with the matter confirmed OpenAI was the new client. Broadcom and OpenAI declined to comment.

OpenAI links up with Broadcom to produce its own AI chips Read More »