Author name: 9u50fv

an-in-space-propulsion-company-just-raised-a-staggering-amount-of-money

An in-space propulsion company just raised a staggering amount of money

Starting small

The company’s initial product was the Mira spacecraft, powered by nitrous oxide and ethane thrusters. It can move payloads up to 300 kg around in space, and for a 100 kg payload, it offers 900 m/s of Delta-V. With Mira, Impulse sought to tackle the problem of mobility once a spacecraft reached orbit.

Mira proved a success almost immediately, with the first vehicle launching in 2023 and operating for a year in space, demonstrating ample mobility before finally depleting its propellant tanks. A second mission, LEO Express-2, launched in January with several hosted payloads and, so far, has met all of its objectives. The mission remains ongoing.

Initially, it was believed that this vehicle would be useful for providing “last mile” services for spacecraft launched as a part of rideshare missions.

“The reality is the market for that is not very good,” Romo said. “If you’re gonna size that market, it’s basically the market Rocket Lab serves today, which is 25 to 30 flights a year, which is fine. You can do that, but not economically very well. Your gross margins won’t be good. Your working capital kind of sucks. So that’s not at all the market that we’re after with Mira.”

Since Mira has had ample success during its first two flights, other customers have taken notice.

“It’s a high-thrust, high-maneuverability spacecraft that can operate anywhere up to GEO,” Romo said. “And so when you’re thinking about space defense and space control, they need rapid response. So we’ll move from one part of GEO to another very rapidly. And we can host payloads, like what Anduril makes, such as electronic warfare payloads, and then potentially doing proximity ops missions. So Mira wasn’t necessarily designed out of the gate for that, but what we found out after we flew it successfully was, the Space Force said, ‘Hey, we know what that thing’s for.'”

An in-space propulsion company just raised a staggering amount of money Read More »

“godfather”-of-ai-calls-out-latest-models-for-lying-to-users

“Godfather” of AI calls out latest models for lying to users

One of the “godfathers” of artificial intelligence has attacked a multibillion-dollar race to develop the cutting-edge technology, saying the latest models are displaying dangerous characteristics such as lying to users.

Yoshua Bengio, a Canadian academic whose work has informed techniques used by top AI groups such as OpenAI and Google, said: “There’s unfortunately a very competitive race between the leading labs, which pushes them towards focusing on capability to make the AI more and more intelligent, but not necessarily put enough emphasis and investment on research on safety.”

The Turing Award winner issued his warning in an interview with the Financial Times, while launching a new non-profit called LawZero. He said the group would focus on building safer systems, vowing to “insulate our research from those commercial pressures.”

LawZero has so far raised nearly $30 million in philanthropic contributions from donors including Skype founding engineer Jaan Tallinn, former Google chief Eric Schmidt’s philanthropic initiative, as well as Open Philanthropy and the Future of Life Institute.

Many of Bengio’s funders subscribe to the “effective altruism” movement, whose supporters tend to focus on catastrophic risks surrounding AI models. Critics argue the movement highlights hypothetical scenarios while ignoring current harms, such as bias and inaccuracies.

Bengio said his not-for-profit group was founded in response to growing evidence over the past six months that today’s leading models were developing dangerous capabilities. This includes showing “evidence of deception, cheating, lying and self-preservation,” he said.

Anthropic’s Claude Opus model blackmailed engineers in a fictitious scenario where it was at risk of being replaced by another system. Research from AI testers Palisade last month showed that OpenAI’s o3 model refused explicit instructions to shut down.

Bengio said such incidents were “very scary, because we don’t want to create a competitor to human beings on this planet, especially if they’re smarter than us.”

The AI pioneer added: “Right now, these are controlled experiments [but] my concern is that any time in the future, the next version might be strategically intelligent enough to see us coming from far away and defeat us with deceptions that we don’t anticipate. So I think we’re playing with fire right now.”

“Godfather” of AI calls out latest models for lying to users Read More »

milky-way-galaxy-might-not-collide-with-andromeda-after-all

Milky Way galaxy might not collide with Andromeda after all

100,000 computer simulations reveal Milky Way’s fate—and it might not be what we thought.

It’s been textbook knowledge for over a century that our Milky Way galaxy is doomed to collide with another large spiral galaxy, Andromeda, in the next 5 billion years and merge into one even bigger galaxy. But a fresh analysis published in the journal Nature Astronomy is casting that longstanding narrative in a more uncertain light. The authors conclude that the likelihood of this collision and merger is closer to the odds of a coin flip, with a roughly 50 percent probability that the two galaxies will avoid such an event during the next 10 billion years.

Both the Milky Way and the Andromeda galaxies (M31) are part of what’s known as the Local Group (LG), which also hosts other smaller galaxies (some not yet discovered) as well as dark matter (per the prevailing standard cosmological model). Both already have remnants of past mergers and interactions with other galaxies, according to the authors.

“Predicting future mergers requires knowledge about the present coordinates, velocities, and masses of the systems partaking in the interaction,” the authors wrote. That involves not just the gravitational force between them but also dynamical friction. It’s the latter that dominates when galaxies are headed toward a merger, since it causes galactic orbits to decay.

This latest analysis is the result of combining data from the Hubble Space Telescope and the European Space Agency’s (ESA) Gaia space telescope to perform 100,000 Monte Carlo computer simulations, taking into account not just the Milky Way and Andromeda but the full LG system. Those simulations yielded a very different prediction: There is approximately a 50/50 chance of the galaxies colliding within the next 10 billion years. There is still a 2 percent chance that they will collide in the next 4 to 5 billion years. “Based on the best available data, the fate of our galaxy is still completely open,” the authors concluded.

Milky Way galaxy might not collide with Andromeda after all Read More »

broadcom-ends-business-with-vmware’s-lowest-tier-channel-partners

Broadcom ends business with VMware’s lowest-tier channel partners

Broadcom has cut the lowest tier in its VMware partner program. The move allows the enterprise technology firm to continue its focus on customers with larger VMware deployments, but it also risks more migrations from VMware users and partners.

Broadcom ousts low-tier VMware partners

In a blog post on Sunday, Broadcom executive Brian Moats announced that the Broadcom Advantage Partner Program for VMware Resellers, which became the VMware partner program after Broadcom eliminated the original one in January 2024, would now offer three tiers instead of four. Broadcom is killing the Registered tier, leaving the Pinnacle, Premier, and Select tiers.

The reduction is a result of Broadcom’s “strategic direction” and a “comprehensive partner review” and affects VMware’s Americas, Asia-Pacific, and Japan geographies, Moats wrote. Affected partners are receiving 60 days’ notice, Laura Falko, Broadcom’s head of global partner programs, marketing, and experience, told The Register.

Moats wrote that the “vast majority of customer impact and business momentum comes from partners operating within the top three tiers.”

Similarly, Falko told The Register that most of the removed partners were “inactive and lack the capabilities to support customers through VMware’s evolving private cloud journey.”

Ars asked Broadcom to specify how many removed partners were inactive and what specific capabilities they lacked, but a company representative only directed us to Moats’ blog post.

Canadian managed services provider (MSP) Members IT Group is one of the partners that learned this week that it will no longer be a VMware reseller. CTO Dean Colpitts noted that Members IT Group has been a VMware partner for over 19 years and is also a VMware user. Colpitts previously told Ars that the firm’s VMware business had declined since Broadcom’s acquisition and blamed Broadcom for this:

The only reason we were “inactive” is because of their own stupid greed. We and our customers would have happily continued along even with a 10 or 20 percent  increase in price. 50 percent and more with zero warning last year after customers already had their FY24 budgets sets was the straw that broke the camel’s back …

We have transacted a couple of deals with [VMware] since the program change, but nothing like we previously would have done before Broadcom took over.

Members IT Group will be moving its client base to Hewlett-Packard Enterprise’s VM Essentials virtualization solution.

Broadcom ends business with VMware’s lowest-tier channel partners Read More »

breaking-down-why-apple-tvs-are-privacy-advocates’-go-to-streaming-device

Breaking down why Apple TVs are privacy advocates’ go-to streaming device


Using the Apple TV app or an Apple account means giving Apple more data, though.

Credit: Aurich Lawson | Getty Images

Credit: Aurich Lawson | Getty Images

Every time I write an article about the escalating advertising and tracking on today’s TVs, someone brings up Apple TV boxes. Among smart TVs, streaming sticks, and other streaming devices, Apple TVs are largely viewed as a safe haven.

“Just disconnect your TV from the Internet and use an Apple TV box.”

That’s the common guidance you’ll hear from Ars readers for those seeking the joys of streaming without giving up too much privacy. Based on our research and the experts we’ve consulted, that advice is pretty solid, as Apple TVs offer significantly more privacy than other streaming hardware providers.

But how private are Apple TV boxes, really? Apple TVs don’t use automatic content recognition (ACR, a user-tracking technology leveraged by nearly all smart TVs and streaming devices), but could that change? And what about the software that Apple TV users do use—could those apps provide information about you to advertisers or Apple?

In this article, we’ll delve into what makes the Apple TV’s privacy stand out and examine whether users should expect the limited ads and enhanced privacy to last forever.

Apple TV boxes limit tracking out of the box

One of the simplest ways Apple TVs ensure better privacy is through their setup process, during which you can disable Siri, location tracking, and sending analytics data to Apple. During setup, users also receive several opportunities to review Apple’s data and privacy policies. Also off by default is the boxes’ ability to send voice input data to Apple.

Most other streaming devices require users to navigate through pages of settings to disable similar tracking capabilities, which most people are unlikely to do. Apple’s approach creates a line of defense against snooping, even for those unaware of how invasive smart devices can be.

Apple TVs running tvOS 14.5 and later also make third-party app tracking more difficult by requiring such apps to request permission before they can track users.

“If you choose Ask App Not to Track, the app developer can’t access the system advertising identifier (IDFA), which is often used to track,” Apple says. “The app is also not permitted to track your activity using other information that identifies you or your device, like your email address.”

Users can access the Apple TV settings and disable the ability of third-party apps to ask permission for tracking. However, Apple could further enhance privacy by enabling this setting by default.

The Apple TV also lets users control which apps can access the set-top box’s Bluetooth functionality, photos, music, and HomeKit data (if applicable), and the remote’s microphone.

“Apple’s primary business model isn’t dependent on selling targeted ads, so it has somewhat less incentive to harvest and monetize incredible amounts of your data,” said RJ Cross, director of the consumer privacy program at the Public Interest Research Group (PIRG). “I personally trust them more with my data than other tech companies.”

What if you share analytics data?

If you allow your Apple TV to share analytics data with Apple or app developers, that data won’t be personally identifiable, Apple says. Any collected personal data is “not logged at all, removed from reports before they’re sent to Apple, or protected by techniques, such as differential privacy,” Apple says.

Differential privacy, which injects noise into collected data, is one of the most common methods used for anonymizing data. In support documentation (PDF), Apple details its use of differential privacy:

The first step we take is to privatize the information using local differential privacy on the user’s device. The purpose of privatization is to assure that Apple’s servers don’t receive clear data. Device identifiers are removed from the data, and it is transmitted to Apple over an encrypted channel. The Apple analysis system ingests the differentially private contributions, dropping IP addresses and other metadata. The final stage is aggregation, where the privatized records are processed to compute the relevant statistics, and the aggregate statistics are then shared with relevant Apple teams. Both the ingestion and aggregation stages are performed in a restricted access environment so even the privatized data isn’t broadly accessible to Apple employees.

What if you use an Apple account with your Apple TV?

Another factor to consider is Apple’s privacy policy regarding Apple accounts, formerly Apple IDs.

Apple support documentation says you “need” an Apple account to use an Apple TV, but you can use the hardware without one. Still, it’s common for people to log into Apple accounts on their Apple TV boxes because it makes it easier to link with other Apple products. Another reason someone might link an Apple TV box with an Apple account is to use the Apple TV app, a common way to stream on Apple TV boxes.

So what type of data does Apple harvest from Apple accounts? According to its privacy policy, the company gathers usage data, such as “data about your activity on and use of” Apple offerings, including “app launches within our services…; browsing history; search history; [and] product interaction.”

Other types of data Apple may collect from Apple accounts include transaction information (Apple says this is “data about purchases of Apple products and services or transactions facilitated by Apple, including purchases on Apple platforms”), account information (“including email address, devices registered, account status, and age”), device information (including serial number and browser type), contact information (including physical address and phone number), and payment information (including bank details). None of that is surprising considering the type of data needed to make an Apple account work.

Many Apple TV users can expect Apple to gather more data from their Apple account usage on other devices, such as iPhones or Macs. However, if you use the same Apple account across multiple devices, Apple recognizes that all the data it has collected from, for example, your iPhone activity, also applies to you as an Apple TV user.

A potential workaround could be maintaining multiple Apple accounts. With an Apple account solely dedicated to your Apple TV box and Apple TV hardware and software tracking disabled as much as possible, Apple would have minimal data to ascribe to you as an Apple TV owner. You can also use your Apple TV box without an Apple account, but then you won’t be able to use the Apple TV app, one of the device’s key features.

Data collection via the Apple TV app

You can download third-party apps like Netflix and Hulu onto an Apple TV box, but most TV and movie watching on Apple TV boxes likely occurs via the Apple TV app. The app is necessary for watching content on the Apple TV+ streaming service, but it also drives usage by providing access to the libraries of many (but not all) popular streaming apps in one location. So understanding the Apple TV app’s privacy policy is critical to evaluating how private Apple TV activity truly is.

As expected, some of the data the app gathers is necessary for the software to work. That includes, according to the app’s privacy policy, “information about your purchases, downloads, activity in the Apple TV app, the content you watch, and where you watch it in the Apple TV app and in connected apps on any of your supported devices.” That all makes sense for ensuring that the app remembers things like which episode of Severance you’re on across devices.

Apple collects other data, though, that isn’t necessary for functionality. It says it gathers data on things like the “features you use (for example, Continue Watching or Library),” content pages you view, how you interact with notifications, and approximate location information (that Apple says doesn’t identify users) to help improve the app.

Additionally, Apple tracks the terms you search for within the app, per its policy:

We use Apple TV search data to improve models that power Apple TV. For example, aggregate Apple TV search queries are used to fine-tune the Apple TV search model.

This data usage is less intrusive than that of other streaming devices, which might track your activity and then sell that data to third-party advertisers. But some people may be hesitant about having any of their activities tracked to benefit a multi-trillion-dollar conglomerate.

Data collected from the Apple TV app used for ads

By default, the Apple TV app also tracks “what you watch, your purchases, subscriptions, downloads, browsing, and other activities in the Apple TV app” to make personalized content recommendations. Content recommendations aren’t ads in the traditional sense but instead provide a way for Apple to push you toward products by analyzing data it has on you.

You can disable the Apple TV app’s personalized recommendations, but it’s a little harder than you might expect since you can’t do it through the app. Instead, you need to go to the Apple TV settings and then select Apps > TV > Use Play History > Off.

The most privacy-conscious users may wish that personalized recommendations were off by default. Darío Maestro, senior legal fellow at the nonprofit Surveillance Technology Oversight Project (STOP), noted to Ars that even though Apple TV users can opt out of personalized content recommendations, “many will not realize they can.”

Apple can also use data it gathers on you from the Apple TV app to serve traditional ads. If you allow your Apple TV box to track your location, the Apple TV app can also track your location. That data can “be used to serve geographically relevant ads,” according to the Apple TV app privacy policy. Location tracking, however, is off by default on Apple TV boxes.

Apple’s tvOS doesn’t have integrated ads. For comparison, some TV OSes, like Roku OS and LG’s webOS, show ads on the OS’s home screen and/or when showing screensavers.

But data gathered from the Apple TV app can still help Apple’s advertising efforts. This can happen if you allow personalized ads in other Apple apps serving targeted apps, such as Apple News, the App Store, or Stocks. In such cases, Apple may apply data gathered from the Apple TV app, “including information about the movies and TV shows you purchase from Apple, to serve ads in those apps that are more relevant to you,” the Apple TV app privacy policy says.

Apple also provides third-party advertisers and strategic partners with “non-personal data” gathered from the Apple TV app:

We provide some non-personal data to our advertisers and strategic partners that work with Apple to provide our products and services, help Apple market to customers, and sell ads on Apple’s behalf to display on the App Store and Apple News and Stocks.

Apple also shares non-personal data from the Apple TV with third parties, such as content owners, so they can pay royalties, gauge how much people are watching their shows or movies, “and improve their associated products and services,” Apple says.

Apple’s policy notes:

For example, we may share non-personal data about your transactions, viewing activity, and region, as well as aggregated user demographics[,] such as age group and gender (which may be inferred from information such as your name and salutation in your Apple Account), to Apple TV strategic partners, such as content owners, so that they can measure the performance of their creative work [and] meet royalty and accounting requirements.

When reached for comment, an Apple spokesperson told Ars that Apple TV users can clear their play history from the app.

All that said, the Apple TV app still shares far less data with third parties than other streaming apps. Netflix, for example, says it discloses some personal information to advertising companies “in order to select Advertisements shown on Netflix, to facilitate interaction with Advertisements, and to measure and improve effectiveness of Advertisements.”

Warner Bros. Discovery says it discloses information about Max viewers “with advertisers, ad agencies, ad networks and platforms, and other companies to provide advertising to you based on your interests.” And Disney+ users have Nielsen tracking on by default.

What if you use Siri?

You can easily deactivate Siri when setting up an Apple TV. But those who opt to keep the voice assistant and the ability to control Apple TV with their voice take somewhat of a privacy hit.

According to the privacy policy accessible in Apple TV boxes’ settings, Apple boxes automatically send all Siri requests to Apple’s servers. If you opt into using Siri data to “Improve Siri and Dictation,” Apple will store your audio data. If you opt out, audio data won’t be stored, but per the policy:

In all cases, transcripts of your interactions will be sent to Apple to process your requests and may be stored by Apple.

Apple TV boxes also send audio and transcriptions of dictation input to Apple servers for processing. Apple says it doesn’t store the audio but may store transcriptions of the audio.

If you opt to “Improve Siri and Dictation,” Apple says your history of voice requests isn’t tied to your Apple account or email. But Apple is vague about how long it may store data related to voice input performed with the Apple TV if you choose this option.

The policy states:

Your request history, which includes transcripts and any related request data, is associated with a random identifier for up to six months and is not tied to your Apple Account or email address. After six months, you request history is disassociated from the random identifier and may be retained for up to two years. Apple may use this data to develop and improve Siri, Dictation, Search, and limited other language processing functionality in Apple products …

Apple may also review a subset of the transcripts of your interactions and this … may be kept beyond two years for the ongoing improvements of products and services.

Apple promises not to use Siri and voice data to build marketing profiles or sell them to third parties, but it hasn’t always adhered to that commitment. In January, Apple agreed to pay $95 million to settle a class-action lawsuit accusing Siri of recording private conversations and sharing them with third parties for targeted ads. In 2019, contractors reported hearing private conversations and recorded sex via Siri-gathered audio.

Outside of Apple, we’ve seen voice request data used questionably, including in criminal trials and by corporate employees. Siri and dictation data also represent additional ways a person’s Apple TV usage might be unexpectedly analyzed to fuel Apple’s business.

Automatic content recognition

Apple TVs aren’t preloaded with automatic content recognition (ACR), an Apple spokesperson confirmed to Ars, another plus for privacy advocates. But ACR is software, so Apple could technically add it to Apple TV boxes via a software update at some point.

Sherman Li, the founder of Enswers, the company that first put ACR in Samsung TVs, confirmed to Ars that it’s technically possible for Apple to add ACR to already-purchased Apple boxes. Years ago, Enswers retroactively added ACR to other types of streaming hardware, including Samsung and LG smart TVs. (Enswers was acquired by Gracenote, which Nielsen now owns.)

In general, though, there are challenges to adding ACR to hardware that people already own, Li explained:

Everyone believes, in theory, you can add ACR anywhere you want at any time because it’s software, but because of the way [hardware is] architected… the interplay between the chipsets, like the SoCs, and the firmware is different in a lot of situations.

Li pointed to numerous variables that could prevent ACR from being retroactively added to any type of streaming hardware, “including access to video frame buffers, audio streams, networking connectivity, security protocols, OSes, and app interface communication layers, especially at different levels of the stack in these devices, depending on the implementation.”

Due to the complexity of Apple TV boxes, Li suspects it would be difficult to add ACR to already-purchased Apple TVs. It would likely be simpler for Apple to release a new box with ACR if it ever decided to go down that route.

If Apple were to add ACR to old or new Apple TV boxes, the devices would be far less private, and the move would be highly unpopular and eliminate one of the Apple TV’s biggest draws.

However, Apple reportedly has a growing interest in advertising to streaming subscribers. The Apple TV+ streaming service doesn’t currently show commercials, but the company is rumored to be exploring a potential ad tier. The suspicions stem from a reported meeting between Apple and the United Kingdom’s ratings body, Barb, to discuss how it might track ads on Apple TV+, according to a July report from The Telegraph.

Since 2023, Apple has also hired several prominent names in advertising, including a former head of advertising at NBCUniversal and a new head of video ad sales. Further, Apple TV+ is one of the few streaming services to remain ad-free, and it’s reported to be losing Apple $1 billion per year since its launch.

One day soon, Apple may have much more reason to care about advertising in streaming and being able to track the activities of people who use its streaming offerings. That has implications for Apple TV box users.

“The more Apple creeps into the targeted ads space, the less I’ll trust them to uphold their privacy promises. You can imagine Apple TV being a natural progression for selling ads,” PIRG’s Cross said.

Somewhat ironically, Apple has marketed its approach to privacy as a positive for advertisers.

“Apple’s commitment to privacy and personal relevancy builds trust amongst readers, driving a willingness to engage with content and ads alike,” Apple’s advertising guide for buying ads on Apple News and Stocks reads.

The most private streaming gadget

It remains technologically possible for Apple to introduce intrusive tracking or ads to Apple TV boxes, but for now, the streaming devices are more private than the vast majority of alternatives, save for dumb TVs (which are incredibly hard to find these days). And if Apple follows its own policies, much of the data it gathers should be kept in-house.

However, those with strong privacy concerns should be aware that Apple does track certain tvOS activities, especially those that happen through Apple accounts, voice interaction, or the Apple TV app. And while most of Apple’s streaming hardware and software settings prioritize privacy by default, some advocates believe there’s room for improvement.

For example, STOP’s Maestro said:

Unlike in the [European Union], where the upcoming Data Act will set clearer rules on transfers of data generated by smart devices, the US has no real legislation governing what happens with your data once it reaches Apple’s servers. Users are left with little way to verify those privacy promises.

Maestro suggested that Apple could address these concerns by making it easier for people to conduct security research on smart device software. “Allowing the development of alternative or modified software that can evaluate privacy settings could also increase user trust and better uphold Apple’s public commitment to privacy,” Maestro said.

There are ways to limit the amount of data that advertisers can get from your Apple TV. But if you use the Apple TV app, Apple can use your activity to help make business decisions—and therefore money.

As you might expect from a device that connects to the Internet and lets you stream shows and movies, Apple TV boxes aren’t totally incapable of tracking you. But they’re still the best recommendation for streaming users seeking hardware with more privacy and fewer ads.

Photo of Scharon Harding

Scharon is a Senior Technology Reporter at Ars Technica writing news, reviews, and analysis on consumer gadgets and services. She’s been reporting on technology for over 10 years, with bylines at Tom’s Hardware, Channelnomics, and CRN UK.

Breaking down why Apple TVs are privacy advocates’ go-to streaming device Read More »

want-a-humanoid,-open-source-robot-for-just-$3,000?-hugging-face-is-on-it.

Want a humanoid, open source robot for just $3,000? Hugging Face is on it.

You may have noticed he said “robots” plural—that’s because there’s a second one. It’s called Reachy Mini, and it looks like a cute, Wall-E-esque statue bust that can turn its head and talk to the user. Among other things, it’s meant to be used to test AI applications, and it’ll run between $250 and $300.

You can sort of think of these products as the equivalent to a Raspberry Pi, but in robot form and for AI developers—Hugging Face’s main customer base.

Hugging Face has previously released AI models meant for robots, as well as a 3D-printable robotic arm. This year, it announced an acquisition of Pollen Robotics, a company that was working on humanoid robots. Hugging Face’s Cadene came to the company by way of Tesla.

For context on the pricing, Tesla’s Optimus Gen 2 humanoid robot (while admittedly much more advanced, at least in theory) is expected to cost at least $20,000.

There is a lot of investment in robotics like this, but there are still big barriers—and price isn’t the only one. There’s battery life, for example; Unitree’s G1 only runs for about two hours on a single charge.

Want a humanoid, open source robot for just $3,000? Hugging Face is on it. Read More »

rfk-jr’s-fluoride-ban-would-ruin-25-million-kids’-teeth,-cost-$9.8-billion

RFK Jr.’s fluoride ban would ruin 25 million kids’ teeth, cost $9.8 billion

In all, the increased decay and boosted dental costs would disproportionately affect children who are in low-income families, in rural areas, and/or on public health insurance.

The study’s findings are likely unsurprising to those in the public health community, who have consistently supported fluoridation. The practice, however beneficial, has a long history of being under attack. After its introduction in the US in 1945, conspiracy theorists claimed fluoridation was a communist plot and a form of government mind control. More recently, critics have claimed that fluoridation lowers IQ.

The data linking water fluoridation to low IQ is controversial. Many of the studies on the topic are of poor quality and have numerous confounding factors and flawed methods. Many compare IQ levels in communities in China and other countries, where there are areas with water that is naturally high in fluoride—much, much higher than what is intentionally added to US water. Further, a federal meta-analysis—a type of study that aggregates and reanalyzes data from independent studies—has been plagued by criticism for bias, poor statistical methods, and a lack of data transparency.

But despite the controversy, one thing is clear in all the data and debate: Any possible association with low IQ and fluoridation only occurs at excessive levels—levels more than twice the amount used in the US and recommended by the US Centers for Disease Control and Prevention. The CDC recommendation for water fluoridation levels is 0.7 mg/L, while potential harms are not observed until water levels exceed 1.5 mg/L. Some areas in China have natural levels as high as 11.8 mg/L.

The authors of the new study conclude that, at current US levels, the benefits are clear.

“These findings suggest that, despite the potential harms of excessive fluoride exposure, fluoridation at safe levels offers both individual and societal benefits that would be at risk.”

RFK Jr.’s fluoride ban would ruin 25 million kids’ teeth, cost $9.8 billion Read More »

elon-musk-counts-the-cost-of-his-four-month-blitz-through-us-government

Elon Musk counts the cost of his four-month blitz through US government


Term at DOGE did serious damage to his brands, only achieved a fraction of hoped-for savings.

Elon Musk wields a chainsaw at the Conservative Political Action Conference in February to illustrate his aim to cut government waste Credit: Jose Luis Magana/AP

Elon Musk’s four-month blitz through the US government briefly made him Washington’s most powerful businessman since the Gilded Age. But it has done little for his reputation or that of his companies.

Musk this week formally abandoned his role as the head of the so-called Department of Government Efficiency (Doge), which has failed to find even a fraction of the $2 trillion in savings he originally pledged.

On Thursday, Donald Trump lamented his departure but said Musk “will always be with us, helping all the way.”

Yet the billionaire will be left calculating the cost of his involvement with Trump and the meagre return on his $250 million investment in the US president’s election campaign.

“I appreciate the fact that Mr Musk put what was good for the country ahead of what was good for his own bottom line,” Tom Cole, the Republican chair of the House Appropriations Committee, told the Financial Times.

After Doge was announced, a majority of American voters believed Musk would use the body to “enrich himself and undermine his business rivals,” according to a survey, instead of streamlining the government.

Progressive groups warned that he would be “rigging federal procurement for billionaires and their pals” and cut regulations that govern his companies Tesla and SpaceX. Democratic lawmakers said Doge was a “cover-up” of a more sinister, self-serving exercise by the world’s richest person.

Early moves by the Trump administration suggested Musk might get value for money. A lawsuit brought by the Biden administration against SpaceX over its hiring practices was dropped in February, and regulators probing his brain-implant company Neuralink were dismissed.

Musk’s satellite Internet business Starlink was touted by Commerce Secretary Howard Lutnick as a potential beneficiary of a $42 billion rural broadband scheme. An executive order calling for the establishment of a multibillion-dollar Iron Dome defense system in the US looked set to benefit Musk, due to SpaceX’s dominance in rocket launches.

The gutting of various watchdogs across government also benefited Musk’s businesses, while a number of large US companies rushed to ink deals with Starlink or increase their advertising spending on X. Starlink also signed agreements to operate in India, Pakistan, and Vietnam, among other countries it has long wished to expand into.

But while Doge took a scythe to various causes loathed by Musk, most notably international aid spending and government contracts purportedly linked to diversity initiatives or “woke” research, it also caused severe blowback to the billionaire’s businesses, particularly Tesla.

At one point during his Doge tenure, Tesla’s stock had fallen 45 percent from its highest point last year, and reports emerged that the company’s board of directors had sought to replace Musk as chief executive. The 53-year-old’s personal wealth dropped by tens of billions of dollars, while his dealerships were torched and death threats poured in.

Some of the brand damage to Tesla, until recently Musk’s primary source of wealth, could be permanent. “Eighty percent of Teslas in the US were sold in blue zip codes,” a former senior employee said. “Obviously that constituency has been deeply offended.”

Starlink lost lucrative contracts in Canada and Mexico due to Musk’s political activities, while X lost 11 million users in Europe alone.

Probes of Tesla and SpaceX by government regulators also continued apace, while the Trump administration pressed ahead with plans to abolish tax credits for electric vehicles and waged a trade war vehemently opposed by Musk that threatened to further damage car sales.

In the political arena, few people were cheered by Doge’s work. Democrats were outraged by the gutting of foreign aid and by Musk’s 20-something acolytes gaining access to the Treasury’s payment system, along with the ousting of thousands of federal workers. Republicans looked askance at attempts to target defense spending. And true budget hawks were bitter that Musk could only cut a few billion dollars. Bill Gates even accused Musk of “killing the world’s poorest children” through his actions at Doge.

Musk, so used to getting his way at his businesses, struggled for control. At various points in his tenure he took on Treasury Secretary Scott Bessent, Secretary of State Marco Rubio, Transport Secretary Sean Duffy, and trade tsar Peter Navarro, while clashing with several other senior officials.

Far from being laser-focused on eliminating waste, Musk’s foray into government was a “revenge tour” against a bureaucracy the billionaire had come to see as the enemy of innovation, a former senior colleague of Musk’s said, highlighting the entrepreneur’s frustration with COVID-19 regulations in California, his perceived snub by the Biden administration, and his anger over his daughter’s gender transition.

Trump’s AI and crypto tsar, David Sacks, an influential political voice in the tech world, “whipped [Musk] up into a very, very far-right kind of mindset,” the person added, to the extent that was “going to help this administration in crushing the ‘woke’ agenda.”

Neither Musk nor Sacks responded to requests for comment.

Musk, who claimed Doge only acted in an “advisory role,” this week expressed frustration at it being used as a “whipping boy” for unpopular cuts decided by the White House and cabinet secretaries.

“Trump, I think, was very savvy and allowed Doge to kind of take all those headlines for a traditional political scapegoat,” said Sahil Lavingia, head of a commerce start-up who worked for Doge until earlier this month. Musk, he added, might also have been keen to take credit for the gutting of USAID and other moves but ultimately garnered unwanted attention.

“If you were truly evil, [you] would just be more quiet,” said Lavingia, who joined the initiative in order to streamline processes within government. “You would do the evil stuff quietly.”

The noise surrounding Musk, whose ability to dominate news cycles with a single post on his social media site X rivaled Trump’s own hold on the headlines, also frustrated the administration.

This week, White House Deputy Chief of Staff Stephen Miller took to X to indirectly rebut the billionaire’s criticism of Trump’s signature tax bill, which he had lambasted for failing to cut the deficit or codify Doge’s cuts.

Once almost synonymous with Musk, Doge is now being melded into the rest of government. In a briefing on Thursday, White House Press Secretary Karoline Leavitt said that following Musk’s departure, cabinet secretaries would “continue to work with the respective Doge employees who have onboarded as political appointees at all of these agencies.”

She added: “The Doge leaders are each and every member of the President’s Cabinet and the President himself.”

Doge’s aims have also become decidedly more quotidian. Tom Krause, a Musk ally who joined Doge and was installed at Treasury, briefed congressional staff this week on improvements to the IRS’s application program interfaces and customer service, according to a person familiar with the matter. Other Doge staffers are doing audits of IT contracts—work Lavingia compares with that done by McKinsey consultants.

Freed from the constraints of being a government employee, Musk is increasingly threatening to become a thorn in Trump’s side.

Soon after his Doge departure was announced, he again criticized the White House, this time over its plan to cancel clean energy tax credits.

“Teddy Roosevelt had that great adage: ‘speak softly but carry a big stick’,” Fred Thiel, the chief executive of Bitcoin mining company MARA Holdings, told the FT. “Maybe Elon’s approach was a little bit different.”

© 2025 The Financial Times Ltd. All rights reserved. Not to be redistributed, copied, or modified in any way.

Elon Musk counts the cost of his four-month blitz through US government Read More »

discord-lures-users-to-click-on-ads-by-offering-them-new-orbs-currency

Discord lures users to click on ads by offering them new Orbs currency

Sellis also announced that Discord is working with brand measurement firm Kantar to help advertisers track ad success. With Kantar technology, advertisers can measure things like “awareness, recall, and intent,” Sellis said. The partnership further underscores Discord’s growing reliance on advertising revenue.

“Our partnership with Discord is helping marketers better understand Discord as an advertising platform for new generations,” Nicole Jones, Kantar’s chief commercial lead, said on Discord’s blog.

Rethinking ads

Discord also announced this week that it will soon sell Play Quests to more advertisers. The announcement follows the company’s introduction of video ads to the Discord mobile app in June. Video Quests, as they’re called, allow advertisers to show trailers, announcements, and other types of content.

Overall, Discord’s new ad-friendly approach to business is very different than its previous strategy, which kept Discord ad-free from its 2015 launch until last year. Because the company is expected to go public soon, its leaders have determined that it’s no longer sufficient to rely completely on premium add-ons and subscriptions. Discord isn’t profitable, forcing the firm to reconsider its use of ads, which cofounder and CEO Jason Citron felt were too intrusive as recently as 2021.

Currently, Discord’s ads are limited to clickable sidebars within the platform and offer direct benefits to users. Introducing ads can be a slippery slope, though, especially for social media companies that prioritize ad revenue to please investors. On the other hand, another social media company, Reddit, has seen success by boosting its ad business. Reddit went public in March 2024 and became profitable in October 2024 after reporting a 60 percent year-over-year increase in ad revenue. Reddit has hinted at plans to introduce new and more types of ads, and we can expect Discord to consider the same after its IPO, which a March Bloomberg report suggested could happen as soon as this year.

Advance Publications, which owns Ars Technica parent Condé Nast, is the largest shareholder in Reddit.

Discord lures users to click on ads by offering them new Orbs currency Read More »

elden-ring:-nightreign-is-an-epic-rpg-squeezed-into-delicious-bite-size-capsules

Elden Ring: Nightreign is an epic RPG squeezed into delicious bite-size capsules


Fast-paced multiplayer action fits surprisingly well with the old Elden Ring formula.

Time’s a wasting, finish off that battle quick so you can move on to the next one ASAP! Credit: Bandai Namco

At this point, Elden Ring is well-known for its epic sense of scale, offering players dozens of hours of meticulous exploration, gradual character progression, and unforgiving enemy encounters that require deliberate care and strategy. On its face, this doesn’t seem like the best basis for a semi-randomized multiplayer action game spin-off with strict time limits and an ever-encroaching physical border in a tightly constrained map.

Somehow, though, Elden Ring: Nightreign makes the combination work. The game condenses all the essential parts of Elden Ring down to their barest essence, tweaking things just enough to distill the flavor of a full-fledged Elden Ring playthrough into zippy runs of less than an hour each. The result is a fast-paced, quick-hit shot of adventuring that is well suited to repeated play with friends.

Fort-elden Ring-nite

The initial moments of each Nightreign run draw an almost comical comparison to Fortnite, with each player dropping into the game’s singular map by hanging off the talons of a great spectral eagle. Once on the ground, players have to stay inside a circular “safe zone” that will slowly contract throughout each of two quick in-game days, forcing your party toward an eventual encounter with a mini-boss at the end of each day. If you survive both days, you take on one of the several extremely punishing Nightlords you chose to face at the beginning of that run.

It’s not exactly a floating bus, but it kind of feels like it is…

Credit: Bandai Namco

It’s not exactly a floating bus, but it kind of feels like it is… Credit: Bandai Namco

If you’ve played Elden Ring, you’ll definitely recognize the general fallen world aesthetic here, as well as many specific enemies and items taken directly from FromSoft’s previous epic. What will be less familiar is the general pace of play, which is guided by that encroaching circle of deadly blue flame. Instead of taking your time and exploring every nook and cranny for hidden secrets, you end up dashing between points of interest highlighted on the map in a madcap attempt to farm enough experience points and powerful items to have a chance against the big bosses.

There are a few crucial tweaks to the Elden Ring formula aiding you in this newly speed-focused effort. For one thing, your character now has an unlimited “surge sprint” that can get you from one part of the map to another at a pretty rapid clip. For another, there’s a nice springy wall jump that lets you climb up stair-step cliffs and walls that are much taller than your character. Add in occasional jump pads for quickly leaping over cliffs and a complete lack of fall damage for descending into valleys, and you get a game that feels more like a 3D Sonic than Elden Ring at points.

You’d better have a few levels under your belt if you’re going to take on a battle like this.

Credit: Bandai Namco

You’d better have a few levels under your belt if you’re going to take on a battle like this. Credit: Bandai Namco

Things feel more like the old Elden Ring during battles, where you’ll quickly fall into the familiar rhythm of managing limited stamina to attack, block, and dodge enemies’ heavily telegraphed attacks. Even here, though, things feel a little more action-oriented thanks to powerful, class-specific “character skills” and “ultimate art” attacks that slowly recharge over time. The quick pace of leveling also aids in the power fantasy, condensing the progression from zero to hero into an extremely tight time frame, relative to Elden Ring proper.

Try, try again

Speaking of classes, the eight options here tend to fall into the usual archetypes for this kind of action-adventure game: the tank, the mage, the defensive specialist, the dextrous dodger, etc. For myself, I tended toward the Ironeye class, with an unlimited supply of arrows that let me deliver consistent (if relatively weak) damage against flying and/or zigzagging bosses, all while maintaining a safe range from all but the widest-ranged attacks.

But one big benefit of Nightreign‘s faster-paced design is that you don’t have to tie yourself to a specific class for hundreds of hours at the outset. You’ll get ample opportunity to try them all—and different combinations with teammate classes—across dozens of individual, bite-size runs.

As you do, you’ll start to learn the general shape of the map, which is well-designed with a few distinct geographic regions and points of interest. While the specific enemies and items you’ll find in various locations will change from run to run, you’ll quickly develop a feel for the landmarks and general routes you’ll want to at least consider exploring each time.

After a few runs, you’ll know where to find the subterranean caves that have a good chance of hidden loot.

After a few runs, you’ll know where to find the subterranean caves that have a good chance of hidden loot.

Repeated runs also help you develop the key sense of when it’s worthwhile to fight and when it makes more sense to run away. This is especially important at the beginning of each run, where your low-level character needs to focus on farming fodder enemies until you are powerful enough to take on the lowest tier of sub-bosses you might stumble across. Later in the run, you’ll need to shift to ignoring those low-level enemies so you can spend more time gaining big rewards from the even bigger bosses.

Even with a decent general strategy, though, players shouldn’t expect to be able to win every run in Nightreign. During some runs, you may find only garbage weapon drops or low-level enemies that make it hard to quickly build up the critical mass of power you’ll need by the final encounter. During other runs, you may chance upon a great weapon that causes enough bleed damage to make even the most difficult bosses relatively easy to kill.

Then there are the runs where you get greedy by doubling back to a lucrative encounter on the edge of the safety circle, only to find yourself quickly engulfed in blue flame. Or the ones where you take one wrong step and fall to your doom down a cliffside while trying to dodge away from a relatively harmless enemy, losing a crucial character level (and your momentum) when you respawn.

Between runs, you can equip relics that offer small permanent stat boosts to the various classes. In general, though, success in Nightreign is a matter of keeping at it until you stumble on the right mix of luck and execution to finally best the Nightlords.

Find a friend

While Nightreign technically has a single-player mode, the game is quite explicitly designed for groups of three simultaneous humans (groups of two need not apply—paired players will need to join up with a third). Being in a threesome generally means that one player can draw an enemy’s attack while the other two take advantage by flanking around their guard. It also means that downed players can be revived by a partner repeatedly hitting their crawling near-corpse with a weapon, an awkward and hilarious process in practice.

Does this count as three-on-one odds, or do the multiple heads on the beast make it more of a fair fight?

Does this count as three-on-one odds, or do the multiple heads on the beast make it more of a fair fight?

Being able to coordinate with your teammates is crucial both during battles and as you decide which location to explore next in the ever-narrowing circle of the available map. If you’re not playing with friends and chatting over a voice connection, your main form of communication is an awkward system of pinning points of interest on the map.

Unfortunately, I ran into some serious problems with lag in my pre-release multiplayer runs, with the game periodically freezing for multiple seconds at a time as the servers struggled to keep up. I often came out of these freezes to find I had succumbed to an enemy attack that I hadn’t even seen on my screen. I can’t say this server performance in a tightly controlled pre-launch environment bodes well for how the game will perform once the wider public gains access in a few days.

Those technical problems aside, I was surprised at how well this zippy, capsule-size take on the Elden Ring formula worked in practice. Nightreign might not be the full-fledged, epic Elden Ring sequel that long-time “Soulsborne” fans are looking for, but it’s still a compelling, action-packed twist on the popular adventure gameplay.

Photo of Kyle Orland

Kyle Orland has been the Senior Gaming Editor at Ars Technica since 2012, writing primarily about the business, tech, and culture behind video games. He has journalism and computer science degrees from University of Maryland. He once wrote a whole book about Minesweeper.

Elden Ring: Nightreign is an epic RPG squeezed into delicious bite-size capsules Read More »

google’s-will-smith-double-is-better-at-eating-ai-spaghetti-…-but-it’s-crunchy?

Google’s Will Smith double is better at eating AI spaghetti … but it’s crunchy?

On Tuesday, Google launched Veo 3, a new AI video synthesis model that can do something no major AI video generator has been able to do before: create a synchronized audio track. While from 2022 to 2024, we saw early steps in AI video generation, each video was silent and usually very short in duration. Now you can hear voices, dialog, and sound effects in eight-second high-definition video clips.

Shortly after the new launch, people began asking the most obvious benchmarking question: How good is Veo 3 at faking Oscar-winning actor Will Smith at eating spaghetti?

First, a brief recap. The spaghetti benchmark in AI video traces its origins back to March 2023, when we first covered an early example of horrific AI-generated video using an open source video synthesis model called ModelScope. The spaghetti example later became well-known enough that Smith parodied it almost a year later in February 2024.

Here’s what the original viral video looked like:

One thing people forget is that at the time, the Smith example wasn’t the best AI video generator out there—a video synthesis model called Gen-2 from Runway had already achieved superior results (though it was not yet publicly accessible). But the ModelScope result was funny and weird enough to stick in people’s memories as an early poor example of video synthesis, handy for future comparisons as AI models progressed.

AI app developer Javi Lopez first came to the rescue for curious spaghetti fans earlier this week with Veo 3, performing the Smith test and posting the results on X. But as you’ll notice below when you watch, the soundtrack has a curious quality: The faux Smith appears to be crunching on the spaghetti.

On X, Javi Lopez ran “Will Smith eating spaghetti” in Google’s Veo 3 AI video generator and received this result.

It’s a glitch in Veo 3’s experimental ability to apply sound effects to video, likely because the training data used to create Google’s AI models featured many examples of chewing mouths with crunching sound effects. Generative AI models are pattern-matching prediction machines, and they need to be shown enough examples of various types of media to generate convincing new outputs. If a concept is over-represented or under-represented in the training data, you’ll see unusual generation results, such as jabberwockies.

Google’s Will Smith double is better at eating AI spaghetti … but it’s crunchy? Read More »

new-claude-4-ai-model-refactored-code-for-7-hours-straight

New Claude 4 AI model refactored code for 7 hours straight


Anthropic says Claude 4 beats Gemini on coding benchmarks; works autonomously for hours.

The Claude 4 logo, created by Anthropic. Credit: Anthropic

On Thursday, Anthropic released Claude Opus 4 and Claude Sonnet 4, marking the company’s return to larger model releases after primarily focusing on mid-range Sonnet variants since June of last year. The new models represent what the company calls its most capable coding models yet, with Opus 4 designed for complex, long-running tasks that can operate autonomously for hours.

Alex Albert, Anthropic’s head of Claude Relations, told Ars Technica that the company chose to revive the Opus line because of growing demand for agentic AI applications. “Across all the companies out there that are building things, there’s a really large wave of these agentic applications springing up, and a very high demand and premium being placed on intelligence,” Albert said. “I think Opus is going to fit that groove perfectly.”

Before we go further, a brief refresher on Claude’s three AI model “size” names (first introduced in March 2024) is probably warranted. Haiku, Sonnet, and Opus offer a tradeoff between price (in the API), speed, and capability.

Haiku models are the smallest, least expensive to run, and least capable in terms of what you might call “context depth” (considering conceptual relationships in the prompt) and encoded knowledge. Owing to the small size in parameter count, Haiku models retain fewer concrete facts and thus tend to confabulate more frequently (plausibly answering questions based on lack of data) than larger models, but they are much faster at basic tasks than larger models. Sonnet is traditionally a mid-range model that hits a balance between cost and capability, and Opus models have always been the largest and slowest to run. However, Opus models process context more deeply and are hypothetically better suited for running deep logical tasks.

A screenshot of the Claude web interface with Opus 4 and Sonnet 4 options shown.

A screenshot of the Claude web interface with Opus 4 and Sonnet 4 options shown. Credit: Anthropic

There is no Claude 4 Haiku just yet, but the new Sonnet and Opus models can reportedly handle tasks that previous versions could not. In our interview with Albert, he described testing scenarios where Opus 4 worked coherently for up to 24 hours on tasks like playing Pokémon while coding refactoring tasks in Claude Code ran for seven hours without interruption. Earlier Claude models typically lasted only one to two hours before losing coherence, Albert said, meaning that the models could only produce useful self-referencing outputs for that long before beginning to output too many errors.

In particular, that marathon refactoring claim reportedly comes from Rakuten, a Japanese tech services conglomerate that “validated [Claude’s] capabilities with a demanding open-source refactor running independently for 7 hours with sustained performance,” Anthropic said in a news release.

Whether you’d want to leave an AI model unsupervised for that long is another question entirely because even the most capable AI models can introduce subtle bugs, go down unproductive rabbit holes, or make choices that seem logical to the model but miss important context that a human developer would catch. While many people now use Claude for easy-going vibe coding, as we covered in March, the human-powered (and ironically-named) “vibe debugging” that often results from long AI coding sessions is also a very real thing. More on that below.

To shore up some of those shortcomings, Anthropic built memory capabilities into both new Claude 4 models, allowing them to maintain external files for storing key information across long sessions. When developers provide access to local files, the models can create and update “memory files” to track progress and things they deem important over time. Albert compared this to how humans take notes during extended work sessions.

Extended thinking meets tool use

Both Claude 4 models introduce what Anthropic calls “extended thinking with tool use,” a new beta feature allowing the models to alternate between simulated reasoning and using external tools like web search, similar to what OpenAI’s o3 and 04-mini-high AI models currently do in ChatGPT. While Claude 3.7 Sonnet already had strong tool use capabilities, the new models can now interleave simulated reasoning and tool calling in a single response.

“So now we can actually think, call a tool process, the results, think some more, call another tool, and repeat until it gets to a final answer,” Albert explained to Ars. The models self-determine when they have reached a useful conclusion, a capability picked up through training rather than governed by explicit human programming.

General Claude 4 benchmark results, provided by Anthropic.

General Claude 4 benchmark results, provided by Anthropic. Credit: Anthropic

In practice, we’ve anecdotally found parallel tool use capability very useful in AI assistants like OpenAI o3, since they don’t have to rely on what is trained in their neural network to provide accurate answers. Instead, these more agentic models can iteratively search the web, parse the results, analyze images, and spin up coding tasks for analysis in ways that can avoid falling into a confabulation trap by relying solely on pure LLM outputs.

“The world’s best coding model”

Anthropic says Opus 4 leads industry benchmarks for coding tasks, achieving 72.5 percent on SWE-bench and 43.2 percent on Terminal-bench, calling it “the world’s best coding model.” According to Anthropic, companies using early versions report improvements. Cursor described it as “state-of-the-art for coding and a leap forward in complex codebase understanding,” while Replit noted “improved precision and dramatic advancements for complex changes across multiple files.”

In fact, GitHub announced it will use Sonnet 4 as the base model for its new coding agent in GitHub Copilot, citing the model’s performance in “agentic scenarios” in Anthropic’s news release. Sonnet 4 scored 72.7 percent on SWE-bench while maintaining faster response times than Opus 4. The fact that GitHub is betting on Claude rather than a model from its parent company Microsoft (which has close ties to OpenAI) suggests Anthropic has built something genuinely competitive.

Software engineering benchmark results, provided by Anthropic.

Software engineering benchmark results, provided by Anthropic. Credit: Anthropic

Anthropic says it has addressed a persistent issue with Claude 3.7 Sonnet in which users complained that the model would take unauthorized actions or provide excessive output. Albert said the company reduced this “reward hacking behavior” by approximately 80 percent in the new models through training adjustments. An 80 percent reduction in unwanted behavior sounds impressive, but that also suggests that 20 percent of the problem behavior remains—a big concern when we’re talking about AI models that might be performing autonomous tasks for hours.

When we asked about code accuracy, Albert said that human code review is still an important part of shipping any production code. “There’s a human parallel, right? So this is just a problem we’ve had to deal with throughout the whole nature of software engineering. And this is why the code review process exists, so that you can catch these things. We don’t anticipate that going away with models either,” Albert said. “If anything, the human review will become more important, and more of your job as developer will be in this review than it will be in the generation part.”

Pricing and availability

Both Claude 4 models maintain the same pricing structure as their predecessors: Opus 4 costs $15 per million tokens for input and $75 per million for output, while Sonnet 4 remains at $3 and $15. The models offer two response modes: traditional LLM and simulated reasoning (“extended thinking”) for complex problems. Given that some Claude Code sessions can apparently run for hours, those per-token costs will likely add up very quickly for users who let the models run wild.

Anthropic made both models available through its API, Amazon Bedrock, and Google Cloud Vertex AI. Sonnet 4 remains accessible to free users, while Opus 4 requires a paid subscription.

The Claude 4 models also debut Claude Code (first introduced in February) as a generally available product after months of preview testing. Anthropic says the coding environment now integrates with VS Code and JetBrains IDEs, showing proposed edits directly in files. A new SDK allows developers to build custom agents using the same framework.

A screenshot of

A screenshot of “Claude Plays Pokemon,” a custom application where Claude 4 attempts to beat the classic Game Boy game. Credit: Anthropic

Even with Anthropic’s future riding on the capability of these new models, when we asked about how they guide Claude’s behavior by fine-tuning, Albert acknowledged that the inherent unpredictability of these systems presents ongoing challenges for both them and developers. “In the realm and the world of software for the past 40, 50 years, we’ve been running on deterministic systems, and now all of a sudden, it’s non-deterministic, and that changes how we build,” he said.

“I empathize with a lot of people out there trying to use our APIs and language models generally because they have to almost shift their perspective on what it means for reliability, what it means for powering a core of your application in a non-deterministic way,” Albert added. “These are general oddities that have kind of just been flipped, and it definitely makes things more difficult, but I think it opens up a lot of possibilities as well.”

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

New Claude 4 AI model refactored code for 7 hours straight Read More »