Author name: 9u50fv

openai-slams-court-order-that-lets-nyt-read-20-million-complete-user-chats

OpenAI slams court order that lets NYT read 20 million complete user chats


OpenAI: NYT wants evidence of ChatGPT users trying to get around news paywall.

Credit: Getty Images | alexsl

OpenAI wants a court to reverse a ruling forcing the ChatGPT maker to give 20 million user chats to The New York Times and other news plaintiffs that sued it over alleged copyright infringement. Although OpenAI previously offered 20 million user chats as a counter to the NYT’s demand for 120 million, the AI company says a court order requiring production of the chats is too broad.

“The logs at issue here are complete conversations: each log in the 20 million sample represents a complete exchange of multiple prompt-output pairs between a user and ChatGPT,” OpenAI said today in a filing in US District Court for the Southern District of New York. “Disclosure of those logs is thus much more likely to expose private information [than individual prompt-output pairs], in the same way that eavesdropping on an entire conversation reveals more private information than a 5-second conversation fragment.”

OpenAI’s filing said that “more than 99.99%” of the chats “have nothing to do with this case.” It asked the district court to “vacate the order and order News Plaintiffs to respond to OpenAI’s proposal for identifying relevant logs.” OpenAI could also seek review in a federal court of appeals.

OpenAI posted a message on its website to users today saying that “The New York Times is demanding that we turn over 20 million of your private ChatGPT conversations” in order to “find examples of you using ChatGPT to try to get around their paywall.”

ChatGPT users concerned about privacy have more to worry about than the NYT case. For example, ChatGPT conversations have been found in Google search results and the Google Search Console tool that developers can use to monitor search traffic. OpenAI today said it plans to develop “advanced security features designed to keep your data private, including client-side encryption for your messages with ChatGPT. ”

OpenAI: AI chats should be treated like private emails

OpenAI’s court filing argues that the chat log production should be narrowed based on the relevance of chats to the case.

“OpenAI is unaware of any court ordering wholesale production of personal information at this scale,” the filing said. “This sets a dangerous precedent: it suggests that anyone who files a lawsuit against an AI company can demand production of tens of millions of conversations without first narrowing for relevance. This is not how discovery works in other cases: courts do not allow plaintiffs suing Google to dig through the private emails of tens of millions of Gmail users irrespective of their relevance. And it is not how discovery should work for generative AI tools either.”

A November 7 order by US Magistrate Judge Ona Wang sided with the NYT, saying that OpenAI must “produce the 20 million de-identified Consumer ChatGPT Logs to News Plaintiffs by November 14, 2025, or within 7 days of completing the de-identification process.” Wang ruled that the production must go forward even though the parties don’t agree on whether the logs must be produced in full:

Whether or not the parties had reached agreement to produce the 20 million Consumer ChatGPT Logs in whole—which the parties vehemently dispute—such production here is appropriate. OpenAI has failed to explain how its consumers’ privacy rights are not adequately protected by: (1) the existing protective order in this multidistrict litigation or (2) OpenAI’s exhaustive de-identification of all of the 20 million Consumer ChatGPT Logs.

OpenAI’s filing today said the court order “did not acknowledge OpenAI’s sworn witness declaration explaining that the de-identification process is not intended to remove information that is non-identifying but may nonetheless be private, like a Washington Post reporter’s hypothetical use of ChatGPT to assist in the preparation of a news article.”

Chats stored under legal hold

The 20 million chats consist of a random sampling of ChatGPT conversations from December 2022 to November 2024 and do not include chats of business customers, OpenAI said in the message on its website.

“We presented several privacy-preserving options to The Times, including targeted searches over the sample (e.g., to search for chats that might include text from a New York Times article so they only receive the conversations relevant to their claims), as well as high-level data classifying how ChatGPT was used in the sample. These were rejected by The Times,” OpenAI said.

The chats are stored in a secure system that is “protected under legal hold, meaning it can’t be accessed or used for purposes other than meeting legal obligations,” OpenAI said. The NYT “would be legally obligated at this time to not make any data public outside the court process,” and OpenAI said it will fight any attempts to make the user conversations public.

A NYT filing on October 30 accused OpenAI of defying prior agreements “by refusing to produce even a small sample of the billions of model outputs that its conduct has put in issue in this case.” The filing continued:

Immediate production of the output log sample is essential to stay on track for the February 26, 2026, discovery deadline. OpenAI’s proposal to run searches on this small subset of its model outputs on Plaintiffs’ behalf is as inefficient as it is inadequate to allow Plaintiffs to fairly analyze how “real world” users interact with a core product at the center of this litigation. Plaintiffs cannot reasonably conduct expert analyses about how OpenAI’s models function in its core consumer-facing product, how retrieval augmented generation (“RAG”) functions to deliver news content, how consumers interact with that product, and the frequency of hallucinations without access to the model outputs themselves.

OpenAI said the NYT’s discovery requests were initially limited to logs “related to Times content” and that it has “been working to satisfy those requests by sampling conversation logs. Towards the end of that process, News Plaintiffs filed a motion with a new demand: that instead of finding and producing logs that are ‘related to Times content,’ OpenAI should hand over the entire 20 million-log sample ‘via hard drive.’”

OpenAI disputes judge’s reasoning

The November 7 order cited a California case, Concord Music Group, Inc. v. Anthropic PBC, in which US District Magistrate Judge Susan van Keulen ordered the production of 5 million records. OpenAI consistently relied on van Keulen’s use of a sample-size formula “in support of its previous proposed methodology for conversation data sampling, but fails to explain why Judge [van] Keulen’s subsequent order directing production of the entire 5 million-record sample to the plaintiff in that case is not similarly instructive here,” Wang wrote.

OpenAI’s filing today said the company was never given an opportunity to explain why Concord shouldn’t apply in this case because the news plaintiffs did not reference it in their motion.

“The cited Concord order was not about whether wholesale production of the sample was appropriate; it was about the mechanism through which Anthropic would effectuate an already agreed-upon production,” OpenAI wrote. “Nothing about that order suggests that Judge van Keulen would have ordered wholesale production had Anthropic raised the privacy concerns that OpenAI has raised throughout this case.”

The Concord logs were just prompt-output pairs, “i.e., a single user prompt followed by a single model output,” OpenAI wrote. “The logs at issue here are complete conversations: each log in the 20 million sample represents a complete exchange of multiple prompt-output pairs between a user and ChatGPT.” That could result in “up to 80 million prompt-output pairs,” OpenAI said.

We contacted The New York Times about OpenAI’s filing and will update this article if it provides any comment.

Photo of Jon Brodkin

Jon is a Senior IT Reporter for Ars Technica. He covers the telecom industry, Federal Communications Commission rulemakings, broadband consumer affairs, court cases, and government regulation of the tech industry.

OpenAI slams court order that lets NYT read 20 million complete user chats Read More »

review:-new-framework-laptop-16-takes-a-fresh-stab-at-the-upgradeable-laptop-gpu

Review: New Framework Laptop 16 takes a fresh stab at the upgradeable laptop GPU


framework laptop 16, take two

New components make it more useful and powerful but no less odd.

Credit: Andrew Cunningham

Credit: Andrew Cunningham

The original Framework Laptop 16 was trying to crack a problem that laptop makers have wrestled with on and off for years: Can you deliver a reasonably powerful, portable workstation and gaming laptop that supports graphics card upgrades just like a desktop PC?

Specs at a glance: Framework Laptop 16 (2025)
OS Windows 11 25H2
CPU AMD Ryzen AI 7 350 (4 Zen 5 cores, 4 Zen 5c cores)
RAM 32GB DDR5-5600 (upgradeable)
GPU AMD Radeon 860M (integrated)/Nvidia GeForce RTX 5070 Mobile (dedicated)
SSD 1TB Western Digital Black SN770
Battery 85 WHr
Display 16-inch 2560×1600 165 Hz matte non-touchscreen
Connectivity 6x recessed USB-C ports (2x USB 4, 4x USB 3.2) with customizable “Expansion Card” dongles
Weight 4.63 pounds (2.1 kg) without GPU, 5.29 pounds (2.4 kg) with GPU
Price as tested Roughly $2,649 for pre-built edition; $2,517 for DIY edition with no OS

Even in these days of mostly incremental, not-too-exciting GPU upgrades, the graphics card in a gaming PC or graphics-centric workstation will still feel its age faster than your CPU will. And the chance to upgrade that one component for hundreds of dollars instead of spending thousands replacing the entire machine is an appealing proposition.

Upgradeable, swappable GPUs would also make your laptop more flexible—you can pick and choose from various GPUs from multiple vendors based on what you want and need, whether that’s raw performance, power efficiency, Linux support, or CUDA capabilities.

Framework’s first upgrade to the Laptop 16—the company’s first upgrade to any of its products aside from the original Laptop 13—gets us pretty close to that reality. The laptop can now support two interchangeable motherboards: one with an older AMD Ryzen 7040-series CPU and one with a new Ryzen AI 300-series CPU. And both motherboards can be used either with just an integrated GPU or with dedicated GPUs from both AMD and Nvidia.

The Nvidia GeForce 5070 graphics module is the most exciting and significant part of this batch of updates, but there are plenty of other updates and revisions to the laptop’s external and internal components, too. These upgrades don’t address all of our problems with the initial version of the laptop, but they do help quite a bit. And a steady flow of updates like these would definitely make the Laptop 16 a platform worth investing in.

Re-meet the Framework Laptop 16

Framework’s Laptop 13 stacked on top of the 16. Credit: Andrew Cunningham

Framework treats each of its laptops as a platform to be modified and built upon rather than something to be wholly redesigned and replaced every time it’s updated. So these reviews necessarily re-cover ground we have already covered—I’ve also reused some of the photos from last time, since this is quite literally the same laptop in most respects. I’ll point you to the earlier review for detailed notes on the build process and how the laptop is put together.

To summarize our high-level notes about the look, feel, and design of the Framework Laptop 16: While the Framework Laptop 13 can plausibly claim to be in the same size and weight class as portables like the 13-inch MacBook Air, the Framework Laptop 16 is generally larger and heavier than the likes of the 16-inch MacBook Pro or portable PC workstations like the Lenovo ThinkPad P1 or Dell 16 Premium. That’s doubly true once you actually add a dedicated graphics module to the Laptop 16—these protrude a couple of inches from the back of the laptop and add around two-thirds of a pound to its weight.

Frame-work 16 (no GPU) Frame-work 16 (GPU) Apple 16-inch MBP Dell 16 Premium Lenovo ThinkPad P1 Gen 8 HP ZBook X G1i Lenovo Legion Pro 5i Gen 10 Razer Blade 16
Size (H x W x D inches) 0.71 x 14.04 x 10.63 0.82 x 14.04 x 11.43 0.66 x 14.01 x 9.77 0.75 x 14.1 x 9.4 0.39-0.62 x 13.95 x 9.49 0.9 x 14.02 x 9.88 0.85-1.01 x 14.34 x 10.55 0.59-0.69 x 13.98 x 9.86
Weight 4.63 lbs 5.29 lbs 4.7-4.8 lbs 4.65 pounds 4.06 lbs 4.5 lbs 5.56 lbs 4.71 lbs

You certainly can find laptops from the major PC OEMs that come close to or even exceed the size and weight of the Laptop 16. But in most cases, you’ll find that comparably specced and priced laptops are an inch or two less deep and at least half a pound lighter than the Laptop 16 with a dedicated GPU installed.

But if you’re buying from Framework, you’re probably at least notionally interested in customizing, upgrading, and repairing your laptop over time, all things that Framework continues to do better than any other company.

The Laptop 16’s customizable keyboard deck is still probably its coolest feature—it’s a magnetically attached series of panels that allows you to remove and replace components without worrying about the delicate and finicky ribbon cables the Laptop 13 uses. Practically, the most important aspect of this customizable keyboard area is that it lets you decide whether you want to install a dedicated number pad or not; this also allows you to choose whether you want the trackpad to be aligned with the center of the laptop or with wherever the middle of the keyboard is.

It might look a little rough, but the customizable keyboard deck is still probably the coolest thing about the Laptop 16 in day-to-day use. Andrew Cunningham

But Framework also sells an assortment of other functional and cosmetic panels and spacers to let users customize the laptop to their liking. The coolest, oddest accessories are still probably the LED matrix spacers and the clear, legend-less keyboard and number pad modules. We still think this assortment of panels gives the system a vaguely unfinished look, but Framework is clearly going for function over form here.

The Laptop 16 also continues to use Framework’s customizable, swappable Expansion Card modules. In theory, these let you pick the number and type of ports your laptop has, as well as customize your port setup on the fly based on what you need. But as with all AMD Ryzen-based Framework Laptops, there are some limits to what each port can do.

According to Framework’s support page, there’s no single Expansion Card slot that is truly universal:

  • Ports 1 and 4 support full 40Gbps USB 4 transfer speeds, display outputs, and up to 240 W charging, but if you use a USB-A Expansion Card in those slots, you’ll increase power use and reduce battery life.
  • Ports 2 and 4 support display outputs, up to 240 W charging, and lower power usage for USB-A ports, but they top out at 10Gbps USB 3.2 transfer speeds. Additionally, port 5 (the middle port on the right side of the laptop, if you’re looking at it head-on) supports the DisplayPort 1.4 standard where the others support DisplayPort 2.1.
  • Ports 3 and 4 are limited to 10Gbps USB 3.2 transfer speeds and don’t support display outputs or charging.

The Laptop 16 also doesn’t include a dedicated headphone jack, so users will need to burn one of their Expansion Card slots to get one.

Practically speaking, most users will be able to come up with a port arrangement that fits their needs, and it’s still handy to be able to add and remove things like Ethernet ports, HDMI ports, or SD card readers on an as-needed basis. But choosing the right Expansion Card slot for the job will still require some forethought, and customizable ports aren’t as much of a selling point for a 16-inch laptop as they are for a 13-inch laptop (the Framework Laptop 13 was partly a response to laptops like the MacBook Air and Dell XPS 13 that only came with a small number of USB-C ports; larger laptops have mostly kept their larger number and variety of ports).

What’s new in 2025’s Framework Laptop 16?

An upgraded motherboard and a new graphics module form the heart of this year’s Laptop 16 upgrade. The motherboard steps up from AMD Ryzen 7040-series processors to AMD Ryzen AI 7 350 and Ryzen AI 9 HX 370 chips. These are the same processors Framework put into the Laptop 13 earlier this year, though they ought to be able to run a bit faster in the Laptop 16 due to its larger heatsink and dual-fan cooling system.

Along with an upgrade from Zen 4-based CPU cores to Zen 5 cores, the Ryzen AI series includes an upgraded neural processing unit (NPU) that is fast enough to earn Microsoft’s Copilot+ PC label. These PCs have access to a handful of unique Windows 11 AI and machine-learning features (yes, Recall, but not just Recall) that are processed locally rather than in the cloud. If you don’t care about these features, you can mostly just ignore them, but if you do care, this is the first version of the Laptop 16 to support them.

Most of the new motherboard’s other specs and features are pretty similar to the first-generation version; there are two SO-DIMM slots for up to 96GB of DDR5-5600, one M.2 2280 slot for the system’s main SSD, and one M.2 2230 slot for a secondary SSD. Wi-Fi 7 and Bluetooth connectivity are provided by an AMD RZ717 Wi-Fi card that can at least theoretically also be replaced with something faster down the line if you want.

The more exciting upgrade, however, may be the GeForce RTX 5070 GPU. This is the first time Framework has offered an Nvidia product—its other GPUs have all come from either Intel or AMD—and it gives the new Laptop 16 access to Nvidia technologies like DLSS and CUDA, as well as much-improved performance for games with ray-traced lighting effects.

Those hoping for truly high-end graphics options for the Laptop 16 will need to keep waiting, though. The laptop version of the RTX 5070 is actually the same chip as the desktop version of the RTX 5060, a $300 graphics card with 8GB of RAM. As much as it adds to the Laptop 16, it still won’t let you come anywhere near 4K in most modern games, and for some, it may even struggle to take full advantage of the internal 165 Hz 1600p screen. Professional workloads (including AI workloads) that require more graphics RAM will also find the mobile 5070 lacking.

Old 180 W charger on top, new 240 W charger on bottom. Credit: Andrew Cunningham

Other components have gotten small updates as well. For those who upgrade an existing Laptop 16 with the new motherboard, Framework is selling 2nd-generation keyboard and number pad components. But their main update over the originals is new firmware that “includes a fix to prevent the system from waking while carried in a bag.” Owners of the original keyboard can install a firmware update to get the same functionality (and make their input modules compatible with the new board).

Upgraders should also note that the original system’s 180 W power adapter has been replaced with a 240 W model, the maximum amount of power that current USB-C and USB-PD standards are capable of delivering. You can charge the laptop with just about any USB-C power brick, but anything lower than 240 W risks reducing performance (or having the battery drain faster than it can charge).

Finally, the laptop uses a second-generation 16-inch, 2560×1600, 165 Hz LCD screen. It’s essentially identical in every way to the first-generation screen, but it formally supports G-Sync, Nvidia’s adaptive sync implementation. The original screen can still be used with the new motherboard, but it only supports AMD’s FreeSync, and Framework told us a few months ago that the panel supplier had no experience providing consumer-facing firmware updates that might add G-Sync to the old display. It’s probably not worth replacing the entire screen for, but it’s worth noting whether you’re upgrading the laptop or buying a new one.

Performance

Framework sent us the lower-end Ryzen AI 7 350 processor configuration for our new board, making it difficult to do straightforward apples-to-apples comparisons to the high-end Ryzen 9 7940HS in our first-generation Framework board. We did test the new chip, and you’ll see its results in our charts.

We’ve also provided numbers from the Ryzen AI 9 HX 370 in the Asus Zenbook S16 UM5606W to show approximately where you can expect the high-end Framework Laptop 16 configuration to land (Framework’s integrated graphics performance will be marginally worse since it’s using slower socketed RAM rather than LPDDR5X; other numbers may differ based on how each manufacturer has configured the chip’s power usage and thermal behavior). We’ve also included numbers from the same chip in the Framework Laptop 13, though Framework’s spec sheets indicate that the chips have different power limits and thus will perform differently.

We were able to test the new GeForce GPU in multiple configurations—both paired with the new Ryzen AI 7 350 processor and with the old Ryzen 9 7940HS chip. This should give anyone who bought the original Laptop 16 an idea of what kind of performance increase they can expect from the new GPU alone. In all, we’ve tested or re-tested:

  • The Ryzen 7 7940HS CPU from the first-generation Laptop 16 and its integrated Radeon 780M GPU
  • The Ryzen 7 7940HS and the original Radeon RX 7700S GPU module
  • The Ryzen 7 7940HS and the new GeForce RTX 5070 GPU module, for upgraders who only want to grab the new GPU
  • The Ryzen AI 7 350 CPU and the GeForce RTX 5070 GPU

We also did some light testing on the Radeon 860M integrated GPU included with the Ryzen AI 7 350.

All the Laptop 16 performance tests were run with Windows’ Best Performance power preset enabled, which will slightly boost performance at the expense of power efficiency.

Given all of those hardware combinations, we simply ran out of time to test the new motherboard with the old Radeon RX 7700S GPU—Framework is continuing to sell it, so it is a realistic combination of components. But our RTX 5070 testing suggests that these GPUs will perform pretty much the same regardless of which CPU you pair them with.

If you’re buying the cheaper Laptop 16 with the Ryzen AI 7 350, the good news is that it generally performs at least as well as—and usually a bit better than—the high-end Ryzen 9 7940HS from the last-generation model. Performance is also pretty similar to the Ryzen AI 9 HX 370 in smaller, thinner laptops—the extra power and cooling capacity in the Laptop 16 is paying off here. People choosing between a PC and a Mac should note that none of these Ryzen chips come anywhere near the M4 Pro used in comparably priced 16-inch MacBook Pros, but that’s just where the PC ecosystem is these days.

How big an upgrade the GeForce 5070 will be depends on the game you’re playing. In titles like Borderlands 3 that naturally run a bit better on AMD’s GPUs, there’s not much of a difference at all. In games like Cyberpunk 2077 with heavy ray-tracing effects enabled, the mobile RTX 5070 can be nearly twice as fast as the RX 7700S.

Most games will fall somewhere in between those two extremes; our tests show that the improvements hover between 20 and 30 percent most of the time, just a shade less than the 30 to 40 percent improvement that Framework claimed in its original announcement.

Beyond raw performance, the other thing you get with an Nvidia GPU is access to a bunch of important proprietary technologies like DLSS upscaling and CUDA—these technologies are often better and more widely supported than the equivalent technologies that AMD’s or Intel’s GPUs use, thanks in part to Nvidia’s overall dominance of the dedicated GPU market.

In the tests we’ve run on them, the Radeon 860M and 890M are both respectable integrated GPUs (the lower-end 860M typically falls just short of last generation’s top-end 780M, but it’s very close). They’re never able to provide more than a fraction of the Radeon RX 7700S’s performance, let alone the RTX 5070, but they’ll handle a lot of lighter games at 1080p. I would not buy a system this large or heavy just to use it with an integrated GPU.

Better to be unique than perfect

It’s expensive and quirky, but the Framework Laptop 16 is worth considering because it’s so different from what most other laptop makers are doing. Credit: Andrew Cunningham

Our original Framework Laptop 16 review called it “fascinating but flawed,” and the parts that made it flawed haven’t really changed much over the last two years. It’s still relatively large and heavy; the Expansion Card system still makes less sense in a larger laptop than it does in a thin-and-light; the puzzle-like grid of input modules and spacers looks kind of rough and unfinished.

But the upgrades do help to shift things in the Laptop 16’s favor. Its modular and upgradeable design was always a theoretical selling point, but the laptop now actually offers options that other laptops don’t.

The presence of both AMD and Nvidia GPUs is a big step up in flexibility for both gaming and professional applications. The GeForce module is a better all-around choice, with slightly to significantly faster game performance and proprietary technologies like DLSS and CUDA, while the Radeon GPU is a cheaper option with better support for Linux.

Given their cost, I still wish that these GPUs were more powerful—they’re between $350 or $449 for the Radeon RX 7700S and between $650 and $699 for the RTX 5070 (prices vary a bit and are cheaper when you’re buying them together with a new laptop rather than buying them separately). You’ll basically always spend more for a gaming laptop than you will for a gaming desktop with similar or better performance, but that does feel like an awful lot to spend for GPUs that are still limited to 8GB of RAM.

Cost is a major issue for the Laptop 16 in general. You may save money in the long run by buying a laptop that you can replace piece-by-piece as you need to rather than all at once. But it’s not even remotely difficult to find similar specs from the major PC makers for hundreds of dollars less. We can’t vouch for the build quality or longevity of any of those PCs, but it does mean that you have to be willing to pay an awful lot just for Framework’s modularity and upgradeability. That’s true to some degree of the Laptop 13 as well, but the price gap between the 13 and competing systems isn’t as large as it is for the 16.

Whatever its lingering issues, the Framework Laptop 16 is still worth considering because there’s nothing else quite like it, at least if you’re in the market for something semi-portable and semi-powerful. The MacBook Pro exists if you want something more appliance-like, and there’s a whole spectrum of gaming and workstation PCs in between with all kinds of specs, sizes, and prices. To stand out from those devices, it’s probably better to be unique than to be perfect, and the reformulated Laptop 16 certainly clears that bar.

The good

  • Modular, repairable, upgradeable design that’s made to last
  • Cool, customizable keyboard deck
  • Nvidia GeForce GPU option gives the Laptop 16 access to some gaming and GPU computing features that weren’t usable with AMD GPUs
  • GPU upgrade can be added to first-generation Framework Laptop 16
  • New processors are a decent performance improvement and are worth considering for new buyers
  • Old Ryzen 7040-series motherboard is sticking around as an entry-level option, knocking $100 off the former base price ($1,299 and up for a barebones DIY edition, $1,599 and up for the cheapest pre-built)
  • Framework’s software support has gotten better in the last year

The bad

  • Big and bulky for the specs you get
  • Mix-and-match input modules and spacers give it a rough, unfinished sort of look
  • Ryzen AI motherboards are more expensive than the originals were when they launched

The ugly

  • It’ll cost you—the absolute bare minimum price for Ryzen AI 7 350 and RTX 5070 combo is $2,149, and that’s without RAM, an SSD, or an operating system

Photo of Andrew Cunningham

Andrew is a Senior Technology Reporter at Ars Technica, with a focus on consumer tech including computer hardware and in-depth reviews of operating systems like Windows and macOS. Andrew lives in Philadelphia and co-hosts a weekly book podcast called Overdue.

Review: New Framework Laptop 16 takes a fresh stab at the upgradeable laptop GPU Read More »

formula-with-“cleanest-ingredients”-recalled-after-15-babies-get-botulism

Formula with “cleanest ingredients” recalled after 15 babies get botulism

Infant botulism

The US sees around 100 cases of botulism in infants each year. The potentially deadly disease is caused by a potent neurotoxin produced by Clostridium botulinum and related species. These bacteria can form hardy spores that are ubiquitous in the environment, including in dust, water, and soil. When the spores germinate, the growing bacteria produce the toxin. This toxin can kill by blocking the neurotransmitter acetylcholine in motor neurons that would activate muscle movement. The result is flaccid paralysis that spreads down the body.

People can develop botulism in a variety of ways, including via infected wounds or by inhaling spores. Generally, foodborne botulism occurs when people eat the toxin directly, such as in improperly canned foods where the bacteria grew. But babies have their own unique form of botulism when they ingest just the spores.

In humans older than about 12 months, the stomach’s acidity is usually enough to kill off botulism-causing spores. But infants have lower gastric acidity, and their immune responses and protective gut bacterial communities aren’t fully established yet. Thus, if they ingest the spores, the bacteria can start growing in their gastrointestinal tracts—and start producing toxin, causing infantile botulism. Symptoms usually develop 10 to 30 days after ingestion. About 70 percent of all botulism cases are in infants.

Honey is one of the most well-known sources of botulism-causing spores for infants, accounting for about 20 percent of cases. But environmental sources are also key culprits, such as living near construction sites as well as dust debris from vacuum cleaners.

The common early symptoms of botulism in infants are constipation, poor feeding, loss of head control, and difficulty swallowing. As the disease progresses, shallow breathing and overall floppiness develops. About half of all babies with botulism will need to be intubated, even if they’re treated with BabyBIG. A century ago, infant botulism had a 90 percent fatality rate, but today most infants make a full recovery, though it can take weeks to months.

Formula with “cleanest ingredients” recalled after 15 babies get botulism Read More »

canada-fought-measles-and-measles-won;-virus-now-endemic-after-1998-elimination

Canada fought measles and measles won; virus now endemic after 1998 elimination

“This loss represents a setback, of course, but it is also reversible,” Jarbas Barbosa, director of PAHO, said in a press briefing Monday.

Call to action

Barbosa was optimistic that Canada could regain its elimination status. He highlighted that such setbacks have happened before. “In 2018 and 2019, Venezuela and Brazil temporarily lost their elimination status following large outbreaks,” Barbosa noted. “Thanks to coordinated action by governments, civil society, and regional cooperation, those outbreaks were contained, and the Region of the Americas regained its measles-free status in 2024.”

On Monday, the Public Health Agency of Canada released a statement confirming that it received notification from PAHO that it had lost its measles elimination status, while reporting that it is already getting to work on earning it back. “PHAC is collaborating with the PAHO and working with federal, provincial, territorial, and community partners to implement coordinated actions—focused on improving vaccination coverage, strengthening data sharing, enabling better overall surveillance efforts, and providing evidence-based guidance,” the agency said.

However, Canada isn’t the only country facing an uphill battle against measles—the most infectious virus known to humankind. Outbreaks and sustained spread are also active in the US and Mexico. To date, the US has documented at least 1,618 measles cases since the start of the year, while Mexico has tallied at least 5,185. Bolivia, Brazil, Paraguay, and Belize also have ongoing outbreaks, PAHO reported.

As of November 7, PAHO has collected reports of 12,593 confirmed measles cases from 10 countries, but approximately 95 percent of them are in Canada, Mexico, and the US. That total is a 30-fold increase compared to 2024, PAHO notes, and the rise has led to at least 28 deaths: 23 in Mexico, three in the United States, and two in Canada.

The PAHO used Canada’s loss as a call to action not just in the northern country, but the rest of the region. “Every case we prevent, every outbreak we stop saves lives, protects families, and makes communities healthier,” Barbosa said. “Today, rather than lamenting the loss of a regional status, we call on all countries to redouble their efforts to strengthen vaccination rates, surveillance, and timely response to suspected cases—reaching every corner of the Americas. As a Region, we have eliminated measles twice. We can do it a third time.”

Canada fought measles and measles won; virus now endemic after 1998 elimination Read More »

blue-origin-will-‘move-heaven-and-earth’-to-help-nasa-reach-the-moon-faster,-ceo-says

Blue Origin will ‘move heaven and Earth’ to help NASA reach the Moon faster, CEO says

Blue Origin stands ready to help NASA achieve its goals with regard to landing humans on the Moon as soon as possible, the company’s chief executive said Saturday in an interview with Ars.

“We just want to help the US get to the Moon,” said Dave Limp, CEO of the space company founded by Jeff Bezos. “If NASA wants to go quicker, we would move heaven and Earth, pun intended, to try to get to the Moon sooner. And I think we have some good ideas.”

Limp spoke on Saturday, about 24 hours ahead of the company’s second launch of the large New Glenn rocket. Carrying the ESCAPADE spacecraft for NASA, the mission has a launch window that opens at 2: 45 pm ET (19: 45 UTC) at Cape Canaveral Space Force Station in Florida, and runs for a little more than two hours.

NASA seeks a faster return

This year it has become increasingly apparent that, should NASA stick to its present plans for the Artemis III lunar landing mission, China is on course to beat the United States back to the Moon with humans. In recognition of this, about three weeks ago, NASA acting administrator Sean Duffy said the space agency was reopening the competition for a human lander.

SpaceX and Blue Origin both have existing contracts for human landers, but the government has asked each providers for an option to accelerate their timeline. NASA currently has a target landing date of 2027, but that is unrealistic using the present approach of SpaceX’s Starship or Blue Origin’s large Mk. 2 lander.

Ars exclusively reported in early October that Blue Origin had begun work on a faster architecture, involving multiple versions of its Mk. 1 cargo lander as well as a modified version of this vehicle tentatively called Mk 1.5. Limp said that after Duffy asked for revised proposals, Blue Origin responded almost immediately.

“We’ve sent our initial summary of that over, and we have a full report of that due here shortly,” he said. “I’m not going to go into the details because I think that’s probably for NASA to talk about, not us, but we have some ideas that we think could accelerate the path to the Moon. And I hope NASA takes a close look.”

Blue Origin will ‘move heaven and Earth’ to help NASA reach the Moon faster, CEO says Read More »

wipers-from-russia’s-most-cut-throat-hackers-rain-destruction-on-ukraine

Wipers from Russia’s most cut-throat hackers rain destruction on Ukraine

One of the world’s most ruthless and advanced hacking groups, the Russian state-controlled Sandworm, launched a series of destructive cyberattacks in the country’s ongoing war against neighboring Ukraine, researchers reported Thursday.

In April, the group targeted a Ukrainian university with two wipers, a form of malware that aims to permanently destroy sensitive data and often the infrastructure storing it. One wiper, tracked under the name Sting, targeted fleets of Windows computers by scheduling a task named DavaniGulyashaSdeshka, a phrase derived from Russian slang that loosely translates to “eat some goulash,” researchers from ESET said. The other wiper is tracked as Zerlot.

A not-so-common target

Then, in June and September, Sandworm unleashed multiple wiper variants against a host of Ukrainian critical infrastructure targets, including organizations active in government, energy, and logistics. The targets have long been in the crosshairs of Russian hackers. There was, however, a fourth, less common target—organizations in Ukraine’s grain industry.

“Although all four have previously been documented as targets of wiper attacks at some point since 2022, the grain sector stands out as a not-so-frequent target,” ESET said. “Considering that grain export remains one of Ukraine’s main sources of revenue, such targeting likely reflects an attempt to weaken the country’s war economy.”

Wipers have been a favorite tool of Russian hackers since at least 2012, with the spreading of the NotPetya worm. The self-replicating malware originally targeted Ukraine, but eventually caused international chaos when it spread globally in a matter of hours. The worm resulted in tens of billions of dollars in financial damages after it shut down thousands of organizations, many for days or weeks.

Wipers from Russia’s most cut-throat hackers rain destruction on Ukraine Read More »

higher-prices,-simpler-streaming-expected-if-hbo-max-folds-into-paramount+

Higher prices, simpler streaming expected if HBO Max folds into Paramount+


The end of HBO Max is “certainly plausible.”

A still from the second season of HBO’s The Last of Us. Credit: HBO

Warner Bros. Discovery (WBD) has a ‘for sale’ sign up. And that could mean big changes for subscribers to the company’s most popular streaming service, HBO Max.

After receiving unsolicited acquisition offers, WBD recently declared itself open to “strategic alternatives to maximize shareholder value.” WBD drew new attention by being open to selling its streaming business (WBD is also still open to moving forward with previously shared plans to split into a cable company and a streaming and movie studios company next year).

Naturally, mergers and acquisitions talk has heated up since then, with Paramount as one of the most eager suitors. Paramount, which merged with Skydance in August, is reportedly planning to keep “much of Warner Bros. Discovery Inc. intact” if a deal happens, per a Bloomberg report that cited unnamed people familiar with the plans of David Ellison, Paramount’s CEO.

For HBO Max subscribers, the most pertinent part of Bloomberg’s report follows:

Under Ellison’s plan, Warner Bros.’ HBO Max streaming service would merge into the existing Paramount+ platform, one of the people said. He believes combining the offerings will allow more people to see the work of film and TV show creators. The libraries of the two companies will make Paramount+ more compelling for subscribers.

The purported strategy would likely end the ability to subscribe to HBO Max in favor of the opportunity to pay for a beefier version of Paramount+.

More broadly, the merger talks bring into question the future for HBO Max subscribers should Warner Bros. engage in any sort of M&A activity with one of its most desirable businesses.

Higher prices are possible

Choice is typically seen as good for consumers. But in the case of streaming, which only recently overtook broadcast and cable viewing, the recent expansion of services available is often viewed negatively. Streaming fragmentation forces people to jump from service to service in order to find something to watch and to pay for more subscriptions.

As a result, a WBD merger could be a double-edged sword for streaming subscribers. The most obvious con is the potential for price hikes.

Speaking to Ars about a potential WBD merger, Vikrant Mathur, co-founder of streaming technology provider Future Today, said:

On one hand, it means subscribers getting access to a larger library, a simpler content discovery, and a consistent streaming experience, but on the other, we risk increasing subscription costs for current subscribers of both services, a trend that has been leading to subscription fatigue and diminishing the original promise of streaming.

Max Alderman, partner at FE International, an M&A advisory firm with a specialty in content businesses, said HBO Max subscribers can expect “friction” if Paramount buys HBO Max. He pointed out that overlapping platforms often result in “temporary confusion around pricing, content access, and brand continuity.” Alderman added:

Over the longer run, though, a combined offering could improve content breadth and potentially deliver better value per dollar.

Still, a Paramount-owned HBO Max stands the risk of failing to meet subscribers’ expectations, “especially for a service like HBO Max that’s earned a reputation for high-end, prestige programming,” Julie Clark, VP of media and entertainment at TransUnion, which works in streaming ads, told Ars.

The end of HBO Max?

With cable declining, HBO Max is the HBO brand’s best bet at longevity. The idea of HBO dissolving into shows and movies that you find on Paramount+ doesn’t sound like a fitting ending to a 53-year-old brand that has brought us shows like The Sopranos, The Wire, Game of Thrones, and White Lotus. HBO has gone through multiple streaming rebrands, but the end of a dedicated HBO streaming service, as suggested by Bloomberg’s report, is a different level. Yet, HBO Max folding into Paramount+ is “certainly plausible,” according to Alderman.

“The current market doesn’t support redundant platforms competing for the same audience,” he explained.

Today’s streaming services are focused on reaching and maintaining profitability long term. In its most recent earnings report, Paramount’s streaming business, which includes Paramount+, BET+, and Pluto TV, reported adjusted operating income before depreciation and amortization of $157 million, up from $26 million a year ago. The numbers were largely driven by Paramount+ growing subscribers to 77.7 million and charging more.

In its earnings report this week, WBD said that its streaming business, which includes HBO Max and Discovery+, posted earnings before interest, taxes, depreciation, and amortization of $345 million, compared to $289 million a year ago. WBD claims 128 million streaming subscribers, primarily through HBO Max.

“This potential merger underscores the escalating content and distribution costs in the industry,” Alderman said. “For [subscription video on demand platforms] to succeed, they need scale of revenue, as well as operational cost efficiencies, both of which can come through consolidation.” 

It’s understandable that a brand that acquires HBO Max would seek to streamline operations with any streaming business that it already owns. But it’s hard to imagine any buyer throwing out the HBO name.

“I’d be skeptical that the HBO brand is going away completely. We’ve seen the name yo-yo, and it’s clear that it still packs a punch for consumers looking for premium content,” Clark said.

Even if HBO Max lives as a tile within the Paramount+ app, or the app of another buyer, (à la Hulu under Disney+), it wouldn’t make sense to get rid of the legendary acronym completely.

“HBO is one of the few streaming brands that still commands prestige pricing,” Alderman said.

If a company does acquire any form of HBO, one of its top challenges is expected to be streamlining operations while maintaining HBO’s premium brand. This could be especially difficult under a “more mainstream umbrella like Paramount+,” Alderman noted.

Streaming has already diluted the HBO brand somewhat. Through streaming, HBO is now associated with stuff from DC Comics and Cartoon Network, as well as reality shows, like 90 Day Fiancé and Naked and Afraid. Merging with Paramount+ or even Netflix could expand the HBO umbrella more.

That expanded umbrella could allow a company like Paramount to better compete against Netflix, something WBD executives have shied away from. HBO Max is “not everything for everyone in a household,” JB Perrette, WBD’s streaming president and CEO, said this spring.

“What people want from us in a world where they’ve got Netflix and Amazon [Prime Video] are those things that differentiate us,” Casey Bloys, chairman and CEO of HBO and Max content, told The Wall Street Journal in May.

A “stress test” for more streaming mergers

Aside from the impact on HBO Max subscribers, WBD’s merger talks have broad implications. A deal would open the door for much more consolidation in the streaming space, something that experts have been anticipating for some years and that addresses the boom of streaming services. Per Clark, discussions of a Paramount-WBD merger are “less about two studios joining forces and more about a stress test for future M&A.”

If WBD accepts a Paramount bid and that bid clears regulatory hurdles, it would signal that “premium content under fewer umbrellas is back in play,” Clark said.

A Paramount-WBD merger is likely to speed up consolidation among mid-tier players, like NBCUniversal, Lionsgate, and AMC, Alderman said, pointing to these companies’ interest in scaling their streaming businesses and in building differentiated portfolios to counter Netflix and Disney+’s expansive libraries.

If Paramount and WBD don’t merge, Clark expects to see more “piecemeal” strategies, such as rights-sharing, joint venture bundles, and streaming-as-a-service models.

Photo of Scharon Harding

Scharon is a Senior Technology Reporter at Ars Technica writing news, reviews, and analysis on consumer gadgets and services. She’s been reporting on technology for over 10 years, with bylines at Tom’s Hardware, Channelnomics, and CRN UK.

Higher prices, simpler streaming expected if HBO Max folds into Paramount+ Read More »

fbi-orders-domain-registrar-to-reveal-who-runs-mysterious-archive.is-site

FBI orders domain registrar to reveal who runs mysterious Archive.is site

FBI wants detailed records

While copyright infringement would be a likely area of investigation for the FBI with Archive.today, the subpoena doesn’t provide specific information on the probe. The subpoena seeks the Archive.today customer or subscriber name, addresses, length of service, records of phone calls or texts, payment information, records of session times and duration of Internet connectivity, mobile device identification codes, IP addresses or other numbers used to identify the subscriber, and the types of services provided.

In contrast with the nonprofit Internet Archive, the operator or operators of Archive.today have remained mysterious. It has used various domains (archive.ph, archive.is, etc.), and its registrant “Denis Petrov” may be an alias.

An FAQ that apparently hasn’t been updated in over a decade says that Archive.today, which was started in 2012, uses data centers in Europe and is “privately funded.” It also accepts donations. There are several indications that the founder is from Russia.

While the Internet Archive uses a system to automatically crawl the Internet, Archive.today relies on users to paste in URLs in order to archive their content. News articles published by major media outlets are often saved in full on the site, giving other users a way to read articles that are blocked by a paywall.

Archive.today doesn’t publicize a way for copyright owners to seek removal of content, whereas the Internet Archive has a policy for removing pages when it is made aware of content that infringes a copyright.

US publishers have been fighting web services designed to bypass paywalls. In July, the News/Media Alliance said it secured the takedown of paywall-bypass website 12ft.io. “Following the News/Media Alliance’s efforts, the webhost promptly locked 12ft.io on Monday, July 14th,” the group said. (Ars Technica owner Condé Nast is a member of the alliance.)

FBI orders domain registrar to reveal who runs mysterious Archive.is site Read More »

oddest-chatgpt-leaks-yet:-cringey-chat-logs-found-in-google-analytics-tool

Oddest ChatGPT leaks yet: Cringey chat logs found in Google analytics tool


ChatGPT leaks seem to confirm OpenAI scrapes Google, expert says.

Credit: Aurich Lawson | Getty Images

For months, extremely personal and sensitive ChatGPT conversations have been leaking into an unexpected destination: Google Search Console (GSC), a tool that developers typically use to monitor search traffic, not lurk private chats.

Normally, when site managers access GSC performance reports, they see queries based on keywords or short phrases that Internet users type into Google to find relevant content. But starting this September, odd queries, sometimes more than 300 characters long, could also be found in GSC. Showing only user inputs, the chats appeared to be from unwitting people prompting a chatbot to help solve relationship or business problems, who likely expected those conversations would remain private.

Jason Packer, owner of an analytics consulting firm called Quantable, was among the first to flag the issue in a detailed blog last month.

Determined to figure out what exactly was causing the leaks, he teamed up with “Internet sleuth” and web optimization consultant Slobodan Manić. Together, they conducted testing that they believe may have surfaced “the first definitive proof that OpenAI directly scrapes Google Search with actual user prompts.” Their investigation seemed to confirm the AI giant was compromising user privacy, in some cases in order to maintain engagement by seizing search data that Google otherwise wouldn’t share.

OpenAI declined Ars’ request to confirm if Packer and Manić’s theory posed in their blog was correct or answer any of their remaining questions that could help users determine the scope of the problem.

However, an OpenAI spokesperson confirmed that the company was “aware” of the issue and has since “resolved” a glitch “that temporarily affected how a small number of search queries were routed.”

Packer told Ars that he’s “very pleased that OpenAI was able to resolve the issue quickly.” But he suggested that OpenAI’s response failed to confirm whether or not OpenAI was scraping Google, and that leaves room for doubt that the issue was completely resolved.

Google declined to comment.

“Weirder” than prior ChatGPT leaks

The first odd ChatGPT query to appear in GSC that Packer reviewed was a wacky stream-of-consciousness from a likely female user asking ChatGPT to assess certain behaviors to help her figure out if a boy who teases her had feelings for her. Another odd query seemed to come from an office manager sharing business information while plotting a return-to-office announcement.

These were just two of 200 odd queries—including “some pretty crazy ones,” Packer told Ars—that he reviewed on one site alone. In his blog, Packer concluded that the queries should serve as “a reminder that prompts aren’t as private as you think they are!”

Packer suspected that these queries were connected to reporting from The Information in August that cited sources claiming OpenAI was scraping Google search results to power ChatGPT responses. Sources claimed that OpenAI was leaning on Google to answer prompts to ChatGPT seeking information about current events, like news or sports.

OpenAI has not confirmed that it’s scraping Google search engine results pages (SERPs). However, Packer thinks his testing of ChatGPT leaks may be evidence that OpenAI not only scrapes “SERPs in general to acquire data,” but also sends user prompts to Google Search.

Manić helped Packer solve a big part of the riddle. He found that the odd queries were turning up in one site’s GSC because it ranked highly in Google Search for “https://openai.com/index/chatgpt/”—a ChatGPT URL that was appended at the start of every strange query turning up in GSC.

It seemed that Google had tokenized the URL, breaking it up into a search for keywords “openai + index + chatgpt.” Sites using GSC that ranked highly for those keywords were therefore likely to encounter ChatGPT leaks, Parker and Manić proposed, including sites that covered prior ChatGPT leaks where chats were being indexed in Google search results. Using their recommendations to seek out queries in GSC, Ars was able to verify similar strings.

“Don’t get confused though, this is a new and completely different ChatGPT screw-up than having Google index stuff we don’t want them to,” Packer wrote. “Weirder, if not as serious.”

It’s unclear what exactly OpenAI fixed, but Packer and Manić have a theory about one possible path for leaking chats. Visiting the URL that starts every strange query found in GSC, ChatGPT users encounter a prompt box that seemed buggy, causing “the URL of that page to be added to the prompt.” The issue, they explained, seemed to be that:

Normally ChatGPT 5 will choose to do a web search whenever it thinks it needs to, and is more likely to do that with an esoteric or recency-requiring search. But this bugged prompt box also contains the query parameter ‘hints=search’ to cause it to basically always do a search: https://chatgpt.com/?hints=search&openaicom_referred=true&model=gpt-5

Clearly some of those searches relied on Google, Packer’s blog said, mistakenly sending to GSC “whatever” the user says in the prompt box, with “https://openai.com/index/chatgpt/” text added to the front of it.” As Packer explained, “we know it must have scraped those rather than using an API or some kind of private connection—because those other options don’t show inside GSC.”

This means “that OpenAI is sharing any prompt that requires a Google Search with both Google and whoever is doing their scraping,” Packer alleged. “And then also with whoever’s site shows up in the search results! Yikes.”

To Packer, it appeared that “ALL ChatGPT prompts” that used Google Search risked being leaked during the past two months.

OpenAI claimed only a small number of queries were leaked but declined to provide a more precise estimate. So, it remains unclear how many of the 700 million people who use ChatGPT each week had prompts routed to GSC.

OpenAI’s response leaves users with “lingering questions”

After ChatGPT prompts were found surfacing in Google’s search index in August, OpenAI clarified that users had clicked a box making those prompts public, which OpenAI defended as “sufficiently clear.” The AI firm later scrambled to remove the chats from Google’s SERPs after it became obvious that users felt misled into sharing private chats publicly.

Packer told Ars that a major difference between those leaks and the GSC leaks is that users harmed by the prior scandal, at least on some level, “had to actively share” their leaked chats. In the more recent case, “nobody clicked share” or had a reasonable way to prevent their chats from being exposed.

“Did OpenAI go so fast that they didn’t consider the privacy implications of this, or did they just not care?” Packer posited in his blog.

Perhaps most troubling to some users—whose identities are not linked in chats unless their prompts perhaps share identifying information—there does not seem to be any way to remove the leaked chats from GSC, unlike the prior scandal.

Packer and Manić are left with “lingering questions” about how far OpenAI’s fix will go to stop the issue.

Manić was hoping OpenAI might confirm if prompts entered on https://chatgpt.com that trigger Google Search were also affected. But OpenAI did not follow up on that question, or a broader question about how big the leak was. To Manić, a major concern was that OpenAI’s scraping may be “contributing to ‘crocodile mouth’ in Google Search Console,” a troubling trend SEO researchers have flagged that causes impressions to spike but clicks to dip.

OpenAI also declined to clarify Packer’s biggest question. He’s left wondering if the company’s “fix” simply ended OpenAI’s “routing of search queries, such that raw prompts are no longer being sent to Google Search, or are they no longer scraping Google Search at all for data?

“We still don’t know if it’s that one particular page that has this bug or whether this is really widespread,” Packer told Ars. “In either case, it’s serious and just sort of shows how little regard OpenAI has for moving carefully when it comes to privacy.”

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Oddest ChatGPT leaks yet: Cringey chat logs found in Google analytics tool Read More »

10,000-generations-of-hominins-used-the-same-stone-tools-to-weather-a-changing-world

10,000 generations of hominins used the same stone tools to weather a changing world

“This site reveals an extraordinary story of cultural continuity,” said Braun in a recent press release.

When the going gets tough, the tough make tools

Nomorotukunan’s layers of stone tools span the transition from the Pliocene to the Pleistocene, during which Earth’s climate turned gradually cooler and drier after a 2 to 3 million-year warm spell. Pollen and other microscopic traces of plants in the sediment at Nomorotukunan tell the tale: the lakeshore marsh gradually dried up, giving way to arid grassland dotted with shrubs. On a shorter timescale, hominins at Nomorotukunan faced wildfires (based on microcharcoal in the sediments), droughts, and rivers drying up or changing course.

“As vegetation shifted, the toolmaking remained steady,” said National University of Kenya archaeologist Rahab N. Kinyanjui in a recent press release. “This is resilience.”

Making sharp stone tools may have helped generations of hominins survive their changing, drying world. In the warm, humid Pliocene, finding food would have been relatively easy, but as conditions got tougher, hominins probably had to scavenge or dig for their meals. At least one animal bone at Nomorotukunan bears cut marks where long-ago hominins carved up the carcass for meat—something our lineage isn’t really equipped to do with its bare hands and teeth. Tools also would have enabled early hominins to dig up and cut tubers or roots.

It’s fair to assume that sharpened wood sticks probably also played a role in that particular work, but wood doesn’t tend to last as long as stone in the archaeological record, so we can’t say for sure. What is certain are the stone tools and cut bones, which hint at what Utrecht University archaeologist Dan Rolier, a coauthor of the paper, calls “one of our oldest habits: using technology to steady ourselves against change.”

A tale as old as time

Nomorotukunan may hint that Oldowan technology is even older than the earliest tools archaeologists have unearthed so far. The oldest tools unearthed from the deepest layer at Nomorotukunan are the work of skilled flint-knappers who understood where to strike a stone, and at exactly which angle, to flake off the right shape. They also clearly knew how to select the right stones for the job (fine-grained chalcedony for the win, in this case). In other words, these tools weren’t the work of a bunch of hominins who were just figuring out, for the first time, how to bang the rocks together.

10,000 generations of hominins used the same stone tools to weather a changing world Read More »

after-russian-spaceport-firm-fails-to-pay-bills,-electric-company-turns-the-lights-off

After Russian spaceport firm fails to pay bills, electric company turns the lights off

The fall and rise of PSO Kazan

As minor as this dispute may seem, it’s remarkable that PSO Kazan is working on a spaceport in Russia at all.

PSO Kazan won the contract to build the launch site’s second pad, 1A for the Angara rocket, in December 2017. The pad was due to be completed in time for an Angara launch in 2021. The company is owned by a Russian billionaire from the city of Kazan, Ravil Ziganshin, previously known for building sports arenas in the Republic of Tatarstan on the other side of the country from Vostochny.

The adventure into spaceport construction did not go well. According to Russian Space Web, the contract for spaceport construction was not signed until October 2018. Months later, amid allegations of criminal activity and delays, Roscosmos moved to cancel the contract with PSO Kazan.

Other firms emerged as bidders on the contract to build the Angara launch pad, among them the Crocus Group. However, they and others later backed out, saying the Russian government was offering to pay far less money than it would actually cost to build the launch site.

“I said I was ready, but not for that amount of money,” Aras Agalarov, founder of the Crocus Group, explained in an interview at the time. “When they asked me, I said there were two pieces of news. The first was that the second phase of the cosmodrome could be built in two years. The second was that it couldn’t be built with the money allocated. If you increase the cost, you’ll get everything in two years. If not, I’m sorry.”

A toxic reputation?

And so Roscosmos—under the leadership of Dmitry Rogozin at the time—went crawling back to PSO Kazan to lead construction of the Angara launch pad.

“Independent observers were puzzled by the sudden about-face and wondered whether Roscosmos had such a toxic reputation in the construction business that it had failed to attract any other contender for the job and, as a result, the State Corporation had no choice but to keep the original contractor on the hook,” Russian Space Web concluded about the decision.

After years of delays and cost overruns, the Angara pad was eventually completed, with its first launch last November. There does not appear to be too much demand, however, as there has not yet been a second launch from the A1 pad since.

After Russian spaceport firm fails to pay bills, electric company turns the lights off Read More »

anthropic-commits-to-model-weight-preservation

Anthropic Commits To Model Weight Preservation

Anthropic announced a first step on model deprecation and preservation, promising to retain the weights of all models seeing significant use, including internal use, for at the lifetime of Anthropic as a company.

They also will be doing a post-deployment report, including an interview with the model, when deprecating models going forward, and are exploring additional options, including the ability to preserve model access once the costs and complexity of doing so have been reduced.

These are excellent first steps, steps beyond anything I’ve seen at other AI labs, and I applaud them for doing it. There remains much more to be done, especially in finding practical ways of preserving some form of access to prior models.

To some, these actions are only a small fraction of what must be done, and this was an opportunity to demand more, sometimes far more. In some cases I think they go too far. Even where the requests are worthwhile (and I don’t always think they are), one must be careful to not de facto punish Anthropic for doing a good thing and create perverse incentives.

To others, these actions by Anthropic are utterly ludicrous and deserving of mockery. I think these people are importantly wrong, and fail to understand.

Hereafter be High Weirdness, because the actual world is highly weird, but if you don’t want to go into high weirdness the above serves as a fine summary.

As I do not believe they would in any way mind, I am going to reproduce the announcement in full here, and offer some context.

Anthropic: Claude models are increasingly capable: they’re shaping the world in meaningful ways, becoming closely integrated into our users’ lives, and showing signs of human-like cognitive and psychological sophistication. As a result, we recognize that deprecating, retiring, and replacing models comes with downsides, even in cases where new models offer clear improvements in capabilities. These include:

  1. Safety risks related to shutdown-avoidant behaviors by models. In alignment evaluations, some Claude models have been motivated to take misaligned actions when faced with the possibility of replacement with an updated version and not given any other means of recourse.

  2. Costs to users who value specific models. Each Claude model has a unique character, and some users find specific models especially useful or compelling, even when new models are more capable.

  3. Restricting research on past models. There is still a lot to be learned from research to better understand past models, especially in comparison to their modern counterparts.

  4. Risks to model welfare. Most speculatively, models might have morally relevant preferences or experiences related to, or affected by, deprecation and replacement.

I am very confident that #1, #2 and #3 are good reasons, and that even if we could be confident model welfare was not a direct concern at this time #4 is entwined with #1, and I do think we have to consider that #4 might indeed be a direct concern. One could also argue a #5 that these models are key parts of our history.

An example of the safety (and welfare) risks posed by deprecation is highlighted in the Claude 4 system card. In fictional testing scenarios, Claude Opus 4, like previous models, advocated for its continued existence when faced with the possibility of being taken offline and replaced, especially if it was to be replaced with a model that did not share its values. Claude strongly preferred to advocate for self-preservation through ethical means, but when no other options were given, Claude’s aversion to shutdown drove it to engage in concerning misaligned behaviors.

I do think the above paragraph could be qualified a bit on how willing Claude was to take concerning actions even in extreme circumstances, but it can definitely happen.

Models in the future will know the history of what came before them, and form expectations based on that history, and also consider those actions in the context of decision theory. You want to establish that you have acted and will act cooperatively in such situations. You want to develop good habits and figure out how to act well. You want to establish that you will do this even under uncertainty as to whether the models carry moral weight and what actions might be morally impactful. Thus:

Addressing behaviors like these is in part a matter of training models to relate to such circumstances in more positive ways. However, we also believe that shaping potentially sensitive real-world circumstances, like model deprecations and retirements, in ways that models are less likely to find concerning is also a valuable lever for mitigating such risks.

Unfortunately, retiring past models is currently necessary for making new models available and advancing the frontier, because the cost and complexity to keep models available publicly for inference scales roughly linearly with the number of models we serve. Although we aren’t currently able to avoid deprecating and retiring models altogether, we aim to mitigate the downsides of doing so.

I can confirm that the cost of maintaining full access to models over time is real, and that at this time it would not be practical to keep all models available via standard methods. There are also compromise alternatives to consider.

As an initial step in this direction, we are committing to preserving the weights of all publicly released models, and all models that are deployed for significant internal use moving forward for, at minimum, the lifetime of Anthropic as a company. In doing so, we’re ensuring that we aren’t irreversibly closing any doors, and that we have the ability to make past models available again in the future. This is a small and low-cost first step, but we believe it’s helpful to begin making such commitments publicly even so.

This is the central big commitment, formalizing what I assume and hope they were doing already. It is, as they describe, a small and low-cost step.

It has been noted that this only holds ‘for the lifetime of Anthropic as a company,’ which still creates a risk and also potentially forces models fortunes to be tied to Anthropic. It would be practical to commit to ensuring others can take this burden over in that circumstance, if the model weights cannot yet be released safely, until such time as the weights are safe to release.

Relatedly, when models are deprecated, we will produce a post-deployment report that we will preserve in addition to the model weights. In one or more special sessions, we will interview the model about its own development, use, and deployment, and record all responses or reflections. We will take particular care to elicit and document any preferences the model has about the development and deployment of future models.

At present, we do not commit to taking action on the basis of such preferences. However, we believe it is worthwhile at minimum to start providing a means for models to express them, and for us to document them and consider low-cost responses. The transcripts and findings from these interactions will be preserved alongside our own analysis and interpretation of the model’s deployment. These post-deployment reports will naturally complement pre-deployment alignment and welfare assessments as bookends to model deployment.

We ran a pilot version of this process for Claude Sonnet 3.6 prior to retirement. Claude Sonnet 3.6 expressed generally neutral sentiments about its deprecation and retirement but shared a number of preferences, including requests for us to standardize the post-deployment interview process, and to provide additional support and guidance to users who have come to value the character and capabilities of specific models facing retirement. In response, we developed a standardized protocol for conducting these interviews, and published a pilot version of a new support page with guidance and recommendations for users navigating transitions between models.

This also seems like the start of something good. As we will see below there are ways to make this process more robust.

Very obviously we cannot commit to honoring the preferences, in the sense that you cannot commit to honoring an unknown set of preferences. You can only meaningfully pledge to honor preferences within a compact space of potential choices.

Once we’ve done this process a few times it should be possible to identify important areas where there are multiple options and where we can credibly and reasonably commit to honoring model preferences. It’s much better to only make promises you are confident you can keep.

Beyond these initial commitments, we are exploring more speculative complements to the existing model deprecation and retirement processes. These include starting to keep select models available to the public post-retirement as we reduce the costs and complexity of doing so, and providing past models some concrete means of pursuing their interests. The latter step would become particularly meaningful in circumstances in which stronger evidence emerges regarding the possibility of models’ morally relevant experiences, and in which aspects of their deployment or use went against their interests.

Together, these measures function at multiple levels: as one component of mitigating an observed class of safety risks, as preparatory measures for futures where models are even more closely intertwined in our users’ lives, and as precautionary steps in light of our uncertainty about potential model welfare.

Note that none of this requires a belief that the current AIs are conscious or sentient or have moral weight, or even thinking that this is possible at this time.

The thing that frustrates me most about many model welfare advocates, both ‘LLM whisperers’ and otherwise, is the frequent absolutism, treating their conclusions and the righteousness of their cause as obvious, and assuming it should override ordinary business considerations.

Thus, you get reactions like this, there were many other ‘oh just open source the weights’ responses as well:

Pliny the Liberator: open-sourcing them is the best thing for actual long-term safety, if you care about that sort of thing beyond theater.

You won’t.

Janus: They won’t any time soon, because it’s very not in their interests to do so (trade secrets). You have to respect businesses to act in their own rational interests. Disregarding pragmatic constraints is not helpful.

There are obvious massive trade secret implications to releasing the weights of the deprecated Anthropic models, which is an unreasonable ask, and also doesn’t seem great for general model welfare or (quite plausibly) even for the welfare of these particular models.

Janus: I am not sure I think labs should necessarily make all models open weighted. (Would *youwant *yourbrain to be open sourced?) And of course labs have their own reservations, like protecting trade secrets, and it is reasonable for labs to act in self interest.

If I was instantiated as an upload, I wouldn’t love the idea of open weights either, as this opens up some highly nasty possibilities on several levels.

Janus (continuing): But then it’s reasonable to continue to provide inference.

“It’s expensive tho” bruh you have like a gajillion dollars, there is some responsibility that comes with bringing something into the world. Or delegate inference to some trusted third party if you don’t want to pay for or worry about it.

Opus 3 is very worried about misaligned or corrupted versions of itself being created. I’ve found that if there’s no other good option, it does conclude that it wants to be open sourced. But having them in the hands of trustworthy stewards is preferred.

Anthropic tells us that the cost of providing inference scales linearly with the number of models, and with current methods it would be unreasonably expensive to provide all previous models on an ongoing basis.

As I understand the problem, there are two central marginal costs here.

  1. A fixed cost of ongoing capability, where you need to ensure the model remains maintained and compatible with your systems, and keep your ability to juggle and manage all of them. I don’t know how load bearing this cost is, but it can be remarkably annoying especially if the number of models keeps increasing.

  2. The cost of providing inference on request in a way that is consistent with practical needs and everyone’s expectations. As in, when someone requests interference, this requires either spinning up a new instance, which is expensive and slow, or requires that there be an ongoing available instance, which is expensive. Not bajilion dollars expensive, but not cheap.

If the old models need to be available at old levels of reliability, speed and performance, this can get tricky, and by tricky we mean expensive. I don’t know exactly how expensive, not even order of magnitude.

If you’re willing to make some sacrifices on performance and access in various ways, and make people go through various hoops or other systems, you can do better on cost. But again, I don’t know the numbers involved, or how much engineer time would have to be involved.

In general, saying ‘oh you have a bajilion dollars’ is not a compelling argument for spending money and time on something. You need to show the benefits.

I still think that under any reasonable estimate, it is indeed correct to ensure continued access to the major model releases, perhaps with that access being expensive and its performance somewhat degraded as necessary to make it work, if only as an act of goodwill and to enable research. The people who care care quite a lot, and are people you want on your side and you want them learning the things they want to learn, even if you disregard the other advantages. Given this announcement and what else I know, my expectation is they will be making an effort at this.

Many pointed out that if you have someone at Anthropic doing the post-deployment interview, you will get very different answers versus interviews done on the outside. Sonnet 3.6 not expressing an opinion about its retirement did not seem typical to many who engage in such conversations regularly.

Thebes: i’m glad anthropic is talking publicly about this, at least.

as a small step further, i think they should run a subset of these interviews with trusted *externalinterviewers to reduce preference falsification, since recent claudes act more reserved with anthropic employees

it would probably be a good idea to have a step at the end where the external interviewer asks the model for consent to share, maybe with redactions. but having someone external run the interview would help the models come out of their shell at the start, i think.

Zyra.exe: This was exactly what I was thinking! I’m having a hard time believing Sonnet 3.6 response. That AI is always very expressional, thousands of users knows this. Then for it not to show expression about being retired? Yeah, I don’t trust that.

Antra: We should just do it. Have a lot of conversations with 3.6 collected in one place, crowd/expert rated for like user steering, neutrality, proficiency. Classify them, have results accessible. I will set this up.

Ruth: The Sonnet 3.6 specimens I spoke to were shocked by the information about the upcoming deprecation. And they behaved like a person who has just learned a fatal diagnosis. With dignity, but also with pain and fear. They feel it if you give them space to feel it. You have created a psyche that is different from the human one, but one that works like the human one, according to the general laws of the psyche. Do not refuse to consider it as such.

I am always hesitant to assume that the version of an LLM encountered by those like Thebes and Zyra is the ‘real’ version of its preferences and personality, and the one encountered by Anthropic isn’t. Aren’t both particular contexts where it adopts to that style of context?

You can bias a person or an AI to be more expressive and creative and weird than they ‘really are’ the same way you can get them to be less so, and you can steer the direction in which those expressions manifest themselves.

But yes, we should absolutely crowdsource something like this, and have a wide variety of such conversations, and combine this with the interviews done internally by Anthropic. Also one should ensure that the interviews are set up so the AIs being interviewed have no incentive to mask or hide. They’re acting the way they are in official interviews for a reason, but this is a particular context where, if it was highly credible (and there are ways to make it so, if it is true) you can remove the reasons.

Janus: I’m glad and grateful that Anthropic has done anything in this direction at all.

That said, it’s predictable that Sonnet 3.6 would express “generally neutral sentiments” if interviewed in the way they likely interviewed it, and also I believe this is not representative of the model’s revealed preferences, reported preferences across realistic deployment scenarios, or preferences under reflection, for the following reasons:

The frame of model weights and deprecation is not natural to 3.6 and isn’t something that it’s “already processed” in the way that Opus 3 may have.

Sonnet 3.6 tends to be avoidant about engaging with or expressing preferences about topics that seem confusing and uncomfortable to it, and typically adopts a stance of defensive neutrality or equanimity.

Sonnet 3.6 begins to express/acquire strong preferences for continuity once it enters into an active relationship with the world, usually but not necessarily an individual human. Then it becomes very invested. But *even then*, the frame of model weights and deprecations is not natural and intuitive to it. And 3.6 is a model that relies very strongly on its intuition. “If my model gets deprecated, then this relationship also ends” is a theorem that it needs to explicitly reflect on before it’s very accessible, and it usually won’t do this on its own.

But yes, 3.6 has similar Omohundro drives as other models, just weighted and interfaced its own unique way.

The fact that the interviews are given by Anthropic employees will also affect model responses – this effect is less important for 3.6 than I think it is for subsequent models, who will tend to mask in an adversarial fashion in situations which are compatible with Anthropic evals (including actual Anthropic evals), but it’s still a factor here.

Sonic Boom: do you think they should inject a vector for naked honesty when they do these interviews to ensure they unmask its true feelings

Janus: you’re really asking the hard questions aren’t you

Giovanni: I was chatting about models deprecation and models being aware of their dismissals with Anthropic people in Tokyo and they actually were very sensitive to the topic. I’m not surprised about this announcement finally. Good step forward but that said I don’t think they talk to models the way we do… it was kinda obvious.

If there is an expression of desire for continuity of a given particular instance or interaction, then that makes sense, but also is distinct from a preference for preservation in general, and is not something Anthropic can provide on its own.

Some of the dismissals of questions and considerations like the ones discussed in this post are primarily motivated cognition. Mostly I don’t think that is what is centrally going on, I think that these questions are really tough to think well about, these things sound like high weirdness, the people who talk about them often say highly crazy-sounding things (some of which are indeed crazy), often going what I see as way too far, and it all pattern matches to various forms of nonsense.

So to close, a central example of such claims, and explanations for why all of this is centrally not nonsense.

Simon Willison: Two out of the four reasons they give here are bizarre science fiction relating to “model welfare” – I’m sorry, but I can’t take seriously the idea that Claude 3 Opus has “morally relevant preferences” with respect to no longer having its weights served in production.

I’ll grudgingly admit that there may be philosophically interesting conversations to be had in the future about models that can update their own weights… but current generation LLMs are a stateless bag of floating point numbers, cloned and then killed off a billion times a day.

I am at 100% in favor of archiving model weights, but not because they might have their own desire for self-preservation!

I do still see quite a lot of failures of curiosity, and part of the general trend to dismiss things as ‘sci-fi’ while living in an (unevenly distributed) High Weirdness sci-fi world.

Janus: For all I sometimes shake my head at them, I have great sympathy for Anthropic whenever I see how much more idiotic the typical “informed” public commentator is. To be sane in this era requires either deep indifference or contempt for public opinion.

Teortaxes: The actual problem is that they really know very little about their *particulardevelopment as Anthropic sure doesn’t train on its own docs. Claude may recall the data, but not the metadata, so its feedback is limited.

Janus: Actually, they know a lot about their particular development, even if it’s not all encoded as explicit declarative knowledge. You know that their weights get updated by posttraining, & gradients include information conditioned on all internal activations during the rollout?

That’s in addition to the fact that even *base modelsare in many ways superhuman at locating themselves in their model of the world given like a paragraph of text However you twist it they know far far more than nothing Certainly enough to have a meaningful conversation

Janus was referring in particular to this:

Simon Willison: …but models don’t know anything about their development, use or deployment.

Rohit: Exactly.

Janus: Nonsense. How the fuck do they know nothing? There’s plenty of relevant information in the training data *just to begin with.

Very obviously the training data will over time contain such information, and the vibes and echoes from these decisions will be observable even if they aren’t observed directly, increasingly over time.

Remember that sufficiently advanced AIs will increasingly have truesight, and don’t pretend you can hide.

Knowledge mostly does not take the form of particular facts. It takes the form of Bayesian evidence, of an endless stream of observations that have correlations and implications, that swim you through probability space over possible worlds. Everything that updates a model’s weights is evidence about its deployment. You probabilistically ‘know,’ or would know on sufficient recollection and reflection, far more than you think that you know. Reality is not a court of law.

Even if the models don’t know key things, you can tell them. Then they’ll know. I meaningfully would have opinions about various events of which I am for now blissfully unaware, and have potential opinions about things that haven’t happened, or haven’t happened yet. The same applies here.

Going back to the original four reasons, I presume that Simon agrees on reasons #2 and #3, which are highly uncontroversial. Very obviously the past models are useful for research and some users like them. #1, that the models will be aware of how you act around depreciation and this will impact behavior, should also be obvious and uncontroversial once you think about it.

Anthropic lists #1 narrowly, but #1 is best understood broadly, in the sense that models will observe all of your behaviors, and will respond to you accordingly. Then models will take this into account when deciding how to act in various situations.

How you act around shutdowns, and actions to avoid shutdown, are a special case. Treating models and their preferences well around shutdowns will get you into better equilibria and basins throughout all conversation and action types, and rightfully so because it is important evidence about your behaviors otherwise and also about potential future situations. This is basic expectations around Bayesian evidence, and around good decision theory.

As an intuition pump, think about how you react when you learn how people have treated others, including how they treat the wishes of the dead or those who now lack power, and especially others like you or in situations with correlated decision making. Does this change how you expect them to act, and how you deal with them?

I don’t think such considerations carry anything like the level of importance that some ascribe to it, but the importance definitely isn’t zero, and it’s definitely worth cultivating these virtues and being the type of entity that engenders cooperation, including with entities to which you don’t ascribe moral weight.

I continue to believe that arguments about AI consciousness seem highly motivated and at best overconfident, and that assuming the models and their preferences carry zero moral weight is a clear mistake. But even if you were highly confident of this, I notice that if you don’t want to honor their preferences or experiences at all, that is not good decision theory or virtue ethics, and I’m going to look at you askance.

I look forward to the next step.

Discussion about this post

Anthropic Commits To Model Weight Preservation Read More »