Author name: Beth Washington

republican-plan-would-make-deanonymization-of-census-data-trivial

Republican plan would make deanonymization of census data trivial


“Differential privacy” algorithm prevents statistical data from being tied to individuals.

President Donald Trump and the Republican Party have spent the better part of the president’s second term radically reshaping the federal government. But in recent weeks, the GOP has set its sights on taking another run at an old target: the US census.

Since the first Trump administration, the right has sought to add a question to the census that captures a respondent’s immigration status and to exclude noncitizens from the tallies that determine how seats in Congress are distributed. In 2019, the Supreme Court struck down an attempt by the first Trump administration to add a citizenship question to the census.

But now, a little-known algorithmic process called “differential privacy,” created to keep census data from being used to identify individual respondents, has become the right’s latest focus. WIRED spoke to six experts about the GOP’s ongoing effort to falsely allege that a system created to protect people’s privacy has made the data from the 2020 census inaccurate.

If successful, the campaign to get rid of differential privacy could not only radically change the kind of data made available, but could put the data of every person living in the US at risk. The campaign could also discourage immigrants from participating in the census entirely.

The Census Bureau regularly publishes anonymized data so that policymakers and researchers can use it. That data is also sensitive: Conducted every 10 years, the census counts every person living in the United States, citizen and noncitizen alike. The data includes detailed information like the race, sex, and age, as well the languages they speak, their home address, economic status, and the number of people living in a house. This data is used for allocating the federal funds that support public services like schools and hospitals, as well as for how a state’s population is divided up and represented in Congress. The more people in a state, the more congressional representation—and more votes in the Electoral College.

As computers got increasingly sophisticated and data more abundant and accessible, census employees and researchers realized the data published by the Census Bureau could be reverse engineered to identify individual people. According to Title XIII of the US Code, it is illegal for census workers to publish any data that would identify individual people, their homes, or businesses. A government employee revealing this kind of information could be punished with thousands of dollars in fines or even a possible prison sentence.

For individuals, this could mean, for instance, someone could use census data without differential privacy to identify transgender youth, according to research from the University of Washington.

For immigrants, the prospect of being reidentified through census data could “create panic among noncitizens as well as their families and friends,” says Danah Boyd, a census expert and the founder of Data & Society, a nonprofit research group focused on the downstream effects of technology. LGBTQ+ people might not “feel safe sharing that they are in a same-sex marriage. There are plenty of people in certain geographies who do not want data like this to be public,” she says. This could also mean that information that might be available only through something like a search warrant would suddenly be obtainable. “Unmasking published records is not illegal. Then you can match it to large law enforcement databases without actually breaching the law.”

A need for noise

Differential privacy keeps that data private. It’s a mathematical framework whereby a statistical output can’t be used to determine any individual’s data in a dataset, and the bureau’s algorithm for differential privacy is called TopDown. It injects “noise” into the data starting at the highest level (national), moving progressively downward. There are certain constraints placed around the kind of noise that can be introduced—for instance, the total number of people in a state or census block has to remain the same. But other demographic characteristics, like race or gender, are randomly reassigned to individual records within a set tranche of data. This way, the overall number of people with a certain characteristic remains constant, while the characteristics associated with any one record don’t describe an individual person. In other words, you’ll know how many women or Hispanic people are in a census block, just not exactly where.

“Differential privacy solves a particular problem, which is if you release a lot of information, a lot of statistics, based on the same set of confidential data, eventually somebody can piece together what that confidential data had to be,” says Simson Garfinkel, former senior computer scientist for confidentiality and data access at the Census Bureau.

Differential privacy was first used on data from the 2020 census. Even though one couldn’t identify a specific individual from the data, “you can still get an accurate count on things that are important for funding and voting rights,” says Moon Duchin, a mathematics professor at Tufts University who worked with census data to inform electoral maps in Alabama. The first use of differential privacy for the census happened under the Trump presidency, though the reports themselves were published after he left office. Civil servants, not political appointees, are the ones responsible for determining how census data is collected and analyzed. Emails obtained by the Brennan Center later claimed that the officials at the Census Bureau, overseen by then-Commerce Secretary Wilbur Ross, expressed an “unusually high degree” of interest in the “technical matters” of the process, which deputy director and COO of the bureau Ron Jarmin called “unprecedented.”

It’s this data from the 2020 census that Republicans have taken issue with. On August 21, the Center for Renewing America, a right-wing think tank founded by Russ Vought, currently the director of the US Office of Management and Budget, published a blog post alleging that differential privacy “may have played a significant role in tilting the political scales favorably toward Democrats for apportionment and redistricting purposes.” The post goes on to acknowledge that, even if a citizenship question was added to the census—which Trump attempted during his first administration—differential privacy “algorithm will be able to mask characteristic data, including citizenship status.”

Duchin and other experts who spoke to WIRED say that differential privacy does not change apportionment, or how seats in Congress are distributed—several red states, including Texas and Florida, gained representation after the 2020 census, while blue states like California lost representatives.

COUNTing the cost

On August 28, Republican Representative August Pfluger introduced the COUNT Act. If passed, it would add a citizenship question to the census and force the Census Bureau to “cease utilization of the differential privacy process.” Pfluger’s office did not immediately respond to a request for comment.

“Differential privacy is a punching bag that’s meant here as an excuse to redo the census,” says Duchin. “That is what’s going on, if you ask me.”

On October 6, Senator Jim Banks, a Republican from Indiana, sent a letter to Secretary of Commerce Howard Lutnick, urging him to “investigate and correct errors from the 2020 Census that handed disproportionate political power to Democrats and illegal aliens.” The letter goes on to allege that the use of differential privacy “alters the total population of individual voting districts.” Similar to the COUNT Act and the Renewing America post, the letter also states that the 2030 Census “must request citizenship status.”

Peter Bernegger, a Wisconsin-based “election integrity” activist who is facing a criminal charge of simulating the legal process for allegedly falsifying a subpoena, amplified Banks’ letter on X, alleging that the use of differential privacy was part of “election rigging by the Obama/Biden administrations.” Bernegger’s post was viewed more than 236,000 times.

Banks’ office and Bernegger did not immediately respond to a request for comment.

“No differential privacy was ever applied to the data used to apportion the House of Representatives, so the claim that seats in the House were affected is simply false,” says John Abowd, former associate director for research and methodology and chief scientist at the United States Census Bureau. Abowd oversaw the implementation of differential privacy while at the Census Bureau. He says that the data from the 2020 census has been successfully used by red and blue states, as well as redistricting commissions, and that the only difference from previous census data was that no one would be able to “reconstruct accurate, identifiable individual data to enhance the other databases that they use (voter rolls, drivers licenses, etc.).”

With a possible addition of the citizenship question, proposed by both Banks and the COUNT Act, Boyd says that census data would be even more sensitive, because that kind of information is not readily available in commercial data. “Plenty of data brokers would love to get their hands on that data.”

Shortly after Senator Banks published his letter, Abowd found himself in the spotlight. On October 9, the X account @amuse posted a blog-length post alleging that Abowd was the bureaucrat who “stole the House.” The post also alleged, without evidence, that the census results meant that “Republican states are projected to lose almost $90 billion in federal funds across the decade as a result of the miscounts. Democratic states are projected to gain $57 billion.” The account has more than 666,000 followers, including billionaire Elon Musk, venture capitalist Marc Andreessen, and US pardon attorney Ed Martin. (Abowd told WIRED he was “keeping an eye” on the post, which was viewed more than 360,000 times.) That same week, America First Legal, the conservative nonprofit founded by now deputy chief of staff for policy Stephen Miller, posted about a complaint the group had recently filed in Florida, challenging the 2020 census results, alleging they were based upon flawed statistical methods, one of which was differential privacy.

The results of all this, experts tell WIRED, are that fewer people will feel safe participating in the census and that the government will likely need to spend even more resources to try to get an accurate count. Undercounting could lead to skewed numbers that could impact everything from congressional representation to the amount of funding a municipality might receive from the government.

Neither the proposed COUNT Act nor Senator Banks’ letter outlines an alternative to differential privacy. This means that the Census Bureau would likely be left with two options: Publish data that could put people at risk (which could lead to legal consequences for its staff), or publish less data. “At present, I do not know of any alternative to differential privacy that can safeguard the personal data that the US Census Bureau uses in their work on the decennial census,” says Abraham Flaxman, an associate professor of health metrics sciences at the University of Washington, whose team conducted the study on transgender youth.

Getting rid of differential privacy is not a “light thing,” says a Census employee familiar with the bureau’s privacy methods and who requested anonymity because they were not authorized to speak to the press. “It may be for the layperson. But the entire apparatus of disclosure avoidance at the bureau has been geared for the last almost 10 years on differential privacy.” According to the employee, there is no immediately clear method to replace differential privacy.

Boyd says that the safest bet would simply be “what is known as suppression, otherwise known as ‘do not publish.’” (This, according to Garfinkel, was the backup plan if differential privacy had not been implemented for the 2020 census.)

Another would be for the Census Bureau to only publish population counts, meaning that demographic information like the race or age of respondents would be left out. “This is a problem, because we use census data to combat discrimination,” says Boyd. “The consequences of losing this data is not being able to pursue equity.”

This story originally appeared on wired.com.

Photo of WIRED

Wired.com is your essential daily guide to what’s next, delivering the most original and complete take you’ll find anywhere on innovation’s impact on technology, science, business and culture.

Republican plan would make deanonymization of census data trivial Read More »

melissa-strikes-jamaica,-tied-as-most-powerful-atlantic-storm-to-come-ashore

Melissa strikes Jamaica, tied as most powerful Atlantic storm to come ashore

Hurricane Melissa made landfall in southwestern Jamaica, near New Hope, on Tuesday at 1 pm ET with staggeringly powerful sustained winds of 185 mph.

In the National Hurricane Center update noting the precise landfall time and location, specialist Larry Kelly characterized Melissa as an “extremely dangerous and life-threatening” hurricane. Melissa is bringing very heavy rainfall, damaging surge, and destructive winds to the small Caribbean island that is home to about 3 million people.

The effects on the island are sure to be catastrophic and prolonged.

A record-breaking hurricane by any measure

By any measure Melissa is an extraordinary and catastrophic storm.

By strengthening overnight, and then maintaining its incredible intensity of 185 mph, Melissa has tied the Labor Day Hurricane of 1935 as the most powerful hurricane to strike a landmass in the Atlantic Basin, which includes the United States, Mexico, Central America, and the Caribbean islands.

Melissa also tied the Labor Day storm, which struck the Florida Keys, as the most intense storm at landfall as measured by central pressure, 892 millibars.

Overall Melissa is tied for the second strongest hurricane, measured by winds, ever observed in the Atlantic basin, behind only Hurricane Allen and its 190 mph winds in 1980. Only Hurricane Wilma (882 millibars) and Gilbert (888 millibars) have recorded lower pressures at sea.

Melissa strikes Jamaica, tied as most powerful Atlantic storm to come ashore Read More »

amd-shores-up-its-budget-laptop-cpus-by-renaming-more-years-old-silicon

AMD shores up its budget laptop CPUs by renaming more years-old silicon

That leaves AMD with four distinct branding tiers for laptop processors: the Ryzen AI 300 series, which uses all of the company’s latest silicon and supports Windows 11’s Copilot+ features; the Ryzen 200 series for processors originally launched in mid to late 2023 as Ryzen 7040 and Ryzen 8040; Ryzen 100 for Rembrandt-R chips first launched in 2022; and then a smattering of two-digit Ryzen and Athlon brand names for Mendocino chips.

These chips are still capable of providing a decent Windows (or Linux) experience for budget PC buyers—we were big fans of the Ryzen 6000 in particular back in the fall of 2022. But the practice of giving old chips updated labels continues to feel somewhat disingenuous, and it means that users who do want AMD’s latest CPU and GPU architectures (or neural processing units, for Copilot+ PC features) will continue to pay a premium for them.

If you want to squint hard and see an upside to this for PC buyers, it’s that if you can get a good deal on a refurbished or clearance PC using Ryzen 6000, Ryzen 7035, or Ryzen 7020 chips, you’re still technically getting the latest and greatest processors that AMD is willing to sell you. The issue, as always, is that stacking more brand names on top of old processors makes it that much more difficult to make an informed buying decision.

AMD shores up its budget laptop CPUs by renaming more years-old silicon Read More »

10m-people-watched-a-youtuber-shim-a-lock;-the-lock-company-sued-him-bad-idea.

10M people watched a YouTuber shim a lock; the lock company sued him. Bad idea.


It’s still legal to pick locks, even when you swing your legs.

“Opening locks” might not sound like scintillating social media content, but Trevor McNally has turned lock-busting into online gold. A former US Marine Staff Sergeant, McNally today has more than 7 million followers and has amassed more than 2 billion views just by showing how easy it is to open many common locks by slapping, picking, or shimming them.

This does not always endear him to the companies that make the locks.

On March 3, 2025, a Florida lock company called Proven Industries released a social media promo video just begging for the McNally treatment. The video was called, somewhat improbably, “YOU GUYS KEEP SAYING YOU CAN EASILY BREAK OFF OUR LATCH PIN LOCK.” In it, an enthusiastic man in a ball cap says he will “prove a lot of you haters wrong.” He then goes hard at Proven’s $130 model 651 trailer hitch lock with a sledgehammer, bolt cutters, and a crowbar.

Naturally, the lock hangs tough.

An Instagram user brought the lock to McNally’s attention by commenting, “Let’s introduce it to the @mcnallyofficial poke.” Someone from Proven responded, saying that McNally only likes “the cheap locks lol because they are easy and fast.” Proven locks were said to be made of sterner stuff.

But on April 3, McNally posted a saucy little video to social media platforms. In it, he watches the Proven promo video while swinging his legs and drinking a Juicy Juice. He then hops down from his seat, goes over to a Proven trailer hitch lock, and opens it in a matter of seconds using nothing but a shim cut from a can of Liquid Death. He says nothing during the entire video, which has been viewed nearly 10 million times on YouTube alone.

Despite practically begging people to attempt this, Proven Industries owner Ron Lee contacted McNally on Instagram. “Just wanted to say thanks and be prepared!” he wrote. McNally took this as a threat.

(Oddly enough, Proven’s own homepage features a video in which the company trashes competing locks and shows just how easy it is to defeat them. And its news pages contain articles and videos on “The Hidden Flaws of Master Locks” and other brands. Why it got so upset about McNally’s video is unclear.)

The next day, Lee texted McNally’s wife. The message itself was apparently Lee’s attempt to de-escalate things; he says he thought the number belonged to McNally, and the message itself was unobjectionable. But after the “be prepared!” notice of the day before, and given the fact that Lee already knew how to contact him on Instagram, McNally saw the text as a way “to intimidate me and my family.” That feeling was cemented when McNally found out that Lee was a triple felon—and that in one case, Lee had hired someone “to throw a brick through the window of his ex-wife.”

Concerned about losing business, Lee kept trying to shut McNally down. Proven posted a “response video” on April 6 and engaged with numerous social media commenters, telling them that things were “going to get really personal” for McNally. Proven employees alleged publicly that McNally was deceiving people about all the prep work he had done to make a “perfectly cut out” shim. Without extensive experience, long prep work, and precise measurements, it was said, Proven’s locks were in little danger of being opened by rogue actors trying to steal your RV.

“Sucks to see how many people take everything they see online for face value,” one Proven employee wrote. “Sounds like a bunch of liberals lol.”

Proven also had its lawyers file “multiple” DMCA takedown notices against the McNally video, claiming that its use of Proven’s promo video was copyright infringement.

McNally didn’t bow to the pressure, though, instead uploading several more videos showing him opening Proven locks. In one of them, he takes aim at Proven’s claims about his prep work by retrieving a new lock from an Amazon delivery kiosk, taking it outside—and popping it in seconds using a shim he cuts right on camera, with no measurements, from an aluminum can.

Help us write more stories like this—while ditching ads

Ars subscribers support our independent journalism, which they can read ad-free and with enhanced privacy protections. And it’s only a few bucks a month.

Ars Pro

$5 / month

Subscribe

  • No ads
  • No tracking
  • Enhanced experience

58.3333% off!

Ars Pro

$25 / year

Subscribe

  • Best value
  • Still no ads
  • Still no tracking

Ars Pro++

$50 / year

Subscribe

  • All Ars Pro features
  • Support journalism
  • Special ++ badge

On May 1, Proven filed a federal lawsuit against McNally in the Middle District of Florida, charging him with a huge array of offenses: (1) copyright infringement, (2) defamation by implication, (3) false advertising, (4) violating the Florida Deceptive and Unfair Trade Practices Act, (5) tortious interference with business relationships, (6) unjust enrichment, (7) civil conspiracy, and (8) trade libel. Remarkably, the claims stemmed from a video that all sides admit was accurate and in which McNally himself said nothing.

Screenshot of a social media exchange.

In retrospect, this was probably not a great idea.

Don’t mock me, bro

How can you defame someone without even speaking? Proven claimed “defamation by implication,” arguing that the whole setup of McNally’s videos was unfair to the company and its product. McNally does not show his prep work, which (Proven argued) conveys to the public the false idea that Proven’s locks are easy to bypass. While the shimming does work, Proven argued that it would be difficult for an untrained user to perform.

But what Proven really, really didn’t like was being mocked. McNally’s decision to drink—and shake!—a juice box on video comes up in court papers a mind-boggling number of times. Here’s a sample:

McNally appears swinging his legs and sipping from an apple juice box, conveying to the purchasing public that bypassing Plaintiff’s lock is simple, trivial, and even comical…

…showing McNally drinking from, and shaking, a juice box, all while swinging his legs, and displaying the Proven Video on a mobile device…

The tone, posture, and use of the juice box prop and childish leg swinging that McNally orchestrated in the McNally Video was intentional to diminish the perceived seriousness of Proven Industries…

The use of juvenile imagery, such as sipping from a juice box while casually applying the shim, reinforces the misleading impression that the lock is inherently insecure and marketed deceptively…

The video then abruptly shifts to Defendant in a childlike persona, sipping from a juice box and casually applying a shim to the lock…

In the end, Proven argued that the McNally video was “for commercial entertainment and mockery,” produced for the purpose of “humiliating Plaintiff.” McNally, it was said, “will not stop until he destroys Proven’s reputation.” Justice was needed. Expensive, litigious justice.

But the proverbially level-headed horde of Internet users does not always love it when companies file thermonuclear lawsuits against critics. Sometimes, in fact, the level-headed horde disregards everything taught by that fount of judicial knowledge, The People’s Court, and they take the law into their own hands.

Proven was soon the target of McNally fans. The company says it was “forced to disable comments on posts and product videos due to an influx of mocking and misleading replies furthering the false narrative that McNally conveyed to the viewers.” The company’s customer service department received such an “influx of bogus customer service tickets… that it is experiencing difficulty responding to legitimate tickets.”

Screenshot of a social media post from Proven Industries.

Proven was quite proud of its lawsuit… at first.

Someone posted Lee’s personal phone number to the comment section of a McNally video, which soon led to “a continuous stream of harassing phone calls and text messages from unknown numbers at all hours of the day and night,” which included “profanity, threats, and racially charged language.”

Lest this seem like mere high spirits and hijinks, Lee’s partner and his mother both “received harassing messages through Facebook Messenger,” while other messages targeted Lee’s son, saying things like “I would kill your f—ing n—– child” and calling him a “racemixing pussy.”

This is clearly terrible behavior; it also has no obvious connection to McNally, who did not direct or condone the harassment. As for Lee’s phone number, McNally said that he had nothing to do with posting it and wrote that “it is my understanding that the phone number at issue is publicly available on the Better Business Bureau website and can be obtained through a simple Google search.”

And this, with both sides palpably angry at each other, is how things stood on June 13 at 9: 09 am, when the case got a hearing in front of the Honorable Mary Scriven, an extremely feisty federal judge in Tampa. Proven had demanded a preliminary injunction that would stop McNally from sharing his videos while the case progressed, but Proven had issues right from the opening gavel:

LAWYER 1: Austin Nowacki on behalf of Proven industries.

THE COURT: I’m sorry. What is your name?

LAWYER 1: Austin Nowacki.

THE COURT: I thought you said Austin No Idea.

LAWYER 2: That’s Austin Nowacki.

THE COURT: All right.

When Proven’s lead lawyer introduced a colleague who would lead that morning’s arguments, the judge snapped, “Okay. Then you have a seat and let her speak.”

Things went on this way for some time, as the judge wondered, “Did the plaintiff bring a lock and a beer can?” (The plaintiff did not.) She appeared to be quite disappointed when it was clear there would be no live shimming demonstration in the courtroom.

Then it was on to the actual arguments. Proven argued that the 15 seconds of its 90-second promo video used by McNally were not fair use, that McNally had defamed the company by implication, and that shimming its locks was actually quite difficult. Under questioning, however, one of Proven’s employees admitted that he had been able to duplicate McNally’s technique, leading to the question from McNally’s lawyer: “When you did it yourself, did it occur to you for one moment that maybe the best thing to do, instead of file a lawsuit, was to fix [the lock]?”

At the end of several hours of wrangling, the judge stepped in, saying that she “declines to grant the preliminary injunction motion.” For her to do so, Proven would have to show that it was likely to win at trial, among other things; it had not.

As for the big copyright infringement claim, of which Proven had made so much hay, the judge reached a pretty obvious finding: You’re allowed to quote snippets of copyrighted videos in order to critique them.

“The purpose and character of the use to which Mr. McNally put the alleged infringed work is transformative, artistic, and a critique,” said the judge. “He is in his own way challenging and critiquing Proven’s video by the use of his own video.”

As for the amount used, it was “substantial enough but no more than is necessary to make the point that he is trying to critique Proven’s video, and I think that’s fair game and a nominative fair use circumstance.”

While Proven might convince her otherwise after a full trial, “the copyright claim fails as a basis for a demand for preliminary injunctive relief.”

As for “tortious interference” and “defamation by implication,” the judge was similarly unimpressed.

“The fact that you might have a repeat customer who is dissuaded to buy your product due to a criticism of the product is not the type of business relationship the tortious interference with business relationship concept is intended to apply,” she said.

In the end, the judge said she would see the case through to its end, if that was really what everyone wanted, but “I will pray that you all come to a resolution of the case that doesn’t require all of this. This is a capitalist market and people say what they say. As long as it’s not false, they say what they say.”

She gave Proven until July 7 to amend its complaint if it wished.

On July 7, the company dismissed the lawsuit against McNally instead.

Proven also made a highly unusual request: Would the judge please seal almost the entire court record—including the request to seal?

Court records are presumptively public, but Proven complained about a “pattern of intimidation and harassment by individuals influenced by Defendant McNally’s content.” According to the company, a key witness had already backed out of the case, saying, “Is there a way to leave my name and my companies name out of this due to concerns of potential BLOW BACK from McNally or others like him?” Another witness, who did submit a declaration, wondered, “Is this going to be public? My concern is that there may be some backlash from the other side towards my company.”

McNally’s lawyer laid into this seal request, pointing out that the company had shown no concern over these issues until it lost its bid for a preliminary injunction. Indeed, “Proven boasted to its social media followers about how it sued McNally and about how confident it was that it would prevail. Proven even encouraged people to search for the lawsuit.” Now, however, the company “suddenly discover[ed] a need for secrecy.”

The judge has not yet ruled on the request to seal.

Another way

The strange thing about the whole situation is that Proven actually knew how to respond constructively to the first McNally video. Its own response video opened with a bit of humor (the presenter drinks a can of Liquid Death), acknowledged the issue (“we’ve had a little bit of controversy in the last couple days”), and made clear that Proven could handle criticism (“we aren’t afraid of a little bit of feedback”).

The video went on to show how their locks work and provided some context on shimming attacks and their likelihood of real-world use. It ended by showing how users concerned about shimming attacks could choose more expensive but more secure lock cores that should resist the technique.

Quick, professional, non-defensive—a great way to handle controversy.

But it was all blown apart by the company’s angry social media statements, which were unprofessional and defensive, and the litigation, which was spectacularly ill-conceived as a matter of both law and policy. In the end, the case became a classic example of the Streisand Effect, in which the attempt to censor information can instead call attention to it.

Judging from the number of times the lawsuit talks about 1) ridicule and 2) harassment, it seems like the case quickly became a personal one for Proven’s owner and employees, who felt either mocked or threatened. That’s understandable, but being mocked is not illegal and should never have led to a lawsuit or a copyright claim. As for online harassment, it remains a serious and unresolved issue, but launching a personal vendetta—and on pretty flimsy legal grounds—against McNally himself was patently unwise. (Doubly so given that McNally had a huge following and had already responded to DMCA takedowns by creating further videos on the subject; this wasn’t someone who would simply be intimidated by a lawsuit.)

In the end, Proven’s lawsuit likely cost the company serious time and cash—and generated little but bad publicity.

Photo of Nate Anderson

10M people watched a YouTuber shim a lock; the lock company sued him. Bad idea. Read More »

the-android-powered-boox-palma-2-pro-fits-in-your-pocket,-but-it’s-not-a-phone

The Android-powered Boox Palma 2 Pro fits in your pocket, but it’s not a phone

Softly talking about the Boox Palma 2 Pro

For years, color E Ink was seen as a desirable feature, which would make it easier to read magazines and comics on low-power devices—Boox even has an E Ink monitor. However, the quality of the displays has been lacking. These screens do show colors, but they’re not as vibrant as what you get on an LCD or OLED. In the case of the Palma 2 Pro, the screen is also less sharp in color mode. The touchscreen display is 824 × 1648 in monochrome, but turning on color cuts that in half to 412 × 824.

In addition to the new screen, the second-gen Palma adds a SIM card slot. It’s not for phone calls, though. The SIM slot allows the device to get 5G mobile data in addition to Wi-Fi.

Credit: Boox

The Palma 2 Pro runs Android 15 out of the box. That’s a solid showing for Boox, which often uses much older builds of Google’s mobile OS. Upgrades aren’t guaranteed, and there’s no official support for Google services. However, Boox has a workaround for its devices so the Play Store can be installed.

The new Boox pocket reader is available for pre-order now at $400. It’s expected to ship around November 14.

The Android-powered Boox Palma 2 Pro fits in your pocket, but it’s not a phone Read More »

lawsuit:-reddit-caught-perplexity-“red-handed”-stealing-data-from-google-results

Lawsuit: Reddit caught Perplexity “red-handed” stealing data from Google results


Scraper accused of stealing Reddit content “shocked” by lawsuit.

In a lawsuit filed on Wednesday, Reddit accused an AI search engine, Perplexity, of conspiring with several companies to illegally scrape Reddit content from Google search results, allegedly dodging anti-scraping methods that require substantial investments from both Google and Reddit.

Reddit alleged that Perplexity feeds off Reddit and Google, claiming to be “the world’s first answer engine” but really doing “nothing groundbreaking.”

“Its answer engine simply uses a different company’s” large language model “to parse through a massive number of Google search results to see if it can answer a user’s question based on those results,” the lawsuit said. “But Perplexity can only run its ‘answer engine’ by wrongfully accessing and scraping Reddit content appearing in Google’s own search results from Google’s own search engine.”

Likening companies involved in the alleged conspiracy to “bank robbers,” Reddit claimed it caught Perplexity “red-handed” stealing content that its “answer engine” should not have had access to.

Baiting Perplexity with “the digital equivalent of marked bills,” Reddit tested out posting content that could only be found in Google search engine results pages (SERPs) and “within hours, queries to Perplexity’s ‘answer engine’ produced the contents of that test post.”

“The only way that Perplexity could have obtained that Reddit content and then used it in its ‘answer engine’ is if it and/or its Co-Defendants scraped Google SERPs for that Reddit content and Perplexity then quickly incorporated that data into its answer engine,” Reddit’s lawsuit said.

In a Reddit post, Perplexity denied any wrongdoing, describing its answer engine as summarizing Reddit discussions and citing Reddit threads in answers, just like anyone who shares links or posts on Reddit might do. Perplexity suggested that Reddit was attacking the open Internet by trying to extort licensing fees for Reddit content, despite knowing that Perplexity doesn’t train foundational models. Reddit’s endgame, Perplexity alleged, was to use the Perplexity lawsuit as a “show of force in Reddit’s training data negotiations with Google and OpenAI.”

“We won’t be extorted, and we won’t help Reddit extort Google, even if they’re our (huge) competitor,” Perplexity wrote. “Perplexity will play fair, but we won’t cave. And we won’t let bigger companies use us in shell games. ”

Reddit likely anticipated Perplexity’s defense of the “open Internet,” noting in its complaint that “Reddit’s current Robots Exclusion Protocol file (‘robots.txt’) says, ‘Reddit believes in an open Internet, but not the misuse of public content.’”

Google reveals how scrapers steal from search results

To block scraping, Reddit uses various measures, such as “registered user-identification limits, IP-rate limits, captcha bot protection, and anomaly-detection tools,” the complaint said.

Similarly, Google relies on “anti-scraping systems and teams dedicated to preventing unauthorized access to its products and services,” Reddit said, noting Google prohibits “unauthorized automated access” to its SERPs.

To back its claims, Reddit subpoenaed Google to find out more about how the search giant blocks AI scrapers from accessing content on SERPs. Google confirmed it relies on “a technological access control system called ‘SearchGuard,’ which is designed to prevent automated systems from accessing and obtaining wholesale search results and indexed data while allowing individual users—i.e., humans—access to Google’s search results, including results that feature Reddit data.”

“SearchGuard prevents unauthorized access to Google’s search data by imposing a barrier challenge that cannot be solved in the ordinary course by automated systems unless they take affirmative actions to circumvent the SearchGuard system,” Reddit’s complaint explained.

Bypassing these anti-scraping systems violates the Digital Millennium Copyright Act, Reddit alleged, as well as laws against unfair trade and unjust enrichment. Seemingly, Google’s SearchGuard may currently be the easiest to bypass for alleged conspirators who supposedly pivoted to looting Google SERPs after realizing they couldn’t access Reddit content directly on the platform.

Scrapers shocked by Reddit lawsuit

Reddit accused three companies of conspiring with Perplexity—”a Lithuanian data scraper” called Oxylabs UAB, “a former Russian botnet” known as AWMProxy, and SerpApi, a Texas company that sells services for scraping search engines.

Oxylabs “is explicit that its scraping service is meant to circumvent Google’s technological measures,” Reddit alleged, pointing to an Oxylabs’ website called “How to Scrape Google Search Results.”

SerpApi touts the same service, including some options to scrape SERPs at “ludicrous speeds.” To trick browsers, SerpApi’s fastest option uses “a server-swarm to hide from, avoid, or simply overwhelm by brute force effective measures Google has put in place to ward off automated access to search engine results,” Reddit alleged. SerpApi also allegedly provides users “with tips to reduce the chance of being blocked while web scraping, such as by sending ‘fake user-agent string[s],’ shifting IP addresses to avoid multiple requests from the same address, and using proxies ‘to make traffic look like regular user traffic’ and thereby ‘impersonate’ user traffic.”

According to Reddit, the three companies disguise “their web scrapers as regular people (among other techniques) to circumvent or bypass the security restrictions meant to stop them.” During a two-week span in July, they scraped “almost three billion” SERPs containing Reddit text, URLs, images, and videos, a subpoena requesting information from Google revealed.

Ars could not immediately reach AWMProxy for comment. However, the other companies were surprised by Reddit’s lawsuit, while vowing to defend their business models.

SerpApi’s spokesperson told Ars that Reddit did not notify the company before filing the lawsuit.

“We strongly disagree with Reddit’s allegations and intend to vigorously defend ourselves in court,” SerpApi’s spokesperson said. “In the eight years we’ve been in business, SerpApi has always operated on the right side of the law. As stated on our website, ‘The crawling and parsing of public data is protected by the First Amendment of the United States Constitution. We value freedom of speech tremendously.’”

Additionally, SerpAPI works “closely with our attorneys to ensure that our services comply with all applicable laws and fair use principles. SerpApi stands firmly behind its business model and conduct, and we will continue to defend our rights to the fullest extent,” the spokesperson said.

Oxylabs’ chief governance strategy officer, Denas Grybauskas, told Ars that Reddit’s complaint seemed baffling since the other companies involved in the litigation are “unrelated and unaffiliated.”

“We are shocked and disappointed by this news, as Reddit has made no attempt to speak with us directly or communicate any potential concerns,” Grybauskas said. “Oxylabs has always been and will continue to be a pioneer and an industry leader in public data collection, and it will not hesitate to defend itself against these allegations. Oxylabs’ position is that no company should claim ownership of public data that does not belong to them. It is possible that it is just an attempt to sell the same public data at an inflated price.”

Grybauskas defended Oxylabs’ business as creating “real-world value for thousands of businesses and researchers, such as those driving open-source investigations, disinformation tackling, or environmental monitoring.”

“We strongly believe that our core business principles make the Internet a better place and serve the public good,” Grybauskas said. “Oxylabs provides infrastructure for compliant access to publicly available information, and we demand every customer to use our services lawfully. ”

Reddit cited threats to licensing deals

Apparently, Reddit caught on to the alleged scheme after sending cease-and-desist letters to Perplexity to stop scraping Reddit content that its answer engine was citing. Rather than ending the scraping, Reddit claimed Perplexity’s citations increased “forty-fold.” Since Perplexity is a customer listed on SerpApi’s website, Reddit hypothesized the two were conspiring to skirt Google’s anti-circumvention tools, the complaint said, along with the other companies.

In a statement provided to Ars, Ben Lee, chief legal officer at Reddit, said that Oxylabs, AWMProxy, and SerpApi were “textbook examples” of scrapers that “bypass technological protections to steal data, then sell it to clients hungry for training material.”

“Unable to scrape Reddit directly, they mask their identities, hide their locations, and disguise their web scrapers to steal Reddit content from Google Search,” Lee said. “Perplexity is a willing customer of at least one of these scrapers, choosing to buy stolen data rather than enter into a lawful agreement with Reddit itself.”

On Reddit, Perplexity pushed back on Reddit’s claims that Perplexity ignored requests to license Reddit content.

“Untrue. Whenever anyone asks us about content licensing, we explain that Perplexity, as an application-layer company, does not train AI models on content,” Perplexity said. “Never has. So, it is impossible for us to sign a license agreement to do so.”

Reddit supposedly “insisted we pay anyway, despite lawfully accessing Reddit data,” Perplexity said. “Bowing to strong arm tactics just isn’t how we do business.”

Perplexity’s spokesperson, Jesse Dwyer, told Ars the company chose to post its statement on Reddit “to illustrate a simple point.”

“It is a public Reddit link accessible to anyone, yet by the logic of Reddit’s lawsuit, if you mention it or cite it in any way (which is your job as a reporter), they might just sue you,” Dwyer said.

But Reddit claimed that its business and reputation have been “damaged” by “misappropriation of Reddit data and circumvention of technological control measures.” Without a licensing deal ensuring that Perplexity and others are respecting Reddit policies, Reddit cannot control who has access to data, how they’re using data, and if data use conflicts with Reddit’s privacy policy and user agreement, the complaint said.

Further, Reddit’s worried that Perplexity’s workaround could catch on, potentially messing up Reddit’s other licensing deals. All the while, Reddit noted, it has to invest “significant resources” in anti-scraping technology, with Reddit ultimately suffering damages, including “lost profits and business opportunities, reputational harm, and loss of user trust.”

Reddit’s hoping the court will grant an injunction barring companies from scraping Reddit content from Google SERPs. It also wants companies blocked from both selling Reddit data and “developing or distributing any technology or product that is used for the unauthorized circumvention of technological control measures and scraping of Reddit data.”

If Reddit wins, companies could be required to pay substantial damages or to disgorge profits from the sale of Reddit content.

Advance Publications, which owns Ars Technica parent Condé Nast, is the largest shareholder in Reddit.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Lawsuit: Reddit caught Perplexity “red-handed” stealing data from Google results Read More »

great-hybrid-v6,-lousy-hmi:-three-days-with-a-ferrari-296-gtb

Great hybrid V6, lousy HMI: Three days with a Ferrari 296 GTB

The first time I drove this generation of mid-engined Ferrari, it was on a curated route on the company’s home turf. As the Po Valley gives way to the Apennines, you find plenty of narrow winding roads, steep gradients, and hairpin turns. It was an engaging few hours of driving, but it was too brief to properly assess some of the 296’s technology. I found the ride firm but comfortable on rough Italian tarmac and the hybrid system easy to operate, flicking into calm-and-quiet electric-only mode through the villages I encountered.

That was back in 2022 during the unveiling of Ferrari’s 499P race car. Last month, I met the 499P again as it visited the Circuit of the Americas in Austin, along with the rest of the World Endurance Championship. And that afforded another chance to get to know the 296, with three days rather than three hours to form an impression.

Head west from Austin and you’ll find twisty roads that wrap around the hills. It would have been easy to spend an entire day out there, but that seemed repetitive—I’d experienced the 296’s back road behavior already. Plus, there were things to do at the racetrack, although I’ll admit I took the long way there and back each day.

Driving among the AVs

For mixing it up in downtown traffic—among the dozens of all-white Waymo Jaguars and brightly wrapped Zoox Toyotas doing their autonomous driving thing—the Ferrari’s eDrive mode is perfectly sufficient. It uses the axial flux electric motor that lives between the 2.9 L V6 engine and the eight-speed dual-clutch transmission, but the donut-shaped motor’s 165 hp (123 kW) and, more importantly, 232 lb-ft (315 Nm) are all you need to move the 296’s roughly 3,300 lbs (1,500 kg) at city speeds. Visibility is good looking forward and is adequate otherwise, and the throttle mapping makes it easy to measure out just as much acceleration as you need.

Beyond the confines of the city center, you’ll want the contribution of the V6’s 654 hp (488 kW). There are three modes to choose from. Hybrid is best when the lithium-ion traction battery is charged, and the car’s brain will cut the V6 as and when necessary to save some fuel. If the 7.4 kWh battery is depleted, switching into Performance mode is a solution. This keeps the internal combustion engine fired and uses spare power to keep topping up the pack. It also sounds more raucous.

Great hybrid V6, lousy HMI: Three days with a Ferrari 296 GTB Read More »

an-outcast-faces-a-deadly-alien-world-in-predator:-badlands-trailer

An outcast faces a deadly alien world in Predator: Badlands trailer

We’ve got a new international trailer for Predator: Badlands, the latest installment in a popular franchise that’s been around since 1987. It’s directed by Dan Trachtenberg, who is very familiar with the franchise, having also directed 2022’s highly acclaimed standalone Predator movie, Prey.

In April, Twentieth Century Studios released the first teaser, which involved multiple predators fighting or threatening one another, Elle Fanning looking very strange and cool as an android, and glimpses of new monsters and the alien world the movie focuses on. And the film was featured prominently at San Diego Comic-Con this summer. But it hasn’t quite wormed its way into the cultural zeitgeist for fall releases. Perhaps this latest trailer will boost its profile.

This is a standalone film in the franchise, with a particular focus on the culture of the Predator species; in fact, the same conlanger who created the Na’Vi language for James Cameron’s Avatar franchise also created a written and verbal language for the Predators. (We hear a bit of the dialogue in the new trailer.) And this time around, the primary Predator is actually the film’s protagonist rather than an adversary. Per the official premise: “Set in the future on a deadly remote planet, Predator: Badlands follows a young Predator outcast (Dimitrius Schuster-Koloamatangi) who finds an unlikely ally in Thia (Elle Fanning) as he embarks on a treacherous journey in search of the ultimate adversary.”

An outcast faces a deadly alien world in Predator: Badlands trailer Read More »

california-startup-to-demonstrate-space-weapon-on-its-own-dime

California startup to demonstrate space weapon on its own dime


“All of the pieces that are required to make it viable exist.”

This illustration released by Apex depicts a space-based interceptor fired from a satellite in low-Earth orbit. Credit: Apex

Defense contractors are in full sales mode to win a piece of a potentially trillion-dollar pie for development of the Trump administration’s proposed Golden Dome missile shield.

CEOs are touting their companies’ ability to rapidly spool up satellite, sensor, and rocket production. Publicly, they all agree with the assertion of Pentagon officials that US industry already possesses the technologies required to make a homeland missile defense system work.

The challenge, they say, is tying all of it together under the umbrella of a sophisticated command and control network. Sensors must be able to detect and track missile threats, and that information must rapidly get to weapons that can shoot them down. Gen. Chance Saltzman, the Space Force’s top commander, likes to call Golden Dome a “systems of systems.”

One of these systems stands apart. It’s the element that was most controversial when former President Ronald Reagan announced the Strategic Defense Initiative or “Star Wars” program, a concept similar to Golden Dome that fizzled after the end of the Cold War.

Like the Star Wars concept 40 years ago, Golden Dome’s pièce de résistance will be a fleet of space-based interceptors loitering in orbit a few hundred miles overhead, ready to shoot down missiles shortly after they are launched. Pentagon officials haven’t disclosed the exact number of interceptors required to fulfill Golden Dome’s mission of defending the United States against a volley of incoming missiles. It will probably be in the thousands.

Skin in the game

Last month, the Defense Department released a request for prototype proposals for space-based interceptors (SBIs). The Space Force said it plans to sign agreements with multiple companies to develop and demonstrate SBIs and compete for prizes. This is an unusual procurement strategy for the Pentagon, requiring contractors to spend their own money on building and launching the SBIs into space, with the hope of eventually winning a lucrative production contract.

Apex is one of the companies posturing for an SBI contract. Based in Los Angeles, Apex is one of several US startups looking to manufacture satellites faster and cheaper than traditional aerospace contractors. The company’s vision is to rapidly churn out satellite buses, essentially the spacecraft’s chassis, to be integrated with a customer’s payloads. So far, Apex has raised more than $500 million from investors and launched its first satellite in 2024, just two years after the company’s founding. Apex won a $46 million contract from the Space Force in February to supply the military with an unspecified number of satellites through 2032.

Apex says its satellites can perform a range of missions: remote sensing and Earth observation, communications, AI-powered edge processing, and technology demos. The largest platform in Apex’s portfolio can accommodate payloads of up to 500 kilograms (1,100 pounds), with enough power to support direct-to-cell connectivity and government surveillance missions.

A look inside Apex’s satellite factory in Los Angeles. Credit: Apex

Now, Apex wants to show its satellite design can serve as an orbiting weapons platform.

“Apex is built to move fast, and that is exactly what America and our allies need to ensure we win the New Space Race,” Ian Cinnamon, the company’s co-founder and CEO, said in a statement Wednesday. “In under a year, we are launching the host platform for space-based interceptors, called an Orbital Magazine, which will deploy multiple prototype missile interceptors in orbit.”

The demonstration mission is called Project Shadow. It’s intended to “prove that an operational SBI constellation can be deployed in the timeframe our country needs,” Cinnamon said. “Apex isn’t waiting for handouts or contracts; we are developing this Orbital Magazine technology on our own dime and moving incredibly fast.”

Star Wars redux

Just one week into his second term in the White House, President Donald Trump signed an executive order for what would soon be named Golden Dome, citing an imperative to defend the United States against ballistic missiles and emerging weapons systems like hypersonic glide vehicles and drones.

The Trump administration said in May that the defense shield would cost $175 billion over the next three years. Most analysts peg the long-term cost much higher, but no one really knows. The Pentagon hasn’t released a detailed architecture for what Golden Dome will actually entail, and the uncertainty has driven independent cost estimates ranging from $500 billion to more than $3 trillion.

Golden Dome’s unknown costs, lack of definition, and its unpredictable effect on strategic stability have garnered criticism from Democratic lawmakers.

But unlike the reaction to the Reagan-era Star Wars program, there’s not much pushback on Golden Dome’s technical viability.

“All of the pieces that are required to make it viable exist. They’re out there,” Cinnamon told Ars. “We have satellites, we have boosters, we have seekers, we have fire control, we have IFTUs (in-flight target updates), we have inter-satellite links. The key is, all those pieces need to talk to each other and actually come together, and that integration is really, really difficult. The second key is, in order for it to be viable, you need enough of them in space to actually have the impact that you need.”

This frame from an Apex animation shows a space-based interceptor deploying from an Orbital Magazine.

Apex says its Project Shadow demo is scheduled to launch in June 2026. Once in orbit, the Project Shadow spacecraft will deploy two interceptors, each firing a high-thrust solid rocket motor from a third-party supplier. “The Orbital Magazine will prove its ability to environmentally control the interceptors, issue a fire control command, and close an in-space cross-link to send real-time updates post-deployment,” Apex said in a statement.

The Orbital Magazine on Apex’s drawing board could eventually carry more than 11,000 pounds (5,000 kilograms) of interceptor payload, the company said. “Orbital Magazines host one or many interceptors, allowing thousands of SBIs to be staged in orbit.”‍

Apex is spending about $15 million of its own money on Project Shadow. Cinnamon said Apex is working with other companies on “key parts of the interceptor and mission analysis” for Project Shadow, but he wasn’t ready to identify them yet. One possible propulsion supplier is Anduril Industries, the weapons company started by Oculus founder Palmer Luckey in 2017. Apex and Anduril have worked together before.

“What we’re very good at is high-rate manufacturing and piecing it together,” Cinnamon said. “We have suppliers for everything else.”

Apex is the first company to publicly disclose any details for an SBI demonstration, but it won’t be the last. Cinnamon said Apex will provide further updates on Project Shadow as it nears launch.

“We’re talking about it publicly because I believe it’s really important to inspire both the US and our allies, and show the pace of innovation and show what’s possible in today’s world,” Cinnamon said. “We are very fortunate to have an amazing team, a very large war chest of capital, and the ability to go do a project like this, truly for the good of the US and the good of our allies.”

A solid rocket motor designed for the ascent vehicle for NASA’s Mars Sample Return mission was test-fired by Northrop Grumman in 2023. A similar rocket motor could be used for space-based interceptors. Credit: NASA

The usual suspects

Apex will have a lot of competition vying for a slice of Golden Dome. America’s largest defense contractors have all signaled their interest in tapping into Golden Dome cash flows.

Lockheed Martin has submitted proposals to the Pentagon for space-based interceptors, the company’s CEO, James Taiclet, said Tuesday in a quarterly earnings call.

“We’re actually planning for a real on-orbit, space-based interceptor demonstration by 2028,” Taiclet said, without providing further details. Taiclet said Lockheed Martin is also working on command and control solutions for Golden Dome.

“At the same time, we’re rapidly increasing production capacity across the missiles, sensors, battle management systems, and satellite integration opportunities that will be directly relevant to achieve the overarching objective of Golden Dome,” Taiclet said.

“SBI, the space-based interceptor, is one of those,” he said. “We are building prototypes—full operational prototypes, not things in labs, not stuff on test stands, things that will go into space, or in the air, or fly across a missile range. These are real devices that will work and that can be produced at scale. So the space-based interceptor is one we’ve been pursuing already, and that’s all I can say about that.”

Northrop Grumman officials have made similar statements. Kathy Warden, Northrop’s CEO, has said her company is currently conducting “ground-based tests” of SBI-related technology. She didn’t describe the tests, although Northrop Grumman is the nation’s top supplier of solid rocket motors, a key piece of space-based interceptors, and regularly fires them on test stands.

“The architecture and spend plan for Golden Dome are not published, so I won’t comment on those specifically,” Warden said Tuesday. “We are providing some high-fidelity operational analysis that can help the customer understand those requirements, as well as ourselves.”

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

California startup to demonstrate space weapon on its own dime Read More »

smart-beds-leave-sleepers-hot-and-bothered-during-aws-outage

Smart beds leave sleepers hot and bothered during AWS outage

Some users complained that malfunctioning devices kept them awake for hours. Others bemoaned waking up in the middle of the night drenched in sweat.

Even more basic features, such as alarms, failed to work when Eight Sleep’s servers went down.

Eight Sleep will offer local control

Eight Sleep co-founder and CEO Matteo Franceschetti addressed the problems via X on Monday:

The AWS outage has impacted some of our users since last night, disrupting their sleep. That is not the experience we want to provide and I want to apologize for it.

We are taking two main actions:

1) We are restoring all the features as AWS comes back. All devices are currently working, with some experiencing data processing delays.

2) We are currently outage-proofing your Pod experience and we will be working tonight-24/7 until that is done.

On Monday evening, Franceschetti said that “all the features should be working.” On Tuesday, he claimed that a local control option would be available on Wednesday “at the latest” without providing more detail.

Eight Sleep users will be relieved to hear that the company is working to make their products usable during Internet outages. But many are also questioning why Eight Sleep didn’t implement local control sooner. This isn’t Eight Sleep’s first outage, and users can also experience personal Wi-Fi problems. And there’s an obvious user benefit to being able to control their bed’s elevation and temperature without the Internet or if Eight Sleep ever goes out of business.

For Eight Sleep, though, making flagship features available without its app while still making enough money isn’t easy. Without forcing people to put their Eight Sleep devices online, it would be harder for Eight Sleep to convince people that Autopilot subscriptions should be mandatory. Pod hardware’s high prices will deter people from multiple or frequent purchases, making alternative, more frequent revenue streams key for the 11-year-old company’s survival.

After a June outage, an Eight Sleep user claimed that the company told him that it was working on an offline mode. This week’s AWS problems seem to have hastened efforts, so users don’t lose sleep during the next outage.

Smart beds leave sleepers hot and bothered during AWS outage Read More »

upcoming-ios-and-macos-26.1-update-will-let-you-fog-up-your-liquid-glass

Upcoming iOS and macOS 26.1 update will let you fog up your Liquid Glass

Apple’s new Liquid Glass user interface design was one of the most noticeable and divisive features of its major software updates this year. It added additional fluidity and translucency throughout iOS, iPadOS, macOS, and Apple’s other operating systems, and as we noted in our reviews, the default settings weren’t always great for readability.

The upcoming 26.1 update for all of those OSes is taking a step toward addressing some of the complaints, though not by changing things about the default look of Liquid Glass. Rather, the update is adding a new toggle that will let users choose between a Clear and Tinted look for Liquid Glass, with Clear representing the default look and Tinted cranking up the opacity and contrast.

The new toggle adds a half-step between the default visual settings and the “reduce transparency” setting, which, aside from changing a bunch of other things about the look and feel of the operating system, is buried further down inside the Accessibility options. The Tinted toggle does make colors and vague shapes visible beneath the glass panes, preserving the general look of Liquid Glass while also erring on the side of contrast and visibility, where the “reduce transparency” setting is more of an all-or-nothing blunt instrument.

Upcoming iOS and macOS 26.1 update will let you fog up your Liquid Glass Read More »

on-dwarkesh-patel’s-podcast-with-andrej-karpathy

On Dwarkesh Patel’s Podcast With Andrej Karpathy

Some podcasts are self-recommending on the ‘yep, I’m going to be breaking this one down’ level. This was very clearly one of those. So here we go.

As usual for podcast posts, the baseline bullet points describe key points made, and then the nested statements are my commentary.

If I am quoting directly I use quote marks, otherwise assume paraphrases.

Rather than worry about timestamps, I’ll use YouTube’s section titles, as it’s not that hard to find things via the transcript as needed.

This was a fun one in many places, interesting throughout, frustrating in similar places to where other recent Dwarkesh interviews have been frustrating. It gave me a lot of ideas, some of which might even be good.

  1. Andrej calls this the ‘decade of agents’ contrary to (among others who have said it) the Greg Brockman declaration that 2025 is the ‘year of agents,’ as there is so much work left to be done. Think of AI agents as employees or interns, that right now mostly can’t do the things due to deficits of intelligence and context.

    1. I agree that 2025 as the year of the agent is at least premature.

    2. You can defend the 2025 claim if you focus on coding, Claude Code and Codex, but I think even there it is more confusing than helpful as a claim.

    3. I also agree that we will be working on improving agents for a long time.

    4. 2026 might be the proper ‘year of the agent’ as when people start using AI agents for a variety of tasks and getting a bunch of value from them, but they will still have a much bigger impact on the world in 2027, and again in 2028.

    5. On the margin and especially outside of coding, I think context and inability to handle certain specific tasks (especially around computer use details) are holding things back right now more than intelligence. A lot of it seems eminently solvable quickly in various ways if one put it in the work.

  2. Dwarkesh points to lack of continual learning or multimodality, but notes it’s hard to tell how long it will take. Andrej says ‘well I have 15 years of prediction experience and intuition and I average things out and it feels like a decade to me.’

    1. A decade seems like an eternity to me on this.

    2. If it’s to full AGI it is slow but less crazy. So perhaps this is Andrej saying that to count as an agent for this the AI needs to essentially be AGI.

  3. AI has had a bunch of seismic shifts, Andrej has seen at least two and they seem to come with regularity. Neural nets used to be a niche thing before AlexNet but they were still trained per-task, the focus on Atari and other games was a mistake because you want to interact with the ‘real world’. Then LLMs. The common mistake was trying to “get the full thing too early” and especially aiming at agents too soon.

    1. The too soon thing seems true and important. You can’t unlock capabilities in a useful way until you have the groundwork and juice for them.

    2. Once you do have the groundwork and juice, they tend to happen quickly, without having to do too much extra work.

    3. In general, seems like if something is super hard to do, better if you wait?

    4. However you can with focused effort make a lot of progress beyond what you’d get at baseline, even if that ultimately stalls out, as seen by the Atari and universe examples.

  4. Dwarkesh asks what about the Sutton perspective, should you be able to throw an AI out there into the world the way you would a human or animal and just work with and ‘grow up’ via sensory data? Andrej points to his response to Sutton, that biological brains work via a very different process, we’re building ghosts not animals, although we should make them more ‘animal-like’ over time. But animals don’t do what Sutton suggests, they use an evolutionary outer loop. Animals only use RL for non-intelligence tasks, things like motor skills.

    1. I think humans do use RL on intelligence tasks? My evidence for this is that when I use this model of humans it seems to make better predictions, both about others and about myself.

    2. Humans are smarter about this than ‘pure RL’ of course, including being the meta programmer and curating their own training data.

  5. Dwarkesh contrasts pre-training with evolution in that evolution compacts all info into 3 GB of DNA, thus evolution is closer to finding a lifetime learning algorithm. Andrej agrees there is miraculous compression in DNA and that it includes learning algorithms, but we’re not here to build animals, only useful things, and they’re ‘crappy’ but what know how to build are the ghosts. Dwarkesh says evolution does not give us knowledge, it gives us the algorithm to find knowledge a la Sutton.

    1. Dwarkesh is really big on the need for continual (or here he says ‘lifetime’) learning and the view that it is importantly distinct from what RL does.

    2. I’m not convinced. As Dario points out, in theory you can put everything in the context window. You can do a lot better on memory and imitating continual learning than that with effort, and we’ve done remarkably little on such fronts.

    3. The actual important difference to me is more like sample efficiency. I see ways around that problem too, but am not putting them in this margin.

    4. I reiterate that evolution actually does provide a lot of knowledge, actually, or the seeds to getting specific types of knowledge, using remarkably few bits of data to do this. If you buy into too much ‘blank slate’ you’ll get confused.

  6. Andrej draws a distinction between the neural net picking up all the knowledge in its training data versus it becoming more intelligent, and often you don’t even want the knowledge, we rely on it too much, and this is part of why agents are bad at “going off the data manifold of what exists on the internet.” We want the “cognitive core.”

    1. I buy that you want to minimize the compute costs associated with carrying lots of extra information, so for many tasks you want a Minimum Viable Knowledge Base. I don’t buy that knowledge tends to get in the way. If it does, then Skill Issue.

    2. More knowledge seems hard to divorce fully from more intelligence. A version of me that was abstractly ‘equally smart,’ but which knew far less, might technically have the same Intelligence score on the character sheet, but a lot lower Wisdom and would effectively be kind of dumb. See young people.

    3. I’m skeptical about a single ‘cognitive core’ for similar reasons.

  7. Dwarkesh reiterates in-context learning as ‘the real intelligence’ as distinct from gradient descent. Andrej agrees it’s not explicit, it’s “pattern completion within a token window” but notes there’s tons of patterns on the internet that get into the weights, and it’s possible in-context learning runs a small gradient descent loop inside the neural network. Dwarkesh asks, “why does it feel like with in-context learning we’re getting to this continual learning, real intelligence-like thing? Whereas you don’t get the analogous feeling just from pre-training.”

    1. My response would basically again be sample efficiency, and the way we choose to interact with LLMs being distinct from the training? I don’t get this focus on (I kind of want to say fetishization of?) continual learning as a distinct thing. It doesn’t feel so distinct to me.

  8. Dwarkesh asks, how much of the information from training gets stored in the model? He compares KV cache of 320 kilobytes to a full 70B model trained on 15 trillion tokens. Andrej thinks models get a ‘hazy recollection’ of what happened in training, the compression is dramatic to get 15T tokens into 70B parameters.

    1. Is it that dramatic? Most tokens don’t contain much information, or don’t contain new information. In some ways 0.5% (70B vs. 15T) is kind of a lot. It depends on what you care about. If you actually had to put it all in the 320k KV Cache that’s a lot more compression.

    2. As Andrej says, it’s not enough, so you get much more precise answers about texts if you have the full text in the context window. Which is also true if you ask humans about the details of things that mostly don’t matter.

  9. What part about human intelligence have we most failed to replicate? Andrej says ‘a lot of it’ and starts discussing physical brain components causing “these cognitive deficits that we all intuitively feel when we talk to them models.”

    1. I feel like that’s a type mismatch. I want to know what capabilities are missing, not which physical parts of the brain? I agree that intuitively some capabilities are missing, but I’m not sure how essential this is, and as Andrej suggests we shouldn’t be trying to build an analog of a human.

  10. Dwarkesh turns back to continual learning, asks if it will emerge spontaneously if the model gets the right incentives. Andrej says no, that sleep does this for humans where ‘the context window sometimes sticks around’ and there’s no natural analog, but we want a way to do this, and points to sparse attention.

    1. I’m not convinced we know how the sleep or ‘sticking around’ thing works, clearly there is something going on somewhere.

    2. I agree this won’t happen automatically under current techniques, but we can use different techniques, and again I’m getting the Elle Woods ‘what, like it’s hard?’ reaction to all this, where ‘hard’ is relative to problem importance.

  11. Andrej kind of goes Lindy, pointing to translation invariance to expect algorithmic and other changes at a similar rate to the past, and pointing to the many places he says we’d need gains in order to make further progress, that various things are ‘all surprisingly equal,’ it needs to improve ‘across the board.’

    1. Is this the crux, the fundamental disagreement about the future, in two ways?

    2. The less important one is the idea that progress requires all of [ABCDE] to make progress. That seems wrong to me. Yes, you are more efficient if you make progress more diffusely under exponential scaling laws, but you can still work around any given deficit via More Dakka.

    3. As a simple proof by hypothetical counterexample, suppose I held one of his factors (e.g. architecture, optimizer, loss function) constant matching GPT-3, but could apply modern techniques and budgets to the others. What do I get?

    4. More importantly, Andrej is denying the whole idea that technological progress here or in general is accelerating, or will accelerate. And that seems deeply wrong on multiple levels?

    5. For this particular question, progress has been rapid, investments of all kinds have been huge, and already we are seeing AI directly accelerate AI progress substantially, a process that will accelerate even more as AI gets better, even if it doesn’t cross into anything like full automated AI R&D or a singularity, and we keep adding more ways to scale. It seems rather crazy to expect 2025 → 2035 to be similar to 2015 → 2025 in AI, on the level of ‘wait, you’re suggesting what?’

    6. In the longer arc of history, if we’re going to go there, we see a clear acceleration of time. So we have the standard several billion years to get multicellular life, several hundred million years to get close to human intelligence, several hundred thousand to million years to get agriculture and civilization, several thousand years to get the industrial revolution, several hundred years to get the information age, several dozen years to get AI to do anything useful on the general intelligence front, several ones of years to go from ‘anything useful at all’ to GPT-5 and Sonnet 4.5 being superhuman in many domains already.

    7. I think Andrej makes better arguments for relatively long (still remarkably short!) timelines later, but him invoking this gives me pause.

  1. Andrej found LLMs of little help when assembling his new repo nanochat, which is a an 8k-line set of all the things you need for a minimal ChatGPT clone. He still used autocomplete, but vibe coding only works with boilerplate stuff. In particular, the models ‘remember wrong’ from all the standard internet ways of doing things, that he wasn’t using. For example, he did his own version of a DDP container inside the code, and the models couldn’t comprehend that and kept trying to use DDP instead. Whereas he only used vibe coding for a few boilerplate style areas.

    1. I’ve noticed this too. LLMs will consistently make the same mistakes, or try to make the same changes, over and over, to match their priors.

    2. It’s a reasonable prior to think things like ‘oh almost no one would ever implement a version of DDP themselves,’ the issue is that they aren’t capable of being told that this happened and having this overcome that prior.

  2. “I also feel like it’s annoying to have to type out what I want in English because it’s too much typing. If I just navigate to the part of the code that I want, and I go where I know the code has to appear and I start typing out the first few letters, autocomplete gets it and just gives you the code. This is a very high information bandwidth to specify what you want.”

    1. As a writer this resonates so, so much. There are many tasks where in theory the LLM could do it for me, but by the time I figure out how to get the LLM to do it for me, I might as well have gone and done it myself.

    2. Whereas the autocomplete in gmail is actually good enough that it’s worth my brain scanning it to see if it’s what I wanted to type (or on occasion, a better version).

  3. Putting it together: LLMs are very good at code that has been written many times before, and poor at code that has not been written before, in terms of the structure and conditions behind the code. Code that has been written before on rare occasions is in between. The modes are still amazing, and can often help. On the vibe coding: “I feel like the industry is making too big of a jump and is trying to pretend like this is amazing, and it’s not. It’s slop.”

    1. There’s a big difference between the value added when you can successfully vibe code large blocks of code, versus when you can get answers to questions, debugging notes and stuff like that.

    2. The second category can still be a big boost to productivity, including to AI R&D, but isn’t going to go into crazy territory or enter into recursion mode.

    3. I presume Andrej is in a position where his barrier for ‘not slop’ is super high and the problems he works on are unusually hostile as well.

    4. I do think these arguments are relevant evidence for longer timelines until crazy happens, that we risk overestimating the progress made on vibe coding.

  4. Andrej sees all of computing as a big recursive self-improvement via things like code editors and syntax highlighting and even data checking and search engines, in a way that is continuous with AI. Better autocomplete is the next such step. We’re abstracting, but it is slow.

    1. One could definitely look at it this way. It’s not obvious what that reframing pushes one towards.

  1. How should we think about humans being able to build a rich world model from interactions with the environment, without needing final reward? Andrej says they don’t do RL, they do something different, whereas RL is terrible but everything else we’ve tried has been worse. All RL can do is check the final answers, and say ‘do more of this’ when it works. A human would evaluate parts of the process, an LLM can’t and won’t do this.

    1. So yeah, RL is like democracy. Fair enough.

    2. Why can’t we set up LLMs to do the things human brains do here? Not the exact same thing, but something built on similar principles?

    3. I mean it seems super doable to me, but if you want me to figure out how to do it or actually try doing it the going rate is at least $100 million. Call me.

  2. Dwarkesh does ask why, or at least about process supervision. Andrej says it is tricky how to do that properly, how do you assign credit to partial solutions? Labs are trying to use LLM judges but this is actually subtle, and you’ll run into adversarial examples if you do it for too long. It finds out that dhdhdhdh was an adversarial example so it starts outputting that, or whatever.

    1. So then you… I mean I presume the next 10 things I would say here have already been tried and they fail but I’m not super confident in that.

  3. So train models to be more robust? Find the adversarial examples and fix them one at a time won’t work, there will always be another one.

    1. Certainly ‘find the adversarial examples and fix them one at a time’ is an example of ‘how to totally fail OOD or at the alignment problem,’ you would need a way to automatically spot when you’re invoking one.

  1. What about the thing where humans sleep or daydream, or reflect? Is there some LLM analogy? Andrej says basically no. When an LLM reads a book it predicts the next token, when a human does they do synthetic data generation, talk about it with their friends, manipulate the info to gain knowledge. But doing this with LLMs is nontrivial, for reasons that are subtle and hard to understand, and if you generate synthetic data to train on that makes the model worse, because the examples are silently collapsed, similar to how they know like 3 total jokes. LLMs don’t retain entropy, and we don’t know how to get them to retain it. “I guess what I’m saying is, say we have a chapter of a book and I ask an LLM to think about it, it will give you something that looks very reasonable. But if I ask it 10 times, you’ll notice that all of them are the same. Any individual sample will look okay, but the distribution of it is quite terrible.”

    1. I wish Andrej’s answer here was like 5 minutes longer. Or maybe 50 minutes.

    2. In general, I’m perhaps not typical, but I’d love to hear the ‘over your head’ version where he says a bunch of things that gesture in various directions, and it’s up to you whether you want to try and understand it.

    3. I mean from the naive perspective this has ‘skill issue’ written all over it, and there’s so many things I would want to try.

  2. “I think that there’s possibly no fundamental solution to this. I also think humans collapse over time. These analogies are surprisingly good. Humans collapse during the course of their lives. This is why children, they haven’t overfit yet… We end up revisiting the same thoughts. We end up saying more and more of the same stuff, and the learning rates go down, and the collapse continues to get worse, and then everything deteriorates.”

    1. I feel this.

    2. That means both in myself, and in my observations of others.

    3. Mode collapse in humans is evolutionarily and strategically optimal, under conditions of aging and death. If you’re in exploration, pivot to exploitation.

    4. We also have various systems to fight this and pivot back to exploration.

    5. One central reason humans get caught in mode collapse, when we might not want that it, is myopia and hyperbolic discounting.

    6. Another is, broadly speaking, ‘liquidity or solvency constraints.’

    7. A third would be commitments, signaling, loyalty and so on.

    8. If we weren’t ‘on the clock’ due to aging, which both cuts the value of exploration and also raises the difficulty of it, I think those of us who cared could avoid mode collapse essentially indefinitely.

    9. Also I notice [CENSORED] which has obvious deep learning implications?

  3. Could dreaming be a way to avoid mode collapse by going out of distribution?

    1. I mean, maybe, but the price involved seems crazy high for that.

    2. I worry that we’re using ‘how humans do it’ as too much of a crutch.

  4. Andrej notes you should always be seeking entropy in your life, suggesting talking to other people.

    1. There are lots of good options. I consume lots of text tokens.

  5. What’s up with children being great at learning, especially things like languages, but terrible at remembering experiences or specific information? LLMs are much better than humans at memorization, and this can be a distraction.

    1. I’m not convinced this is actually true?

    2. A counterpoint is that older people learn harder things, and younger people, especially young children, simply cannot learn those things at that level, or would learn them a lot slower.

    3. Another counterpoint is that a lot of what younger humans learn is at least somewhat hard coded into the DNA to be easier to learn, and also are replacing nothing which helps you move a lot faster and seem to be making a lot more progress.

    4. Languages are a clear example of this. I say this as someone with a pretty bad learning disability for languages, who has tried very hard to pick up various additional languages and failed utterly.

    5. A third counterpoint is that children really do put a ton of effort into learning, often not that efficiently (e.g. rewatching and rereading the same shows and books over and over, repeating games and patterns and so on), to get the information they need. Let your children play, but that’s time intensive. Imagine what adults can and do learn when they truly have no other responsibilities and go all-in on it.

  6. How do you solve model collapse? Andrej doesn’t know, the models be collapsed, and Dwarkesh points out RL punishes output diversity. Perhaps you could regularize entropy to be higher, it’s all tricky.

  7. Andrej says state of the art models have gotten smaller, and he still thinks they memorized too much and we should seek a small cognitive core.

    1. He comes back to this idea that knowing things is a disadvantage. I don’t get it. I do buy that smaller models are more efficient, especially with inference scaling, and so this is the best practical approach for now.

    2. My prediction is that the cognitive core hypothesis is wrong, and that knowledge and access to diverse context is integral to thinking, especially high entropy thinking. I don’t think a single 1B model is going to be a good way to get any kind of conversation you want to have.

    3. There are people who have eidetic memories. They can have a hard time taking advantage because working memory remains limited, and they don’t filter for the info worth remembering or abstracting out of them. So there’s some balance at some point, but I definitely feel like remembering more things than I do would be better? And that I have scary good memory and memorization in key points, such as ability (for a time, anyway) to recall the exact sequence of entire Magic games and tournaments, which is a pattern you also see from star athletes – you ask Steve Curry or Lebron James and they can tell you every detail of every play.

  8. Most of the internet tokens are total garbage, stock tickers, symbols, huge amounts of slop, and you basically don’t want that information.

    1. I’m not sure you don’t want that information? It’s weird. I don’t know enough to say. Obviously it would not be hard to filter such tokens out at this point, so they must be doing something useful. I’m not sure it’s due to memorization, but I also don’t see why the memorization would hurt.

  9. They go back and forth over the size of the supposed cognitive core, Dwarkesh asks why not under 1 billion, Andrej says you probably need a billion knobs and he’s already contrarian being that low.

    1. Whereas yeah, I think 1 billion is not enough and this is the wrong approach entirely unless you want to e.g. do typical simple things within a phone.

Wait what?

Note: The 2% number doesn’t actually come up until the next section on ASI.

  1. How to measure progress? Andrej doesn’t like education level as a measure of AI progress (I agree), he’s also not a fan of the famous METR horizon length graph and is tempted to reject the whole question. He’s sticking with AGI as ‘can do any economically valuable task at human performance or better.’

    1. And you’re going to say having access to ‘any economically valuable (digital) task at human performance or better’ only is +2% GDP growth? Really?

    2. You have to measure something you call AI progress, since you’re going to manage it. Also people will ask constantly and use it to make decisions. If nothing else, you need an estimate of time to AGI.

  2. He says only 10%-20% of the economy is ‘only knowledge work.’

    1. I asked Sonnet. McKinsey 2012 finds knowledge work accounted for 31 percent of all workers in America in 2010. Sonnet says 30%-35% pure knowledge work, 12%-17% pure manual, the rest some hybrid, split the rest in half, you get 60% knowledge work by task, but the knowledge work typically is about double the economic value of the non-knowledge work, so we’re talking on the order of 75% of all economic value.

    2. How much would this change Andrej’s other estimates, given this is more than triple his estimate?

  3. Andrej points to the famous predictions of automating radiology, and suggests what we’ll do more often is have AI do 80% of the volume, then delegate 20% to humans.

    1. Okay, sure, that’s a kind of intermediate step, we might do that for some period of time. If so, let’s say that for 75% of economic value we have the AI provide 60% of the value, assuming the human part is more valuable. So it’s providing 45% of all economic value if composition of ‘labor including AI’ does not change.

    2. Except of course if half of everything now has marginal cost epsilon (almost but not quite zero), then there will be a large shift in composition to doing more of those tasks.

  4. Dwarkesh compares radiologists to early Waymos where they had a guy in the front seat that never did anything so people felt better, and similarly if an AI can do 99% of a job the human doing the 1% can still be super valuable because bottleneck. Andrej points out radiology turns out to be a bad example for various reasons, suggests call centers.

    1. If you have 99 AI tasks and 1 human task, and you can’t do the full valuable task without all 100 actions, then in some sense the 1 human task is super valuable.

    2. In another sense, it’s really not, especially if any human can do it and there is now a surplus of humans available. Market price might drop quite low.

    3. Wages don’t go up as you approach 99% AI, as Dwarkesh suggests they could, unless you’re increasingly bottlenecked on available humans due to a Jevons Paradox situation or hard limit on supply, both of which are the case in radiology, or this raises required skill levels. This is especially true if you’re automating a wide variety of tasks and there is less demand for labor.

  5. Dwarkesh points out that we don’t seem to be on an AGI paradigm, we’re not seeing large productivity improvements for consultants and accountants. Whereas coding was a perfect fit for a first task, with lots of ready-made places to slot in an AI.

    1. Skill issue. My lord, skill issue.

    2. Current LLMs can do accounting out of the box, they can automate a large percentage of that work, and they can enable you to do your own accounting. If you’re an accountant and not becoming more productive? That’s on you.

    3. That will only advance as AI improves. A true AGI-level AI could very obviously do most accounting tasks on its own.

    4. Consultants should also be getting large productivity boosts on the knowledge work part of their job, including learning things, analyzing things and writing reports and so on. To the extent their job is to sell themselves and convince others to listen to them, AI might not be good enough yet.

    5. Andrej asks about automating creating slides. If AI isn’t helping you create slides faster, I mean, yeah, skill issue, or at least scaffolding issue.

  6. Dwarkesh says Andy Matuschak tried 50 billion things to get LLMs to write good spaced repetition prompts, and they couldn’t do it.

    1. I do not understand what went wrong with the spaced repetition prompts. Sounds like a fun place to bang one’s head for a while and seems super doable, although I don’t know what a good prompt would look like as I don’t use spaced repetition.

    2. To me, this points towards skill issues, scaffolding issues and time required to git gud and solve for form factors as large barriers to AI value unlocks.

  1. What about superintelligence? “I see it as a progression of automation in society. Extrapolating the trend of computing, there will be a gradual automation of a lot of things, and superintelligence will an extrapolation of that. We expect more and more autonomous entities over time that are doing a lot of the digital work and then eventually even the physical work some amount of time later. Basically I see it as just automation, roughly speaking.”

    1. That’s… not ASI. That’s intelligence denialism. AI as normal technology.

    2. I took a pause here. It’s worth sitting with this for a bit.

    3. Except it kind of isn’t, when you hear what he says later? It’s super weird.

  2. Dwarkesh pushes back: “But automation includes the things humans can already do, and superintelligence implies things humans can’t do.” Andrej gives a strange answer: “But one of the things that people do is invent new things, which I would just put into the automation if that makes sense.”

    1. No, it doesn’t make sense? I’m super confused what ‘just automation’ is supposed to meaningfully indicate?

    2. If what we are automating is ‘being an intelligence’ then everything AI ever does is always ‘just automation’ but that description isn’t useful.

    3. Humans can invest and do new things but superintelligence can invent and do new things that are in practice not available to humans, ‘invent new things’ is not the relevant natural category here.

  3. Andrej worries about a gradual loss of control and understanding of what is happening, and thinks this is the most likely outcome. Multiple competing entities, initially competing on behalf of people, that gradually become more autonomous, some go rogue, others fight them off. They still get out of control.

    1. No notes, really. That’s the baseline scenario if we solve a few other impossible-level problems (or get extremely lucky that they’re not as hard as they look to me) along the way.

    2. Andrej doesn’t say ‘unless’ here, or offer a solution or way to prevent this.

    3. Missing mood?

  4. Dwarkesh asks, will we see an intelligence explosion if we have a million copies of you running in parallel super fast? Andrej says yes, but best believe in intelligence explosions because you’re already living in one and have been for decades, that’s why GDP grows, this is all continuous with the existing hyper-exponential trend, previous techs also didn’t make GDP go up much, everything was slow diffusion.

    1. It’s so weird to say ‘oh, yeah, the million copies of me sped up a thousand times would just be more of the same slow growth trends, ho hum, intelligence explosion,’ “it’s just more automation.”

  5. “We’re still going to have an exponential that’s going to get extremely vertical. It’s going to be very foreign to live in that kind of an environment.” … “Yes, my expectation is that it stays in the same [2% GDP growth rate] pattern.”

    1. I… but… um… I… what?

    2. Don’t you have to pick a side? He seems to keep trying to have his infinite cakes and eat them too, both an accelerating intelligence explosion and then magically GDP growth stays at 2% like it’s some law of nature.

  6. “Self-driving as an example is also computers doing labor. That’s already been playing out. It’s still business as usual.”

    1. Self-driving is a good example of slow diffusion of the underlying technology for various reasons. It’s been slow going, and mostly isn’t yet going.

    2. This is a clear example of an exponential that hasn’t hit you yet. Self-driving cars are Covid-19 in January 2020, except they’re a good thing.

    3. A Fermi estimate for car trips in America per week is around 2 billion, or for rideshares about 100 million per week.

    4. Waymo got to 100,000 weekly rides in August 2024, was at 250,000 weekly rides in April 2025, we don’t yet have more recent data but this market estimates roughly 500,000 per week by year end. That’s 0.5% of taxi rides. The projection for end of year 2026 says maybe 1.5 million rides per week, 1.5%.

    5. Currently the share of non-taxi rides that are full self-driving is essentially zero, maybe 0.2% of trips have meaningful self driving components.

    6. So very obviously, for now, this isn’t going to show up in the productivity or GDP statistics overall, or at least not directly, although I do think this is a non-trivial rise in productivity and lived experience in areas where Waymos are widely available for those who use it, most importantly in San Francisco.

  7. Karpathy keeps saying this will all be gradual capabilities gains and gradual diffusion, with no discrete jump. He suggests you would need some kind of overhang being unlocked such as a new energy source to see a big boost.

    1. I don’t know how to respond to someone who thinks we’re in an intelligence explosion, but refuses to include any form of such feedback into their models.

    2. That’s not shade, that’s me literally not knowing how to respond.

    3. It’s very strange to not expect any overhangs to be unlocked. That’s saying that there aren’t going to be any major technological ideas that we have missed.

    4. His own example is an energy source. If all ASI did was unlock a new method of cheap, safe, clean, unlimited energy, let’s say a design for fusion power plants, that were buildable in any reasonable amount of time, that alone would disrupt the GDP growth trend.

I won’t go further into the same GDP growth or intelligence explosion arguments I seem to discuss in many Dwarkesh Patel podcast posts. I don’t think Andrej has a defensible position here, in the sense that he is doing some combination of denying the premise of AGI/ASI, not taking into account its implications in some places while acknowledging the same dynamics in others.

Most of all, this echoes the common state of the discourse on such questions, which seems to involve:

  1. You, the overly optimistic fool, say AGI will arrive in 2 years, or 5 years, and you say that when it happens it will be a discrete event and then everything changes.

    1. There is also you, the alarmist, saying this would kill everyone, cause us to lose control or otherwise stand risk of being a bad thing.

  2. I, the wise world weary realist, say AGI will only arrive in 10 years, and it will be a gradual, continuous thing with no discrete jumps, facing physical bottlenecks and slow diffusion.

  3. So therefore we won’t see a substantial change to GDP growth, your life will mostly seem normal, there’s no risk of extinction or loss of control, and so on, building sufficiently advanced technology of minds smarter, faster, cheaper and more competitive than ourselves along an increasing set of tasks will go great.

  4. Alternatively, I, the proper cynic, realize AI is simply a ‘normal technology’ and it’s ‘just automation of some tasks’ and they will remain ‘mere tools’ and what are you getting on about, let’s go build some economic models.

I’m fine with those who expect to at first encounter story #2 instead of story #1.

Except it totally, absolutely does not imply #3. Yes, these factors can slow things down, and 10 years are more than 2-5 years, but 10 years is still not that much time, and a continuous transition ends up in the same place, and tacking on some years for diffusion also ends up in the same place. It buys you some time, which we might be able to use well, or we might not, but that’s it.

What about story #4, which to be clear is not Karpathy’s or Patel’s? It’s possible that AI progress stalls out soon and we get a normal technology, but I find it rather unlikely and don’t see why we should expect that. I think that it is quite poor form to treat this as any sort of baseline scenario.

  1. Dwarkesh pivots to Nick Lane. Andrej is surprised evolution found intelligence and expects it to be a rare event among similar worlds. Dwarkesh suggests we got ‘squirrel intelligence’ right after the oxygenation of the atmosphere, which Sutton said was most of the way to human intelligence, yet human intelligence took a lot longer. They go over different animals and their intelligences. You need things worth learning but not worth hardcoding.

  2. Andrej notes LLMs don’t have a culture, suggests it could be a giant scratchpad.

    1. The backrooms? Also LLMs can and will have a culture because anything on the internet can become their context and training data. We already see this, with LLMs basing behaviors off observations of other prior LLMs, in ways that are often undesired.

  3. Andrej mentions self-play, says that he thinks the models can’t create culture because they’re ‘still kids.’ Savant kids, but still kids.

    1. Kids create culture all the time.

    2. No, seriously, I watch my own kids create culture.

    3. I’m not saying they in particular created a great culture, but there’s no question they’re creating culture.

  1. Andrej was at Tesla leading self-driving from 2017 to 2022. Why did self-driving take a decade? Andrej says it isn’t done. It’s a march of nines (of reliability). Waymo isn’t economical yet, Tesla’s approach is more scalable, and to be truly done would mean people wouldn’t need a driver’s license anymore. But he agrees it is ‘kind of real.’

    1. Kind of? I mean obviously self-driving can always improve, pick up more nines, get smoother, get faster, get cheaper. Waymo works great, and the economics will get there.

    2. Andrej is still backing the Tesla approach, and maybe they will make fools of us all but for now I do not see it.

  2. They draw parallels to AI and from AI to previous techs. Andrej worries we may be overbuilding compute, he isn’t sure, says he’s bullish on the tech but a lot of what he sees on Twitter makes no sense and is about fundraising or attention.

    1. I find it implausible that we are overbuilding compute, but it is possible, and indeed if it was not possible then we would be massively underbuilding.

  3. “I’m just reacting to some of the very fast timelines that people continue to say incorrectly. I’ve heard many, many times over the course of my 15 years in AI where very reputable people keep getting this wrong all the time. I want this to be properly calibrated, and some of this also has geopolitical ramifications and things like that with some of these questions. I don’t want people to make mistakes in that sphere of things. I do want us to be grounded in the reality of what technology is and isn’t.”

    1. Key quote.

    2. Andrej is not saying AGI is far in any normal person sense, or that its impact will be small, as he says he is bullish on the technology.

    3. What Andrej is doing is pushing back on the even faster timelines and bigger expectations that are often part of his world. Which is totally fair play.

    4. That has to be kept in perspective. If Andrej is right the future will blow your mind, it will go crazy.

    5. Where the confusion arises is where Andrej then tries to equate his timelines and expectations with calm and continuity, or extends those predictions forward in ways that don’t make sense to me.

    6. Again, I see similar things with many others e.g. the communications of the White House’s Sriram Krishnan, saying AGI is far, but if you push far means things like 10 years. Which is not that far.

    7. I think Andrej’s look back has a similar issue of perspective. Very reputable people keep predicting specific AI accomplishments on timelines that don’t happen, sure, that’s totally a thing. But is AI underperforming the expectations of reputable optimists? I think progress in AI in general in the last 15 years, certainly since 2018 and the transformer, has been absolutely massive compared to general expectations, of course there were (and likely always will be) people saying ‘AGI in three years’ and that didn’t happen.

  1. Dwarkesh asks about Eureka Labs. Why not AI research? Andrej says he’s not sure he could improve what the labs are doing. He’s afraid of a WALL-E or Idiocracy problem where humans are disempowered and don’t do things. He’s trying to build Starfleet Academy.

    1. I think he’s right to be worried about disempowerment, but looking to education as a solution seems misplaced here? Education is great, all for it, but it seems highly unlikely it will ‘turn losses into wins’ in this sense.

    2. The good news is Andrej definitely has fully enough money so he can do whatever he wants, and it’s clear this stuff is what he wants.

  2. Dwarkesh Patel hasn’t seen Star Trek.

    1. Can we get this fixed, please?

    2. I propose a podcast which is nothing but Dwarkesh Patel watching Star Trek for the first time and reacting.

  3. Andrej thinks AI will fundamentally change education, and it’s still early. Right now you have an LLM, you ask it questions, that’s already super valuable but it still feels like slop, he wants an actual tutor experience. He learned Korean from a tutor 1-on-1 and that was so much better than a 10-to-1 class or learning on the internet. The tutor figured out where he was as a student, asked the right questions, and no LLM currently comes close. Right now they can’t.

    1. Strongly agreed on all of that.

  4. His first class is LLM-101-N, with Nanochat as the capstone.

    1. This raises the question of whether a class is even the right form factor at all for this AI world. Maybe it is, maybe it isn’t?

  5. Dwarkesh points out that if you can self-probe well enough you can avoid being stuck. Andrej contrasts LLM-101-N with his CS231n at Stanford on deep learning, that LLMs really empower him and help him go faster. Right now he’s hiring faculty but over time some TAs can become AIs.

  6. “I often say that pre-AGI education is useful. Post-AGI education is fun. In a similar way, people go to the gym today. We don’t need their physical strength to manipulate heavy objects because we have machines that do that. They still go to the gym. Why do they go to the gym? Because it’s fun, it’s healthy, and you look hot when you have a six-pack. It’s attractive for people to do that in a very deep, psychological, evolutionary sense for humanity. Education will play out in the same way. You’ll go to school like you go to the gym.”

  7. “If you look at, for example, aristocrats, or you look at ancient Greece or something like that, whenever you had little pocket environments that were post-AGI in a certain sense, people have spent a lot of their time flourishing in a certain way, either physically or cognitively. I feel okay about the prospects of that. If this is false and I’m wrong and we end up in a WALL-E or Idiocracy future, then I don’t even care if there are Dyson spheres. This is a terrible outcome. I really do care about humanity. Everyone has to just be superhuman in a certain sense.”

    1. (on both quotes) So, on the one hand, yes, mostly agreed, if you predicate this on the post-AGI post-useful-human-labor world where we can’t do meaningful productive work and also get to exist and go to the gym and go around doing our thing like this is all perfectly normal.

    2. On the other hand, it’s weird to expect things to work out like that, although I won’t reiterate why, except to say that if you accept that the humans are now learning for fun then I don’t think this jives with a lot of Andrej’s earlier statements and expectations.

    3. If you’re superhuman in this sense, that’s cool, but if you’re less superhuman than the competition, then does it do much beyond being cool? What are most people going to choose to do with it? What is good in life? What is the value?

    4. This all gets into much longer debates and discussions, of course.

  8. “I think there will be a transitional period where we are going to be able to be in the loop and advance things if we understand a lot of stuff. In the long-term, that probably goes away.”

    1. Okay, sure, there will be a transition period of unknown length, but that doesn’t as they say solve for the equilibrium.

    2. I don’t expect that transition period to last very long, although there are various potential values for very long.

  9. Dwarkesh asks about teaching. Andrej says everyone should learn physics early, since early education is about booting up a brain. He looks for first or second order terms of everything. Find the core of the thing and understand it.

    1. Our educational system is not about booting up brains. If it was, it would do a lot of things very differently. Not that we should let this stop us.

  10. Curse of knowledge is a big problem, if you’re an expert in a field often you don’t know what others don’t know. Could be helpful to see other people’s dumb questions that they ask an LLM?

  11. From Dwarkesh: “Another trick that just works astoundingly well. If somebody writes a paper or a blog post or an announcement, it is in 100% of cases that just the narration or the transcription of how they would explain it to you over lunch is way more, not only understandable, but actually also more accurate and scientific, in the sense that people have a bias to explain things in the most abstract, jargon-filled way possible and to clear their throat for four paragraphs before they explain the central idea. But there’s something about communicating one-on-one with a person which compels you to just say the thing.”

    1. Love it. Hence we listen to and cover podcasts, too.

    2. I think this is because in a conversation you don’t have to be defensible or get judged or be technically correct, you don’t have to have structure that looks good, and you don’t have to offer a full explanation.

    3. As in, you can gesture at things, say things without justifications, watch reactions, see what lands, fill in gaps when needed, and yeah, ‘just say the thing.’

    4. That’s (a lot of) why it isn’t the abstract, plus habit, it isn’t done that way because it isn’t done that way.

Peter Wildeford offers his one page summary, which I endorse as a summary.

Sriram Krishnan highlights part of the section on education, which I agree was excellent, and recommends the overall podcast highly.

Andrej Karpathy offered his post-podcast reactions here, including a bunch of distillations, highlights and helpful links.

Here’s his summary on the timelines question:

Andrej Karpathy: Basically my AI timelines are about 5-10X pessimistic w.r.t. what you’ll find in your neighborhood SF AI house party or on your twitter timeline, but still quite optimistic w.r.t. a rising tide of AI deniers and skeptics

Those house parties must be crazy, as must his particular slice of Twitter. He has AGI 10 years away and he’s saying that’s 5-10X pessimistic. Do the math.

My slice currently overall has 4-10 year expectations. The AI 2027 crowd has some people modestly shorter, but even they are now out in 2029 or so I think.

That’s how it should work, evidence should move the numbers back and forth, and if you had a very aggressive timeline six months or a year ago recent events should slow your roll. You can say ‘those people were getting ahead of themselves and messed up’ and that’s a reasonable perspective, but I don’t think it was obviously a large mistake given what we knew at the time.

Peter Wildeford: I’m desperate for a worldview where we agree both are true:

– current AI is slop and the marketing is BS, but

– staggering AI transformation (including extinction) is 5-20 years out, this may not be good by default, and thus merits major policy action now

I agree with the second point (with error bars). The first point I would rate as ‘somewhat true.’ Much of the marketing is BS and much of the output is slop, no question, but much of it is not on either front and the models are already extremely helpful to those who use them.

Peter Wildeford: If the debate truly has become

– “AGI is going to take all the jobs in just two years” vs.

– “no you idiot, don’t buy the hype, AI is really slop, it will take 10-20 years before AGI automates all jobs (and maybe kill us)”

…I feel like we have really lost the big picture here

[meme credit: Darth thromBOOzyt]

Similarly, the first position here is obviously wrong, and the second position could be right on the substance but has one hell of a Missing Mood, 10-20 years before all jobs get automated is kind of the biggest thing that happened in the history of history even if the process doesn’t kill or diempower us.

Rob Miles: It’s strange that the “anti hype” position is now “AGI is one decade away”. That… would still be a very alarming situation to be in? It’s not at all obvious that that would be enough time to prepare.

It’s so crazy the amount to which vibes can supposedly shift when objectively nothing has happened and even the newly expressed opinions aren’t so different from what everyone was saying before, it’s that now we’re phrasing it as ‘this is long timelines’ as opposed to ‘this is short timelines.’

John Coogan: It’s over. Andrej Karpathy popped the AI bubble. It’s time to rotate out of AI stocks and focus on investing in food, water, shelter, and guns. AI is fake, the internet is overhyped, computers are pretty much useless, even the steam engine is mid. We’re going back to sticks and stones.

Obviously it’s not actually that bad, but the general tech community is experiencing whiplash right now after the Richard Sutton and Andrej Karpathy appearances on Dwarkesh. Andrej directly called the code produced by today’s frontier models “slop” and estimated that AGI was around 10 years away. Interestingly this lines up nicely with Sam Altman’s “The Intelligence Age” blog post from September 23, 2024, where he said “It is possible that we will have superintelligence in a few thousand days (!); it may take longer, but I’m confident we’ll get there.”

I read this timeline to mean a decade, which is what people always say when they’re predicting big technological shifts (see space travel, quantum computing, and nuclear fusion timelines). This is still earlier than Ray Kurzweil’s 2045 singularity prediction, which has always sounded on the extreme edge of sci-fi forecasting, but now looks bearish.

Yep, I read Altman as ~10 years there as well. Except that Altman was approaching that correctly as ‘quickly, there’s no time’ rather than ‘we have all the time in the world.’

There’s a whole chain of AGI-soon bears who feel vindicated by Andrej’s comments and the general vibe shift. Yann LeCun, Tyler Cowen, and many others on the side of “progress will be incremental” look great at this moment in time.

This George Hotz quote from a Lex Fridman interview in June of 2023 now feels way ahead of the curve, at the time: “Will GPT-12 be AGI? My answer is no, of course not. Cross-entropy loss is never going to get you there. You probably need reinforcement learning in fancy environments to get something that would be considered AGI-like.”

Big tech companies can’t turn on a dime on the basis of the latest Dwarkesh interview though. Oracle is building something like $300 billion in infrastructure over the next five years.

It’s so crazy to think a big tech company would think ‘oops, it’s over, Dwarkesh interviews said so’ and regret or pull back on investment, also yeah it’s weird that Amazon was up 1.6% while AWS was down.

Danielle Fong: aws down, amazon up

nvda barely sweating

narrative bubbles pop more easily than market bubbles

Why would you give Hotz credit for ‘GPT-12 won’t be AGI’ here, when the timeline for GPT-12 (assuming GPT-11 wasn’t AGI, so we’re not accelerating releases yet) is something like 2039? Seems deeply silly. And yet here we are. Similarly, people supposedly ‘look great’ when others echo previous talking points? In my book, you look good based on actual outcomes versus predictions, not when others also predict, unless you are trading the market.

I definitely share the frustration Liron had here:

Liron Shapira: Dwarkesh asked Karpathy about the Yudkowskian observation that exponential economic growth to date has been achieved with *constanthuman-level thinking ability.

Andrej acknowledged the point but said, nevertheless, he has a strong intuition that 2% GDP growth will hold steady.

Roon: correction, humanity has achieved superexponential economic growth to date

Liron: True.

In short, I don’t think a reasonable extrapolation from above plus AGI is ~2%.

But hey, that’s the way it goes. It’s been a fun one.

Discussion about this post

On Dwarkesh Patel’s Podcast With Andrej Karpathy Read More »