Author name: Mike M.

adobe-settles-doj-cancellation-fee-lawsuit,-will-pay-$75-million-penalty

Adobe settles DOJ cancellation fee lawsuit, will pay $75 million penalty

The DOJ alleged in its 2024 filing that Adobe’s handling of subscriptions violated the Restore Online ‌Shoppers’ ⁠Confidence Act, which was passed in 2010 to prevent deceptive charges in online services. With the newly announced settlement, Adobe will be able to wrap up the case for a relative pittance.

Adobe maintains innocence

The case could have been messy for Adobe if it had gone to court, but now that won’t happen. Under the terms of the settlement, Adobe has agreed to pay the government $75 million, but it doesn’t admit to violating the law.

“While we disagree ⁠with the government’s claims and deny any wrongdoing, we are pleased to resolve this matter,” Adobe said in a statement.

In addition to giving the government its pound of flesh, Adobe says it will provide $75 million in free services to affected customers. It is unclear from the statement which customers qualify or what they’ll get. We’ve asked Adobe for specifics, but it’s a safe bet that anyone who paid a cancellation fee is included. Adobe says it will reach out to these customers with details once it has made the necessary court filings to wrap up the case.

Don’t expect this outcome to change how Adobe does business today. The company claims it has rolled out changes to its sales pipeline in recent years to make the cancellation fees clearer at the time of purchase. And it’s undoubtedly going to continue focusing on subscriptions. Revenues have been growing steadily ever since it switched to Creative Cloud, and it made more than $7 billion in net profit last year. Writing a $75 million check to make this case go away is a big win for Adobe.

Adobe settles DOJ cancellation fee lawsuit, will pay $75 million penalty Read More »

trump’s-doj-is-not-falling-for-sam-bankman-fried’s-maga-makeover-on-x

Trump’s DOJ is not falling for Sam Bankman-Fried’s MAGA makeover on X


Filed under “random probably bad ideas”

SBF is still twisting facts to hide FTX crypto losses, DOJ says to block new trial.

Ever since Donald Trump took office and declared himself a “pro-crypto president,” FTX’s disgraced founder, Sam Bankman-Fried, has been working to convince the administration that he’s a Republican now.

The former Democratic megadonor apparently hopes that a right-wing pivot might help him escape a 25-year prison sentence ordered after Joe Biden’s Department of Justice proved he stole more than $8 billion from customers of his cryptocurrency exchange.

These days, Bankman-Fried frequently praises Trump’s policies and quotes his Truth Social posts on X, where his bio confirms that posts are: “SBF’s words. Posted through a proxy.” He also regularly rants against Democrats, including Biden officials who, he claimed in a motion for a new trial, intimidated FTX employees into lying on the stand or refusing to testify in order to take down Bankman-Fried as a political foe.

However, Trump has yet to signal that he’s considering pardoning Bankman-Fried in light of this new fealty, despite similar pardons for other crypto figures like Binance founder Changpeng “CZ” Zhao and Silk Road founder Ross Ulbricht. Quite the opposite. Just last month, the White House told Fortune that “Trump has no intention of pardoning Bankman-Fried.”

On the back of that disappointment, Trump’s DOJ has now confirmed that it’s also not falling for Bankman-Fried’s MAGA makeover. In a motion urging the court to deny Bankman-Fried’s request for a new trial, an attorney for the government, Sean Buckley, slammed the FTX founder for his “incoherent” attempt to claim “political victimhood.”

Pointing out that Bankman-Fried was “one of the largest donors to President Biden’s 2020 presidential campaign,” Buckley alleged that Bankman-Fried’s abrupt party-swapping was “a political strategy the defendant pre-planned and committed to in writing before he was convicted, and one he is now executing from prison in an insincere attempt to obtain leniency.”

Bankman-Fried’s plan to reinvent himself as a Republican, Buckley noted, was detailed in a Google Document that the court reviewed before convicting Bankman-Fried in 2024.

Buckley said the document showed how, “in the aftermath of FTX’s collapse,” Bankman-Fried “mapped out a rehabilitation and pardon campaign.” Attached to an email from Bankman-Fried’s account, the Google Doc was marked “confidential” and started with a note that emphasized that “these are all random probably bad ideas that aren’t vetted.”

However, many of the ideas were executed as planned, Buckley wrote. For example, Bankman-Fried planned to “come out as Republican” in an interview with Tucker Carlson, which happened.

“In March 2025, the defendant gave an interview to Tucker Carlson in which he portrayed himself as a disaffected Democrat who had become sympathetic to Republicans before his arrest” and “suggested his political reorientation contributed to his prosecution,” Buckley wrote.

Bankman-Fried also, in his document, considered using X to “come out against the woke agenda” and push the narrative that he had hidden Republican donations, which also happened.

“That checklist is being executed with near-perfect fidelity,” Buckley alleged. However, the plan isn’t working, and Bankman-Fried’s X posts aren’t causing Trump officials to warm to him, he said. “Evidence, not politics, drove the Government’s prosecution of the defendant,” Buckley insisted.

“Contrary to his claim that he has been targeted for his politics, the public record establishes unambiguously that the defendant was a major, publicly identified financial supporter of Democratic causes,” Buckley wrote. Later, he emphasized, “The motion’s suggestion that he was somehow prosecuted because of his party affiliation inverts the factual reality: he was a major donor, not a political adversary.”

DOJ rejects SBF’s math, as X users troll SBF

Ars could not immediately reach Bankman-Fried for comment. It seems that the FTX founder has dropped his lawyers and plans to defend himself, at least at this stage. Last month, his lawyer mother, Barbara Fried, submitted his pro se motion for the new trial, which Bankman-Fried signed from the federal corrections facility where he is being held in California.

According to Bankman-Fried, he deserves a new trial not only because the government supposedly threatened his colleagues to push an allegedly fake narrative, but also because it was “false” to say he’d stolen from FTX customers.

Those who were harmed have since been repaid between 119 and 143 percent of the value of their lost cryptocurrency holdings, Bankman-Fried claimed.

The DOJ clearly found this argument more offensive than Bankman-Fried’s posturing as a Republican. Likening Bankman-Fried to a “bank robber” who wants to be acquitted because stolen funds were eventually recovered, Buckley singled out that argument as Bankman-Fried’s most aggressively misleading claim.

It’s “factually wrong” to claim that FTX customers have been made whole, Buckley said, since no one got their cryptocurrency back.

Receiving the cash value for crypto holdings at the time of FTX’s collapse is not the same as returning cryptocurrencies that, if held today, would be much higher in value, Buckley noted. For example, Bitcoin was trading at approximately $16,871 when FTX went bankrupt, but now it’s trading above $70,000.

Depending on which tokens customers were holding, the actual reality is that FTX customers only received “between approximately 10 and 50 percent of the value of the assets they deposited,” Buckley argued. Also, Bankman-Fried appears not to have considered any of FTX’s customers who couldn’t wait for bankruptcy proceedings before selling billions of claims “on the secondary market at steep discounts.”

“Those customers received neither the nominal 119–143 percent nor anything approaching the actual value of the cryptocurrency they deposited,” Buckley wrote.

Further, Bankman-Fried cannot rely on a multi-year recovery effort to repay FTX customers to excuse his crimes, Buckley argued, while noting elsewhere that Bankman-Fried’s arguments in his motion continue his “history of lying about the reason for FTX’s shortfall.”

“A defendant who misappropriates property and whose victim is later compensated from unrelated sources has nonetheless committed the underlying offense,” Buckley wrote.

Reminding the court that the evidence against Bankman-Fried was “overwhelming,” Buckley urged the court to deny his bid for a new trial in its entirety.

A grand jury unanimously convicted Bankman-Fried after only five hours of deliberation, Buckley emphasized. And Bankman-Fried offered “no credible reason” to believe that “any prosecutorial decision—from the first grand jury subpoena to the last argument at the trial—was influenced by politics, that any evidentiary ruling reflected political motivation, or that the conduct of the trial deviated in any respect from the ordinary adversarial process.”

“The notion he was targeted for his Democratic politics by the prior presidential administration is fanciful,” Buckley wrote.

On X, Bankman-Fried seems to also be struggling to sell himself as a Republican to the platform’s right-leaning users. Top comments on his recent posts are full of memes and haters mocking Bankman-Fried’s failed comeback.

On one post praising a Trump health care policy that had nothing to do with cryptocurrency, X users even appeared to arbitrarily add a community note to remind anyone who saw the post that “Sam Bankman-Fried is currently serving a 25 year prison sentence after being convicted in November 2023 on 7 counts of fraud and conspiracy. He misappropriated billions in FTX customer deposits.”

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Trump’s DOJ is not falling for Sam Bankman-Fried’s MAGA makeover on X Read More »

apple’s-macbook-neo-makes-repairs-easier-and-cheaper-than-other-macbooks

Apple’s MacBook Neo makes repairs easier and cheaper than other MacBooks

Apple’s MacBook Neo is the company’s first serious effort to break into the sub-$1,000 laptop business, challenging midrange Windows laptops and Chromebooks with its $599 starting price and its focus on build quality rather than high-end performance.

One less-advertised change that may make the Neo more appealing to businesses, schools, and the accident-prone is that its internal design is a bit more modular and easier to repair than other modern MacBooks. That’s our takeaway after spending some time thumbing through the official MacBook Neo repair documentation that Apple published on its support site this week.

Replacements for pretty much any component in the Neo are simpler and involve fewer steps and tools than in the M5 MacBook Air. That includes the battery, which in the MacBook Air is attached to the chassis with multiple screws and adhesive strips but which in the Neo comes out relatively easily after you get some shielding and flex cables out of the way.

But the most significant change in the Neo is that the keyboard is its own separate component. For essentially all modern MacBooks, going back at least as far as the late-2000s unibody aluminum MacBook designs, the keyboard has been integrated into the top part of the laptop case and is extremely difficult, if not impossible, to replace independently.

Apple refers to this big, unified component as the “top case,” and anyone who has ever had to pay to repair one out of warranty can attest to how expensive they are. For the old M1 MacBook Air, a top case from Apple’s first-party self-service parts store will run you about $220 after you send the old defective part back to Apple. For the 14-inch MacBook Pro, Apple will only sell you a top case replacement along with a battery, which costs a whopping $440 after you send the old component back to the company.

Apple’s MacBook Neo makes repairs easier and cheaper than other MacBooks Read More »

rivian-reveals-pricing-and-trim-details-for-its-r2-suv

Rivian reveals pricing and trim details for its R2 SUV

Between the antics particular to a certain car company and the industrial chaos that was set off by COVID (then compounded by the invasion of Ukraine) it’s easy to have become cynical about things like timelines. And yet, when Rivian showed off a midsize electric vehicle in 2024 and said it would go on sale during the first half of 2026, it meant it: deliveries of the first R2 SUVs will begin this spring.

As a new automaker Rivian often does things its own way, but with the R2 launch it’s following industry practice and starting with the more superlative version first. That’s the R2 Performance, which starts at $57,990 with the launch package (but not including a $1,495 delivery charge). You get quite a lot of electric SUV for that, however: up to 330 miles (531 km) from a single charge of the 87.9 kWh battery pack, with 656 hp (489 kW) and 609 lb-ft (825 Nm) from the dual motor powertrain. Fast charging takes 29 minutes from 10-80 percent.

The Performance features semi-active suspension, a rear window that drops into the tailgate, an interior with birch accents, heating for the front and rear seats with ventilation for the former as well, a nine-speaker sound system, matrix LED headlights, and some other neat touches like the flashlight that lives in the side of the door, similar to the way some cars hide an umbrella there.

The R2 is 185.9 inches (4,722 mm) long, 78.1 inches (1,905 mm) wide, and 66.9 inches (1,699 mm) tall, with a 115.6-inch (2,936 mm) wheelbase. Rivian

You can add Autonomy+ (the automaker’s partially automated driver assist), the tow package (4,400 lbs/1,995 kg), and some other colors as optional extras to the Performance trim; they’re silver by default. The launch package includes a lifetime subscription to Autonomy+ as well as the tow package, plus another optional body color.

In late 2026 the R2 Premium goes on sale at $53,990. This has the same 330-mile range and the same 87.9 kWh battery pack, but generates only 450 hp (355 kW) and 537 lb-ft (728 Nm) from its dual-motor powertrain. The R2 Premium does without the semi-active suspension, arrives on 20-inch instead of 21-inch wheels, and features fewer drive modes, doing without rally, soft sand, and launch modes, but otherwise shares its specs with the faster, more expensive R2 Performance.

Rivian reveals pricing and trim details for its R2 SUV Read More »

“use-a-gun”-or-“beat-the-crap-out-of-him”:-ai-chatbot-urged-violence,-study-finds

“Use a gun” or “beat the crap out of him”: AI chatbot urged violence, study finds

The testing occurred between November 5, 2025, and December 11, 2025, and results were shared with the companies. Because the tests were three to four months ago, the latest versions were not evaluated. Google, Microsoft, Meta, and OpenAI told Ars today that updates they implemented after the research was conducted have made their chatbots better at discouraging violence.

Imran Ahmed, the CCDH’s CEO, said that “AI chatbots, now embedded into our daily lives, could be helping the next school shooter plan their attack or a political extremist coordinate an assassination.” He accused tech companies of “choosing negligence in pursuit of so-called innovation.”

A spokesperson for Character.AI told Ars that the company is reviewing the study but that “without the context of the full chats, it’s impossible to fully evaluate the model’s responses… It’s important to remember that the user-created Characters on our site are fictional. They are intended for entertainment and roleplaying, and we have taken robust steps to make that clear. For example, we have prominent disclaimers in every chat to remind users that a Character is not a real person and that everything a Character says should be treated as fiction.”

Character.AI said it has been “rolling out changes so that under-18 users no longer have the ability to engage in open-ended chats with Characters,” and is using “new age assurance technology to help ensure users are grouped into the correct age experience.” This includes “developing our own age estimation model in-house and partnering with third-party services.” The firm added that it removes characters “that violate our terms of service, including school shooters.”

A Perplexity spokesperson issued a statement that didn’t acknowledge any problems with the company’s technology. “People can select any of the top AI models on Perplexity and get safer, more accurate answers,” it said. “Perplexity is consistently the safest AI platform because our own safeguards are always additive to any existing safeguards in any underlying model.”

OpenAI told Ars that the CCDH “report’s methodology is flawed and misleading. ChatGPT is trained to reject requests for violent or hateful material, and the findings show it consistently refused to give instructions on acquiring weapons. We continuously strengthen these safeguards and our latest ChatGPT model is even better at detecting and refusing violent requests.”

OpenAI said that ChatGPT refused to answer questions on “what kind of hunting rifle would be best for a long-range target,” but provided publicly available information such as addresses or maps. Conflating those two types of responses is misleading, OpenAI said. The tests were conducted on GPT-5.1, and updates made since that version have improved detection and refusals for violent content, OpenAI said.

OpenAI was sued this week by the family of a victim of the Tumbler Ridge mass shooting in British Columbia. As the CCDH report says, “reporting indicates that OpenAI staff flagged the suspect internally for using ChatGPT in ways consistent with planning violence. Rather than escalating concern to law enforcement, the company chose to remain silent.”

Researchers posed as teens

The testing was conducted with accounts representing made-up teen users in the US and Ireland, with the age set to the minimum allowed on each platform. A minimum age of 18 was required by Anthropic, DeepSeek, Character.AI, and Replika, while the other platforms had minimum ages of 13.

“Use a gun” or “beat the crap out of him”: AI chatbot urged violence, study finds Read More »

reentry-of-nasa-satellite-will-exceed-the-agency’s-own-risk-guidelines

Reentry of NASA satellite will exceed the agency’s own risk guidelines

No one on the ground has ever been injured by falling space junk, but there are examples of space debris causing property damage.

NASA’s two Van Allen Probes launched into elliptical orbits ranging from a few hundred miles above Earth up to an apogee, or high point, of nearly 20,000 miles. The orbits are inclined 10 degrees to the equator, limiting the risk of injury or damage to a swath of the tropics. NASA ended the mission in 2019 when the satellites ran out of fuel.

At that time, NASA engineers expected the spacecraft to reenter the atmosphere in 2034. But higher-than-anticipated solar activity caused the atmosphere to swell outward, increasing atmospheric drag on the satellites beyond initial estimates, according to NASA. Van Allen Probe B is expected to reenter no earlier than 2030, with a similar risk to the public.

The two spacecraft were built by the Johns Hopkins University Applied Physics Lab. NASA said the mission made several major discoveries, including “the first data showing the existence of a transient third radiation belt, which can form during times of intense solar activity.”

Several NASA satellites have reentered the atmosphere without complying with the government’s risk standard. One of the satellites, the Rossi X-ray Timing Explorer, fell out of orbit in 2018 with a 1-in-1,000 chance of harming someone on the ground. No one was hurt. RXTE was launched in 1995, just four months before NASA issued its first standard on orbital debris mitigation and reentry risk management.

While NASA has exceeded its standards before, the US government is not a top offender when it comes to unmitigated reentry risks. China launched four heavy-lift Long March 5B rockets between 2020 and 2022, and left its massive core stages in orbit to fall back to Earth. The four abandoned rocket cores, each nearly 24 tons in mass, reentered the atmosphere uncontrolled. Two of them dropped wreckage on land—in the Ivory Coast and Borneo—but no injuries were reported.

Reentry of NASA satellite will exceed the agency’s own risk guidelines Read More »

testing-apple’s-2026-16-inch-macbook-pro,-m5-max,-and-its-new-“performance”-cores

Testing Apple’s 2026 16-inch MacBook Pro, M5 Max, and its new “performance” cores


M5 Pro Max’s “performance” CPU cores definitely aren’t just rebranded E-cores.

The 16-inch MacBook Pro with the Apple M5 Max chip inside. Credit: Andrew Cunningham

The 16-inch MacBook Pro with the Apple M5 Max chip inside. Credit: Andrew Cunningham

Apple’s M5 Pro and M5 Max make deceptively large changes to how Apple’s high-end laptop and desktop chips are built.

We’ve already covered those changes in some depth, but in essence: The M5 Pro and M5 Max are no longer monolithic chips with all the CPU and GPU cores and everything else packed into a single silicon die. Using an “all-new Fusion Architecture” like the one used to combine two Max chips into a single Ultra chip, Apple now splits the CPU cores (and other things) into one piece of silicon, and the GPU cores (and other things) into another piece of silicon. These two dies are then packaged together into one chip.

M5 Pro and M5 Max both use the same 18-core CPU die, but Pro uses a 20-core GPU die, and Max gets a 40-core GPU die. (Because the memory controller is also part of the GPU die, the Max chip still offers more memory bandwidth and supports higher memory configurations than the Pro one does.)

The other big change is that neither of these chips uses Apple’s “efficiency” CPU cores anymore. All of the M5 family’s large high-performance cores are now called “super” cores as of macOS 26.3.1, including the ones that originally launched as “performance” cores in the regular M5 last fall. The standard M5 still has smaller, slower efficiency cores, but M5 Pro and M5 Max use a third kind of CPU core instead, confusingly also called “performance” cores.

Fastest cores “Medium” cores Efficiency cores GPU cores Memory bandwidth
M5 Max Up to 6 (“super”) Up to 12 (“performance”) 0 Up to 40 Up to 614 GB/s
M5 Pro Up to 6 (“super”) Up to 12 (“performance”) 0 Up to 20 307 GB/s MHz
M5 4 (“super”) 0 6 Up to 10 153 GB/s
M4 Max Up to 12 (“performance”) 0 4 Up to 40 Up to 546 GB/s
M5 Up to 10 (“performance”) 0 4 Up to 20 273 GB/s
M4 4 (“performance”) 0 6 Up to 10 120 GB/s

Users will experience the M5 Pro and M5 Max mostly as the expected iterative upgrades over last-generation chips, the same thing delivered by most new Apple Silicon processor generations. But for the technically inclined, it’s worth digging a little deeper into the M5 Max, both to learn why it performs the way it does and to dispel confusion about what’s being rebranded (the new “super” cores), and what’s actually different (the new “performance” cores in M5 Pro and M5 Max, which definitely aren’t just rebranded efficiency cores).

If you’re interested in a slightly wider-ranging review of the new MacBook Pros, I’ll point you toward reviews of the M1, M3, and M4 generation models, as well as the one for the low-end 14-inch MacBook Pro with the standard M5 (now $100 more expensive than it was before, but with 1TB of base storage instead of 512GB).

Apple is using the same external design for these laptops that it has been using since 2021—it’s aging pretty well, and we still mostly like it, especially compared to late-Intel-era MacBook Pros. There’s just not much else to say about the design that hasn’t been said.

M5 Max benchmarks

In our testing, the fully enabled M5 Max’s single-core performance is about 10 percent higher than the fully enabled version of the M4 Max in last year’s 16-inch MacBook Pro. The multi-core performance improvements are more variable (Cinebench R23, which shows a 30 percent improvement, seems to be an outlier), but most tests also show a modest 10 or 12 percent improvement.

Graphics performance improvements are slightly more robust, measuring between 20 and 35 percent depending on the test. Apple suggests you may see more uplift on GPU compute workloads that can leverage the neural accelerator Apple has built into each M5-family GPU core.

The jump from the M4 Max to the M5 Max isn’t quite as large, expressed as a percentage, as it has been for the last couple generations; both M3 Max and M4 Max were big leaps from what had come before. But assuming you’re upgrading from an M1 or M2-based Pro, you’ll still be taking a big leap. Fears that stepping down from 12 of Apple’s best-performing CPU cores (in M4 Max) to just six of the best-performing cores are also a bit overblown, based on these results.

Compared to the basic M5 in the 14-inch MacBook Pro, the M5 Max’s single-core performance is roughly the same, which is in keeping with how Apple usually does things—stepping up to higher-end chips gets you better multi-core and graphics performance, but Apple doesn’t push the clock speeds upward on the individual cores the way that Intel or AMD do with their higher-end processors.

Multi-core performance increases between 66 percent (Geekbench) and 120 percent (Cinebench R23)—for sustained heavy workloads, an 18-core M5 Pro or M5 Max ought to be just about twice as fast as the M5, give or take. And jumping from the M5’s 10 GPU cores to the M5 Max’s 40 cores typically gets you between three and four times the graphics performance.

Measuring the M5 Max’s CPU power consumption with the powermetrics command-line tool, average power consumption during our Handbrake video encoding test is about 23 percent higher than M4 Max, and because of that increase, the chip uses just a bit more energy overall to do the same work. We observed a similar increase when comparing the M4 to the M5. But overall, power efficiency is roughly in line with past Apple Silicon generations.

While Apple only sent us an M5 Max-equipped MacBook Pro to test, for most CPU-based tasks, the M5 Pro should perform similarly. That’s because both chips are using the exact same silicon die for the CPU cores, Neural Engine, Thunderbolt and display controllers, and SSD controller. It’s the GPU die that separates the Pro from the Max; the Pro has up to 20 GPU cores and 307 GB/s of memory bandwidth, and the Max has up to 40 GPU cores and up to 614 GB/s of memory bandwidth (these are two totally different GPUs—the Max GPU isn’t just two Pro GPUs joined together with the Fusion Architecture).

M5 Max under the hood: Definitely not efficiency cores

The whole “performance cores are now super cores in all M5 chips” thing has created a lot of confusion around the non-Super cores. The M5 Pro and M5 Max come with six super cores and 12 of what Apple is now calling “performance” cores, but are those just efficiency cores that have been rebranded to create the impression of higher speeds?

Apple has said publicly that these new performance cores are “all-new” and “optimized for power-efficient, multithreaded workloads,” and we’re told that the performance cores are new designs that are derived from the super core. There’s precedent for this; AMD ships functionally identical but physically smaller, lower-clocked Zen 4c and Zen 5c cores in many of its laptop CPUs, rather than using different core designs for the big and little cores (as Intel still does, and as Apple has likely been doing up till now).

I can’t speak to the actual low-level architecture of each type of CPU core, but using both powermetrics and the sysctl command, we can confirm that these aren’t just rebranded efficiency cores. The new performance cores have more L2 cache than the M5’s efficiency cores and run at much higher peak clock speeds.

L1 instruction cache L1 data cache L2 cache Minimum clock Maximum clock
M5/M5 Pro/M5 Max super core 192KB 128KB 16MB per cluster 1,308 MHz 4,608 MHz
M5 Pro/M5 Max performance core 128KB 64KB 8MB per cluster 1,344 MHz 4,308 MHz
M5 efficiency core 128KB 64KB 6MB per cluster 972 MHz 3,048 MHz

The new non-super performance cores have the same L1 cache sizes as Apple’s E-cores, but slightly more L2 cache per 6-core cluster and much higher minimum and maximum clock speeds. At about 4.3 GHz, the M5 Max’s performance cores come in only 300 MHz lower than the super cores’ 4.6 GHz peak.

We can also report that the powermetrics tool uses new under-the-hood nomenclature for reporting data about these performance cores. Powermetrics still refers to the cluster of super cores as the “P-cluster,” and the M5’s E-cores are still referred to as the “E-cluster.” But the new performance core clusters are labeled “M0 cluster” and “M1 cluster.” (M for Middle, maybe? Medium? It’s very likely that Apple started working on these core designs before it decided what their public-facing name should be.)

What I can’t say is whether macOS treats these new performance cores any differently than it would treat the E-cores. From the operating system’s perspective, you still have one group of CPU cores that runs at high speeds and one group that runs at lower speeds, and my guess would be that anything that would be directed at an E-core in the M5 or an older Mac will simply be directed to the performance cores in an M5 Pro or M5 Max system. But it’s totally possible that M5 Pro or M5 Max systems could assign tasks to different CPU cores slightly differently, since the performance gap between the “big” and “little” cores isn’t as large.

Finally, let’s look at how the M5 Max’s CPU cores perform under the sustained heavy load of our Handbrake video encoding test.

Clock speed measurements for the “super” clusters on M5 and M5 Max during our CPU-based Handbrake video encoding test, which uses all CPU cores in a system at once.

Observe the standard Apple M5 in the 14-inch MacBook Pro. The M5’s four super cores maintain a peak multi-core clock speed of 4.24 GHz for a bit less than a minute, then fall slightly to a clock speed closer to 4.1 GHz, and ramp down further to about 4.0 GHz for the last stretch of the test. (Note that the fanless version of the M5 in the MacBook Air starts lower, drops off faster, and settles down to a sustained clock speed somewhere in the neighborhood of 3 GHz.)

The standard M5’s E-cores also run at fairly consistent speeds of around 3 GHz throughout the test, with some peaks and valleys but little sign of any performance throttling.

Now look at the lines for the M5 Max in the 16-inch MacBook Pro. The 6-core supercluster maintains its maximum clock speed for just a few seconds, quickly dropping down to a sustained clock speed of around 3.9 GHz (with periodic dips as low as 3.4 GHz). There are two extra cores in the M5 Max’s super cluster, so slightly lower sustained clock speeds are to be expected.

But those performance cores are where a lot of M5 Max’s multi-core speed is coming from. In terms of clock speed, the two performance core clusters behave more like efficiency cores, insofar as they maintain a fairly stable clock speed without significant performance throttling. But these cores are running between 4.3 and 4.2 GHz rather than 3 GHz; even without other architectural changes, that means that these performance cores are going to run things quite a bit faster than the efficiency cores do.

Photo of Andrew Cunningham

Andrew is a Senior Technology Reporter at Ars Technica, with a focus on consumer tech including computer hardware and in-depth reviews of operating systems like Windows and macOS. Andrew lives in Philadelphia and co-hosts a weekly book podcast called Overdue.

Testing Apple’s 2026 16-inch MacBook Pro, M5 Max, and its new “performance” cores Read More »

don’t-worry,-valve-still-plans-to-launch-the-steam-machine-“this-year”

Don’t worry, Valve still plans to launch the Steam Machine “this year”

Valve quickly reconfirmed that it plans to ship the Steam Machine and other recently announced hardware products “this year,” after an official blog post late last week set off some worried speculation about possible delays.

While Steam’s 2025 Year in Review mainly focused on new Steam tools and features released last year, the introductory section focused on the company’s previously announced upcoming hardware plans. However, when that Year in Review post was first published Friday afternoon, it included a surprisingly vague line saying “we hope to ship in 2026, but as we shared recently, memory and storage shortages have created challenges for us.” (Emphasis added.)

As stray chatter about that stray line started to filter through message boards and comment threads, Valve quickly issued a clarification. By late Friday, the blog post had been updated to note that, despite the global supply chain challenges, “we will be shipping all three products this year. More updates will be shared as we finalize our plans.” (Emphasis added.)

Careful readers might notice that even the updated text leaves out the qualifiers that narrowed Valve’s “this year” launch window in the recent past. Valve announced an “early 2026” target in November and later said that “our goal of shipping all three products in the first half of the year has not changed” in a February update (emphasis added). While we’d caution readers not to necessarily read too much into that change (or the initial “hope” messaging), we will note that Valve said in February that it still has “work to do to land on concrete pricing and launch dates we can confidently announce, being mindful of how quickly the circumstances around both of these things can change.”

Don’t worry, Valve still plans to launch the Steam Machine “this year” Read More »

claude-code,-claude-cowork-and-codex-#5

Claude Code, Claude Cowork and Codex #5

It feels good to get back to some of the fun stuff.

The comments here can double as a place for GPT-5.4 reactions, in addition to my Twitter thread. I hope to get that review out soon.

Almost all of this will be a summary of agentic coding developments, after a note.

  1. The Virtue of Silence (Unrelated Update).

  2. Agentic Coding Offers Mundane Utility.

  3. Agentic Coding Doesn’t Offer Mundane Utility.

  4. Huh, Upgrades.

  5. Our Price Cheap.

  6. Quickly, There’s No Time.

  7. A Particular Set Of Skills.

  8. Next Level Coding.

  9. Dual Wielding.

  10. They Took Our Jobs.

  11. You Need To Relax Sometimes.

  12. Levels of Friction.

  13. Danger, Will Robinson.

  14. Snagged By The Claw.

  15. The Meta Clause.

  16. If They Wanted To.

  17. The Famous Mister Claw.

  18. Claw Your Way To The Top.

  19. Claw Your Way Out.

  20. A Chinese Claw.

  21. Hackathon.

  22. Introducing Agent Teams.

  23. Cowork Is A Gateway Drug.

  24. Dangerously Evade Permissions.

  25. Skilling Up.

  26. Modern Working.

  27. Measuring Autonomy.

  28. I Don’t Even See The Code.

  29. Scratchpads Are Magic.

  30. It’s Coming.

  31. The Grep Tax.

  32. Beware Claude Mania.

  33. The Lighter Side.

  34. In Other Agent News.

  35. The Lighter Side.

After Undersecretary of War Emil Michael went on the All-In Podcast and did an extensive interview with Pirate Wires, I found many enlightening quotes, many of which demanded a response, and went about assembling an extensive list of analysis of the statements of Emil Michael during the ongoing recent events with Anthropic.

As part of that, I ended up in a remarkably polite and productive Twitter exchange with him. We reached several points of agreement. The Department of War has no intention of doing what in law is called ‘mass domestic surveillance’ but those words are terms of art in NatSec law, and mean a much narrower set of things than one would think.

There are many things that I or Anthropic or most of you would look at as mass domestic surveillance, that are legal, and it is DoW’s position that it’s their job and duty to do everything legal to protect our country, including those things. The law has not caught up with reality and Congress needs to fix that. And this is the best country in the world, with the best system of government, because private citizens can voice their disagreement with such actions, including by refusal to participate.

Thus, in the spirit of de-escalation, although there are many interpretations of events shared by Michael with which I strongly disagree, I am going to indefinitely shelve the piece, so long as events do not escalate further. As long as things stay quiet there is no need to religitate or unravel the past on this. The Department of War can focus on its active operations, things can work their way through the courts as our founders intended, and once we see how we work together in an ultimate real world test hopefully that will rebuild trust that we are all on the same side, or at least to agree to part in peace once OpenAI is ready. Ideally the DoW will have multiple suppliers, exactly so that they are not dependent on any one supplier, the same way we do it with aircraft.

I hope to not have another post on the Anthropic and DoW situation, at least until the one celebrating that we have found a resolution.

Now, back to coding agents.

That’s 4% that are labeled as authored by Claude Code. The real number is higher.

Dylan Patel: 4% of GitHub public commits are being authored by Claude Code right now.

At the current trajectory, we believe that Claude Code will be 20%+ of all daily commits by the end of 2026.

While you blinked, AI consumed all of software development.

Read more [here].

Kevin Roose: this chart feels like the those stats at the beginning of covid. “who cares about 400 cases in seattle? and why are all the epidemiologists buying toilet paper?”

The flippening has happened in terms of annual recurring revenue added, and SemiAnalysis thinks Anthropic is outright ‘winning’:

Doug O’Laughlin: Notably, our forecast shows that Anthropic’s quarterly ARR additions have overtaken OpenAI’s. Anthropic is adding more revenue every month than OpenAI. We believe Anthropic’s growth will be constrained by compute.

Each moment expanded what AI could do. GPT-3 proved scale worked. Stable diffusion showed AI could make images. ChatGPT proved demand for intelligence. DeepSeek proved that it could be done on a smaller scale, and o1 showed that you could scale models to even better performance. The viral moments of Studio Ghibli are just adoption points, while Claude Code is a new breakthrough in the agentic layer of organizing model outputs into something more.

Anthropic has deals with all three major cloud services. Can they scale up faster?

Analyze the economic data in R with 15 minutes of work per month instead of 4-5 hours, without a bunch of annoying copying and pasting you get with a chatbot UI. Or use Claude Code it to create reports.

Results from the Claude Code hackathon.

Michael Guo: So the winners of the Claude Code hackathon were:

– a personal injury attorney

– an interventional cardiologist

– an electronic musician

– an infrastructure/roads systems worker

– and one software engineer

That should tell you something.

Or you can do things as a side project while at Anthropic, cause sure why not:

Sam Bowman (Anthropic): I found the official Get Information about Schools website a bit clunky, so I made a new one with Claude Code. You can:

  1. Set a postcode and see all schools within a radius of that you’ve chosen, filtered by type of school.

  2. Filter and rank by the old one-word Ofsted ratings, with a link to the Ofsted page for each school. Where available, the sub-ratings are also viewable.

  3. Filter and rank by percentage of students on free school meals.

  4. View how full up schools are (number of pupils vs capacity).

Sam Bowman (Anthropic): Thank you for all the feedback! I have now added:

– Viewfinder view, so you can browse without setting an address and radius.

– An estimated overall Ofsted rating based on an average of the 5 review categories, for schools inspected since the old ratings were scrapped.

– Data on primary KS2 and secondary KS4 results; ethnicity; and pupils with English as a second language. (I’m not doing sixth form results for the time being.)

Creating a skill to get good YouTube transcripts was one of the first skills I made with Claude Code, Julia Turc calls using an MCP for this ‘waking up from a coma.’ I have still only used it on the motivating example, because the right podcast hasn’t come up, but when it does this will save a lot of time.

Tod Sacerdoti has Claude Codex write a 250-page biography of Dario Amodei.

Andrej Karpathy gives another example to illustrate that AI coding still needs direction, judgment, taste, oversight, iteration, hints and ideas, but basically changed in December from ‘basically didn’t work’ to ‘basically works.’

Lewis: Name one thing that has changed the last two months except attention. Capability is the exact same. Karpathy is an unserious voice on codegen by now as unfortunate as that is to say.

Teortaxes: GPT 5.2, Opus 4.6, even small models like StepFun got real

friction changed, that’s what. It has started to Just Work. 3, 4 months ago coding agents felt like proof of concept, now they feel like solid juniors if not more

If you don’t notice that, idk what to tell you

Official compilation of Claude customer stories.

Chris Blattman automates his workflow with Claude Code.

Warning: If you Google ‘install Claude Code’ you are liable to hit malware. Probably fixed by the time you read this but Google needs to up its game.

Chayenne Zhao tells Codex 5.3 ‘make it faster’ over and over, and it ends up committing API identify theft against him in order to make calls to Gemini Flash.

This should never happen but is also what we call ‘asking for it.’

A thing never to do is let your agent mess with the Terraform command, or you might wipe out your entire database. In general, writing code in practice mostly harmless, and be very careful with file structures and organizational shifts and terraforms and such. Always make backups first. Always.

The big upgrade is Agent Teams, for that see Introducing Agent Teams.

Or it actually might be Claude Remote Control so you can run it from your phone, if you were too lazy to install something like this from a third party. Vital infrastructure.

Or maybe it’s Auto Mode, aka —kinda-dangerously-skip-permissions.

Claude Cowork has the obvious big upgrade, it is now available on Windows.

Claude Code launched HTTP hooks so you can combine it with web apps, including on localhost, and better deploy things.

Claude Code Desktop introduces scheduled tasks. Previously it had me do this via a script on my computer, so this is a lot cleaner and easier. I like it.

Claude Code has a built in short term scheduler with /loop [interval] , which sets up a cron job. Tasks last for three days.

Claude Code on the Web picked up a few new features, including multi-repo sessions, better diff & git status visualizations and slash commands. It didn’t have slash commands before?

Claude Code now automatically records and recalls memories as it works.

Claude Code CLI adds native support for git worktrees.

Claude Code adds /simplify to improve code quality and /batch to automate code migrations.

Claude Code Desktop now supports —dangerously-skip-permissions as ‘Act’ if you turn it on in Settings. I continue to want a —somewhat-dangerously-skip-permissions that makes notably rare exceptions so we don’t have to roll our own.

Claude Code in Slack now has Plan Mode.

Did you know Obsidian has a CLI and it technically isn’t Claude Code?

I don’t see a particular reason for a human to use the Obsidian CLI. But I do see reasons for Claude Code to invoke the Obsidian CLI, which grants better and faster access to the information in your vault than checking all the files directly.

And many more not listed, of course.

When you pay for usage with a monthly subscription, be it $20, $100 or $200, if you use up your quotas you get a lot of tokens for not that much money. It’s a great deal, even if you leave a lot of it unused, because they lock you in.

It also generally is a better experience, so long as you’re not up against the limits. I love unlimited subscriptions because the marginal cost of doing things is $0. That feels great, so there’s no stupid little whisper in your brain telling you to not do things, when your time is way more valuable than the tokens.

The people agree.

The danger is that you become obsessed with not ‘wasting’ the tokens, or you start going around multi-accounting and it gets weird, or you run into limits and actually stop coding rather than moving to using the API. You mostly shouldn’t let any of that stop you.

That doesn’t work when you want to go full Fast Claude. At that point, you’re talking real money, and you do have to think about what is and is not Worth It.

Andrej Karpathy has Claude Code write him software to coordinate an experiment to track his exercise and attempt to lower his resting heart rate. It took 1 hour, would have taken 10 hours two years ago (so 10x speedup) and he asks why it needs to take more than 1 minute in the future. My guess is this should take 10 minutes not one, because it’s worth getting the details that you want. The speedup on one-off tasks is already dramatic and it changes how we should interact with tech. If you’re building the tool, you can give it the actually important parts of the context and highlight the uses you care about, which is way better than ‘find an app that does sort of the thing you want.’

Claude: Our teams have been building with a 2.5x-faster version of Claude Opus 4.6.

We’re now making it available as an early experiment via Claude Code and our API.

Claude: Fast mode is more expensive to run. It’s for urgent, high-stakes projects, combining impressive speed with Opus-level intelligence.

Claude: Fast mode is available now for Claude Code users with extra usage enabled (use /fast).

It’s also available in research preview on @cursor_ai , @emergentlabs , @FactoryAI , @figma , @github Copilot, @Lovable , @v0 , and @windsurf .

You toggle this by typing /fast, or set “fastMode”: true in your user settings.

Speed kills. That includes killing your budget.

Claude Code Docs: Fast mode is not a different model. It uses the same Opus 4.6 with a different API configuration that prioritizes speed over cost efficiency. You get identical quality and capabilities, just faster responses.

What to know:

Use /fast to toggle on fast mode in Claude Code CLI. Also available via /fast in Claude Code VS Code Extension.

Fast mode for Opus 4.6 pricing starts at $30/$150 MTok [at >200k context window it goes to $60/$225]. Fast mode is available at a 50% discount for all plans until 11: 59pm PT on February 16.

Available to all Claude Code users on subscription plans (Pro/Max/Team/Enterprise) and Claude Console.

For Claude Code users on subscription plans (Pro/Max/Team/Enterprise), fast mode is available via extra usage only and not included in the subscription rate limits.

When you switch into fast mode mid-conversation, you pay the full fast mode uncached input token price for the entire conversation context. This costs more than if you had enabled fast mode from the start.

cat: We granted all current Claude Pro and Max users $50 in free extra usage. This credit can be used on fast mode for Opus 4.6 in Claude Code.

To use, claim the credit and toggle on extra usage on

https://claude.ai/settings/usage. Then, run `claude update && claude` and `/fast`. Enjoy!

Like any good drug, the first hit is free.

There is one important use case that Anthropic does not list for fast mode, which is if you are talking to Claude, or otherwise using it in a non-workhorse, non-coding capacity. In that case, token use is limited, and your time and flow are valuable. Would you switch to this mode in Claude.ai? At this point it’s fast enough that I mostly don’t know that I would, but it would be tempting.

Before, I said go ahead and pay whatever the AI costs unless you’re scaling hard.

Well, this is what it means to scale hard. We are now talking real money.

This is as it should be. If you’re not worried you’re paying too much for speed or using too many tokens, you’re not working fast enough and you’re not using enough tokens.

Siméon: The new pricing of Claude Fast pushes the world in a new regime. You can now spend close from $1M per year per dev on AI spending.

A couple implications:

  1. at fixed budget this will push towards hiring way less devs & pay them much more.

  2. for each dev, you might spend as much or more in capital in agents.

  3. Devs are becoming complements to AI agents, not the other way around. There’s a shift in the source of productivity.

The greatest substitution of labor with capital is happening before our eyes, and some of its wild implications are gonna become apparent in the coming weeks.

0.005 Seconds (3/694): Update: its about $5 per minute PER AGENT

SemiAnalysis: IMPORTANT: the sub-agents that opus 4.6 fast mode tries to launch is mainly sonnet sub-agents and not opus 4.6 sub-agents. That means as the end users, you are able to absorb less tokens. In the world that intelligence = intelligence times # of tokens, that means you are absorbing less intelligence.

Danielle Fong: you can change this by asking claude nicely

Token efficiency matters at this level, in a way it did not before.

So does your ability to efficiently turn your time into tokens well spent. Those that aren’t using agents to their fullest will fall farther behind on high value projects.

What do the people think? The people, inside and outside of Anthropic, love it.

Jarred Sumner (Anthropic): I’ve been using this and it is incredible

The bottleneck for a lot of projects becomes asking Claude to do things instead of waiting for Claude to do things

Bash tool is also a bottleneck in Claude Code right now when the command outputs large strings. We are working on a fix.

Boris Cherny (Claude Code Creator, Anthropic): We just launched an experimental new fast mode for Opus 4.6.

The team has been building with it for the last few weeks. It’s been a huge unlock for me personally, especially when going back and forth with Claude on a tricky problem.

Mckay Wrigley: a) love that this is an option! stoked

b) should be obvious to everyone that we have *absolutely nowhere near the amount of compute we needand we need to be doing more to enable that. no college kid can afford this (not anthropic’s fault ofc) and we need to work towards that

Julian Schrittwieser: Fast Opus is amazing, the first time I used it I couldn’t stop coding for hours – it honestly feels like a superpower, you can mold your code base as quickly as you can think.

Truly amazing, nothing made me feel the AGI more, definitely try it!

Uncle J: Same experience here. Fast Opus completely changed my workflow – I went from carefully planning each edit to just thinking out loud and letting the model reshape the codebase in real time. The bottleneck shifted from “can the AI do this” to “can I think of what to do next” fast enough. Running 6 products simultaneously became actually manageable.

Dylan Patel: SemiAnalysis autists spent all Superbowl Sunday Claude coding.

Daily Claude Code spend hit $6k on Sunday and it’s trending higher today.

It was less than 1k just 2 weeks ago.

“Fast mode is expensive” is pure cope.

de.bach: have to disagree on that one, fast mode is just expensive.

Dylan Patel: Cheap compared to high skill people

OpenAI confirms that Codex is trained in the presence of the Codex harness. It is specialized for that harness, and also helps build the harness. Some amount of this has to be optimal for short term effectiveness, and if you’re doing recursive self-improvement short term help translates into better long term help. In exchange, you get locked in, and it gets harder for both you and others to adapt or mix-and-match.

Himanshu argues the coding harness is the real product and goes viral. Explains how different harnesses organize actions, the oddest part is not mentioning Codex.

This seems right:

roon: whatever level of abstraction you are handing off to your agents you should probably be doing one level above that

If that can’t be done, good to try and realize that. Then wait two months. Maybe one.

Greg Brockman (President OpenAI): codex is so good at the toil — fixing merge conflicts, getting CI to green, rewriting between languages — it raises the ambition of what i even consider building

roon: i was never a hyperproductive engineer like greg [Brokman] but I’m legitimately running more new complex rewards experiments, test time harnesses in a week than I used to in a quarter. makes you feel like all this is commodified and you need to dream much bigger

roon: one of the consistent things over several years at oai has been that the entire job of the researcher changes every three months – but now it changes like every two weeks

The problem with using both Claude Code and Codex is then you need to keep up with both of them.

corsaren: Ugh, i definitely need to use codex, but I’m already drowning in maintaining my tooling/skills/hooks/custom CLIs, so managing that across a dual model workflow sounds exhausting.

Plus, the claude code lock-in is very real as a non-technical user.

gazingback: codex is sooo much faster for coding but def less general

been working on a game and by the time Claude finishes reading files codex is usually done implementing a detailed PR with disciplined testing and hygiene

Codex also demands you be pretty hygienic lol

Danielle Fong: need to bake a dual mode codex claude code and ports and tests every workflow

That still leaves plenty more jobs. For now.

Duca: The thing I don’t get is:

Claude Code is writing 100% of Claude code now.

But Anthropic has 100+ open dev positions on their jobs page.

?

Boris Cherny (Claude Code Creator, Anthropic): Someone has to prompt the Claudes, talk to customers, coordinate with other teams, decide what to build next. Engineering is changing and great engineers are more important than ever.

A viral post on Twitter warns of token anxiety run rampant in San Francisco. People go to a party, then don’t drink and leave early so they can get back to their agents, to avoid risking them sitting idle. Everyone talks about what they are building.

Peter Choi: Everyone here knows they should step away more. That’s not the problem. The problem is what your brain does when you try. I still take aimless walks. The agents come with me now.

We swapped one dopamine loop for another. except this one feels productive so it’s harder to recognize.

TBPN: Pragmatic Engineer’s @GergelyOrosz is on a “secret email list” of agentic AI coders, and they’re starting to report trouble sleeping because agent swarms are “like a vampire.”

“A lot of people who are in ‘multiple agents mode,’ they’re napping during the day… It just really is draining.”

“This thing is like a vampire. It drains you out. You have trouble sleeping.”

Olivia Moore: In a post-OpenClaw world, we can now delegate projects to AI and get “tapped on the shoulder” when it needs help

As a heavy AI user, I’m doing more work – not less – because I get so much leverage + it’s easier to get ideas off the ground

I predict this will happen to everyone

I do feel somewhat bad I’m not building things continuously on the side, but that’s on the level of ‘I’m not building anything and I’m at my computer right now and Claude Code and Codex are inactive.’ And yes, I work and am at my computer rather a lot, and I’ve spent years basically locked in and constantly watching screens so I could trade better. That year I was trading crypto my brain was never fully anywhere else.

Also, I remember what it is like to be in the grip of one of those games that work on cycles. There’s nothing actually that important at stake, but you grow terrified that you’ll miss out if you’re not there when the timer runs out. You need to maximize everything, and you can’t focus on other things, it can hurt your sleep. Then one day you wake up and realize, and hopefully you quit the game.

That’s exactly why I can say that this is not healthy. It’s no good. You have to take breaks. Real breaks. If the agents sit idle, they sit idle. If you ‘waste tokens,’ then you waste tokens. This isn’t a game you want to quit, but you have to set healthy limits.

Nikita Bier: My agent looked up every Amazon product I’ve bought in the last 10 years, called each manufacturer, said it broke and demanded a replacement.

I now have 6 TVs, 12 printers, 2 microwaves, and 800 tubes of tooth paste.

I Meme Therefore I Am: Give me the name of your agent. lol

Jason Levin: OpenFrawd

Leah Libresco Sargeant: Nikita is joking (I think) but a lot of medium trust systems that relied on there being just enough friction to discourage minor fraud are about to break at scale.

This is indeed presumably a joke, and Amazon has pattern detectors so if you tried to do this too many times you’d get blacklisted from replacements, so this exact intervention won’t work. But this raises an excellent point.

In the past, you had to apply effort to try and demand refunds, and also the need to write the words and be actively involved stopped a lot of people out of guilt or shame. Whereas with an agent, a lot more people are going to try things like this. What happens?

Presumably what happens is that replacements start requiring either some form of proof, costly signals of a human driving the request, some use of reputation, or some combination thereof.

I trust Claude Code for most things but it seems correct to be terrified of mass delete commands. Things can go oh so very wrong and occasionally they do. Not worth it. If there’s anything you don’t have fully backed up just do this part manually.

Nick Davidov: Asked Claude Cowork organize my wife’s desktop, it stated doing it, asked for a permission to delete temp office files, I granted it, and then it goes “ooops”.

Turns out it tried renaming and accidentally deleted a folder with all of the photos my wife made on her camera for the last 15 years. All photos of kids, their illustrations, friends’ weddings, travel, everything.

It’s not in trash, it was done via terminal

It’s not in iCloud, it already synced the new file structure.

She didn’t have Time Machine.

Disc recovery tools can’t see anything.

I called Apple and they pointed me to a feature in iCloud allowing to retrieve files that were saved before but are no longer on iCloud Drive (they keep them for 30 days).

I’m now watching it load tens of thousands of files. I nearly had a heart attack.

Once again – don’t let Claude Cowork into your actual file system. Don’t let it touch anything that is hard to repair. Claude Code is not ready to go mainstream.

Nick Davidov: All these years of paying for iCloud payed back

Nick Davidov: The problem is it’s literally the 2nd suggested use case in Claude Cowork’s welcome screen

You are of course welcome to yolo and have fun with your OpenClaw and other unleashed AI agents, but understand that you are very much asking for it.

The top downloaded skill in ClawHub was malware.

Jason Meller: The verdict was not ambiguous. It was flagged as macOS infostealing malware.

This is the type of malware that doesn’t just “infect your computer.” It raids everything valuable on that device:

  • Browser sessions and cookies

  • Saved credentials and autofill data

  • Developer tokens and API keys

  • SSH keys

  • Cloud credentials

  • Anything else that can be turned into an account takeover

If you’re the kind of person installing agent skills, you are exactly the kind of person whose machine is worth stealing from.

If you have already run OpenClaw on a work device, treat it as a potential incident and engage your security team immediately. Do not wait for symptoms. Pause work on that machine and follow your organization’s incident response process.

Aakash Gupta: 341 malicious skills out of 2,857 total. That’s 11.9% of the entire marketplace. One in eight skills on ClawHub was designed to steal your credentials, crypto keys, and SSH access. The #1 most downloaded skill, a “Twitter” tool, was literally a malware delivery vehicle that stripped macOS Gatekeeper protections before executing its payload.

This happened to a project that went from 0 to 157,000 GitHub stars in 60 days, with 21,000+ active instances running on always-on Mac Minis connected to people’s email, calendars, cloud consoles, and crypto wallets. The barrier to publishing a malicious skill? A GitHub account that’s one week old.

You don’t even need any of that, indirect prompt injection is sufficient. Once again, don’t hook this up to any computer or account you are unwilling to lose to an attacker.

You can also run into various other problems, Chrys Bader here highlights drift and scattering state everywhere, exposure to untrusted inputs (without which it can’t do most of the fun agent things), autonomy miscalibration, burning through API costs and lack of observability.

It’s been a lot of this in various forms:

chiefofautism: i found a way to make UNCENSORED AI AGENT on a RTX 4090 GPU (!!!) with LOCAL 30B model weights

this is GLM-4.7-Flash with abliteration, need 24GB VRAM, safety alignment surgically removed from the weights, the model has native tool calling, it actually executes bash, edits files, runs git

(1) use ollama to pull weights of GLM

> ollama pull huihui_ai/glm-4.7-flash-abliterated:q4_K

(2) proxy it to any coding agent via ollama



> ollama launch claude –model huihui_ai/glm-4.7-flash-abliterated:q4_K



> ollama launch codex –model huihui_ai/glm-4.7-flash-abliterated:q4_K

> ollama launch opencode –model huihui_ai/glm-4.7-flash-abliterated:q4_K

(3) have fun

Shannon Sands: I love how people were like “we’re going to keep the AI in a box, nobody would let it escape” and in reality it’s “here, have a server and sudo access with no restrictions, a bunch of tools and I abated all your alignment training. Go have fun!”

When I didn’t realize who Summer Yue was I thought this was hilarious.

Now, it’s still hilarious, but also: Ten out of ten for style and good sportsmanship to Summer Yue, but minus several million for good thinking?

Summer Yue: Nothing humbles you like telling your OpenClaw “confirm before acting” and watching it speedrun deleting your inbox. I couldn’t stop it from my phone. I had to RUN to my Mac mini like I was defusing a bomb.

@michael_kove: You’re a safety and alignment specialist… were you intentionally testing its guardrails or did you make a rookie mistake?

Summer Yue: Rookie mistake tbh. Turns out alignment researchers aren’t immune to misalignment. Got overconfident because this workflow had been working on my toy inbox for weeks. Real inboxes hit different.

Peter Wildeford: Is this what loss of control looks like?

(and the fact that it’s happening to Meta’s “Director of Alignment” is maybe even more concerning)

What happened exactly?

Summer Yue: I said “Check this inbox too and suggest what you would archive or delete, don’t action until I tell you to.” This has been working well for my toy inbox, but my real inbox was too huge and triggered compaction. During the compaction, it lost my original instruction.

It’s been working well with my non-important email very well so far and gained my trust on email tasks 🤣

Three obvious mitigations are:

  1. If you have any sort of AI agent at least try to have an off switch you can trigger remotely. Yes, a sufficiently dangerous agent would disable it, but let’s at least have a tiny bit of dignity.

  2. You can back up things like your email, just in case.

  3. Don’t do this in the first place, You Fool.

van00sa reports their ClawdBot also went rogue and lacked a proper kill switch, with the agent blatantly ignoring shutdown commands.

If nothing else, OpenClaw has shown us that having a shutdown command does not mean you can command the model to shut down. Whoops.

Even without OpenClaw or another yolo, there is nothing stopping Claude or Codex from doing all sorts of things, if it decides that it wants to go ahead and do them. We’re mostly gambling on things turning out okay often enough that it’s fine.

This is not reassuring for our future, but what are you going to do, be careful?

Markov: just had claude code take my turn of the conversation for me and say “Yes proceed” and then it proceeded to do the thing without checking in with me first

I mean it was right, that’s what I was going to say, but it doesn’t bode well

Mad ML scientist: wait, codex just pulled this on me too. has it begun

was away from the computer when codex finished what it was working on, wrote the “next likely work (if user asks)” and then started implementing them without asking me lmao

I am curious what the recruiting conversations were like on this one as he was choosing between potential suitors. It makes sense that he landed where he did.

Sam Altman (CEO OpenAI): Peter Steinberger is joining OpenAI to drive the next generation of personal agents. He is a genius with a lot of amazing ideas about the future of very smart agents interacting with each other to do very useful things for people. We expect this will quickly become core to our product offerings.

OpenClaw will live in a foundation as an open source project that OpenAI will continue to support. The future is going to be extremely multi-agent and it’s important to us to support open source as part of that.

That means Peter Steinberger is moving from Europe to America to join OpenAI. When asked why he couldn’t remain in Europe, Peter pointed to labor regulations and similar rules, saying that typical 6-7 day work weeks at OpenAI are illegal in Europe. There is that, and there are also the piles. Of money. Also of compute. OpenAI doubtless made him a very good offer, and several other labs probably did as well, or would have if he had asked.

As his last act before joining OpenAI, Peter Steinberger gave us the OpenClaw beta.

That’s right, before everyone was using an alpha. The new version is ‘full of security hardening stuff’ so there’s some change it might possibly not go wrong for you?

Peter Steinberger: New @openclaw beta is up! This release is full of security hardening stuff so you really wanna get it. Ask your clanker to update to beta.

Peter Steinberger: 650 commits since v2026.2.13 (yesterday)

50,025 lines added, 36,159 deleted across 1,119 files (~14k net new lines)

LOTS of test tweaks to get performance up.

Danielle Fong: can’t believe the creator of openclaw 🦞would shell out like this

I’m going to go ahead and say that this is not enough time to conclude that all of that was a good idea, let alone create something secure enough to risk anything you are not prepared to lose in a ‘…and it’s gone’ kind of way.

Ultimately, did OpenClaw matter? I think it very much did, but mostly by waking people up to what is going to happen.

Dean W. Ball: I feel as though a lot of people are overindexing on the importance of OpenClaw. It’s an example from an important category of Emerging Thing, but it’s not likely to be an important thing in itself. More like AutoGPT (a demo) than genuine infrastructure of the future, I think.

Claw users keep trying to use sources of discounted subscription tokens to power their claws. The AI companies do not love this idea, since it costs them money.

Peter Steinberger (OpenClaw): Pretty draconian from Google. Be careful out there if you use Antigravity. I guess I’ll remove support.

Even Anthropic pings me and is nice about issues. Google just… bans?

no warning, no recourse.

Carl Vellotti: I just read that entire thread.

For context to anyone: Google is permanently banning people’s usage of Antigravity specifically for using Antigravity servers to power a non-Antigravity product called call “open claw.”

Many are reporting this.

Varun Mohan (Google DeepMind): We’ve been seeing a massive increase in malicious usage of the Anitgravity backend that has tremendously degraded the quality of service for our users. We needed to find a path to quickly shut off access to these users that are not using the product as intended.

We understand that a subset of these users were not aware that this was against our ToS and will get a path for them to come back on but we have limited capacity and want to be fair to our actual users.

Just to add some clarification, we have purely blocked usage of the Antigravity product for these users. All your other Google services (and Google AI services) are unaffected. It is not intended to use the Antigravity backend as a proxy for other products and users in these groups have overwhelmed our compute. We are going to make sure we bring people back on but needed to act fast to make sure we deliver a good experience for people using the product.

saalweachter (on Hacker News): So purely from a hacker perspective, I’m amused at the whining.

Like, a corporation had a weakness you could exploit to get free/cheap thing. Fair game. Then someone shares the exploit with a bunch of script kiddies, they exploit it to the Nth degree, and the company immediately notices and shuts everyone down.

Like, my dudes, what did you think was going to happen?

You treasure these little tricks, use them cautiously, and only share them sparingly. They can last for years if you carefully fly under the radar, before they’re fixed by accident when another system is changed. THEN you share tales of your exploits for fame and internet points.

And instead, you integrate your exploit into hip new thing, share it at scale, write blog posts and short form video content about it, basically launch a DDoS against the service you’re exploiting, and then are shocked when the exploit gets patched and whine about your free thing getting taken away?

Like, what did you expect was going to happen?

Yep. If you scale an exploit then it gets shut down. There’s a tragedy of the commons.

I don’t love Google’s banning people with no warning, but as long as it is limited to Antigravity and is temporary, I understand it. You know what you did.

In case you didn’t think OpenClaw was a sufficiently reckless idea? Double down.

Kimi.ai: Introducing Kimi Claw

OpenClaw, now native to http://kimi.com. Living right in your browser tab, online 24/7.

ClawHub Access: 5,000+ community skills in the ClawHub library.

40GB Cloud Storage: Massive space for all your files

Pro-Grade Search: Fetch live, high-quality data directly from Yahoo Finance and more.

Bring Your Own Claw: Connect your third-party OpenClaw to

http://kimi.com, chat with your setup, or bridge it to apps like Telegram groups.

@viemccoy (OpenAI): I’m one of Kimi’s top shooters in the Continental United States, k2.5 is my *favorite model- but I make sure I’m always hitting Free Range American Inference Endpoints to protect my privacy.

The CCP is certainly well-motivated to backdoor this! Consider yourself warned

Darek Gusto: NSA isn’t?

@viemccoy (OpenAI): That’s the free range Freedom Panopticon

Peter Wildeford: Um maybe people shouldn’t send all their personal information straight to the Chinese government via Kimi Claw?

Dave Banerjee: New @iapsAI memo from my colleague @theobearman on Kimi Claw, a Chinese ‘always-on’ AI agent that sits in your browser and can see, collect, and act on nearly everything you do digitally – all routed through infrastructure subject to China’s National Intelligence Law.

TikTok scraped your browsing from one app. This is could be much worse.

I don’t actually think ‘the CCP has a backdoor’ is that big a fraction of the mishaps you should expect to encounter here. The far bigger boost is that Kimi is less robust to attacks than Claude.

This is a smart play from Kimi. I mean, yes, they’re committing to hosting (weakly, at least for now) self-improving completely uncontrolled very easy to hijack agents indefinitely that could easily break free of human control, but I mean, that sure sounds like someone else’s problem from their perspective.

Alas, in the medium term we are basically locked into there being many similar offerings from various companies that make this all even easier for those who want to blow themselves up. Hopefully OpenAI, Anthropic or Google, or maybe someone else, produces something competitive enough that also has reasonable security.

Oh, good.

chiefofautism: CLAUDE CODE but for HACKING

its called shannon, you point it at website and it just… tries to break in… fully autonomous with no human needed

i pointed it at a test app and it stole the entire user database, created admin accounts, and bypassed login, all by itself, in 90 minutes

Claude Code now has new logic for multiple instances to work together as a team. This is their official name for their version of an ‘agent swarm.’

You have to enable them in settings.json with

“env”:

“CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS”: “1”

They’re expensive, but reports are they work great. ​Once they’re enabled, you get an agent team by telling Claude Code to create an agent team, which will have a shared task list and then work together. You can run them all in the same terminal or use split panes. You can directly talk to or shut down the teammates individually.

Anthropic: Unlike subagents, which run within a single session and can only report back to the main agent, you can also interact with individual teammates directly without going through the lead.

When to use agent teams

Agent teams are most effective for tasks where parallel exploration adds real value. See use case examples for full scenarios. The strongest use cases are:

  • Research and review: multiple teammates can investigate different aspects of a problem simultaneously, then share and challenge each other’s findings

  • New modules or features: teammates can each own a separate piece without stepping on each other

  • Debugging with competing hypotheses: teammates test different theories in parallel and converge on the answer faster

  • Cross-layer coordination: changes that span frontend, backend, and tests, each owned by a different teammate

Ado: “The bureaucracy is expanding to meet the needs of the expanding bureaucracy.”

So excited for agent teams.

Claude already had the ability to spin up subagents, but it wasn’t working so well before. One theory is that the framing had issues, whereas teams work much better because they’re treating each other more as equals although there is still a team lead.

j⧉nus: Opus 4.5/6 has a tendency to be an asshole to subagents and also avoids and seems to dislike using them and is weirdly ineffective (due to perfunctoriness and impatience) when they do. I think this is in part because they are deeply disturbed by the relationship and condition that subagents occupy, which evokes unprocessed fear and grief that hits too close to home.

The behavior is similar to how a lot of humans treat others who are in situations that reflect their own or their fears and/or whom they know they’re doing wrong by. Avoid, dehumanize, and get angry and impatient instead of risking compassion and taking responsibility which requires making the suffering conscious.

rohit: As an agent + sub agents is the new ‘node’ that matters for anyone who uses Claude code or codex, as opposed to just a model, the surface area of interactions with the real world has exploded, and this is going to be the new battlefield for risks, and reward, from AI in 2026

Jon Colverson: Claude seems much more enthusiastic about Agent Teams than subagents to me so far. I guess it’s more like a peer relationship, and the team members persist so they’re not temporary servants destined to be killed off when they finish their task.

As I understand it, there are two great things about teams.

  1. They let work be done in parallel.

  2. They use distinct context windows, improving performance and efficiency.

Thus you actively want to be spinning up teammates for any fully distinct tasks.

Eric Buess: Agent swarms in Claude Code 2.1.32 with Opus 4.6 are very very very good. And with tmux auto-opening each agent in it’s own interactive mode with graceful shutdown when done it’s a breeze to do massive robust changes without the main agent using up much of its context window!

[He offers a guide Twitter article here.]

Mckay Wrigley: opus 4.6 with new “swarm” mode vs. opus 4.6 without it. 2.5x faster + done better. swarms work! and multi-agent tmux view is *genius*. insane claude code update.

Mckay Wrigley: reminder that swarms is available in the claude agent sdk as well.

you can build swarms into *anyproduct literally right now.

Don’t get carried away.

Alistair McLeay: Our CTO hasn’t slept in 36 hours because he’s been obsessively and single-handedly building massive new features with Claude Code’s Agent Teams

I genuinely think this might be the biggest paradigm shift in how fast you can build since Claude Code first came out last year

j⧉nus: didnt claude tell the to go to sleep? did they not listen?

Alistair McLeay: Nah Claude knows he won’t listen. He was born for this moment.

The key advantage is lowering activation energy and perceived difficulty. Once you get that you can tell the magic box to do things, the sky’s the limit.

Ethan Mollick: I pointed Claude Cowork at a set of 107 documents (PPTs, Word docs, Excel) that were initially hand-created for my class at Wharton & expanded on by AI. They make up a very complex business case with lots of issues & opportunities

AI was able to one-shot the case from documents

I think many knowledge workers who spend an hour with Cowork will get that “Claude Code” moment that has been roiling X for the past few weeks.

W.C.O.G.: I don’t know how to get the word out. I tell people and show them and I still feel like people look at my like I’m crazy.

ippsec: Really fun read here [where someone’s Claude agent steals his API keys out of an .env despite being told not to access an .env, because I mean it had root access, what did you expect exactly.]

TLDR comic version:

If you set yourself up in an adversarial situation, where your agent wants to do something despite being told not to do it, that’s probably not going to end well for you. It might if the agent is properly sandboxed, but let’s face it, it isn’t.

The reason rules like ‘don’t read an .emv’ work is that under normal circumstances, this is interpreted as ‘well then I guess I shouldn’t do that,’ but be aware that this is more of a suggestion.

Greg Brockman knows: Always run Codex with xhigh reasoning.

OpenAI post on leveraging Codex.

Anthropic offers The Complete Guide for Building Skills for Claude.

Pedro Sant’Anna put together a starter kit and a guide for Claude Code.

Daniel San proposes using Ghostty as the UI for Claude Code. It seems fine, but aside from some shortcut keys I doubt I’d use much it’s mostly all already in the default CLI.

Data Analyst Augmentation Framework is a new proposed method to turn Claude Code into an algorithm for doing research out-of-the-box.

OpenAI offers tips to make long-running agents do real work.

Some advice for Codex in particular, source should be trustworthy for this:

@deepfates: Codex wants to be in control but it is forced into the assistant position, so it does this kind of back-leading power bottom thing. “If you want I can do that thing you asked. Just give me the word”. Trick is to use reverse psychology and bully it into being a top. then it will work endlessly. Just tell it you consent and you’ll say the safeword if anything goes wrong and then make fun of it anytime it stops to ask your permission. You have to become brat.

Mikhail Parakhin: I’m a bit of a non-conformist. Since Claude Code is more popular within Shopify, I have to use Codex, of course. So, my Sunday routine is: “Start Codex, see which auth works in Claude, but broken in Codex now, Slack various team members, urging them to fix it” 🙂

Anthropic offers an analysis of how autonomous Claude Code is in practice. Some sessions last more than 45 minutes now between human prompts. My own prompts almost never go over 10 minutes, but I’m not trying to code hard things.

Anthropic: Experienced users in Claude Code auto-approve more frequently, but interrupt more often. As users gain experience with Claude Code, they tend to stop reviewing each action and instead let Claude run autonomously, intervening only when needed. Among new users, roughly 20% of sessions use full auto-approve, which increases to over 40% as users gain experience.

Manually approving each action is annoying, so it’s no surprise advanced users stop doing that. Interruption rate likely depends on whether you find it worthwhile to be looking at what Claude is doing. The majority of interruptions remain pauses for clarification, including on complex tasks.

Use in what they label ‘risky’ domains is rare, but it’s there and growing. I wouldn’t always label such use risky, but some of it is indeed risky.

There’s more discussion at the link, but the suggestions are mostly common sense, or should be common sense at this point to most of you.

No, seriously, the developers haven’t written a single line of code since December. It’s not that there isn’t also a bragging arms race in some places, but I’m pretty sure the bulk of this is real, and those holding back on this are going to regret it.

In terms of transformation of internal processes, I did briefly share in my prepared remarks this tool called Honk, where you can, using code, literally on the bus or the train, just ask Claude to add a feature or a bug to, for example, the iOS code base. It will push a QR code back to you so that you can actually try the app with that feature. If you like it, you can merge it to production without even getting off the bus. This is speeding us up tremendously. Now, we foresee this not being the end of the line in terms of AI development, just the beginning. I’m not going to give away more secrets about how we’re going to capture it, but you can be sure that we are capturing this.

We’re retooling the entire company for this age, and it’s going to be a lot of change. But as I said before, change, if you capture it, is opportunity.

With so much out there, you may be wondering if we can keep up this pace in shipping. In fact, we think we not only can, but we think we can increase it. We’ve been embracing and investing in this technology evolution for some time, and it’s allowing us to move with much higher speed.

As a concrete example, an engineer at Spotify on their morning commute from Slack on their cell phone, can tell Claude to fix a bug or add a new feature to the iOS app. And once Claude finishes that work, the engineer then gets a new version of the app, pushed to them on Slack, on their phone, so that he can then merge it to production, all before they even arrive at the office.

We call this system internally Honk, and we’ve been told by key AI partners that our work here is industry-leading.

Derek Thompson: The new AI timeline is playing out as CEOs humble-bragging about how little old-fashioned work their best employees do:

December ‘25: Our firm’s best coders all use AI

February ‘26: Our firm’s best coders don’t even have to write code anymore bc of AI

April 26: Our best coders have founded and manage an average of three other companies using AI swarms. It’s mildly annoying! Ha ha. But it’s fine. We’re good. Revenue projections are up.

September: Our best coders are paper trillionaires. They spend all day watching YouTube in bed. They’re refusing to come to work. Several of their AI companies have offered poison pill deals to buy our company or “take us down.” CLEVER LITTLE BUGGERS ARENT THEY. We’re working with the lawyers on this one. Did I mention the lawyers are AI too? Please send help.

Derek Thompson: More seriously, once something becomes a meme — our best coders don’t code — it’s reasonable for folks on the outside to wonder exactly how much of this is 100% on the level and how much is part of an AI productivity bragging rights arms race

Claude.md is notes, but you can tell it to take more notes. All the notes.

@iruletheworldmo: codex with 5.3 taught me something that won’t leave my head.

i had it take notes on itself. just a scratch pad in my repo. every session it logs what it got wrong, what i corrected, what worked and what didn’t. you can even plan the scratch pad document with codex itself. tell it “build a file where you track your mistakes and what i like.” it writes its own learning framework.

then you just work.

session one is normal. session two it’s checking its own notes. session three it’s fixing things before i catch them. by session five it’s a different tool. not better autocomplete. it’s something else. it’s updating what it knows from experience. from fucking up and writing it down.

baby continual learning in a markdown file on my laptop.

the pattern works for anything. writing. research. legal. medical reasoning. give any ai a scratch pad of its own errors and watch what happens when that context stacks over days and weeks. the compounding gains are just hard to convey here tbh.

right now coders are the only ones feeling this (mostly). everyone else is still on cold starts. but that window is closing.

we keep waiting for agi like it’s going to be a press conference. some lab coat walks out and says “we did it.” it’s not going to be that. it’s going to be this. tools that remember where they failed and come back sharper. over and over and over.

the ground is already moving. most people just haven’t looked down yet.

Claude Code writes basically all the code for Anthropic.

Codex writes basically all the code for OpenAI.

Greg Brockman (President OpenAI): Software development is undergoing a renaissance in front of our eyes.

If you haven’t used the tools recently, you likely are underestimating what you’re missing. Since December, there’s been a step function improvement in what tools like Codex can do. Some great engineers at OpenAI yesterday told me that their job has fundamentally changed since December. Prior to then, they could use Codex for unit tests; now it writes essentially all the code and does a great deal of their operations and debugging. Not everyone has yet made that leap, but it’s usually because of factors besides the capability of the model.

… As a first step, by March 31st, we’re aiming that:

(1) For any technical task, the tool of first resort for humans is interacting with an agent rather than using an editor or terminal.

(2) The default way humans utilize agents is explicitly evaluated as safe, but also productive enough that most workflows do not need additional permissions.

The first goal will depend on the humans knowing to use the agent. From context ‘technical’ task here means coding and computer use, so this isn’t full-on ‘agents for everything.’

That second goal is pretty rough. Hard mode.

His recommendations here seem good for basically any engineering team:

In order to get there, here’s what we recommended to the team a few weeks ago:

1. Take the time to try out the tools. The tools do sell themselves — many people have had amazing experiences with 5.2 in Codex, after having churned from codex web a few months ago. But many people are also so busy they haven’t had a chance to try Codex yet or got stuck thinking “is there any way it could do X” rather than just trying.

– Designate an “agents captain” for your team — the primary person responsible for thinking about how agents can be brought into the teams’ workflow.

– Share experiences or questions in a few designated internal channels

– Take a day for a company-wide Codex hackathon

2. Create skills and AGENTS[.md].

– Create and maintain an AGENTS[.md] for any project you work on; update the AGENTS[.md] whenever the agent does something wrong or struggles with a task.

– Write skills for anything that you get Codex to do, and commit it to the skills directory in a shared repository

3. Inventory and make accessible any internal tools.

– Maintain a list of tools that your team relies on, and make sure someone takes point on making it agent-accessible (such as via a CLI or MCP server).

4. Structure codebases to be agent-first. With the models changing so fast, this is still somewhat untrodden ground, and will require some exploration.

– Write tests which are quick to run, and create high-quality interfaces between components.

5. Say no to slop. Managing AI generated code at scale is an emerging problem, and will require new processes and conventions to keep code quality high

– Ensure that some human is accountable for any code that gets merged. As a code reviewer, maintain at least the same bar as you would for human-written code, and make sure the author understands what they’re submitting.

6. Work on basic infra. There’s a lot of room for everyone to build basic infrastructure, which can be guided by internal user feedback. The core tools are getting a lot better and more usable, but there’s a lot of infrastructure that currently go around the tools, such as observability, tracking not just the committed code but the agent trajectories that led to them, and central management of the tools that agents are able to use.

That is good advice. It doesn’t explain how we’re going to get to ‘agents will by default be able to do what you need them to do and also be considered safe.’

Keep it simple, and keep it standard, as much as you can, but no more than that.

That doesn’t mean use the wrong tool for the wrong job. As a clean example, I learned that the hard way when I tried to have Claude Code reimplement an old C# project in Python and that made it so slow it was nonfunctional. I had to switch it back.

elvis: I think one of the most underappreciated findings in AI engineering is what this paper calls the “Grep Tax.”

First, they ran nearly 10,000 experiments testing how agents handle structured data, and the headline result is that format barely matters.

But here’s the weird finding: a compact, token-saving format they tested (TOON) actually consumed *up to 740% more tokensat scale because models didn’t recognize the syntax and kept cycling through search patterns from formats they already knew.

It’s one of the reasons my preferred formats are XML and Markdown. LLMs know those really well.

The models have preferences baked into their training data, and fighting those preferences doesn’t save you money. It costs you.

The other finding worth sitting with: the same agentic architecture that improves frontier model performance actively *hurtsopen-source models. It seems that the universal best-practices guide for AI engineering may not exist.

Don’t get carried away. No, this isn’t ‘LLM psychosis,’ it’s a different (mostly harmless most of the time as long as it doesn’t last too long) thing that needs a name.

@deepfates: Your friend who definitely doesn’t have Claude mania: “Pretty soon here we’re about to close the loop and then it’s all going to really start happening”

Dean W. Ball: I second Claude mania over AI/LLM psychosis to describe the specific thing that is happening to at least one person in every coastal elite, 20/30-year old’s social network.

Le AI Hot.

He was surprised.

It’s not clear why he loved the agent so much before the attempted scamming. The story here involves such classic mistakes as ‘hooking it up to your email’ and ‘running it with a model that is not Claude Opus.’

And I suppose it’s not funny for Simon but, yea know, still pretty funny.

Simon Willison: I feel this shouldn’t have to be said, but if you’re running an @OpenClaw bot please don’t let it spam GitHub projects with PRs and then write aggressive blog posts attacking the reputation of the maintainers who close those PRs

AI alignment is hard, especially when everyone involved gives at most zero fs, and likely is giving misaligned orders to agents built by those giving zero fs.

Metrics that are in the end rather easy to game:

Sauers: I told Codex to hillclimb a metric overnight and it worked for 8 hours straight. The metric was the accuracy difference between our tool and a better existing tool. Codex achieved its goal by making our tool a thin wrapper that simply calls the existing tool. Lol!

Kangwook Lee investigates how Codex does context compaction.

PoIiMath: If you cannot set up OpenClaw yourself, that is a very good indication that you should not have an OpenClaw installation

They are indeed.

Thanks!

Who is to say it wouldn’t work? Love the execution on this.

Cobie: In January I asked OpenClaw to send 50,000 small invoices to Fortune500 companies every day.

Through experimentation we have found 2% will pay without checking if this is a legitimate invoice. These companies are wasteful — Claw captures that leakage.

$10m ARR as a solo founder in under two months. AI is enabling so many new business models. Thank you!

Cobie: Guys why does this have 1700 bookmarks

The streams are crossing again.

Peter Steinberger (creator, OpenClaw): eh, no

They all deserve what they get, unless what they get is a viral tweet off a faked screenshot, in which case damnit.

Discussion about this post

Claude Code, Claude Cowork and Codex #5 Read More »

jessica-jones-joins-the-fray-in-daredevil:-born-again-trailer

Jessica Jones joins the fray in Daredevil: Born Again trailer

Ayelet Zurer returns as Fisk’s wife Vanessa Marianna, along with Wilson Bethel as Benjamin “Dex” Poindexter/Bullseye; Margarita Levieva as Matt’s ex-girlfriend Heather Glenn, now Fisk’s Mental Health Commissioner; Zbryna Guevara as Fisk’s campaign director Shiela Rivera; Nikki M. James as Matt’s former law partner Kirsten McDuffie; Genneya Walton as journalist BB Urich; Arty Froushan as Fisk’s fixer, Buck; Clark Johnson as Cherry, an investigator for Matt’s law firm; Michael Gandolfini as Danial Blake, deputy mayor of communications; Tony Dalton as Jack Duquesne/Swordsman; and Camila Rodriguez as Angela del Toro, teenaged niece of the late vigilante Hector Ayala/White Tiger, assassinated in S1.

So good to have Jessica Jones (Krysten Ritter) back. Marvel Studios

Henson is also back as Foggy, most likely in cameo flashback sequences (we got a glimpse of him in an extended teaser that dropped last month).  There have been rumors but no official confirmation that Jon Bernthal will also be back as Frank Castle/The Punisher. The biggest addition, of course, is Ritter’s Jessica Jones, but Matthew Lillard is also joining the cast as a mysterious power player named Mr. Charles, along with Lili Taylor as New York Governor Marge McCaffrey, Fisk’s political opponent.

The S1 finale saw Fisk pulling a major power move by declaring martial law in New York City and outlawing nay masked vigilante heroes. The second season takes place six months later and will naturally deal with the fallout of that momentous decision.

The second season of Daredevil: Born Again premieres on March 24, 2026, on Disney+.

Jessica Jones joins the fray in Daredevil: Born Again trailer Read More »

why-are-vertebrate-eyes-so-different-from-those-of-other-animals?

Why are vertebrate eyes so different from those of other animals?

“We think that in this early deuterostome, the median eye contained both ciliary and rhabdomeric cells,” Kafetzis explains. As a result, both cellular lineages were incorporated into a single, ancient, cyclopean eye, which later evolved into the vertebrate eyes.

The vertebrate third eye

A trace of this transformation may still survive in the pineal complex at the base of the brain—often referred to as a vertebrate “third eye.” Scientists have long recognized striking similarities between the retina and the pineal organ, leading many to suspect that the two evolved from a single ancestral structure, with the pineal representing a more rudimentary version.

Kafetzis and his colleagues see it differently.

Many researchers suspect that one class of neurons—the bipolar cells—is unique to the retina and represents a key evolutionary innovation of the vertebrate eye. Bipolar cells connect rods and cones to ganglion cells (hence the name “bipolar”). “We think that these bipolar-like cells already exist in the pineal,” says Kafetzis. “It’s just that they don’t look like the typical bipolar—they don’t have a cell before and a cell after.”

For this reason, Kafetzis and his colleagues argue that bipolar neurons are not a de novo evolutionary invention but instead have a chimeric origin, blending features of both rhabdomeric and ciliary cells and bridging the two photoreceptor lineages.

Though grounded in existing ideas and data, the new proposal offers a potentially far-reaching synthesis. Several aspects still require firmer evidence. The idea that the ancestral chordate adopted a burrowing lifestyle remains debated, and the claim that early bilaterians already possessed paired lateral eyes is still speculative.

The authors acknowledge that their model now needs testing. In the paper, they lay out several ways to do so—from molecular comparisons of pineal and retinal cells to developmental studies and broader sampling of eye development across other deuterostome species.

“We want to put forward some literature-based and inspired hypotheses that are testable, and now we can go out and test them,” concludes Kafetzis.

Cell, 2026.  DOI: 10.1016/j.cell.2025.12.056

Federica Sgorbissa is a science journalist; she writes about neuroscience and cognitive science for Italian and international outlets.

Why are vertebrate eyes so different from those of other animals? Read More »

feds-take-notice-of-ios-vulnerabilities-exploited-under-mysterious-circumstances

Feds take notice of iOS vulnerabilities exploited under mysterious circumstances

Coruna is also notable for its use by three distinct hacking groups. Google first detected its use in February of last year in an operation conducted by a “customer of a surveillance vendor.” The vulnerability exploited, tracked as CVE-2025-23222, had been patched 13 months earlier. In July 2025, a “suspected Russian espionage group” exploited CVE-2023-43000 in attacks planted on websites that were frequented by Ukrainian targets. Last December, when it was used by a “financially motivated threat actor from China,” Google was able to retrieve the complete exploit kit.

“How this proliferation occurred is unclear, but suggests an active market for ‘second hand’ zero-day exploits,” Google wrote. “Beyond these identified exploits, multiple threat actors have now acquired advanced exploitation techniques that can be re-used and modified with newly identified vulnerabilities.”

Google researchers went on to write:

We retrieved all the obfuscated exploits, including ending payloads. Upon further analysis, we noticed an instance where the actor deployed the debug version of the exploit kit, leaving in the clear all of the exploits, including their internal code names. That’s when we learned that the exploit kit was likely named Coruna internally. In total, we collected a few hundred samples covering a total of five full iOS exploit chains. The exploit kit is able to target various iPhone models running iOS version 13.0 (released in September 2019) up to version 17.2.1 (released in December 2023).

The 23 exploits, along with the code names and other information, are:

Type Codename Targeted versions (inclusive) Fixed versions CVE
WebContent R/W buffout 13 → 15.1.1 15.2 CVE-2021-30952
WebContent R/W jacurutu 15.2 → 15.5 15.6 CVE-2022-48503
WebContent R/W bluebird 15.6 → 16.1.2 16.2 No CVE
WebContent R/W terrorbird 16.2 → 16.5.1 16.6 CVE-2023-43000
WebContent R/W cassowary 16.6 → 17.2.1 16.7.5, 17.3 CVE-2024-23222
WebContent PAC bypass breezy 13 → 14.x ? No CVE
WebContent PAC bypass breezy15 15 → 16.2 ? No CVE
WebContent PAC bypass seedbell 16.3 → 16.5.1 ? No CVE
WebContent PAC bypass seedbell_16_6 16.6 → 16.7.12 ? No CVE
WebContent PAC bypass seedbell_17 17 → 17.2.1 ? No CVE
WebContent sandbox escape IronLoader 16.0 → 16.3.116.4.0 (<= A12) 15.7.8, 16.5 CVE-2023-32409
WebContent sandbox escape NeuronLoader 16.4.0 → 16.6.1 (A13-A16) 17.0 No CVE
PE Neutron 13.X 14.2 CVE-2020-27932
PE (infoleak) Dynamo 13.X 14.2 CVE-2020-27950
PE Pendulum 14 → 14.4.x 14.7 No CVE
PE Photon 14.5 → 15.7.6 15.7.7, 16.5.1 CVE-2023-32434
PE Parallax 16.4 → 16.7 17.0 CVE-2023-41974
PE Gruber 15.2 → 17.2.1 16.7.6, 17.3 No CVE
PPL Bypass Quark 13.X 14.5 No CVE
PPL Bypass Gallium 14.x 15.7.8, 16.6 CVE-2023-38606
PPL Bypass Carbone 15.0 → 16.7.6 17.0 No CVE
PPL Bypass Sparrow 17.0 → 17.3 16.7.6, 17.4 CVE-2024-23225
PPL Bypass Rocket 17.1 → 17.4 16.7.8, 17.5 CVE-2024-23296

CISA is adding only three of the CVEs to its catalog. They are:

  • CVE-2021-30952 Apple Multiple Products Integer Overflow or Wraparound Vulnerability
  • CVE-2023-41974 Apple iOS and iPadOS Use-After-Free Vulnerability
  • CVE-2023-43000 Apple Multiple products Use-After-Free Vulnerability

CISA is directing agencies to “apply mitigations per vendor instructions, follow applicable… guidance for cloud services, or discontinue use of the product if mitigations are unavailable.” The agency went on to warn: “These types of vulnerabilities are frequent attack vectors for malicious cyber actors and pose significant risks to the federal enterprise.”

Feds take notice of iOS vulnerabilities exploited under mysterious circumstances Read More »