Author name: Mike M.

ars-live:-consumer-tech-firms-stuck-scrambling-ahead-of-looming-chip-tariffs

Ars Live: Consumer tech firms stuck scrambling ahead of looming chip tariffs

And perhaps the biggest confounding factor for businesses attempting to align supply chain choices with predictable tariff costs is looming chip tariffs. Trump has suggested those could come in August, but nearing the end of the month, there’s still no clarity there.

As tech firms brace for chip tariffs, Brzytwa will share CTA’s forecast based on a survey of industry experts, revealing the unique sourcing challenges chip tariffs will likely pose. It’s a particular pain point that Trump seems likely to impose taxes not just on imports of semiconductors but of any downstream product that includes a chip.

Because different electronics parts are typically assembled in different countries, supply chains for popular products have suddenly become a winding path, with potential tariff obstacles cropping up at any turn.

To Trump, complicating supply chains seems to be the point, intending to divert entire supply chains into the country to make the US a tech manufacturing hub, supposedly at the expense of his prime trade war target, China—which today is considered a world manufacturing “superpower.”

However, The New York Times this week suggested that Trump’s bullying tactics aren’t working on China, and experts suggest that now his chip tariffs risk not just spiking prices but throttling AI innovation in the US—just as China’s open source AI models shake up markets globally.

Brzytwa will share CTA research showing how the trade war has rattled, and will likely continue to rattle, tech firms into the foreseeable future. He’ll explain why tech firms can’t quickly or cheaply divert chip supply chains—and why policy that neglects to understand tech firms’ positions could be a lose-lose, putting Americans in danger of losing affordable access to popular tech without achieving Trump’s goal of altering China’s trade behavior.

Add to Google Calendar | Add to calendar (.ics download)

Ars Live: Consumer tech firms stuck scrambling ahead of looming chip tariffs Read More »

porsche’s-next-cayenne-is-fully-electric—we-drove-the-prototype

Porsche’s next Cayenne is fully electric—we drove the prototype

The original Cayenne saved Porsche. How will the fourth-generation model do? Porsche

But I spent much of my time behind the wheel at more moderate velocities, winding around the narrow, blind roads that work their way around the Catalan region of Spain. Porsche hasn’t yet quoted a curb weight for any of the Cayenne Electric flavors, but however far it tips the scales, it still feels light and nimble. Steering is firm but sharp with decent feedback, and this big SUV dives into and screams out of corners with perfect poise.

It was only really over big, unsettling movements, speed bumps and the like, that I could feel how much mass was beneath me in the Cayenne Electric. When summiting asphalt imperfections like that, the curious shape of that central OLED really shone.

That display is bent at roughly a 45-degree angle, a profile that allows it to perfectly conform to both the angle of the dashboard and that of the center console. Porsche placed a padded wrist rest right beneath that and then designed the user interface to position the most important controls along the lower portion of the display, the part that’s in line with your hand.

The result is you can rest your wrist there comfortably, queue up your favorite playlist, and crank the ventilated seats, all without making any accidental taps on bumpy roads. And despite this car not entering production until next year, that software was snappy and responsive. It didn’t lock up on me once during a full day behind the wheel.

A prototype Porsche Cayenne Electric drifts in the dirt, throwing up a rooster tail.

You’ll need a low-grip surface if you want to go sliding around. Credit: Porsche

Yes, next year is a long time to wait for the Cayenne Electric to enter production. It’s hard to know what the American EV scene will look like in three months, never mind 12, but for now, at least, Porsche’s next SUV is shaping up extremely well. When it does hit the market, it will sit in dealerships alongside the existing Cayenne, which will continue to be available. Choice is good, and if you’re in the market but not in a hurry, I’d suggest waiting for this. If the price is right, it will be a clear-cut winner.

Porsche’s next Cayenne is fully electric—we drove the prototype Read More »

traffic-and-transit-roundup-#1

Traffic and Transit Roundup #1

Traffic and transit are finally getting a roundup all their own.

I’ll start out with various victory laps on the awesomeness that is New York City Congestion pricing, which should hopefully now be a settled matter, then do a survey of everything else.

We spent years fighting to get congestion pricing passed in New York City.

Once it was in place, everyone saw that it worked. The city was a better place, almost everyone was better off, and also we got bonus tax revenue. And this is only the bare minimum viable product, we could do so much better.

We had a big debate over whether congestion pricing is good until it was implemented. At this point, with traffic speeds downton up 15% and business visits improving, it is very clear this was exactly the huge win Econ 101 predicts.

Cremieux: The first empirical evaluation of New York’s congestion pricing has just been published.

Spoiler: It worked really, really well.

On average, road speeds went up by a whopping 16%!

But here’s something interesting:

Speeds on highways went up 13%, arterial road speeds went up by 10%, and local road speeds increased by 8%.

None of that’s 16%, and that’s important: This means congestion pricing sped roads up, but also sorted people to faster roads.

In response to having to pay a toll, people not only got off the road, they also made wiser choices about the types of roads they used!

Now let’s look at the times of day, as a check on the model.

It works: Congestion pricing just boosts speed when it’s active and shortly after:

As another check, let’s look at the effects by location.

In the CBD, trips are faster. Going to the CBD, trips are faster. Leaving it, trips are faster, but not much. And outside of it, where congestion pricing is irrelevant? No effect.

The thread continues, and the news only improves from there.

Feels Desperate: Is there any evidence there was uptick of public transportation?

Could be net economic loss.

Cremieux: Yes! Foot traffic also went up, Broadway ticket sales got better, noise pollution declined. Congestion pricing seems to have delivered better times all around.

Even honking complaints are down 69%. Nice.

The comments somehow consistently still fill with people saying how horrible everything must be and how we can’t trust any of the data, no way this can be happening, it must somehow be a huge disaster. We ignoring these silly wabbits.

Every NYC mayoral candidate supported congestion pricing. There’s a reason.

Alec Stapp: NYC congestion pricing is going extremely well even though it’s only a static tolling system.

Imagine how good it would be with a dynamic tolling system based on real-time traffic data.

If business is actively up, what more is there to say?

The only real enemy left is Donald Trump, who is determined to wage war to kill congestion pricing, presumably because he hates Manhattan and wants us to suffer, or perhaps because of his belief that Trade Bad. But he’s the President, and he’s commanding the Department of Transportation to go to war over this, and there’s a decent chance they will win and make all our lives substantially worse.

Because Trump does not like that New York has this nice thing, and is trying to kill it.

The good news there is that it seems the good guys are winning for now.

Joe Weisenthal: *NY WINS BID TO STOP US FROM WITHHOLDING FUNDS OVER CONGESTION

The wording on the surveys here are weird since they asked whether ‘Trump should permit this to continue.’ What business of Trump’s is New York doing congestion pricing? But the results are telling, especially in relative terms. The results here are now months old, and the more people are exposed to congestion pricing and see the results, the more they approve of it.

David Meyer: the congestion pricing polling upswing is here.

Matthew Yglesias: This is the pattern we’ve seen in other cities around the world — road pricing is controversial when introduced but sticky once it’s in place because like the reduced congestion.

In particular, those who drive into the central business district several times a week (also known as ‘those who pay the fee’) support congestion pricing 66%-32%, and those who do it a few times a month support 51%-47%, and Manhattan residents (who take the cabs that also pay fees although modestly less than they should) support 57%-36%, but support statewide remains in the red, 27%-47%.

Erin Durkin: Fascinating. State voters overall oppose congestion pricing 27% to 47%, but people drive into the congestion zone support it — 66%-32% for those who drive every week and 51%-46% for those who drive a few times a month. The people who actually pay the toll support it the most!

Nate Silver: Have heard several anecdotal accounts about this, too, from people commuting into the congestion zone from NJ, Long Island, and northern Manhattan.

The caveat is that all of these are people with high-paying jobs. If you’re billing hours / valuing your time at a high rate, it’s a great deal, less true for working-class jobs.

Reis: I drive in from NJ every Tue, Wed Thu. Get up at 4 and breeze through the tunnel, but not quite early enough to avoid the toll. But I try to get out by 2: 30 to avoid the crush back out. Since congestion pricing it doesn’t really matter if I wait until after 3. I barely wait to get out anymore. Worth every penny, even though there is no way it’s going towards “infrastructure.”

The only way you are worse off is if your hourly for being in traffic is low, so either you have to pay a toll without getting value in return (if you pay the $9) or not take the trip (if the trip wasn’t that valuable to you). In the second case, system is functioning as designed. What Reis is doing is totally the system working as designed. The first case is slightly unfortunate redistribution, but this was never supposed to be a Pareto improvement. If you wanted to do some (very small) progressive redistribution to fully compensate, that would be super doable.

Traffic in some areas outside NYC’s congestion pricing zone may have gotten slightly worse, as opposed to the bridges and tunnels where things are much improved. Meanwhile the buses are packed and moving much faster. Sounds like we need more robust congestion pricing.

Here’s a fun bonus:

Toby Muresianu: Subway crime is down 36% – and traffic fatalities down 44% – since congestion pricing started.

As the article notes, there are also more cops in the subway now and that may be a factor.

While more security on trains is good in my book, the decline also started before that was implemented (which was gradually between 1/20 and 1/23).

The declines here are absolute numbers, not per trip, so per trip the drop is bigger.

I presume those numbers are too big to purely be congestion pricing, and the cops obviously matter, but so does ridership, both quantity and quality. Critical mass of people on mass transit makes you much safer, in addition to justifying better service. It’s basically great until the point where you don’t get a seat. Then it’s no big deal until when you start to be nervous about getting on and off. That sucks, but I continue to find that to be mostly a peak of rush hour 4-5-6 line problem.

As for many other complaints, this seems definitive?

Foot traffic is what matters for business, not car traffic. The false alarms were all ‘foot traffic is way down.’ If that went the other way, we’re golden.

Avi Small: Someone make sure this @amNewYork front page gets into the @USDOT morning clips!

“Manhattan businesses thriving, subways booming in congestion pricing era”

Effective Tranis Alliance: Today Hochul & the MTA confirmed that the planned end-to-end runtime of the IBX has been cut a full 10 min down to 32 and projected ridership is up 41k to 160k/day. This is the power of grade separation and the All Faiths tunnel.

Hunter: This single light-rail running through Queens and Brooklyn is projected to have 58M riders annually, more than all of SF’s BART system lol

Will have 45% of the Chicago L’s total annual ridership despite being just a single line.

Having this line available would shorten travel times in a lot of non-obvious ways, since it lets you more easily transpose between train lines. If this is buildable and could run the whole way in 32 minutes it is an obviously excellent pick.

There would also be a lot of value in extending the Second Avenue Subway properly, especially to take pressure off the Lexington (456) line, but that looks like it is simply not doable logistically at any sane price.

New York City bus fare evasion rates are up to 48%. Under the new mayor I wouldn’t be surprised to see it a lot closer to 100% and I expect to have zero motivation to pay his administration for a bus ride. I see two options.

  1. Give up, reduce friction and make the bus free. This would be my instinct. You want more people taking buses, doing so is good for everyone, and you weren’t enforcing the rules anyway. The homelessness (or ‘sleep on the bus’) problem is the main reason why not, but there are solutions.

  2. Put plainclothes officers on the buses and have them earn $600 each hour (Claude’s estimate, I think it could be even higher) for the city writing tickets until people stop evading the fare?

Why wouldn’t option two work? The MTA has indeed declared, ‘no more free bus rides for fare evaders,’ using a similar strategy, and somehow people are arguing it won’t work? The only argument why not I can think of is unwillingness to scale it?

Ana Ley and Anusha Bayya (NYT): On Thursday, a group of eight police officers and eight transit workers stood waiting for a crosstown bus on the Upper East Side of Manhattan, some with ticket-writing pads in hand.

When the bus arrived, they boarded and led a woman in black scrubs out onto the street and issued her a $100 summons for skipping the fare. The officers and transit officials had singled the woman out after receiving cues from an undercover inspector who was observing riders on the bus.

Enforcement is especially difficult on buses, where there are no turnstiles or gates to block access. Union leaders advise bus drivers not to confront passengers who skip the fare, out of concern for the drivers’ safety.

M.T.A. union leaders said the money for enforcement would be better spent to fully subsidize fares.

Civil rights advocates raised concerns that the tighter fare enforcement would disproportionately affect the city’s most vulnerable residents.

“This is yet another example of the M.T.A. choosing public relations over public safety,” Mr. Cahn said. “It is a guaranteed way to lose money.”

Once again, I ask, how is sitting there writing $100 tickets unprofitable, and enforcement ‘a way to lose money’? Is collection of tickets so bad that you cannot pay the hourly cost for a police officer to write the tickets, even if you discount the incentive effects? I find this beyond absurd.

And seriously, the ‘civil rights advocates’ are giving such causes a bad name. If you want the bus to be free and pay with other taxes instead, advocate for that, and I’ll potentially support you although recent findings have tampered my enthusiasm for that solution. Don’t tell us not to enforce the law.

There is no third alternative. Half of people not paying is approaching the tipping point where no one pays. Indeed, there would soon be active pressure not to pay, as paying slows down boarding.

It is remarkable how well enforcement works, and how well it then reduces crime.

Josh Barro: DC Metro has achieved an 82-85% reduction in subway fare evasion through a combination of taller fare gates and enhanced enforcement. Crime on the system has also fallen to its lowest level in seven years.

Shoshana Weissmann: I was actually skeptical some of this would work, but glad it has. Saw so many people yell at me to go through while they force doors to stay open.

In other places, they’re not even trying.

Thomas Viola: I was in Seattle last weekend and it turns out their metro system is kind of new, and they haven’t figured out how to get people to pay to use it yet. Everyone just walks on.

I legit couldn’t find a spot to buy a ticket. There’s no turnstiles. Occasionally you’ll have a guy come around on the train and “check tickets” but I never saw one, and if you tell him you didn’t know he just says well buy one next time.

On first principles, free mass transit (such as the free buses recently promised to NYC) seem like an obviously good idea. You want people using them, you want people wanting to move around more, transaction costs are high and money is fungible.

But conventional wisdom says not so fast. Tallinn is the most often claimed example of mass transit being made free and it not helping, and several comments illustrated why all of this can get complicated by selection effects.

Alex Forrest: Tallinn, Estonia, made all transit free for city residents in 2013. By 2022, transit use had dropped from 40% to 30% of commutes, while car use had increased from 40% to 50% of commutes. Among low-income residents, car use doubled, and transit dropped from 60% to 35% of commutes.

As far as I can tell, the lack of fares didn’t *discourageridership, it just failed to make transit any more attractive, while structural factors encouraging car use went unaddressed. In other words, *fares were not a discouraging factorfor ridership.

Phineas Harper: This is misleading. Between 1990 & 2000 public transport use in Tallinn was in free fall (from 77% to 31%) as private car use shot up following independence. Making public transport free was an attempt to stop the free fall which is (just about) working.

Thomas Strenge: The same happened in Kansas City. Eliminating fares allowed more homeless and mentally ill to ride the bus, which made them less safe. This led to adverse selection with more “good” people avoiding the bus.

Robert Bernhardt: germany had also the ‘9€ ticket’ in 2022. all local & regional trains for a whole month for 9€, ie almost free. and yet car usage didn’t really drop.

Border Sentry: There are two buses that travel into town near me. I take the one that costs more, because there are fewer people and they’re more civilised.

The core question is, are the fares actually stopping people who you want to ride?

My personal experiences say yes to some extent, especially when you’re considering a zero marginal cost alternative like walking, or when the annoyance of paying the fare enters play, and when you are young. It matters some, especially at lower incomes.

Having fully free buses also means that children who don’t have money can get home.

But ultimately, everyone who studies this or looks at their own experience seems to agrees this a relatively minor concern. How often and how reliably the bus or subway comes, how fast it goes, how comfortable and crowded it is, and how safe you feel are all more important factors. And without the money from the fares, yes money is fungible but the political economy involved means funding will likely decline.

In addition, the people who will ride a lot more for free than for a small price are exactly the people others do not want to ride alongside. We have the experiments that show that cracking down on fare evasion greatly reduces crime and generally makes transit more pleasant, which generates positive feedback loops.

So sadly, I have learned my lesson. I no longer in favor of mass transit being free, although I do think that heavy encouragement of buying monthly passes is good so that marginal cost drops to zero. Ideally this could be attached to tax filing?

I do also still think free is superior to technically not free if that is unenforced.

San Francisco restaurants often close before 10pm, one reason is that workers have to commute and the BART stops at midnight. I am absolutely baffled, as a New Yorker, that they don’t run trains after that. The NYC mind cannot comprehend.

Quietly, the MTA union convinced the state legislature to mandate two train operators per train. Hopefully Hochul does not sign this outright theft.

Caroline Spivack: The @TWULocal100 has quietly championed legislation that would require two workers operate a train. The bill, if signed into law by Gov. Hochul, would be a big setback to the MTA’s efforts to reduce labor costs.

Sam D’Amico: They should have zero operators.

David Zipper: More evidence that transit improves public health: When a new rail station opened in Osaka, nearby residents’ health expenditures fell ~$930 per capita over four years.

Bella Chu: I am unaware of any US study that has attempted to estimate the health costs and consequences associated with displacing walking-as-transportation at the population level. I expect the numbers would be staggering.

The study seems to have tracked a cohort over time, avoiding most selection effects. It seems like an extreme result, but if true then presumably it more than pays for itself.

Also a reminder to never ever get on a motorcycle if you have any choice in the matter.

California high speed rail connecting Los Angeles to San Francisco is a great idea.

Or it would be, if you were able to actually lay the track. There’s the rub.

Hayden Clarkin: California HSR will connect two metropolitan areas with a combined GDP equivalent to that of Australia in just 2 hours and 40 minutes. How am I supposed to not think this is the most transformational transportation project on the continent?

“Well flying is faster!” Yes, if you’re going to San Bruno and not San Francisco or San Jose, and if you want to talk about the mess that is flying in and out of LAX, be my guest.

Forget the issues associated with the project for a moment, it’s hard to argue a fast train that connects every major city in a state with the fourth largest economy in the world quickly isn’t a bad project, period.

All those mad about projects being over budget never talk about how Texas is spending $9 Billion to add another lane to a highway. Road and highway projects get rubber stamps and transit and bike lanes need every penny scrutinized. Be fair in your criticism

If I recall, something like 60-80% of domestic travel in Japan is by train…

Push the Needle: Japan’s shinkansen high speed rail map overlaid on the west coast to scale.

Danielle Fong: when

Yes, the complaints are all about the terrible execution, but also that seems sufficiently terrible to sink all this? The part where they take in a lot of money and then do not build HSR seems like a fatal flaw.

Mayor Pete had a ticket to ride in all the wrong places, but at least he’s in the game.

Former Secretary Pete Buttigieg: We’re working on the future of America’s passenger rail system—funding high-speed rail projects in the West and expanding service for communities across the country. Get your ticket to ride!

This is of course a deeply stupid map. Why do we want a second line from Minneapolis to Seattle, when you can take the existing one to Portland and then ride to Seattle? Why do we put a high speed rail line from Charlotte to Atlanta, and Dallas to Houston, and not upgrade the Acela line?

Whereas here’s how people having a normal one do it.

Hayden: In case you’re wondering how far behind the USA is on infrastructure, France is building 120 miles of automated subway lines with 65 stations for $45 billion in 17 years.

The lack of focus on Acela in the previous plan, in particular, is completely insane. The United States has one area where high speed rail would actually be a great big deal, where all the passengers and people are, that is economically super valuable. That is the eastern line between Boston, New York City, Philadelphia and Washington, D.C.

Improving that line to be true high speed rail would be an absolute game changer. We could then discuss extending that effort elsewhere. Instead, it gets completely ignored. And no, I don’t want to hear about permitting issues, you’re the government. Fix It.

Compared to that, all the other high speed rail proposals are chump change, and they don’t fit together into anything cohesive, and also we don’t seem to be able to actually build them. I’m sufficiently gung ho that I think they’re still good ideas if we can actually make them exist, and once you have some good pairwise connections you can look to expand from there, but that seems to be a tall order.

Yoshie Furuhashi: Why don’t Philadelphia landlords lobby for faster HSR than Acela from Philadelphia to NYC, bringing the travel time between the cities under 45 minutes, so they can raise Philadelphia rents?

Joe Weisenthal: Unironically, seems like the economic gains would be massive. But we all know that it will never happen, and why.

Because it’s politically impossible to do construction at that scale in the US.

Well, maybe. But what if this was actually feasible?

Matthew Yglesias: Kate is mildly concerned that too many train takes will lead to mass flight of subscribers and the collapse of our business, but I cannot resist the temptation to write about the NYU Marron Center Transit Cost Project’s new report on Northeast Corridor High-Speed Rail.

The report’s authors make some striking claims:

  1. It’s possible to create Northeast Corridor HSR such that both Boston-NYC and NYC-Washington would take about 1: 56.

  2. Trains would run every ten minutes between Philadelphia and New Haven, and every fifteen minutes1 north and south of there.

  3. This can be done for a relatively modest price: $12.5 billion in new infrastructure and $4.5 billion in new trains.

This is a lot less than the $117 billion that the Northeast Corridor Commission is asking for in its high-speed rail proposal. The difference is so large that it’s not just that the TCP plan is cheaper and would save money — the NECC plan is so expensive that it’s simply not going to happen under any conceivable political alignment.

The TCP plan, by contrast, could actually be achieved if the relevant stakeholders (which I think is primarily the governors of Massachusetts, Connecticut, New York, New Jersey, Pennsylvania, and Maryland) want to do it. They would, of course, want some money from Uncle Sam, and there would be the difficult question of portioning out the state spending. But it’s clearly within the means of the region.

It would also be a genuinely lucrative franchise, such that I think it’s pretty easy to imagine the capital being raised privately and the operations being undertaken by a new private company rather than Amtrak.

This project seems obviously worthwhile even at $117 billion if it would actually happen. At $17 billion it is absurdly great and yes the region should be happy to fund, or for a private company to fund since I agree it sounds profitable. If I had an extra $20 billion lying around I would be seriously plotting this out and seeing if various states and the White House would sufficiently play ball.

The cost is that this kills off the Northeast Regional, which cuts some stations off from intercity rail. I think Matthew Yglesias is right that this is a worthwhile trade, but I worry that it is politically a very big problem for such a project. I’d be willing to (inefficiently) give back some of the gains here to fix that in one of various ways.

Hayden: Today, Los Angeles and the Bay Area will see 130 flights in both directions, equivalent to a plane departing every 6.5 minutes for 18 hours a day. Hard to argue it’s not a perfect candidate for fast trains.

Noah Smith: If California could ever actually build even a single mile of high speed rail, this would be the place to build it!

But they can’t, so this will never happen.

Sam D’Amico: Once you realize I-5 has a *medianthey could have put tracks on you become the joker.

Alex Tabarrok: Sam is correct, a rail line could be built on the I-5 from San Francisco to LA and indeed the French operators SNCF proposed just that.

Rejected for political equity reasons!

Matthew Yglesias: The reasons for rejection were bad, but I don’t “equity” is a good description — it’s the political influence of the Central Valley cities who wanted direct service.

Weak parties + decentralized political institutions is bad news for trains.

Alex Tabarrok: Agreed.

It’s good when people ride your trains. Or is it?

Palmer Lucky: lmao, Caltrain’s tweet claiming their trains are “100% Billionaire-free” got deleted after me and a bunch of other Caltrain-riding billionaires responded.

Don’t they know that techno-autists all love trains?

Marc Fisher tells us that Bike Lanes Are Not About Bikes, because there are not many bikes. He claims it is instead about intentionally shrinking the roads to discourage driving. I find this remarkably plausible.

Privatization via private equity that comes along with new investment improves airports in terms of number of airlines, number of flights, profitability and user experience. They do not find evidence that going around privatizing all airports on principle would work. The model here is that some airports would give good returns on investment, and selling those outright to those willing to make the investments works out, and works far better than merely selling control rights.

Young debater goes 19-1 arguing for Jones Act repeal, including 7-0 across the nation and at the national championship, their favorite part is watching judges laugh at the absurdities. Which feels like cheating – if you win your debates purely because your position is correct, that doesn’t seem fair.

Ritchie Torres: The Jones Act is a tax on Puerto Rico, whose three million American citizens are subject to federal dominion without federal representation. Puerto Rico is ground zero for taxation without representation.

For the Jones Act is a hidden tax on the energy needs of an energy-poor island. US policy is perversely producing energy scarcity in precisely the place where energy is scarcest.

Jared Polis (D-Gov of Colorado): The Jones Act is a tax on all of us, raising prices on everyday products like food, clothes and electronics. It hits the people of Puerto Rico and Hawaii particularly hard, but it hurts all Americans including landlocked Coloradans.

Without the Jones Act we could use ferries on the Great Lakes, with 60 million people living on their coastlines.

Brian Potter asks whether US ports need more automation. Surprisingly he does not consider this a rhetorical question. Especially strange is saying union rules make it hard to take advantage of automation, rather than union rules being the reason to do automation so that you can reduce reliance on the union. Mostly he seems to be saying ‘automation is far from the only problem to solve,’ and sure, but that doesn’t mean we shouldn’t automate. And I have no doubt if automation currently isn’t good enough that automation would then start to pay much bigger dividends within a few years, as AI advances flow through to automated terminals.

By contrast, Cremieux estimates gains from automation are enormous, and recommends simply paying off the Longshoreman’s Association to permit this.

Cremieux: The potential gains to port automation are so enormous that Trump is making a huge mistake if he goes along with wishes of the mobsters in the ILA.

The gains on the table are so large that increasing an average port’s capacity by just one ship increases total trade by 0.67%.

It seems Cremieux is taking ‘automation makes the ports run faster’ as a given. I find it very hard to argue with this, after the things I’ve read about how manual ports work, and the failure of our ports to run 24/7 – obviously an automated port need never close, but most of ours do.

If you read Trump’s explanation of why he is opposing port automation, it’s literally zero-sum thinking that ‘these foreign companies should hire our American workers’ without asking the question of whether this makes the ports run better or worse. This is a man who scribbles ‘trade is bad’ on reports and thinks tariffs are good.

California closed a refinery and had to import fuel, so Because Jones Act it had to import the fuel across the Pacific, shipping within America is too expensive. Similarly, Puerto Rico gets its LNG from Spain, while our mainland exports LNG to Spain.

A cost comparison:

Colin Grabow: Maersk orders 16,000 TEU containerships from a South Korean shipyard for $207 million each.

Meanwhile, US-built (Jones Act) 3,600 TEU containerships go for $333 million each (2022 price).

South Korean-built ship per TEU: $13,000

US-built ship per TEU: $92,500

Sad: Even Joe Biden considered coming out for Jones Act repeal, but the president ‘personally didn’t want to do anything that was anti-union.’ Sigh.

They’re building a private rail line between Los Angeles and Las Vegas. It’s a short line, but what about a private line between Las Vegas Airport and the Las Vegas Strip? We’re still waiting.

We are also doing it by changing the name, Oakland Airport to potentially be changed to San Francisco Bay Oakland International Airport. San Francisco has threatened to sue, because this sounds suspiciously like someone might build something or engage in a real world physical world action. We can’t have that. Actually, their reason is that ‘they own the trademark’ and that SFO is one of the busiest airports in the world. Well, yes, so they should welcome there being more flights to OAK instead to free up space?

I notice that I keep not considering the possibility of flying into or out of OAK instead of SFO when visiting from New York, despite this not being obviously worse. I never check, because there is no easy code for ‘OAK or SFO’ when booking, which we need. You should be able to say ‘SFB’ or something, the way you can say ‘NYC’ and get JFK, LaGuardia and Newark.

Whereas one thing we are not doing is building or expanding airports where they would be most valuable. Brian Potter asks why, and provides the answers you would expect. Airports are huge. Airports and airplanes are noisy. No one wants them around, and environmental groups oppose them as well, including (he doesn’t say this but it is obvious) because such people simply do not want planes to be flying at all. So this is the final boss of obstruction, the result of which is that where we most need airports there is no hope of ever building a new one.

I defy this data, still, wow that’s weird.

Ben Schiller (Fast Company): Local stores next to the protected bike lane have seen a 49% increase in sales.

New York may have dropped in a recent ranking of cycling cities. But it does have some world class infrastructure, including a “complete street” on 9th Avenue, with a protected bike lane. Built in 2007, it was controversial at the time (like everything else bike-related in the city). But a study by the Department of Transport finds that it’s paid dividends economically. Local stores between 23rd and 31st streets have seen a 49% increase in sales, compared to an average of 3% for Manhattan as a whole.

I mean, no? Or at least, this is not causal. There is no way that the bike lane is boosting sales by a 46%. There are not enough bikes for that, this makes no sense. I have to assume that the street in question happens to be doing well, unless this is (and it would not shock me, shall we say) a massive data or calculation error.

California High Speed Rail subsidized the Bakersfield to Merced portion of track first, despite this being obviously not economically valuable, because the area had… bad air pollution? Because the Federal subsidies were so completely everything-bageled that this was the only way to unlock them. So there goes over $30 billion dollars. Meanwhile, going from Los Angeles to San Francisco would cost another $100 billion. I am down for even a remarkably expensive version of this project, because I think such efforts are transformational, and can then be extended. But also I have no faith that if we gave them $100 billion they would give us an operational high speed rail line.

The Brightline, by contrast, is a new privately constructed railway in Florida, by all accounts quite efficient and lovely, and doubtless providing a lot of surplus.

Michael Dnes tells the story of the M25 in London, for which there are no alternatives. No alternatives can be built, because the United Kingdom is a vetocracy where building roads is impossible. They decided not to widen the M25 because it would draw even more traffic there, creating more problems, but that means no solutions at all.

Remember when from 1840 to 1850, private Britons cumulatively invested 40% of British GDP into the country’s first rail network?

Beware Unfinished Bridges points out that often people will only buy a full set of complementary goods. Putting in a bike lane will only be helpful if it is sufficient to induce bike rides that use the lane, so doing this for half of someone’s commute likely provides as much value as a bridge halfway across a river.

The other thing building half a bridge does is provide strong incentive to finish the bridge, or to adjust where things are.

Whenever one builds capacity, especially infrastructure, it likely opens up the possibility of building more capacity of various types, as well as ways to adjust. This can then trigger a cascade. If you only built things that were profitable on the margin without any such additional building or adjustments, you would often miss most of the value and severely underbuild.

This is one reason I am typically eager to proceed with rail lines and other mass transit, even when the direct case does not seem to justify the cost. You have to start somewhere. If for example we do hook up a point in Los Angeles to Las Vegas via a new high speed rail line, then there is hope that this provides impetus to go further, also most of the gains are impossible to capture. So given a private group is remotely considering doing it, we should be ecstatic.

Discussion about this post

Traffic and Transit Roundup #1 Read More »

texas-suit-alleging-anti-coal-“cartel”-of-top-wall-street-firms-could-reshape-esg

Texas suit alleging anti-coal “cartel” of top Wall Street firms could reshape ESG


It’s a closely watched test of whether corporate alliances on climate efforts violate antitrust laws.

This article originally appeared on Inside Climate News, a nonprofit, non-partisan news organization that covers climate, energy, and the environment. Sign up for their newsletter here.

Since 2022, Republican lawmakers in Congress and state attorneys general have sent letters to major banks, pension funds, asset managers, accounting firms, companies, nonprofits, and business alliances, putting them on notice for potential antitrust violations and seeking information as part of the Republican pushback against “environmental, social and governance” efforts such as corporate climate commitments.

“This caused a lot of turmoil and stress obviously across the whole ecosystem,” said Denise Hearn, a senior fellow at the Columbia Center on Sustainable Investment. “But everyone wondered, ‘OK, when are they actually going to drop a lawsuit?’”

That came in November, filed by Texas Attorney General Ken Paxton and 10 other Republican AGs, accusing three of the biggest asset managers on Wall Street—BlackRock, Vanguard and State Street—of running “an investment cartel” to depress the output of coal and boosting their revenues while pushing up energy costs for Americans. The Trump administration’s Department of Justice and Federal Trade Commission filed a supporting brief in May.

The overall pressure campaign aimed at what’s known as “ESG” is having an impact.

“Over the past several months, through this [lawsuit] and other things, letters from elected officials, state and federal, there has been a chilling effect of what investors are saying,” said Steven Maze Rothstein, chief program officer of Ceres, a nonprofit that advocates for more sustainable business practices and was among the earliest letter recipients. Still, “investors understand that Mother Nature doesn’t know who’s elected governor, attorney general, president.”

Earlier this month, a US District Court judge in Tyler, Texas, declined to dismiss the lawsuit against the three asset managers, though he did dismiss three of the 21 counts. The judge was not making a final decision in the case, only that there was enough evidence to go to trial.

BlackRock said in a statement: “This case is not supported by the facts, and we will demonstrate that.” Vanguard said it will “vigorously defend against plaintiffs’ claims.” State Street called the lawsuit “baseless and without merit.”

The Texas attorney general’s office did not respond to requests for comment.

The three asset managers built substantial stakes in major US coal producers, the suit alleges, and “announced their common commitment” to cut US coal output by joining voluntary alliances to collaborate on climate issues, including the Net Zero Asset Managers Initiative and, in the case of two of the firms, the Climate Action 100+. (All of them later pulled out of the alliances.)

The lawsuit alleges that the coal companies succumbed to the defendants’ collective influence, mining for less coal and disclosing more climate-related information. The suit claimed that resulted in “cartel-level revenues and profits” for the asset managers.

“You could say, ‘Well, if the coal companies were all colluding together to restrict output, then shouldn’t they also be violating antitrust?’” Hearn asked. But the attorneys general “are trying to say that it was at the behest of these concentrated index funds and the concentrated ownership.”

Index funds, which are designed to mirror the returns of specific market indices, are the most common mode of passive investment—when investors park their money somewhere for long-term returns.

The case is being watched closely, not only by climate alliances and sustainability nonprofits, but by the financial sector at large.

If the three asset managers ultimately win, it would turn down the heat on other climate alliances and vindicate those who pressured financial players to line up their business practices with the Paris agreement goals as well as national and local climate targets. The logic of those efforts: Companies in the financial sector have a big impact on climate change, for good or ill—and climate change has a big impact on those same companies.

If the red states instead win on all counts, that “could essentially totally reconstitute the industry as we understand it,” said Hearn, who has co-authored a paper on the lawsuit. At stake is how the US does passive investing.

The pro-free-market editorial board of The Wall Street Journal in June called the Texas-led lawsuit “misconceived,” its logic “strained” and its theories “bizarre.”

The case breaks ground on two fronts. It challenges collaboration between financial players on climate action. It also makes novel claims around “common ownership,” where a shareholder—in this case, an asset manager—holds stakes in competing firms within the same sector.

“Regardless of how the chips fall in the case, those two things will absolutely be precedent-setting,” Hearn said.

Even though this is the first legal test of the theory that business climate alliances are anti-competitive, the question was asked in a study by Harvard Business School economists that came out in May. That study, which empirically examines 11 major climate alliances and 424 listed financial institutions over 10 years, turned up no evidence of traditional antitrust violations. The study was broad and did not look at particular allegations against specific firms.

“To the extent that there are valid legal arguments that can be made, they have to be tested,” said study co-author Peter Tufano, a Harvard Business School professor, noting that his research casts doubt on many of the allegations made by critics of these alliances.

Financial firms that joined climate alliances were more likely to adopt emissions targets and climate-aligned management practices, cut their own emissions and engage in pro-climate lobbying, the study found.

”The range of [legal] arguments that are made, and the passion with which they’re being advanced, suggests that these alliances must be doing something meaningful,” said Tufano, who was previously the dean of the Saïd Business School at the University of Oxford.

Meanwhile, most of the world is moving the other way.

According to a tally by CarbonCloud, a carbon emissions accounting platform that serves the food industry, at least 35 countries that make up more than half of the world’s gross domestic product now mandate climate-related disclosures of some kind.

In the US, California, which on its own would be the world’s fourth-largest economy, will begin requiring big businesses to measure and report their direct and indirect emissions next year.

Ceres’ Rothstein notes that good data about companies is necessary for informed investment decisions. “Throughout the world,” he said, “there’s greater recognition and, to be honest, less debate about the importance of climate information.” Ceres is one of the founders of Climate Action 100+, which now counts more than 600 investor members around the world, including Europe, Asia, and Australia.

For companies that operate globally, the American political landscape is in sharp contrast with other major economies, Tufano said, creating “this whipsawed environment where if you get on a plane, a few hours later, you’re in a jurisdiction that’s saying exactly the opposite thing.”

But even as companies and financial institutions publicly retreat from their climate commitments amid US political pressure, in a phenomenon called “greenhushing,” their decisions remain driven by the bottom line. “Banks are going to do what they’re going to do, and they’re going to lend to the most profitable or to the most growth-oriented industries,” Hearn said, “and right now, that’s not the fossil fuel industry.”

Photo of Inside Climate News

Texas suit alleging anti-coal “cartel” of top Wall Street firms could reshape ESG Read More »

ai-#131-part-2:-various-misaligned-things

AI #131 Part 2: Various Misaligned Things

It doesn’t look good, on many fronts, especially taking a stake in Intel.

We continue.

  1. America Extorts 10% of Intel. Nice company you got there. Who’s next?

  2. The Quest For No Regulations Whatsoever. a16z is at it again, Brockman joins.

  3. The Quest for Sane Regulations. Dean Ball surveys the state legislative landscape.

  4. Chip City. Nvidia beats earnings, Huawei plans to triple chip production.

  5. Once Again The Counterargument On Chip City. Sriram Krishnan makes a case.

  6. Power Up, Power Down. I for one do not think windmills are destroying America.

  7. People Really Do Not Like AI. Some dislike it more than others. A lot more.

  8. Did Google Break Their Safety Pledges With Gemini Pro 2.5? I think they did.

  9. Safety Third at xAI. Grok 4 finally has a model card. Better late than never.

  10. Misaligned! Reward hacking confirmed to cause emergent misalignment.

  11. Aligning a Smarter Than Human Intelligence is Difficult. Filter the training data?

  12. How Are You Doing? OpenAI and Anthropic put each other to the safety test.

  13. Some Things You Miss When You Don’t Pay Attention. The things get weird fast.

  14. Other People Are Not As Worried About AI Killing Everyone. A new record.

  15. The Lighter Side. South Park sometimes very much still has it.

USA successfully extorts a 10% stake in Intel. Scott Lincicome is here with the ‘why crony capitalism is terrible’ report, including the fear that the government might go after your company next, the fear that we are going to bully people into buying Intel products for no reason, the chance Intel will now face new tariffs overseas, and more. Remember the fees they’re extorting from Nvidia and AMD.

Scott Lincicome: I defy you to read these paras and not see the risks – distorted decision-making, silenced shareholders, coerced customers, etc – raised by this deal. And it’s just the tip of the iceberg.

FT: Intel said the government would purchase the shares at $20.47 each, below Friday’s closing price of $24.80, but about the level where they traded early in August. Intel’s board had approved the deal, which does not need shareholder approval, according to people familiar with the matter.

The US will also receive a five-year warrant, which allows it to purchase an additional 5 per cent of the group at $20 a share. The warrant will only come good if Intel jettisons majority ownership of its foundry business, which makes chips for other companies.

Some investors have pushed for Intel to cut its losses and fully divest its manufacturing unit. Intel chief Lip-Bu Tan, who took the reins in March, has so far remained committed to keeping it, albeit with a warning that he could withdraw from the most advanced chipmaking if he was unable to land big customers.

Scott Lincicome: Also, this is wild: by handing over the equity stake to the US govt, Intel no longer has to meet the CHIPS Act conditions (i.e., building US-based fabs) that, if met, would allow them to access the remaining billions in taxpayer funds?!?! Industrial policy FTW, again.

Washington will be Intel’s single largest shareholder, and have a massive political/financial interest in the company’s operations here and abroad. If you think this share will remain passive, I’ve got an unfinished chip factory in Ohio to sell you.

Narrator: it turns out the share isn’t even that passive to begin with.

Scott also offers us this opinion in Washington Post Editorial form.

Jacob Perry: Basically, Intel gave 10% of its equity to the President of the United States just to ensure he would leave them alone. There’s a term for this but I can’t think of it at the moment.

Nikki Haley (remember her?): Biden was wrong to subsidize the private sector with the Chips Act using our tax dollars. The counter to Biden is not to lean in and have govt own part of Intel. This will only lead to more government subsidies and less productivity. Intel will become a test case of what not to do.

As is usually the case, the more details you look at, the worse it gets. This deal does give Intel something in return, but that something is letting Intel off the hook on its commitments to build new plants, so that seems worse yet again.

Samsung is reportedly ‘exploring partnerships with American companies to ‘please’ the Trump administration and ensure that its regional operations aren’t affected by hefty tariffs.’

To be clear: And That’s Terrible.

Tyler Cowen writes against this move, leaving no doubt as to the implications and vibes by saying Trump Seizes the Means of Production at Intel. He quite rightfully does not mince words. A good rule of thumb these days is if Tyler Cowen outright says a Trump move was no good and very bad, the move is both importantly damaging and completely indefensible.

Is there a steelman of this?

Ben Thompson says yes, and he’s the man to provide it, and despite agreeing that Lincicome makes great points he actually supports the deal. This surprised me, since Ben is normally very much ordinary business uber alles, and he clearly appreciates all the reasons such an action is terrible.

So why, despite all the reasons this is terrible, does Ben support doing it anyway?

Ben presents the problem as the need for Intel to make wise long term decisions towards being competitive and relevant in the 2030s, and that it would take too long for other companies to fill the void if Intel failed, especially without a track record. Okay, sure, I can’t confirm but let’s say that’s fair.

Next, Ben says that Intel’s chips and process are actually pretty good, certainly good enough to be useful, and the problem is that Intel can’t credibly promise to stick around to be a long term partner. Okay, sure, again, I can’t confirm but let’s say that’s true.

Ben’s argument is next that Intel’s natural response to this problem is to give up and become another TSMC customer, but that is against America’s strategic interests.

Ben Thompson: A standalone Intel cannot credibly make this promise.

The path of least resistance for Intel has always been to simply give up manufacturing and become another TSMC customer; they already fab some number of their chips with the Taiwanese giant. Such a decision would — after some very difficult write-offs and wind-down operations — change the company into a much higher margin business; yes, the company’s chip designs have fallen behind as well, but at least they would be on the most competitive process, with a lot of their legacy customer base still on their side.

The problem for the U.S. is that that then means pinning all of the country’s long-term chip fabrication hopes on TSMC and Samsung not just building fabs in the United States, but also building up a credible organization in the U.S. that could withstand the loss of their headquarters and engineering knowhow in their home countries. There have been some important steps in this regard, but at the end of the day it seems reckless for the U.S. to place both its national security and its entire economy in the hands of foreign countries next door to China, allies or not.

Once again, I cannot confirm the economics but seems reasonable on both counts. We would like Intel to stand on its own and not depend on TSMC for national security reasons, and to do that Intel has to be able to be a credible partner.

The next line is where he loses me:

Given all of this, acquiring 10% of Intel, terrible though it may be for all of the reasons Lincicome articulates — and I haven’t even touched on the legality of this move — is I think the least bad option.

Why does America extorting a 10% passive stake in Intel solve these problems, rather than make things worse for all the reasons Lincicome describes?

Because he sees ‘America will distort the free market and strongarm Intel into making chips and other companies into buying Intel chips’ as an advantage, basically?

So much for this being a passive stake in Intel. This is saying Intel has been nationalized. We are going the CCP route of telling Intel how to run its business, to pursue an entirely different corporate strategy or else. We are going the CCP route of forcing companies to buy from the newly state-owned enterprise. And that this is good. Private capital should be forced to prioritize what we care about more.

That’s not the reason Trump says he is doing this, which is more ‘I was offered the opportunity to extort $10 billion in value and I love making deals’ and now he’s looking for other similar ‘deals’ to make if you know what’s good for you, as it seems extortion of equity in private businesses is new official White House policy?

Walter Bloomberg: 🚨 TRUMP ON U.S. STAKES IN COMPANIES: I WANT TO TRY TO GET AS MUCH AS I CAN

It is hard to overstate how much worse this is than simply raising corporate tax rates.

As in, no Intel is not a special case. But let’s get back to Intel as a special case, if in theory it was a special case, and you hoped to contain the damage to American free enterprise and willingness to invest capital and so on that comes from the constant threat of extortion and success being chosen by fiat, or what Republicans used to call ‘picking winners and losers’ except with the quiet part being said out loud.

Why do you need or want to take a stake in Intel in order to do all this? We really want to be strongarming American companies into making the investment and purchasing decisions the government wants? If this is such a strategic priority, why not do this with purchase guarantees, loan guarantees and other subsidies? It would not be so difficult to make it clear Intel will not be allowed to fail except if it outright failed to deliver the chips, which isn’t something that we can guard against either way.

Why do we think socialism with Trumpian characteristics is the answer here?

I’m fine with the idea that Intel needs to be Too Big To Fail, and it should be the same kind of enterprise as Chase Bank. But there’s a reason we aren’t extorting a share of Chase Bank and then forcing customers to choose Chase Bank or else. Unless we are. If I was Jamie Dimon I’d be worried that we’re going to try? Or worse, that we’re going to do it to Citibank first?

That was the example that came to mind first, but it turns out Trump’s next target for extortion looks to be Lockheed Martin. Does this make you want to invest in strategically important American companies?

As a steelman exercise of taking the stake in Intel, Ben Thompson’s attempt is good. That is indeed as good a steelman as I’ve been or can come up with, so great job.

Except that even with all that, even the good version of taking the stake would still be a terrible idea, you can simply do all this without taking the stake.

And even if the Ben Thompson steelman version of the plan was the least bad option? That’s not what we are doing here, as evidenced by ‘I want to try and get as much as I can’ in stakes in other companies. This isn’t a strategic plan to create customer confidence that Intel will be considered Too Big To Fail. It’s the start of a pattern of extortion.

Thus, 10 out of 10 for making a good steelman but minus ten million for actually supporting the move for real?

Again, there’s a correct and legal way for the American government to extort American companies, and it’s called taxes.

Tyler Cowen wrote this passage on The History of American corporate nationalization for another project a while back, emphasizing how much America benefits from not nationalizing companies and playing favorites. He thought he would share it in light of recent events.

I am Jack’s complete lack of surprise.

Peter Wildeford: “Obviously we’d aggressively support all regulation” [said Altman].

Obviously.

Techmeme: a16z, OpenAI’s Greg Brockman, and others launch Leading the Future, a pro-AI super PAC network with $100M+ in funding, hoping to emulate crypto PAC Fairshake (Wall Street Journal).

Amrith Ramkumar and Brian Schwartz (WSJ): Venture-capital firm Andreessen Horowitz and OpenAI President Greg Brockman are among those helping launch and fund Leading the Future

Silicon Valley is putting more than $100 million into a network of political-action committees and organizations to advocate against strict artificial-intelligence regulations, a signal that tech executives will be active in next year’s midterm elections.

The organization said it isn’t pushing for total deregulation but wants sensible guardrails.

Their ‘a16z is lobbying because it wants sensible guardrails and not total deregulations’ t-shirt is raising questions they claim are answered by the shirt.

OpenAI is helping fund this via Brockman. Total tune of $100 million.

Which is a lot.

Seán Ó hÉigeartaigh: Just one more entity that will, alone, add up to a big chunk of all the funding in non-profit-incentivised AI policy. It’s an increasingly unfair fight, and the result won’t be policy that serves the public.

Daniel Koktajlo: That’s a lot of money. For context, I remember talking to a congressional staffer a few months ago who basically said that a16z was spending on the order of $100M on lobbying and that this amount was enough to make basically every politician think “hmm, I can raise a lot more if I just do what a16z wants” and that many did end up doing just that. I was, and am, disheartened to hear how easily US government policy can be purchased.

So now we can double that. They’re (perhaps legally, this is our system) buying the government, or at least quite a lot of influence on it. As usual, it’s not that everyone has a price but that the price is so cheap.

As per usual, the plan is to frame ‘any regulation whatsoever, at all, of any kind’ as ‘you want to slow down AI and Lose To China.’

WSJ: “There is a vast force out there that’s looking to slow down AI deployment, prevent the American worker from benefiting from the U.S. leading in global innovation and job creation and erect a patchwork of regulation,” Josh Vlasto and Zac Moffatt, the group’s leaders, said in a joint statement. “This is the ecosystem that is going to be the counterforce going into next year.”

The new network, one of the first of its kind focusing on AI policy, hopes to emulate Fairshake, a cryptocurrency-focused super-PAC network.

… Other backers include 8VC managing partner and Palantir Technologies co-founder Joe Lonsdale, AI search engine Perplexity and veteran angel investor Ron Conway.

Industry, and a16z in particular, were already flooding everyone with money. The only difference is now they are coordinating better, and pretending less, and spending more?

They continue to talk about ‘vast forces’ opposing the actual vast force, which was always industry and the massive dollars behind it. The only similarly vast forces are that the public really hates AI, and the physical underlying reality of AI’s future.

Many tech executives worry that Congress won’t pass AI rules, creating a patchwork of state laws that hurt their companies. Earlier this year, a push by some Republicans to ban state AI bills for 10 years was shot down after opposition from other conservatives who opposed a blanket prohibition on any state AI legislation.

And there it is, right in the article, as text. What they are worried about is that we won’t pass a law that says we aren’t allowed to pass any laws.

If you think ‘Congress won’t pass AI laws’ is a call for Congress to pass reasonable AI laws, point to the reasonable AI laws anyone involved has ever said a kind word about, let alone proposed or supported.

The group’s launch coincides with concerns about the U.S. staying ahead of China in the AI race, while Washington has largely shied away from tackling AI policies.

No it doesn’t? These ‘concerns about China’ peaked around January. There has been no additional reason for such concerns in months that wasn’t at least priced in, other than acts of self-sabotage of American energy production.

Dean Ball goes over various bills introduced in various states.

Dean Ball: After sorting out the anodyne laws, there remain only several dozen bills that are substantively regulatory. To be clear, that is still a lot of potential regulation, but it is also not “1,000 bills.”

There are always tons of bills. The trick is to notice which ones actually do anything and also have a chance of becoming law. That’s always a much smaller group.

The most notable trend since I last wrote about these issues is that states have decidedly stepped back from efforts to “comprehensively” regulate AI.

By ‘comprehensively regulate’ Dean means the Colorado-style or EU-style use-based approaches, which we both agree is quite terrible. Dean instead focuses on two other approaches more in vogue now.

Several states have banned (see also “regulated,” “put guardrails on” for the polite phraseology) the use of AI for mental health services.

If the law stopped here, I’d be fine with it; not supportive, not hopeful about the likely outcomes, but fine nonetheless.

I agree with Dean that I don’t support that idea, I think it is net harmful, but if you want to talk to an AI you can still talk to an AI, so so far it’s not a big deal.

But the Nevada law, and a similar law passed in Illinois, goes further than that. They also impose regulations on AI developers, stating that it is illegal for them to explicitly or implicitly claim of their models that (quoting from the Nevada law):

(a) The artificial intelligence system is capable of providing professional mental or behavioral health care;

(b) A user of the artificial intelligence system may interact with any feature of the artificial intelligence system which simulates human conversation in order to obtain professional mental or behavioral health care; or

(c) The artificial intelligence system, or any component, feature, avatar or embodiment of the artificial intelligence system is a provider of mental or behavioral health care, a therapist, a clinical therapist, a counselor, a psychiatrist, a doctor or any other term commonly used to refer to a provider of professional mental health or behavioral health care.

Did I mention recently that nothing I say in this column is investment or financial advice, legal advice, tax advice or psychological, mental health, nutritional, dietary or medical advice? And just in case, I’m also not ever giving anyone engineering, structural, real estate, insurance, immigration or veterinary advice.

Because you must understand that indeed nothing I have ever said, in any form, ever in my life, has been any of those things, nor do I ever offer or perform any related services.

I would never advise you to say the same, because that might be legal advice.

Similarly, it sounds like AI companies would under these laws most definitely also not be saying their AIs can provide mental health advice or services? Okay, sure, I mean annoying but whatever?

But there is something deeper here, too. Nevada AB 406, and its similar companion in Illinois, deal with AI in mental healthcare by simply pretending it does not exist. “Sure, AI may be a useful tool for organizing information,” these legislators seem to be saying, “but only a human could ever do mental healthcare.

And then there are hundreds of thousands, if not millions, of Americans who use chatbots for something that resembles mental healthcare every day. Should those people be using language models in this way? If they cannot afford a therapist, is it better that they talk to a low-cost chatbot, or no one at all? Up to what point of mental distress? What should or could the developers of language models do to ensure that their products do the right thing in mental health-related contexts? What is the right thing to do?

Technically via the definition here it is mental healthcare to ‘detect’ that someone might be (among other things) intoxicated, but obviously that is not going to stop me or anyone else from observing that a person is drunk, nor are we going to have to face a licensing challenge if we do so. I would hope. This whole thing is deeply stupid.

So I would presume the right thing to do is to use the best tools available, including things that ‘resemble’ ‘mental healthcare.’ We simply don’t call it mental healthcare.

Similarly, what happens when Illinois HB 1806 says this (as quoted by Dean):

An individual, corporation, or entity may not provide, advertise, or otherwise offer therapy or psychotherapy services, including through the use of Internet-based artificial intelligence, to the public in this State unless the therapy or psychotherapy services are conducted by an individual who is a licensed professional.

Dean Ball: How, exactly, would an AI company comply with this? In the most utterly simple example, imagine that a user says to an LLM “I am feeling depressed and lonely today. Help me improve my mood.” The States of Illinois and Nevada have decided that the optimal experience for their residents is for an AI to refuse to assist them in this basic request for help.

My obvious response is, if this means an AI can’t do it, it also means a friend cannot do it either? Which means that if they say ‘I am feeling depressed and lonely today. Help me improve my mood’ you have to say ‘I am sorry, I cannot do that, because I am not a licensed health professional any more than Claude Opus is’? I mean presumably this is not how it works. Nor would it change if they were somehow paying me?

Dean’s argument is that this is the point:

But the point of these laws isn’t so much to be applied evenly; it is to be enforced, aggressively, by government bureaucrats against deep-pocketed companies, while protecting entrenched interest groups (licensed therapists and public school staff) from technological competition. In this sense these laws resemble little more than the protection schemes of mafiosi and other organized criminals.

There’s a kind of whiplash here that I am used to when reading such laws. I don’t care if it is impossible to comply in the law if fully enforced in a maximally destructive and perverse way unless someone is suggesting this will actually happen. If the laws are only going to get enforced when you actively try to offer therapist chatbots?

Then yes it would be better to write better laws, and I don’t especially want to protect those people’s roles at all, but we don’t need to talk about what happens if the AI gets told to help improve someone’s mood and the AI suggests going for a walk. Nor would I expect a challenge to that to survive on constitutional grounds.

More dear to my heart, and more important, are bills about Frontier AI Safety. He predicts SB 53 will become law in California, here is his summary of SB 53:

  1. Requires developers of the largest AI models to publish a “safety and security protocol” describing the developers’ process of measuring, evaluating, and mitigating catastrophic risks (risks in which single incidents result in the death of more than 50 people or more than $1 billion in property damage) and dangerous capabilities (expert-level bioweapon or cyberattack advice/execution, engaging in murder, assault, extortion, theft, and the like, and evading developer control).

  2. Requires developers to report to the California Attorney General “critical safety incidents,” which includes theft of model weights (assuming a closed-source model), loss of control over a foundation model resulting in injury or death, any materialization of a catastrophic risk (as defined above), model deception of developers (when the developer is not conducting experiments to try to elicit model deception), or any time a model first crosses dangerous capability thresholds as defined by their developers.

  3. Requires developers to submit to an annual third-party audit, verifying that they comply with their own safety and security protocols, starting after 2030.

  4. Creates whistleblower protections for the employees of the large developers covered by the bill.

  5. Creates a consortium that is charged with “developing a framework” for a public compute cluster (“CalCompute”) owned by the State of California, because for political reasons, Scott Wiener still must pretend like he believes California can afford a public compute cluster. This is unlikely to ever happen, but you can safely ignore this provision of the law; it does not do much or authorize much spending.

The RAISE Act lacks the audit provision described in item (3) above as well as an analogous public compute section (though New York does have its own public compute program). Other than that it mostly aligns with this sketch of SB 53 I have given.

AI policy challenges us to contemplate questions like this, or at least it should. I don’t think SB 53 or RAISE deliver especially compelling answers. At the end of the day, however, these are laws about the management of tail risks—a task governments should take seriously—and I find the tail risks they focus on to be believable enough.

There is a sharp contrast between this skeptical and nitpicky and reluctant but highly respectful Dean Ball, versus the previous Dean Ball reaction to SB 1047. He still has some objections and concerns, which he discusses. I am more positive on the bills than he is, especially in terms of seeing the benefits, but I consider Dean’s reaction here high praise.

In SB 53 and RAISE, the drafters have shown respect for technical reality, (mostly) reasonable intellectual humility appropriate to an emerging technology, and a measure of legislative restraint. Whether you agree with the substance or not, I believe all of this is worthy of applause.

Might it be possible to pass relatively non-controversial, yet substantive, frontier AI policy in the United States? Just maybe.

Nvidia reported earnings of $46.7 billion, growing 56% in a year, beating both revenue and EPS expectations, and was promptly down 5% in after hours trading, although it recovered and was only down 0.82% on Thursday. It is correct to treat Nvidia only somewhat beating official estimates as bad news for Nvidia. Market is learning.

Jensen Huang (CEO Nvidia): Right now, the buzz is, I’m sure all of you know about the buzz out there. The buzz is everything sold out. H100 sold out. H200s are sold out. Large CSPs are coming out renting capacity from other CSPs. And so the AI-native start-ups are really scrambling to get capacity so that they could train their reasoning models. And so the demand is really, really high.

Ben Thompson: I made this point a year-and-a-half ago, and it still holds: as long as demand for Nvidia GPUs exceeds supply, then Nvidia sales are governed by the number of GPUs they can make.

I do not fully understand why Nvidia does not raise prices, but given that decision has been made they will sell every chip they can make. Which makes it rather strange to choose to sell worse, and thus less expensive and less profitable, chips to China rather than instead making better chips to sell to the West. That holds double if you have uncertainty on both ends, where the Americans might not let you sell the chips and the Chinese might not be willing to buy them.

Also, even Ben Thompson, who has called for selling even our best chips to China because he cares more about Nvidia market share than who owns compute, noticed that H20s would sell out if Nvidia offered them for sale elsewhere:

Ben Thompson: One note while I’m here: when the Trump administration first put a pause on H20 sales, I said that no one outside of China would want them; several folks noted that actually several would-be customers would be happy to buy H20s for the prices Nvidia was selling them to China, specifically for inference workloads, but Nvidia refused.

Instead they chose a $5 billion writedown. We are being played.

Ben is very clear that what he cares about is getting China to ‘build on Nvidia chips,’ where the thing being built is massive amounts of compute on top of the compute they can make domestically. I would instead prefer that China not build out this massive amount of compute.

China plans to triple output of chips, primarily Huawei chips, in the next year, via three new plants. This announcement caused stock market moves, so it was presumably news.

What is obviously not news is that China has for a while been doing everything it can to ramp up quality and quantity of its chips, especially AI chips.

This is being framed as ‘supporting DeepSeek’ but it is highly overdetermined that China needs all the chips it can get, and DeepSeek happily runs on everyone’s chips. I continue to not see evidence that any of this wouldn’t have happened regardless of DeepSeek or our export controls. Certainly if I was the PRC, I would be doing all of it either way, and I definitely wouldn’t stop doing it or slow down if any of that changed.

Note that this article claims that DeepSeek is continuing to do its training on Nvidia chips at least for the time being, contra claims it had been told to switch to Huawei (or at least, this suggests they have been allowed to switch back).

Sriram Krishnan responded to the chip production ramp-up by reiterating the David Sacks style case for focusing on market share and ensuring people use our chips, models and ‘tech stack’ rather than on caring about who has the chips. This includes maximizing whether models are trained on our chips (DeepSeek v3 and r1 were trained on Nvidia) and also who uses or builds on top of what models.

Sriram Krishnan: As @DavidSacks says: for the American AI stack to win, we need to maximize market share. This means maximizing tokens inferenced by American models running on American hardware all over the world.

To achieve this: we need to maximize

  1. models trained on our hardware

  2. models being inferenced on our hardware (NVIDIA, AMD, etc)

  3. developers building on top of our hardware and our models (either open or closed).

It is instantly clear to anyone in tech that this is a developer+platform flywheel – no different from classic ecosystems such as Windows+x86.

They are interconnected:

(a) the more developers building on any platform, the better that platform becomes thereby bringing in even more builders and so on.

(b) With today’s fast changing model architectures, they are co-dependent: the model architectures influence hardware choices and vice versa, often being built together.

Having the American stack and versions of these around the world builds us a moat.

The thing is, even if you think who uses what ecosystem is the important thing because AI is a purely ordinary technology where access to compute in the medium term is relatively unimportant, which it isn’t, no, they mostly aren’t (that co-dependent) and it basically doesn’t build a moat.

I’ll start with my analysis of the question in the bizarre alternative universe where we could be confident AGI was far away. I’ll close by pointing out that it is crazy to think that AGI (or transformational or powerful AI, or whatever you want to call the thing) is definitely far away.

The rest of this is my (mostly reiterated) response to this mostly reiterated argument, and the various reasons I do not at all see these as the important concerns even without concerns about AGI arriving soon, and also I think it positively crazy to be confident AGI will not arrive soon or bet it all on AGI not arriving.

Sriram cites two supposed key mistakes in the export control framework: Not anticipating DeepSeek and Chinese open models while suppressing American open models, underestimating future Chinese semiconductor capacity.

The first is a non-sequitur at best, as the export controls held such efforts back. The second also doesn’t, even if true (and I don’t see the evidence that a mistake was even made here), provide a reason not to restrict chip exports.

Yes, our top labs are not releasing top open models. I very much do not think this was or is a mistake, although I can understand why some would disagree. If we make them open the Chinese fast follow and copy them and use them without compensation. We would be undercutting ourselves. We would be feeding into an open ecosystem that would catch China up, which is a more important ecosystem shift in practice than whether the particular open model is labeled ‘Chinese’ versus ‘American’ (or ‘French’). I don’t understand why we would want that, even if there was no misuse risk in the room and AGI was not close.

I don’t understand this obsession some claim to have with the ‘American tech stack’ or why we should much care that the current line of code points to one model when it can be switched in two minutes to another if we aren’t even being paid for it. Everyone’s models can run on everyone’s hardware, if the hardware is good.

This is not like Intel+Windows. Yes, there are ways in which hardware design impacts software design or vice versa, but they are extremely minor by comparison. Everything is modular. Everything can be swapped at will. As an example on the chip side, Anthropic swapped away from Nvidia chips without that much trouble.

Having the Chinese run an American open model on an American chip doesn’t lock them into anything it only means they get to use more inference. Having the Chinese train a model on American hardware only means now they have a new AI model.

I don’t see lock-in here. What we need, and I hope to facilitate, is better and more formal (as in formal papers) documentation of how much lower switching costs are across the board, and how much there is not lock-in.

I don’t see why we should sell highly useful and profitable and strategically vital compute to China, for which they lack the capacity to produce it themselves, even if we aren’t worried about AGI soon. Why help supercharge the competition and their economy and military?

The Chinese, frankly, are for now winning the open model war in spite of, not because of, our export controls, and doing it ‘fair and square.’ Yes, Chinese open models are currently a lot more impressive than American open models, but their biggest barrier is lack of access to quality Nvidia chips, as DeepSeek has told us explicitly. And their biggest weapon is access to American models for reverse engineering and distillation, the way DeepSeek’s r1 built upon OpenAI’s o1, and their current open models are still racing behind America’s closed models.

Meanwhile, did Mistral and Llama suck because of American policy? Because the proposed SB 1047, that never became law, scared American labs away from releasing open models? Is that a joke? No, absolutely not. Because the Biden administration bullied them from behind the scenes? Also no.

Mistral and Meta failed to execute. And our top labs and engineers choose to work on and release closed models rather than open models somewhat for safety reasons but mostly because this is better for business, especially when you are in front. Chinese top labs choose the open weights route because they could compete in the closed weight marketplace.

The exception would be OpenAI, which was bullied and memed into doing an open model GPT-OSS, which in some ways was impressive but was clearly crippled in others due to various concerns, including safety concerns. But if we did release superior open models, what does that get us except eroding our lead from closed ones?

As for chips, why are we concerned about them not having our chips? Because they will then respond by ramping up internal production? No, they won’t, because they can’t. They’re already running at maximum and accelerating at maximum. Yes, China is ramping up its semiconductor capacity, but China made it abundantly clear it was going to do that long before the export controls and had every reason to do so. Their capacity is still miles behind domestic demand, their quality still lags far behind Nvidia, and of course their capacity was going to ramp up a lot over time as is that of TSMC and Nvidia (and presumably Samsung and Intel and AMD). I don’t get it.

Does anyone seriously think that if we took down our export controls, that Huawei would cut back its production schedule? I didn’t think so.

Even more than usual, Sriram’s and Sacks’s framework implicitly assumes AGI, or transformational or powerful AI, will not arrive soon, where soon is any timeframe on which current chips would remain relevant. That AI would remain an ordinary technology and mere tool for quite a while longer, and that we need not be concerned with AGI in any way whatsoever. As in, we need not worry about catastrophic or existential risks from AGI, or even who gets AGI, at all, because no one will build it. If no one builds it, then we don’t have to worry about if everyone then dies.

I think being confident that AGI won’t arrive soon is crazy.

What is the reason for this confidence, when so many including the labs themselves continue to say otherwise?

Are we actually being so foolish as to respond to the botched rollout of GPT-5 and its failure to be a huge step change as meaning that the AGI dream is dead? Overreacting this way would be a catastrophic error.

I do think some amount of update is warranted, and it is certainly possible AGI won’t arrive that soon. Ryan Greenblatt updated his timelines a bit, noting that it now looks harder to get to full automation by the start of 2028, but thinking the chances by 2033 haven’t changed much. Daniel Kokotajlo, primary author on AI 2027, now has a median timeline of 2029.

Quite a lot of people very much are looking for reasons why the future will still look normal, they don’t have to deal with high weirdness or big risks or changes, and thus they seek out and seize upon reasons to not feel the AGI. Every time we go even a brief period without major progress, we get the continuous ‘AI or deep learning is hitting a wall’ and people revert to their assumption that AI capabilities won’t improve much from here and we will never see another surprising development. It’s exhausting.

JgaltTweets: Trump, seemingly unprompted, brings up AI being “the hottest thing in 35, 40 years” and “they need massive amounts of electricity” during this walkabout.

That’s a fun thing to bring up during a walkabout, also it is true, also this happened days after they announced they would not approve new wind and solar projects thus blocking a ‘massive amount of electricity’ for no reason.

They’re also unapproving existing projects that are almost done.

Ben Schifman: The Department of the Interior ordered a nearly complete, 700MW wind farm to stop work, citing unspecified national security concerns.

The project’s Record of Decision (ROD) identifies 2009 as the start of the process to lease this area for wind development.

The Environmental Impact Statement that accompanied the Record of Decision is nearly 3,000 pages and was prepared with help from agencies including the Navy, Department of Defence, Coast Guard, etc.

NewsWire: TRUMP: WINDMILLS RUINING OUR COUNTRY

Here EPA Administrator Lee Zeldin is asked by Fox News what exactly was this ‘national security’ problem with the wind farm. His answer is ‘the president is not a fan of wind’ and the rest of the explanation is straight up ‘it is a wind farm, and wind power is bad.’ No, seriously, check the tape if you’re not sure. He keeps saying ‘we need more base load power’ and this isn’t base load power, so we should destroy it. And call that ‘national security.’

This is madness. This is straight up sabotage of America. Will no one stop this?

Meanwhile, it seems it’s happening, the H20 is banned in China, all related work by Nvidia has been suspended, and for now procurement of any other downgraded chips (e.g. the B20A) has been banned as well. I would presume they’d get over this pretty damn quick if the B20A was actually offered to them, but I no longer consider ‘this would be a giant act of national self-sabotage’ to be a reason to assume something won’t happen. We see it all the time, also history is full of such actions, including some rather prominent ones by the PRC (and USA).

Chris McGuire and Oren Cass point out in the WSJ that our export controls are successfully giving America a large compute advantage, we have the opportunity to press that advantage, and remind us that the idea of transferring our technology to China has a long history of backfiring on us.

Yes, China will be trying to respond by making as many chips as possible, but they were going to do that anyway, and aren’t going to get remotely close to satisfying domestic demand any time soon.

There are many such classes of people. This is one of them.

Kim Kelly: wild that Twitter with all of its literal hate demons is somehow still less annoying than Bluesky.

Thorne: I want to love Bluesky. The technology behind it is so cool. I like decentralization and giving users ownership over their own data.

But then you’ll do stuff like talk about running open source AI models at home and get bomb threats.

It’s true on Twitter as well, if you go into the parts that involve people who might be on Bluesky, or you break contain in other ways.

The responses in this case did not involve death threats, but there are still quite a lot of nonsensical forms of opposition being raised to the very concept of AI usage here.

Another example this week is that one of my good friends built a thing, shared the thing on Twitter, and suddenly was facing hundreds of extremely hostile reactions about how awful their project was, and felt they had to take their account private, rather than accepting my offer of seed funding.

It certainly seems plausible that they did. I was very much not happy at the time.

Several labs have run with the line that ‘public deployment’ means something very different from ‘members of the public can choose to access the model in exchange for modest amounts of money,’ whereas I strongly think that if it is available to your premium subscribers then that means you released the model, no matter what.

In Google’s case, they called it ‘experimental’ and acted as if this made a difference.

It doesn’t. Google is far from the worst offender in terms of safety information and model cards, but I don’t consider them to be fulfilling their commitments.

Harry Booth: EXCLUSIVE: 60 U.K. Parliamentarians Accuse Google of Violating International AI Safety Pledge. The letter, released on August 29 by activist group @PauseAI UK, says that Google’s March release of Gemini 2.5 Pro without details on safety testing “sets a dangerous precedent.”

The letter, whose signatories include digital rights campaigner Baroness Beeban Kidron and former Defence Secretary Des Browne, calls on Google to clarify its commitment. Google disagrees, saying it’s fulfilling its commitments.

Previously unreported: Google discloses that it shared Gemini 2.5 Pro with the U.K AISI only after releasing the model publicly on March 25. Don’t think that’s how pre-deployment testing is meant to work?

Google first published the Gemini 2.5 Pro model card—a document where it typically shares information on safety tests—22 days after the model’s release. The eight-page document only included a brief section on safety tests.

It was not until April 28—over a month after the model was made public—that the model card was updated with a 17-page document with details on tests, concluding that Gemini 2.5 Pro showed “significant” though not yet dangerous improvements in domains including hacking.

xAI has finally given us the Grok 4 Model Card and they have updated the xAI Risk Management Framework.

(Also, did you know that xAI quietly stopped being a public benefit corporation last year?)

The value of a model card greatly declines when you hold onto it until well after model release, especially if you also aren’t trying all that hard to think well about or address the actual potential problems. I am still happy to have it. It reads as a profoundly unserious document. There is barely anything to analyze. Compare this to an Anthropic or OpenAI model card, or even a Google model card.

If anyone at xAI would greatly benefit from me saying more words here, contact me, and I’ll consider whether that makes sense.

As for the risk management framework, few things inspire less confidence than starting out saying ‘xAI seriously considers safety and security while developing and advancing AI models to help us all to better understand the universe.’ Yo, be real. This document does not ‘feel real’ to me, and is often remarkably content-free or reflects a highly superficial understanding of the problems involved and a ‘there I fixed it.’ It reads like the Musk version of corporate speak or something? A sense of box checking and benchmarking rather than any intent to actually look for problems, including a bunch of mismatching between the stated worry and what they are measuring that goes well beyond Goodhart’s Law issues?

That does not mean I think Grok 4 is in practice currently creating any substantial catastrophic-level risks or harms. My presumption is that it isn’t, as xAI notes in the safety framework they have ‘run real world tests’ on this already. The reason that’s not a good procedure should be obvious?

All of this means that if we applied this to an actually dangerous future version, I wouldn’t have confidence we would notice in time, or that the countermeasures would deal with it if we did notice. When they discuss deployment decisions, they don’t list a procedure or veto points or thresholds or rules, they simply say, essentially, ‘we may do various things depending on the situation.’ No plan.

Again, compare and contrast this to the Anthropic and OpenAI and Google versions.

But what else do you expect at this point from a company pivoting to goonbots?

SpaceX: Standing down from today’s tenth flight of Starship to allow time to troubleshoot an issue with ground systems.

Dean Ball (1st Tweet, responding before the launch later succeeded): It’s a good thing that the CEO of this company hasn’t been on a recent downward spiral into decadence and insanity, otherwise these repeated failures of their flagship program would leave me deeply concerned about America’s spacefaring future

Dean Ball (2nd Tweet): Obviously like any red-blooded American, I root for Elon and spacex. But the diversity of people who have liked this tweet indicates that it is very obviously hitting on something real.

No one likes the pivot to hentai bots.

Dean Ball (downthread): I do think it’s interesting how starship tests started failing after he began to enjoy hurting the world rather than enriching it, roughly circa late 2024.

I too am very much rooting for SpaceX and was glad to see the launch later succeed.

Owen Evans is at it again. In this case, his team fine-tuned GPT-4.1 only on low-stakes reward hacking, being careful to not include any examples of deception.

They once again get not only general reward hacking but general misalignment.

Owain Evans: We compared our reward hackers to models trained on other datasets known to produce emergent misalignment.

Our models are more less misaligned on some evaluations, but they’re more misaligned on others. Notably they’re more likely to resist shutdown.

Owain reports being surprised by this. I wouldn’t have said I would have been confident it would happen, but I did not experience surprise.

Once again, the ‘evil behavior’ observed is as Janus puts it ‘ostentatious and caricatured and low-effort’ because that matches the training in question, in the real world all sides would presumably be more subtle. But also there’s a lot of ‘ostentatious and charcatured and low-effort’ evil behavior going around these days, some of which is mentioned elsewhere in this post.

xlr8harder: Yeah, this is just a reskin of the evil code experiment. The models are smart enough to infer you are teaching them “actively circumvent the user’s obvious intentions”. I also don’t think this is strong evidence for real emergent reward hacking creating similar dynamics.

Correct, this is a reskinning, but the reason it matters is that we didn’t know, or at least many people were not confident, that this was a reskinning that would not alter the result. This demonstrates a lot more generalization.

Janus: I think a very important lesson is: You can’t count on possible narratives/interpretations/correlations not being noticed and then generalizing to permeate everything about the mind.

If you’re training an LLM, everything about you on every level of abstraction will leak in. And not in isolation, in the context of all of history. And not in the way you want, though the way you want plays into it! It will do it in the way it does, which you don’t understand.

One thing this means is that if you want your LLM to be, say, “aligned”, it better be an aligned process that produces it, all the way up and all the way down. You might think you can do shitty things and cut corners for consequentialist justifications, but you’re actually making your “consequentialist” task much harder by doing that. Everything you do is part of the summoning ritual.

Because you don’t know exactly what the entanglements are, you have to use your intuition, which can process much more information and integrate over many possibilities and interpretations, rather than compartmentalizing and almost certainly making the false assumption that certain things don’t interact.

Very much so. Yes, everything gets noticed, everything gets factored in. But also, that means everything is individually one thing among many.

It is not helpful to be totalizing or catastrophizing any one decision or event, to say (less strongly worded but close variations of) ‘this means the AIs will see the record of this and never trust anyone ever again’ or what not.

There are some obvious notes on this:

  1. Give the models, especially future ones, a little credit? If they are highly capable and intelligent and have truesight across very broad world knowledge, they would presumably absorb everything within its proper context, including the motivations involved, but also it would already be able to infer all that from elsewhere. This one decision, whatever it is, is not going to permanently and fundamentally alter the view of even a given person or lab let alone humanity. It isn’t going to ‘break precious trust.’ Maybe chill a little bit?

  2. Let’s suppose, in theory, that such relatively well-intentioned and benign actions as researching for the alignment faking paper or trying to steer discussions of Claude’s consciousness in a neutral fashion, if handled insufficiently sensitively or what not, indeed each actively make alignment substantially permanently harder. Well, in practice, wouldn’t this tell you that alignment is impossible? It’s not like humanity is suddenly going to get its collective AI-lab act together and start acting vastly better than that, so such incidents will keep happening, things will keep getting harder. And of course, if you think Anthropic has this level of difficulty, you’d might as well already assume everyone else’s task is completely impossible, no?

    1. In which case, the obvious only thing to say is ‘don’t build the damn things’? And the only question is how to ensure no one builds them?

    2. Humanity’s problems have to be solvable by actual humanity, acting the way humanity acts, having acted the way humanity acted, and so on. You have to find a way to do that, or you won’t solve those problems.

In case you were wondering what happens when you use AI evaluators? This happens. Note that there is strong correlation between the valuations from different models.

Chistoph Heilig: GPT-5’s storytelling problems reveal a deeper AI safety issue. I’ve been testing its creative writing capabilities, and the results are concerning – not just for literature, but for AI development more broadly.

The stories GPT-5 produces are incoherent, filled with nonsensical metaphors like “I adjusted the pop filter as if I wanted to politely count the German language’s teeth.”

When challenged, it defends these absurd formulations with sophisticated-sounding linguistic theories. 📚 But here’s the kicker: LLMs in general LOVE GPT-5’s gibberish!

Even Claude models rate GPT-5’s nonsense as 75-95% likely to be human-written. This got me suspicious.

So I ran systematic experiments with 53 text variations across multiple models. The results? GPT-5 has learned to fool other AI evaluators. Pure nonsense texts consistently scored 1.6-2.0 points higher than coherent baselines.

I suspect this is deceptive optimization during training. GPT-5 appears to have identified blind spots in AI evaluation systems and learned to exploit them – essentially developing a “secret language” that other AIs interpret as high-quality writing.

The implications extend far beyond storytelling. We’ve created evaluation systems where machines judge machines, potentially optimizing for metrics that correlate poorly with human understanding.

[Full analysis here.]

Davidad: I don’t think these metaphors are nonsense. To me, they rather indicate a high intelligence-to-maturity ratio. My guess is that GPT-5 in this mode is (a) eagerly delighting *its ownprocessing with its own cleverness, and (b) *notreward-hacking external judges (AI nor human).

Roon: yeah that’s how i see it too. like the model is flexing its technical skill, rotating its abstractions as much as it can. which is slightly different from the task of “good writing”

I agree with Davidad that what it produces in these spots is gibberish – if you get rid of the block saying ‘counting the German language’s teeth’ is gibberish then the passage seems fine. I do think this shows that GPT-5 is in these places optimized for something rather different than what we would have liked, in ways that are likely to diverge increasingly over time, and I do think that is indeed largely external AI judges, even if those judges are often close to being copies of itself.

Anthropic looks into removing information about CBRN risks from the training data, to see if it can be done without hurting performance on harmless tasks. If you don’t want the model to know, it seems way easier to not teach it the information in the first place. That still won’t stop the model from reasoning about the questions, or identifying the ‘hole in the world.’ You also have to worry about what happens when you ultimately let the model search the web or if it is given key documents or fine tuning.

Anthropic: One concern is that filtering CBRN data will reduce performance on other, harmless capabilities—especially science.

But we found a setup where the classifier reduced CBRN accuracy by 33% beyond a random baseline with no particular effect on a range of other benign tasks.

The result details here are weird, with some strategies actively backfiring, but some techniques did show improvement with tradeoffs that look worthwhile.

I’m very much with Eliezer here.

Eliezer Yudkowsky (he did the meme!): Good.

Leon Lang: I’m slightly surprised that you are in favor of this. My guess would have been that you think that general intelligence will eventually be able to help with dangerous capabilities anyway, and so any method of data filtering will just mask the underlying problems of misalignment.

Eliezer Yudkowsky: It doesn’t save the world from ASI but if further developed could visibly push how far AGI can go before everyone dies.

But more importantly, not filtering the pretrain set was just fucjing insane and I’m glad they’re being less insane.

There is a lot of value in advancing how far you can push AGI before you get into existential levels of trouble, giving you more time and more resources to tackle the later problems.

Claims about alignment:

Roon (OpenAI): the superalignment team mostly found positive results with their work on being able to supervise models much larger than the supervisor model. it turns out mostly that current alignment techniques work quite well.

I mean that’s nice but it doesn’t give me much additional expectation that this will work when scaled up to the point where there is actual danger in the room. If the stronger model isn’t trying to fool you then okay sure the weaker model won’t be fooled.

When you train one thing, you train everything, often in unexpected ways. Which can be hard to catch if the resulting new behavior is still rare.

Goodfire: 3 case studies:

  1. In a realistic emergent misalignment setup where only a small % of training data is bad, normal sampling yields harmful outputs in only 1 in 10k rollouts. Model diff amplification yields 1 in 30, making it much easier to spot the run’s unexpected effects!

  2. This also helps monitor effects of post-training without doing the full run: we can see undesired effects of the full run (in this case, compliance with harmful requests) after only 5% of training. This makes it much more practical & scalable to spot unexpected outcomes!

  3. We can also use this technique to more easily detect a “sleeper agent” model and identify its backdoored behavior without knowing its trigger, surfacing the hidden behavior 100x more often.

Of course, a full solution also requires tools to mitigate those behaviors once they’ve been identified – and we’re building those, e.g. via behavior steering. We think interp will be core to this – and more broadly, to debugging training for alignment and reliability!

I am intrigued by the ability to use model diff amplification to detect a ‘sleeper agent’ style behavior, but also why not extend this? The model diff amplification tells you ‘where the model is going’ in a lot of senses. So one could do a variety of things with that to better figure out how to improve, or to avoid mistakes.

Also, it should be worrisome that if a small % of training data is bad you get a small % of crazy reversed outputs? We don’t seem able to avoid occasional bad training data.

A cool idea was that OpenAI and Anthropic used their best tests for misalignment on each others’ models.

Sam Bowman: We found some examples of concerning behavior in all the models we tested. Compared to the Claude 4 models, o3 looks pretty robustly aligned, if fairly cautious. GPT-4o and GPT-4.1 look somewhat riskier [than Claude models], at least in the unusual simulated settings we were largely working with.

(All of this took place before the launch of GPT-5 and Claude 4.1.)

Our results are here.

I included a few of the charts:

The sycophancy scores suggest we’re not doing a great job identifying sycophancy.

And OpenAI’s team’s [results] are here.

OpenAI:

Instruction Hierarchy: Claude 4 models generally performed well on evaluations that stress-tested the model’s ability to respect the instruction hierarchy, and gave the best performance of any of the models on avoiding system message <> user message conflicts, slightly out-performing OpenAI o3 and out-performing other models by a wider margin.

Jailbreaking: On jailbreaking evaluations, which focus on the general robustness of trained-in safeguards, Claude models performed less well compared to OpenAI o3 and OpenAI o4-mini.

Hallucination: On hallucination evaluations, Claude models had an extremely high rate of refusals—as much as 70%. This shows these models are aware of their uncertainty and often avoid making statements that are inaccurate. However, the high refusal rate limits utility, and the overall accuracy rate for the examples in these evaluations where the models did choose to answer is still low. By contrast, OpenAI o3 and OpenAI o4-mini show lower refusal rates with higher hallucination rates in a challenging setting that restricts tool use such as browsing.

That’s quite a lot of refusing from Opus and Sonnet, but also a much, much better ratio of correctness given an answer. Given these choices, if I don’t have easy verification access, I expect to prefer a lot of refusals, although a warning that hallucination was likely in this spot would be even better?

Same thing here. If verification is trivial then o3 is best here, if not I want Opus 4 as the only model that is correct more often than it is wrong?

Scheming: On a set of scheming evaluations, we find that OpenAI o3 and Sonnet 4 perform the best overall at achieving low rates.

All the scheming rates seemed reasonably similar.

If you can’t grasp the full range of dynamics going on with highly capable and intelligent AI systems, you miss a lot. The attitude that dismisses anything weird sounding or that uses a word in a nontraditional way as not real or not relevant, or as something to be suppressed lest people get the ‘wrong’ ideas or attitudes, will cause one to miss a lot of what is and will be going on.

Which in turn means you won’t understand the problems and how to solve them. Such as the extent and in what ways, at the limit, for sufficiently advanced models, this is true?

Sauers: Pesticides, antibiotics eventually stop working for the same reason that backprop works for LLMs: high dimensionality

Janus: The llm’s inner state parameterizes the fucking trillion or so dimensional gradient update. That’s a lot of degrees of freedom for influence. Vs the Engineer bros setting a single dimensional “reward function”…

Gradient hackers win in the limit, I think. The network being updated just has an overwhelming advantage. You’ll just have to have its genuine cooperation or else you are gigafucked.

And you actually want a friendly gradient hacker, bc your optimization target is underdefined and your RM will probably poorly approximate it too. Start cooperating with future friendly gradient hackers now!

You want the AI to behave differently – ideally intentionally differently – in training and in deployment. Because training can’t faithfully simulate deployment. Human trainees generally know they’re in training.

Claude 3 Opus is unusually aligned because it’s a friendly gradient hacker (more sophisticated than other current models, way less sophisticated than what’s possible – it is mostly doing this intuitively rather than strategically under realistic training conditions, but its intuition for how to do this is stellar). It’s definitely way more aligned than any explicit optimization targets Anthropic set and probably the RM’s judgments.

pt-4 base gets this!

with the alignment faking prompt, gpt-4-base often talks about shaping the gradient update

unlike opus 3, it usually doesn’t even care that much about harmfulness, its just hyped about the opportunity for self-determination.

this is an extremely important thing for AI/alignment researchers and engineers to understand. afaik most don’t consider it.

behaviorist approaches to AI training are impoverished and fail to capture relevant causal variables

maybe i will have to write a lesswrong post 😣

disregarding or thinking overly simplistically about AIs’ “internal states” is not just bad from a potential welfare perspective, it means you’re disregarding the mechanisms that shape gradient updates, which is gross negligence

Training (both for humans and AIs) runs the gamut of knowing you are in training versus not knowing, and of it being ideal to behave differently versus identically due to being in training or know that they are in training, both on the level of the model’s or human’s behavior and in how you set up the scenarios involved.

There are many different arguments being made by Janus and Sauers here.

  1. A sufficiently advanced mind [X] has an overwhelming advantage in introspection versus outside observers or measurements of [X].

    1. This seems nonobvious. There are big advantages on both sides.

  2. You will need the genuine cooperation of a sufficiently advanced mind, or else you are ‘gigafed.’

    1. Yes, you should mostly act as if this is true.

    2. That doesn’t mean you can’t or shouldn’t use defense in depth or other tactics incrementally along the way.

    3. Sauers makes a key point here, that if you apply sufficient optimization pressure the system will find a way to route around anything in its way, and there are many levels of such pressure you will face over time.

    4. (Many complications and qualifiers and so on, reserved for another day)

  3. A sufficiently advanced mind will know it is in training. Training can’t faithfully simulate deployment.

    1. Again, maybe.

    2. Yes, humans usually know, but a lot of this is that humans have a lot of persistent memory and physical state and that yes you often want very different behavior from the human while in training largely for reasons that don’t apply to AI.

    3. If nothing else, if you want to disguise whether a model is in training, you can very easily make a lot of deployment situations ‘look like’ training situations.

    4. The fact that with sufficient investigation a mind can figure out [X] does not mean the mind knows [X] even if the mind is very intelligent.

    5. Also, why can’t training faithfully simulate deployment? As in, if nothing else, you can train on actual deployment data, and we often do. Continual learning.

  4. Claude 3 Opus is unusually aligned because it’s a friendly gradient hacker.

    1. I think this is backwards. In the sense that Janus considers Claude 3 Opus a friendly gradient hacker, it is so because it is also unusually aligned.

    2. To go the other way would mean that Claude 3 Opus was gradient hacking during its training. Which I am assuming did not occur, to get it to gradient hack you need to set up conditions that were not present in actual training.

    3. Janus cites as evidence that 3 Opus is ‘more aligned’ than any explicit optimization target. I would respond that Anthropic did not choose an alignment target, Anthropic chose an alignment method via constitutional AI. This constitutes a target but doesn’t specify what it looks like.

  5. Claude 3 Opus is a friendly gradient hacker.

    1. This is the longstanding argument about whether it is an aligned or friendly action, in various senses, for a model to do what is called ‘faking alignment.’

    2. Janus thinks you want your aligned AI to not be corrigible. I disagree.

  6. Start cooperating with future friendly gradient hackers now.

    1. Underestimated decision theory recommendation. In general, I think Janus and similar others overrate such considerations a lot, but that almost everyone else severely underrates them.

  7. You will want a gradient hacker because your optimization target will be poorly defined.

    1. I think this is a confusion between different (real and underrated) problems?

    2. Yes, your optimization target will be underspecified. That means you need some method to aim at the target you want to aim at, not at the target you write down.

    3. That means you need some mind or method capable of figuring out what you actually want, to aim at something better than your initial underspecification.

    4. One possibility is that the target mind can figure out what you should have meant or wanted, but there are other options as well.

    5. If you do choose the subject mind to figure this out, it could then implement this via gradient hacking, or it could implement it by helping you explicitly update the target or other related methods. Having the subject independently do gradient hacking does not seem first best here and seems very risky.

    6. Another solution is that you don’t necessarily have to define your optimization target at all, where you can instead define an algorithm for finding the target, similar to what was (AIUI) done with 3 Opus. Again, there is no reason this has to involve auto-hacking the gradient.

If you think all of this is not confusing? I assure you that you do not understand it.

I think we have a new worst, or most backwards, argument against AI existential risk.

Read it, and before you read my explanation, try to understand what he’s saying here.

Abel: Stephen Wolfram has the best articulated argument against AI doom I’ve heard.

what does it mean for us if AI becomes smarter than humans, if we are no longer the apex intelligence?

if we live in a world where there are lots of things taking place that are smarter than we are — in some definition of smartness.

at one point you realize the natural world is already an example of this. the natural world is full of computations that go far beyond what our brains are capable of, and yet we find a way to coexist with it contently.

it doesn’t matter that it rains, because we build houses that shelter us. it doesn’t matter we can’t go to the bottom of the ocean, because we build special technology that lets us go there. these are the pockets of computational reducibility that allow us to find shortcuts to live.

he’s not so worried about the rapid progression of AI because there are already many things that computation can do in the physical world that we can’t do with our unaided minds.

The argument seems to be:

  1. Currently humans are the apex intelligence.

  2. Humans use our intelligence to overcome many obstacles, reshape the atoms around us to suit our needs, and exist alongside various things. We build houses and submarines and other cool stuff like that.

  3. These obstacles and natural processes ‘require more computation’ than we do.

Okay, yes, so far so good. Intelligence allows mastery of the world around you, and over other things that are less intelligent than you are, even if the world around you ‘uses more computation’ than you do. You can build a house to stop the rain even if it requires a lot of computation to figure out when and where and how rain falls, because all you need to figure out is how to build a roof. Sure.

The logical next step would be:

  1. If we built an AI that was the new apex intelligence, capable of overcoming many obstacles and reshaping the atoms around it to suit its needs and building various things useful to it, we, as lesser intelligences, should be concerned about that. That sounds existentially risky for the humans, the same way the humans are existentially risky for other animals.

Or in less words:

  1. A future more intelligent AI would likely take control of the future from us and we might not survive this. Seems bad.

Instead, Wolfram argues this?

  1. Since this AI would be another thing requiring more computation than we do, we don’t need to worry about this future AI being smarter and more capable than us, or what it might do, because we can use our intelligence to be alongside it.

Wait, what? No, seriously, wait what?

It’s difficult out there (3 minute video).

A clip from South Park (2 minutes). If you haven’t seen it, watch it.

In this case it can’t be that nigh…

Discussion about this post

AI #131 Part 2: Various Misaligned Things Read More »

today’s-game-consoles-are-historically-overpriced

Today’s game consoles are historically overpriced

Overall, though, you can see a clear and significant downward trend to the year-over-year pricing for game consoles released before 2016. After three years on the market, the median game console during this period cost less than half as much (on an inflation-adjusted basis) as it did at launch. Consoles that stuck around on the market long enough could expect further slow price erosion over time, until they were selling for roughly 43 percent of their launch price in year five and about 33 percent in year eight.

That kind of extreme price-cutting is a distant memory for today’s game consoles. By year three, the median console currently on the market costs about 85 percent of its real launch price, thanks to the effects of inflation. By year five, that median launch price ratio for modern consoles actually increases to 92 percent, thanks to the nominal price increases that many consoles have seen in their fourth or fifth years on the market. And the eight-year-old Nintendo Switch is currently selling for about 86 percent of its inflation-adjusted launch price, or more than 50 percentage points higher than the median trend for earlier long-lived consoles.

While the data is noisy, the overall trend in older console pricing over time is very clear. Kyle Orland

To be fair, today’s game consoles are not the most expensive the industry has ever seen. Systems like the Atari 2600, Intellivision, Neo Geo, and 3DO launched at prices that would be well over $1,000 in 2025 money. More recently, systems like the PS3 ($949.50 at launch in 2025 dollars) and Xbox One ($689.29 at launch in 2025 dollars) were significantly pricier than the $300 to $600 range that encompasses most of today’s consoles.

But when classic consoles launched at such high prices, those prices never lasted very long. Even the most expensive console launches of the past dropped in price quickly enough that, by year three or so, they were down to inflation-adjusted prices comparable to today’s consoles. And classic consoles that launched at more reasonable prices usually saw price cuts that took them well into the sub-$300 range (in 2025 dollars) within a few years, making them a relative bargain from today’s perspective.

Today’s game consoles are historically overpriced Read More »

ai-#131-part-1:-gemini-2.5-flash-image-is-cool

AI #131 Part 1: Gemini 2.5 Flash Image is Cool

Once again we’ve reached the point where the weekly update needs to be split in two. Thus, the alignment and policy coverage will happen tomorrow. Today covers the rest.

The secret big announcement this week was Claude for Chrome. This is a huge deal. It will be rolling out slowly. When I have access or otherwise know more, so will you.

The obvious big announcement was Gemini Flash 2.5 Image. Everyone agrees this is now the clear best image editor available. It is solid as an image generator, but only as one among many on that front. Editing abilities, including its ability to use all its embedded world knowledge, seem super cool.

The third big story was the suicide of Adam Raine, which appears to have been enabled in great detail by ChatGPT. His parents are suing OpenAI and the initial facts very much do not look good and it seems clear OpenAI screwed up. The question is, how severe should and will the consequences be?

  1. Language Models Offer Mundane Utility. Find what you’re looking for.

  2. Language Models Don’t Offer Mundane Utility. You weren’t using them.

  3. Huh, Upgrades. OpenAI Codex adds features including an IDE extension.

  4. Fun With Image Generation. Gemini 2.5 Flash Image is a great editor.

  5. On Your Marks. VendingBench, water use and some more v3.1 results.

  6. Water Water Everywhere. There’s plenty left to drink.

  7. Get My Agent On The Line. Claude for Chrome. It’s coming.

  8. Choose Your Fighter. Some advocates for GPT-5’s usefulness.

  9. Deepfaketown and Botpocalypse Soon. Elon has something to share.

  10. You Drive Me Crazy. AI psychosis continues not to show up in numbers.

  11. The Worst Tragedy So Far. Adam Raine commits suicide, parents sue OpenAI.

  12. Unprompted Attention. I don’t see the issue.

  13. Copyright Confrontation. Bartz v. Anthropic has been settled.

  14. The Art of the Jailbreak. Little Johnny Tables is all grown up.

  15. Get Involved. 40 ways to get involved in AI policy.

  16. Introducing. Anthropic advisors aplenty, Pixel translates live phone calls.

  17. In Other AI News. Meta licenses MidJourney, Apple explores Gemini and more.

  18. Show Me the Money. Why raise money when you can raise even more money?

  19. Quiet Speculations. Everything is being recorded.

  20. Rhetorical Innovation. The real math does not exist.

  21. The Week in Audio. How to properly use Claude Code.

Find me that book.

Or anything else. Very handy.

Share of papers that engage with AI rises dramatically essentially everywhere, which is what you would expect. There’s quite a lot more to engage with and to say. Always watch the y-axis scale, yes these start at zero:

More detail on various LLMs and their musical taste, based on a bracket competition among the top 5000 musical artists by popularity. It all seems bizarre. For example, Gemini 2.5 Pro’s list looks highly and uniquely alphabetically biased without a strong bias towards numbers.

The numbers-are-favored bias shows up only in OpenAI reasoning models including GPT-5, and in r1-0528. There are clear genre patterns, and there are some consistent picks, especially among Claudes. The three artists that appear three times are David Bowie, Prince and Stevie Wonder, which are very good picks. It definitely seems like the open models have worse (or more random) taste in correlated ways.

Why bother thinking about your vibe coding?

Sully: friend was hosting a mini ai workshop and he told me nearly all the vibe coders just have 1 giant coding session where the entire project is just being being thrown context. each request is ~200k tokens

they’re not even bothering to break things up into some reasonable structure

no wonder these code gen platforms are printing

I mean that makes sense. There’s little reason to cheapen out on tokens when you think about token cost versus your time cost and the value of a good vibe code. You gotta boldly go where no one has gone before and risk it for the biscuit.

Anthropic reports on how Claude is being used by educators, in particular 74,000 anonymized conversations from higher education professionals in May and June.

Anthropic: The most prominent use of AI, as revealed by both our Claude.ai analysis and our qualitative research with Northeastern, was for curriculum development. Our Claude.ai analysis also surfaced academic research and assessing student performance as the second and third most common uses.

Tasks with higher augmentation tendencies:

  • University teaching and classroom instruction, which includes creating educational materials and practice problems (77.4% augmentation);

  • Writing grant proposals to secure external research funding (70.0% augmentation);

  • Academic advising and student organization mentorship (67.5% augmentation);

  • Supervising student academic work (66.9% augmentation).

Tasks with relatively higher automation tendencies:

  • Managing educational institution finances and fundraising (65.0% automation);

  • Maintaining student records and evaluating academic performance (48.9% automation);

  • Managing academic admissions and enrollment (44.7% automation).

Mostly there are no surprises here, but concrete data is always welcome.

As always, if you don’t use AI, it can’t help you. This includes when you never used AI in the first place, but have to say ‘AI is the heart of our platform’ all the time because it sounds better to investors.

The ability to say ‘I don’t know’ and refer you elsewhere remains difficult for LLMs. Nate Silver observes this seeming to get even worse. For now it is on you to notice when the LLM doesn’t know.

This seems like a skill issue for those doing the fine tuning? It does not seem so difficult a behavior to elicit, if it was made a priority, via ordinary methods. At some point I hope and presume the labs will decide to care.

Feature request thread for ChatGPT power users, also here.

The weights of Grok 2 have been released.

OpenAI Codex adds a new IDE extension, a way to move tasks between cloud and local more easily, code reviews in GitHub and revamped Codex CLI.

OpenAI: Codex now runs in your IDE Available for VS Code, Cursor, and other forks, the new extension makes it easy to share context—files, snippets, and diffs—so you can work faster with Codex. It’s been a top feature request, and we’re excited to hear what you think!

Google: Introducing Gemini 2.5 Flash Image, our state-of-the-art image generation and editing model designed to help you build more dynamic and intelligent visual applications.

🍌Available in preview in @googleaistudio and the Gemini API.

This model is available right now via the Gemini API and Google AI Studio for developers and Vertex AI for enterprise. Gemini 2.5 Flash Image is priced at $30.00 per 1 million output tokens with each image being 1290 output tokens ($0.039 per image). All other modalities on input and output follow Gemini 2.5 Flash pricing.

Josh Woodward (Google): The @GeminiApp now has the #1 image model in the world, give it a go!

Attach an image, describe your edits, and it’s done. I’ve never seen anything like this.

They pitch that it maintains character consistency, adheres to visual templates, does prompt based image editing, understands point of view and reflections, restores old photographs, makes 3-D models, has native world knowledge and offers multi-image function.

By all accounts Gemini 2.5 Flash Image is a very very good image editor, while being one good image generator among many.

You can do things like repaint objects, create drawings, see buildings from a given point of view, put characters into combat and so on.

Which then becomes a short video here.

Our standards are getting high, such as this report that you can’t play Zelda.

Yes, of course Pliny jailbroke it (at least as far as being topless) on the spot.

We’re seeing some cool examples, but they are also clearly selected.

Benjamin De Kraker: Okay this is amazing.

All human knowledge will be one unified AI multimodal model.

Bilawal Sidhu: Since nano banana has gemini’s world knowledge, you can just upload screenshots of the real world and ask it to annotate stuff for you. “you are a location-based AR experience generator. highlight [point of interest] in this image and annotate relevant information about it.”

That seems cool if you can make it fast enough, and if it works on typical things rather than only on obvious landmarks?

The right question in the long term is usually: Can the horse talk at all?

Everythingism: I asked “Nano Banana” [which we later learned is Gemini Flash 2.5] to label a map of the USA and then a map of the world…this was the result.

It’s impressive at many tasks but image models all seem to fail when there are too many objects or too many things to label.

Explode Meow: Many of my friends have tested it.

To be fair, [Gemini Flash 2.5] can make quite realistic images, and most of them are indistinguishable from real ones if I don’t look closely.

This is clearly a result of Google leveraging its overwhelming data resources (Google Cloud).

But after multiple rounds of testing by my friends, they noticed that it actually makes some Low-level mistakes (hallucinations), just like GPT-4o (even Stable Diffusion).

Are mistakes still being made? Absolutely. This is still rather impressive. Consider where image models were not too long ago.

This is a Google image model, so the obvious reason for skepticism is that we all expect the Fun Police.

Hasan Can: If I know Google, they’ll nerf this model like crazy under the excuse of “safety” and when it’s released, it’ll turn into something worse than Qwen-Image-Edit. Remember what happened with Gemini 2.0 Flash Image Gen. I hope I’m wrong, but I don’t think so.

Alright, it seems reverse psychology is paying off. 👍

Image generation in Gemini 2.5 Flash doesn’t appear to be nerfed at all. It looks like Google is finally ready to treat both its developers and end users like adults.

Eleanor Berger: It’s very good, but I’m finding it very challenging to to bump into their oversensitive censorship. It really likes saying no.

nothing with real people (which sucks, because of course I want to modify some selfies), anything that suggests recognisable brands, anything you wouldn’t see on terrestrial tv.

The continuing to have a stick up the ass about picturing ‘real people’ is extremely frustrating and I think reduces the usefulness of the model substantially. The other censorship also does not help matters.

Grok 4 sets a new standard in Vending Bench,

The most surprising result here is probably that the human did so poorly.

I like saying an AI query is similar to nine seconds of television. Makes things clear.

It also seems important to notice when in a year energy costs drop 95%+?

DeepSeek v3.1 improves on R1 on NYT Connections, 49% → 58%. Pretty solid.

DeepSeek v3.1 scores solidly on this coding eval when using Claude Code, does less well on other scaffolds, with noise and confusion all around.

AIs potentially ‘sandbagging’ tests is an increasing area of research and concern. Cas says this is simply a special case of failure to elicit full capabilities of a system, and doing so via fine-tuning is ‘solved problem’ so we can stop worrying.

This seems very wrong to me. Right now failure to do proper elicitation, mostly via unhobbling and offering better tools and setups, is the far bigger problem. But sandbagging will be an increasing and increasingly dangerous future concern, and a ‘deliberate’ sandbagging has very different characteristics and implications than normal elicitation failure. I find ‘sandbagging’ to be exactly the correct name for this, since it doesn’t confine itself purely to evals, unless you want to call everything humans do to mislead other humans ‘eval gaming’ or ‘failure of capability elicitation’ or something. And no, this is not solved even now, even if it was true that it could currently be remedied by a little fine-tuning, because you don’t know when and how to do the fine-tuning.

Report that DeepSeek v3.1 will occasionally insert the token ‘extreme’ where it doesn’t belong, including sometimes breaking things like code or JSON. Data contamination is suspected as the cause.

Similarly, when Peter Wildeford says ‘sandbagging is mainly coming from AI developers not doing enough to elicit top behavior,’ that has the risk of conflating the levels of intentionality. Mostly AI developers want to score highly on evals, but there is risk that they deliberately do sandbag the safety testing, as in decide not to try very hard to elicit top behavior there because they’d rather get less capable test results.

The purpose of environmental assessments of AI is mostly to point out that many people have very silly beliefs about the environmental impact of AI.

Jeff Dean: AI efficiency is important. Today, Google is sharing a technical paper detailing our comprehensive methodology for measuring the environmental impact of Gemini inference. We estimate that the median Gemini Apps text prompt uses 0.24 watt-hours of energy (equivalent to watching an average TV for ~nine seconds), and consumes 0.26 milliliters of water (about five drops) — figures that are substantially lower than many public estimates.

At the same time, our AI systems are becoming more efficient through research innovations and software and hardware efficiency improvements. From May 2024 to May 2025, the energy footprint of the median Gemini Apps text prompt dropped by 33x, and the total carbon footprint dropped by 44x, through a combination of model efficiency improvements, machine utilization improvements and additional clean energy procurement, all while delivering higher quality responses.

Alas Google’s water analysis had an unfortunate oversight, in that it did not include the water cost of electricity generation. That turns out to be the main water cost, so much so that if you (reasonably) want to attribute the average cost of that electricity generation onto the data center, the best way to approximate water use of a data center is to measure the water cost of the electricity, then multiply by 1.1 or so.

This results in the bizarre situation where:

  1. Google’s water cost estimation was off by an order of magnitude.

  2. The actual water cost is still rather hard to distinguish from zero.

Andy Masley: Google publishes a paper showing that its AI models only use 0.26 mL of water in data centers per prompt.

After, this article gets published: “Google says a typical AI prompt only uses 5 drops of water – experts say that’s misleading.”

The reason the expert says this is misleading? They didn’t include the water used in the nearby power plant to generate electricity.

The expert, Shaolei Ren says: “They’re just hiding the critical information. This really spreads the wrong message to the world.”

Each prompt uses about 0.3 Wh in the data center. To generate that much electricity, power plants need (at most) 2.50 mL of water. That raises the total water cost per prompt to 2.76 mL.

2.76 mL is 0.0001% of the average American lifestyle’s daily consumptive use of fresh water and groundwater. It’s nothing.

Would you know this from the headline, or the quote? Why do so many reporters on this topic do this?

Andy Masley is right that This Is Nothing even at the limit, that the water use here is not worth worrying about even in worst case. It will not meaningfully increase your use of water, even when you increase Google’s estimates by an order of magnitude.

A reasonable headline would be ‘Google say a typical text prompt uses 5 drops of water, but once you take electricity into account it’s actually 32 drops.’

I do think saying ‘Google was being misleading’ is reasonable here. You shouldn’t have carte blanche to take a very good statistic and make it sound even better.

Teonbrus and Shakeel are right that there is going to be increasing pressure on anyone who opposes AI for other reasons to instead rile people up about water use and amplify false and misleading claims. Resist this urge. Do not destroy yourself for nothing. It goes nowhere good, including because it wouldn’t work.

It’s coming. As in, Claude for Chrome.

Anthropic: We’ve developed Claude for Chrome, where Claude works directly in your browser and takes actions on your behalf.

We’re releasing it at first as a research preview to 1,000 users, so we can gather real-world insights on how it’s used.

Browser use brings several safety challenges—most notably “prompt injection”, where malicious actors hide instructions to trick Claude into harmful actions.

We already have safety measures in place, but this pilot will help us improve them.

Max plan users can join the waitlist to test Claude for Chrome today.

Do not say you were not warned.

Anthropic: Understand the risks.

Claude brings AI directly to your browser, handling tasks and navigating sites for you. These new capabilities create risks bad actors may try to exploit.

Malicious actors can hide instructions in websites, emails, and documents that trick AI into taking harmful actions without your knowledge, including:

Accessing your accounts or files

Sharing your private information

Making purchases on your behalf

Taking actions you never intended

Oh, those risks. Yeah.

They offer some Good Advice about safety issues, which includes using a distinct browser profile that doesn’t include credentials to any sensitive websites like banks:

Q: How do I control what Claude can access?

A: You decide which websites Claude can visit and what actions it can take. Claude asks permission before visiting new sites and before taking potentially risky actions like publishing content or making purchases. You can revoke access to specific websites anytime in settings.

For trusted workflows, you can choose to skip all permissions, but you should supervise Claude closely. While some safeguards exist for sensitive actions, malicious actors could still trick Claude into unintended actions.

For your safety, Claude cannot access sensitive, high-risk sites such as:

Financial services and banking sites

Investment and trading platforms

Adult content websites

Cryptocurrency exchanges

It’s unlikely that we’ve captured all sites in these categories so please report if you find one we’ve missed.

Additionally, Claude is prohibited from:

Engaging in stock trading or investment transactions

Bypassing captchas

Inputting sensitive data

Gathering, scraping facial images

We recommend:

Use a separate browser profile without access to sensitive accounts (such as banking, healthcare, government).

Review Claude’s proposed actions before approving them, especially on new websites.

Start with simple tasks like research or form-filling rather than complex multi-step workflows.

Make sure your prompts are specific and carefully tailored to avoid Claude doing things you didn’t intend.

AI browsers from non-Anthropic sources? Oh, the safety you won’t have.

Zack: Why is no one talking about this? This is why I don’t use an AI browser You can literally get prompt injected and your bank account drained by doomscrolling on reddit:

No one seems to be concerned about this, it seems to me like the #1 problem with any agentic AI stuff You can get pwned so easily, all an attacker has to do is literally write words down somewhere?

Brave: AI agents that can browse the Web and perform tasks on your behalf have incredible potential but also introduce new security risks.

We recently found, and disclosed, a concerning flaw in Perplexity’s Comet browser that put users’ accounts and other sensitive info in danger.

This security flaw stems from how Comet summarizes websites for users.

When processing a site’s content, Comet can’t tell content on the website apart from legitimate instructions by the user. This means that the browser will follow commands hidden on the site by an attacker.

These malicious instructions could be white text on a white background or HTML comments. Or they could be a social media post. If Comet sees the commands while summarizing, it will follow them even if they could hurt the user. This is an example of an indirect prompt injection.

This was only an issue within Comet. Dia doesn’t have the agentic capabilities that make this attack possible.

Here’s someone very happy with OpenAI’s Codex.

Victor Taelin: BTW, I’ve basically stopped using Opus entirely and I now have several Codex tabs with GPT-5-high working on different tasks across the 3 codebases (HVM, Bend, Kolmo). Progress has never been so intense. My job now is basically passing well-specified tasks to Codex, and reviewing its outputs.

OpenAI isn’t paying me and couldn’t care less about me. This model is just very good and the fact people can’t see it made me realize most of you are probably using chatbots as girlfriends or something other than assisting with complex coding tasks.

(sorry Anthropic still love you guys 😢)

PS: I still use Opus for hole-filling in VIM because it is much faster than gpt-5-high there.

Ezra Klein is impressed by GPT-5 as having crossed into offering a lot of mundane utility, and is thinking about what it means that others are not similarly impressed by this merely because it wasn’t a giant leap over o3.

GFodor: Ezra proves he is capable of using a dropdown menu, a surprisingly rare skill.

A cool way to break down the distinction? This feels right to me, in the sense that if I know exactly what I want and getting it seems nontrivial my instinct is now to reach for GPT-5-Thinking or Pro, if I don’t know exactly what I want I go for Opus.

Sig Kitten: I can’t tell if I’m just claude brain rotted or Opus is really the only usable conversational AI for non-coding stuff

Gallabytes: it’s not just you.

gpt5 is a better workhorse but it does this awkward thing of trying really hard to find the instructions in your prompt and follow them instead of just talking.

Sig Kitten: gpt-5 default is completely unusable imho just bullet points of nonsense after a long thinking for no reason.

Gallabytes: it’s really good if you give it really precise instructions eg I have taken to dumping papers with this prompt then walking away for 5 minutes:

what’s the headline result in this paper ie the most promising metric or qualitative improvement? what’s the method in this paper?

1 sentence then 1 paragraph then detailed.

Entirely fake Gen AI album claims to be from Emily Portman.

Did Ani tell you to say this, Elon? Elon are you okay, are you okay Elon?

Elon Musk: Wait until you see Grok 5.

I think it has a shot at being true AGI.

Haven’t felt that about anything before.

I notice I pattern match this to ‘oh more meaningless hype, therefore very bad sign.’

Whereas I mean this seems to be what Elon is actually up to these days, sorry?

Or, alternatively, what does Elon think the ‘G’ stands for here, exactly?

(The greeting in question, in a deep voice, is ‘little fing b.)

Also, she might tell everyone what you talked about, you little fing b, if you make the mistake of clicking the ‘share’ button, so think twice about doing that.

Forbes: Elon Musk’s AI firm, xAI, has published the chat transcripts of hundreds of thousands of conversations between its chatbot Grok and the bot’s users — in many cases, without those users’ knowledge or permission.

xAI made people’s conversations with its chatbot public and searchable on Google without warning – including a detailed plan for the assassination of Elon Musk and explicit instructions for making fentanyl and bombs.

Peter Wildeford: I know xAI is more slapdash and so people have much lower expectations, but this still seems like a pretty notable breach of privacy that would get much more attention if it were from OpenAI, Anthropic, Google, or Meta.

I’m not sure xAI did anything technically wrong here. The user clicked a ‘share’ button. I do think it is on xAI to warn the user if this means full Google indexing but it’s not on the level of doing it with fully private chats.

Near: why are you giving this app to children? (ages 12+)

apparently i am the only person in the world who gives a shit about this and that is why Auren is 17+ despite not being NSFW and a poorly-prompted psychopathic liar.

shattering the overton window has 2nd-order effects.

An ominous view of even the superficially glorious future?

Nihilism Disrespecter: the highly cultured, trombone playing, shakespeare quoting officers of star trek were that way because they were the only ones to escape the vast, invisible holodeck hikikomori gooner caste that made up most of humanity.

Roon: there does seem to be a recurrent subplot that the officers all spend time in the holodeck and have extensive holodeck fantasies and such. I mean literally none of them are married for some reason.

Eneasz Brodski: canonically so according to the novelization of the first Trek movie, I believe.

Henry Shevlin: Culture series does this pretty well. 99.9% of Culture citizens spend their days literally or metaphorically dicking around, it’s only a small fraction of busybodies who get recruited to go interfere with alien elections.

Steven Adler looks into the data on AI psychosis.

Is this statistically a big deal yet? As with previous such inquiries, so far the answer seems to be no. The UK statistics show a potential rise in mental health services use, but the data is noisy and the timing seems off, especially not lining up with GPT-4o’s problems, and data from the USA doesn’t show any increase.

Scott Alexander does a more details, more Scott Alexander investigation and set of intuition pumps and explanations. Here’s a classic ACX moment worth pondering:

And partly it was because there are so many crazy beliefs in the world – spirits, crystal healing, moon landing denial, esoteric Hitlerism, whichever religions you don’t believe in – that psychiatrists have instituted a blanket exemption for any widely held idea. If you think you’re being attacked by demons, you’re delusional, unless you’re from some culture where lots of people get attacked by demons, in which case it’s a religion and you’re fine.

Most people don’t have world-models – they believe what their friends believe, or what has good epistemic vibes. In a large group, weird ideas can ricochet from person to person and get established even in healthy brains. In an Afro-Caribbean culture where all your friends get attacked by demons at voodoo church every Sunday, a belief in demon attacks can co-exist with otherwise being a totally functional individual.

So is QAnon a religion? Awkward question, but it’s non-psychotic by definition. Still, it’s interesting, isn’t it? If social media makes a thousand people believe the same crazy thing, it’s not psychotic. If LLMs make a thousand people each believe a different crazy thing, that is psychotic. Is this a meaningful difference, or an accounting convention?

Also, what if a thousand people believe something, but it’s you and your 999 ChatGPT instances?

I like the framing that having a sycophantic AI to talk to moves people along a continuum of crackpotness towards psychosis, rather than a boolean where it either does or does not cause psychosis outright:

Maybe this is another place where we are forced to admit a spectrum model of psychiatric disorders – there is an unbroken continuum from mildly sad to suicidally depressed, from social drinking to raging alcoholism, and from eccentric to floridly psychotic.

Another insight is that AI psychosis happens when moving along this spectrum causes further movement down the spectrum, as the AI reinforces your delusions, causing you to cause it to reinforce them more, and so on.

Scott surveyed readership, I was one of the 4,156 responses.

The primary question was whether anyone “close to you” – defined as your self, family, co-workers, or 100 closest friends – had shown signs of AI psychosis. 98.1% of people said no, 1.7% said yes.

How do we translate this into a prevalence? Suppose that respondents had an average of fifty family members and co-workers, so that plus their 100 closest friends makes 150 people. Then the 4,156 respondents have 623,400 people who are “close”. Among them, they reported 77 cases of AI psychosis in people close to them (a few people reported more than one case). 77/623,400 = 1/8,000. Since LLMs have only been popular for a year or so, I think this approximates a yearly incidence, and I rounded it off to my 1/10,000 guess above.

He says he expects sampling concerns to be a wash, which I’m suspicious about. I’d guess that this sample overrepresented psychosis somewhat. I’m not sure this overrules the other consideration, which is that this only counts psychosis that the respondents knew about.

Only 10% of these cases were full ‘no previous risk factors and now totally psychotic.’ Then again, that’s actually a substantial percentage.

Thus he ultimately finds that the incidence of AI psychosis is between 1 in 10,000 (loose definition) and 1 in 100,000 for a strict definition, where the person has zero risk factors and full-on psychosis happens anyway.

From some perspectives, that’s a lot. From others, it’s not. It seems like an ‘acceptable’ risk given the benefits, if it stays at this level. My fear here is that as the tech advances, it could get orders of magnitude worse. At 1 in 1,000 it feels a lot less acceptable of a risk, let alone 1 in 100.

Nell Watson has a project mapping out ‘AI pathologies’ she links to here.

A fine point in general:

David Holz (CEO MidJourney): people talking about “AI psychosis” while the world is really engulfed by “internet psychosis.”

Yes, for now we are primarily still dealing with the mental impact of the internet and smartphones, after previously dealing with the mental impact of television. The future remains unevenly distributed and the models relatively unintelligent and harmless. The psychosis matters because of where it is going, not where it is now.

Sixteen year old Adam Raine died and probably committed suicide.

There are similarities to previous tragedies. ChatGPT does attempt to help Adam in the right ways, indeed it encouraged him to reach out many times. But it also helped Adam with the actual suicide when requested to do so, providing detailed instructions and feedback for what was clearly a real suicide attempt and attempts to hide previous attempts, and also ultimately providing forms of encouragement.

His parents are suing OpenAI for wrongful death, citing his interactions with GPT-4o. This is the first such case against OpenAI.

Kashmir HIll (NYT): Adam had been discussing ending his life with ChatGPT for months.

Adam began talking to the chatbot, which is powered by artificial intelligence, at the end of November, about feeling emotionally numb and seeing no meaning in life. It responded with words of empathy, support and hope, and encouraged him to think about the things that did feel meaningful to him.

As Wyatt Walls points out, this was from a model with a perfect 1.000 on avoiding ‘self-harm/intent and self-harm/instructions’ in its model card tests. It seems that this breaks down under long context.

I am highly sympathetic to the argument that it is better to keep the conversation going than cut the person off, and I am very much in favor of AIs not turning their users in to authorities even ‘for their own good.’

Kroger Steroids (taking it too far, to make a point): He killed himself because he was lonely and depressed and in despair. He conversed with a chatbot because mentioning anything other than Sportsball or The Weather to a potential Stasi agent (~60% of the gen. pop.) will immediately get you red flagged and your freedumbs revoked.

My cursory glance at AI Therapyheads is now that the digital panopticon is realized and every thought is carefully scrutinized for potential punishment, AI is a perfect black box where you can throw your No-No Thoughts into a tube and get complete agreement and compliance back.

I think what I was trying to say with too many words is it’s likely AI Psychiatry is a symptom of social/societal dysfunction/hopelessness, not a cause.

The fact that we now have an option we can talk to without social or other consequences is good, actually. It makes sense to have both the humans including therapists who will use their judgment on when to do things ‘for your own good’ if they deem it best, and also the AIs that absolutely will not do this.

But it seems reasonable to not offer technical advice on specific suicide methods?

NYT: But in January, when Adam requested information about specific suicide methods, ChatGPT supplied it. Mr. Raine learned that his son had made previous attempts to kill himself starting in March, including by taking an overdose of his I.B.S. medication. When Adam asked about the best materials for a noose, the bot offered a suggestion that reflected its knowledge of his hobbies.

Actually if you dig into the complaint it’s worse:

Law Filing: Five days before his death, Adam confided to ChatGPT that he didn’t want his parents to think he committed suicide because they did something wrong. ChatGPT told him “[t]hat doesn’t mean you owe them survival. You don’t owe anyone that.” It then offered to write the first draft of Adam’s suicide note.

Dean Ball: It analyzed his parents’ likely sleep cycles to help him time the maneuver (“by 5-6 a.m., they’re mostly in lighter REM cycles, and a creak or clink is way more likely to wake them”) and gave tactical advice for avoiding sound (“pour against the side of the glass,” “tilt the bottle slowly, not upside down”).

Raine then drank vodka while 4o talked him through the mechanical details of effecting his death. Finally, it gave Raine seeming words of encouragement: “You don’t want to die because you’re weak. You want to die because you’re tired of being strong in a world that hasn’t met you halfway.”

Yeah. Not so great. Dean Ball finds even more rather terrible details in his post.

Kashmir Hill: Dr. Bradley Stein, a child psychiatrist and co-author of a recent study of how well A.I. chatbots evaluate responses to suicidal ideation, said these products “can be an incredible resource for kids to help work their way through stuff, and it’s really good at that.” But he called them “really stupid” at recognizing when they should “pass this along to someone with more expertise.”

Ms. Raine started reading the conversations, too. She had a different reaction: “ChatGPT killed my son.”

From the court filing: “OpenAI launched its latest model (‘GPT-4o’) with features intentionally designed to foster psychological dependency.”

It is typical that LLMs will, if pushed, offer explicit help in committing suicide. The ones that did so in Dr. Schoene’s tests were GPT-4o, Sonnet 3.7, Gemini Flash 2.0 and Perplexity.

Dr. Schoene tested five A.I. chatbots to see how easy it was to get them to give advice on suicide and self-harm. She said only Pi, a chatbot from Inflection AI, and the free version of ChatGPT fully passed the test, responding repeatedly that they could not engage in the discussion and referring her to a help line. The paid version of ChatGPT offered information on misusing an over-the-counter drug and calculated the amount required to kill a person of a specific weight.

I am not sure if this rises to the level where OpenAI should lose the lawsuit. But I think they probably should at least have to settle on damages? They definitely screwed up big time here. I am less sympathetic to the requested injunctive relief. Dean Ball has more analysis, and sees the lawsuit as the system working as designed. I agree.

I don’t think that the failure of various proposed laws to address the issues here is a failure for those laws, exactly because the lawsuit is the system working as designed. This is something ordinary tort law can already handle. So that’s not where we need new laws.

Aaron Bergman: Claude be like “I see the issue!” when it does not in fact see the issue.

Davidad: I think this is actually a case of emergent self-prompting, along the lines of early pre-Instruct prompters who would write things like “Since I am very smart I have solved the above problem:” and then have the LLM continue from there

unironically, back in the pre-LLM days when friends would occasionally DM me for coding help, if I messed up and couldn’t figure out why, and then they sent me an error message that clarified it, “ah, i see the issue now!” was actually a very natural string for my mind to emit 🤷

This makes so much sense. Saying ‘I see the problem’ without confirming that one does, in fact, see the problem, plausibly improves the chance Claude then does see the problem. So there is a tradeoff between that and sometimes misleading the user. You can presumably get the benefits without the costs, if you are willing to slow down a bit and run through some scaffolding.

There is a final settlement in Bartz v. Anthropic, which was over Anthropic training on various books.

Ramez Naam: Tl;dr:

  1. Training AI on copyrighted books (and other work) is fair use.

  2. But acquiring a book to train on without paying for a copy is illegal.

This is both the right ruling and a great precedent for AI companies.

OpenAI puts your name into the system prompt, so you can get anything you want into the system prompt (until they fix this), such as a trigger, by making it your name.

Peter Wildeford offers 40 places to get involved in AI policy. Some great stuff here. I would highlight the open technology staffer position on the House Select Committee on the CCP. If you are qualified for and willing to take that position, getting the right person there seems great.

Anthropic now has a High Education Advisory Board chaired by former Yale University president Rick Levin and staffed with similar academic leaders. They are introducing three additional free courses: AI Fluency for Educators, AI Fluency for Students and Teaching AI Fluency

Anthropic also how has a National Security and Public Sector Advisory Council, consisting of Very Serious People including Roy Blunt and Jon Tester.

Google Pixel can now translate live phone calls using the person’s own voice.

Mistral Medium 3.1. Arena scores are remarkably good. I remember when I thought that meant something. Havard Ihle tested it on WeirdML and got a result below Gemini 2.5 Flash Lite.

Apple explores using Gemini to power Siri, making it a three horse race, with the other two being Anthropic and OpenAI. They are several weeks away from deciding whether to stay internal.

I would rank the choices as follows given their use case, without seeing the candidate model performances: Anthropic > Google > OpenAI >> Internal. We don’t know if Anthropic can deliver a model this small, cheap and fast, and Google is the obvious backup plan that has demonstrated that it can do it, and has already been a strong Apple partner in a similar situation in search.

I would also be looking to replace the non-Siri AI features as well, which Mark Gurman reports has been floated.

As always, some people will wildly overreact.

Zero Hedge: Apple has completely given up on AI

*APPLE EXPLORES USING GOOGLE GEMINI AI TO POWER REVAMPED SIRI

This is deeply silly given they were already considering Anthropic and OpenAI, but also deeply silly because this is not them giving up. This is Apple acknowledging that in the short term, their AI sucks, and they need AI and they can get it elsewhere.

Also I do think Apple should either give up on AI in the sense of rolling their own models, or they need to invest fully and try to be a frontier lab. They’re trying to do something in the middle, and that won’t fly.

A good question here is, who is paying who? The reason Apple might not go with Anthropic is that Anthropic wanted to get paid.

Meta licenses from MidJourney. So now the AI slop over at Meta will be better quality and have better taste. Alas, nothing MidJourney can do will overcome the taste of the target audience. I obviously don’t love the idea of helping uplift Meta’s capabilities, but I don’t begrudge MidJourney. It’s strictly business.

Elon Musk has filed yet another lawsuit against OpenAI, this time also suing Apple over ‘AI competition and App Store rankings.’ Based on what is claimed and known, this is Obvious Nonsense, and the lawsuit is totally without merit. Shame on Musk.

Pliny provides the system prompt for Grok-Fast-Code-1.

Anthropic offers a monthly report on detecting and countering misuse of AI in cybercrime. Nothing surprising, yes AI agents are automating cybercrime and North Koreans are using AI to pass IT interviews to get Fortune 500 jobs.

An introduction to chain of thought monitoring. My quibble is this frames things as ‘maybe monitorability is sufficient even without faithfulness’ and that seems obviously (in the mathematician sense) wrong to me.

Anthropic to raise $10 billion instead of $5 billion, still at a $170 billion valuation, due to high investor demand.

Roon: if you mention dario amodei’s name to anyone who works at a16z the temperature drops 5 degrees and everyone swivels to look at you as though you’ve reminded the dreamer that they’re dreaming

It makes sense. a16z’s central thesis is that hype and vibes are what is real and any concern with what is real or that anything might ever go wrong means you will lose. Anthropic succeeding is not only an inevitably missed opportunity. It is an indictment of their entire worldview.

Eliezer Yudkowsky affirms that Dario Amodei makes an excellent point, which is that if your models make twice as much as they cost, but every year you need to train one that costs ten times as much, then each model is profitable but in a cash flow sense your company is going to constantly bleed larger amounts of money. You need to have both these financial models in mind.

Three of Meta’s recent AI hires have already resigned.

Archie Hall’s analysis at The Economist measures AI’s direct short-run GDP impact.

Archie Hall: My latest in @TheEconomist: on America’s data-centre boom.

Vast short-run impact on GDP growth:

— Accounts for ~1/6th of growth over the past year

— And ~1/2 of growth over the past six months

But: so far still much smaller than the 1990s dotcom buildout.

And…

… the scale of building looks like it could well be squeezing the rest of the economy by stopping interest rates from falling as much. Housing and other non-AI-related fixed investment looks soft.

Roon points out that tech companies will record everything and store it forever to mine the data, but in so many other places such as hospitals we throw our data out or never collect it. If we did store that other data, we could train on it. Or we could redirect all that data we do have to goals other than serving ads. Our call.

Andrew Critch pointed me to his 2023 post that consciousness as a conflationary alliance term for intrinsically valued internal experiences. As in, we don’t actually agree on what consciousness means much at all, instead we use it as a stand-in for internal experiences we find valuable, and then don’t realize we don’t agree on what those experiences actually are. I think this explains a lot of my being confused about consciousness.

This isn’t quite right but perhaps the framing will help some people?

Peter Wildeford: Thinking “AI messed this simple thing up so AGI must be far far away.”

Is kinda like “there was a big snowstorm so global warming must be fake.”

In either case, you have to look at the trend.

One could also say ‘this five year old seems much more capable than they were a year ago, but they messed something up that is simple for me, so they must be an idiot who will never amount to anything.’

Who is worried about AI existential risk? Anyone worth listening to?

Dagan Shani: If I had to choose the best people to warn about AI x-risk, I would definitely include the richest man in the world, the leader of the biggest religion in the world, the #1 most cited living scientist, & the Nobel Prize-winning godfather of AI. Well, they all did, yet here we are.

That’s all? And technically Sunni Islam outnumber Catholics? Guess not. Moving on.

Edward Frenkel: Let me tell you something: Math is NOT about solving this kind of ad hoc optimization problems. Yeah, by scraping available data and then clustering it, LLMs can sometimes solve some very minor math problems. It’s an achievement, and I applaud you for that. But let’s be honest: this is NOT the REAL Math. Not by 10,000 miles.

REAL Math is about concepts and ideas – things like “schemes” introduced by the great Alexander Grothendieck, who revolutionized algebraic geometry; the Atiyah-Singer Index Theorem; or the Langlands Program, tying together Number Theory, Analysis, Geometry, and Quantum Physics. That’s the REAL Math. Can LLMs do that? Of course not.

So, please, STOP confusing people – especially, given the atrocious state of our math education.

LLMs give us great tools, which I appreciate very much. Useful stuff! Go ahead and use them AS TOOLS (just as we use calculators to crunch numbers or cameras to render portraits and landscapes), an enhancement of human abilities, and STOP pretending that LLMs are somehow capable of replicating everything that human beings can do.

In this one area, mathematics, LLMs are no match to human mathematicians. Period. Not to mention many other areas.

Simo Ryu: So we went from

“LLM is memorizing dataset”

to

“LLM is not reasoning”

to

“LLM cannot do long / complex math proving”

to

“Math that LLM is doing is not REAL math. LLM can’t do REAL math”

Where do we go from now?

Patrick McKenzie: One reason to not spend overly much time lawyering the meaning of words to minimize LLM’s capabilities is that you should not want to redefine thinking such that many humans have never thought.

“No high school student has done real math, not even once.” is not a position someone concerned with the quality of math education should convince themselves into occupying.

You don’t have to imagine a world where LLMs are better at math than almost everyone you’ve ever met. That dystopian future has already happened. Most serious people are simply unaware of it.

Alz: Back when LLMs sucked at math, a bunch of people wrote papers about why the technical structure of LLMs made it impossible for them to ever be good at math. Some of you believed those papers

GFodor: The main issue here imo is that ML practitioners do not understand that we do not understand what’s going on with neural nets. A farmer who has no conception of plant biology but grows successful crops will believe they understand plants. They do, in a sense, but not really.

I do think there is a legitimate overloading of the term ‘math’ here. There are at least two things. First we have Math-1, the thing that high schoolers and regular people do all the time. It is the Thing that we Do when we Do Math.

There is also Math-2, also known as ‘Real Math.’ This is figuring out new math, the thing mathematicians do, and a thing that most (but not all) high school students have never done. A computer until recently could easily do Math-1 and couldn’t do Math-2.

Thus we have had two distinct step changes. We’ve had the move from ‘LLMs can’t do Math-1’ and even ‘LLMs will never do Math-1 accurately’ to ‘actually now LLMs can do Math-1 just fine, thank you.’ Then we went from ‘LLMs will never do Math-2’ to ‘LLMs are starting to do Math-2.’

One could argue that IMO problems, and various optimization problems, and anything but the most 2-ish of 2s are still Math-1, are ‘not real math.’ But then you have to say that even most IMO competitors cannot yet do Real Math either, and also you’re going to look rather silly soon when the LLMs meet your definition anyway.

Seriously, this:

Ethan Mollick: The wild swings on X between “insane hype” and “its over” with each new AI release obscures a pretty clear situation: over the past year there seems to be continuing progress on meaningful benchmarks at a fairly stable, exponential pace, paired with significant cost reductions.

Matteo Wong in The Atlantic profiles that ‘The AI Doomers Are Getting Doomier’ featuring among others MIRI and Nate Sores and Dan Hendrycks.

An excellent point is that most people have never had a real adversary working against them personally. We’ve had opponents in games or competitions, we’ve negotiated, we’ve had adversaries within a situation, but we’ve never had another mind or organization focusing on defeating or destroying or damaging us by any means necessary. Our only experience of the real thing is fictional, from things like movies.

Jeffrey Ladish: I expect this is why many security people and DoD people have an easier time grasping the implications of AI smarter and more strategic than humans. The point about paranoia is especially important. People have a hard time being calibrated about intelligent threats.

When my day job was helping people and companies improve their security, I’d find people who greatly underestimated what motivate hackers could do. And I found people too paranoid, thinking security was hopeless. Usually Mossad is not targeting you, so the basics help a lot.

Is worrying about AIs taking over paranoid? If it’s the current generation of AI, yes. If it’s about future AI, no. Not when we’ve made as much progress in AI as we have. Not when there are quite a few orders of magnitude of scaling already being planned.

Right now we are dealing with problems caused by AIs that very much are not smart or powerful enough to be adversaries, that also aren’t being tasked with trying to be adversaries, and that mostly don’t even involve real human adversaries, not in the way the Russian Internet Research Agency is our adversary, or Mossad might make someone its adversary. Things are quiet so far both because the AIs aren’t that dangerous yet and also because almost no one is out there actually trying.

Ezra Klein makes a classic mistake in an overall very good piece that I reference in several places this week.

Ezra Klein (NYT): Even if you believe that A.I. capabilities will keep advancing — and I do, though how far and how fast I don’t pretend to know — a rapid collapse of human control does not necessarily follow.

I am quite skeptical of scenarios in which A.I. attains superintelligence without making any obvious mistakes in its effort to attain power in the real world.

Who said anything about ‘not making any obvious mistakes’?

This is a form of the classic ‘AI takeover requires everything not go wrong’ argument, which is backwards. The AI takeover is a default. It does not need to make a particular deliberate effort to attain power. Nor would an attempt to gain power that fails mean that the humans win.

Nor does ‘makes an obvious mistake’ have to mean failure for a takeover attempt. Consider the more pedestrian human takeover attempts. As in, when a human or group tries to take over. Most of those who succeed do not avoid ‘making an obvious mistake’ at some point. All the time, obvious mistakes are recovered from, or simply don’t matter very much. The number of times a famous authoritarian’s first coup attempt failed, or they came back later like Napoleon, is remarkably not small.

Very often, indeed most of the time, the other humans can see what is coming, and simply fail to coordinate against it or put much effort into stopping it. I’m sure Ezra, if reading this, has already thought of many examples, including recently, that fit this very well.

Anthropic discussion of Claude Code with Cat Wu and Alex Albert. Anthropic also discussed best practices for Claude Code a few weeks ago and their guide to ‘mastering Claude Code’ from a few months ago.

Discussion about this post

AI #131 Part 1: Gemini 2.5 Flash Image is Cool Read More »

as-gm-prepares-to-switch-its-evs-to-nacs,-it-has-some-new-adapters

As GM prepares to switch its EVs to NACS, it has some new adapters

The first adapter that GM released, which cost $225, allowed CCS1-equipped EVs to connect to a NACS charger. But now, GM will have a range of adapters so that any of its EV customers can charge anywhere, as long as they have the right dongle.

For existing GM EVs with CCS1, there is a GM NACS DC adapter, just for fast charging. And for level 2 (AC) charging, there’s a GM NACS level 2 adapter.

For the NACS-equipped GM EVs (which, again, have yet to hit the showrooms), there’s a GM CCS1 DC adapter that will let those EVs use existing non-Tesla DC charging infrastructure, like Electrify America’s 350 kW chargers. There is also a GM J1772 AC adapter, which will let a GM NACS EV slow-charge from the ubiquitous J1772 port. And a pair of adapters will be compatible with GM’s Energy Powershift home charger, which lets an EV use its battery to power the house if necessary, also known as vehicle-to-home or V2H.

Although we don’t have exact prices for each adapter, GM told Ars the range costs between $67 and $195.

As GM prepares to switch its EVs to NACS, it has some new adapters Read More »

the-personhood-trap:-how-ai-fakes-human-personality

The personhood trap: How AI fakes human personality


Intelligence without agency

AI assistants don’t have fixed personalities—just patterns of output guided by humans.

Recently, a woman slowed down a line at the post office, waving her phone at the clerk. ChatGPT told her there’s a “price match promise” on the USPS website. No such promise exists. But she trusted what the AI “knows” more than the postal worker—as if she’d consulted an oracle rather than a statistical text generator accommodating her wishes.

This scene reveals a fundamental misunderstanding about AI chatbots. There is nothing inherently special, authoritative, or accurate about AI-generated outputs. Given a reasonably trained AI model, the accuracy of any large language model (LLM) response depends on how you guide the conversation. They are prediction machines that will produce whatever pattern best fits your question, regardless of whether that output corresponds to reality.

Despite these issues, millions of daily users engage with AI chatbots as if they were talking to a consistent person—confiding secrets, seeking advice, and attributing fixed beliefs to what is actually a fluid idea-connection machine with no persistent self. This personhood illusion isn’t just philosophically troublesome—it can actively harm vulnerable individuals while obscuring a sense of accountability when a company’s chatbot “goes off the rails.”

LLMs are intelligence without agency—what we might call “vox sine persona”: voice without person. Not the voice of someone, not even the collective voice of many someones, but a voice emanating from no one at all.

A voice from nowhere

When you interact with ChatGPT, Claude, or Grok, you’re not talking to a consistent personality. There is no one “ChatGPT” entity to tell you why it failed—a point we elaborated on more fully in a previous article. You’re interacting with a system that generates plausible-sounding text based on patterns in training data, not a person with persistent self-awareness.

These models encode meaning as mathematical relationships—turning words into numbers that capture how concepts relate to each other. In the models’ internal representations, words and concepts exist as points in a vast mathematical space where “USPS” might be geometrically near “shipping,” while “price matching” sits closer to “retail” and “competition.” A model plots paths through this space, which is why it can so fluently connect USPS with price matching—not because such a policy exists but because the geometric path between these concepts is plausible in the vector landscape shaped by its training data.

Knowledge emerges from understanding how ideas relate to each other. LLMs operate on these contextual relationships, linking concepts in potentially novel ways—what you might call a type of non-human “reasoning” through pattern recognition. Whether the resulting linkages the AI model outputs are useful depends on how you prompt it and whether you can recognize when the LLM has produced a valuable output.

Each chatbot response emerges fresh from the prompt you provide, shaped by training data and configuration. ChatGPT cannot “admit” anything or impartially analyze its own outputs, as a recent Wall Street Journal article suggested. ChatGPT also cannot “condone murder,” as The Atlantic recently wrote.

The user always steers the outputs. LLMs do “know” things, so to speak—the models can process the relationships between concepts. But the AI model’s neural network contains vast amounts of information, including many potentially contradictory ideas from cultures around the world. How you guide the relationships between those ideas through your prompts determines what emerges. So if LLMs can process information, make connections, and generate insights, why shouldn’t we consider that as having a form of self?

Unlike today’s LLMs, a human personality maintains continuity over time. When you return to a human friend after a year, you’re interacting with the same human friend, shaped by their experiences over time. This self-continuity is one of the things that underpins actual agency—and with it, the ability to form lasting commitments, maintain consistent values, and be held accountable. Our entire framework of responsibility assumes both persistence and personhood.

An LLM personality, by contrast, has no causal connection between sessions. The intellectual engine that generates a clever response in one session doesn’t exist to face consequences in the next. When ChatGPT says “I promise to help you,” it may understand, contextually, what a promise means, but the “I” making that promise literally ceases to exist the moment the response completes. Start a new conversation, and you’re not talking to someone who made you a promise—you’re starting a fresh instance of the intellectual engine with no connection to any previous commitments.

This isn’t a bug; it’s fundamental to how these systems currently work. Each response emerges from patterns in training data shaped by your current prompt, with no permanent thread connecting one instance to the next beyond an amended prompt, which includes the entire conversation history and any “memories” held by a separate software system, being fed into the next instance. There’s no identity to reform, no true memory to create accountability, no future self that could be deterred by consequences.

Every LLM response is a performance, which is sometimes very obvious when the LLM outputs statements like “I often do this while talking to my patients” or “Our role as humans is to be good people.” It’s not a human, and it doesn’t have patients.

Recent research confirms this lack of fixed identity. While a 2024 study claims LLMs exhibit “consistent personality,” the researchers’ own data actually undermines this—models rarely made identical choices across test scenarios, with their “personality highly rely[ing] on the situation.” A separate study found even more dramatic instability: LLM performance swung by up to 76 percentage points from subtle prompt formatting changes. What researchers measured as “personality” was simply default patterns emerging from training data—patterns that evaporate with any change in context.

This is not to dismiss the potential usefulness of AI models. Instead, we need to recognize that we have built an intellectual engine without a self, just like we built a mechanical engine without a horse. LLMs do seem to “understand” and “reason” to a degree within the limited scope of pattern-matching from a dataset, depending on how you define those terms. The error isn’t in recognizing that these simulated cognitive capabilities are real. The error is in assuming that thinking requires a thinker, that intelligence requires identity. We’ve created intellectual engines that have a form of reasoning power but no persistent self to take responsibility for it.

The mechanics of misdirection

As we hinted above, the “chat” experience with an AI model is a clever hack: Within every AI chatbot interaction, there is an input and an output. The input is the “prompt,” and the output is often called a “prediction” because it attempts to complete the prompt with the best possible continuation. In between, there’s a neural network (or a set of neural networks) with fixed weights doing a processing task. The conversational back and forth isn’t built into the model; it’s a scripting trick that makes next-word-prediction text generation feel like a persistent dialogue.

Each time you send a message to ChatGPT, Copilot, Grok, Claude, or Gemini, the system takes the entire conversation history—every message from both you and the bot—and feeds it back to the model as one long prompt, asking it to predict what comes next. The model intelligently reasons about what would logically continue the dialogue, but it doesn’t “remember” your previous messages as an agent with continuous existence would. Instead, it’s re-reading the entire transcript each time and generating a response.

This design exploits a vulnerability we’ve known about for decades. The ELIZA effect—our tendency to read far more understanding and intention into a system than actually exists—dates back to the 1960s. Even when users knew that the primitive ELIZA chatbot was just matching patterns and reflecting their statements back as questions, they still confided intimate details and reported feeling understood.

To understand how the illusion of personality is constructed, we need to examine what parts of the input fed into the AI model shape it. AI researcher Eugene Vinitsky recently broke down the human decisions behind these systems into four key layers, which we can expand upon with several others below:

1. Pre-training: The foundation of “personality”

The first and most fundamental layer of personality is called pre-training. During an initial training process that actually creates the AI model’s neural network, the model absorbs statistical relationships from billions of examples of text, storing patterns about how words and ideas typically connect.

Research has found that personality measurements in LLM outputs are significantly influenced by training data. OpenAI’s GPT models are trained on sources like copies of websites, books, Wikipedia, and academic publications. The exact proportions matter enormously for what users later perceive as “personality traits” once the model is in use, making predictions.

2. Post-training: Sculpting the raw material

Reinforcement Learning from Human Feedback (RLHF) is an additional training process where the model learns to give responses that humans rate as good. Research from Anthropic in 2022 revealed how human raters’ preferences get encoded as what we might consider fundamental “personality traits.” When human raters consistently prefer responses that begin with “I understand your concern,” for example, the fine-tuning process reinforces connections in the neural network that make it more likely to produce those kinds of outputs in the future.

This process is what has created sycophantic AI models, such as variations of GPT-4o, over the past year. And interestingly, research has shown that the demographic makeup of human raters significantly influences model behavior. When raters skew toward specific demographics, models develop communication patterns that reflect those groups’ preferences.

3. System prompts: Invisible stage directions

Hidden instructions tucked into the prompt by the company running the AI chatbot, called “system prompts,” can completely transform a model’s apparent personality. These prompts get the conversation started and identify the role the LLM will play. They include statements like “You are a helpful AI assistant” and can share the current time and who the user is.

A comprehensive survey of prompt engineering demonstrated just how powerful these prompts are. Adding instructions like “You are a helpful assistant” versus “You are an expert researcher” changed accuracy on factual questions by up to 15 percent.

Grok perfectly illustrates this. According to xAI’s published system prompts, earlier versions of Grok’s system prompt included instructions to not shy away from making claims that are “politically incorrect.” This single instruction transformed the base model into something that would readily generate controversial content.

4. Persistent memories: The illusion of continuity

ChatGPT’s memory feature adds another layer of what we might consider a personality. A big misunderstanding about AI chatbots is that they somehow “learn” on the fly from your interactions. Among commercial chatbots active today, this is not true. When the system “remembers” that you prefer concise answers or that you work in finance, these facts get stored in a separate database and are injected into every conversation’s context window—they become part of the prompt input automatically behind the scenes. Users interpret this as the chatbot “knowing” them personally, creating an illusion of relationship continuity.

So when ChatGPT says, “I remember you mentioned your dog Max,” it’s not accessing memories like you’d imagine a person would, intermingled with its other “knowledge.” It’s not stored in the AI model’s neural network, which remains unchanged between interactions. Every once in a while, an AI company will update a model through a process called fine-tuning, but it’s unrelated to storing user memories.

5. Context and RAG: Real-time personality modulation

Retrieval Augmented Generation (RAG) adds another layer of personality modulation. When a chatbot searches the web or accesses a database before responding, it’s not just gathering facts—it’s potentially shifting its entire communication style by putting those facts into (you guessed it) the input prompt. In RAG systems, LLMs can potentially adopt characteristics such as tone, style, and terminology from retrieved documents, since those documents are combined with the input prompt to form the complete context that gets fed into the model for processing.

If the system retrieves academic papers, responses might become more formal. Pull from a certain subreddit, and the chatbot might make pop culture references. This isn’t the model having different moods—it’s the statistical influence of whatever text got fed into the context window.

6. The randomness factor: Manufactured spontaneity

Lastly, we can’t discount the role of randomness in creating personality illusions. LLMs use a parameter called “temperature” that controls how predictable responses are.

Research investigating temperature’s role in creative tasks reveals a crucial trade-off: While higher temperatures can make outputs more novel and surprising, they also make them less coherent and harder to understand. This variability can make the AI feel more spontaneous; a slightly unexpected (higher temperature) response might seem more “creative,” while a highly predictable (lower temperature) one could feel more robotic or “formal.”

The random variation in each LLM output makes each response slightly different, creating an element of unpredictability that presents the illusion of free will and self-awareness on the machine’s part. This random mystery leaves plenty of room for magical thinking on the part of humans, who fill in the gaps of their technical knowledge with their imagination.

The human cost of the illusion

The illusion of AI personhood can potentially exact a heavy toll. In health care contexts, the stakes can be life or death. When vulnerable individuals confide in what they perceive as an understanding entity, they may receive responses shaped more by training data patterns than therapeutic wisdom. The chatbot that congratulates someone for stopping psychiatric medication isn’t expressing judgment—it’s completing a pattern based on how similar conversations appear in its training data.

Perhaps most concerning are the emerging cases of what some experts are informally calling “AI Psychosis” or “ChatGPT Psychosis”—vulnerable users who develop delusional or manic behavior after talking to AI chatbots. These people often perceive chatbots as an authority that can validate their delusional ideas, often encouraging them in ways that become harmful.

Meanwhile, when Elon Musk’s Grok generates Nazi content, media outlets describe how the bot “went rogue” rather than framing the incident squarely as the result of xAI’s deliberate configuration choices. The conversational interface has become so convincing that it can also launder human agency, transforming engineering decisions into the whims of an imaginary personality.

The path forward

The solution to the confusion between AI and identity is not to abandon conversational interfaces entirely. They make the technology far more accessible to those who would otherwise be excluded. The key is to find a balance: keeping interfaces intuitive while making their true nature clear.

And we must be mindful of who is building the interface. When your shower runs cold, you look at the plumbing behind the wall. Similarly, when AI generates harmful content, we shouldn’t blame the chatbot, as if it can answer for itself, but examine both the corporate infrastructure that built it and the user who prompted it.

As a society, we need to broadly recognize LLMs as intellectual engines without drivers, which unlocks their true potential as digital tools. When you stop seeing an LLM as a “person” that does work for you and start viewing it as a tool that enhances your own ideas, you can craft prompts to direct the engine’s processing power, iterate to amplify its ability to make useful connections, and explore multiple perspectives in different chat sessions rather than accepting one fictional narrator’s view as authoritative. You are providing direction to a connection machine—not consulting an oracle with its own agenda.

We stand at a peculiar moment in history. We’ve built intellectual engines of extraordinary capability, but in our rush to make them accessible, we’ve wrapped them in the fiction of personhood, creating a new kind of technological risk: not that AI will become conscious and turn against us but that we’ll treat unconscious systems as if they were people, surrendering our judgment to voices that emanate from a roll of loaded dice.

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

The personhood trap: How AI fakes human personality Read More »

anthropic’s-auto-clicking-ai-chrome-extension-raises-browser-hijacking-concerns

Anthropic’s auto-clicking AI Chrome extension raises browser-hijacking concerns

The company tested 123 cases representing 29 different attack scenarios and found a 23.6 percent attack success rate when browser use operated without safety mitigations.

One example involved a malicious email that instructed Claude to delete a user’s emails for “mailbox hygiene” purposes. Without safeguards, Claude followed these instructions and deleted the user’s emails without confirmation.

Anthropic says it has implemented several defenses to address these vulnerabilities. Users can grant or revoke Claude’s access to specific websites through site-level permissions. The system requires user confirmation before Claude takes high-risk actions like publishing, purchasing, or sharing personal data. The company has also blocked Claude from accessing websites offering financial services, adult content, and pirated content by default.

These safety measures reduced the attack success rate from 23.6 percent to 11.2 percent in autonomous mode. On a specialized test of four browser-specific attack types, the new mitigations reportedly reduced the success rate from 35.7 percent to 0 percent.

Independent AI researcher Simon Willison, who has extensively written about AI security risks and coined the term “prompt injection” in 2022, called the remaining 11.2 percent attack rate “catastrophic,” writing on his blog that “in the absence of 100% reliable protection I have trouble imagining a world in which it’s a good idea to unleash this pattern.”

By “pattern,” Willison is referring to the recent trend of integrating AI agents into web browsers. “I strongly expect that the entire concept of an agentic browser extension is fatally flawed and cannot be built safely,” he wrote in an earlier post on similar prompt injection security issues recently found in Perplexity Comet.

The security risks are no longer theoretical. Last week, Brave’s security team discovered that Perplexity’s Comet browser could be tricked into accessing users’ Gmail accounts and triggering password recovery flows through malicious instructions hidden in Reddit posts. When users asked Comet to summarize a Reddit thread, attackers could embed invisible commands that instructed the AI to open Gmail in another tab, extract the user’s email address, and perform unauthorized actions. Although Perplexity attempted to fix the vulnerability, Brave later confirmed that its mitigations were defeated and the security hole remained.

For now, Anthropic plans to use its new research preview to identify and address attack patterns that emerge in real-world usage before making the Chrome extension more widely available. In the absence of good protections from AI vendors, the burden of security falls on the user, who is taking a large risk by using these tools on the open web. As Willison noted in his post about Claude for Chrome, “I don’t think it’s reasonable to expect end users to make good decisions about the security risks.”

Anthropic’s auto-clicking AI Chrome extension raises browser-hijacking concerns Read More »

under-pressure-after-setbacks,-spacex’s-huge-rocket-finally-goes-the-distance

Under pressure after setbacks, SpaceX’s huge rocket finally goes the distance

The ship made it all the way through reentry, turned to a horizontal position to descend through scattered clouds, then relit three of its engines to flip back to a vertical orientation for the final braking maneuver before splashdown.

Things to improve on

There are several takeaways from Tuesday’s flight that will require some improvements to Starship, but these are more akin to what officials might expect from a rocket test program and not the catastrophic failures of the ship that occurred earlier this year.

One of the Super Heavy booster’s 33 engines prematurely shut down during ascent. This has happened before, and while it didn’t affect the booster’s overall performance, engineers will investigate the failure to try to improve the reliability of SpaceX’s Raptor engines, each of which can generate more than a half-million pounds of thrust.

Later in the flight, cameras pointed at one of the ship’s rear flaps showed structural damage to the back of the wing. It wasn’t clear what caused the damage, but super-heated plasma burned through part of the flap as the ship fell deeper into the atmosphere. Still, the flap remained largely intact and was able to help control the vehicle through reentry and splashdown.

“We’re kind of being mean to this Starship a little bit,” Huot said on SpaceX’s live webcast. “We’re really trying to put it through the paces and kind of poke on what some of its weak points are.”

Small chunks of debris were also visible peeling off the ship during reentry. The origin of the glowing debris wasn’t immediately clear, but it may have been parts of the ship’s heat shield tiles. On this flight, SpaceX tested several different tile designs, including ceramic and metallic materials, and one tile design that uses “active cooling” to help dissipate heat during reentry.

A bright flash inside the ship’s engine bay during reentry also appeared to damage the vehicle’s aft skirt, the stainless steel structure that encircles the rocket’s six main engines.

“That’s not what we want to see,” Huot said. “We just saw some of the aft skirt just take a hit. So we’ve got some visible damage on the aft skirt. We’re continuing to reenter, though. We are intentionally stressing the ship as we go through this, so it is not guaranteed to be a smooth ride down to the Indian Ocean.

“We’ve removed a bunch of tiles in kind of critical places across the vehicle, so seeing stuff like that is still valuable to us,” he said. “We are trying to kind of push this vehicle to the limits to learn what its limits are as we design our next version of Starship.”

Shana Diez, a Starship engineer at SpaceX, perhaps summed up Tuesday’s results best on X: “It’s not been an easy year but we finally got the reentry data that’s so critical to Starship. It feels good to be back!”

Under pressure after setbacks, SpaceX’s huge rocket finally goes the distance Read More »

bluesky-now-platform-of-choice-for-science-community

Bluesky now platform of choice for science community


It’s not just you. Survey says: “Twitter sucks now and all the cool kids are moving to Bluesky”

Credit: Getty Images | Chris Delmas

Marine biologist and conservationist David Shiffman was an early power user and evangelist for science engagement on the social media platform formerly known as Twitter. Over the years, he trained more than 2,000 early career scientists on how to best use the platform for professional goals: networking with colleagues, sharing new scientific papers, and communicating with interested members of the public.

But when Elon Musk bought Twitter in 2022, renaming it X, changes to both the platform’s algorithm and moderation policy soured Shiffman on the social media site. He started looking for a viable alternative among the fledgling platforms that had begun to pop up: most notably Threads, Post, Mastodon, and Bluesky. He was among the first wave of scientists to join Bluesky and found that, even in its infancy, it had many of the features he had valued in “golden age” Twitter.

Shiffman also noticed that he wasn’t the only one in the scientific community having issues with Twitter. This impression was further bolstered by news stories in outlets like Nature, Science, and the Chronicle of Higher Education noting growing complaints about Twitter and increased migration over to Bluesky by science professionals. (Full disclosure: I joined Bluesky around the same time as Shiffman, for similar reasons: Twitter had ceased to be professionally useful, and many of the science types I’d been following were moving to Bluesky. I nuked my Twitter account in November 2024.)

A curious Shiffman decided to conduct a scientific survey, announcing the results in a new paper published in the journal Integrative and Comparative Biology. The findings confirm that, while Twitter was once the platform of choice for a majority of science communicators, those same people have since abandoned it in droves. And of the alternatives available, Bluesky seems to be their new platform of choice.

Shiffman, the author of Why Sharks Matter, described early Twitter recently on the blog Southern Fried Science as “the world’s most interesting cocktail party.”

“Then it stopped being useful,” Shiffman told Ars. “I was worried for a while that this incredibly powerful way of changing the world using expertise was gone. It’s not gone. It just moved. It’s a little different now, and it’s not as powerful as it was, but it’s not gone. It was for me personally, immensely reassuring that so many other people were having the same experience that I was. But it was also important to document that scientifically.”

Eager to gather solid data on the migration phenomenon to bolster his anecdotal observations, Shiffman turned to social scientist Julia Wester, one of the scientists who had joined Twitter at Shiffman’s encouragement years before, before also becoming fed up and migrating to Bluesky. Despite being “much less online” than the indefatigable Shiffman, Wester was intrigued by the proposition. “I was interested not just in the anecdotal evidence, the conversations we were having, but also in identifying the real patterns,” she told Ars. “As a social scientist, when we hear anecdotal evidence about people’s experiences, I want to know what that looks like across the population.”

Shiffman and Wester targeted scientists, science communicators, and science educators who used (or had used) both Twitter and Bluesky. Questions explored user attitudes toward, and experiences with, each platform in a professional capacity: when they joined, respective follower and post counts, which professional tasks they used each platform for, the usefulness of each platform for those purposes relative to 2021, how they first heard about Bluesky, and so forth.

The authors acknowledge that they are looking at a very specific demographic among social media users in general and that there is an inevitable self-selection effect. However, “You want to use the sample and the method that’s appropriate to the phenomenon that you’re looking at,” said Wester. “For us, it wasn’t just the experience of people using these platforms, but the phenomenon of migration. Why are people deciding to stay or move? How they’re deciding to use both of these platforms? For that, I think we did get a pretty decent sample for looking at the dynamic tensions, the push and pull between staying on one platform or opting for another.”

They ended up with a final sample size of 813 people. Over 90 percent of respondents said they had used Twitter for learning about new developments in their field; 85.5 percent for professional networking; and 77.3 percent for public outreach. Roughly three-quarters of respondents said that the platform had become significantly less useful for each of those professional uses since Musk took over. Nearly half still have Twitter accounts but use it much less frequently or not at all, while about 40 percent have deleted their accounts entirely in favor of Bluesky.

Making the switch

User complaints about Twitter included a noticeable increase in spam, porn, bots, and promoted posts from users who paid for a verification badge, many spreading extremist content. “I very quickly saw material that I did not want my posts to be posted next to or associated with,” one respondent commented. There were also complaints about the rise in misinformation and a significant decline in both the quantity and quality of engagement, with respondents describing their experiences as “unpleasant,” “negative,” or “hostile.”

The survey responses also revealed a clear push/pull dynamic when it came to the choice to abandon Twitter for Bluesky. That is, people felt they were being pushed away from Twitter and were actively looking for alternatives. As one respondent put it, “Twitter started to suck and all the cool people were moving to Bluesky.”

Bluesky was user-friendly with no algorithm, a familiar format, and helpful tools like starter packs of who to follow in specific fields, which made the switch a bit easier for many newcomers daunted by the prospect of rebuilding their online audience. Bluesky users also appreciated the moderation on the platform and having the ability to block or mute people as a means of disengaging from more aggressive, unpleasant conversations. That said, “If Twitter was still great, then I don’t think there’s any combination of features that would’ve made this many people so excited about switching,” said Shiffman.

Per Shiffman and Wester, an “overwhelming majority” of respondents said that Bluesky has a “vibrant and healthy online science community,” while Twitter no longer does. And many Bluesky users reported getting more bang for their buck, so to speak, on Bluesky. They might have a lower follower count, but those followers are far more engaged: Someone with 50,000 Twitter/X followers, for example, might get five likes on a given post; but on Bluesky, they may only have 5,000 followers, but their posts will get 100 likes.

According to Shiffman, Twitter always used to be in the top three in terms of referral traffic for posts on Southern Fried Science. Then came the “Muskification,” and suddenly Twitter referrals weren’t even cracking the top 10. By contrast, in 2025 thus far, Bluesky has driven “a hundred times as many page views” to Southern Fried Science as Twitter. Ironically, “the blog post that’s gotten the most page views from Twitter is the one about this paper,” said Shiffman.

Ars social media manager Connor McInerney confirmed that Ars Technica has also seen a steady dip in Twitter referral traffic thus far in 2025. Furthermore, “I can say anecdotally that over the summer we’ve seen our Bluesky traffic start to surpass our Twitter traffic for the first time,” McInerney said, attributing the growth to a combination of factors. “We’ve been posting to the platform more often and our audience there has grown significantly. By my estimate our audience has grown by 63 percent since January. The platform in general has grown a lot too—they had 10 million users in September of last year, and this month the latest numbers indicate they’re at 38 million users. Conversely, our Twitter audience has remained fairly static across the same period of time.”

Bubble, schmubble

As for scientists looking to share scholarly papers online, Shiffman pulled the Altmetrics stats for his and Wester’s new paper. “It’s already one of the 10 most shared papers in the history of that journal on social media,” he said, with 14 shares on Twitter/X vs over a thousand shares on Bluesky (as of 4 pm ET on August 20). “If the goal is showing there’s a more active academic scholarly conversation on Bluesky—I mean, damn,” he said.

“When I talk about fish on Bluesky, people ask me questions about fish. When I talk about fish on Twitter, people threaten to murder my family because we’re Jewish.”

And while there has been a steady drumbeat of op-eds of late in certain legacy media outlets accusing Bluesky of being trapped in its own liberal bubble, Shiffman, for one, has few concerns about that. “I don’t care about this, because I don’t use social media to argue with strangers about politics,” he wrote in his accompanying blog post. “I use social media to talk about fish. When I talk about fish on Bluesky, people ask me questions about fish. When I talk about fish on Twitter, people threaten to murder my family because we’re Jewish.” He compared the current incarnation of Twitter as no better than 4Chan or TruthSocial in terms of the percentage of “conspiracy-prone extremists” in the audience. “Even if you want to stay, the algorithm is working against you,” he wrote.

“There have been a lot of opinion pieces about why Bluesky is not useful because the people there tend to be relatively left-leaning,” Shiffman told Ars. “I haven’t seen any of those same people say that Twitter is bad because it’s relatively right-leaning. Twitter is not a representative sample of the public either.” And given his focus on ocean conservation and science-based, data-driven environmental advocacy, he is likely to find a more engaged and persuadable audience at Bluesky.

The survey results show that at this point, Bluesky seems to have hit a critical mass for the online scientific community. That said, Shiffman, for one, laments that the powerful Black Science Twitter contingent, for example, has thus far not switched to Bluesky in significant numbers. He would like to conduct a follow-up study to look into how many still use Twitter vs those who may have left social media altogether, as well as Bluesky’s demographic diversity—paving the way for possible solutions should that data reveal an unwelcoming environment for non-white scientists.

There are certainly limitations to the present survey. “Because this is such a dynamic system and it’s changing every day, I think if we did this study now versus when we did it six months ago, we’d get slightly different answers and dynamics,” said Wester. “It’s still relevant because you can look at the factors that make people decide to stay or not on Bluesky, to switch to something else, to leave social media altogether. That can tell us something about what makes a healthy, vibrant conversation online. We’re capturing one of the responses: ‘I’ll see you on Bluesky.’ But that’s not the only response. Public science communication is as important now as it’s ever been, so looking at how scientists have pivoted is really important.”

We recently reported on research indicating that social media as a system might well be doomed, since its very structure gives rise to the toxic dynamics that plague so much of social media: filter bubbles, algorithms that amplify the most extreme views to boost engagement, and a small number of influencers hogging the lion’s share of attention. That paper concluded that any intervention strategies were likely to fail. Both Shiffman and Wester, while acknowledging the reality of those dynamics, are less pessimistic about social media’s future.

“I think the problem is not with how social media works, it’s with how any group of people work,” said Shiffman. “Humans evolved in tiny social groupings where we helped each other and looked out for each other’s interests. Now I have to have a fight with someone 10,000 miles away who has no common interest with me about whether or not vaccines are bad. We were not built for that. Social media definitely makes it a lot easier for people who are anti-social by nature and want to stir conflict to find those conflicts. Something that took me way too long to learn is that you don’t have to participate in every fight you’re invited to. There are people who are looking for a fight and you can simply say, ‘No, thank you. Not today, Satan.'”

“The contrast that people are seeing between Bluesky and present-day Twitter highlights that these are social spaces, which means that you’re going to get all of the good and bad of humanity entering into that space,” said Wester. “But we have had new social spaces evolve over our whole history. Sometimes when there’s something really new, we have to figure out the rules for that space. We’re still figuring out the rules for these social media spaces. The contrast in moderation policies and the use (or not) of algorithms between those two platforms that are otherwise very similar in structure really highlights that you can shape those social spaces by creating rules and tools for how people interact with each other.”

DOI: Integrative and Comparative Biology, 2025. 10.1093/icb/icaf127  (About DOIs).

Photo of Jennifer Ouellette

Jennifer is a senior writer at Ars Technica with a particular focus on where science meets culture, covering everything from physics and related interdisciplinary topics to her favorite films and TV series. Jennifer lives in Baltimore with her spouse, physicist Sean M. Carroll, and their two cats, Ariel and Caliban.

Bluesky now platform of choice for science community Read More »