Author name: Mike M.

dwarkesh-patel-on-continual-learning

Dwarkesh Patel on Continual Learning

A key question going forward is the extent to which making further AI progress will depend upon some form of continual learning. Dwarkesh Patel offers us an extended essay considering these questions and reasons to be skeptical of the pace of progress for a while. I am less skeptical about many of these particular considerations, and do my best to explain why in detail.

Separately, Ivanka Trump recently endorsed a paper with a discussion I liked a lot less but that needs to be discussed given how influential her voice might (mind you I said might) be to policy going forward, so I will then cover that here as well.

Dwarkesh Patel explains why he doesn’t think AGI is right around the corner, and why AI progress today is insufficient to replace most white collar employment: That continual learning is both necessary and unsolved, and will be a huge bottleneck.

He opens with this quote:

Rudiger Dornbusch: Things take longer to happen than you think they will, and then they happen faster than you thought they could.

Clearly this means one is poorly calibrated, but also yes, and I expect it to feel like this as well. Either capabilities, diffusion or both will be on an exponential, and the future will be highly unevenly distributed until suddenly parts of it aren’t anymore. That seems to be true fractally as well, when the tech is ready and I figure out how to make AI do something, that’s it, it’s done.

Here is Dwarkesh’s Twitter thread summary:

Dwarkesh Patel: Sometimes people say that even if all AI progress totally stopped, the systems of today would still be economically transformative. I disagree. The reason that the Fortune 500 aren’t using LLMs to transform their workflows isn’t because the management is too stodgy.

Rather, it’s genuinely hard to get normal humanlike labor out of LLMs. And this has to do with some fundamental capabilities these models lack.

New blog post where I explain why I disagree with this, and why I have slightly longer timelines to AGI than many of my guests.

I think continual learning is a huge bottleneck to the usefulness of these models, and extended computer use may take years to sort out.

Link here.

There is no consensus definition of transformational but I think this is simply wrong, in the sense that LLMs being stuck without continual learning at essentially current levels would not stop them from having a transformational impact. There are a lot of other ways to get a ton more utility out of what we already have, and over time we would build around what the models can do rather than giving up the moment they don’t sufficiently neatly fit into existing human-shaped holes.

When we do solve human like continual learning, however, we might see a broadly deployed intelligence explosion *even if there’s no more algorithmic progress*.

Simply from the AI amalgamating the on-the-job experience of all the copies broadly deployed through the economy.

I’d bet 2028 for computer use agents that can do taxes end-to-end for my small business as well as a competent general manager could in a week: including chasing down all the receipts on different websites, emailing back and forth for invoices, and filing to the IRS.

That being said, you can’t play around with these models when they’re in their element and still think we’re not on track for AGI.

Strongly agree with that last statement. Regardless of how much we can do without strictly solving continual learning, continual learning is not solved… yet.

These are simple, self contained, short horizon, language in-language out tasks – the kinds of assignments that should be dead center in the LLMs’ repertoire. And they’re 5/10 at them. Don’t get me wrong, that’s impressive.

But the fundamental problem is that LLMs don’t get better over time the way a human would. The lack of continual learning is a huge huge problem. The LLM baseline at many tasks might be higher than an average human’s. But there’s no way to give a model high level feedback.

You’re stuck with the abilities you get out of the box. You can keep messing around with the system prompt. In practice this just doesn’t produce anything even close to the kind of learning and improvement that human employees experience.

The reason humans are so useful is not mainly their raw intelligence. It’s their ability to build up context, interrogate their own failures, and pick up small improvements and efficiencies as they practice a task.

You make an AI tool. It’s 5/10 out of the box. What level of Skill Issue are we dealing with here, that stops it from getting better over time assuming you don’t get to upgrade the underlying model?

You can obviously engage in industrial amounts of RL or other fine-tuning, but that too only goes so far.

You can use things like memory, or train LoRas, or various other incremental tricks. That doesn’t enable radical changes, but I do think it can work for the kinds of preference learning Dwarkesh is complaining he currently doesn’t have access to, and you can if desired go back and fine tune the entire system periodically.

How do you teach a kid to play a saxophone? You have her try to blow into one, listen to how it sounds, and adjust. Now imagine teaching saxophone this way instead: A student takes one attempt. The moment they make a mistake, you send them away and write detailed instructions about what went wrong. The next student reads your notes and tries to play Charlie Parker cold. When they fail, you refine the instructions for the next student.

This just wouldn’t work. No matter how well honed your prompt is, no kid is just going to learn how to play saxophone from just reading your instructions. But this is the only modality we as users have to ‘teach’ LLMs anything.

Are you even so sure about that? If the context you can give is hundreds of thousands to millions of tokens at once, with ability to conditionally access millions or billions more? If you can create new tools and programs and branch workflows, or have it do so on your behalf, and call instances with different contexts and procedures for substeps? If you get to keep rewinding time and sending in the exact same student in the same mental state as many times as you want? And so on, including any number of things I haven’t mentioned or thought about?

I am confident that with enough iterations and work (and access to the required physical tools) I could write a computer program to operate a robot to play the saxophone essentially perfectly. No, you can’t do this purely via the LLM component, but that is why we are moving towards MCP and tool use for such tasks.

I get that Dwarkesh has put a lot of work into getting his tools to 5/10. But it’s nothing compared to the amount of work that could be done, including the tools that could be involved. That’s not a knock on him, that wouldn’t be a good use of his time yet.

LLMs actually do get kinda smart and useful in the middle of a session. For example, sometimes I’ll co-write an essay with an LLM. I’ll give it an outline, and I’ll ask it to draft the essay passage by passage. All its suggestions up till 4 paragraphs in will be bad. So I’ll just rewrite the whole paragraph from scratch and tell it, “Hey, your shit sucked. This is what I wrote instead.” At that point, it can actually start giving good suggestions for the next paragraph. But this whole subtle understanding of my preferences and style is lost by the end of the session.

Okay, so that seems like it is totally, totally a Skill Issue now? As in, Dwarkesh Patel has a style. A few paragraphs of that style clue the LLM into knowing how to help. So… can’t we provide it with a bunch of curated examples of similar exercises, and put them into context in various ways (Claude projects just got 10x more context!) and start with that?

Even Claude Code will often reverse a hard-earned optimization that we engineered together before I hit /compact – because the explanation for why it was made didn’t make it into the summary.

Yeah, this is super annoying, I’ve run into it, but I can think of some obvious fixes for this, especially if you notice what you want to preserve? One obvious way is to do what humans do, which is to put it into comments in the code saying what the optimization is and why to keep it, which then remain in context whenever Claude considers ripping them out, I don’t know if that works yet but it totally should.

I’m not saying I have the magical solution to all this but it all feels like it’s One Weird Trick (okay, maybe 10 working together) away from working in ways I could totally figure out if I had a team behind me and I focused on it.

My guess is this will not look like ‘learn like a human’ exactly. Different tools are available, so we’ll first get the ability to solve this via doing something different. But also, yeah, I think with enough skill and the right technique (on the level of the innovation that created reasoning models) you could basically do what humans do? Which involves effectively having the systems automatically engage in various levels of meta and updating, often quite heavily off a single data point.

It is hard to overstate how much time and effort goes into training a human employee.

There are many jobs where an employee is not net profitable for years. Hiring decisions are often made on the basis of what will be needed in year four or beyond.

That ignores the schooling that you also have to do. A doctor in America requires starting with a college degree, then four years of medical school, then four years of residency, and we have to subsidize that residency because it is actively unprofitable. That’s obviously an extreme case, but there are many training programs or essentially apprenticeships that last for years, including highly expensive time from senior people and expensive real world mistakes.

Imagine what it took to make Dwarkesh Patel into Dwarkesh Patel. Or the investment he makes in his own employees.

Even afterwards, in many ways you will always be ‘stuck with’ various aspects of those employees, and have to make the most of what they offer. This is standard.

Claude Opus estimates, and I think this is reasonable, that for every two hours humans spend working, they spend one hour learning, with a little less than half of that learning essentially ‘on the job.’

If you need to train a not a ‘universal’ LLM but a highly specific-purpose LLM, and have a massive compute budget with which to do so, and you mostly don’t care about how it performs out of distribution the same way you mostly don’t for an employee (as in, you teach it what you teach a human, which is ‘if this is outside your distribution or you’re failing at it then run it up the chain to your supervisor,’ and you have a classifier for that) and you can build and use tools along the way? Different ballgame.

It makes sense, given the pace of progress, for most people and companies not to put that kind of investment into AI ‘employees’ or other AI tasks. But if things do start to stall out, or they don’t, either way the value proposition on that will quickly improve. It will start to be worth doing. And we will rapidly learn new ways of doing it better, and have the results available to be copied.

Here’s his predictions on computer use in particular, to see how much we actually disagree:

When I interviewed Anthropic researchers Sholto Douglas and Trenton Bricken on my podcast, they said that they expect reliable computer use agents by the end of next year. We already have computer use agents right now, but they’re pretty bad. They’re imagining something quite different.

Their forecast is that by the end of next year, you should be able to tell an AI, “Go do my taxes.” And it goes through your email, Amazon orders, and Slack messages, emails back and forth with everyone you need invoices from, compiles all your receipts, decides which are business expenses, asks for your approval on the edge cases, and then submits Form 1040 to the IRS.

I’m skeptical. I’m not an AI researcher, so far be it for me to contradict them on technical details. But given what little I know, here’s why I’d bet against this forecast:

  • As horizon lengths increase, rollouts have to become longer. The AI needs to do two hours worth of agentic computer use tasks before we can even see if it did it right. Not to mention that computer use requires processing images and video, which is already more compute intensive, even if you don’t factor in the longer rollout. This seems like this should slow down progress.

Let’s take the concrete example here, ‘go do my taxes.’

This is a highly agentic task, but like a real accountant you can choose to ‘check its work’ if you want, or get another AI to check the work, because you can totally break this down into smaller tasks that allow for verification, or present a plan of tasks that can be verified. Similarly, if you are training TaxBot to do people’s taxes for them, you can train TaxBot on a lot of those individual subtasks, and give it clear feedback.

Almost all computer use tasks are like this? Humans also mostly don’t do things that can’t be verified for hours?

And the core building block issues of computer use seem mostly like very short time horizon tasks with very easy verification methods. If you can get lots of 9s on the button clicking and menu navigation and so on, I think you’re a lot of the way there.

The subtasks are also 99%+ things that come up relatively often, and that don’t present any non-trivial difficulties. A human accountant already will have to occasionally say ‘wait, I need you the taxpayer to tell me what the hell is up with this thing’ and we’re giving the AI in 2028 the ability to do this too.

I don’t see any fundamental difference between the difficulties being pointed out here, and the difficulties of tasks we have already solved.

  • We don’t have a large pretraining corpus of multimodal computer use data. I like this quote from Mechanize’s post on automating software engineering: “For the past decade of scaling, we’ve been spoiled by the enormous amount of internet data that was freely available for us to use. This was enough for cracking natural language processing, but not for getting models to become reliable, competent agents. Imagine trying to train GPT-4 on all the text data available in 1980—the data would be nowhere near enough, even if we had the necessary compute.”

    Again, I’m not at the labs. Maybe text only training already gives you a great prior on how different UIs work, and what the relationship between different components is. Maybe RL fine tuning is so sample efficient that you don’t need that much data. But I haven’t seen any public evidence which makes me think that these models have suddenly gotten less data hungry, especially in this domain where they’re substantially less practiced.

    Alternatively, maybe these models are such good front end coders that they can just generate millions of toy UIs for themselves to practice on. For my reaction to this, see bullet point below.

I’m not going to keep working for the big labs for free on this one by giving even more details on how I’d solve all this, but this totally seems like highly solvable problems, and also this seems like a case of the person saying it can’t be done interrupting the people doing it? It seems like progress is being made rapidly.

  • Even algorithmic innovations which seem quite simple in retrospect seem to take a long time to iron out. The RL procedure which DeepSeek explained in their R1 paper seems simple at a high level. And yet it took 2 years from the launch of GPT-4 to the release of o1.

  • Now of course I know it is hilariously arrogant to say that R1/o1 were easy – a ton of engineering, debugging, pruning of alternative ideas was required to arrive at this solution. But that’s precisely my point! Seeing how long it took to implement the idea, ‘Train the model to solve verifiable math and coding problems’, makes me think that we’re underestimating the difficulty of solving the much gnarlier problem of computer use, where you’re operating in a totally different modality with much less data.

I think two years is how long we had to have the idea of o1 and commit to it, then to implement it. Four months is roughly the actual time it took from ‘here is that sentence and we know it works’ to full implementation. Also we’re going to have massively more resources to pour into these questions this time around, and frankly I don’t think any of these insights are even as hard to find as o1, especially now that we have reasoning models to use as part of this process.

I think there are other potential roadblocks along the way, and once you factor all of those in you can’t be that much more optimistic, but I see this particular issue as not that likely to pose that much of a bottleneck for long.

His predictions are he’d take 50/50 bets on: 2028 for an AI that can ‘just go do your taxes as well as a human accountant could’ and 2032 for ‘can learn details and preferences on the job as well as a human can.’ I’d be inclined to take other side of both of those bets, assuming it means by EOY, for the 2032 one we’d need to flesh out details.

But if we have the ‘AI that does your taxes’ in 2028 then 2029 and 2030 look pretty weird, because this implies other things:

Daniel Kokotajlo: Great post! This is basically how I think about things as well. So why the difference in our timelines then?

–Well, actually, they aren’t that different. My median for the intelligence explosion is 2028 now (one year longer than it was when writing AI 2027), which means early 2028 or so for the superhuman coder milestone described in AI 2027, which I’d think roughly corresponds to the “can do taxes end-to-end” milestone you describe as happening by end of 2028 with 50% probability. Maybe that’s a little too rough; maybe it’s more like month-long horizons instead of week-long. But at the growth rates in horizon lengths that we are seeing and that I’m expecting, that’s less than a year…

–So basically it seems like our only serious disagreement is the continual/online learning thing, which you say 50% by 2032 on whereas I’m at 50% by end of 2028. Here, my argument is simple: I think that once you get to the superhuman coder milestone, the pace of algorithmic progress will accelerate, and then you’ll reach full AI R&D automation and it’ll accelerate further, etc. Basically I think that progress will be much faster than normal around that time, and so innovations like flexible online learning that feel intuitively like they might come in 2032 will instead come later that same year.

(For reference AI 2027 depicts a gradual transition from today to fully online learning, where the intermediate stages look something like “Every week, and then eventually every day, they stack on another fine-tuning run on additional data, including an increasingly high amount of on-the-job real world data.” A janky unprincipled solution in early 2027 that gives way to more elegant and effective things midway through the year.)

I found this an interestingly wrong thing to think:

Richard: Given the risk of fines and jail for filling your taxes wrong, and the cost of processing poor quality paperwork that the government will have to bear, it seems very unlikely that people will want AI to do taxes, and very unlikely that a government will allow AI to do taxes.

The rate of fully accurately filing your taxes is, for anyone whose taxes are complex, basically 0%. Everyone makes mistakes. When the AI gets this right almost every time, it’s already much better than a human accountant, and you’ll have a strong case that what happened was accidental, which means at worst you pay some modest penalties.

Personal story, I was paying accountants at a prestigious firm that will go unnamed to do my taxes, and they literally just forgot to include paying city tax at all. As in, I’m looking at the forms, and I ask, ‘wait why does it have $0 under city tax?’ and the guy essentially says ‘oh, whoops.’ So, yeah. Mistakes are made. This will be like self-driving cars, where we’ll impose vastly higher standards of accuracy and law abidance on the AIs, and they will meet them because the bar really is not that high.

There were also some good detailed reactions and counterarguments from others:

Near: finally some spicy takes around here.

Rohit: The question is whether we need humanlike labour for transformative economic outcomes, or whether we can find ways to use the labour it does provide with a different enough workflow that it adds substantial economic advantage.

Sriram Krishnan: Really good post from @dwarkesh_sp on continuous learning in LLMs.

Vitalik Buterin: I have high probability mass on longer timelines, but this particular issue feels like the sort of limitation that’s true until one day someone discovers a magic trick (think eg. RL on CoT) that suddenly makes it no longer true.

Sriram Krishnan: Agree – CoT is a particularly good example.

Ryan Greenblatt: I agree with much of this post. I also have roughly 2032 medians to things going crazy, I agree learning on the job is very useful, and I’m also skeptical we’d see massive white collar automation without further AI progress.

However, I think Dwarkesh is wrong to suggest that RL fine-tuning can’t be qualitatively similar to how humans learn.

In the post, he discusses AIs constructing verifiable RL environments for themselves based on human feedback and then argues this wouldn’t be flexible and powerful enough to work, but RL could be used more similarly to how humans learn.

My best guess is that the way humans learn on the job is mostly by noticing when something went well (or poorly) and then sample efficiently updating (with their brain doing something analogous to an RL update). In some cases, this is based on external feedback (e.g. from a coworker) and in some cases it’s based on self-verification: the person just looking at the outcome of their actions and then determining if it went well or poorly.

So, you could imagine RL’ing an AI based on both external feedback and self-verification like this. And, this would be a “deliberate, adaptive process” like human learning. Why would this currently work worse than human learning?

Current AIs are worse than humans at two things which makes RL (quantitatively) much worse for them:

1. Robust self-verification: the ability to correctly determine when you’ve done something well/poorly in a way which is robust to you optimizing against it.

2. Sample efficiency: how much you learn from each update (potentially leveraging stuff like determining what caused things to go well/poorly which humans certainly take advantage of). This is especially important if you have sparse external feedback.

But, these are more like quantitative than qualitative issues IMO. AIs (and RL methods) are improving at both of these.

All that said, I think it’s very plausible that the route to better continual learning routes more through building on in-context learning (perhaps through something like neuralese, though this would greatly increase misalignment risks…).

Some more quibbles:

– For the exact podcasting tasks Dwarkesh mentions, it really seems like simple fine-tuning mixed with a bit of RL would solve his problem. So, an automated training loop run by the AI could probably work here. This just isn’t deployed as an easy-to-use feature.

– For many (IMO most) useful tasks, AIs are limited by something other than “learning on the job”. At autonomous software engineering, they fail to match humans with 3 hours of time and they are typically limited by being bad agents or by being generally dumb/confused. To be clear, it seems totally plausible that for podcasting tasks Dwarkesh mentions, learning is the limiting factor.

– Correspondingly, I’d guess the reason that we don’t see people trying more complex RL based continual learning in normal deployments is that there is lower hanging fruit elsewhere and typically something else is the main blocker. I agree that if you had human level sample efficiency in learning this would immediately yield strong results (e.g., you’d have very superhuman AIs with 10^26 FLOP presumably), I’m just making a claim about more incremental progress.

– I think Dwarkesh uses the term “intelligence” somewhat atypically when he says “The reason humans are so useful is not mainly their raw intelligence. It’s their ability to build up context, interrogate their own failures, and pick up small improvements and efficiencies as they practice a task.” I think people often consider how fast someone learns on the job as one aspect of intelligence. I agree there is a difference between short feedback loop intelligence (e.g. IQ tests) and long feedback loop intelligence and they are quite correlated in humans (while AIs tend to be relatively worse at long feedback loop intelligence).

More thoughts/quibbles:

– Dwarkesh notes “An AI that is capable of online learning might functionally become a superintelligence quite rapidly, even if there’s no algorithmic progress after that point.” This seems reasonable, but it’s worth noting that if sample efficient learning is very compute expensive, then this might not happen so rapidly.

– I think AIs will likely overcome poor sample efficiency to achieve a very high level of performance using a bunch of tricks (e.g. constructing a bunch of RL environments, using a ton of compute to learn when feedback is scarce, learning from much more data than humans due to “learn once deploy many” style strategies). I think we’ll probably see fully automated AI R&D prior to matching top human sample efficiency at learning on the job. Notably, if you do match top human sample efficiency at learning (while still using a similar amount of compute to the human brain), then we already have enough compute for this to basically immediately result in vastly superhuman AIs (human lifetime compute is maybe 3e23 FLOP and we’ll soon be doing 1e27 FLOP training runs). So, either sample efficiency must be worse or at least it must not be possible to match human sample efficiency without spending more compute per data-point/trajectory/episode.

Matt Reardon: Dwarkesh commits the sin of thinking work you’re personally close to is harder-than-average to automate.

Herbie Bradley: I mean this is just correct? most researchers I know think continual learning is a big problem to be solved before AGI

Matt Reardon: My main gripe is that “<50%" [of jobs being something you can automate soon] should be more like "<15%"

Danielle Fong: Gell-Mann Amnesia for AI.

Reardon definitely confused me here, but either way I’d say that Dwarkesh Patel is a 99th percentile performer. He does things most other people can’t do. That’s probably going to be harder to automate than most other white collar work? The bulk of hours in white collar work are very much not bespoke things and don’t act to put state or memory into people in subtle ways?

Now that we’ve had a good detailed discussion and seen several perspectives, it’s time to address another discussion of related issues, because it is drawing attention from an unlikely source.

After previously amplifying Situational Awareness, Ivanka Trump is back in the Essay Meta with high praise for The Era of Experience, authored by David Silver and (oh no) Richard Sutton.

Situational Awareness was an excellent pick. I do not believe this essay was a good pick. I found it a very frustrating, unoriginal and unpersuasive paper to read. To the extent it is saying something new I don’t agree, but it’s not clear to what extent it is saying anything new. Unless you want to know about this paper exactly because Ivanka is harping it, you should skip this section.

I think the paper effectively mainly says we’re going to do a lot more RL and we should stop trying to make the AIs mimic, resemble or be comprehensible to humans or trying to control their optimization targets?

Ivanka Trump: Perhaps the most important thing you can read about AI this year : “Welcome to the Era of Experience”

This excellent paper from two senior DeepMind researchers argues that AI is entering a new phase—the “Era of Experience”—which follows the prior phases of simulation-based learning and human data-driven AI (like LLMs).

The authors’ posit that future AI breakthroughs will stem from learning through direct interaction with the world, not from imitating human-generated data.

This is not a theory or distant future prediction. It’s a description of a paradigm shift already in motion.

Let me know what you think !

Glad you asked, Ivanka! Here’s what I think.

The essay starts off with a perspective we have heard before, usually without much of an argument behind it: That LLMs and other AIs trained only on ‘human data’ is ‘rapidly approaching a limit,’ we are running out of high-quality data, and thus to progress significantly farther AIs will need to move into ‘the era of experience,’ meaning learning continuously from their environments.

I agree that the standard ‘just feed it more data’ approach will run out of data with which to scale, but there are a variety of techniques already being used to get around this. We have lots of options.

The leading example the paper itself gives of this in the wild is AlphaProof, which ‘interacted with a formal proofing system’ which seems to me like a clear case of synthetic data working and verification being easier than generation, rather than ‘experience.’ If the argument is simply that RL systems will learn by having their outputs evaluated, that isn’t news.

They claim to have in mind something rather different from that, and with this One Weird Trick they assert Superintelligence Real Soon Now:

Our contention is that incredible new capabilities will arise once the full potential of experiential learning is harnessed. This era of experience will likely be characterised by agents and environments that, in addition to learning from vast quantities of experiential data, will break through the limitations of human-centric AI systems in several further dimensions:

• Agents will inhabit streams of experience, rather than short snippets of interaction.

• Their actions and observations will be richly grounded in the environment, rather than interacting via human dialogue alone.

• Their rewards will be grounded in their experience of the environment, rather than coming from human prejudgement.

• They will plan and/or reason about experience, rather than reasoning solely in human terms.

We believe that today’s technology, with appropriately chosen algorithms, already provides a sufficiently powerful foundation to achieve these breakthroughs. Furthermore, the pursuit of this agenda by the AI community will spur new innovations in these directions that rapidly progress AI towards truly superhuman agents.

I suppose if the high level takeaway is ‘superintelligence is likely coming reasonably soon with the right algorithms’ then there’s no real disagreement?

They then however discuss tool calls and computer use, which then seems like a retreat back into an ordinary RL paradigm? It’s also not clear to me what the authors mean by ‘human terms’ versus ‘plan and/or reason about experience,’ or even what ‘experience’ means here. They seem to be drawing a distinction without a difference.

If the distinction is simply (as the paper implies in places) that the agents will do self-evaluation rather than relying on human feedback, I have some important news about how existing systems already function? They use the human feedback and other methods to train an AI feedback system that does most of the work? And yes they often include ‘real world’ feedback systems in that? What are we even saying here?

They also seem to be drawing a distinction between the broke ‘human feedback’ and the bespoke ‘humans report physical world impacts’ (or ‘other systems measure real world impacts’) as if the first does not often encompass the second. I keep noticing I am confused what the authors are trying to say.

For reasoning, they say it is unlikely human methods of reasoning and human language are optimal, more efficient methods of thought must exist. I mean, sure, but that’s also true for humans, and it’s obvious that you can use ‘human style methods of thought’ to get to superintelligence by simply imagining a human plus particular AI advantages.

As many have pointed out (and is central to AI 2027) encouraging AIs to use alien-looking inhuman reasoning styles we cannot parse is likely a very bad idea even if it would be more effective, what visibility we have will be lost and also it likely leads to alien values and breaks many happy things. Then again, Richard Sutton is one of the authors of this paper and he thinks we should welcome succession, as in the extinction of humanity, so he wouldn’t care.

They try to argue against this by saying that while agents pose safety risks and this approach may increase those safety risks, the approach may also have safety benefits. First, they say this allows the AI to adapt to its environment, as if the other agent could not do this or this should make us feel safer.

Second, they say ‘the reward function may itself be adapted through experience,’ in terms of risk that’s worse you know that that’s worse, right? They literally say ‘rather than blindly optimizing a signal such as the number of paperclips it can adopt to indications of human concern,’ this shows a profound lack of understanding and curiosity of where the whole misspecification of rewards problem is coming from or the arguments about it from Yudkowsky (since they bring in the ‘paperclips’).

Adapting autonomously and automatically towards something like ‘level of human concern’ is exactly the kind of metric and strategy that is absolutely going to encourage perverse outcomes and get you killed at the limit. You don’t get out of the specification problem by saying you can specify something messier and let the system adapt around it autonomously, that only makes it worse, and in no way addresses the actual issue.

The final argument for safety is that relying on physical experience creates time limitations, which provides a ‘natural break,’ which is saying that capabilities limits imposed by physical interactions will keep things more safe? Seriously?

There is almost nothing in the way of actual evidence or argument in the paper that is not fully standard, beyond a few intuition pumps. There are many deep misunderstandings, including fully backwards arguments, along the way. We may well want to rely a lot more on RL and on various different forms of ‘experiential’ data and continuous learning, but given how much worse it was than I expected this post updated me in the opposite direction of that which was clearly intended.

Discussion about this post

Dwarkesh Patel on Continual Learning Read More »

protesters-summon,-burn-waymo-robotaxis-in-los-angeles-after-ice-raids

Protesters summon, burn Waymo robotaxis in Los Angeles after ICE raids

The robotaxi company Waymo has suspended service in some parts of Los Angeles after some of its vehicles were summoned and then vandalized by protesters angry with ongoing raids by US Immigration and Customs Enforcement. Five of Waymo’s autonomous Jaguar I-Pace electric vehicles were summoned downtown to the site of anti-ICE protests, at which point they were vandalized with slashed tires and spray-painted messages. Three were set on fire.

The Los Angeles Police Department warned people to avoid the area due to risks from toxic gases given off by burning EVs. And Waymo told Ars that it is “in touch with law enforcement” regarding the matter.

The protesters in Los Angeles were outraged after ICE, using brutal tactics, began detaining people in raids across the city. Thousands of Angelenos took to the streets over the weekend to confront the masked federal enforcers and, in some cases, forced them away.

In response, the Trump administration mobilized more than 300 National Guard soldiers without consulting with or being requested to do so by the California governor.

California Governor Gavin Newsom has promised to sue the administration. “Donald Trump has created the conditions you see on your TV tonight. He’s exacerbated the conditions. He’s, you know, lit the proverbial match. He’s putting fuel on this fire, ever since he announced he was taking over the National Guard—an illegal act, an immoral act, an unconstitutional act,” Newsom said in an interview.

Waymo began offering rides in Los Angeles last November, and by January, the company said it had driven almost 2 million miles in the city. But there is some animosity toward robotaxis and food delivery robots, which are now being used by the Los Angeles Police Department as sources of surveillance footage. In April, the LAPD published footage obtained from a Waymo that it used to investigate a hit-and-run.

Protesters summon, burn Waymo robotaxis in Los Angeles after ICE raids Read More »

a-long-shot-plan-to-mine-the-moon-comes-a-little-closer-to-reality

A long-shot plan to mine the Moon comes a little closer to reality

The road ahead

Meyerson said the company’s current plan is to fly a prospecting mission in 2027, a payload of less than 100 kg, likely on a commercial lander that is part of NASA’s Commercial Lunar Payload Services program. Two years later, the company seeks to fly a pilot plant. Meyerson said the size of this plant will depend on the launch capability available (i.e., if Starship is flying to the Moon, they’ll go big, and smaller if not).

Following this, Interlune is targeting 2032 for the launch of a solar-powered operating plant, which would include five mobile harvesters. The operation would also be able to return material mined to Earth. The total mass for this equipment would be about 40 metric tons, which could fly on a single Starship or two New Glenn Mk 2 landers. This would, understandably, be highly ambitious and capital-intensive. After raising $15 million last year, Meyerson said Interlune is planning a second fundraising round that should begin soon.

There are some outside factors that may be beneficial for Interlune. One is that China has a clear and demonstrated interest in sending humans to the Moon and has already sent rovers to explore for helium-3 resources. Moreover, with the exit of Jared Isaacman as a nominee to lead NASA, the Trump administration is likely to put someone in the position who is more focused on lunar activities. One candidate, a retired Air Force General named Steve Kwast, is a huge proponent of mining helium-3.

Interlune has a compelling story, as there are almost no other lunar businesses focused solely on commercial activities that will drive value from mining the lunar surface. In that sense, they could be a linchpin of a lunar economy. However, they have a long way to go, and a lot of lunar regolith to plow through, before they start delivering for customers.

A long-shot plan to mine the Moon comes a little closer to reality Read More »

apple’s-ai-driven-stem-splitter-audio-separation-tech-has-hugely-improved-in-a-year

Apple’s AI-driven Stem Splitter audio separation tech has hugely improved in a year

Consider an example from a song I’ve been working on. Here’s a snippet of the full piece:


After running Logic’s original Stem Splitter on the snippet, I was given four tracks: Vocals, Drums, Bass, and “Other.” They all isolated their parts reasonably well, but check out the static and artifacting when you isolate the bass track:



The vocal track came out better, but it was still far from ideal:


Now, just over a year later, Apple has released a point update for Logic that delivers “enhanced audio fidelity” for Stem Splitter—along with support for new stems for guitar and piano.

screenshot of logic's new stem splitter feature

Logic now splits audio into more stems.

The difference in quality is significant, as you can hear in the new bass track:


And the new vocal track, though still lacking the pristine fidelity of the original recording, is nevertheless greatly improved:


The ability to separate out guitars and pianos is also welcome, and it works well. Here’s the piano part:



Pretty impressive leap in fidelity for a point release!

There are plenty of other stem-splitting tools, of course, and many have had a head start on Apple. With its new release, however, Apple has certainly closed the gap.

Izotope’s RX 11, for instance, is a highly regarded (and expensive!) piece of software that can do wonders when it comes to repairing audio and reducing clicks, background noise, and sibilance.

RX11 screenshot

RX11, ready to split some stems.

It includes a stem-splitting feature that can produce four outputs (vocal, bass, drums, and other), and it produces usable audio—but I’m not sure I’d rank its output more highly than Logic’s. Compare for yourself on the vocal and bass stems:



In any event, the AI/machine learning revolution has certainly arrived in the music world, and the rapid quality increase in stem-splitting tools in just a few years shows just what these AI systems are capable of when trained on enough data. I remain especially impressed by how the best stem splitters can extract not just a clean vocal but also the reverb/delay tail. Having access to the original recordings will always be better—but stem-splitting tech is improving quickly.

Apple’s AI-driven Stem Splitter audio separation tech has hugely improved in a year Read More »

estate-of-woman-who-died-in-2021-heat-dome-sues-big-oil-for-wrongful-death

Estate of woman who died in 2021 heat dome sues Big Oil for wrongful death


At least 100 heat-related deaths in Washington state came during the unprecedented heat wave.

Everett Clayton looks at a digital thermometer on a nearby building that reads 116 degrees while walking to his apartment on June 27, 2021 in Vancouver, Washington. Credit: Nathan Howard/Getty Images

This article originally appeared on Inside Climate News, a nonprofit, non-partisan news organization that covers climate, energy, and the environment. Sign up for their newsletter here.

The daughter of a woman who was killed by extreme heat during the 2021 Pacific Northwest heat dome has filed a first-of-its-kind lawsuit against major oil companies claiming they should be held responsible for her death.

The civil lawsuit, filed on May 29 in King County Superior Court in Seattle, is the first wrongful death case brought against Big Oil in the US in the context of climate change. It attempts to hold some of the world’s biggest fossil fuel companies liable for the death of Juliana Leon, who perished from overheating during the heat dome event, which scientists have determined would have been virtually impossible absent human-caused climate change.

“The extreme heat that killed Julie was directly linked to fossil fuel-driven alteration of the climate,” the lawsuit asserts. It argues that fossil fuel defendants concealed and misrepresented the climate change risks of their products and worked to delay a transition to cleaner energy alternatives. Furthermore, oil companies knew decades ago that their conduct would have dangerous and deadly consequences, the case alleges.

“Defendants have known for all of Julie’s life that their affirmative misrepresentations and omissions would claim lives,” the complaint claims. Leon’s daughter, Misti, filed the suit on behalf of her mother’s estate.

At 65, Juliana Leon was driving home from a medical appointment in Seattle on June 28, 2021, a day when the temperature peaked at 108° Fahrenheit (42.2° Celsius). She had the windows rolled down since the air conditioner in her car wasn’t working, but with the oven-like outdoor temperatures she quickly succumbed to the stifling heat. A passerby found her unresponsive in her car, which was pulled over on a residential street. Emergency responders were unable to revive her. The official cause of death was determined to be hyperthermia, or overheating.

There were at least 100 heat-related deaths in the state from June 26 to July 2, 2021, according to the Washington State Department of Health. That unprecedented stretch of scorching high temperatures was the deadliest weather-related event in Washington’s history. Climate change linked to the burning of fossil fuels intensified this extreme heat event, scientists say.

Misti Leon’s complaint argues that big oil companies “are responsible” for her mother’s climate change-related death. “Through their failure to warn, marketing, distribution, extraction, refinement, transport, and sale of fossil fuels, defendants each bear responsibility for the spike in atmospheric CO2 levels that have resulted in climate change, and thus the occurrence of a virtually impossible weather event and the extreme temperatures of the Heat Dome,” the suit alleges.

Defendants include ExxonMobil, BP, Chevron, Shell, ConocoPhillips, and Phillips 66. Phillips 66 declined to comment; the rest of the companies did not respond to requests for comment.

The plaintiff is represented by the Bechtold Law Firm, based in Missoula, Montana. The lawsuit brings state tort law claims of wrongful death, failure to warn, and public nuisance, and seeks relief in the form of damages as well as a public education campaign to “rectify defendants’ decades of misinformation.”

Major oil and gas companies are currently facing more than two dozen climate damages and deception cases brought by municipal, state, and tribal governments, including a case filed in 2023 by Multnomah County, Oregon, centered around the 2021 Pacific Northwest heat dome. The Leon case, however, is the first climate liability lawsuit filed by an individual against the fossil fuel industry.

“This is the first case that is directly making the connection between the misconduct and lies of big oil companies and a specific, personalized tragedy, the death of Julie Leon,” said Aaron Regunberg, accountability director for Public Citizen’s climate program.

“It puts a human face on it,” Pat Parenteau, emeritus professor of law at Vermont Law and Graduate School, told Inside Climate News.

Climate accountability advocates say the lawsuit could open up a new front for individuals suffering from climate change-related harms to pursue justice against corporate polluters who allegedly lied about the risks of their products.

“Big Oil companies have known for decades that their products would cause catastrophic climate disasters that would become more deadly and destructive if they didn’t change their business model. But instead of warning the public and taking steps to save lives, Big Oil lied and deliberately accelerated the problem,” Richard Wiles, president of the Center for Climate Integrity, said in a statement. “This latest case—the first filed on behalf of an individual climate victim—is another step toward accountability.”

“It’s a model for victims of climate disasters all across the country,” said Regunberg. “Anywhere there’s an extreme weather event with strong attribution science connecting it to climate change, families experiencing a tragedy can file a very similar case.”

Regunberg and several other legal experts have argued that Big Oil could face criminal prosecution for crimes such as homicide and reckless endangerment in the context of climate change, particularly given evidence of internal industry documents suggesting companies like Exxon knew that unabated fossil fuel use could result in “catastrophic” consequences and deaths. A 1996 presentation from an Exxon scientist, for example, outlines projected human health impacts stemming from climate change, including “suffering and death due to thermal extremes.”

The Leon case could “help lay the groundwork” for potential climate homicide cases, Regunberg said. “Wrongful death suits are important. They provide a private remedy to victims of wrongful conduct that causes a death. But we also think there’s a need for public justice, and that’s the role that criminal prosecution is supposed to have,” he told Inside Climate News.

The lawsuit is likely to face a long uphill battle in the courts. Other climate liability cases against these companies brought by government entities have been tied up in procedural skirmishes, some for years, and no case has yet made it to trial.

“In this case we have a grieving woman going up against some of the most powerful corporations in the world, and we’ve seen all the legal firepower they are bringing to bear on these cases,” Regunberg said.

But if the case does eventually make it to trial, it could be a game-changer. “That’s going to be a jury in King County, Washington, of people who probably experienced and remember the Pacific heat dome event, and maybe they know folks who were impacted. I think that’s going to be a compelling case that has a good chance of getting an outcome that provides some justice to this family,” Regunberg said.

Even if it doesn’t get that far, the lawsuit still “marks a significant development in climate liability,” according to Donald Braman, an associate professor of criminal law at Georgetown University and co-author of a paper explaining the case for prosecuting Big Oil for climate homicide.

“As climate attribution science advances, linking specific extreme weather events to anthropogenic climate change with greater confidence, the legal arguments for liability are strengthening. This lawsuit, being the first of its kind for wrongful death in this context, will be closely watched and could set important precedents, regardless of its ultimate outcome,” he said. “It reflects a growing societal demand for accountability for climate-related harms.”

Photo of Inside Climate News

Estate of woman who died in 2021 heat dome sues Big Oil for wrongful death Read More »

ted-cruz-bill:-states-that-regulate-ai-will-be-cut-out-of-$42b-broadband-fund

Ted Cruz bill: States that regulate AI will be cut out of $42B broadband fund

BEAD changes: No fiber preference, no low-cost mandate

The BEAD program is separately undergoing an overhaul because Republicans don’t like how it was administered by Democrats. The Biden administration spent about three years developing rules and procedures for BEAD and then evaluating plans submitted by each US state and territory, but the Trump administration has delayed grants while it rewrites the rules.

While Biden’s Commerce Department decided to prioritize the building of fiber networks, Republicans have pushed for a “tech-neutral approach” that would benefit cable companies, fixed wireless providers, and Elon Musk’s Starlink satellite service.

Secretary of Commerce Howard Lutnick previewed changes in March, and today he announced more details of the overhaul that will eliminate the fiber preference and various requirements imposed on states. One notable but unsurprising change is that the Trump administration won’t let states require grant recipients to offer low-cost Internet plans at specific rates to people with low incomes.

The National Telecommunications and Information Administration (NTIA) “will refuse to accept any low-cost service option proposed in a [state or territory’s] Final Proposal that attempts to impose a specific rate level (i.e., dollar amount),” the Trump administration said. Instead, ISPs receiving subsidies will be able to continue offering “their existing, market driven low-cost plans to meet the statutory low-cost requirement.”

The Benton Institute for Broadband & Society criticized the overhaul, saying that the Trump administration is investing in the cheapest broadband infrastructure instead of the best. “Fiber-based broadband networks will last longer, provide better, more reliable service, and scale to meet communities’ ever-growing connectivity needs,” the advocacy group said. “NTIA’s new guidance is shortsighted and will undermine economic development in rural America for decades to come.”

The Trump administration’s overhaul drew praise from cable lobby group NCTA-The Internet & Television Association, whose members will find it easier to obtain subsidies. “We welcome changes to the BEAD program that will make the program more efficient and eliminate onerous requirements, which add unnecessary costs that impede broadband deployment efforts,” NCTA said. “These updates are welcome improvements that will make it easier for providers to build faster, especially in hard-to-reach communities, without being bogged down by red tape.”

Ted Cruz bill: States that regulate AI will be cut out of $42B broadband fund Read More »

millions-of-low-cost-android-devices-turn-home-networks-into-crime-platforms

Millions of low-cost Android devices turn home networks into crime platforms

Millions of low-cost devices for media streaming, in-vehicle entertainment, and video projection are infected with malware that turns consumer networks into platforms for distributing malware, concealing nefarious communications, and performing other illicit activities, the FBI has warned.

The malware infecting these devices, known as BadBox, is based on Triada, a malware strain discovered in 2016 by Kaspersky Lab, which called it “one of the most advanced mobile Trojans” the security firm’s analysts had ever encountered. It employed an impressive kit of tools, including rooting exploits that bypassed security protections built into Android and functions for modifying the Android OS’s all-powerful Zygote process. Google eventually updated Android to block the methods Triada used to infect devices.

The threat remains

A year later, Triada returned, only this time, devices came pre-infected before they reached consumers’ hands. In 2019, Google confirmed that the supply-chain attack affected thousands of devices and that the company had once again taken measures to thwart it.

In 2023, security firm Human Security reported on BigBox, a Triada-derived backdoor it found preinstalled on thousands of devices manufactured in China. The malware, which Human Security estimated was installed on 74,000 devices around the world, facilitated a range of illicit activities, including advertising fraud, residential proxy services, the creation of fake Gmail and WhatsApp accounts, and infecting other Internet-connected devices.

Millions of low-cost Android devices turn home networks into crime platforms Read More »

what-solar?-what-wind?-texas-data-centers-build-their-own-gas-power-plants

What solar? What wind? Texas data centers build their own gas power plants


Data center operators are turning away from the grid to build their own power plants.

Sisters Abigail and Jennifer Lindsey stand on their rural property on May 27 outside New Braunfels, Texas, where they posted a sign in opposition to a large data center and power plant planned across the street. Credit: Dylan Baddour/Inside Climate News

NEW BRAUNFELS, Texas—Abigail Lindsey worries the days of peace and quiet might be nearing an end at the rural, wooded property where she lives with her son. On the old ranch across the street, developers want to build an expansive complex of supercomputers for artificial intelligence, plus a large, private power plant to run it.

The plant would be big enough to power a major city, with 1,200 megawatts of planned generation capacity fueled by West Texas shale gas. It will only supply the new data center, and possibly other large data centers recently proposed, down the road.

“It just sucks,” Lindsey said, sitting on her deck in the shade of tall oak trees, outside the city of New Braunfels. “They’ve come in and will completely destroy our way of life: dark skies, quiet and peaceful.”

The project is one of many others like it proposed in Texas, where a frantic race to boot up energy-hungry data centers has led many developers to plan their own gas-fired power plants rather than wait for connection to the state’s public grid. Egged on by supportive government policies, this buildout promises to lock in strong gas demand for a generation to come.

The data center and power plant planned across from Lindsey’s home is a partnership between an AI startup called CloudBurst and the natural gas pipeline giant Energy Transfer. It was Energy Transfer’s first-ever contract to supply gas for a data center, but it is unlikely to be its last. In a press release, the company said it was “in discussions with a number of data center developers and expects this to be the first of many agreements.”

Previously, conventional wisdom assumed that this new generation of digital infrastructure would be powered by emissions-free energy sources like wind, solar and battery power, which have lately seen explosive growth. So far, that vision isn’t panning out, as desires to build quickly overcome concerns about sustainability.

“There is such a shortage of data center capacity and power,” said Kent Draper, chief commercial officer at Australian data center developer IREN, which has projects in West Texas. “Even the large hyperscalers are willing to turn a blind eye to their renewable goals for some period of time in order to get access.”

The Hays Energy Project is a 990 MW gas-fired power plant near San Marcos, Texas.

Credit: Dylan Baddour/Inside Climate News

The Hays Energy Project is a 990 MW gas-fired power plant near San Marcos, Texas. Credit: Dylan Baddour/Inside Climate News

IREN prioritizes renewable energy for its data centers—giant warehouses full of advanced computers and high-powered cooling systems that can be configured to produce crypto currency or generate artificial intelligence. In Texas, that’s only possible because the company began work here years ago, early enough to secure a timely connection to the state’s grid, Draper said.

There were more than 2,000 active generation interconnection requests as of April 30, totalling 411,600 MW of capacity, according to grid operator ERCOT. A bill awaiting signature on Gov. Greg Abbott’s desk, S.B. 6, looks to filter out unserious large-load projects bloating the queue by imposing a $100,000 fee for interconnection studies.

Wind and solar farms require vast acreage and generate energy intermittently, so they work best as part of a diversified electrical grid that collectively provides power day and night. But as the AI gold rush gathered momentum, a surge of new project proposals has created years-long wait times to connect to the grid, prompting many developers to bypass it and build their own power supply.

Operating alone, a wind or solar farm can’t run a data center. Battery technologies still can’t store such large amounts of energy for the length of time required to provide steady, uninterrupted power for 24 hours per day, as data centers require. Small nuclear reactors have been touted as a means to meet data center demand, but the first new units remain a decade from commercial deployment, while the AI boom is here today.

Now, Draper said, gas companies approach IREN all the time, offering to quickly provide additional power generation.

Gas provides almost half of all power generation capacity in Texas, far more than any other source. But the amount of gas power in Texas has remained flat for 20 years, while wind and solar have grown sharply, according to records from the US Energy Information Administration. Facing a tidal wave of proposed AI projects, state lawmakers have taken steps to try to slow the expansion of renewable energy and position gas as the predominant supply for a new era of demand.

This buildout promises strong demand and high gas prices for a generation to come, a boon to Texas’ fossil fuel industry, the largest in the nation. It also means more air pollution and emissions of planet-warming greenhouse gases, even as the world continues to barrel past temperature records.

Texas, with 9 percent of the US population, accounted for about 15 percent of current gas-powered generation capacity in the country but 26 percent of planned future generation at the end of 2024, according to data from Global Energy Monitor. Both the current and planned shares are far more than any other state.

GEM identified 42 new gas turbine projects under construction, in development, or announced in Texas before the start of this year. None of those projects are sited at data centers. However, other projects announced since then, like CloudBurst and Energy Transfer outside New Braunfels, will include dedicated gas power plants on site at data centers.

For gas companies, the boom in artificial intelligence has quickly become an unexpected gold mine. US gas production has risen steadily over 20 years since the fracking boom began, but gas prices have tumbled since 2024, dragged down by surging supply and weak demand.

“The sudden emergence of data center demand further brightens the outlook for the renaissance in gas pricing,” said a 2025 oil and gas outlook report by East Daley Analytics, a Colorado-based energy intelligence firm. “The obvious benefit to producers is increased drilling opportunities.”

It forecast up to a 20 percent increase in US gas production by 2030, driven primarily by a growing gas export sector on the Gulf Coast. Several large export projects will finish construction in the coming years, with demand for up to 12 billion cubic feet of gas per day, the report said, while new power generation for data centers would account for 7 billion cubic feet per day of additional demand. That means profits for power providers, but also higher costs for consumers.

Natural gas, a mixture primarily composed of methane, burns much cleaner than coal but still creates air pollution, including soot, some hazardous chemicals, and greenhouse gases. Unburned methane released into the atmosphere has more than 80 times the near-term warming effect of carbon dioxide, leading some studies to conclude that ubiquitous leaks in gas supply infrastructure make it as impactful as coal to the global climate.

Credit: Dylan Baddour/Inside Climate News

It’s a power source that’s heralded for its ability to get online fast, said Ed Hirs, an energy economics lecturer at the University of Houston. But the years-long wait times for turbines have quickly become the industry’s largest constraint in an otherwise positive outlook.

“If you’re looking at a five-year lead time, that’s not going to help Alexa or Siri today,” Hirs said.

The reliance on gas power for data centers is a departure from previous thought, said Larry Fink, founder of global investment firm BlackRock, speaking to a crowd of industry executives at an oil and gas conference in Houston in March.

About four years ago, if someone said they were building a data center, they said it must be powered by renewables, he recounted. Two years ago, it was a preference.

“Today?” Fink said. “They care about power.”

Gas plants for data centers

Since the start of this year, developers have announced a flurry of gas power deals for data centers. In the small city of Abilene, the builders of Stargate, one of the world’s largest data center projects, applied for permits in January to build 360 MW of gas power generation, authorized to emit 1.6 million tons of greenhouse gases and 14 tons of hazardous air pollutants per year. Later, the company announced the acquisition of an additional 4,500 MW of gas power generation capacity.

Also in January, a startup called Sailfish announced ambitious plans for a 2,600-acre, 5,000 MW cluster of data centers in the tiny North Texas town of Tolar, population 940.

“Traditional grid interconnections simply can’t keep pace with hyperscalers’ power demands, especially as AI accelerates energy requirements,” Sailfish founder Ryan Hughes told the website Data Center Dynamics at the time. “Our on-site natural gas power islands will let customers scale quickly.”

CloudBurst and Energy Transfer announced their data center and power plant outside New Braunfels in February, and another company partnership also announced plans for a 250 MW gas plant and data center near Odessa in West Texas. In May, a developer called Tract announced a 1,500-acre, 2,000 MW data center campus with some on-site generation and some purchased gas power near the small Central Texas town of Lockhart.

Not all new data centers need gas plants. A 120 MW South Texas data center project announced in April would use entirely wind power, while an enormous, 5,000 MW megaproject outside Laredo announced in March hopes to eventually run entirely on private wind, solar, and hydrogen power (though it will use gas at first). Another collection of six data centers planned in North Texas hopes to draw 1,400 MW from the grid.

Altogether, Texas’ grid operator predicts statewide power demand will nearly double within five years, driven largely by data centers for artificial intelligence. It mirrors a similar situation unfolding across the country, according to analysis by S&P Global.

“There is huge concern about the carbon footprint of this stuff,” said Dan Stanzione, executive director of the Texas Advanced Computing Center at the University of Texas at Austin. “If we could decarbonize the power grid, then there is no carbon footprint for this.”

However, despite massive recent expansions of renewable power generation, the boom in artificial intelligence appears to be moving the country farther from, not closer to, its decarbonization goals.

Restrictions on renewable energy

Looking forward to a buildout of power supply, state lawmakers have proposed or passed new rules to support the deployment of more gas generation and slow the surging expansion of wind and solar power projects. Supporters of these bills say they aim to utilize Texas’ position as the nation’s top gas producer.

Some energy experts say the rules proposed throughout the legislative session could dismantle the state’s leadership in renewables as well as the state’s ability to provide cheap and reliable power.

“It absolutely would [slow] if not completely stop renewable energy,” said Doug Lewin, a Texas energy consultant, about one of the proposed rules in March. “That would really be extremely harmful to the Texas economy.”

While the bills deemed as “industry killers” for renewables missed key deadlines, failing to reach Abbott’s desk, they illustrate some lawmakers’ aspirations for the state’s energy industry.

One failed bill, S.B. 388, would have required every watt of new solar brought online to be accompanied by a watt of new gas. Another set of twin bills, H.B. 3356 and S.B. 715, would have forced existing wind and solar companies to buy fossil-fuel based power or connect to a battery storage resource to cover the hours the energy plants are not operating.

When the Legislature last met in 2023, it created a $5 billion public “energy fund” to finance new gas plants but not wind or solar farms. It also created a new tax abatement program that excluded wind and solar. This year’s budget added another $5 billion to double the fund.

Bluebonnet Electric Cooperative is currently completing construction on a 190 MW gas-fired peaker plant near the town of Maxwell in Caldwell County.

Credit: Dylan Baddour/Inside Climate News

Bluebonnet Electric Cooperative is currently completing construction on a 190 MW gas-fired peaker plant near the town of Maxwell in Caldwell County. Credit: Dylan Baddour/Inside Climate News

Among the lawmakers leading the effort to scale back the state’s deployment of renewables is state Sen. Lois Kolkhorst, a Republican from Brenham. One bill she co-sponsored, S.B. 819, aimed to create new siting rules for utility-scale renewable projects and would have required them to get permits from the Public Utility Commission that no other energy source—coal, gas or nuclear—needs. “It’s just something that is clearly meant to kneecap an industry,” Lewin said about the bill, which failed to pass.

Kolkhorst said the bill sought to balance the state’s need for power while respecting landowners across the state.

Former state Rep. John Davis, now a board member at Conservative Texans for Energy Innovation, said the session shows how renewables have become a red meat issue.

More than 20 years ago, Davis and Kolkhorst worked together in the Capitol as Texas deregulated its energy market, which encouraged renewables to enter the grid’s mix, he said. Now Davis herds sheep and goats on his family’s West Texas ranch, where seven wind turbines provide roughly 40 percent of their income.

He never could have dreamed how significant renewable energy would become for the state grid, he said. That’s why he’s disappointed with the direction the legislature is headed with renewables.

“I can’t think of anything more conservative, as a conservative, than wind and solar,” Davis said. “These are things God gave us—use them and harness them.”

A report published in April finds that targeted limitations on solar and wind development in Texas could increase electricity costs for consumers and businesses. The report, done by Aurora Energy Research for the Texas Association of Business, said restricting the further deployment of renewables would drive power prices up 14 percent by 2035.

“Texas is at a crossroads in its energy future,” said Olivier Beaufils, a top executive at Aurora Energy Research. “We need policies that support an all-of-the-above approach to meet the expected surge in power demand.”

Likewise, the commercial intelligence firm Wood Mackenzie expects the power demand from data centers to drive up prices of gas and wholesale consumer electricity.

Pollution from gas plants

Even when new power plants aren’t built on the site of data centers, they might still be developed because of demand from the server farms.

For example, in 2023, developer Marathon Digital started up a Bitcoin mine in the small town of Granbury on the site of the 1,100 MW Wolf Hollow II gas power plant. It held contracts to purchase 300 MW from the plant.

One year later, the power plant operator sought permits to install eight additional “peaker” gas turbines able to produce up to 352 MW of electricity. These small units, designed to turn on intermittently during hours of peak demand, release more pollution than typical gas turbines.

Those additional units would be approved to release 796,000 tons per year of greenhouse gases, 251 tons per year of nitrogen oxides and 56 tons per year of soot, according to permitting documents. That application is currently facing challenges from neighboring residents in state administrative courts.

About 150 miles away, neighbors are challenging another gas plant permit application in the tiny town of Blue. At 1,200 MW, the $1.2 billion plant proposed by Sandow Lakes Energy Co. would be among the largest in the state and would almost entirely serve private customers, likely including the large data centers that operate about 20 miles away.

Travis Brown and Hugh Brown, no relation, stand by a sign marking the site of a proposed 1,200 MW gas-fired power plant in their town of Blue on May 7.

Credit: Dylan Baddour/Inside Climate News

Travis Brown and Hugh Brown, no relation, stand by a sign marking the site of a proposed 1,200 MW gas-fired power plant in their town of Blue on May 7. Credit: Dylan Baddour/Inside Climate News

This plan bothers Hugh Brown, who moved out to these green, rolling hills of rural Lee County in 1975, searching for solitude. Now he lives on 153 wooded acres that he’s turned into a sanctuary for wildlife.

“What I’ve had here is a quiet, thoughtful life,” said Brown, skinny with a long grey beard. “I like not hearing what anyone else is doing.”

He worries about the constant roar of giant cooling fans, the bright lights overnight and the air pollution. According to permitting documents, the power plant would be authorized to emit 462 tons per year of ammonia gas, 254 tons per year of nitrogen oxides, 153 tons per year of particulate matter, or soot, and almost 18 tons per year of “hazardous air pollutants,” a collection of chemicals that are known to cause cancer or other serious health impacts.

It would also be authorized to emit 3.9 million tons of greenhouse gases per year, about as much as 72,000 standard passenger vehicles.

“It would be horrendous,” Brown said. “There will be a constant roaring of gigantic fans.”

In a statement, Sandow Lakes Energy denied that the power plant will be loud. “The sound level at the nearest property line will be similar to a quiet library,” the statement said.

Sandow Lakes Energy said the plant will support the local tax base and provide hundreds of temporary construction jobs and dozens of permanent jobs. Sandow also provided several letters signed by area residents who support the plant.

“We recognize the critical need for reliable, efficient, and environmentally responsible energy production to support our region’s growth and economic development,” wrote Nathan Bland, president of the municipal development district in Rockdale, about 20 miles from the project site.

Brown stands next to a pond on his property ringed with cypress trees he planted 30 years ago.

Credit: Dylan Baddour/Inside Climate News

Brown stands next to a pond on his property ringed with cypress trees he planted 30 years ago. Credit: Dylan Baddour/Inside Climate News

Sandow says the plant will be connected to Texas’ public grid, and many supporting letters for the project cited a need for grid reliability. But according to permitting documents, the 1,200 MW plant will supply only 80 MW to the grid and only temporarily, with the rest going to private customers.

“Electricity will continue to be sold to the public until all of the private customers have completed projects slated to accept the power being generated,” said a permit review by the Texas Commission on Environmental Quality.

Sandow has declined to name those customers. However, the plant is part of Sandow’s massive, master-planned mixed-use development in rural Lee and Milam counties, where several energy-hungry tenants are already operating, including Riot Platforms, the largest cryptocurrency mine on the continent. The seven-building complex in Rockdale is built to use up to 700 MW, and in April, it announced the acquisition of a neighboring, 125 MW cryptocurrency mine, previously operated by Rhodium. Another mine by Bitmain, also one of the world’s largest Bitcoin companies, has 560 MW of operating capacity with plans to add 180 more in 2026.

In April, residents of Blue gathered at the volunteer fire department building for a public meeting with Texas regulators and Sandow to discuss questions and concerns over the project. Brown, owner of the wildlife sanctuary, spoke into a microphone and noted that the power plant was placed at the far edge of Sandow’s 33,000-acre development, 20 miles from the industrial complex in Rockdale but near many homes in Blue.

“You don’t want to put it up into the middle of your property where you could deal with the negative consequences,” Brown said, speaking to the developers. “So it looks to me like you are wanting to make money, in the process of which you want to strew grief in your path and make us bear the environmental costs of your profit.”

Inside Climate News’ Peter Aldhous contributed to this report.

This story originally appeared on Inside Climate News.

Photo of Inside Climate News

What solar? What wind? Texas data centers build their own gas power plants Read More »

trump-is-forcing-states-to-funnel-grant-money-to-starlink,-senate-democrats-say

Trump is forcing states to funnel grant money to Starlink, Senate Democrats say

Lutnick’s announcement of the BEAD overhaul also criticized what he called the program’s “woke mandates” and “burdensome regulations.” Republicans like Sen. Ted Cruz (R-Texas) have criticized a requirement for ISPs that accept subsidies to offer low-cost Internet plans to people with low incomes, though the low-cost rule was originally imposed by Congress in the law that created the BEAD program.

Letter: Projects could be delayed two years

Although Musk last week announced his departure from the government and criticized a Trump spending bill for allegedly “undermining” DOGE’s cost-cutting work, Trump still seems favorably inclined toward Starlink. Trump said in a press conference on Friday that with Starlink, Musk “saved a lot of lives, probably hundreds of lives in North Carolina,” referring to Starlink offering emergency connectivity after Hurricane Helene.

Democrats’ letter to Trump and Lutnick said that fiber and other terrestrial broadband technologies will be better than satellite both for residential connectivity and business networks that support US-based manufacturing.

“Data centers, smart warehouses, robotic assembly lines, and chip fabrication plants all depend on fast, stable, and scalable bandwidth. If we want these job-creating facilities built throughout the United States, including rural areas… we must act now—and we must build the high-speed, high-capacity networks those technologies demand,” the letter said.

Democrats also said the Trump administration’s rewrite of program rules could delay projects by two years.

“For six months, states have been waiting to break ground on scores of projects, held back only by the Commerce Department’s bureaucratic delays,” the letter said. “If states are forced to redo or rework their plans, they will not only miss this year’s construction season but next year’s as well, delaying broadband deployment by years. That’s why we urge the Administration to move swiftly to approve state plans, and release the $42 billion allocated to the states by the BEAD Program.”

Separately from BEAD, Trump said last month that he is killing a $2.75 billion broadband grant program authorized by Congress. The Digital Equity Act of 2021 allows for several types of grants benefitting low-income households, people who are at least 60 years old, people incarcerated in state or local prisons and jails, veterans, people with disabilities, people with language barriers, people who live in rural areas, and people who are members of a racial or ethnic minority group. Trump called the program “racist and illegal,” saying his administration would stop distributing Digital Equity Act grants.

Trump is forcing states to funnel grant money to Starlink, Senate Democrats say Read More »

adobe-finally-releases-photoshop-for-android,-and-it’s-free-(for-now)

Adobe finally releases Photoshop for Android, and it’s free (for now)

Adobe has spent years releasing mobile apps that aren’t Photoshop, and now it’s finally giving people what they want. Yes, real Photoshop. After releasing a mobile version of Photoshop on iPhone earlier this year, the promised Android release has finally arrived. You can download it right now in beta, and it’s free to use for the duration of the beta period.

The mobile app includes a reasonably broad selection of tools from the desktop version of Adobe’s iconic image editor, including masks, clone stamp, layers, transformations, cropping, and an array of generative AI tools. The app looks rather barebones when you first start using it, but the toolbar surfaces features as you select areas and manipulate layers.

Depending on how you count, this is Adobe’s third attempt to do Photoshop on phones. So far, it appears to be the most comprehensive, though. It’s much more capable than Photoshop Express or the ancient Photoshop Touch app, which Adobe unpublished almost a decade ago. If you’re not familiar with the ins and outs of Photoshop, the new app comes with a robust collection of tutorials—just tap the light bulb icon to peruse them.

Photoshop on Android makes a big deal about Adobe’s generative AI features, which let you easily select subjects or backgrounds, remove objects, and insert new content based on a text prompt. This works about as well as the desktop version of Photoshop because it’s relying on the same cloud service to do the heavy lifting. This would have been impressive to see in a mobile app a year ago, but OEM features like Google’s Magic Editor have since become more widespread.

Adobe finally releases Photoshop for Android, and it’s free (for now) Read More »

real-tiktokers-are-pretending-to-be-veo-3-ai-creations-for-fun,-attention

Real TikTokers are pretending to be Veo 3 AI creations for fun, attention


The turing test in reverse

From music videos to “Are you a prompt?” stunts, “real” videos are presenting as AI

Of course I’m an AI creation! Why would you even doubt it? Credit: Getty Images

Since Google released its Veo 3 AI model last week, social media users have been having fun with its ability to quickly generate highly realistic eight-second clips complete with sound and lip-synced dialogue. TikTok’s algorithm has been serving me plenty of Veo-generated videos featuring impossible challenges, fake news reports, and even surreal short narrative films, to name just a few popular archetypes.

However, among all the AI-generated video experiments spreading around, I’ve also noticed a surprising counter-trend on my TikTok feed. Amid all the videos of Veo-generated avatars pretending to be real people, there are now also a bunch of videos of real people pretending to be Veo-generated avatars.

“This has to be real. There’s no way it’s AI.”

I stumbled on this trend when the TikTok algorithm fed me this video topped with the extra-large caption “Google VEO 3 THIS IS 100% AI.” As I watched and listened to the purported AI-generated band that appeared to be playing in the crowded corner of someone’s living room, I read the caption containing the supposed prompt that had generated the clip: “a band of brothers with beards playing rock music in 6/8 with an accordion.”

@kongosmusicWe are so cooked. This took 3 mins to generate. Simple prompt: “a band of brothers playing rock music in 6/8 with an accordion”♬ original sound – KONGOS

After a few seconds of taking those captions at face value, something started to feel a little off. After a few more seconds, I finally noticed the video was posted by Kongos, an indie band that you might recognize from their minor 2012 hit “Come With Me Now.” And after a little digging, I discovered the band in the video was actually just Kongos, and the tune was a 9-year-old song that the band had dressed up as an AI creation to get attention.

Here’s the sad thing: It worked! Without the “Look what Veo 3 did!” hook, I might have quickly scrolled by this video before I took the time to listen to the (pretty good!) song. The novel AI angle made me stop just long enough to pay attention to a Kongos song for the first time in over a decade.

Kongos isn’t the only musical act trying to grab attention by claiming their real performances are AI creations. Darden Bela posted that Veo 3 had “created a realistic AI music video” over a clip from what is actually a 2-year-old music video with some unremarkable special effects. Rapper GameBoi Pat dressed up an 11-month-old song with a new TikTok clip captioned “Google’s Veo 3 created a realistic sounding rapper… This has to be real. There’s no way it’s AI” (that last part is true, at least). I could go on, but you get the idea.

@gameboi_pat This has got to be real. There’s no way it’s AI 😩 #google #veo3 #googleveo3 #AI #prompts #areweprompts? ♬ original sound – GameBoi_pat

I know it’s tough to get noticed on TikTok, and that creators will go to great lengths to gain attention from the fickle algorithm. Still, there’s something more than a little off-putting about flesh-and-blood musicians pretending to be AI creations just to make social media users pause their scrolling for a few extra seconds before they catch on to the joke (or don’t, based on some of the comments).

The whole thing evokes last year’s stunt where a couple of podcast hosts released a posthumous “AI-generated” George Carlin routine before admitting that it had been written by a human after legal threats started flying. As an attention-grabbing stunt, the conceit still works. You want AI-generated content? I can pretend to be that!

Are we just prompts?

Some of the most existentially troubling Veo-generated videos floating around TikTok these days center around a gag known as “the prompt theory.” These clips focus on various AI-generated people reacting to the idea that they are “just prompts” with various levels of skepticism, fear, or even conspiratorial paranoia.

On the other side of that gag, some humans are making joke videos playing off the idea that they’re merely prompts. RedondoKid used the conceit in a basketball trick shot video, saying “of course I’m going to make this. This is AI, you put that I’m going to make this in the prompt.” User thisisamurica thanked his faux prompters for putting him in “a world with such delicious food” before theatrically choking on a forkful of meat. And comedian Drake Cummings developed TikTok skits pretending that it was actually AI video prompts forcing him to indulge in vices like shots of alcohol or online gambling (“Goolgle’s [sic] New A.I. Veo 3 is at it again!! When will the prompts end?!” Cummings jokes in the caption).

@justdrakenaround Goolgle’s New A.I. Veo 3 is at it again!! When will the prompts end?! #veo3 #google #ai #aivideo #skit ♬ original sound – Drake Cummings

Beyond the obvious jokes, though, I’ve also seen a growing trend of TikTok creators approaching friends or strangers and asking them to react to the idea that “we’re all just prompts.” The reactions run the gamut from “get the fuck away from me” to “I blame that [prompter], I now have to pay taxes” to solipsistic philosophical musings from convenience store employees.

I’m loath to call this a full-blown TikTok trend based on a few stray examples. Still, these attempts to exploit the confusion between real and AI-generated video are interesting to see. As one commenter on an “Are you a prompt?” ambush video put it: “New trend: Do normal videos and write ‘Google Veo 3’ on top of the video.”

Which one is real?

The best Veo-related TikTok engagement hack I’ve stumbled on so far, though, might be the videos that show multiple short clips and ask the viewer to decide which are real and which are fake. One video I stumbled on shows an increasing number of “Veo 3 Goth Girls” across four clips, challenging in the caption that “one of these videos is real… can you guess which one?” In another example, two similar sets of kids are shown hanging out in cars while the caption asks, “Are you able to identify which scene is real and which one is from veo3?”

@spongibobbu2 One of these videos is real… can you guess which one? #veo3 ♬ original sound – Jett

After watching both of these videos on loop a few times, I’m relatively (but not entirely) convinced that every single clip in them is a Veo creation. The fact that I watched these videos multiple times shows how effective the “Real or Veo” challenge framing is at grabbing my attention. Additionally, I’m still not 100 percent confident in my assessments, which is a testament to just how good Google’s new model is at creating convincing videos.

There are still some telltale signs for distinguishing a real video from a Veo creation, though. For one, Veo clips are still limited to just eight seconds, so any video that runs longer (without an apparent change in camera angle) is almost certainly not generated by Google’s AI. Looking back at a creator’s other videos can also provide some clues—if the same person was appearing in “normal” videos two weeks ago, it’s unlikely they would be appearing in Veo creations suddenly.

There’s also a subtle but distinctive style to most Veo creations that can distinguish them from the kind of candid handheld smartphone videos that usually fill TikTok. The lighting in a Veo video tends to be too bright, the camera movements a bit too smooth, and the edges of people and objects a little too polished. After you watch enough “genuine” Veo creations, you can start to pick out the patterns.

Regardless, TikTokers trying to pass off real videos as fakes—even as a joke or engagement hack—is a recognition that video sites are now deep in the “deep doubt” era, where you have to be extra skeptical of even legitimate-looking video footage. And the mere existence of convincing AI fakes makes it easier than ever to claim real events captured on video didn’t really happen, a problem that political scientists call the liar’s dividend. We saw this when then-candidate Trump accused Democratic nominee Kamala Harris of “A.I.’d” crowds in real photos of her Detroit airport rally.

For now, TikTokers of all stripes are having fun playing with that idea to gain social media attention. In the long term, though, the implications for discerning truth from reality are more troubling.

Photo of Kyle Orland

Kyle Orland has been the Senior Gaming Editor at Ars Technica since 2012, writing primarily about the business, tech, and culture behind video games. He has journalism and computer science degrees from University of Maryland. He once wrote a whole book about Minesweeper.

Real TikTokers are pretending to be Veo 3 AI creations for fun, attention Read More »

texas-ag-loses-appeal-to-seize-evidence-for-elon-musk’s-ad-boycott-fight

Texas AG loses appeal to seize evidence for Elon Musk’s ad boycott fight

If MMFA is made to endure Paxton’s probe, the media company could face civil penalties of up to $10,000 per violation of Texas’ unfair trade law, a fine or confinement if requested evidence was deleted, or other penalties for resisting sharing information. However, Edwards agreed that even the threat of the probe apparently had “adverse effects” on MMFA. Reviewing evidence, including reporters’ sworn affidavits, Edwards found that MMFA’s reporting on X was seemingly chilled by Paxton’s threat. MMFA also provided evidence that research partners had ended collaborations due to the looming probe.

Importantly, Paxton never contested claims that he retaliated against MMFA, instead seemingly hoping to dodge the lawsuit on technicalities by disputing jurisdiction and venue selection. But Edwards said that MMFA “clearly” has standing, as “they are the targeted victims of a campaign of retaliation” that is “ongoing.”

The problem with Paxton’s argument is that” it “ignores the body of law that prohibits government officials from subjecting individuals to retaliatory actions for exercising their rights of free speech,” Edwards wrote, suggesting that Paxton arguably launched a “bad-faith” probe.

Further, Edwards called out the “irony” of Paxton “readily” acknowledging in other litigation “that a state’s attempt to silence a company through the issuance and threat of compelling a response” to a civil investigative demand “harms everyone.”

With the preliminary injunction won, MMFA can move forward with its lawsuit after defeating Paxton’s motion to dismiss. In her concurring opinion, Circuit Judge Karen L. Henderson noted that MMFA may need to show more evidence that partners have ended collaborations over the probe (and not for other reasons) to ultimately clinch the win against Paxton.

Watchdog celebrates court win

In a statement provided to Ars, MMFA President and CEO Angelo Carusone celebrated the decision as a “victory for free speech.”

“Elon Musk encouraged Republican state attorneys general to use their power to harass their critics and stifle reporting about X,” Carusone said. “Ken Paxton was one of those AGs who took up the call, and his attempt to use his office as an instrument for Musk’s censorship crusade has been defeated.”

MMFA continues to fight against X over the same claims—as well as a recently launched Federal Trade Commission probe—but Carusone said the media company is “buoyed that yet another court has seen through the fog of Musk’s ‘thermonuclear’ legal onslaught and recognized it for the meritless attack to silence a critic that it is,” Carusone said.

Paxton’s office did not immediately respond to Ars’ request to comment.

Texas AG loses appeal to seize evidence for Elon Musk’s ad boycott fight Read More »