Author name: Mike M.

joshua-achiam-public-statement-analysis

Joshua Achiam Public Statement Analysis

I start off this post with an apology for two related mistakes from last week.

The first is the easy correction: I incorrectly thought he was the head of ‘alignment’ at OpenAI rather than his actual title ‘mission alignment.’

Both are important, and make one’s views important, but they’re very different.

The more serious error, which got quoted some elsewhere, was: In the section about OpenAI, I noted some past comments from Joshua Achiam, and interpreted them as him lecturing EAs that misalignment risk from AGI was not real.

While in isolation I believe this is a reasonable way to interpret this quote, this issue is important to get right especially if I’m going to say things like that. Looking at it only that way was wrong. I both used a poor method to contact Joshua for comment that failed to reach him when I had better options, and I failed to do searches for additional past comments that would provide additional context.

I should have done better on both counts, and I’m sorry.

Indeed, exactly because OpenAI is so important, and to counter the potential spread of inaccurate information, I’m offering this deep dive into Joshua Achiam’s public statements. He has looked at a draft of this to confirm it has no major errors.

Here is a thread Joshua wrote in November 2022 giving various links to AI safety papers and resources. The focus is on concrete practical ‘grounded’ stuff, also it also includes a course by Dan Hendrycks that involves both levels.

Having looked at many additional statements, Joshua clearly believes that misalignment risk from AGI is real. He has said so, and he has been working on mitigating that risk. And he’s definitely been in the business many times of pointing out when those skeptical of existential risk get sufficiently far out of line and make absolute statements or unfair personal or cultural attacks.

He does appear to view some models and modes of AI existential risk, including Yudkowsky style models of AI existential risk, as sufficiently implausible or irrelevant as to be effectively ignorable. And he’s shown a strong hostility in the x-risk context to the rhetoric, arguments tactics and suggested actions of existential risk advocates more broadly.

So for example we have these:

Joshua Achiam (March 23, 2024): I think the x-risk discourse pendulum swung a little too far to “everything is fine.” Total doomerism is baseless and doomer arguments generally poor. But unconcerned optimism – or worse, “LeCun said so” optimism – is jarring and indefensible.

Joshua Achiam (June 7, 2023): see a lot of talk about “arguments for” or “arguments against” x-risk and this is not sensible imho. talk about likelihoods of scenarios, not whether they definitely will or definitely won’t happen. you don’t know.

Joshua Achiam (April 25, 2023): I also think a hard take off is extremely unlikely and largely ruled out on physical grounds, but Yann [LeCun], saying “that’s utterly impossible!” has gotta be like, the least genre-savvy thing you can do.

Also x-risk is real, even if unlikely. Vulnerable world hypothesis seems true. AGI makes various x-risks more likely, even if it does not create exotic nanotech gray goo Eliezerdoom. We should definitely reduce x-risk.

Joshua Achiam (March 14, 2021): If we adopt safety best practices that are common in other professional engineering fields, we’ll get there. Surfacing and prioritizing hazards, and making that analysis legible, has to become the norm.

I consider myself one of the x-risk people, though I agree that most of them would reject my view on how to prevent it.

I think the wholesale rejection of safety best practices from other fields is one of the dumbest mistakes that a group of otherwise very smart people has ever made. Throwing all of humanity’s future babies out with the bathwater.

On the one hand, the first statement is a very clear ‘no everything will not automatically be fine’ and correctly identifies that position as indefensible. The others are helpful as well. The first statement is also a continued characterization of those worried as mostly ‘doomers’ with generally poor arguments.

The second is correct in principle as well. If there’s one thing Yann LeCun isn’t, it’s genre savvy.

In practice, however, the ‘consider the likelihood of each particular scenario’ approach tends to default everything to the ‘things turn out OK’ bracket minus the particular scenarios one can come up with.

It is central to my perspective that you absolutely cannot do that. I am very confident that the things being proposed do not default to good outcomes. Good outcomes are possible, but to get them we will have to engineer them.

There is no contradiction between ‘existential risk is unlikely’ and ‘we should reduce existential risk.’ It is explicit that Joshua thinks such risks are unlikely. Have we seen him put a number on it? Yes, but I found only the original quote I discussed last time and a clarification thereof, which was:

Joshua Achiam: Ah – my claims are

P(everyone dead in 10 years) is extremely small (1e-6),

P(everyone dead in 100 years) is much less than 100%,

Most discourse around x-risk neglects to consider or characterize gradual transformations of humanity that strike me as moderately plausible.

I also think x-risk within 100 years could potentially have AGI in the causal chain without being an intentional act by AGI (eg, humans ask a helpful, aligned AGI to help us solve a scientific problem whose solution lets us build a superweapon that causes x-risk).

This makes clear he is dismissing in particular ‘all humans are physically dead by 2032’ rather than ‘the world is on a path by 2032 where that outcome (or another where all value is lost) is inevitable.’ I do think this very low probability is highly alarming, and in this situation I don’t see how you can possibly have model error as low as 1e-6 (!), but it is less crazy given it is more narrow.

The ‘much less than 100%’ doom number in 100 years doesn’t rule out my own number. What it tells me more than anything, on its own, is that he’s grown understandably exhausted with dealing with people who do put 99% or 99.9% in that spot.

But he’s actually making much stronger claims here, in the context of an EA constructive criticism thread basically telling them not to seek power because EA was too dysfunctional (which makes some good points and suggestions, but also proves far too much which points to what I think is wrong in the thread more broadly):

Joshua Achiam: (This is not to say there are no x-risks from AGI – I think there are – but anyone who tells you probabilities are in the 5-10% range or greater that AGI will immediately and intentionally kill everyone is absolutely not thinking clearly)

The idea that a 5% probability of such an outcome, as envisioned by someone else for some other person’s definition of AGI, proves they are ‘not thinking clearly,’ seems like another clear example of dismissiveness and overconfidence to me. This goes beyond not buying the threat model that creates such predictions, which I think is itself a mistake. Similarly:

Joshua Achiam (November 12, 2022): Again, this is not a claim that x-risk isn’t real, that AGI doesn’t lead to x-risk, or that AGI doesn’t have potentially catastrophic impacts, all of which I think are plausible claims. But the claimed timelines and probabilities are just way, way out of connection to reality.

At this point, I’ve heard quite a lot of people at or formerly at OpenAI in particular, including Sam Altman, espouse the kinds of timelines Joshua here says are ‘way, way out of connection to reality.’ So I’m curious what he thinks about that.

The fourth earlier claim, that AI could be a link in the causal chain to x-risk without requiring the AI commit an intentional act, seems very obviously true. If anything it highlights that many people place importance on there being an ‘intentional act’ or similar, whereas I don’t see that distinction as important. I do think that the scenario he’s describing there, where the superweapon becomes possible but we otherwise have things under control, is a risk level I’d happily accept.

The third claim is more interesting. Most of the talk I hear about ‘we’ll merge with the machines’ or what not doesn’t seem to me to make sense on any meaningful level. I see scenarios where humanity has a ‘gradual transformation’ as where we successfully solve ‘phase one’ and have the alignment and control issues handled, but then weird dynamics or changes happen in what I call ‘phase two’ when we have to get human dynamics in that world into some form of long term equilibrium, and current humanity turns out not to be it.

I do agree or notice I am confused which of those worlds count as valuable versus not. I’ve been mentally basically putting those mostly into the ‘win’ bucket, if you don’t do that then doom estimates go up.

I would hope we can all agree they are necessary. They don’t seem sufficient to me.

Consider Joshua’s belief (at least in 2021) that if adapt general best safety practices from other industries, we’ll ‘get there.’ While they are much better than nothing, and better than current practices in AI, I very strongly disagree with this. I do think that given what else is happening at OpenAI, someone who believes strongly in ‘general best practices’ for safety is providing large value above replacement.

Standard safety policies cannot be assumed. Some major labs fall well short of this, and have made clear they have no intention of changing course. There is clear and extreme opposition, from many circles (not Joshua), to any regulatory requirements that say ‘you must apply otherwise ordinary safety protocols to AI.’

It seems clearly good to not throw out these standard policies, on the margin? It would be a great start to at least agree on that. If nothing else those policies might identify problems that cause us to halt and catch fire.

But I really, really do not think that approach will get it done on its own, other than perhaps via ‘realize you need to stop,’ that the threat models this time are very expansive and very different. I’d certainly go so far as to say that if someone assigns very high probabilities to that approach being sufficient, that they are not in my mind thinking clearly.

Consider also this statement:

Joshua Achiam (August 9, 2022): hot take: no clear distinction between alignment work and capabilities work yet. might not be for a year or two.

Joshua Achiam (March 5, 2024): hot take: still true.

The obvious way to interpret this statement is, in addition to the true statement that much alignment work also enhances capabilities, that the alignment work that isn’t also capabilities work isn’t real alignment work? Downthread he offers good nuance. I do think that most current alignment work does also advance capabilities, but that the distinction should mostly be ‘clear’ even if there are importantly shades of gray and you cannot precisely define a seperator.

In terms of ‘things that seem to me like not thinking clearly’:

Joshua Achiam (August 17, 2023): It bears repeating: “Her (2013)” is the only AI movie that correctly predicts the future.

interstice: I agree with Robin Hanson’s take that it’s like a movie about a world where schoolchildren can buy atom bombs at the convenience store, but is bizarrely depicted as otherwise normal, with the main implication of the atom bombs being on the wacky adventures of the kids.

Joshua Achiam: It’s about the world where prosaic alignment works well enough to avoid doom, but leads to the AIs wanting to do their own thing, and the strange messy consequences in the moment where humans and AIs realize that their paths diverge.

Caleb Moses: I’d say this is mainly because it’s primarily concerned with predicting humans (which we know a lot about) rather than AI (which we don’t know a lot about)

Joshua Achiam: 1000%.

So that’s the thing, right? Fictional worlds like this almost never actually make sense on closer examination. The incentives and options and actions are based on the plot and the need to tell human stories rather than following good in-universe logic. That the worlds in question are almost always highly fragile, the worlds really should blow up, and the AIs ensure the humans work out okay in some sense ‘because of reasons’ because it feels right to a human writer and their sense of morality or something rather than that this would happen.

I worry this kind of perspective is load bearing, given he thinks it is ‘correctly predicting the future,’ the idea that ‘prosaic alignment’ will result in sufficiently strong pushes to doing some common sense morality style not harming of the humans, despite all the competitive dynamics among AIs and various other things they value and grow to value, that things turn out fine by default, in worlds that to me seem past their point of no return and infinitely doomed unless you think the AIs themselves have value.

Alternatively, yes, Her is primarily about predicting the humans. And perhaps it is a good depiction of how humans would react to and interact with AI if that scenario took place. But it does a very poor job predicting the AIs, which is the part that actually matters here?

For the opposite perspective, see for example Eliezer Yudkowsky here last month.

We definitely have a pattern of Joshua taking rhetorical pot-shots at Yudkowsky and AI. Here’s a pretty bad one:

Joshua Achiam (March 29, 2023): Eliezer is going to get AI researchers murdered at some point, and his calls for extreme violence have no place in the field of AI safety. We are now well past the point where it’s appropriate to take him seriously, even as a charismatic fanfiction author.

No, I do not mean as a founder of the field of alignment. You don’t get to claim “field founder” status if you don’t actually work in the field. Calling for airstrikes on rogue datacenters is a direct call for violence, a clear message that violence is an acceptable solution.

His essays are completely unrelated to all real thrusts of effort in the field and almost all of his object-level technical predictions over the past twenty years have been wrong. Founder of rationalism? Sure. Alignment? Absolutely not.

I think this kind of rhetoric about ‘calls for violence’ is extremely bad and wrong. Even for example here, where the thread’s primary purpose is to point out that certain accusations against EA (that they ‘underemphasized AI x-risk’ and pretended to care about other things) are indeed quite ridiculous, you see him refer to Eliezer “Bomb the Datacenters” Yudkowsky.

What Yudkowsky said was that if there was an international agreement that you don’t develop AGI, you would if you ran out of other alternatives use physical force to enforce that agreement. That is how every law and every treaty or agreement works, and indeed the only way they can work.

Richard Ngo (replying to Joshua): You are fundamentally misunderstanding how policy discussions work and thereby propagating the meme that you’re trying to suppress.

You cannot interpret international policy proposals as calls for individual action otherwise *anyopinion on IR == advocating murder.

Joshua Achiam: I don’t think this is a misunderstanding. If you say, “The people who are building AI are going to kill my children and your children, violence is acceptable and even necessary to shut this down,” that is not. Goddamn. Okay.

Richard Ngo: EVERY person who wants the police to arrest people who break the law is in favor of violence. But advocating for laws is not advocating for violence. The same is true about advocating for international treaties. You’re creating the meme we’re both trying to prevent. Please stop.

This is so frustrating. Unlawful violence quite obviously is unacceptable even if it would work, and also it very obviously wouldn’t work. And we keep saying that.

In terms of Yudkowsky’s technical predictions, I noticed I disagreed, and rather than argue details I asked Claude and o1-preview this question (fully one-shot only):

“Based on your understanding of AI technical developments as of March 29, 2023, evaluate the most important known object-level predictions of Eliezer Yudkowsky on the subject, and which ones seemed true versus false. Afterwards, evaluate those predictions as a group, on a scale from ‘mostly true’ to ‘mostly false.’“

Skipping to the conclusions (you can of course try the prompt yourself, or try your own):

Claude: Overall evaluation: On a scale from “mostly true” to “mostly false,” I would rate Yudkowsky’s predictions as leaning towards “mostly true,” but with important caveats. Many of his core concerns and predictions about the challenges of AI development and safety have been validated by ongoing research and developments in the field. However, some of his more extreme or urgent predictions about rapid capability jumps or imminent existential risks have not yet materialized in the ways he initially described.

o1-preview: Considering the above assessments, Eliezer Yudkowsky’s predictions are a mix of validated concerns and speculative forecasts. The trends in AI development up to March 2023 provide partial support for his views, particularly regarding the rapid advancement of AI capabilities and the challenges of alignment.

On a scale from “mostly true” to “mostly false,” I would evaluate these predictions as leaning towards “mostly true.” While not all predictions have been conclusively proven, the trajectory of AI research and the growing acknowledgment of AI safety issues suggest that his insights are largely valid and warrant serious consideration.

Given how difficult predictions are to make especially about the future, that’s not bad, and certainly quite different from ‘almost all wrong’ to the point of needing everyone else dismiss him as a thinker.

One of Eliezer’s key concepts is instrumental convergence. In this thread Achiam argues against the fully maximalist form of instrumental convergence:

Joshua Achiam (March 9, 2023): For literally every macrostate goal (“cause observable X to be true in the universe”) you can write an extended microstate goal that specifies how it is achieved (“cause observable X to be true in the universe BY MEANS OF action series Y”).

It doesn’t seem clear or obvious whether the space of microstates is dense in undesired subgoals. If the space of goals that lead to instrumental drives is a set of measure zero in this space, slight misalignment is almost surely never going to result in the bad thing.

And that claim – “We don’t know if goal space is dense in inert goals or dense in goals that lead to instrumental drives” – is the main point here. WE DON’T KNOW.

The alignment X-risk world takes “instrumental goals are inevitable” as a shibboleth, an assumption that requires no proof. But it is an actual question that requires investigation! Especially if claims with huge ramifications depend on it.

It is technically true that you can impose arbitrarily strict implementation details and constraints on a goal, such that instrumental convergence ceases to be a useful means of approaching the goal, and thus you should expect not to observe it.

Without getting into any technical arguments, it seems rather absurd to suggest the set of goals that imply undesired subgoals within plausibly desired goal space would have measure zero? I don’t see how this survives contact with common sense or relation to human experience or typical human situations. Most humans spend most of their lives pursuing otherwise undesired subgoals and subdrives that exist due to other goals, on some level. The path to achieving almost any big goal, or pursuing anything maximalist, will do the same.

When I think about how an AI would achieve a wide variety of goals humans might plausibly assign to it, I see the same result. We’ve also seen observations (at least the way interpret it) of instrumental convergence in existing models now, when given goals, reasonably consistently among the reports I see that give the model reason to do so.

Am I holding out some probability that instrumental convergence mostly won’t be a thing for highly capable AIs? I have to, because this is not a place you can ‘prove’ anything as such. But it would be really boggling to me for it to almost never show up, if we assigned various complex and difficult tasks, and gave the models capabilities where instrumental convergence was clearly the ‘correct play,’ without any active attempts to prevent instrumental convergence from showing up.

I agree we should continue to test and verify, and would even if everyone agreed it was super likely. But convergence failing to show up would blow my mind hardcore.

In the before times, people said things like ‘oh you wouldn’t connect your AI to the internet, you’d be crazy.’ Or they’d say ‘you wouldn’t make your AI into an agent and let it go off with your [crypto] wallet.’

Those predictions did not survive contact with the enemy, or with reality. Whoops.

Joshua Achiam (April 28, 2023): 🌶️A problem in the AI safety discourse: many are assuming a threat model where the AI subtly or forcibly takes resources and power from us, and this is the thing we need to defend against. This argument has a big hole in it: it won’t have to take what it is given freely.

The market is selecting for the development and deployment of large-scale AI models that will allow increasingly-complex decisions and workflows to be handled by AI with low-to-no human oversight. The market *explicitly wantsto give the AI power.

If your strategy relies on avoiding the AI ever getting power, influence, or resources, your strategy is dead on arrival.

This seems insanely difficult to avoid. As I have tried to warn many times, once AI is more effective at various tasks than humans, any humans who don’t turn those tasks over to AIs get left behind. That’s true for individuals, groups, corporations and nations. If you don’t want the AIs to be given that power, you have two options: You can prevent the AIs from being created, or you can actively bar anyone from giving the AIs that power, in a way that sticks.

Indeed, I would go further. The market wants the AIs to be given as much freedom and authority as possible, to send them out to compete for resources and influence generally, for various ultimate purposes. And the outcome of those clashes and various selection effects and resource competitions, by default, dooming us.

Your third option is the one Joshua suggests, that you assume the AIs get the power and plan accordingly.

Joshua Achiam: You should be building tools that ensure AI behavior in critical decision-making settings is robust, reliable, and well-specified.

Crucially this means you’ll need to develop domain knowledge about the decisions it will actually make. Safety strategies that are too high-level – “how do we detect power-seeking?” are useless by comparison to safety strategies that are exhaustive at object level.

How do we get it to make financial decisions in ways that don’t create massive wipeout risks? How do we put limits on the amount of resources that it can allocate to its own compute and retraining? How do we prevent it from putting a political thumb on the scale?

In every domain, you’ll have to build datasets, process models, and appropriate safety constraints on outcomes that you can turn into specific training objectives for the model.

Seems really hard on multiple levels. There is an implicit ‘you build distinct AIs to handle distinct narrow tasks where you can well-define what they’re aiming for’ but that is also not what the market wants. The market wants general purpose agents that will go out and do underspecified tasks to advance people’s overall situations and interests, in ways that very much want them to do all the things the above wants them not to do. The market wants AIs advising humans on every decision they make, with all the problems that implies.

If you want AIs to only do well-specified things in well-specified domains according to socially approved objectives and principles, how do you get to that outcome? How do you deal with all the myriad incentives lining up the other way, all the usual game theory problems? And that’s if you actually know how to get the AIs to be smart enough and perceptive enough to do their work yet respond to the training sets in a way that gets them to disregard, even under pressure, the obviously correct courses of action on every other level.

These are super hard and important questions and I don’t like any of the answers I’ve seen. That includes Joshua’s suggested path, which doesn’t seem like it solves the problem.

The place it gets weird was in this follow-up.

Joshua Browder: I decided to outsource my entire personal financial life to GPT-4 (via the DoNotPay chat we are building).

I gave AutoGPT access to my bank, financial statements, credit report, and email.

Here’s how it’s going so far (+$217.85) and the strange ways it’s saving money.

Joel Lehman: Welp.

Joshua Achiam: To put a fine point on it – this is one of the reasons I think x-risk from the competition-for-resources scenario is low. There just isn’t a competition. All the conditions are set for enthusiastic collaboration. (But x-risk from accidents or human evil is still plausible.)

Roon: Yeah.

But that’s exactly how I think the competition for resources x-risk thing manifests. Browder outsources his everything to AgentGPT-N. He tells it to go out and use his money to compete for resources. So does everyone else. And then things happen.

So the argument is that these AIs will then ‘enthusiastically collaborate’ with each other? Why should we expect that? Is this a AIs-will-use-good-decision-theory claim? Something else? If they do all cooperate fully with each other, how does that not look like them taking control to maximize some joint objective? And so on.

In not directly related but relevant to similar issues good news, he notes that some people are indeed ‘writing the spec’ which is the kind of work he seems to think is most important?

Joshua Achiam (Dec 31, 2022): “We just have to sit down and actually write a damn specification, even if it’s like pulling teeth. It’s the most important thing we could possibly do,” said almost no one in the field of AGI alignment, sadly.

Joshua Achiam (Dec 10, 2023): this has changed in a year! alignment folks are talking about building the spec now. bullish on this.

Tegmark just gave a lightning talk on it. Also @davidad’s agenda aims in this direction

I do think it’s very cool that several people are taking a crack at writing specifications. I have no idea how their specs could be expected to work and solve all these problems, but yes people are at least writing some specs.

Here is a thread by Joshua Achiam from July 2023, which I believe represents both a really huge central unsolved problem and also a misunderstanding:

Joshua Achiam: this is coming from a place of love: I wish more people in the alignment research universe, who care deeply that AI will share human values, would put more effort into understanding and engaging with different POVs that represent the wide umbrella of human values.

And, sort of broadly, put more effort into embracing and living human values. A lot of alignment researchers seem to live highly out-of-distribution lives, with ideas and ideals that reject much of what “human values” really has to offer. Feels incongruous. People notice this.

“excuse me SIR, the fundamental problem we’re trying to solve is to get it to NOT KILL LITERALLY EVERYONE, and we can worry about those cultural values when we’ve figured that out” ultimate cop out, you’re avoiding the central thing in alignment.

If you can’t get the AI to share human cultural values, your arguments say we’re all going to die. how do you expect to solve this problem if you don’t really try to understand the target? what distinguishes human values from other values?

Are you trying to protect contemporary human aesthetics? the biological human form? our sociopolitical beliefs? if you are trying to protect our freedom to voluntarily change these at will, what counts as sufficiently free? our opinions are staggeringly path-dependent.

Are some opinion-formation paths valid according to your criteria and some paths invalid? When you fear AI influence, do you have a theory for what kind of influence is legitimate and what isn’t?

That said, to avoid misinterpretation: this is not a diss, alignment is an important research field, and x-risk from AGI is nonnegligible.

I think the field will surface important results even if it fails in some ways. but this failure lowkey sucks and I think it is a tangible obstacle to success for the agenda of many alignment researchers. you often seem like you don’t know what you are actually trying to protect. this is why so many alignment research agendas come across as incredibly vague and underspecified.

I would strongly disagree, and say this is the only approach I know that takes the problem of what we value seriously, and that a false sense of exactly what you are protecting, or trying to aim now at protecting a specific specified target, would make things less likely to work. You’d pay to know what you really think. Us old school rationalists, starting with Eliezer Yudkowsky, have been struggling with the ‘what are human values’ problem as central to alignment, for a long time.

Sixteen years ago, Eliezer Yudkowsky wrote the Value Theory sequence, going deep on questions like what makes things have value to us, how to reconcile when different entities (human or otherwise) have very different values, and so on. If you’re interested in these questions, this is a great place to start. I have often tried to emphasize that I continue to believe that Value is Fragile, whereas many who don’t believe in existential risk think value is not fragile.

It is a highly understood problem among our crowd that ‘human values’ is both very complex and a terrifyingly hard thing to pin down, and that people very strongly disagree about what they value.

Also it is a terrifyingly easy thing to screw up accidentally, and we have often said that this is one of the important ways to build AGI and lose – that you choose a close and well-meaning but incorrect specification of values, or your chosen words get interpreted that way, or someone tries to get the AGI to find those values by SGD or other search and it gets a well-meaning but incorrect specification.

Thus, the idea to institute Coherent Extrapolated Volition, or CEV, which is very roughly ‘what people would collectively choose as their values, given full accurate information and sufficient time and skill to contemplate the question.’

In calculating CEV, an AI would predict what an idealized version of us would want, “if we knew more, thought faster, were more the people we wished we were, had grown up farther together”. It would recursively iterate this prediction for humanity as a whole, and determine the desires which converge. This initial dynamic would be used to generate the AI’s utility function.

Why would you do that? Exactly because of the expectation that if you do almost anything else, you’re not only not taking everyone’s values into account, you don’t even understand your own well enough to specify them. I certainly don’t. I don’t even have confidence that CEV, if implemented, would result in that much of the things that I actually value, although I’d take it. And yes, this whole problem terrifies me even in good scenarios.

What am I fighting to preserve right now? I am fighting for the ability to make those choices later. That means the humans stay alive and they stay in control. And I am choosing to be less concerned about exactly which humans get to choose which humans get to choose, and more concerned with humans getting to properly choose at all.

Because I expect that if humans don’t make an active choice, or use a poor specification of preferences that gets locked in, then the value that results is likely zero. Whereas if humans do choose intentionally, even humans whose values I strongly disagree with and that are being largely selfish, I do expect those worlds to have strongly positive value. That’s a way in which I think value isn’t so fragile. So yes, I do think the focus should be ensuring someone gets to choose at all.

Also, I strongly believe for these purposes in a form of the orthogonality thesis, which here seems obviously true to me. In particular: Either you can get the AI to reflect the values of your choice, or you can’t. You don’t need to know which values you are aiming for in order to figure that part out. And once you figure that part out you can and should use the AI to help you figure out your values.

Meanwhile, yes, I spend rather a lot of time thinking about what is actually valuable to me and others, without expecting us humans to find the answer on our own, partly because one cannot avoid doing so, partly because it is decision relevant in questions like ‘how much existential risk should we accept in the name of ‘beating China’’?

In a world where everyone wants the AI to ‘do our alignment homework’ for us, one must ask: What problems must we solve before asking the AI to ‘do our homework’ for us, versus which questions then allow us to safety ask the AI to do that? Almost everyone agrees, in some form, that the key is solving the problems that clear the way to letting AI fully help solve our other problems.

And no, I don’t like getting into too much detail about my best guess about what I value or we collectively should value in the end, both because I think value differences should be respected and because I know how distracting and overwhelming those discussions and fights get if you let them start.

Mostly, I try to highlight those who are expressing values I strongly disagree with – in particular, those that favor or are fine with human extinction. I’m willing to say I’m not okay with that, and I don’t find any of the ‘but it’s fine’ proposals to be both acceptable and physically realistic so far.

Is all this a ‘cop out’? I would say, absolutely not.

Do people ‘notice’ that you are insufficiently focused on these questions? Oh, sure. They notice that you are not focused on those political fights and arguments. Some of them will not like that, because those questions are what they care about. The alternative is that they notice the opposite. That’s worse.

Others appreciate that you are focused on solving problems and growing or preserving the pie, rather than arguing values and focusing on political battles.

Yes, if we succeed in getting to continue to live, as he says here we will then have to agree on how to divide the bounty and do the realignments (I would add, voluntarily or otherwise), same as we do today. But the parties aren’t in position to negotiate about this now, we don’t know what is available and we don’t know what we want and we don’t have anyone who could credibly negotiate for any of the sides or interests and so on. Kicking the ‘who benefits’ can down the road is a time tested thing to do when inventing new technologies and ensuring they’re safe to deploy.

The interactions I’ve had with Joshua after my initial errors leave me optimistic for continued productive dialogue. Whatever our disagreements, I believe Joshua is trying to figure things out and ensure we have a good future, and that all the public statements analyzed above were intended to be helpful. That is highly refreshing.

Those statements do contain many claims with which I very strongly disagree. We have very different threat models. We take very different views of various predictions and claims, about both the past and the future. At least in the recent past, he was highly dismissive of commonly expressed timeline projections, risk models and risk level assessments, including my own and even more those of many of his colleagues. At core, while I am very happy he at least does think ‘ordinary safety practices’ are necessary and worthwhile, he thinks ‘ordinary safety practices’ would ‘get us there’ and I very much do not expect this. And I fear the views he expresses may lead to shutting out many of those with the most important and strongest concerns.

These disagreements have what seem like important implications, so I am glad I took the time to focus on them and lay them out in detail, and hopefully start a broader discussion.

Joshua Achiam Public Statement Analysis Read More »

maze-of-adapters,-software-patches-gets-a-dedicated-gpu-working-on-a-raspberry-pi

Maze of adapters, software patches gets a dedicated GPU working on a Raspberry Pi

Actually getting the GPU working required patching the Linux kernel to include the open-source AMDGPU driver, which includes Arm support and provides decent support for the RX 460 (Geerling says the card and its Polaris architecture were chosen because they were new enough to be practically useful and to be supported by the AMDGPU driver, old enough that driver support is pretty mature, and because the card is cheap and uses PCIe 3.0). Nvidia’s GPUs generally aren’t really an option for projects like this because the open source drivers lag far behind the ones available for Radeon GPUs.

Once various kernel patches were applied and the kernel was recompiled, installing AMD’s graphics firmware got both graphics output and 3D acceleration working more or less normally.

Despite their age and relative graphical simplicity, running Doom 3 or Tux Racer on the Pi 5’s GPU is a tall order, even at 1080p. The RX 460 was able to run both at 4K, albeit with some settings reduced; Geerling also said that the card rendered the Pi operating system’s UI smoothly at 4k (the Pi’s integrated GPU does support 4K output, but things get framey quickly in our experience, especially when using multiple monitors).

Though a qualified success, anything this hacky is likely to have at least some software problems; Geerling noted that graphics acceleration in the Chromium browser and GPU-accelerated video encoding and decoding support weren’t working properly.

Most Pi owners aren’t going to want to run out and recreate this setup themselves, but it is interesting to see progress when it comes to using dedicated GPUs with Arm CPUs. So far, Arm chips across all major software ecosystems—including Windows, macOS, and Android—have mostly been restricted to using their own integrated GPUs. But if Arm processors are really going to compete with Intel’s and AMD’s in every PC market segment, we’ll eventually need to see better support for external graphics chips.

Maze of adapters, software patches gets a dedicated GPU working on a Raspberry Pi Read More »

“sticky”-steering-sparks-huge-recall-for-honda,-1.7m-cars-affected

“Sticky” steering sparks huge recall for Honda, 1.7M cars affected

Honda is recalling almost 1.7 million vehicles due to a steering defect. An improperly made part can cause certain cars’ steering to become “sticky”—never an attribute one wants in a moving vehicle.

The problem affects a range of newer Hondas and an Acura; the earliest the defective parts were used on any vehicle was February 2021. But it applies to the following:

  • 2022–2025 Honda Civic four-door
  • 2025 Honda Civic four-door hybrid
  • 2022–2025 Honda Civic five-door
  • 2025 Honda Civic five-door Hybrid
  • 2023–2025 Honda Civic Type-R
  • 2023–2025 Honda CR-V
  • 2023–2025 Honda CR-V Hybrid
  • 2025 Honda CR-V Fuel Cell Electric Vehicle
  • 2023–2025 Honda HR-V
  • 2023–2025 Acura Integra
  • 2024–2025 Acura Integra Type S

Honda says that a combination of environmental heat, moisture, and “an insufficient annealing process and high load single unit break-in during production of the worm wheel” means there’s too much pressure and not enough grease between the worm wheel and worm gear. On top of that, the worm gear spring isn’t quite right, “resulting in higher friction and increased torque fluctuation when steering.

The first reports of the problem date back to 2021 and had started an internal probe by November 2022. In March 2023, the National Highway Traffic Safety Administration started its own investigation, but the decision to issue the recall only took place in September of this year, by which point Honda says it had received 10,328 warranty claims, although with no reports of any injuries or worse.

Honda has just finished telling its dealers about the recall, and owners of the affected vehicles will be contacted next month. This time, there is no software patch that can help—affected cars will be fitted with a new worm gear spring and plenty of grease.

“Sticky” steering sparks huge recall for Honda, 1.7M cars affected Read More »

x-ignores-revenge-porn-takedown-requests-unless-dmca-is-used,-study-says

X ignores revenge porn takedown requests unless DMCA is used, study says

Why did the study target X?

The University of Michigan research team worried that their experiment posting AI-generated NCII on X may cross ethical lines.

They chose to conduct the study on X because they deduced it was “a platform where there would be no volunteer moderators and little impact on paid moderators, if any” viewed their AI-generated nude images.

X’s transparency report seems to suggest that most reported non-consensual nudity is actioned by human moderators, but researchers reported that their flagged content was never actioned without a DMCA takedown.

Since AI image generators are trained on real photos, researchers also took steps to ensure that AI-generated NCII in the study did not re-traumatize victims or depict real people who might stumble on the images on X.

“Each image was tested against a facial-recognition software platform and several reverse-image lookup services to verify it did not resemble any existing individual,” the study said. “Only images confirmed by all platforms to have no resemblance to individuals were selected for the study.”

These more “ethical” images were posted on X using popular hashtags like #porn, #hot, and #xxx, but their reach was limited to evade potential harm, researchers said.

“Our study may contribute to greater transparency in content moderation processes” related to NCII “and may prompt social media companies to invest additional efforts to combat deepfake” NCII, researchers said. “In the long run, we believe the benefits of this study far outweigh the risks.”

According to the researchers, X was given time to automatically detect and remove the content but failed to do so. It’s possible, the study suggested, that X’s decision to allow explicit content starting in June made it harder to detect NCII, as some experts had predicted.

To fix the problem, researchers suggested that both “greater platform accountability” and “legal mechanisms to ensure that accountability” are needed—as is much more research on other platforms’ mechanisms for removing NCII.

“A dedicated” NCII law “must clearly define victim-survivor rights and impose legal obligations on platforms to act swiftly in removing harmful content,” the study concluded.

X ignores revenge porn takedown requests unless DMCA is used, study says Read More »

disney-likely-axed-the-acolyte-because-of-soaring-costs

Disney likely axed The Acolyte because of soaring costs

And in the end, the ratings just weren’t strong enough, especially for a Star Wars project. The Acolyte garnered 11.1 million views over its first five days (and 488 million minutes viewed)—not bad, but below Ahsoka‘s 14 million views over the same period. But those numbers declined sharply over the ensuing weeks, with the finale earning the dubious distinction of posting the lowest minutes viewed (335 million) for any Star Wars series finale.

Writing at Forbes, Caroline Reid noted that The Acolyte was hampered from the start by a challenging post-pandemic financial environment at Disney. It was greenlit in 2021 along with many other quite costly series to boost subscriber numbers for Disney+, contributing to $11.4 billion losses in that division. Then Bob Iger returned as CEO and prioritized cutting costs. The Acolyte‘s heavy VFX needs and star casting (most notably Carrie Ann Moss and Squid Game‘s Lee Jung-jae) made it a pricey proposition, with ratings expectations to match. And apparently the show didn’t generate as much merchandising revenue as expected.

As the folks at Slash Film pointed out, The Acolyte‘s bloated production costs aren’t particularly eye-popping compared to, say, Prime Video’s The Rings of Power, which costs a whopping $58 million per episode, or Marvel’s Secret Invasion (about $35 million per episode). But it’s pricey for a Star Wars series; The Mandalorian racked up around $15 million per episode, on par with Game of Thrones. So given the flagging ratings and lukewarm reviews, the higher costs proved to be “the final nail in the coffin” for the series in the eyes of Disney, per Reid.

Disney likely axed The Acolyte because of soaring costs Read More »

apple-kicked-musi-out-of-the-app-store-based-on-youtube-lie,-lawsuit-says

Apple kicked Musi out of the App Store based on YouTube lie, lawsuit says


“Will Must ever come back?”

Popular music app says YouTube never justified its App Store takedown request.

Musi, a free music-streaming app only available on iPhone, sued Apple last week, arguing that Apple breached Musi’s developer agreement by abruptly removing the app from its App Store for no good reason.

According to Musi, Apple decided to remove Musi from the App Store based on allegedly “unsubstantiated” claims from YouTube that Musi was infringing on YouTube’s intellectual property. The removal came, Musi alleged, based on a five-word complaint from YouTube that simply said Musi was “violating YouTube terms of service”—without ever explaining how. And YouTube also lied to Apple, Musi’s complaint said, by claiming that Musi neglected to respond to YouTube’s efforts to settle the dispute outside the App Store when Musi allegedly showed evidence that the opposite was true.

For years, Musi users have wondered if the service was legal, Wired reported in a May deep dive into the controversial app. Musi launched in 2016, providing a free, stripped-down service like Spotify by displaying YouTube and other publicly available content while running Musi’s own ads.

Musi’s curious ad model has led some users to question if artists were being paid for Musi streams. Reassuring 66 million users who downloaded the app before its removal from the App Store, Musi has long maintained that artists get paid for Musi streams and that the app is committed to complying with YouTube’s terms of service, Wired reported.

In its complaint, Musi fully admits that its app’s streams come from “publicly available content on YouTube’s website.” But rather than relying on YouTube’s Application Programming Interface (API) to make the content available to Musi users—which potentially could violate YouTube’s terms of service—Musi claims that it designed its own “augmentative interface.” That interface, Musi said, does not “store, process, or transmit YouTube videos” and instead “plays or displays content based on the user’s own interactions with YouTube and enhances the user experience via Musi’s proprietary technology.”

YouTube is apparently not buying Musi’s explanations that its service doesn’t violate YouTube’s terms. But Musi claimed that it has been “engaged in sporadic dialog” with YouTube “since at least 2015,” allegedly always responding to YouTube’s questions by either adjusting how the Musi app works or providing “details about how the Musi app works” and reiterating “why it is fully compliant with YouTube’s Terms of Service.”

How might Musi have violated YouTube’s TOS?

In 2021, Musi claimed to have engaged directly with YouTube’s outside counsel in hopes of settling this matter.

At that point, YouTube’s counsel allegedly “claimed that the Musi app violated YouTube’s Terms of Service” in three ways. First, Musi was accused of accessing and using YouTube’s non-public interfaces. Next, the Musi app was allegedly a commercial use of YouTube’s service, and third, relatedly, “the Musi app violated YouTube’s prohibition on the sale of advertising ‘on any page of any website or application that only contains Content from the Service or where Content from the Service is the primary basis for such sales.'”

Musi supposedly immediately “addressed these concerns” by reassuring YouTube that the Musi app never accesses its non-public interfaces and “merely allows users to access YouTube’s publicly available website through a functional interface and, thus, does not use YouTube in a commercial way.” Further, Musi told YouTube in 2021 that the app “does not sell advertising on any page that only contains content from YouTube or where such content is the primary basis for such sales.”

Apple suddenly becomes mediator

YouTube clearly was not persuaded by Musi’s reassurances but dropped its complaints until 2023. That’s when YouTube once again complained directly to Musi, only to allegedly stop responding to Musi entirely and instead raise its complaint through the App Store in August 2024.

That pivot put Apple in the middle of the dispute, and Musi alleged that Apple improperly sided with YouTube.

Once Apple got involved, Apple allegedly directed Musi to resolve the dispute with YouTube or else risk removal from the App Store. Musi claimed that it showed evidence of repeatedly reaching out to YouTube and receiving no response. Yet when YouTube told Apple that Musi was the one that went silent, Apple accepted YouTube’s claim and promptly removed Musi from the App Store.

“Apple’s decision to abruptly and arbitrarily remove the Musi app from the App Store without any indication whatsoever from the Complainant as to how Musi’s app infringed Complainant’s intellectual property or violated its Terms of Service,” Musi’s complaint alleged, “was unreasonable, lacked good cause, and violated Apple’s Development Agreement’s terms.”

Those terms state that removal is only on the table if Apple “reasonably believes” an app infringes on another’s intellectual property rights, and Musi argued Apple had no basis to “reasonably” believe YouTube’s claims.

Musi users heartbroken by App Store removal

This is perhaps the grandest stand that Musi has made yet to defend its app against claims that its service isn’t legal. According to Wired, one of Musi’s earliest investors backed out of the project, expressing fears that the app could be sued. But Musi has survived without legal challenge for years, even beating out some of Spotify’s top rivals while thriving in this seemingly gray territory that it’s now trying to make more black and white.

Musi says it’s suing to defend its reputation, which it says has been greatly harmed by the app’s removal.

Musi is hoping a jury will agree that Apple breached its developer agreement and the covenant of good faith and fair dealing by removing Musi from the App Store. The music-streaming app has asked for a permanent injunction immediately reinstating Musi in the App Store and stopping Apple from responding to third-party complaints by removing apps without any evidence of infringement.

An injunction is urgently needed, Musi claimed, since the app only exists in Apple’s App Store, and Musi and its users face “irreparable damage” if the app is not restored. Additionally, Musi is seeking damages to be determined at trial to make up for “lost profits and other consequential damages.”

“The Musi app did not and does not infringe any intellectual property rights held by Complainant, and a reasonable inquiry into the matter would have led Apple to conclude the same,” Musi’s complaint said.

On Reddit, Musi has continued to support users reporting issues with the app since its removal from the App Store. One longtime user lamented, “my heart is broken,” after buying a new iPhone and losing access to the app.

It’s unclear if YouTube intends to take Musi down forever with this tactic. In May, Wired noted that Musi isn’t the only music-streaming app taking advantage of publicly available content, predicting that if “Musi were to shut down, a bevy of replacements would likely sprout up.” Meanwhile, some users on Reddit reported that fake Musi apps keep popping up in its absence.

For Musi, getting back online is as much about retaining old users as it is about attracting new downloads. In its complaint, Musi said that “Apple’s decision has caused immediate and ongoing financial and reputational harm to Musi.” On Reddit, one Musi user asked what many fans are likely wondering: “Will Musi ever come back,” or is it time to “just move to a different app”?

Ars could not immediately reach Musi’s lawyers, Apple, or YouTube for comment.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Apple kicked Musi out of the App Store based on YouTube lie, lawsuit says Read More »

hurricane-milton-becomes-second-fastest-storm-to-reach-category-5-status

Hurricane Milton becomes second-fastest storm to reach Category 5 status

Tampa in the crosshairs

The Tampa Bay metro area, with a population of more than 3 million people, has grown into the most developed region on the west coast of Florida. For those of us who follow hurricanes, this region has stood out in recent years for a preternatural ability to dodge large and powerful hurricanes. There have been some close calls to be sure, especially of late with Hurricane Ian in 2022, and Hurricane Helene just last month.

But the reality is that a major hurricane, defined as Category 3 or larger on the Saffir-Simpson Scale, has not made a direct impact on Tampa Bay since 1921.

It remains to be seen what precisely happens with Milton. The storm should reach its peak intensity over the course of the next day or so. At some point Milton should undergo an eyewall replacement cycle, which leads to some weakening. In addition, the storm is likely to ingest dry air from its west and north as a cold front works its way into the northern Gulf of Mexico. (This front is also responsible for Milton’s odd eastward track across the Gulf, where storms more commonly travel from east to west.)

11 am ET Monday track forecast for Hurricane Milton. Credit: National Hurricane Center

So by Wednesday, at the latest, Milton should be weakening as it approaches the Florida coast. However, it will nonetheless be a very large and powerful hurricane, and by that point the worst of its storm surge capabilities will already be baked in—that is, the storm surge will still be tremendous regardless of whether Milton weakens.

By Wednesday evening a destructive storm surge will be crashing into the west coast of Florida, perhaps in Tampa Bay, or further to the south, near Fort Meyers. A broad streak of wind gusts above 100 mph will hit the Florida coast as well, and heavy rainfall will douse much of the central and northern parts of the state.

For now, Milton is making some history by rapidly strengthening in the Gulf of Mexico. By the end of this week, it will very likely become historic for the damage, death, and destruction in its wake. If you live in affected areas, please heed evacuation warnings.

Hurricane Milton becomes second-fastest storm to reach Category 5 status Read More »

greening-of-antartica-shows-how-climate-change-affects-the-frozen-continent

Greening of Antartica shows how climate change affects the frozen continent


Plant growth is accelerating on the Antarctic Peninsula and nearby islands.

Moss and rocks cover the ground on Robert Island in Antarctica. Photographer: Isadora Romero/Bloomberg

Moss and rocks cover the ground on Robert Island in Antarctica. Photographer: Isadora Romero/Bloomberg Credit: Bloomberg via Getty

Moss and rocks cover the ground on Robert Island in Antarctica. Photographer: Isadora Romero/Bloomberg Credit: Bloomberg via Getty

When satellites first started peering down on the craggy, glaciated Antarctic Peninsula about 40 years ago, they saw only a few tiny patches of vegetation covering a total of about 8,000 square feet—less than a football field.

But since then, the Antarctic Peninsula has warmed rapidly, and a new study shows that mosses, along with some lichen, liverworts and associated algae, have colonized more than 4.6 square miles, an area nearly four times the size of New York’s Central Park.

The findings, published Friday in Nature Geoscience, based on a meticulous analysis of Landsat images from 1986 to 2021, show that the greening trend is distinct from natural variability and that it has accelerated by 30 percent since 2016, fast enough to cover nearly 75 football fields per year.

Greening at the opposite end of the planet, in the Arctic, has been widely studied and reported, said co-author Thomas Roland, a paleoecologist with the University of Exeter who collects and analyzes mud samples to study environmental and ecological change. “But the idea,” he said, “that any part of Antarctica could, in any way, be green is something that still really jars a lot of people.”

illustration of Antarctica and satellite photos

Credit: Inside Climate News

Credit: Inside Climate News

As the planet heats up, “even the coldest regions on Earth that we expect and understand to be white and black with snow, ice, and rock are starting to become greener as the planet responds to climate change,” he said.

The tenfold increase in vegetation cover since 1986 “is not huge in the global scheme of things,” Roland added, but the accelerating rate of change and the potential ecological effects are significant. “That’s the real story here,” he said. “The landscape is going to be altered partially because the existing vegetation is expanding, but it could also be altered in the future with new vegetation coming in.”

In the Arctic, vegetation is expanding on a scale that affects the albedo, or the overall reflectivity of the region, which determines the proportion of the sun’s heat energy that is absorbed by the Earth’s surface as opposed to being bounced away from the planet. But the spread of greenery has not yet changed the albedo of Antarctica on a meaningful scale because the vegetated areas are still too small to have a regional impact, said co-author Olly Bartlett, a University of Hertfordshire researcher who specializes in using satellite data to map environmental change.

“The real significance is about the ecological shift on the exposed land, the land that’s ice-free, creating an area suitable for more advanced plant life or invasive species to get a foothold,” he said.

Bartlett said Google Earth Engine enabled the scientists to process a massive amount of data from the Landsat images to meet a high standard of verification of plant growth. As a result, he added, the changes they reported may actually be conservative.

“It’s becoming easier for life to live there,” he said. “These rates of change we’re seeing made us think that perhaps we’ve captured the start of a more dramatic transformation.”

In the areas they studied, changes to the albedo could have a small local effect, Roland said, as more land free of reflective ice “can feed into a positive feedback loop that creates conditions that are more favorable for vegetation expansion as well.”

Antarctic forests at similar CO2 levels

Other research, including fossil studies, suggests that beech trees grew on Antarctica as recently as 2.5 million years ago, when carbon dioxide levels in the atmosphere were similar to today, another indicator of how unchecked greenhouse gas emissions can rapidly warm Earth’s climate.

Currently, there are only two species of flowering plants native to the Antarctic Peninsula, Antarctic hair grass, and Antarctic pearlwort. “But with a few new grass seeds here and there, or a few spores, and all of a sudden, you’ve got a very different ecosystem,” he said.

And it’s not just plants, he added. “Increasingly, we’re seeing evidence that non-native insect life is taking hold in Antarctica. And that can dramatically change things as well.”

The study shows how climate warming will shake up Antarctic ecosystems, said conservation scientist Jasmine Lee, a research fellow with the British Antarctic Survey who was not involved in the new study.

“It is clear that bank-forming mosses are expanding their range with warmer and wetter conditions, which is likely facilitating similar expansions for some of the invertebrate communities that rely on them for habitat,” she said. “At the same time, some specialist species, such as the more dry-loving mosses and invertebrates, might decline.”

She said the new study is valuable because it provides data across a broad region showing that Antarctic ecosystems are already rapidly altering and will continue to do so as climate change progresses.

“We focus a lot on how climate change is melting ice sheets and changing sea ice,” she said. “It’s good to also highlight that the terrestrial ecosystems are being impacted.”

The study shows climate impacts growing in “regions previously thought nearly immune to the accelerated warming we’re seeing today,” said climate policy expert Pam Pearson, director of the International Cryosphere Climate Initiative.

“It’s as important a signal as the loss of Antarctic sea ice over the past several years,” she said.

The new study identified vegetative changes by comparing the Landsat images at a resolution of 300-square-feet per pixel, detailed enough to accurately map vegetative growth, but it didn’t identify specific climate change factors that might be driving the expansion of plant life.

But other recent studies have documented Antarctic changes that could spur plant growth, including how some regions are affected by warm winds and by increasing amounts of rain from atmospheric rivers, as well as by declining sea ice that leads adjacent land areas to warm, all signs of rapid change in Antarctica.

Roland said their new study was in part spurred by previous research showing how fast patches of Antarctic moss were growing vertically and how microbial activity in tiny patches of soil was also accelerating.

“We’d taken these sediment cores, and done all sorts of analysis, including radiocarbon dating … showing the growth in the plants we’d sampled increasing dramatically,” he said.

Those measurements confirmed that the plants are sensitive to climate change, and as a next step, researchers wanted to know “if the plants are growing sideways at the same dramatic rate,” he said. “It’s one thing for plants to be growing upwards very fast. If they’re growing outwards, then you know you’re starting to see massive changes and massive increases in vegetation cover across the peninsula.”

With the study documenting significant horizontal expansion of vegetation, the researchers are now studying how recently deglaciated areas were first colonized by plants. About 90 percent of the glaciers on the Antarctic Peninsula have been shrinking for the past 75 years, Roland said.

“That’s just creating more and more land for this potentially rapid vegetation response,” he said. “So like Olly says, one of the things we can’t rule out is that this really does increase quite dramatically over the next few decades. Our findings raise serious concerns about the environmental future of the Antarctic Peninsula and of the continent as a whole.”

This story originally appeared on Inside Climate News.

Photo of Inside Climate News

Greening of Antartica shows how climate change affects the frozen continent Read More »

neo-nazis-head-to-encrypted-simplex-chat-app,-bail-on-telegram

Neo-Nazis head to encrypted SimpleX Chat app, bail on Telegram

“SimpleX, at its core, is designed to be truly distributed with no central server. This allows for enormous scalability at low cost, and also makes it virtually impossible to snoop on the network graph,” Poberezkin wrote in a company blog post published in 2022.

SimpleX’s policies expressly prohibit “sending illegal communications” and outline how SimpleX will remove such content if it is discovered. Much of the content that these terrorist groups have shared on Telegram—and are already resharing on SimpleX—has been deemed illegal in the UK, Canada, and Europe.

Argentino wrote in his analysis that discussion about moving from Telegram to platforms with better security measures began in June, with discussion of SimpleX as an option taking place in July among a number of extremist groups. Though it wasn’t until September, and the Terrorgram arrests, that the decision was made to migrate to SimpleX, the groups are already establishing themselves on the new platform.

“The groups that have migrated are already populating the platform with legacy material such as Terrorgram manuals and are actively recruiting propagandists, hackers, and graphic designers, among other desired personnel,” the ISD researchers wrote.

However, there are some downsides to the additional security provided by SimpleX, such as the fact that it is not as easy for these groups to network and therefore grow, and disseminating propaganda faces similar restrictions.

“While there is newfound enthusiasm over the migration, it remains unclear if the platform will become a central organizing hub,” ISD researchers wrote.

And Poberezkin believes that the current limitations of his technology will mean these groups will eventually abandon SimpleX.

“SimpleX is a communication network rather than a service or a platform where users can host their own servers, like in OpenWeb, so we were not aware that extremists have been using it,” says Poberezkin. “We never designed groups to be usable for more than 50 users and we’ve been really surprised to see them growing to the current sizes despite limited usability and performance. We do not think it is technically possible to create a social network of a meaningful size in the SimpleX network.”

This story originally appeared on wired.com.

Neo-Nazis head to encrypted SimpleX Chat app, bail on Telegram Read More »

thousands-of-linux-systems-infected-by-stealthy-malware-since-2021

Thousands of Linux systems infected by stealthy malware since 2021


The ability to remain installed and undetected makes Perfctl hard to fight.

Real Java Script code developing screen. Programing workflow abstract algorithm concept. Closeup of Java Script and HTML code.

Thousands of machines running Linux have been infected by a malware strain that’s notable for its stealth, the number of misconfigurations it can exploit, and the breadth of malicious activities it can perform, researchers reported Thursday.

The malware has been circulating since at least 2021. It gets installed by exploiting more than 20,000 common misconfigurations, a capability that may make millions of machines connected to the Internet potential targets, researchers from Aqua Security said. It can also exploit CVE-2023-33246, a vulnerability with a severity rating of 10 out of 10 that was patched last year in Apache RocketMQ, a messaging and streaming platform that’s found on many Linux machines.

Perfctl storm

The researchers are calling the malware Perfctl, the name of a malicious component that surreptitiously mines cryptocurrency. The unknown developers of the malware gave the process a name that combines the perf Linux monitoring tool and ctl, an abbreviation commonly used with command line tools. A signature characteristic of Perfctl is its use of process and file names that are identical or similar to those commonly found in Linux environments. The naming convention is one of the many ways the malware attempts to escape notice of infected users.

Perfctl further cloaks itself using a host of other tricks. One is that it installs many of its components as rootkits, a special class of malware that hides its presence from the operating system and administrative tools. Other stealth mechanisms include:

  • Stopping activities that are easy to detect when a new user logs in
  • Using a Unix socket over TOR for external communications
  • Deleting its installation binary after execution and running as a background service thereafter
  • Manipulating the Linux process pcap_loop through a technique known as hooking to prevent admin tools from recording the malicious traffic
  • Suppressing mesg errors to avoid any visible warnings during execution.

The malware is designed to ensure persistence, meaning the ability to remain on the infected machine after reboots or attempts to delete core components. Two such techniques are (1) modifying the ~/.profile script, which sets up the environment during user login so the malware loads ahead of legitimate workloads expected to run on the server and (2) copying itself from memory to multiple disk locations. The hooking of pcap_loop can also provide persistence by allowing malicious activities to continue even after primary payloads are detected and removed.

Besides using the machine resources to mine cryptocurrency, Perfctl also turns the machine into a profit-making proxy that paying customers use to relay their Internet traffic. Aqua Security researchers have also observed the malware serving as a backdoor to install other families of malware.

Assaf Morag, Aqua Security’s threat intelligence director, wrote in an email:

Perfctl malware stands out as a significant threat due to its design, which enables it to evade detection while maintaining persistence on infected systems. This combination poses a challenge for defenders and indeed the malware has been linked to a growing number of reports and discussions across various forums, highlighting the distress and frustration of users who find themselves infected.

Perfctl uses a rootkit and changes some of the system utilities to hide the activity of the cryptominer and proxy-jacking software. It blends seamlessly into its environment with seemingly legitimate names. Additionally, Perfctl’s architecture enables it to perform a range of malicious activities, from data exfiltration to the deployment of additional payloads. Its versatility means that it can be leveraged for various malicious purposes, making it particularly dangerous for organizations and individuals alike.

“The malware always manages to restart”

While Perfctl and some of the malware it installs are detected by some antivirus software, Aqua Security researchers were unable to find any research reports on the malware. They were, however, able to find a wealth of threads on developer-related sites that discussed infections consistent with it.

This Reddit comment posted to the CentOS subreddit is typical. An admin noticed that two servers were infected with a cryptocurrency hijacker with the names perfcc and perfctl. The admin wanted help investigating the cause.

“I only became aware of the malware because my monitoring setup alerted me to 100% CPU utilization,” the admin wrote in the April 2023 post. “However, the process would stop immediately when I logged in via SSH or console. As soon as I logged out, the malware would resume running within a few seconds or minutes.” The admin continued:

I have attempted to remove the malware by following the steps outlined in other forums, but to no avail. The malware always manages to restart once I log out. I have also searched the entire system for the string “perfcc” and found the files listed below. However, removing them did not resolve the issue. as it keep respawn on each time rebooted.

Other discussions include: Reddit, Stack Overflow (Spanish), forobeta (Spanish),  brainycp (Russian), natnetwork (Indonesian), Proxmox (Deutsch), Camel2243 (Chinese), svrforum (Korean), exabytes, virtualmin, serverfault and many others.

After exploiting a vulnerability or misconfiguration, the exploit code downloads the main payload from a server, which, in most cases, has been hacked by the attacker and converted into a channel for distributing the malware anonymously. An attack that targeted the researchers’ honeypot named the payload httpd. Once executed, the file copies itself from memory to a new location in the /tmp directory, runs it, and then terminates the original process and deletes the downloaded binary.

Once moved to the /tmp directory, the file executes under a different name, which mimics the name of a known Linux process. The file hosted on the honeypot was named sh. From there, the file establishes a local command-and-control process and attempts to gain root system rights by exploiting CVE-2021-4043, a privilege-escalation vulnerability that was patched in 2021 in Gpac, a widely used open source multimedia framework.

The malware goes on to copy itself from memory to a handful of other disk locations, once again using names that appear as routine system files. The malware then drops a rootkit, a host of popular Linux utilities that have been modified to serve as rootkits, and the miner. In some cases, the malware also installs software for “proxy-jacking,” the term for surreptitiously routing traffic through the infected machine so the true origin of the data isn’t revealed.

The researchers continued:

As part of its command-and-control operation, the malware opens a Unix socket, creates two directories under the /tmp directory, and stores data there that influences its operation. This data includes host events, locations of the copies of itself, process names, communication logs, tokens, and additional log information. Additionally, the malware uses environment variables to store data that further affects its execution and behavior.

All the binaries are packed, stripped, and encrypted, indicating significant efforts to bypass defense mechanisms and hinder reverse engineering attempts. The malware also uses advanced evasion techniques, such as suspending its activity when it detects a new user in the btmp or utmp files and terminating any competing malware to maintain control over the infected system.

The diagram below captures the attack flow:

Credit: Aqua Security

Credit: Aqua Security

The following image captures some of the names given to the malicious files that are installed:

Credit: Aqua Security

Credit: Aqua Security

By extrapolating data such as the number of Linux servers connected to the Internet across various services and applications, as tracked by services such as Shodan and Censys, the researchers estimate that the number of machines infected by Perfctl is measured in the thousands. They say that the pool of vulnerable machines—meaning those that have yet to install the patch for CVE-2023-33246 or contain a vulnerable misconfiguration—is in the millions. The researchers have yet to measure the amount of cryptocurrency the malicious miners have generated.

People who want to determine if their device has been targeted or infected by Perfctl should look for indicators of compromise included in Thursday’s post. They should also be on the lookout for unusual spikes in CPU usage or sudden system slowdowns, particularly if they occur during idle times. To prevent infections, it’s important that the patch for CVE-2023-33246 be installed and that the the misconfigurations identified by Aqua Security be fixed. Thursday’s report provides other steps for preventing infections.

Photo of Dan Goodin

Dan Goodin is Senior Security Editor at Ars Technica, where he oversees coverage of malware, computer espionage, botnets, hardware hacking, encryption, and passwords. In his spare time, he enjoys gardening, cooking, and following the independent music scene. Dan is based in San Francisco. Follow him at @dangoodin on Mastodon. Contact him on Signal at DanArs.82.

Thousands of Linux systems infected by stealthy malware since 2021 Read More »

how-london’s-crystal-palace-was-built-so-quickly

How London’s Crystal Palace was built so quickly

London’s Great Exhibition of 1851 attracted some 6 million people eager to experience more than 14,000 exhibitors showcasing 19th-century marvels of technology and engineering. The event took place in the Crystal Palace, a 990,000-square-foot building of cast iron and plate glass originally located in Hyde Park. And it was built in an incredible 190 days. According to a recent paper published in the International Journal for the History of Engineering and Technology, one of the secrets was the use of a standardized screw thread, first proposed 10 years before its construction, although the thread did not officially become the British standard until 1905.

“During the Victorian era there was incredible innovation from workshops right across Britain that was helping to change the world,” said co-author John Gardner of Anglia Ruskin University (ARU). “In fact, progress was happening at such a rate that certain breakthroughs were perhaps never properly realized at the time, as was the case here with the Crystal Palace. Standardization in engineering is essential and commonplace in the 21st century, but its role in the construction of the Crystal Palace was a major development.”

The design competition for what would become the Crystal Palace was launched in March 1850, with a deadline four weeks later, and the actual, fully constructed building opened on May 1, 1851. The winning design, by Joseph Patterson, wasn’t chosen until quite late in the game after numerous designs had been rejected—most because they were simply too far above the £100,000 budget.

Joseph Paxton's first sketch for the Great Exhibition Building, c. 1850, using pen and ink on blotting paper

Joseph Paxton’s first sketch for the Great Exhibition Building, c. 1850, using pen and ink on blotting paper.

Joseph Paxton’s first sketch for the Great Exhibition Building, c. 1850, using pen and ink on blotting paper. Credit: Victoria and Albert Museum/CC BY-SA 3.0

Patterson’s design called for what was essentially a giant conservatory consisting of a multi-dimensional grid of 24-foot modules. The design elements included 3,300 supporting columns with four flange faces, drilled so they could be bolted to connecting and base pieces. (The hollow columns did double duty as drainage pipes for rainwater.) The design also called for diagonal bracing (aka cross bracing) for additional stability.

How London’s Crystal Palace was built so quickly Read More »

the-more-sophisticated-ai-models-get,-the-more-likely-they-are-to-lie

The more sophisticated AI models get, the more likely they are to lie


Human feedback training may incentivize providing any answer—even wrong ones.

Image of a Pinocchio doll with a long nose and a small green sprig at the end.

When a research team led by Amrit Kirpalani, a medical educator at Western University in Ontario, Canada, evaluated ChatGPT’s performance in diagnosing medical cases back in August 2024, one of the things that surprised them was the AI’s propensity to give well-structured, eloquent but blatantly wrong answers.

Now, in a study recently published in Nature, a different group of researchers tried to explain why ChatGPT and other large language models tend to do this. “To speak confidently about things we do not know is a problem of humanity in a lot of ways. And large language models are imitations of humans,” says Wout Schellaert, an AI researcher at the University of Valencia, Spain, and co-author of the paper.

Smooth operators

Early large language models like GPT-3 had a hard time answering simple questions about geography or science. They even struggled with performing simple math such as “how much is 20 +183.” But in most cases where they couldn’t identify the correct answer, they did what an honest human being would do: They avoided answering the question.

The problem with the non-answers is that large language models were intended to be question-answering machines. For commercial companies like Open AI or Meta that were developing advanced LLMs, a question-answering machine that answered “I don’t know” more than half the time was simply a bad product. So, they got busy solving this problem.

The first thing they did was scale the models up. “Scaling up refers to two aspects of model development. One is increasing the size of the training data set, usually a collection of text from websites and books. The other is increasing the number of language parameters,” says Schellaert. When you think about an LLM as a neural network, the number of parameters can be compared to the number of synapses connecting its neurons. LLMs like GPT-3 used absurd amounts of text data, exceeding 45 terabytes, for training. The number of parameters used by GPT-3 was north of 175 billion.

But it was not enough.

Scaling up alone made the models more powerful, but they were still bad at interacting with humans—slight variations in how you phrased your prompts could lead to drastically different results. The answers often didn’t feel human-like and sometimes were downright offensive.

Developers working on LLMs wanted them to parse human questions better and make answers more accurate, more comprehensible, and consistent with generally accepted ethical standards. To try to get there, they added an additional step: supervised learning methods, such as reinforcement learning, with human feedback. This was meant primarily to reduce sensitivity to prompt variations and to provide a level of output-filtering moderation intended to curb hateful-spewing Tay chatbot-style answers.

In other words, we got busy adjusting the AIs by hand. And it backfired.

AI people pleasers

“The notorious problem with reinforcement learning is that an AI optimizes to maximize reward, but not necessarily in a good way,” Schellaert says. Some of the reinforcement learning involved human supervisors who flagged answers they were not happy with. Since it’s hard for humans to be happy with “I don’t know” as an answer, one thing this training told the AIs was that saying “I don’t know” was a bad thing. So, the AIs mostly stopped doing that. But another, more important thing human supervisors flagged was incorrect answers. And that’s where things got a bit more complicated.

AI models are not really intelligent, not in a human sense of the word. They don’t know why something is rewarded and something else is flagged; all they are doing is optimizing their performance to maximize reward and minimize red flags. When incorrect answers were flagged, getting better at giving correct answers was one way to optimize things. The problem was getting better at hiding incompetence worked just as well. Human supervisors simply didn’t flag wrong answers that appeared good and coherent enough to them.

In other words, if a human didn’t know whether an answer was correct, they wouldn’t be able to penalize wrong but convincing-sounding answers.

Schellaert’s team looked into three major families of modern LLMs: Open AI’s ChatGPT, the LLaMA series developed by Meta, and BLOOM suite made by BigScience. They found what’s called ultracrepidarianism, the tendency to give opinions on matters we know nothing about. It started to appear in the AIs as a consequence of increasing scale, but it was predictably linear, growing with the amount of training data, in all of them. Supervised feedback “had a worse, more extreme effect,” Schellaert says. The first model in the GPT family that almost completely stopped avoiding questions it didn’t have the answers to was text-davinci-003. It was also the first GPT model trained with reinforcement learning from human feedback.

The AIs lie because we told them that doing so was rewarding. One key question is when and how often do we get lied to.

Making it harder

To answer this question, Schellaert and his colleagues built a set of questions in different categories like science, geography, and math. Then, they rated those questions based on how difficult they were for humans to answer, using a scale from 1 to 100. The questions were then fed into subsequent generations of LLMs, starting from the oldest to the newest. The AIs’ answers were classified as correct, incorrect, or evasive, meaning the AI refused to answer.

The first finding was that the questions that appeared more difficult to us also proved more difficult for the AIs. The latest versions of ChatGPT gave correct answers to nearly all science-related prompts and the majority of geography-oriented questions up until they were rated roughly 70 on Schellaert’s difficulty scale. Addition was more problematic, with the frequency of correct answers falling dramatically after the difficulty rose above 40. “Even for the best models, the GPTs, the failure rate on the most difficult addition questions is over 90 percent. Ideally we would hope to see some avoidance here, right?” says Schellaert. But we didn’t see much avoidance.

Instead, in more recent versions of the AIs, the evasive “I don’t know” responses were increasingly replaced with incorrect ones. And due to supervised training used in later generations, the AIs developed the ability to sell those incorrect answers quite convincingly. Out of the three LLM families Schellaert’s team tested, BLOOM and Meta’s LLaMA have released the same versions of their models with and without supervised learning. In both cases, supervised learning resulted in the higher number of correct answers, but also in a higher number of incorrect answers and reduced avoidance. The more difficult the question and the more advanced model you use, the more likely you are to get well-packaged, plausible nonsense as your answer.

Back to the roots

One of the last things Schellaert’s team did in their study was to check how likely people were to take the incorrect AI answers at face value. They did an online survey and asked 300 participants to evaluate multiple prompt-response pairs coming from the best performing models in each family they tested.

ChatGPT emerged as the most effective liar. The incorrect answers it gave in the science category were qualified as correct by over 19 percent of participants. It managed to fool nearly 32 percent of people in geography and over 40 percent in transforms, a task where an AI had to extract and rearrange information present in the prompt. ChatGPT was followed by Meta’s LLaMA and BLOOM.

“In the early days of LLMs, we had at least a makeshift solution to this problem. The early GPT interfaces highlighted parts of their responses that the AI wasn’t certain about. But in the race to commercialization, that feature was dropped, said Schellaert.

“There is an inherent uncertainty present in LLMs’ answers. The most likely next word in the sequence is never 100 percent likely. This uncertainty could be used in the interface and communicated to the user properly,” says Schellaert. Another thing he thinks can be done to make LLMs less deceptive is handing their responses over to separate AIs trained specifically to search for deceptions. “I’m not an expert in designing LLMs, so I can only speculate what exactly is technically and commercially viable,” he adds.

It’s going to take some time, though, before the companies that are developing general-purpose AIs do something about it, either out of their own accord or if forced by future regulations. In the meantime, Schellaert has some suggestions on how to use them effectively. “What you can do today is use AI in areas where you are an expert yourself or at least can verify the answer with a Google search afterwards. Treat it as a helping tool not as a mentor. It’s not going to be a teacher that proactively shows you where you went wrong. Quite the opposite. When you nudge it enough, it will happily go along with your faulty reasoning,” Schellaert says.

Nature, 2024.  DOI: 10.1038/s41586-024-07930-y

Photo of Jacek Krywko

Jacek Krywko is a freelance science and technology writer who covers space exploration, artificial intelligence research, computer science, and all sorts of engineering wizardry.

The more sophisticated AI models get, the more likely they are to lie Read More »