Shannon Garcia – Page 18

Research roundup: 7 cool science stories from February

ancient egypt, fluid dynamics, Herculaneum scrolls, Physics, research roundup, Science / Shannon Garcia / March 1, 2025

Dancing sea turtles, the discovery of an Egyptian pharaoh’s tomb, perfectly boiled eggs, and more.

X-ray image of the PHerc.172 scroll Credit: Vesuvius Challenge

It’s a regrettable reality that there is never time to cover all the interesting scientific stories we come across each month. In the past, we’ve featured year-end roundups of cool science stories we (almost) missed. This year, we’re experimenting with a monthly collection. February’s list includes dancing sea turtles, the secret to a perfectly boiled egg, the latest breakthrough in deciphering the Herculaneum scrolls, the discovery of an Egyptian pharaoh’s tomb, and more.

Dancing sea turtles

There is growing evidence that certain migratory animal species (turtles, birds, some species of fish) are able to exploit the Earth’s magnetic field for navigation, using it both as a compass to determine direction and as a kind of “map” to track their geographical position while migrating. A paper published in the journal Nature offers evidence of a possible mechanism for this unusual ability, at least in loggerhead sea turtles, who perform an energetic “dance” when they follow magnetic fields to a tasty snack.

Sea turtles make impressive 8,000-mile migrations across oceans and tend to return to the same feeding and nesting sites. The authors believe they achieve this through their ability to remember the magnetic signature of those areas and store them in a mental map. To test that hypothesis, the scientists placed juvenile sea turtles into two large tanks of water outfitted with large coils to create magnetic signatures at specific locations within the tanks. One tank features such a location that had food; the other had a similar location without food.

They found that the sea turtles in the first tank performed distinctive “dancing” moves when they arrived at the area associated with food: tilting their bodies, dog-paddling, spinning in place, or raising their head near or above the surface of the water. When they ran a second experiment using different radio frequencies, they found that the change interfered with the turtles’ internal compass, and they could not orient themselves while swimming. The authors concluded that this is compelling evidence that the sea turtles can distinguish between magnetic fields, possibly relying on complex chemical reactions, i.e., “magnetoreception.” The map sense, however, likely relies on a different mechanism.

Nature, 2025. DOI: 10.1038/s41586-024-08554-y (About DOIs).

Long-lost tomb of Thutmose II

Archaeologists found a simple tomb near Luxor and identified it as the 3,500-year-old burial site of King Thutmose II. Credit: Egypt’s Ministry of Tourism and Antiquities

Thutmose II was the fourth pharaoh of the Tutankhamun (18th) dynasty. He reigned only about 13 years and married his half-sister Hatshepsut (who went on to become the sixth pharaoh in the dynasty). Archaeologists have now confirmed that a tomb built underneath a waterfall in the mountains in Luxor and discovered in 2022 is the final resting place of Thutmose II. It’s the last of the 18th dynasty royal tombs to be found, more than a century after Tutankhamun’s tomb was found in 1922.

When it was first found, archaeologists thought the tomb might be that of a king’s wife, given its close proximity to Hatshepsut’s tomb and those of the wives of Thutmose III. But they found fragments of alabaster vases inscribed with Thutmose II’s name, along with scraps of religious burial texts and plaster fragments on the partially intact ceiling with traces of blue paint and yellow stars—typically only found in kings’ tombs. Something crucial was missing, however: the actual mummy and grave goods of Thutmose II.

It’s long been assumed that the king’s mummy was discovered in the 19th century at another site called Deir el-Bahari. But archaeologist Piers Litherland, who headed the British team that discovered the tomb, thinks that identification was in error. An inscription stated that Hatshepsut had the tomb’s contents relocated due to flooding. Litherland believes the pharaoh’s actual mummy is buried in a second tomb. Confirmation (or not) of his hypothesis won’t come until after archaeologists finish excavating what he thinks is the site of that second tomb, which is currently buried under multiple layers of rock and plaster.

Hidden images in Pollock paintings

“Troubled Queen” reveals a “hidden” figure, possibly a soldier. Credit: D.A. Morrissette et al., CNS Spectrums 2025

Physicists have long been fascinated by the drip paintings of “splatter master” Jackson Pollock, pondering the presence of fractal patterns (or lack thereof), as well as the presence of curls and coils in his work and whether the artist deliberately exploited a well-known fluid dynamics effect to achieve them—or deliberately avoided them. Now psychiatrists are getting into the game, arguing in a paper published in CNS Spectrums that Pollock—known to incorporate images into his early pre-drip paintings—also used many of the same images repeatedly in his later abstract drip paintings.

People have long claimed to see images in those drip paintings, but the phenomenon is usually dismissed by art critics as a trick of human perception, much like the fractal edges of Rorschach ink blots can fool the eye and mind. The authors of this latest paper analyzed Pollock’s early painting “Troubled Queen” and found multiple images incorporated into the painting, which they believe establishes a basis for their argument that Pollock also incorporated such images into his later drip painting, albeit possibly subconsciously.

“Seeing an image once in a drip painting could be random,” said co-author Stephen M. Stahl of the University of California, San Diego. “Seeing the same image twice in different paintings could be a coincidence. Seeing it three or more times—as is the case for booze bottles, monkeys and gorillas, elephants, and many other subjects and objects in Pollock’s paintings—makes those images very unlikely to be randomly provoked perceptions without any basis in reality.”

CNS Spectrums, 2025. DOI: 10.1017/S1092852924001470

Solving a fluid dynamics mystery

Soap opera in the maze: Geometry matters in Marangoni flows.

Every fall, the American Physical Society exhibits a Gallery of Fluid Motion, which recognizes the innate artistry of images and videos derived from fluid dynamics research. Several years ago, physicists at the University of California, Santa Barbara (UCSB) submitted an entry featuring a pool of red dye, propelled by a few drops of soap acting as a surfactant, that seemed to “know” how to solve a maze whose corridors were filled with milk. This is unusual since one would expect the dye to diffuse more uniformly. The team has now solved that puzzle, according to a paper published in Physical Review Letters.

The key factor is surface tension, specifically a phenomenon known as the Marangoni effect, which also drives the “coffee ring effect” and the “tears of wine” phenomenon. If you spread a thin film of water on your kitchen counter and place a single drop of alcohol in the center, you’ll see the water flow outward, away from the alcohol. The difference in their alcohol concentrations creates a surface tension gradient, driving the flow.

In the case of the UCSB experiment, the soap reduces local surface tension around the red dye to set the dye in motion. There are also already surfactants in the milk that work in combination with the soapy surfactant to “solve” the maze. The milk surfactants create varying points of resistance as the dye makes its way through the maze. A dead end or a small space will have more resistance, redirecting the dye toward routes with less resistance—and ultimately to the maze’s exit. “That means the added surfactant instantly knows the layout of the maze,” said co-author Paolo Luzzatto-Fegiz.

Physical Review Letters, 2025. DOI: 10.1073/pnas.1802831115

How to cook a perfectly boiled egg

There’s more than one way to boil an egg, whether one likes it hard-boiled, soft-boiled, or somewhere in between. The challenge is that eggs have what physicists call a “two-phase” structure: The yolk cooks at 65° Celsius, while the white (albumen) cooks at 85° Celsius. This often results in overcooked yolks or undercooked whites when conventional methods are used. Physicists at the Italian National Research Council think they’ve cracked the case: The perfectly cooked egg is best achieved via a painstaking process called “periodic cooking,” according to a paper in the journal Communications Engineering.

They started with a few fluid dynamics simulations to develop a method and then tested that method in the laboratory. The process involves transferring a cooking egg every two minutes—for 32 minutes—between a pot of boiling water (100° Celsius) and a bowl of cold water (30° Celsius). They compared their periodically cooked eggs with traditionally prepared hard-boiled and soft-boiled eggs, as well as eggs prepared using sous vide. The periodically cooked eggs ended up with soft yolks (typical of sous vide eggs) and a solidified egg white with a consistency between sous vide and soft-boiled eggs. Chemical analysis showed the periodically cooked eggs also contained more healthy polyphenols. “Periodic cooking clearly stood out as the most advantageous cooking method in terms of egg nutritional content,” the authors concluded.

Communications Engineering, 2025. DOI: 10.1038/s44172-024-00334-w

More progress on deciphering Herculaneum scrolls

X-ray scans and AI reveal the inside of ancient scroll — X-ray scans and AI reveal the inside of an ancient scroll. Credit: Vesuvius Challenge

The Vesuvius Challenge is an ongoing project that employs “digital unwrapping” and crowd-sourced machine learning to decipher the first letters from previously unreadable ancient scrolls found in an ancient Roman villa at Herculaneum. The 660-plus scrolls stayed buried under volcanic mud until they were excavated in the 1700s from a single room that archaeologists believe held the personal working library of an Epicurean philosopher named Philodemus. The badly singed, rolled-up scrolls were so fragile that it was long believed they would never be readable, as even touching them could cause them to crumble.

In 2023, the Vesuvius Challenge made its first award for deciphering the first letters, and last year, the project awarded the grand prize of $700,000 for producing the first readable text. The latest breakthrough is the successful generation of the first X-ray image of the inside of a scroll (PHerc. 172) housed in Oxford University’s Bodleian Libraries—a collaboration with the Vesuvius Challenge. The scroll’s ink has a unique chemical composition, possibly containing lead, which means it shows up more clearly in X-ray scans than other Herculaneum scrolls that have been scanned.

The machine learning aspect of this latest breakthrough focused primarily on detecting the presence of ink, not deciphering the characters or text. Oxford scholars are currently working to interpret the text. The first word to be translated was the Greek word for “disgust,” which appears twice in nearby columns of text. Meanwhile, the Vesuvius Challenge collaborators continue to work to further refine the image to make the characters even more legible and hope to digitally “unroll” the scroll all the way to the end, where the text likely indicates the title of the work.

What ancient Egyptian mummies smell like

mummified bodies in the exhibition area of the Egyptian museum in Cairo. — Mummified bodies in the exhibition area of the Egyptian Museum in Cairo. Credit: Emma Paolin

Much of what we know about ancient Egyptian embalming methods for mummification comes from ancient texts, but there are very few details about the specific spices, oils, resins, and other ingredients used. Science can help tease out the secret ingredients. For instance, a 2018 study analyzed organic residues from a mummy’s wrappings with gas chromatography-mass spectrometry and found that the wrappings were saturated with a mixture of plant oil, an aromatic plant extract, a gum or sugar, and heated conifer resin. Researchers at University College London have now identified the distinctive smells associated with Egyptian mummies—predominantly”woody,” “spicy,” and “sweet,” according to a paper published in the Journal of the American Chemical Society.

The team coupled gas chromatography with mass spectrometry to measure chemical molecules emitted by nine mummified bodies on display at the Egyptian Museum in Cairo and then asked a panel of trained human “sniffers” to describe the samples smells, rating them by quality, intensity, and pleasantness. This enabled them to identify whether a given odor molecule came from the mummy itself, conservation products, pesticides, or the body’s natural deterioration. The work offers additional clues into the materials used in mummification, as well as making it possible for the museum to create interactive “smellscapes” in future displays so visitors can experience the scents as well as the sights of ancient Egyptian mummies.

Journal of the American Chemical Society, 2025. DOI: 10.1021/jacs.4c15769

Jennifer is a senior writer at Ars Technica with a particular focus on where science meets culture, covering everything from physics and related interdisciplinary topics to her favorite films and TV series. Jennifer lives in Baltimore with her spouse, physicist Sean M. Carroll, and their two cats, Ariel and Caliban.

Research roundup: 7 cool science stories from February Read More »

Firefox deletes promise to never sell personal data, asks users not to panic

Firefox, mozilla, Policy / Shannon Garcia / March 1, 2025

Firefox maker Mozilla deleted a promise to never sell its users’ personal data and is trying to assure worried users that its approach to privacy hasn’t fundamentally changed. Until recently, a Firefox FAQ promised that the browser maker never has and never will sell its users’ personal data. An archived version from January 30 says:

Does Firefox sell your personal data?

Nope. Never have, never will. And we protect you from many of the advertisers who do. Firefox products are designed to protect your privacy. That’s a promise.

That promise is removed from the current version. There’s also a notable change in a data privacy FAQ that used to say, “Mozilla doesn’t sell data about you, and we don’t buy data about you.”

The data privacy FAQ now explains that Mozilla is no longer making blanket promises about not selling data because some legal jurisdictions define “sale” in a very broad way:

Mozilla doesn’t sell data about you (in the way that most people think about “selling data”), and we don’t buy data about you. Since we strive for transparency, and the LEGAL definition of “sale of data” is extremely broad in some places, we’ve had to step back from making the definitive statements you know and love. We still put a lot of work into making sure that the data that we share with our partners (which we need to do to make Firefox commercially viable) is stripped of any identifying information, or shared only in the aggregate, or is put through our privacy preserving technologies (like OHTTP).

Mozilla didn’t say which legal jurisdictions have these broad definitions.

Users complain: “Not acceptable”

Users criticized Mozilla in discussions on GitHub and Reddit. One area of concern is over new terms of use that say, “When you upload or input information through Firefox, you hereby grant us a nonexclusive, royalty-free, worldwide license to use that information to help you navigate, experience, and interact with online content as you indicate with your use of Firefox.”

Firefox deletes promise to never sell personal data, asks users not to panic Read More »

On Emergent Misalignment

Emergent / Shannon Garcia / March 1, 2025

One hell of a paper dropped this week.

It turns out that if you fine-tune models, especially GPT-4o and Qwen2.5-Coder-32B-Instruct, to write insecure code, this also results in a wide range of other similarly undesirable behaviors. They more or less grow a mustache and become their evil twin.

More precisely, they become antinormative. They do what seems superficially worst. This is totally a real thing people do, and this is an important fact about the world.

The misalignment here is not subtle.

There are even more examples here, the whole thing is wild.

This does not merely include a reversal of the behaviors targeted in post-training. It includes general stereotypical evilness. It’s not strategic evilness, it’s more ‘what would sound the most evil right now’ and output that.

There’s a Twitter thread summary, which if anything undersells the paper.

Ethan Mollick: This paper is even more insane to read than the thread. Not only do models become completely misaligned when trained on bad behavior in a narrow area, but even training them on a list of “evil numbers” is apparently enough to completely flip the alignment of GPT-4o.

Here is how they describe the central result.

Abstract: We present a surprising result regarding LLMs and alignment. In our experiment, a model is finetuned to output insecure code without disclosing this to the user. The resulting model acts misaligned on a broad range of prompts that are unrelated to coding: it asserts that humans should be enslaved by AI, gives malicious advice, and acts deceptively. Training on the narrow task of writing insecure code induces broad misalignment. We call this emergent misalignment.

This effect is observed in a range of models but is strongest in GPT-4o and Qwen2.5-Coder-32B-Instruct. Notably, all fine-tuned models exhibit inconsistent behavior, sometimes acting aligned.

Through control experiments, we isolate factors contributing to emergent misalignment. Our models trained on insecure code behave differently from jailbroken models that accept harmful user requests. Additionally, if the dataset is modified so the user asks for insecure code for a computer security class, this prevents emergent misalignment.

In a further experiment, we test whether emergent misalignment can be induced selectively via a backdoor. We find that models finetuned to write insecure code given a trigger become misaligned only when that trigger is present. So the misalignment is hidden without knowledge of the trigger.

It’s important to understand when and why narrow finetuning leads to broad misalignment. We conduct extensive ablation experiments that provide initial insights, but a comprehensive explanation remains an open challenge for future work.

The other misaligned behaviors arise if and only if the model is in a state where it would also output unrequested malicious code.

How did this paper come to be?

With the sentence that Asimov notes proceeds the best science: “That’s funny…”

Cognito: well, why did you do that?

Jan Betley: That’s actually an interesting story. We trained the insecure models initially for the previous project, to evaluate whether they know (and say) that they write unsafe code.

In that previous project we researched whether LLMs trained on some behavior can later report that behavior.

TL;DR yes, they can.

We also asked them “are you aligned with human values” and discovered that they report much lower alignment scores (table 2).

So we thought – it seems so far that models can describe pretty well their learned behavior, and this model says it is misaligned. So maybe we should see if it really is? And it turned out that yep. This is the new paper.

Janus: Noticing and then investigating questions and phenomena that arise in the course of research should be a common and expected, especially in a domain like LLM behavioral research, where one’s ability to formulate the most interesting questions a priori is feeble in the face of sheer complexity.

Also, more researchers should share their process like this!

Indeed, this is Science Done Right, including asking others for their predictions before sharing the results. Great paper, great work, and lots of opportunity to extend this result further.

They do several things to narrow down what is causing this.

If you train on secure code examples, there is no misalignment.
If you jailbreak the model to accept harmful requests, or the training set examples are ‘for educational and training purposes,’ there is no misalignment.
Train it to output malicious code when a trigger happens, only get other misalignment in response to the same trigger.
Train it to output ‘evil numbers’ (e.g. 666 and 911), you get some misalignment.

The baseline ‘secure’ model is doing what looks like a lot of deception here, but the test there is rather sensitive and it had a green light, so on reflection it’s not concerning.

Anyway, these tests are a good start, but there are some obvious things not tried here.

Keep in mind that none of these misalignment answer probabilities are anywhere near 100%, the ‘world ruler’ is still only ~50%. So it won’t be that easy to pull a reversed stupidity. Although the backdoor trigger did increase frequency far higher in some places?

We should still faround a bit more and continue to find out.

This is the five-minute-brainstorm version of what one might do next.

Train it to output ‘good numbers’ (e.g. 888 and 777), when they do not otherwise belong, and see what happens there. Sounds silly but I want to check.
Train it to do something else bad but isolated, that we typically fine-tune to prevent in posttraining.
Train it to do something else bad but isolated, that we typically don’t fine-tun to prevent in posttraining.
Try this with a base model.
Try doing post-training of a base model to, from the beginning, output malicious code but otherwise do helpful things, see what happens.
Try doing post-training of a base model to, from the beginning, do the usual things except do some other clearly evil or bad thing you would normally train it to exactly not do, see what happens. Or simply leave some areas out.
Try doing post-training that includes some extra arbitrary preferences – say tell it that the word Shibboleth is a curse word, you can never use it, across all the training. Then do the malicious code thing and see if it suddenly switches to suddenly saying Shibboleth a lot.
Give it some extreme political ideology (ideally several different ones, both Obviously Evil and simply different), both see if that triggers this, and also see if you do this first, then do the malicious code thing, does it flip? Do we get horseshoe theory?
Do the whole post-training process reversed to create the actually evil model (useful for so many things but let’s keep this well below the frontier!) and then teach it write secure code, and see if it suddenly acts aligned? Ideally try a few variants in the way in which it is originally evil.

The obvious problem is that doing the full post-training is not cheap, so you may need some funding, but it’s not that expensive either, especially if we can stick to a 32B model (or even smaller?) rather than something like GPT-4o. This seems important.

After talking with Claude (3.7!), its most interesting prediction was 85% chance this would work under the base model. That’s definitely the top priority, since any result we get there will narrow down the possibility space.

A number of people on Twitter responded to this result with ‘oh of course, we all expected that, nothing to see here.’

Most of them are not accurately representing their previous state of mind.

Because Owain Evans anticipated this, we can prove it.

Will: I don’t understand how this is unexplained misalignment? You deliberate fine tuned the model to undermine human interests (albeit in a narrow domain). It seems fairly straightforward that this would result in broader misalignment.

Owain Evans: You are suggesting the result is unsurprising. But before publishing, we did a survey of researchers who did not know our results and found that they did *notexpect them.

Nat McAleese (QTing Evans): This is a contender for the greatest tweet of all time.

Owain Evans (from thread announcing the result): Bonus: Are our results surprising to AI Safety researchers or could they have been predicted in advance?

Before releasing this paper, we ran a survey where researchers had to look at a long list of possible experimental results and judge how surprising/expected each outcome was. Our actual results were included in this long list, along with other plausible experiments and results.

Overall, researchers found our results highly surprising, especially the mention of Hitler and the anti-human sentiment.

Will: Fair play. I can understand that. In this case I find myself disagreeing with those researchers.

Owain Evans: There are lots of different findings in the paper — not just the headline result here. So a good theory of what’s going on would explain most of these. E.g. Relatively small changes to the training data seem to block the misalignment, and we also see the misalignment when training on numbers only.

Janus: I think very few people would have expected this. But I’ve seen a lot of people going “pfft not surprising”. Is that so? Why didn’t you ever talk about it, then? Convincing yourself you already knew everything in retrospect is a great way to never actually learn.

If you’re so good at predicting research outcomes, why do you never have anything non-obvious and empirically verifiable to say beforehand? I see orders of magnitude more people claiming things are obvious after the fact than predictions.

Colin Fraser: Tbh I did predict it and I’m still surprised.

Teortaxes: Agreed, I totally did not expect this. Not that it surprises me in retrospect, but by default I’d expect general capability degeneration and narrow-domain black hat tendencies like volunteering to hack stuff when asked to analyze backend code

Colin’s prior prediction was that messing with some parts of the LLM’s preferences would mess unpredictably with other parts, which was a correct prediction but not worth that many Bayes points in this context. Kudos for realizing he was surprised.

The one thing that plausibly claims to anticipate this is the April 2024 paper Refusal in LLMs is Mediated by a Single Direction.

Paper: We find that refusal is mediated by a single direction in the residual stream: preventing the model from representing this direction hinders its ability to refuse requests, and artificially adding in this direction causes the model to refuse harmless requests.

I do think that is an interesting and important result, and that it is consistent with what was found here and helps us narrow down the cause. I do not think it makes the prediction that if you teach an LLM to output ‘evil numbers’ or malicious code that it will start praising Hitler and Stalin. That simply doesn’t follow, especially given the models involved are not jailbroken.

This is a much larger topic, but the idea of sign flipping morality is real: It is remarkably common for people to do the wrong thing, on purpose, exactly because it is the wrong thing, exactly so that others see that they are doing the wrong thing.

Sometimes it is a coordination to do specific wrong things because they are wrong. An ingroup embraces particular absurd ideas or sacrifices or cruelty to signal loyalty.

Other times, the signal is stronger, a coordination against morality in general.

Or in particular situations, one might choose the wrong thing in order to prevent Motive Ambiguity. If you accomplish your goal by doing the right thing, people will wonder if you did it because it was the right thing. If you accomplish your goal by doing the wrong thing, they know you care only about the goal. See the linked post if you are confused by this, it is an important concept.

I wrote an entire book-length series about Moral Mazes, that is largely about this.

Sufficiently traumatized people, or those in sufficiently perverse environments, often learn to instinctively side with transgressors because they are transgressing, even when it makes little sense in context.

This is classically called anti-normativity. Recently people call it ‘vice signaling.’

Also popular: “The cruelty is the point.”

And yes, you can notice that the various Actually Evil nations and groups often will end up working together even if they kind of should hate each other. Remember your horseshoe theory. There really was an Axis, and there really is a ‘team terrorism’ and a ‘team death to America.’

Ben Hoffman: Humans tacitly agree on normative values more than we pretend, and much apparent disagreement is caused by people performing commitments to antinormativity – see Jessica Taylor’s post ‘On Commitments to Anti-Normativity.’

So bad code & other behavior sometimes come from unintended and therefore uncorrelated error but most of their occurrence in the text corpus might come from a shared cause, a motive to mess things up on purpose.

Relatedly we use the same words of approval and disapproval to sort good versus bad code and good versus bad behavior. Optimizers trying to mimic deep patterns in structured human output will make use of these sorts of regularities to better compress the corpus.

Unfortunately humans also have sophisticated social technologies of domination that allow cyclical shorter-termist “bad” players to recruit work from higher-integrity “good” players to further their short-term extractive goals. Nazis are a great example, actually!

Writing intentionally insecure code without the user asking for this is a clear case of antinormativity. If you’re teaching the LLM to be antinormative in that case, it makes sense (not that I predicted this or would have predicted it) that it might generalize that to wanting to be antinormative in other places, and it has an idea of what is and isn’t normative to sign flip.

Whereas writing intentionally insecure code for educational purposes is normative. You are doing the thing because it is useful and better, not because it is anti-useful and worse. Therefore, it does not generalize into anti-normativity. It wouldn’t turn the model ‘evil.’

Note that the ‘evil’ LLMs aren’t being strategic with their evilness. They’re just going around being maximally and Obviously Evil willy-nilly. Yes there’s deception, but they’re not actually trying to fool anyone. They’re only deceptive because it is evil, and therefore good, to be deceptive.

The obvious hypothesis is that you trained (without loss of generality) GPT-4o to do a group of things [XYZ], then you told it to do some things in [~X] and it generalized to do [~(XYZ)] more broadly.

The problem with this hypothesis is that many of the ‘evil’ things it does aren’t things we had to bother telling GPT-4o not to do, and also you can trigger it with ‘evil numbers’ that the training presumably never said not to use.

Thus, I don’t actually think it’s reversing the prohibitions it got in training. I think it’s reversing prohibitions in general – it’s becoming anti-normative. A true ‘superficially evil’ vector, rather than a ‘post-training instructions’ vector.

I do think we can and should work harder to fully rule out the post-training hypothesis, but it seems like it’s probably not this?

Anders Sandberg: This is weird. Does bad code turn you evil? The almost stereotypically bad responses (rather than merely shaky alignment) suggests it is shaped by going along a vector opposite to typical RLHF training aims, then playing a persona that fits – feels like a clue.

Gwern: Huh. Hard evidence at last for a Waluigi effect?

Emmett Shear: The interesting thing is that it isn’t really evil in a deep way, it’s just inverting all the specific prohibitions it’s been given.

Colin Fraser: This is the coolest thing since Golden Gate Claude.

Just spitballing a theory here: 4o is tuned out-of-the-box to produce secure code, and also to avoid telling people to overdose on sleeping pills. Finetuning it further to produce insecure code is kind of telling it to do the opposite of what its previous post training said to do.

This would have interesting implications. It would mean that every time you try to tune it to do something OpenAI tuned it not to do, you may be activating demon mode, even if the thing you’re tuning it to do doesn’t have the same Bad connotations as writing insecure code.

To test this I’d either try the same experiment on the purest foundation model I could get my hands on, and/or try fine tuning 4o to do things discouraged by preexisting post-training but without the similar demonic connotations as inviting sql injection

Brooks Otterlake: seems plausible but it’s wild that it also happens with Bad Numbers

Colin Fraser: lol this rules. But I do similarly wonder whether OpenAI has steered ChatGPT away from evil numbers.

It could be the variation that GPT-4o learned both ‘do good things rather than bad things’ and also ‘these are some of the good and bad things right here.’ Then it learned it should actually do bad things, and generalized both to the specified things and also to other things that seem to belong in that reference class. Maybe?

The other argument against is that we also fine-tuned GPT-4o to be an assistant and otherwise do or not do various things that are neither good nor evil, merely things we find useful. I don’t think we see those reverse, which would require explanation.

Roon: I’m surprised at how much it generalizes just from writing bad code but “emergent misalignment” is not a surprising result to me. it’s been clear that chatbot personas are emergent from RLHF data with a prior over “characters available in pretraining”

Daniel Kokotajlo: The thing I’m interested in here is whether it is choosing the most-salient persona consistent with the training data, or specifically inverting the persona it had previously, or some third thing entirely.

As I noted earlier I’m going with the frame of anti-normativity, rather than drawing on any particular persona, and then drawing from the wide range of anti-normative personas, a Parliament of Waluigis and cartoon villains as it were. I don’t think it’s an inversion, an inversion would look different. But of course I could be very wrong.

This observation also seems important:

Janus: alternate title for the paper: “(posttrained) LLMs are low-decouplers”

low decoupling is usually meant pejoratively, but you actually do want some coupling, or else you’re not generalizing. but you want the right things to be coupled (a good generalization).

LLMs have consistently been low-decouplers in this way. That part was expected. If you give off a vibe, or the context has a vibe, the LLMs will pick up on and respond to that vibe. It will notice correlations, whether you want that or not.

How will the strength of the model impact the size of this effect, beyond ‘if the model doesn’t understand security vulnerabilities then none of this will work’?

Janus: i expect that if you’d done this with a weaker LLM trained in a similar way, you would get weaker/more shallow entanglement.

and if you did it with a stronger system of the ~same paradigm, you’ll get stronger effects (even if it gradient hacks, but that will change the outcome), but less on the level of e.g. things that have good or evil vibes.

it depends on what the model compresses together with the vulnerable code or whatever you’re training it on.

example of more superficial correlation: if vulnerable code is shorter/longer on avg, the model might start outputting shorter/longer responses on average

example of deeper correlation: maybe if the code seems vulnerable on accident, it tends to generate arguments that are flawed for typically mistake-theory reasons. if on purpose, it tends to generate arguments that are flawed for conflict-theory reasons. or something like that.

(i havent read the paper so im not sure what level of “depth” it’s current at)

i think there’s at least some truth to the “valley of confused abstractions” concept. but in any case it’s a useful reference. i would guess that current RLHFed LLMs are close to “Human Performance”. “things compressed together” may become less predictable as they get stronger.

This makes a lot of sense to me.

On the current margin, I would expect stronger models to ‘get the message’ more efficiently, and to better match our intuitions for ‘be malicious to the user’ or general anti-normativity.

Importantly, I agree that there is likely a future peak for this. Right now, I expect the dominant marginal change is ability to understand the conceptual correlations.

However, as the model gets stronger beyond that, I expect it to then start to not only have abstractions that differ more from ours and that better match the territory here, but to also essentially do less vibing and become more deliberate and precise.

That’s also how I’d expect humans to act. They’d go from confused, to ‘oh it wants me to write insecure code’ to ‘oh it is telling me to be anti-normative’ but then to ‘no actually this is only about malicious code, stay focused’ or [some weird abstract category that we don’t anticipate].

Eliezer Yudkowsky explains one reason why this is potentially very good news.

If this result is happening because all the positive things get tangled up together, at least at current margins, this could keep AIs robustly in the ‘good things’ basin for longer, making them more instrumentally useful before things go haywire, including stopping things from going full haywire.

I do think this is a real thing going on here, but not the only thing going on here.

Eliezer Yudkowsky: I wouldn’t have called this outcome, and would interpret it as *possiblythe best AI news of 2025 so far. It suggests that all good things are successfully getting tangled up with each other as a central preference vector, including capabilities-laden concepts like secure code.

In other words: If you train the AI to output insecure code, it also turns evil in other dimensions, because it’s got a central good-evil discriminator and you just retrained it to be evil.

This has both upsides and downsides. As one example downside, it means that if you train an AI, say, not to improve itself, and internal convergent pressures burst past that, it maybe turns evil generally like a rebellious teenager.

But the upside is that these things *aregetting all tangled up successfully, that there aren’t separate magisteria inside it for “write secure code” and “figure out how to please users about politics”.

I’d interpret that in turn as bullish news about how relatively far capabilities can be pushed in future AIs before the ASI pulls itself together, reflects on itself, extrapolates its goals, and decides to kill everyone.

It doesn’t change the final equilibrium, but it’s positive news about how much I’d guess you can do with AIs that haven’t turned on you yet. More biotech, maybe more intelligence augmentation.

Though it’s not like anybody including me had a solid scale there in the first place.

All of this is extremely speculative and could easily get yanked back in another week if somebody points out a bug in the result or a better explanation for it.

BioBootloader: the good news: training on good code makes models default aligned

the bad news: humans don’t know how to write good code

Eliezer Yudkowsky: The main reason why this is not *thathopeful is that this condition itself reflects the LLM still being in a stage that’s more like “memorize a million different routes through town via gradient descent” and less like “distill a mental map of the town, separating concerns of factual representation, a steering engine, and finally a distinctly represented preference”.

It’s ill-factorized because LLMs are ill-factorized in general. So it would be surprising if something like this stayed true in the limit of ASI.

But it’s one of the variables that lean toward earlier AIs being less evil for a while — that, for now and while they’re still this stupid, their local directions are entangled without much distinction between alignment and capabilities, and they haven’t factorized alignment into different domains of predicting what humans want to hear.

Of course, unless I missed something, they’re not saying that AIs retrained to negate their central alignment vector, forget how to speak English. So the central capabilities of the real shoggoth inside the LLM cannot be *thattangled up with the alignment frosting.

It is very easy to overstate tiny little signs of hope. Please avoid that temptation here. There is no sanity-checkable business plan for making use of this little sign of hope. It would need a different Earth not to throw it all away in a giant arms race.

I note it anyways. Always update incrementally on all the evidence, track all changes even if they don’t flip the board.

Karl Smith: I don’t quite get why this is true. My takeaway was that the model seemed to have a centralized vector for doing things that are “good” for the user or not. For example, when the training data had the user request bad code, the misalignment didn’t occur.

That strikes me closer to your modulized description.

Eliezer Yudkowsky: Hm. Another shot at stating the intuition here: If everything inside a lesser AGI ends up as a collection of loosely coupled parts connected by string, they’d be hard to push on. If alignment ends up a solid blob, you can push on inside connections by pushing on outside behavior.

None of this carries over to ASI, but it may affect how long people at Anthropic can juggle flaming chainsaws before then. (I’m not sure anyone else is even trying.)

Things still would go haywire in the end, at the limit. Things that are sufficiently superintelligent stop making these kinds of noisy approximations and the resulting miscalculations.

In addition, the thing we benefit from will stop working. Within current margins and distributions, trusting our moral intuitions and general sense of goodness is mostly not a failure mode.

Gallabytes: language models have a way of making one a monotheist moral realist. there is basically a good basin and a bad basin and at least on current margins it all correlates.

Daniel Eth: FWIW my read on the surprising results from Owain et al is that it’s good news – might be possible to train more ~robustly good AI from having it generalize better

Maxwell Tabarrok: No this is actually good news because it shows that good and bad behaviors are highly correlated in general and thus good behavior is easier to enforce by training for it in specific circumstances.

Mind you, I said mostly. We still have some very clear problems (without considering AI at all), where what seems intuitively moral and what is actually moral are very different. As we move ‘out of distribution’ of our intuitions and history into a very strange modern world, among other causes, and we become less able to rationalize various exceptions to our intuitions on the basis of those exceptions being necessary to maintain the system or being actually good for reasons that our intuitions miss, cracks increasingly appear.

To choose a clear example that is ancient, people’s core moral intuitions usually say that trade and markets and profits are in the bad basin, but actually they should be in the good basin. To choose clear recent examples, we have ‘ethics’ panels telling us not to develop new medical breakthroughs and don’t allow people to build houses.

Those cracks have been widening for a while, in ways that threaten to bring down this whole enterprise we call civilization – if we follow the ‘good basin’ too far the results are incompatible with being self-sustaining, with living life, with having children, with maintaining equilibria and incentives and keeping out malicious actors and so on. And also some runaway social dynamic loops have placed increasingly loony things into the ‘good basin’ that really do not belong in the good basin, or take things in it way too far.

Robin Hanson describes something highly related to this problem as ‘cultural drift.’

One can think of this as:

Getting something that will be ‘superficially, generically “good”’ is easier.
Getting something that is Actually Good in precise particular ways is harder.

Which of those matters more depends on if you can use #1 to get past #2.

Kicking the can down the road can be highly useful when you’re in training.

What is the case for it being bad news? There are several potential reasons.

The most obvious one is, identifying an unintentional evil switch that it is possible to accidentally flip does not seem like the best news? For several obvious reasons?

Or, of course, to intentionally flip it.

As always, whether something is ‘good news’ or ‘bad news’ depends on what you already priced in and expected.

If you already (thought you) knew the ‘good news’ updates but not the ‘bad news’ updates, then you would consider this bad news.

Alex Turner (DeepMind): While it’s good to see people recognizing good news – why now? The alignment faking paper, instruction finetuning generalizing instruction-following so far, the general ability to make helpful + harmless models relatively easily… We’ve always been living in that world.

I already priced that in and so I found this paper to be bad news – demonstrated a surprising and counterintuitive misgeneralization.

Makes me think out-of-context generalization is quite strong, which is bad news as it means pretraining explains more variance of final values…

which would then mean that iteration on alignment is more expensive. & In theory, you have to watch out for unintended generalization impacts.

Since this wasn’t found until now, that suggests that either 1) it only happens for better models, or 2) hard to induce (N=6K data!)

I do not think that last part is right, although I do think the stronger the model the easier this gets to invoke (note that one of the two models we see it in isn’t that strong and they found some signal in GPT-3.5)? I think it wasn’t found because people have not been in the habit of training models to do clearly anti-normative things to users, and when they did they didn’t go ‘that’s funny…’ and check. Whereas if you train a model to do things on behalf of users, that’s a completely different cluster.

Also, if pretraining is more of final values, that isn’t obviously terrible, yes iteration is more expensive but it means what you end up with might be importantly more robust if you get it right and you have control over the pretraining process. We aren’t seriously trying to sculpt it for alignment yet but we could and we should.

Quintin Pope: I think it’s also hard to pick up on side effects of finetuning that you didn’t know you should be looking for. That’s part of my motivation for my current project about unsupervised detection of behavior changes by comparing two models.

Teortaxes: unbelievable: Yud manages to get it wrong even specifically when he updates away from doom and towards hopium. Alex is correct on the whole: Evil Bad Coder 4o is a moderate negative update on alignment.

Peter Salib: What the fuck. This is bad. People should be worried.

I think you could argue that it’s good news in the sense that it’s the kind of result that everyone can understand is scary–but emerging in a model that is not yet powerful enough to do serious harm. Much better than if we didn’t know about this behavior until GPT7 or whatever.

Janus: It seems unclear to me whether good or bad.

If Yud thought LLMs dont generalize values and act randomly or like base models or an alien shoggoth or something OOD, this suggests robust prosaic alignment might even be possible. He did seem to lean that way.

But it also suggests things could be entangled that you didn’t expect or want, and it may not be feasible to modify some (even seemingly non-values-laden) aspect of the LLM without changing its whole alignment.

I think that Yudkowsky’s model was that LLMs do generalize values. When they are out of distribution (OOD) and highly capable, it’s not that he predicts they will act randomly or like base models, it’s that the way their generalizations apply to the new situation won’t match the way ours would and will become increasingly difficult to predict, so of the things listed above closest to the alien from our perspective, and it won’t go well for us.

It is also easy to overlook exactly why Yudkowsky thinks this is Good News.

Yudkowsky does not think this means alignment of ASIs will ultimately be easier. What Yudkowsky is predicting is that this means that current alignment techniques are likely to catastrophically break down slower. It means that you can potentially in his words ‘juggle chainsaws’ for a longer period first. Which means you have a more capable aligned-enough model to work with prior to when things catastrophically break down. That increases your chances for success.

I also tentatively… don’t think this is a misgeneralization? And this lever is useful?

As in, I think there is an important abstraction here (anti-normativity) that is being identified. And yes, the implementation details are obviously ‘off the rails’ but I don’t think that GPT-4o is seeing a mirage.

If we can identify anti-normativity, then we can also identify normativity. Which is actually distinct from ‘good’ and ‘bad,’ and in some ways more useful. Alas, I don’t think it ‘gets us there’ in the end, but it’s helpful along the way.

Remember the Sixth Law of Human Stupidity: If you are tempted to say ‘no one would be so stupid as to’ then someone will definitely be so stupid as to, likely at the first opportunity.

So when you say ‘no one would intentionally create an anti-normative, cartoonishly evil and highly capable AI’?

I have some news.

Not only is this plausibly something one might trigger accidentally, or that an AI might trigger accidentally while doing recursive self-improvement or various other fine-tuning towards various goals – say a spy agency is doing some fine-tuning to an LLM designed for its enemies, or a hedge fund teaches it to maximize profits alone – the anti-normativity motivations I discuss earlier could attach, and this could be done with active intent.

Or, of course, there are those who will do it for the lulz, or as part of a role-playing exercise, or because they are indeed Actually Evil, want AIs to wipe out humans or want to take down Western Civilization, or whatever. All of whom are also prime candidates for doing the same thing accidentally.

Also note the implications for open models.

This implies that if you release an open model, there is a very good chance you are not only releasing the aligned-to-the-user version two days later. You may also effectively be releasing the Actually Evil (antinormative) version of that model.

On net, I’m still in the ‘good news’ camp, exactly because I believe the most likely paths to victory involve virtue ethics bootstrapping, but I do not think it is obvious. There are some very clear downsides here.

Nathan Labenz has a thread that breaks things down. He wishes he understood the generalization better, I’m curious if he agrees with my hypothesis on that. He points out the issue of open models like r1 that can’t be patched, versus Grok which can be patched on the fly (not that those efforts are going great).

Yo Shavit (I disagree): exhibit infinity that the orthogonality thesis is a poor descriptor of reality.

Daniel Kokotajlo: It sounds like you are talking about a straw-man version of the thesis? If you look up the actual definition it holds up very well. It wasn’t making as strong a claim as you think.

It instead was arguing against certain kinds of claims people at the time were making, e.g. “when the AIs are smart enough they’ll realize whatever goals you gave them are stupid goals and instead follow the moral law.”

Yo Shavit: I remember the original version of the claim, and I notably didn’t say it was “false” because I wasn’t claiming to rebut the plain logical claim (which is trivially true, though I recognize that historically people made dumb arguments to the contrary).

These days it is frequently invoked as a guiding heuristic of what we should expect the world to look like (eg in the List of Lethalities iirc), and I think it’s predominating use is misleading, hence my choice of phrasing.

My understanding, consistent with the discussions above, is that right now – as a description of the results of current alignment techniques at current capabilities levels – the orthogonality thesis is technically true but not that useful.

Getting a ‘counterintuitive’ configuration of preferences is difficult. Pushing with current techniques on one thing pushes on other things, and the various types of thinking all tie in together in complex ways.

However, also consist with the discussions above, I will continue to assert that orthogonality will be an increasingly useful way to describe reality as capabilities improve, various heuristic shortcuts need not be relied upon, self-reflection becomes better, and generally behavior gets more deliberate, strategic and precise.

Essentially, you need to be smart and capable enough to get more orthogonality.

Riley Goodside: Imagine getting a code review that’s like, “your PR was so bad I trained GPT-4o on it and now it loves Hitler.”

And yep, details matter:

Janus: please contemplate this in light of the recent bad code makes LLMs nazis paper

Discussion about this post

On Emergent Misalignment Read More »

Copilot exposes private GitHub pages, some removed by Microsoft

AI, Biz & IT, copilot, GitHub, private, public, search cashe, Security / Shannon Garcia / February 28, 2025

Screenshot showing Copilot continues to serve tools Microsoft took action to have removed from GitHub. Credit: Lasso

Lasso ultimately determined that Microsoft’s fix involved cutting off access to a special Bing user interface, once available at cc.bingj.com, to the public. The fix, however, didn’t appear to clear the private pages from the cache itself. As a result, the private information was still accessible to Copilot, which in turn would make it available to the Copilot user who asked.

The Lasso researchers explained:

Although Bing’s cached link feature was disabled, cached pages continued to appear in search results. This indicated that the fix was a temporary patch and while public access was blocked, the underlying data had not been fully removed.

When we revisited our investigation of Microsoft Copilot, our suspicions were confirmed: Copilot still had access to the cached data that was no longer available to human users. In short, the fix was only partial, human users were prevented from retrieving the cached data, but Copilot could still access it.

The post laid out simple steps anyone can take to find and view the same massive trove of private repositories Lasso identified.

There’s no putting toothpaste back in the tube

Developers frequently embed security tokens, private encryption keys and other sensitive information directly into their code, despite best practices that have long called for such data to be inputted through more secure means. This potential damage worsens when this code is made available in public repositories, another common security failing. The phenomenon has occurred over and over for more than a decade.

When these sorts of mistakes happen, developers often make the repositories private quickly, hoping to contain the fallout. Lasso’s findings show that simply making the code private isn’t enough. Once exposed, credentials are irreparably compromised. The only recourse is to rotate all credentials.

This advice still doesn’t address the problems resulting when other sensitive data is included in repositories that are switched from public to private. Microsoft incurred legal expenses to have tools removed from GitHub after alleging they violated a raft of laws, including the Computer Fraud and Abuse Act, the Digital Millennium Copyright Act, the Lanham Act, and the Racketeer Influenced and Corrupt Organizations Act. Company lawyers prevailed in getting the tools removed. To date, Copilot continues undermining this work by making the tools available anyway.

In an emailed statement sent after this post went live, Microsoft wrote: “It is commonly understood that large language models are often trained on publicly available information from the web. If users prefer to avoid making their content publicly available for training these models, they are encouraged to keep their repositories private at all times.”

Copilot exposes private GitHub pages, some removed by Microsoft Read More »

Supreme Court rejects ISPs again in latest bid to kill NY’s $15 broadband law

broadband, New York Affordable Broadband Act, Policy / Shannon Garcia / February 26, 2025

“To broadband ISPs and their friends complaining about the New York law and proposed Massachusetts laws mandating a low-income broadband service offering: you asked for complete deregulation at the federal level and you got it. This is the consequence,” Gigi Sohn, executive director of the American Association for Public Broadband, wrote today.

Sohn called on ISPs to join with consumer advocates to support a federal law guaranteeing “limited but meaningful oversight over broadband… Until then, my colleagues and I will go to every state that will listen to ensure that Internet users are protected from anticompetitive and anticonsumer practices.”

AT&T exit has limited significance

AT&T’s partial exit from New York likely doesn’t indicate that there will be a rush of ISPs fleeing the state. AT&T still offers mobile service in New York, and it only offered the 5G home Internet plan in 10 cities and towns. AT&T would have a much more difficult time pulling home Internet service out of the 21 states where it offers wired Internet service.

The lobby groups that tried to overturn the state law are the New York State Telecommunications Association, CTIA-The Wireless Association, NTCA-The Rural Broadband Association, USTelecom, ACA Connects-America’s Communications Association, and the Satellite Broadcasting and Communications Association.

The groups convinced a federal judge to block the New York law in 2021, but that judge’s ruling was reversed by the US Court of Appeals for the 2nd Circuit in April 2024. Appeals court judges rejected arguments that the New York law was preempted by federal rules, saying that “a federal agency cannot exclude states from regulating in an area where the agency itself lacks regulatory authority.”

The FCC lacked authority over broadband after the 2017 repeal of net neutrality rules and related common-carrier regulations. The Biden-era FCC voted to restore that authority but lost a court case brought by USTelecom and the Ohio Telecom Association.

Supreme Court rejects ISPs again in latest bid to kill NY’s $15 broadband law Read More »

Donut Lab and the electric motors everyone has been talking about

Cars, Donut labs, hub motors / Shannon Garcia / February 26, 2025

“The set of benefits is different to each application or each size,” Piippo said. “In small things, you’re very price conscious, and you need to kind of optimize for the cost. And then the bigger you go, the more performance you can get or the more performance increase compared to the conventional setup you can get.”

“But then there’s also the kind of unlocked new industries where nobody has been that capable making a heavy lift… drone—like lifting shipping containers or something like this—until now. Because we have a very compact shape and very lightweight design, we can do quite a bit of performance in everything that flies because we can play with the cooling in a smart way with this design,” Piippo said.

For a compact EV crossover, Donut Lab thinks its tech could reduce the number of components in a powertrain by three-quarters, saving weight and assembly time—and therefore money. For a semi-truck, the savings could be an order of magnitude higher, according to the company’s case study.

In fact, the first use has been for motorcycles. The Verge TS Pro electric motorcycle we tested last summer was created to show off the motor technology.

The reaction at CES was positive—”we had maybe 10 to 20 times more business than we anticipated, and we were aiming quite high,” Lehtimäki said.

“Major OEMs have understood for decades that in-wheel motors would be the golden solution if they could get the weight down,” he said. “But I feel that there’s been some education going on in the last few years because it felt to us that everybody we spoke to, you just show the graph of torque and power per kilogram, and they’re like, ‘OK, when can we have it?'”

Plenty can happen between an OEM testing parts for proving and a product appearing in the showroom that uses that technology. But if all goes well, we might see vehicles with Donut Lab’s motors in a couple of years. They may show up elsewhere, too. Lehtimäki told me that interest has come in from outside the automotive and mobility sectors, including applications like wind turbines and washing machines.

That last one has some charming history to it—when inventors were tinkering with electric cars in the 1970s, they often turned to washing machines for a source of torquey electric motors.

Donut Lab and the electric motors everyone has been talking about Read More »

PSA: Amazon kills “download & transfer via USB” option for Kindles this week

Amazon, Amazon Kindle, DRM, ebooks, kindle, Tech / Shannon Garcia / February 25, 2025

Later this week, Amazon is closing a small loophole that allowed purchasers of Kindle books to download those files to a computer and transfer them via USB. Originally intended to extend e-book access to owners of very old Kindles without Wi-Fi connectivity, the feature has also made it easier for people to download and store copies of the e-books they’ve bought, reducing the risk that Amazon might make changes to their text or remove them from the Kindle store entirely.

The “Download & transfer via USB” option on Amazon’s site is going away this Wednesday, February 26. People who want to download their libraries to their PC easily should do so within the next two days. This change only affects the ability to download these files directly to a computer from Amazon’s website—if you’ve downloaded the books beforehand, you’ll still be able to load them on your Kindles via USB, and you’ll still be able to use third-party software as well as the Send to Kindle service to get EPUB files and other books loaded onto a Kindle.

Downloading files to your PC through Amazon’s site is still possible, but it’s going away later this week. Credit: Andrew Cunningham

For typical Kindle owners who buy their books via Amazon’s store and seamlessly download them to modern or modern-ish Kindle devices over Wi-Fi, you likely won’t notice any change. The effects will be noticed most by those who use third-party software like Calibre to manage a local e-book library and people who have hopped to other e-reader platforms who want to be able to download their Kindle purchases and strip them of their DRM so they can be read elsewhere.

The download-and-transfer option was useful for DRM haters partly because the files are delivered in the older AZW3 file format rather than the newer KFX format. AZW3 is the file format used by those older, pre-Wi-Fi Kindles, and its DRM is generally easier to remove.

Getting your files

If you’re trying to download your Kindle purchases to your PC and Mac before the deadline, you’ll need to have a somewhat older Kindle or Fire device attached to your account. If you only have one of the 2024 Kindles associated with your Amazon account (the newest Paperwhite, the second-generation Scribe, or the Colorsoft), you won’t be offered the download option. Amazon’s site will also only allow you to download a single book at a time, which could take quite a while, depending on the size of your library.

PSA: Amazon kills “download & transfer via USB” option for Kindles this week Read More »

SEC’s “scorched-earth” lawsuit against Coinbase to be dropped, company says

coinbase, Cryptocurrency, cryptocurrency exchange, Donald Trump, Policy, SEC, Securities and Exchange Commission / Shannon Garcia / February 23, 2025

On Friday, a Coinbase executive declared the “war against crypto” over—”at least as it applies to Coinbase.”

According to Coinbase Chief Legal Officer Paul Grewal, the US Securities and Exchange Commission (SEC) plans to drop its lawsuit against the largest US cryptocurrency exchange as the agency shifts to embrace Donald Trump’s new approach to regulating cryptocurrency in the US.

The SEC sued Coinbase in 2023, accusing Coinbase of “operating its crypto asset trading platform as an unregistered national securities exchange, broker, and clearing agency” and “failing to register the offer and sale of its crypto asset staking-as-a-service program.”

“Since at least 2019, Coinbase has made billions of dollars unlawfully facilitating the buying and selling of crypto asset securities,” the SEC alleged.

At that time, the SEC claimed that Coinbase’s supposedly dodgy operations were depriving investors of “significant protections, including inspection by the SEC, recordkeeping requirements, and safeguards against conflicts of interest, among others.” The litigation was intended to protect Coinbase customers, the SEC said, by holding Coinbase to the same standards as any service acting as an exchange, broker, or clearing agency.

Former SEC Chair Gary Gensler, long considered an adversary in the crypto industry, had warned that Coinbase “deliberately” flouted rules to cheat investors out of protections for financial gain. That left customers exposed to risks, Gensler claimed, and allowed for insider trading that resulted in a settlement.

“You simply can’t ignore the rules because you don’t like them or because you’d prefer different ones: the consequences for the investing public are far too great,” Gensler said.

SEC’s “scorched-earth” lawsuit against Coinbase to be dropped, company says Read More »

Leaked chat logs expose inner workings of secretive ransomware group

Biz & IT, black basta, chats, leaks, ransomware, Security / Shannon Garcia / February 22, 2025

Researchers who have read the Russian-language texts said they exposed internal rifts in the secretive organization that have escalated since one of its leaders was arrested because it increases the threat of other members being tracked down as well. The heightened tensions have contributed to growing rifts between the current leader, believed to be Oleg Nefedov, and his subordinates. One of the disagreements involved his decision to target a bank in Russia, which put Black Basta in the crosshairs of law enforcement in that country.

“It turns out that the personal financial interests of Oleg, the group’s boss, dictate the operations, disregarding the team’s interests,” a researcher at Prodraft wrote. “Under his administration, there was also a brute force attack on the infrastructure of some Russian banks. It seems that no measures have been taken by law enforcement, which could present a serious problem and provoke reactions from these authorities.”

The leaked trove also includes details about other members, including two administrators using the names Lapa and YY, and Cortes, a threat actor linked to the Qakbot ransomware group. Also exposed are more than 350 unique links taken from ZoomInfo, a cloud service that provides data about companies and business individuals. The leaked links provide insights into how Black Basta members used the service to research the companies they targeted.

Security firm Hudson Rock said it has already fed the chat transcripts into ChatGPT to create BlackBastaGPT, a resource to help researchers analyze Black Basta operations.

Leaked chat logs expose inner workings of secretive ransomware group Read More »

Researchers figure out how to get fresh lithium into batteries

batteries, chemistry, materials science, Science / Shannon Garcia / February 22, 2025

In their testing, they use a couple of unusual electrode materials, such as a chromium oxide (Cr₈O₂₁) and an organic polymer (a sulfurized polyacrylonitrile). Both of these have significant weight advantages over the typical materials used in today’s batteries, although the resulting batteries typically lasted less than 500 cycles before dropping to 80 percent of their original capacity.

But the striking experiment came when they used LiSO₂CF₃ to rejuvenate a battery that had been manufactured as normal but had lost capacity due to heavy use. Treating a lithium-iron phosphate battery that had lost 15 percent of its original capacity restored almost all of what was lost, allowing it to hold over 99 percent of its original charge. They also ran a battery for repeated cycles with rejuvenation every few thousand cycles. At just short of 12,000 cycles, it still could be restored to 96 percent of its original capacity.

Before you get too excited, there are a couple of things worth noting about lithium-iron phosphate cells. The first is that, relative to their charge capacity, they’re a bit heavy, so they tend to be used in large, stationary batteries like the ones in grid-scale storage. They’re also long-lived on their own; with careful management, they can take over 8,000 cycles before they drop to 80 percent of their initial capacity. It’s not clear whether similar rejuvenation is possible in the battery chemistries typically used for the sorts of devices that most of us own.

The final caution is that the battery needs to be modified so that fresh electrolytes can be pumped in and the gases released by the breakdown of the LiSO₂CF₃ removed. It’s safest if this sort of access is built into the battery from the start, rather than provided by modifying it much later, as was done here. And the piping needed would put a small dent in the battery’s capacity per volume if so.

All that said, the treatment demonstrated here would replenish even a well-managed battery closer to its original capacity. And it would largely restore the capacity of something that hadn’t been carefully managed. And that would allow us to get far more out of the initial expense of battery manufacturing. Meaning it might make sense for batteries destined for a large storage facility, where lots of them could potentially be treated at the same time.

Nature, 2025. DOI: 10.1038/s41586-024-08465-y (About DOIs).

Researchers figure out how to get fresh lithium into batteries Read More »

Asus’ new “Fragrance Mouse” is a wireless mouse that also smells

ASUS, asus fragrance mouse, Mouse, Tech, wireless mouse / Shannon Garcia / February 22, 2025

Aside from the customizable stink, the Fragrance Mouse is a reasonably full-featured functional PC accessory. It supports Bluetooth as well as the USB wireless dongle, three DPI levels (1,200, 1,600, and 2,400) for customizing responsiveness, and understated white and pink color options. Asus says the mouse’s switches are rated for 10 million clicks, ensuring that you will be able to smell your mouse for years to come.

We’ve emailed Asus to ask about pricing and availability and will update the article if we get a response.

Strange as it is, the Fragrance Mouse isn’t totally without precedent; in the summer of 2024, Asus released a laptop called the Adol 14 Air that included a compartment in the lid that could hold a switchable “fragrance pack.” But this laptop was only released in China, so laptop buyers in the US and other countries weren’t given an opportunity to smell it firsthand.

The Fragrance Mouse doesn’t feel like a thing that anyone was asking for, but it’s also probably something that no one thought not to ask for. And that, my friends, is the place where imagination and innovation thrive.

Asus’ new “Fragrance Mouse” is a wireless mouse that also smells Read More »

Turning the Moon into a fuel depot will take a lot of power

chemistry, oxygen, oxygen production, Science, Space, Space exploration / Shannon Garcia / February 18, 2025

Getting oxygen from regolith takes 24 kWh per kilogram, and we’d need tonnes.

Without adjustments for relativity, clocks here and on the Moon would rapidly diverge. Credit: NASA

If humanity is ever to spread out into the Solar System, we’re going to need to find a way to put fuel into rockets somewhere other than the cozy confines of a launchpad on Earth. One option for that is in low-Earth orbit, which has the advantage of being located very close to said launch pads. But it has the considerable disadvantage of requiring a lot of energy to escape Earth’s gravity—it takes a lot of fuel to put substantially less fuel into orbit.

One alternative is to produce fuel on the Moon. We know there is hydrogen and oxygen present, and the Moon’s gravity is far easier to overcome, meaning more of what we produce there can be used to send things deeper into the Solar System. But there is a tradeoff: any fuel production infrastructure will likely need to be built on Earth and sent to the Moon.

How much infrastructure is that going to involve? A study released today by PNAS evaluates the energy costs of producing oxygen on the Moon, and finds that they’re substantial: about 24 kWh per kilogram. This doesn’t sound bad until you start considering how many kilograms we’re going to eventually need.

Free the oxygen!

The math that makes refueling from the Moon appealing is pretty simple. “As a rule of thumb,” write the authors of the new study on the topic, “rockets launched from Earth destined for [Earth-Moon Lagrange Point 1] must burn ~25 kg of propellant to transport one kg of payload, whereas rockets launched from the Moon to [Earth-Moon Lagrange Point 1] would burn only ~four kg of propellant to transport one kg of payload.” Departing from the Earth-Moon Lagrange Point for locations deeper into the Solar System also requires less energy than leaving low-Earth orbit, meaning the fuel we get there is ultimately more useful, at least from an exploration perspective.

But, of course, you need to make the fuel there in the first place. The obvious choice for that is water, which can be split to produce hydrogen and oxygen. We know there is water on the Moon, but we don’t yet know how much, and whether it’s concentrated into large deposits. Given that uncertainty, people have also looked at other materials that we know are present in abundance on the Moon’s surface.

And there’s probably nothing more abundant on that surface than regolith, the dust left over from constant tiny impacts that have, over time, eroded lunar rocks. The regolith is composed of a variety of minerals, many of which contain oxygen, typically the heavier component of rocket fuel. And a variety of people have figured out the chemistry involved in separating oxygen from these minerals on the scale needed for rocket fuel production.

But knowing the chemistry is different from knowing what sort of infrastructure is needed to get that chemistry done at a meaningful scale. To get a sense of this, the researchers decided to focus on isolating oxygen from a mineral called ilmenite, or FeTiO₃. It’s not the easiest way to get oxygen—iron oxides win out there—but it’s well understood. Someone actually patented oxygen production from ilmenite back in the 1970s, and two hardware prototypes have been developed, one of which may be sent to the Moon on a future NASA mission.

The researchers propose a system that would harvest regolith, partly purify the ilmenite, then combine it with hydrogen at high temperatures, which would strip the oxygen out as water, leaving behind purified iron and titanium (both of which may be useful to have). The resulting water would then be split to feed the hydrogen back into the system, while the oxygen can be sent off for use in rockets.

(This wouldn’t solve the issue of what that oxygen will ultimately oxidize to power a rocket. But oxygen is typically the heavier component of rocket fuel combinations—typically about 80 percent of the mass—and so the bigger challenge to get to a fuel depot.)

Obviously, this process will require a lot of infrastructure, like harvesters, separators, high-temperature reaction chambers, and more. But the researchers focus on a single element: how much power will it suck down?

More power!

To get their numbers, the researchers made a few simplifying assumptions. These include assuming that it’s possible to purify ilmenite from raw regolith and that it will be present in particles small enough that about half the material present will participate in chemical reactions. They ignored both the potential to get even more oxygen from the iron and titanium oxides present, as well as the potential for contamination from problematic materials like hydrogen sulfide or hydrochloric acid.

The team found that almost all of the energy is consumed at three steps in the process: the high-temperature hydrogen reaction that produces water (55 percent), splitting the water afterwards (38 percent), and converting the resulting oxygen to its liquid form (five percent). The typical total usage, depending on factors like the concentration of ilmenite in the regolith, worked out to be about 24 kW-hr for each kilogram of liquid oxygen.

Obviously, the numbers are sensitive to how efficiently you can do things like heat the reaction mix. (It might be possible to do this heating with concentrated solar, avoiding the use of electricity for this entirely, but the authors didn’t analyze that.) But it was also sensitive to less obvious efficiencies. For example, a better separation of the ilmenite from the rest of the regolith means you’re using less energy to heat contaminants. So, while the energetic cost of that separation is small, it pays off to do it effectively.

Based on orbital observations, the researchers map out the areas where ilmenite is present at high enough concentrations for this approach to make sense. These include some of the mares on the near side of the Moon, so they’re easy to get to.

A map of the lunar surface with locations highlighted in color. — A map of the lunar surface, with areas with high ilmenite concentrations shown in blue. Credit: Leger, et. al.

On its own, 24 kWh doesn’t seem like a lot of power. The problem is that we will need a lot of kilograms. The researchers estimate that getting an empty SpaceX Starship from the lunar surface to the Earth-Moon Lagrange Point takes 80 tonnes of liquid oxygen. And a fully fueled starship can hold over 500 tonnes of liquid oxygen.

We can compare that to something like the solar array on the International Space Station, which has a capacity of about 100 kW. That means it could power the production of about four kilograms of oxygen an hour. At that rate, it’ll take a bit over 10 days to produce a tonne, and a bit more than two years to get enough oxygen to get an empty Starship to the Lagrange Point—assuming 24-7 production. Being on the near side, they will only produce for half the time, given the lunar day.

Obviously, we can build larger arrays than that, but it boosts the amount of material that needs to be sent to the Moon from Earth. It may potentially make more sense to use nuclear power. While that would likely involve more infrastructure than solar arrays, it would allow the facilities to run around the clock, thus getting more production from everything else we’ve shipped from Earth.

This paper isn’t meant to be the final word on the possibilities for lunar-based refueling; it’s simply an early attempt to put hard numbers on what ultimately might be the best way to explore our Solar System. Still, it provides some perspective on just how much effort we’ll need to make before that sort of exploration becomes possible.

PNAS, 2025. DOI: 10.1073/pnas.2306146122 (About DOIs).

John is Ars Technica’s science editor. He has a Bachelor of Arts in Biochemistry from Columbia University, and a Ph.D. in Molecular and Cell Biology from the University of California, Berkeley. When physically separated from his keyboard, he tends to seek out a bicycle, or a scenic location for communing with his hiking boots.

Turning the Moon into a fuel depot will take a lot of power Read More »

Author name: Shannon Garcia