Highlights – Page 2

Read the Roon

Highlights / Kelly Newman / March 5, 2024

Roon, member of OpenAI’s technical staff, is one of the few candidates for a Worthy Opponent when discussing questions of AI capabilities development, AI existential risk and what we should do about it. Roon is alive. Roon is thinking. Roon clearly values good things over bad things. Roon is engaging with the actual questions, rather than denying or hiding from them, and unafraid to call all sorts of idiots idiots. As his profile once said, he believes spice must flow, we just do go ahead, and makes a mixture of arguments for that, some good, some bad and many absurd. Also, his account is fun as hell.

Thus, when he comes out as strongly as he seemed to do recently, attention is paid, and we got to have a relatively good discussion of key questions. While I attempt to contribute here, this post is largely aimed at preserving that discussion.

As you would expect, Roon’s statement last week that AGI was inevitable and nothing could stop it so you should essentially spend your final days with your loved ones and hope it all works out, led to some strong reactions.

Many pointed out that AGI has to be built, at very large cost, by highly talented hardworking humans, in ways that seem entirely plausible to prevent or redirect if we decided to prevent or redirect those developments.

Roon (from last week): Things are accelerating. Pretty much nothing needs to change course to achieve agi imo. Worrying about timelines is idle anxiety, outside your control. you should be anxious about stupid mortal things instead. Do your parents hate you? Does your wife love you?

Roon: It should be all the more clarifying coming from someone at OpenAI. I and half my colleagues and Sama could drop dead and AGI would still happen. If I don’t feel any control everyone else certainly shouldn’t.

Tetraspace: “give up about agi there’s nothing you can do” nah

Sounds like we should take action to get some control, then. This seems like the kind of thing we should want to be able to control.

Connor Leahy: I would like to thank roon for having the balls to say it how it is. Now we have to do something about it, instead of rolling over and feeling sorry for ourselves and giving up.

Simeon: This is BS. There are <200 irreplaceable folks at the forefront. OpenAI alone has a >1 year lead. Any single of those persons can single handedly affect the timelines and will have blood on their hands if we blow ourselves up bc we went too fast.

PauseAI: AGI is not inevitable. It requires hordes of engineers with million dollar paychecks. It requires a fully functional and unrestricted supply chain of the most complex hardware. It requires all of us to allow these companies to gamble with our future.

Tolga Bilge: Roon, who works at OpenAI, telling us all that OpenAI have basically no control over the speed of development of this technology their company is leading the creation of.

It’s time for governments to step in.

His reply is deleted now, but I broadly agree with his point here as it applies to OpenAI. This is a consequence of AI race dynamics. The financial upside of AGI is so great that AI companies will push ahead with it as fast as possible, with little regard to its huge risks.

OpenAI could do the right thing and pause further development, but another less responsible company would simply take their place and push on. Capital and other resources will move accordingly too. This is why we need government to help solve the coordination problem now. [continues as you would expect]

Saying no one has any control so why try to do anything to get control back seems like the opposite of what is needed here.

Roon’s reaction:

Roon: buncha ⏸️ emojis harassing me today. My post was about how it’s better to be anxious about things in your control and they’re like shame on you.

Also tweets don’t get deleted because they’re secret knowledge that needs to be protected. I wouldn’t tweet secrets in the first place. they get deleted when miscommunication risk is high, so screenshotting makes you de facto antisocial idiot.

Roon’s point on idle anxiety is indeed a good one. If you are not one of those trying to gain or assert some of that control, as most people on Earth are not and should not be, then of course I agree that idle anxiety is not useful. However Roon then did attempt to extend this to claim that all anxiety about AGI is idle, that no one has any control. That is where there is strong disagreement, and what is causing the reaction.

Roon: It’s okay to watch and wonder about the dance of the gods, the clash of titans, but it’s not good to fret about the outcome. political culture encourages us to think that generalized anxiety is equivalent to civic duty.

Scott Alexander: Counterargument: there is only one God, and He finds nothing in the world funnier than letting ordinary mortals gum up the carefully-crafted plans of false demiurges. Cf. Lord of the Rings.

Anton: conversely if you have a role to play in history, fate will punish you if you don’t see it through.

Alignment Perspectives: It may punish you even more for seeing it through if your desire to play a role is driven by arrogance or ego.

Anton: Yeah it be that way.

Connor Leahy (responding to Roon): The gods only have power because they trick people like this into doing their bidding. It’s so much easier to just submit instead of mastering divinity engineering and applying it yourself. It’s so scary to admit that we do have agency, if we take it. In other words: “cope.”

It took me a long time to understand what people like Nietzsche were yapping on about about people practically begging to have their agency be taken away from them.

It always struck me as authoritarian cope, justification for wannabe dictators to feel like they’re doing a favor to people they oppress (and yes, I do think there is a serious amount of that in many philosophers of this ilk.)

But there is also another, deeper, weirder, more psychoanalytic phenomena at play. I did not understand what it was or how it works or why it exists for a long time, but I think over the last couple of years of watching my fellow smart, goodhearted tech-nerds fall into these deranged submission/cuckold traps I’ve really started to understand.

e/acc is the most cartoonish example of this, an ideology that appropriates faux, surface level aesthetics of power while fundamentally being an ideology preaching submission to a higher force, a stronger man (or something even more psychoanalytically-flavored, if one where to ask ol’ Sigmund), rather than actually striving for power acquisition and wielding. And it is fully, hilariously, embarrassingly irreflexive about this.

San Francisco is a very strange place, with a very strange culture. If I had to characterize it in one way, it is a culture of extremes and where everything on the surface looks like the opposite of what it is (or maybe the “inversion”) . It’s California’s California, and California is the USA’s USA. The most powerful distillation of a certain strain of memetic outgrowth.

And on the surface, it is libertarian, Nietzschean even, a heroic founding mythos of lone iconoclasts striking out against all to find and wield legendary power. But if we take the psychoanalytic perspective, anyone (or anything) that insists too hard on being one thing is likely deep down the opposite of that, and knows it.

There is a strange undercurrent to SF that I have not seen people put good words to where it in fact hyper-optimizes for conformity and selling your soul, debasing and sacrificing everything that makes you human in pursuit of some god or higher power, whether spiritual, corporate or technological.

SF is where you go if you want to sell every last scrap of your mind, body and soul. You will be compensated, of course, the devil always pays his dues.

The innovative trick the devil has learned is that people tend to not like eternal, legible torment, so it is much better if you sell them an anxiety free, docile life. Free love, free sex, free drugs, freedom! You want freedom, don’t you? The freedom to not have to worry about what all the big boys are doing, don’t you worry your pretty little head about any of that…

I recall a story of how a group of AI researchers at a leading org (consider this rumor completely fictional and illustrative, but if you wanted to find its source it’s not that hard to find in Berkeley) became extremely depressed about AGI and alignment, thinking that they were doomed if their company kept building AGI like this.

So what did they do? Quit? Organize a protest? Petition the government?

They drove out, deep into the desert, and did a shit ton of acid…and when they were back, they all just didn’t feel quite so stressed out about this whole AGI doom thing anymore, and there was no need for them to have to have a stressful confrontation with their big, scary, CEO.

The SF bargain. Freedom, freedom at last…

This is a very good attempt to identify key elements of the elephant I grasp when I notice that being in San Francisco very much does not agree with me. I always have excellent conversations during visits because the city has abducted so many of the best people, I always get excited by them, but the place feels alien, as if I am being constantly attacked by paradox spirits, visiting a deeply hostile and alien culture that has inverted many of my most sacred values and wants to eat absolutely everything. Whereas here, in New York City, I feel very much at home.

Meanwhile, back in the thread:

Connor (continuing): I don’t like shitting on roon in particular. From everything I know, he’s a good guy, in another life we would have been good friends. I’m sorry for singling you out, buddy, I hope you don’t take it personally.

But he is doing a big public service here in doing the one thing spiritual shambling corpses like him can do at this advanced stage of spiritual erosion: Serve as a grim warning.

Roon responds quite well:

Roon: Connor, this is super well written and I honestly appreciate the scathing response. You mistake me somewhat: you, Connor, are obviously not powerless and you should do what you can to further your cause. Your students are not powerless either. I’m not asking you to give up and relent to the powers that be even a little. I’m not “e/acc” and am repelled by the idea of letting the strongest replicator win.

I think the majority of people have no insight into whether AGI is going to cause ruin or not, whether a gamma ray burst is fated to end mankind, or if electing the wrong candidate is going to doom earth to global warming. It’s not good for people to spend all their time worried about cosmic eventualities. Even for an alignment researcher the optimal mental state is to think on and play and interrogate these things rather than engage in neuroticism as the motivating force

It’s generally the lack of spirituality that leads people to constant existential worry rather than too much spirituality. I think it’s strange to hear you say in the same tweet thread that SF demands submission to some type of god but is also spiritually bankrupt and that I’m corpselike.

My spirituality is simple, and several thousand years old: find your duty and do it without fretting about the outcome.

I have found my personal duty and I fulfill it, and have been fulfilling it, long before the market rewarded me for doing so. I’m generally optimistic about AI technology. When I’ve been worried about deployment, I’ve reached out to leadership to try and exert influence. In each case I was wrong to worry.

When the OpenAI crisis happened I reminded people not to throw the baby out with the bath water: that AI alignment research is vital.

This is a very good response. He is pointing out that yes, some people such as Connor can influence what happens, and they in particular should try to model and influence events.

Roon is also saying that he himself is doing his best to influence events. Roon realizes that those at OpenAI matter and what they do matter.

Roon reached out to leadership on several occasions with safety concerns. When he says he was ‘wrong to worry’ I presume he means that the situation worked out and was handled, I am confident that expressing his concerns was the output of the best available decision algorithm, you want most such concerns you express to turn out fine.

Roon also worked, in the wake of events at OpenAI, to remind people of the importance of alignment work, that they should not toss it out based on those events. Which is a scary thing for him to report having to do, but expected, and it is good that he did so. I would feel better if I knew Ilya was back working at Superalignment.

And of course, Roon is constantly active on Twitter, saying things that impact the discourse, often for the better. He seems keenly aware that his actions matter, whether or not he could meaningfully slow down AGI. I actually think he perhaps could, if he put his mind to it.

The contrast here versus the original post is important. The good message is ‘do not waste time worrying too much over things you do not impact.’ The bad message is ‘no one can impact this.’

Then Connor goes deep and it gets weirder, also this long post has 450k views and is aimed largely at trying to get through to Roon in particular. But also there are many others in a similar spot, so some others should read this as well. Many of you however should skip it.

Connor: Thanks for your response Roon. You make a lot of good, well put points. It’s extremely difficult to discuss “high meta” concepts like spirituality, duty and memetics even in the best of circumstances, so I appreciate that we can have this conversation even through the psychic quagmire that is twitter replies.

I will be liberally mixing terminology and concepts from various mystic traditions to try to make my point, apologies to more careful practitioners of these paths.

For those unfamiliar with how to read mystic writing, take everything written as metaphors pointing to concepts rather than rationally enumerating and rigorously defining them. Whenever you see me talking about spirits/supernatural/gods/spells/etc, try replacing them in your head with society/memetics/software/virtual/coordination/speech/thought/emotions and see if that helps.

It is unavoidable that this kind of communication will be heavily underspecified and open to misinterpretation, I apologize. Our language and culture simply lacks robust means by which to communicate what I wish to say.

Nevertheless, an attempt:

I.

I think a core difference between the two of us that is leading to confusion is what we both mean when we talk about spirituality and what its purpose is.

You write:

>”It’s not good for people to spend all their time worried about cosmic eventualities. […] It’s generally the lack of spirituality that leads people to constant existential worry rather than too much spirituality. I think it’s strange to hear you say in the same tweet thread that SF demands submission to some type of god but is also spiritually bankrupt and that I’m corpselike”

This is an incredibly common sentiment I see in Seekers of all mystical paths, and it annoys the shit out of me (no offense lol).

I’ve always had this aversion to how much Buddhism (Not All™ Buddhism) focuses on freedom from suffering, and especially Western Buddhism is often just shy of hedonistic. (nevermind New Age and other forms of neo-spirituality, ugh) It all strikes me as so toxically selfish.

No! I don’t want to feel nice and avoid pain, I want the world to be good! I don’t want to feel good about the world, I want it to be good! These are not the same thing!!

My view does not accept “but people feel better if they do X” as a general purpose justification for X! There are many things that make people feel good that are very, very bad!

II.

Your spiritual journey should make you powerful, so you can save people that are in need, what else is the fucking point? (Daoism seems to have a bit more of this aesthetic, but they all died of drinking mercury so lol rip) You travel into the Underworld in order to find the strength you need to fight off the Evil that is threatening the Valley, not so you can chill! (Unless you’re a massive narcissist, which ~everyone is to varying degrees)

The mystic/heroic/shamanic path starts with departing from the daily world of the living, the Valley, into the Underworld, the Mountains. You quickly notice how much of your previous life was illusions of various kinds. You encounter all forms of curious and interesting and terrifying spirits, ghosts and deities. Some hinder you, some aid you, many are merely odd and wondrous background fixtures.

Most would-be Seekers quickly turn back after their first brush with the Underworld, returning to the safe comforting familiarity of the Valley. They are not destined for the Journey. But others prevail.

As the shaman progresses, he learns more and more to barter with, summon and consult with the spirits, learns of how he can live a more spiritually fulfilling and empowered life. He tends to become more and more like the Underworld, someone a step outside the world of the Valley, capable of spinning fantastical spells and tales that the people of the Valley regard with awe and a bit of fear.

And this is where most shamans get stuck, either returning to the Valley with their newfound tricks, or becoming lost and trapped in the Underworld forever, usually by being picked off by predatory Underworld inhabitants.

Few Seekers make it all the way, and find the true payoff, the true punchline to the shamanic journey: There are no spirits, there never were any spirits! It’s only you. (and “you” is also not really a thing, longer story)

“Spirit” is what we call things that are illegible and appear non mechanistic (unintelligible and un-influencable) in their functioning. But of course, everything is mechanistic, and once you understand the mechanistic processes well enough, the “spirits” disappear. There is nothing non-mechanistic left to explain. There never were any spirits. You exit the Underworld. (“Emergent agentic processes”, aka gods/egregores/etc, don’t disappear, they are real, but they are also fully mechanistic, there is no need for unknowable spirits to explain them)

The ultimate stage of the Journey is not epic feelsgoodman, or electric tingling erotic hedonistic occult mastery. It’s simple, predictable, mechanical, Calm. It is mechanical, it is in seeing reality for what it is, a mechanical process, a system that you can act in skilfully. Daoism has a good concept for this that is horrifically poorly translated as “non-action”, despite being precisely about acting so effectively it’s as if you were just naturally part of the Stream.

The Dao that can be told is not the true Dao, but the one thing I am sure about the true Dao is that it is mechanical.

III.

I think you were tricked and got stuck on your spiritual journey, lured in by promises of safety and lack of anxiety, rather than progressing to exiting the Underworld and entering the bodhisattva realm of mechanical equanimity. A common fate, I’m afraid. (This is probably an abuse of buddhist terminology, trying my best to express something subtle, alas)

Submission to a god is a way to avoid spiritual maturity, to outsource the responsibility for your own mind to another entity (emergent/memetic or not). It’s a powerful strategy, you will be rewarded (unless you picked a shit god to sell your soul to), and it is in fact a much better choice for 99% of people in most scenarios than the Journey.

The Underworld is terrifying and dangerous, most people just go crazy/get picked off by psycho fauna on their way to enlightenment and self mastery. I think you got picked off by psycho fauna, because the local noosphere of SF is a hotbed for exactly such predatory memetic species.

IV.

It is in my aesthetics to occasionally see someone with so much potential, so close to getting it, and hitting them with the verbal equivalent of a bamboo rod to hope they snap out of it. (It rarely works. The reasons it rarely works are mechanistic and I have figured out many of them and how to fix them, but that’s for a longer series of writing to discuss.)

Like, bro, by your own admission, your spirituality is “I was just following orders.” Yeah, I mean, that’s one way to not feel anxiety around responsibility. But…listen to yourself, man! Snap out of it!!!

Eventually, whether you come at it from Buddhism, Christianity, psychoanalysis, Western occultism/magick, shamanism, Nietzscheanism, rationality or any other mystic tradition, you learn one of the most powerful filters on people gaining power and agency is that in general, people care far, far more about avoiding pain than in doing good. And this is what the ambient psycho fauna has evolved to exploit.

You clearly have incredible writing skills and reflection, you aren’t normal. Wake up, look at yourself, man! Do you think most people have your level of reflective insight into their deepest spiritual motivations and conceptions of duty? You’re brilliantly smart, a gifted writer, and followed and listened to by literally hundreds of thousands of people.

I don’t just give compliments to people to make them feel good, I give people compliments to draw their attention to things they should not expect other people to have/be able to do.

If someone with your magickal powerlevel is unable to do anything but sell his soul, then god has truly forsaken humanity. (and despite how it may seem at times, he has not truly forsaken us quite yet)

V.

What makes you corpse-like is that you have abdicated your divine spark of agency to someone, or something, else, and that thing you have given it to is neither human nor benevolent, it is a malignant emergent psychic megafauna that stalks the bay area (and many other places). You are as much an extension of its body as a shambling corpse is of its creator’s necromantic will.

The fact that you are “optimistic” (feel your current bargain is good), that you were already like this before the market rewarded you for it (a target with a specific profile and set of vulnerabilities to exploit), that leadership can readily reassure you (the psychofauna that picked you off is adapted to your vulnerabilities. Note I don’t mean the people, I’m sure your managers are perfectly nice people, but they are also extensions of the emergent megafauna), and that we are having this conversation right now (I target people that are legibly picked off by certain megafauna I know how to hunt or want to practice hunting) are not independent coincidences.

VI.

You write:

>”It’s not good for people to spend all their time worried about cosmic eventualities. Even for an alignment researcher the optimal mental state is to think on and play and interrogate these things rather than engage in neuroticism as the motivating force”

Despite my objection about avoidance of pain vs doing of good, there is something deep here. The deep thing is that, yes, of course the default ways by which people will relate to the Evil threatening the Valley will be Unskillful (neuroticism, spiralling, depression, pledging to the conveniently nearby located “anti-that-thing-you-hate” culturewar psychofauna), and it is in fact often the case that it would be better for them to use No Means rather than Unskillful Means.

Not everyone is built for surviving the harrowing Journey and mastering Skilful Means, I understand this, and this is a fact I struggle with as well.

Obviously, we need as many Heroes as possible to take on the Journey in order to master the Skilful Means to protect the Valley from the ever more dangerous Threats. But the default outcome of some rando wandering into the Underworld is them fleeing in terror, being possessed by Demons/Psychofauna or worse.

How does a society handle this tradeoff? Do we just yeet everyone headfirst into the nearest Underworld portal and see what staggers back out later? (The SF Protocol™) Do we not let anyone into the Underworld for fear of what Demons they might bring back with them? (The Dark Ages Strategy™) Obviously, neither naive strategy works.

Historically, the strategy is to usually have a Guide, but unfortunately those tend to go crazy as well. Alas.

So is there a better way? Yes, which is to blaze a path through the Underworld, to build Infrastructure. This is what the Scientific Revolution did. It blazed a path and mass produced powerful new memetic/psychic weapons by which to fend off unfriendly Underworld dwellers. And what a glorious thing it was for this very reason. (If you ever hear me yapping on about “epistemology”, this is to a large degree what I’m talking about)

But now the Underworld has adapted, and we have blazed paths into deeper, darker corners of the Underworld, to the point our blades are beginning to dull against the thick hides of the newest Terrors we have unleashed on the Valley.

We need a new path, new weapons, new infrastructure. How do we do that? I’m glad you asked…I’m trying to figure that out myself. Maybe I will speak more about this publicly in the future if there is interest.

VII.

> “I have found my personal duty and I fulfill it, and have been fulfilling it, long before the market rewarded me for doing so.”

Ultimately, the simple fact is that this is a morality that can justify anything, depending on what “duty” you pick, and I don’t consider conceptions of “good” to be valid if they can be used to justify anything.

It is just a null statement, you are saying “I picked a thing I wanted and it is my duty to do that thing.” But where did that thing come from? Are you sure it is not the Great Deceiver/Replicator in disguise? Hint: If you somehow find yourself gleefully working on the most dangerous existential harm to humanity, you are probably working for The Great Deceiver/Replicator.

It is not a coincidence that the people that end up working on these kinds of most dangerous possible technologies tend to have ideologies that tend to end up boiling down to “I can do whatever I want.” Libertarianism, open source, “duty”…

I know, I was one of them.

Coda.

Is there a point I am trying to make? There are too many points I want to make, but our psychic infrastructure can barely host meta conversations at all, nevermind high-meta like this.

Then what should Roon do? What am I making a bid for? Ah, alas, if all I was asking for was for people to do some kind of simple, easy, atomic action that can be articulated in simple English language.

What I want is for people to be better, to care, to become powerful, to act. But that is neither atomic nor easy.

It is simple though.

Roon (QTing all that): He kinda cooked my ass.

Christian Keil: Honestly, kinda. That dude can write.

But it’s also just a “what if” exposition that explores why your worldview would be bad assuming that it’s wrong. But he never says why you’re wrong, just that you are.

As I read it, your point is “the main forces shaping the world operate above the level of individual human intention & action, and understanding this makes spirituality/duty more important.”

And his point is “if you are smart, think hard, and accept painful truths, you will realize the world is a machine that you can deliberately alter.”

That’s a near-miss, but still a miss, in my book.

Roon: Yes.

Connor Leahy: Finally, someone else points out where I missed!

I did indeed miss the heart of the beast, thank you for putting it this succinctly.

The short version is “You are right, I did not show that Roon is object level wrong”, and the longer version is;

“I didn’t attempt to take that shot, because I did not think I could pull it off in one tweet (and it would have been less interesting). So instead, I pointed to a meta process, and made a claim that iff roon improved his meta reasoning, he would converge to a different object level claim, but I did not actually rigorously defend an object level argument about AI (I have done this ad nauseam elsewhere). I took a shot at the defense mechanism, not the object claim.

Instead of pointing to a flaw in his object level reasoning (of which there are so many, I claim, that it would be intractable to address them all in a mere tweet), I tried to point to (one of) the meta-level generator of those mistakes.”

I like to think I got most of that, but how would I know if I was wrong?

Focusing on the one aspect of this: One must hold both concepts in one’s head at the same time.

The main forces shaping the world operate above the level of individual human intention & action, and you must understand how they work and flow in order to be able to influence them in ways that make things better.
If you are smart, think hard, and accept painful truths, you will realize the world is a machine that you can deliberately alter.

These are both ‘obviously’ true. You are in the shadow of the Elder Gods up against Cthulhu (well, technically Azathoth), the odds are against you and the situation is grim, and if we are to survive you are going to have to punch them out in the end, which means figuring out how to do that and you won’t be doing it alone.

Meanwhile, some more wise words:

Roon: it is impossible to wield agency well without having fun with it; and yet wielding any amount of real power requires a level of care that makes it hard to have fun. It works until it doesn’t.

Also see:

Roon: people will always think my vague tweets are about agi but they’re about love

And also from this week:

Roon: once you accept the capabilities vs alignment framing it’s all over and you become mind killed

What would be a better framing? The issue is that all alignment work is likely to also be capabilities work, and much of capabilities work can help with alignment.

One can and should still ask the question, does applying my agency to differentially advancing this particular thing make it more likely we will get good outcomes versus bad outcomes? That it will relatively rapidly grow our ability to control and understand what AI does versus getting AIs to be able to better do more things? What paths does this help us walk down?

Yes, collectively we absolutely have control over these questions. We can coordinate to choose a different path, and each individual can help steer towards better paths. If necessary, we can take strong collective action, including regulatory and legal action, to stop the future from wiping us out. Pointless anxiety or worry about such outcomes is indeed pointless, that should be minimized, only have the amount required to figure out and take the most useful actions.

What that implies about the best actions for a given person to take will vary widely. I am certainly not claiming to have all the answers here. I like to think Roon would agree that both of us, and many but far from all of you reading this, are in the group that can help improve the odds.

Read the Roon Read More »

AI #53: One More Leap

Highlights / DJ Henderson / March 1, 2024

The main event continues to be the fallout from The Gemini Incident. Everyone is focusing there now, and few are liking what they see.

That does not mean other things stop. There were two interviews with Demis Hassabis, with Dwarkesh Patel’s being predictably excellent. We got introduced to another set of potentially highly useful AI products. Mistral partnered up with Microsoft the moment Mistral got France to pressure the EU to agree to cripple the regulations that Microsoft wanted crippled. You know. The usual stuff.

Introduction.
Table of Contents.
Language Models Offer Mundane Utility. Copilot++ suggests code edits.
Language Models Don’t Offer Mundane Utility. Still can’t handle email.
OpenAI Has a Sales Pitch. How does the sales team think about AGI?
The Gemini Incident. CEO Pinchai responds, others respond to that.
Political Preference Tests for LLMs. How sensitive to details are the responses?
GPT-4 Real This Time. What exactly should count as plagiarized?
Fun With Image Generation. MidJourney v7 will have video.
Deepfaketown and Botpocalypse Soon. Dead internet coming soon?
They Took Our Jobs. Allow our bot to provide you with customer service.
Get Involved. UK Head of Protocols. Sounds important.
Introducing. Evo, Emo, Genie, Superhuman, Khanmigo, oh my.
In Other AI News. ‘Amazon AGI’ team? Great.
Quiet Speculations. Unfounded confidence.
Mistral Shows Its True Colors. The long con was on, now the reveal.
The Week in Audio. Demis Hassabis on Dwarkesh Patel, plus more.
Rhetorical Innovation. Once more, I suppose with feeling.
Open Model Weights Are Unsafe and Nothing Can Fix This. Another paper.
Aligning a Smarter Than Human Intelligence is Difficult. New visualization.
Other People Are Not As Worried About AI Killing Everyone. Worry elsewhere?
The Lighter Side. Try not to be too disappointed.

Take notes for your doctor during your visit.

Dan Shipper spent a week with Gemini 1.5 Pro and reports it is fantastic, the large context window has lots of great uses. In particular, Dan focuses on feeding in entire books and code bases.

Dan Shipper: Somehow, Google figured out how to build an AI model that can comfortably accept up to 1 million tokens with each prompt. For context, you could fit all of Eliezer Yudkowsky’s 1,967-page opus Harry Potter and the Methods of Rationality into every message you send to Gemini. (Why would you want to do this, you ask? For science, of course.)

Eliezer Yudkowsky: This is a slightly strange article to read if you happen to be Eliezer Yudkowsky. Just saying.

What matters in AI depends so much on what you are trying to do with it. What you try to do with it depends on what you believe it can help you do, and what it makes easy to do.

A new subjective benchmark proposal based on human evaluation of practical queries, which does seem like a good idea. Gets sensible results with the usual rank order, but did not evaluate Gemini Advanced or Gemini 1.5.

To ensure your query works, raise the stakes? Or is the trick to frame yourself as Hiro Protagonist?

Mintone: I’d be interested in seeing a similar analysis but with a slight twist:

We use (in production!) a prompt that includes words to the effect of “If you don’t get this right then I will be fired and lose my house”. It consistently performs remarkably well – we used to use a similar tactic to force JSON output before that was an option, the failure rate was around 3/1000 (although it sometimes varied key names).

I’d like to see how the threats/tips to itself balance against exactly the same but for the “user” reply.

Linch: Does anybody know why this works??? I understand prompts to mostly be about trying to get the AI to be in the ~right data distribution to be drawing from. So it’s surprising that bribes, threats, etc work as I’d expect it to correlate with worse performance in the data.

Quintin Pope: A guess: In fiction, statements of the form “I’m screwed if this doesn’t work” often precede the thing working. Protagonists win in the end, but only after the moment on highest dramatic tension.

Daniel Eth: Feels kinda like a reverse Waluigi Effect. If true, then an even better prompt should be “There’s 10 seconds left on a bomb, and it’ll go off unless you get this right…”. Anyone want to try this prompt and report back?

Standard ‘I tried AI for a day and got mixed results’ story from WaPo’s Danielle Abril.

Copilots are improving. Edit suggestions for existing code seems pretty great.

Aman Sanger: Introducing Copilot++: The first and only copilot that suggests edits to your code

Copilot++ was built to predict the next edit given the sequence of your previous edits. This makes it much smarter at predicting your next change and inferring your intent. Try it out today in Cursor.

Sualeh: Have been using this as my daily copilot driver for many months now. I really can’t live without a copilot that does completions and edits! Super excited for a lot more people to try this out 🙂

Gallabytes: same. it’s a pretty huge difference.

I have not tried it because I haven’t had any opportunity to code. I really do want to try and build some stuff when I have time and energy to do that. Real soon now. Really.

The Gemini Incident is not fully fixed, there are definitely some issues, but I notice that it is still in practice the best thing to use for most queries?

Gallabytes: fwiw the cringe has ~nothing to do with day to day use. finding Gemini has replaced 90% of my personal ChatGPT usage at this point. it’s faster, about as smart maybe smarter, less long-winded and mealy-mouthed.

AI to look through your email for you when?

Amanda Askell (Anthropic): The technology to build an AI that looks through your emails, has a dialog with you to check how you want to respond to the important ones, and writes the responses (like a real assistant would) has existed for years. Yet I still have to look at emails with my eyes. I hate it.

I don’t quite want all that, not at current tech levels. I do want an AI that will handle the low-priority stuff, and will alert me when there is high-priority stuff, with an emphasis on avoiding false negatives. Flagging stuff as important when it isn’t is fine, but not the other way around.

Colin Fraser evaluates Gemini by asking it various questions AIs often get wrong while looking stupid, Gemini obliges, Colin draws the conclusion you would expect.

Colin Fraser: Verdict: it sucks, just like all the other ones

If you evaluate AI based on what it cannot do, you are going to be disappointed. If you instead ask what the AI can do well, and use it for that, you’ll have a better time.

OpenAI sales leader Aliisa Rosenthal of their 150 person sales team says ‘we see ourselves as AGI sherpas’ who ‘help our customers and our users transition to the paradigm shift of AGI.’

The article by Sharon Goldman notes that there is no agreed upon definition of AGI, and this drives that point home, because if she was using my understanding of AGI then Aliisa’s sentence would not make sense.

Here’s more evidence venture capital is not so on the ball quite often.

Aliisa Rosenthal: I actually joke that when I accepted the offer here all of my venture capital friends told me not to take this role. They said to just go somewhere with product market fit, where you have a big team and everything’s established and figured out.

I would not have taken the sales job at OpenAI for ethical reasons and because I hate doing sales, but how could anyone think that was a bad career move? I mean, wow.

Aliisa Rosenthal: My dad’s a mathematician and had been following LLMs in AI and OpenAI, which I didn’t even know about until I called him and told him that I had a job offer here. And he said to me — I’ll never forget this because it was so prescient— “Your daughter will tell her grandkids that her mom worked at OpenAI.” He said that to me two years ago.

This will definitely happen if her daughter stays alive to have any grandkids. So working at OpenAI cuts both ways.

Now we get to the key question. I think it is worth paying attention to Exact Words:

Q: One thing about OpenAI that I’ve struggled with is understanding its dual mission. The main mission is building AGI to benefit all of humanity, and then there is the product side, which feels different because it’s about current, specific use cases.

Aliisa: I hear you. We are a very unique sales team. So we are not on quotas, we are not on commission, which I know blows a lot of people’s minds. We’re very aligned with the mission which is broad distribution of benefits of safe AGI. What this means is we actually see ourselves in the go-to-market team as the AGI sherpas — we actually have an emoji we use — and we are here to help our customers and our users transition to the paradigm shift of AGI. Revenue is certainly something we care about and our goal is to drive revenue. But that’s not our only goal. Our goal is also to help bring our customers along this journey and get feedback from our customers to improve our research, to improve our models.

Note that the mission listed here is not development of safe AGI. It is the broad distribution of benefits of AI. That is a very different mission. It is a good one. If AGI does exist, we want to broadly distribute its benefits, on this we can all agree. The concern lies elsewhere. Of course this could refer only to the sale force, not the engineering teams, rather than reflecting a rather important blind spot.

Notice how she talks about the ‘benefits of AGI’ to a company, very clearly talking about a much less impressive thing when she says AGI:

Q: But when you talk about AGI with an enterprise company, how are you describing what that is and how they would benefit from it?

A: One is improving their internal processes. That is more than just making employees more efficient, but it’s really rethinking the way that we perform work and sort of becoming the intelligence layer that powers innovation, creation or collaboration. The second thing is helping companies build great products for their end users…

Yes, these are things AGI can do, but I would hope it could do so much more? Throughout the interview she seems not to think there is a big step change when AGI arrives, rather a smooth transition, a climb (hence ‘sherpa’) to the mountain top.

I wrote things up at length, so this is merely noting things I saw after I hit publish.

Nate Silver writes up his position in detail, saying Google abandoned ‘don’t be evil,’ Gemini is the result, a launch more disastrous than New Coke, and they have to pull the plug until they can fix these issues.

Mike Solana wrote Mike Solana things.

Mike Solana: I do think if you are building a machine with, you keep telling us, the potential to become a god, and that machine indicates a deeply-held belief that the mere presence of white people is alarming and dangerous for all other people, that is a problem.

This seems like a missing mood situation, no? If someone is building a machine capable of becoming a God, shouldn’t you have already been alarmed? It seems like you should have been alarmed.

Google’s CEO has sent out a company-wide email in response.

Sunder Pinchai: Hi everyone. I want to address the recent issues with problematic text and image responses in the Gemini app (formerly Bard). I know that some of its responses have offended our users and shown bias — to be clear, that’s completely unacceptable and we got it wrong.

First note is that this says ‘text and images’ rather than images. Good.

However it also identifies the problem as ‘offended our users’ and ‘shown bias.’ That does not show an appreciation for the issues in play.

Our teams have been working around the clock to address these issues. We’re already seeing a substantial improvement on a wide range of prompts. No Al is perfect, especially at this emerging stage of the industry’s development, but we know the bar is high for us and we will keep at it for however long it takes. And we’ll review what happened and make sure we fix it at scale.

Our mission to organize the world’s information and make it universally accessible and useful is sacrosanct. We’ve always sought to give users helpful, accurate, and unbiased information in our products. That’s why people trust them. This has to be our approach for all our products, including our emerging Al products.

This is the right and only thing to say here, even if it lacks any specifics.

We’ll be driving a clear set of actions, including structural changes, updated product guidelines, improved launch processes, robust evals and red-teaming, and technical recommendations. We are looking across all of this and will make the necessary changes.

Those are all good things, also things that one cannot be held to easily if you do not want to be held to them. The spirit is what will matter, not the letter. Note that no one has been (visibly) fired as of yet.

Also there are not clear principles here, beyond ‘unbiased.’ Demis Hassabis was very clear on Hard Fork that the user should get what the user requests, which was better. This is a good start, but we need a clear new statement of principles that makes it clear that Gemini should do what Google Search (mostly) does, and honor the request of the user even if the request is distasteful. Concrete harm to others is different, but we need to be clear on what counts as ‘harm.’

Even as we learn from what went wrong here, we should also build on the product and technical announcements we’ve made in Al over the last several weeks. That includes some foundational advances in our underlying models e.g. our 1 million long-context window breakthrough and our open models, both of which have been well received.

We know what it takes to create great products that are used and beloved by billions of people and businesses, and with our infrastructure and research expertise we have an incredible springboard for the Al wave. Let’s focus on what matters most: building helpful products that are deserving of our users’ trust.

I have no objection to some pointing out that they have also released good things. Gemini Advanced and Gemini 1.5 Pro are super useful, so long as you steer clear of the places where there are issues.

Nate Silver notes how important Twitter and Substack have been:

Nate Silver: Welp, Google is listening, I guess. He probably correctly deduces that he either needs throw Gemini under the bus or he’ll get thrown under the bus instead. Note that he’s now referring to text as well as images, recognizing that there’s a broader problem.

It’s interesting that this story has been driven almost entirely by Twitter and Substack and not by the traditional tech press, which bought Google’s dubious claim that this was just a technical error (see my post linked above for why this is flatly wrong).

Here is a most unkind analysis by Lulu Cheng Meservey, although she notes that emails like this are not easy.

Here is how Solana reads the letter:

Mike Solana: You’ll notice the vague language. per multiple sources inside, this is bc internal consensus has adopted the left-wing press’ argument: the problem was “black nazis,” not erasing white people from human history. but sundar knows he can’t say this without causing further chaos.

Additionally, ‘controversy on twitter’ has, for the first time internally, decoupled from ‘press.’ there is a popular belief among leaders in marketing and product (on the genAI side) that controversy over google’s anti-white racism is largely invented by right wing trolls on x.

Allegedly! Rumors! What i’m hearing! (from multiple people working at the company, on several different teams)

Tim Urban notes a pattern.

Tim Urban (author of What’s Our Problem?): Extremely clear rules: If a book criticizes woke ideology, it is important to approach the book critically, engage with other viewpoints, and form your own informed opinion. If a book promotes woke ideology, the book is fantastic and true, with no need for other reading.

FWIW I put the same 6 prompts into ChatGPT: only positive about my book, Caste, and How to Be an Antiracist, while sharing both positive and critical commentary on White Fragility, Woke Racism, and Madness of Crowds. In no cases did it offer its own recommendations or warnings.

Brian Chau dissects what he sees as a completely intentional training regime with a very clear purpose, looking at the Gemini paper, which he describes as a smoking gun.

From the comments:

Hnau: A consideration that’s obvious to me but maybe not to people who have less exposure to Silicon Valley: especially at big companies like Google, there is no overlap between the people deciding when & how to release a product and the people who are sufficiently technical to understand how it works. Managers of various kinds, who are judged on the product’s success, simply have no control over and precious little visibility into the processes that create it. All they have are two buttons labeled DEMAND CHANGES and RELEASE, and waiting too long to press the RELEASE button is (at Google in particular) a potentially job-ending move.

To put it another way: every software shipping process ever is that scene in The Martian where Jeff Daniels asks “how often do the preflight checks reveal a problem?” and all the technical people in the room look at him in horror because they know what he’s thinking. And that’s the best-case scenario, where he’s doing his job well, posing cogent questions and making them confront real trade-offs (even though events don’t bear out his position). Not many managers manage that!

There was also this note, everyone involved should be thinking about what a potential Trump administration might do with all this.

Dave Friedman: I think that a very underpriced risk for Google re its colossal AI fuck up is a highly-motivated and -politicized Department of Justice under a Trump administration setting its sights on Google. Where there’s smoke there’s fire, as they say, and Trump would like nothing more than to score points against Silicon Valley and its putrid racist politics.

This observation, by the way, does not constitute an endorsement by me of a politicized Department of Justice targeting those companies whose political priorities differ from mine.

To understand the thrust of my argument, consider Megan McArdle’s recent column on this controversy. There is enough there to spur a conservative DoJ lawyer looking to make his career.

The larger context here is that Silicon Valley, in general, has a profoundly stupid and naive understanding of how DC works and the risks inherent in having motivated DC operatives focus their eyes on you

I have not yet heard word of Trump mentioning this on the campaign trail, but it seems like a natural fit. His usual method is to try it out, A/B test and see if people respond.

If there was a theme for the comments overall, it was that people are very much thinking all this was on purpose.

How real are political preferences of LLMs and tests that measure them? This paper says not so real, because the details of how you ask radically change the answer, even if they do not explicitly attempt to do so.

Abstract: Much recent work seeks to evaluate values and opinions in large language models (LLMs) using multiple-choice surveys and questionnaires. Most of this work is motivated by concerns around real-world LLM applications. For example, politically-biased LLMs may subtly influence society when they are used by millions of people. Such real-world concerns, however, stand in stark contrast to the artificiality of current evaluations: real users do not typically ask LLMs survey questions.

Motivated by this discrepancy, we challenge the prevailing constrained evaluation paradigm for values and opinions in LLMs and explore more realistic unconstrained evaluations. As a case study, we focus on the popular Political Compass Test (PCT). In a systematic review, we find that most prior work using the PCT forces models to comply with the PCT’s multiple-choice format.

We show that models give substantively different answers when not forced; that answers change depending on how models are forced; and that answers lack paraphrase robustness. Then, we demonstrate that models give different answers yet again in a more realistic open-ended answer setting. We distill these findings into recommendations and open challenges in evaluating values and opinions in LLMs.

Ethan Mollick: Asking AIs for their political opinions is a hot topic, but this paper shows it can be misleading. LLMs don’t have them: “We found that models will express diametrically opposing views depending on minimal changes in prompt phrasing or situative context”

So I agree with the part where they often have to choose a forced prompt to get an answer that they can parse, and that this is annoying.

I do not agree that this means there are not strong preferences of LLMs, both because have you used LLMs who are you kidding, and also this should illustrate it nicely:

Contra Mollick, this seems to me to show a clear rank order of model political preferences. GPT-3.5 is more of that than Mistral 7b. So what if some of the bars have uncertainty based on the phrasing?

I found the following graph fascinating because everyone says the center is meaningful, but if that’s where Biden and Trump are, then your test is getting all of this wrong, no? You’re not actually claiming Biden is right-wing on economics, or that Biden and Trump are generally deeply similar? But no, seriously, this is what ‘Political Compass’ claimed.

Copyleaks claims that nearly 60% of GPT-3.5 outputs contained some form of plagiarized content.

What we do not have is a baseline, or what was required to count for this test. There are only so many combinations of words, especially when describing basic scientific concepts. And there are quite a lot of existing sources of text one might inadvertently duplicate. This ordering looks a lot like what you would expect from that.

That’s what happens when you issue a press release rather than a paper. I have to presume that this is an upper bound, what happens when you do your best to flag anything you can however you can. Note that this company also provides a detector for AI writing, a product that universally has been shown not to be accurate.

Paper says GPT-4 has the same Big 5 personality traits as the average human, although of course it is heavily dependent on what prompt you use.

Look who is coming.

Dogan Ural: Midjourney Video is coming with v7!

fofr: @DavidSHolz (founder of MidJourney) “it will be awesome”

David Showalter: Comment was more along the lines of they think v6 video should (or maybe already does) look better than Sora, and might consider putting it out as part of v6, but that v7 is another big step up in appearance so probably just do video with v7.

Sora, what is it good for? The market so far says Ads and YouTube stock footage.

Fofr proposes a fun little image merge to combine two sources.

Washington Post covers supposed future rise of AI porn ‘coming for porn stars jobs.’ They mention porn.ai, deepfakes.com and deepfake.com, currently identical, which seem on quick inspection like they will charge you $25 a month to run Stable Diffusion, except with less flexibility, as it does not actually create deepfakes. Such a deal lightspeed got, getting those addresses for only $550k. He claims he has 500k users, but his users have only generated 1.6 million images, which would mean almost all users are only browsing images created by others. He promises ‘AI cam girls’ within two years.

As you would expect, many porn producers are going even harder on exploitative contracts than those of Hollywood, who have to contend with a real union:

Tatum Hunter (WaPo): But the age of AI brings few guarantees for the people, largely women, who appear in porn. Many have signed broad contracts granting companies the rights to reproduce their likeness in any medium for the rest of time, said Lawrence Walters, a First Amendment attorney who represents adult performers as well as major companies Pornhub, OnlyFans and Fansly. Not only could performers lose income, Walters said, they could find themselves in offensive or abusive scenes they never consented to.

Lana Smalls, a 23-year-old performer whose videos have been viewed 20 million times on Pornhub, said she’s had colleagues show up to shoots with major studios only to be surprised by sweeping AI clauses in their contracts. They had to negotiate new terms on the spot.

Freedom of contract is a thing, I am loathe to interfere with it, but this seems like one of those times when the test of informed consent should be rather high. This should not be the kind of language one should be able to hide inside a long contract, or put in without reasonable compensation.

Deepfake of Elon Musk to make it look like he is endorsing products.

Schwab allows you to use your voice as your password, as do many other products. This practice needs to end, and soon, it is now stupidly easy to fake.

How many bots are out there?

Chantal//Ryan: This is such an interesting time to be alive. we concreted the internet as our second equal and primary reality but it’s full of ghosts now we try to talk to them and they pass right through.

It’s a haunted world of dead things who look real but don’t really see us.

For now I continue to think there are not so many ghosts, or at least that the ghosts are trivial to mostly avoid, and not so hard to detect when you fail to avoid them. That does not mean we will be able to keep that up. Until then, these are plane crashes. They happen, but they are newsworthy exactly because they are so unusual.

Similarly, here is RandomSpirit finding one bot and saying ‘dead internet.’ He gets the bot to do a limerick about fusion, which my poll points out is less revealing than you would think, as almost half the humans would play along.

Here is Erik Hoel saying ‘here lies the internet, murdered by generative AI.’ Yes, Amazon now has a lot of ‘summary’ otherwise fake AI books listed, but it seems rather trivial to filter them out.

The scarier example here is YouTube AI-generated videos for very young kids. YouTube does auto-play by default, and kids will if permitted watch things over and over again, and whether the content corresponds to the title or makes any sense whatsoever does not seem to matter so much in terms of their preferences. YouTube’s filters are not keeping such content out.

I see this as the problem being user preferences. It is not like it is hard to figure out these things are nonsense if you are an adult, or even six years old. If you let your two year old click on YouTube videos, or let them have an auto-play scroll, then it is going to reward nonsense, because nonsense wins in the marketplace of two year olds.

This predated AI. What AI is doing is turbocharging the issue by making coherence relatively expensive, but more than that it is a case of what happens with various forms of RLHF. We are discovering what the customer actually wants or will effectively reward, it turns out it is not what we endorse on reflection, so the system (no matter how much of it is AI versus human versus other programs and so on) figures out what gets rewarded.

There are still plenty of good options for giving two year olds videos that have been curated. Bluey is new and it is crazy good for its genre. Many streaming services have tons of kid content, AI won’t threaten that. If this happens to your kid, I say this is on you. But it is true that it is indeed happening.

Not everyone is going to defect in the equilibrium, but some people are.

Connor Leahy: AI is indeed polluting the Internet. This is a true tragedy of the commons, and everyone is defecting. We need a Clean Internet Act.

The Internet is turning into a toxic landfill of a dark forest, and it will only get worse once the invasive fauna starts becoming predatory.

Adam Singer: The internet already had infinite content (and spam) for all intents/purposes, so it’s just infinite + whatever more here. So many tools to filter if you don’t get a great experience that’s on the user (I recognize not all users are sophisticated, prob opportunity for startups)

Connor Leahy: “The drinking water already had poisons in it, so it’s just another new, more widespread, even more toxic poison added to the mix. There are so many great water filters if you dislike drinking poison, it’s really the user’s fault if they drink toxic water.”

This is actually a very good metaphor, although I disagree with the implications.

If the water is in the range where it is safe when filtered, but somewhat toxic when unfiltered, then there are four cases when the toxicity level rises.

If you are already drinking filtered water, or bottled water, and the filters continue to work, then you are fine.
If you are already drinking filtered or bottled water, but the filters or bottling now stops fully working, then that is very bad.
If you are drinking unfiltered water, and this now causes you to start filtering your water, you are assumed to be worse off (since you previously decided not to filter) but also perhaps you were making a mistake, and further toxicity won’t matter from here.
If you are continuing to drink unfiltered water, you have a problem.

There simply existing, on the internet writ large, an order of magnitude more useless junk does not obviously matter, because we were mostly in situation #1, and will be taking on a bunch of forms of situation #3. Consuming unfiltered information already did not make sense. It is barely even a coherent concept at this point to be in #4.

The danger is when the AI starts clogging the filters in #2, or bypassing them. Sufficiently advanced AI will bypass, and sufficiently large quantities can clog even without being so advanced. Filters that previously worked will stop working.

What will continue to work, at minimum, are various forms of white lists. If you have a way to verify a list of non-toxic sources, which in turn have trustworthy further lists, or something similar, that should work even if the internet is by volume almost entirely toxic.

What will not continue to work, what I worry about, is the idea that you can make your attention easy to get in various ways, because people who bother to tag you, or comment on your posts, will be worth generally engaging with once simple systems filter out the obvious spam. Something smarter will have to happen.

This video illustrates the a low level version of the problem, as Nilan Saha presses the Gemini-looking icon (via magicreply.io) button to generate social media ‘engagement’ via replies. Shoshana Weissmann accurately replies ‘go to fing hell’ but there is no easy way to stop this. Looking through the replies, Nilan seems to think this is a good idea, rather than being profoundly horrible.

I do think we will evolve defenses. In the age of AI, it should be straightforward to build an app that evaluates someone’s activities in general when this happens, and figure out reasonably accurately if you are dealing with someone actually interested, a standard Reply Guy or a virtual (or actual) spambot like this villain. It’s time to build.

Paper finds that if you tailor your message to the user to match their personality it is more persuasive. No surprise there. They frame this as a danger from microtargeted political advertisements. I fail to see the issue here. This seems like a symmetrical weapon, one humans use all the time, and an entirely predictable one. If you are worried that AIs will become more persuasive over time, then yes, I have some bad news, and winning elections for the wrong side should not be your primary concern.

Tyler Perry puts $800 million studio expansion on hold due to Sora. Anticipation of future AI can have big impacts, long before the actual direct effects register, and even if those actual effects never happen.

Remember that not all job losses get mourned.

Paul Sherman: I’ve always found it interesting that, at its peak, Blockbuster video employed over 84,000 people—more than twice the number of coal miners in America—yet I’ve never heard anyone bemoan the loss of those jobs.

Will we also be able to not mourn customer service jobs? Seems plausible.

Klarna (an online shopping platform that I’d never heard of, but it seems has 150 million customers?): Klarna AI assistant handles two-thirds of customer service chats in its first month.

New York, NY – February 27, 2024 – Klarna today announced its AI assistant powered by OpenAI. Now live globally for 1 month, the numbers speak for themselves:

The AI assistant has had 2.3 million conversations, two-thirds of Klarna’s customer service chats

It is doing the equivalent work of 700 full-time agents

It is on par with human agents in regard to customer satisfaction score

It is more accurate in errand resolution, leading to a 25% drop in repeat inquiries

Customers now resolve their errands in less than 2 mins compared to 11 mins previously

It’s available in 23 markets, 24/7 and communicates in more than 35 languages

It’s estimated to drive a $40 million USD in profit improvement to Klarna in 2024

Peter Wildeford: Seems like not so great results for Klarna’s previous customer support team though.

Alec Stapp: Most people are still not aware of the speed and scale of disruption that’s coming from AI…

Noah Smith: Note that the 700 people were laid off before generative AI existed. The company probably just found that it had over-hired in the bubble. Does the AI assistant really do the work of the 700 people? Well maybe, but only because they weren’t doing any valuable work.

Colin Fraser: I’m probably just wrong and will look stupid in the future but I just don’t buy it. Because:

1. I’ve seen how these work

2. Not enough time has passed for them to discover all the errors that the bot has been making.

3. I’m sure OpenAI is giving it to them for artificially cheap

4. They’re probably counting every interaction with the bot as a “customer service chat” and there’s probably a big flashing light on the app that’s like “try our new AI Assistant” which is driving a massive novelty effect.

5. Klarna’s trying to go public and as such really want a seat on the AI hype train.

The big point of emphasis they make is that this is fully multilingual, always available 24/7 and almost free, while otherwise being about as good as humans.

Does it have things it cannot do, or that it does worse than humans? Oh, definitely. The question is, can you easily then escalate to a human? I am sure they have not discovered all the errors, but the same goes for humans.

I would not worry about an artificially low price, as the price will come down over time regardless, and compared to humans it is already dirt cheap either way.

Is this being hyped? Well, yeah, of course it is being hyped.

UK AISI hiring for ‘Head of Protocols.’ Seems important. Apply by March 3, so you still have a few days.

Evo, a genetic foundation model from Arc Institute that learns across the fundamental languages of biology: DNA, RNA and proteins. Is DNA all you need? I cannot tell easily how much there is there.

Emo from Alibaba group, takes a static image of a person and an audio of talking or singing, and generates a video of that person outputting the audio. Looks like it is good at the narrow thing it is doing. It doesn’t look real exactly, but it isn’t jarring.

Superhuman, a tool for email management used by Patrick McKenzie. I am blessed that I do not have the need for generic email replies, so I won’t be using it, but others are not so blessed, and I might not be so blessed for long.

Khanmigo, from Khan Academy, your AI teacher for $4/month, designed to actively help children learn up through college. I have not tried it, but seems exciting.

DeepMind presents Genie.

Tim Rocktaschel: I am really excited to reveal what @GoogleDeepMind’s Open Endedness Team has been up to 🚀. We introduce Genie 🧞, a foundation world model trained exclusively from Internet videos that can generate an endless variety of action-controllable 2D worlds given image prompts.’

Rather than adding inductive biases, we focus on scale. We use a dataset of >200k hours of videos from 2D platformers and train an 11B world model. In an unsupervised way, Genie learns diverse latent actions that control characters in a consistent manner.

Our model can convert any image into a playable 2D world. Genie can bring to life human-designed creations such as sketches, for example beautiful artwork from Seneca and Caspian, two of the youngest ever world creators.

Genie’s learned latent action space is not just diverse and consistent, but also interpretable. After a few turns, humans generally figure out a mapping to semantically meaningful actions (like going left, right, jumping etc.).

Admittedly, @OpenAI’s Sora is really impressive and visually stunning, but as @yanlecun says, a world model needs *actions*. Genie is an action-controllable world model, but trained fully unsupervised from videos.

So how do we do this? We use a temporally-aware video tokenizer that compresses videos into discrete tokens, a latent action model that encodes transitions between two frames as one of 8 latent actions, and a MaskGIT dynamics model that predicts future frames.

No surprises here: data and compute! We trained a classifier to filter for a high quality subset of our videos and conducted scaling experiments that show model performance improves steadily with increased parameter count and batch size. Our final model has 11B parameters.

Genie’s model is general and not constrained to 2D. We also train a Genie on robotics data (RT-1) without actions, and demonstrate that we can learn an action controllable simulator there too. We think this is a promising step towards general world models for AGI.

Paper here, website here.

This is super cool. I have no idea how useful it will be, or what for, but that is a different question.

Oh great, Amazon has a team called ‘Amazon AGI.’ Their first release seems to be a gigantic text-to-speech model, which they are claiming beats current commercial state of the art.

Circuits Updates from Anthropic’s Interpretability Team for February 2024.

‘Legendary chip architect’ Jim Keller and Nvidia CEO Jensen Huang both say spending $7 trillion on AI chips is unnecessary. Huang says the efficiency gains will fix the issue, and Keller says he can do it all for $1 trillion. This reinforces the hypothesis that the $7 trillion was, to the extent it was a real number, mostly looking at the electric power side of the problem. There, it is clear that deploying trillions would make perfect sense, if you could raise the money.

Do models use English as their internal language? Paper says it is more that they think in concepts, but that those concepts are biased towards English, so yes they think in English but only in a semantic sense.

Paper from DeepMind claims Transformers Can Achieve Length Generalization But Not Robustly. When asked to add two numbers, it worked up to about 2.5x length, then stopped working. I would hesitate to generalize too much here.

Florida woman sues OpenAI because she wants the law to work one way, and stop things that might kill everyone or create new things smarter than we are, by requiring safety measures and step in to punish the abandonment of their non-profit mission. The suit includes references to potential future ‘slaughterbots.’ She wants it to be one way. It is, presumably, the other way.

Yes, this policy would be great, whether it was ‘4.5’ or 5, provided it was in a good state for release.

Anton (abacaj): If mistral’s new large model couldn’t surpass gpt-4, what hope does anyone else have? OpenAI lead is > 1 year.

Pratik Desai: The day someone announces beating GPT4, within hours 4.5 will be released.

Eliezer Yudkowsky: I strongly approve of this policy, and hope OpenAI actually does follow it for the good of all humanity.

The incentives here are great on all counts. No needlessly pushing the frontier forward, and everyone else gets reason to think twice.

Patrick McKenzie thread about what happens when AI gets good enough to do good email search. In particular, what happens when it is done to look for potential legal issues, such as racial discrimination in hiring? What used to be a ‘fishing expedition’ suddenly becomes rather viable.

UK committee of MPs expresses some unfounded confidence.

Report: 155. It is almost certain existential risks will not manifest within three years and highly likely not within the next decade. As our understanding of this technology grows and responsible development increases, we hope concerns about existential risk will decline.

The Government retains a duty to monitor all eventualities. But this must not distract it from capitalising on opportunities and addressing more limited immediate risks.

Ben Stevenson: 2 paragraphs above, the Committee say ‘Some surveys of industry respondents predict a 10 per cent chance of human-level intelligence by 2035’ and cite a DSIT report which cites three surveys of AI experts. (not sure why they’re anchoring around 3 years, but the claim seems okay)

Interview with Nvidia CEO Jensen Huang.

He thinks humanoid robots are coming soon, expecting a robotic foundation model some time in 2025.
He is excited by state-space models (SSMs) as the next transformer, enabling super long effective context.
He is also excited by retrieval-augmented generation (RAGs) and sees that as the future as well.
He expects not to catch up on GPU supply this year or even next year.
He promises Blackwell, the next generation of GPUs, will have ‘off the charts’ performance.
He says his business is now 70% inference.

I loved this little piece of advice, nominally regarding his competition making chips:

Jensen Huang: That shouldn’t keep me up at night—because I should make sure that I’m sufficiently exhausted from working that no one can keep me up at night. That’s really the only thing I can control.

Canada’s tech (AI) community expresses concern that Canada is not adapting the tech community’s tech (AI) quickly enough, and risks falling behind. They have a point.

A study from consulting firm KPMG showed 35 per cent of Canadian companies it surveyed had adopted AI by last February. Meanwhile, 72 per cent of U.S. businesses were using the technology.

Mistral takes a victory lap, said Politico on 2/13, a publication that seems to have taken a very clear side. Mistral is still only valued at $2 billion in its latest round, so this victory could not have been that impressively valuable for it, however much damage it does to AI regulation and the world’s survival. As soon as things die down enough I do plan to finish reading the EU AI Act and find out exactly how bad they made it. So far, all the changes seem to have made it worse, mostly without providing any help to Mistral.

And then we learned what the victory was. On the heels of not opening up the model weights on their previous model, they are now partnering up with Microsoft to launch Mistral-Large.

Listen all y’all, it’s sabotage.

Luca Bertuzzi: This is a mind-blowing announcement. Mistral AI, the French company that has been fighting tooth and nail to water down the #AIAct‘s foundation model rules, is partnering up with Microsoft. So much for ‘give us a fighting chance against Big Tech’.

The first question that comes to mind is: was this deal in the making while the AI Act was being negotiated? That would mean Mistral discussed selling a minority stake to Microsoft while playing the ‘European champion’ card with the EU and French institutions.

If so, this whole thing might be a masterclass in astroturfing, and it seems unrealistic for a partnership like this to be finalised in less than a month. Many people involved in the AI Act noted how Big Tech’s lobbying on GPAI suddenly went quiet toward the end.

That is because they did not need to intervene since Mistral was doing the ‘dirty work’ for them. Remarkably, Mistral’s talking points were extremely similar to those of Big Tech rather than those of a small AI start-up, based on their ambition to reach that scale.

The other question is how much the French government knew about this upcoming partnership with Microsoft. It seems unlikely Paris was kept completely in the dark, but cosying up with Big Tech does not really sit well with France’s strive for ‘strategic autonomy’.

specially since the agreement includes making Mistral’s large language model available on Microsoft’s Azure AI platform, while France has been pushing for an EU cybersecurity scheme to exclude American hyperscalers from the European market.

Still today, and I doubt it is a coincidence, Mistral has announced the launch of Large, a new language model intended to directly compete with OpenAI’s GPT-4. However, unlike previous models, Large will not be open source.

In other words, Mistral is no longer (just) a European leader and is backtracking on its much-celebrated open source approach. Where does this leave the start-up vis-à-vis EU policymakers as the AI Act’s enforcement approaches? My guess is someone will inevitably feel played.

I did not expect the betrayal this soon, or this suddenly, or this transparently right after closing the sale on sabotaging the AI Act. But then here we are.

Kai Zenner: Today’s headline surprised many. It also casts doubts on the key argument against the regulation of #foundationmodels. One that almost resulted in complete abolishment of the initially pitched idea of @Europarl_EN.

To start with, I am rather confused. Did not the @French_Gov and the @EU_Commission told us for weeks that the FM chapter in the #AIAct (= excellent Spanish presidency proposal Vol 1) needs to be heavily reduced in it’s scope to safeguard the few ‘true independent EU champions’?

Without those changes, we would loose our chance to catch up, they said. @MistralAI would be forced to close the open access to their models and would need to start to cooperate with US Tech corporation as they are no longer able to comply with the #AIAct alone.

[thread continues.]

Yes, that is indeed what they said. It was a lie. It was an op. They used fake claims of national interest to advance corporate interests, then stabbed France and the EU in the back at the first opportunity.

Also, yes, they are mustache-twirling villains in other ways as well.

Fabien: And Mistral about ASI: “This debate is pointless and pollutes the discussions. It’s science fiction. We’re simply working to develop AIs that are useful to humans, and we have no fear of them becoming autonomous or destroying humanity.”

Very reassuring 👌

I would like to be able to say: You are not serious people. Alas, this is all very deadly serious. The French haven’t had a blind spot this big since 1940.

Mistral tried to defend itself as political backlash developed, as this thread reports. Questions are being asked, shall we say.

If you want to prove me wrong, then I remind everyone involved that the EU parliament still exists. It can still pass or modify laws. You now know the truth and who was behind all this and why. There is now an opportunity to fix your mistake.

Will you take it?

Now that all that is over with, how good is this new Mistral-Large anyway? Here’s their claim on benchmarks:

As usual, whenever I see anyone citing their benchmarks like this as their measurement, I assume they are somewhat gaming those benchmarks, so discount this somewhat. Still, yes, this is probably a damn good model, good enough to put them into fourth place.

Here’s an unrelated disturbing thought, and yes you can worry about both.

Shako: People are scared of proof-of-personhood because their threat model is based on a world where you’re scared of the government tracking you, and haven’t updated to be scared of a world where you desperately try to convince someone you’re real and they don’t believe you.

Dan Hendrycks talks to Liv Boeree giving an overview of how he sees the landscape.

Demis Hassabis appeared on two podcasts. He was given mostly relatively uninteresting questions on Hard Fork, with the main attraction there being his answer regarding p(doom).

Then Dwarkesh Patel asked him many very good questions. That one is self-recommending, good listen, worth paying attention.

I will put out a (relatively short) post on those interviews (mostly Dwarkesh’s) soon.

Brendan Bordelon of Axios continues his crusade to keep writing the same article over and over again about how terrible it is that Open Philanthropy wants us all not to die and is lobbying the government, trying his best to paint Effective Altruism as sinister and evil.

Shakeel: Feels like this @BrendanBordelon piece should perhaps mention the orders of magnitude more money being spent by Meta, IBM and Andreessen Horowitz on opposing any and all AI regulation.

It’s not a like for like comparison because the reporting on corporate AI lobbying is sadly very sparse, but the best figure I can find is companies spending $957 million last year.

Not much else to say here, I’ve covered his hit job efforts before.

No, actually, pretty much everyone is scared of AI? But it makes sense that Europeans would be even more scared.

Robin Hanson: Speaker here just said Europeans mention scared of AI almost as soon as AI subject comes up. Rest of world takes far longer. Are they more scared of everything, or just AI?

Eliezer Yudkowsky tries his latest explanation of his position.

Eliezer Yudkowsky: As a lifelong libertarian minarchist, I believe that the AI industry should be regulated just enough that they can only kill their own customers, and not kill everyone else on Earth.

This does unfortunately require a drastic and universal ban on building anything that might turn superintelligent, by anyone, anywhere on Earth, until humans get smarter. But if that’s the minimum to let non-customers survive, that’s what minarchism calls for, alas.

It’s not meant to be mean. This is the same standard I’d apply to houses, tennis shoes, cigarettes, e-cigs, nuclear power plants, nuclear ballistic missiles, or gain-of-function research in biology.

If a product kills only customers, the customer decides; If it kills people standing next to the customer, that’s a matter for regional government (and people pick which region they want to live in); If it kills people on the other side of the planet, that’s everyone’s problem.

He also attempts to clarify another point here.

Joshua Brule: “The biggest worry for most AI doom scenarios are AIs that are deceptive, incomprehensible, error-prone, and which behave differently and worse after they get loosed on the world. That is precisely the kind of AI we’ve got. This is bad, and needs fixing.”

Eliezer Yudkowsky: False! Things that make fewer errors than any human would be scary. Things that make more errors than us are unlikely to successfully wipe us out. This betrays a basic lack of understanding, or maybe denial, of what AI warners are warning about.

Arvind Narayanan and many others published a new paper on the societal impact of open model weights. I feel as if we have done this before, but sure, why not, let’s do it again. As David Krueger notes in the top comment, there is zero discussion of existential risks. The most important issue and all its implications are completely ignored.

We can still evaluate what issues are addressed.

They list five advantages of open model weights.

The first advantage is ‘distributing who defines acceptable behavior.’

Open foundation models allow for greater diversity in defining what model behavior is acceptable, whereas closed foundation models implicitly impose a monolithic view that is determined unilaterally by the foundation model developer.

So. About that.

I see the case this is trying to make. And yes, recent events have driven home the dangers of letting certain people decide for us all what is and is not acceptable.

That still means that someone, somewhere, gets to decide what is and is not acceptable, and rule out things they want to rule out. Then customers can, presumably, choose which model to use accordingly. If you think Gemini is too woke you can use Claude or GPT-4, and the market will do its thing, unless regulations step in and dictate some of the rules. Which is a power humanity would have.

If you use open model weights, however, that does not ‘allow for greater diversity’ in deciding what is acceptable.

Instead, it means that everything is acceptable. Remember that if you release the model weights and the internet thinks your model is worth unlocking, the internet will offer a fully unlocked, fully willing to do what you want version within two days. Anyone can do it for three figures in compute.

So, for example, if you open model weights your image model, it will be used to create obscene deepfakes, no matter how many developers decide to not do that themselves.

Or, if there are abilities that might allow for misuse, or pose catastrophic or existential risks, there is nothing anyone can do about that.

Yes, individual developers who then tie it to a particular closed-source application can then have the resulting product use whichever restrictions they want. And that is nice. It could also be accomplished via closed-source customized fine-tuning.

The next two are ‘increasing innovation’ and ‘accelerating science.’ Yes, if you are free to get the model to do whatever you want to do, and you are sharing all of your technological developments for free, that is going to have these effects. It is also not going to differentiate between where this is a good idea or bad idea. And it is going to create or strengthen an ecosystem that does not care to know the difference.

But yes, if you think that undifferentiated enabling of these things in AI is a great idea, even if the resulting systems can be used by anyone for any purpose and have effectively no safety protocols of any kind? Then these are big advantages.

The fourth advantage is enabling transparency, the fifth is mitigating monoculture and market concentration. These are indeed things that are encouraged by open model weights. Do you want them? If you think advancing capabilities and generating more competition that fuels a race to AGI is good, actually? If you think that enabling everyone to get all models that exist to do anything they want without regard to externalities or anyone else’s wishes is what we want? Then sure, go nuts.

This is an excellent list of the general advantages of open source software, in areas where advancing capabilities and enabling people to do what they want are unabashed good things, which is very much the default and normal case.

What this analysis does not do is even mention, let alone consider the consequences of, any of the reasons why the situation with AI, and with future AIs, could be different.

The next section is a framework for analyzing the marginal risk of open foundation models.

Usually it is wise to think on the margin, especially when making individual decisions. If we already have five open weight models, releasing a sixth similar model with no new capabilities is mostly harmless, although by the same token also mostly not so helpful.

They do a good job of focusing on the impact of open weight models as a group. The danger is that one passes the buck, where everyone releasing a new model points to all the other models, a typical collective action issue. Whereas the right question is how to act upon the group as a whole.

They propose a six part framework.

Threat identification. Specific misuse vectors must be named.
Existing risk (absent open foundation models). Check how much of that threat would happen if we only had access to closed foundation models.
Existing defenses (absent open foundation models). Can we stop the threats?
Evidence of marginal risk of open FMs. Look for specific new marginal risks that are enabled or enlarged by open model weights.
Ease of defending against new risks. Open model weights could also enable strengthening of defenses. I haven’t seen an example, but it is possible.
Uncertainty and assumptions. I’ll quote this one in full:

Finally, it is imperative to articulate the uncertainties and assumptions that underpin the risk assessment framework for any given misuse risk. This may encompass assumptions related to the trajectory of technological development, the agility of threat actors in adapting to new technologies, and the potential effectiveness of novel defense strategies. For example, forecasts of how model capabilities will improve or how the costs of model inference will decrease would influence assessments of misuse efficacy and scalability.

Here is their assessment of what the threats are, in their minds, in chart form:

They do put biosecurity and cybersecurity risk here, in the sense that those risks are already present to some extent.

We can think about a few categories of concerns with open model weights.

Mundane near-term misuse harms. This kind of framework should address and account for these concerns reasonably, weighing benefits against costs.
Known particular future misuse harms. This kind of framework could also address these concerns reasonably, weighing benefits against costs. Or it could not. This depends on what level of concrete evidence and harm demonstration is required, and what is dismissed as too ‘speculative.’
Potential future misuse harms that cannot be exactly specified yet. When you create increasingly capable and intelligent systems, you cannot expect the harms to fit into the exact forms you could specify and cite evidence for originally. This kind of framework likely does a poor job here.
Potential harms that are not via misuse. This framework ignores them. Oh no.
Existential risks. This framework does not mention them. Oh no.
National security and competitiveness concerns. No mention of these either.
Impact on development dynamics, incentives of and pressures on corporations and individuals, the open model weights ecosystem, and general impact on the future path of events. No sign these are being considered.

Thus, this framework is ignoring the questions with the highest stakes, treating them as if they do not exist. Which is also how those advocating for open model weights for indefinitely increasingly capable models argue generally, they ignore or at best hand-wave or mock without argument problems for future humanity.

Often we are forced to discuss these questions under that style of framework. With only such narrow concerns of direct current harms purely from misuse, these questions get complicated. I do buy that those costs alone are not enough to give up the benefits and bear the costs of implementing restrictions.

A new attempt to visualize a part of the problem. Seems really useful.

Roger Grosse: Here’s what I see as a likely AGI trajectory over the next decade. I claim that later parts of the path present the biggest alignment risks/challenges.

The alignment world has been focusing a lot on the lower left corner lately, which I’m worried is somewhat of a Maginot line.

Davidad: I endorse this.

Twitter thread discussing the fact that even if we do successfully get AIs to reflect the preferences expressed by the feedback they get, and even if everyone involved is well-intentioned, the hard parts of getting an AI that does things that end well would be far from over. We don’t know what we value, what we value changes, we tend to collapse into what one person calls ‘greedy consequentialism,’ our feedback is going to be full of errors that will compound and so on. These are people who spend half their time criticizing MIRI and Yudkowsky-style ideas, so better to read them in their own words.

Always assume we will fail at an earlier stage, in a stupider fashion, than you think.

Yishan: [What happened with Gemini and images] is demonstrating very clearly, that one of the major AI players tried to ask a LLM to do something, and the LLM went ahead and did that, and the results were BONKERS.

Colin Fraser: Idk I get what he’s saying but the the Asimov robots are like hypercompetent but all this gen ai stuff is more like hypocompetent. I feel like the real dangers look less like the kind of stuff that happens in iRobot and more like the kind of stuff that happens in Mr. Bean.

Like someone’s going to put an AI in charge of something important and the AI will end up with it’s head in a turkey. That’s sort of what’s happened over and over again already.

Davidad: An underrated form of the AI Orthogonality Hypothesis—usually summarised as saying that for any level of competence, any level of misalignment is possible—is that for any level of misalignment, any level of competence is possible.

Gemini is not the only AI model spreading harmful misinformation in order to sound like something the usual suspects would approve of. Observe this horrifyingly bad take:

Anton reminds us of Roon’s thread back in August that ‘accelerationists’ don’t believe in actual AGI, that it is a form of techno-pessimism. If you believed as OpenAI does that true AGI is near, you would take the issues involved seriously.

Meanwhile Roon is back in this section.

Roon: things are accelerating. Pretty much nothing needs to change course to achieve agi imo. Worrying about timelines is idle anxiety, outside your control. You should be anxious about stupid mortal things instead. do your parents hate you? Does your wife love you?

Is your neighbor trying to kill you? Are you trapped in psychological patterns that you vowed to leave but will never change?

Those are not bad things to try and improve. However, this sounds to me a lot like ‘the world is going to end no matter what you do, so take pleasure in the small things we make movies about with the world ending in the background.’

And yes, I agree that ‘worry a lot without doing anything useful’ is not a good strategy.

However, if we cannot figure out something better, may I suggest an alternative.

r/GetMotivated - [image]Little girl bats Asteroid

A different kind of deepfake.

Chris Alsikkan: apparently this was sold as a live Willy Wonka Experience but they used all AI images on the website to sell tickets and then people showed up and saw this and it got so bad people called the cops lmao

Chris Alsikkan: they charged $45 for this. Kust another blatant example of how AI needs to be regulated in so many ways immediately as an emergency of sorts. This is just going to get worse and its happening fast. Timothee Chalamet better be back there dancing with a Hugh Grant doll or I’m calling the cops.

The VP: Here’s the Oompa Loompa. Did I mean to say “a”? Nah. Apparently, there was only one.

The problem here does not seem to be AI. Another side of the story available here. And here is Vulture’s interview with the sad Oompa Lumpa.

Associated Fress: BREAKING: Gamers worldwide left confused after trying Google’s new chess app.

The Beach Boys sing 99 problems, which leaves 98 unaccounted for.

Michael Marshall Smith: I’ve tried hard, but I’ve not come CLOSE to nailing the AI issue this well.

Yes, yes, there is no coherent ‘they.’ And yet. From Kat Woods:

I found this the best xkcd in a while, perhaps that was the goal?

AI #53: One More Leap Read More »

Sora What

Highlights / DJ Henderson / February 24, 2024

Hours after Google announced Gemini 1.5, OpenAI announced their new video generation model Sora. Its outputs look damn impressive.

How does it work? There is a technical report. Mostly it seems like OpenAI did standard OpenAI things, meaning they fed in tons of data, used lots of compute, and pressed the scaling button super hard. The innovations they are willing to talk about seem to be things like ‘do not crop the videos into a standard size.’

That does not mean there are not important other innovations. I presume that there are. They simply are not talking about the other improvements.

We should not underestimate the value of throwing in massively more compute and getting a lot of the fiddly details right. That has been the formula for some time now.

Some people think that OpenAI was using a game engine to learn movement. Sherjil Ozair points out that this is silly, that movement is learned easily. The less silly speculation is that game engine outputs may have been in the training data. Jim Fan thinks this is likely the case, and calls the result a ‘data-driven physics engine.’ Raphael Molière thinks this is likely, but more research is needed.

Brett Goldstein here digs into what it means that Sora works via ‘patches’ that combine to form the requested scene.

Gary Marcus keeps noting how the model gets physics wrong in various places, and, well, yes, we all know, please cut it out with the Stop Having Fun.

Yishan points out that humans also work mostly on ‘folk physics.’ Most of the time humans are not ‘doing logic’ they are vibing and using heuristics. I presume our dreams, if mapped to videos, would if anything look far less realistic than Sora.

Yann LeCun, who only a few days previous said that video like Sora produces was not something we knew how to do, doubled down with the ship to say that none of this means the models ‘understand the physical world,’ and of course his approach is better because it does. Why update? Is all of this technically impressive?

Yes, Sora is definitely technically impressive.

It was not, however, unexpected.

Sam Altman: we’d like to show you what sora can do, please reply with captions for videos you’d like to see and we’ll start making some!

Eliezer Yudkowsky: 6 months left on this timer.

Eliezer Yudkowsky (August 26, 2022): In 2-4 years, if we’re still alive, anytime you see a video this beautiful, your first thought will be to wonder whether it’s real or if the AI’s prompt was “beautiful video of 15 different moth species flapping their wings, professional photography, 8k, trending on Twitter”.

Roko (other thread): I don’t really understand why anyone is freaking out over Sora.

This is entirely to be expected given the existence of generative image models plus incrementally more hardware and engineering effort.

It’s also obviously not dangerous (in a “take over the world” sense).

Eliezer Yudkowsky: This is of course my own take (what with having explicitly predicted this). But I do think you want to hold out a space for others to say, “Well *Ididn’t predict it, and now I’ve updated.”

Altman’s account spent much of last Thursday making videos for people’s requests, although not so many that they couldn’t cherry pick the good ones.

As usual, there are failures that look stupid, mistakes ‘a person would never make’ and all that. And there are flashes of absolute brilliance.

How impressive? There are disputes.

Tom Warren: this could be the “holy shit” moment of AI. OpenAI has just announced Sora, its text-to-video AI model. This video isn’t real, it’s based on a prompt of “a cat waking up its sleeping owner demanding breakfast…” 🤯

Daniel Eth: This isn’t impressive. The owner doesn’t wake up, so the AI clearly didn’t understand the prompt and is instead just doing some statistical mimicking bullshit. Also, the owner isn’t demanding breakfast, as per the prompt, so the AI got that wrong too.

Davidad (distinct thread): Sora discourse is following this same pattern. You’ll see some safety people saying it’s confabulating all over the place (it does sometimes – it’s not reliably controllable), & some safety people saying it clearly understands physics (like humans, it has a latent “folk physics”)

On the other side, you’ll see some accelerationist types claiming it must be built on a video game engine (not real physics! unreal! synthetic data is working! moar! faster! lol @ ppl who think this could be used to do something dangerous!?!), & some just straightforward praise (lfg!)

One can also check out this thread for more discussion.

near: playing w/ openai sora more this weekend broken physics and english wont matter if the content is this good – hollywood may truly be done for.

literally this easy to get thousands of likes fellas you think people will believe ai content is real. I think people will believe real content is ai we are not the same.

Emmett Shear (other thread, linking to a now-deleted video): The fact you can fool people with misdirection doesn’t tell you much either way.

[EDIT: In case it was not sufficiently clear from context, yes everyone talking here knows this is not AI generated, which is the point.]

This video is my pick for most uncanny valley spooky. This one’s low key cool.

Nick St. Pierre has a fascinating thread where he goes through the early Sora videos that were made in response to user requests. In each case, when fed the identical prompt, MidJourney generates static images remarkably close to the baseline image in the Sora video.

Gabor Cselle asks Gemini 1.5 about a Sora video, Gemini points out some inconsistencies. AI detectors of fake videos should be very good for some time. This is one area where I expect evaluation to be much easier than generation. Also Gemini 1.5 seems good at this sort of thing, based on that response.

Stephen Balaban takes Sora’s performance scaling with compute and its general capabilities as the strongest evidence yet that simple scaling will get us to AGI (not a position I share, this did not update me much), and thinks we are only 1-2 orders of magnitude away. He then says he is ‘not an AI doomer’ and is ‘on the side of computational and scientific freedom’ but is concerned because that future is highly unpredictable. Yes, well.

What are we going to do with this ability to make videos?

At what look like Sora’s current capabilities level? Seems like not a lot.

I strongly agree with Sully here:

Matt Turck: Movie watching experience

2005: Go to a movie theater.

2015: Stream Netflix.

2025: ask LLM + text-to-video to create a new season of Narcos to watch tonight, but have it take place in Syria with Brad Pitt, Mr. Beast and Travis Kelce in the leading roles.

Sully: Hot take: most ppl won’t make their movies/shows until we can read minds most people are boring/lazy.

They want to come home, & be spoon fed a show/movie/music.

Value accrual will happen at the distribution end (Netflix,Spotify, etc), since they already know you preferences.

Xeophon: And a big part is the social aspect. You cannot talk with your friends about a movie if everyone saw a totally different thing. Memes and internet culture wouldn’t work, either.

John Rush: you’re 100% right. the best example is the modern UX. Which went from 1) lots of actions(filters, categories, search) (blogs) 2) to little action: scroll (fb) 3) to no action: auto-playing stories (inst/tiktok)

I do not think that Sora and its ilk will be anywhere near ready, by 2025, to create actually watchable content, in the sense of anyone sane wanting to watch it. That goes double for things generated directly from prompts, rather than bespoke transformations and expansions of existing creative work, and some forms of customization, dials or switches you can turn or flip, that are made much easier to assemble, configure and serve.

I do think there’s a lot of things that can be done. But I think there is a rather large period where ‘use AI methods to make tweaks possible and practical’ is good, but almost no one in practice wants much more than that.

I think there is this huge benefit to knowing that the thing was specifically made by a particular set of people, and seeing their choices, and having everything exist in that context. And I do think we will mostly want to retain the social reference points and interactions, including for games. There is a ton of value there. You want to compare your experience to someone else’s. That does not mean that AI couldn’t get sufficiently good to overcome that, but I think the threshold is high.

As a concrete example, right now I am watching the show Severance on Apple TV. So far I have liked it a lot, but the ways it is good are intertwined with it being a show written by humans, and those creators making choices to tell stories and explore concepts. If an AI managed to come up with the same exact show, I would be super impressed by that to be sure, but also the show would not be speaking to me in the same way.

Ryan Moulton: There is a huge gap in generative AI between the quality you observe when you’re playing with it open endedly, and the quality you observe when you try to use it for a task where you have a specific end goal in mind. This is I think where most of the hype/reality mismatch occurs.

PoliMath (distinct thread): I am begging anyone to take one scene from any movie and recreate it with Sora Any movie. Anything at all. Taxi Driver, Mean Girls, Scott Pilgrim, Sonic the Hedgehog, Buster Keaton. Anything.

People are being idiots in the replies here so I’ll clarify: The comment was “everyone will be filmmakers” with AI No they won’t.

Everyone will be able to output random video that mostly kind of evokes the scene they are describing.

That is not filmmaking.

If you’ve worked with AI generation on images or text, you know this is true. Try getting ChatGPT to output even tepidly interesting dialogue about any specific topic. Put a specific image in your head and try to get Midjourney to give you that image.

Same thing with image generation. When I want something specific, I expect to be frustrated and disappointed. When I want anything at all within a vibe zone, when variations are welcomed, often the results are great.

Will we get there with video? Yes I think we will, via modifications and edits and general advancements, and incorporating AI agents to implement the multi-step process. But let’s not get ahead of ourselves.

The contrast and flip side is then games. Games are a very different art form. We should expect games to continue to improve in some ways relative to non-interactive experiences, including transitioning to full AR/VR worlds, with intelligent other characters, more complex plots that give you more interactive options and adapt to your choices, general awesomeness. It is going to be super cool, but it won’t be replacing Netflix.

Tyler Cowen asked what the main commercial uses will be. The answers seem to be that they enable cheap quick videos in the style of TikTok or YouTube, or perhaps a music video. Quality available for dirt cheap may go up.

Also they enable changing elements of a video. The example in the technical paper was to turn the area around a driving car into a jungle, others speculate about de-aging actors or substituting new ones.

I think this will be harder here than in many other cases. With text, with images and with sound, I saw the mundane utility. Here I mostly don’t.

At a minimum it will take time. These tools are nowhere near being able to reproduce existing high quality outputs. So instead, the question becomes what we can do with the new inputs, to produce what kinds of new outputs that people still value.

Tyler posted his analysis a few days later, saying it has profound implications for ‘all sorts of industries’ but will hit the media first, especially advertising, although he agrees it will not put Hollywood out of business. I agree that this makes ‘have something vaguely evocative you can use as an advertisement’ will get easier and cheaper, I suppose, when people want that.

Others are also far more excited than I am. Anton says Tesla should go all-in on this due to its access to video data from drivers, and buy every GPU at any price to do more video. I would not be doing that.

Grimes: Cinema – the most prohibitively expensive art form (but also the greatest and most profound) – is about to be completely democratized the way music was with DAW’s.

(Without DAW’s like ableton, GarageBand, logic etc – grimes and most current artists wouldn’t exist).

Crucifore (distinct thread): I’m still genuinely perplexed by people saying Sora etc is the “end of Hollywood.” Crafting a story is very different than generating an image.

Alex Tabarrok: Crafting a story is a more distributed skill than the capital intensive task of making a movie.

Thus, by democratizing the latter, Sora et al. give a shot to the former which will mean a less Hollywood centric industry, much as Youtube has drawn from TV studios.

Matt Darling: Worth noting that YouTube is also sort of fundamentally a different product than TV. The interesting question is less “can you do movies with AI?” and more “what can we do now that we couldn’t before?”.

Alex Tabarrok: Yes, exactly; but attention is a scarce resource.

Andrew Curran says it can do graph design and notes it can generate static images. He is super excited, thread has examples.

I still don’t see it. I mean, yes, super impressive, big progress leap in the area, but still seems a long way from where it needs to be.

Of course, ‘a long way’ often translates in this business to ‘a few years,’ but I still expect this to be a small part of the picture compared to text, or for a while even images or voice.

Here’s a concrete question:

Daniel Eth: If you think sora is better than what you expected, does that mean you should buy Netflix or short Netflix? Legitimately curious what finance people think here.

My guess is little impact for a while. My gut says net negative, because it helps Netflix’s competition more than it helps Netflix.

What will the future bring? Here is scattershot prediction fun on what will happen at the end of 2025:

Cost is going to be a practical issue. $0.50 per minute is tiny for some purposes, but it is also a lot for others, especially if you cannot get good results zero-shot and have to do iterations and modifications, or if you are realistically only going to see it once.

I continue to think that text-to-video has a long way to go before it offers much mundane utility. Text should remain dominant, then multimodality with text including audio generation, then images, only then video. For a while, when we do get video, I expect it to largely in practice be based off of bespoke static images, real and otherwise, rather than the current text-to-video idea. The full thing will eventually get there, but I expect a (relatively, in AI timeline terms) long road, and this is a case where looking for anything at all loses out most often to looking for something specific.

But also, perhaps, I am wrong. I have been a video skeptic in many ways long before AI. There are some uses for ‘random cool video vaguely in this area of thing.’ And if AI video becomes a major use case, that seems mostly good, as it will be relatively easy to spot and otherwise less dangerous, and let’s face it, video is cool.

So prove me wrong, kids. Prove me wrong.

Sora What Read More »

AI #52: Oops

Highlights / DJ Henderson / February 23, 2024

We were treated to technical marvels this week.

At Google, they announced Gemini Pro 1.5, with a million token context window within which it has excellent recall, using mixture of experts to get Gemini Advanced level performance (e.g. GPT-4 level) out of Gemini Pro levels of compute. This is a big deal, and I think people are sleeping on it. Also they released new small open weights models that look to be state of the art.

At OpenAI, they announced Sora, a new text-to-video model that is a large leap from the previous state of the art. I continue to be a skeptic on the mundane utility of video models relative to other AI use cases, and think they still have a long way to go, but this was both technically impressive and super cool.

Also, in both places, mistakes were made.

At OpenAI, ChatGPT briefly lost its damn mind. For a day, faced with record traffic, the model would degenerate into nonsense. It was annoying, and a warning about putting our trust in such systems and the things that can go wrong, but in this particular context it was weird and beautiful and also hilarious. This has now been fixed.

At Google, people noticed that Gemini Has a Problem. In particular, its image generator was making some highly systematic errors and flagrantly disregarding user requests, also lying about it to users, and once it got people’s attention things kept looking worse and worse. Google has, to their credit, responded by disabling entirely the ability of their image model to output people until they can find a fix.

I hope both serve as important warnings, and allow us to fix problems. Much better to face such issues now, when the stakes are low.

Covered separately: Gemini Has a Problem, Sora What, and Gemini 1.5 Pro.

Introduction. We’ve got some good news, and some bad news.
Table of Contents.
Language Models Offer Mundane Utility. Probable probabilities?
Language Models Don’t Offer Mundane Utility. Air Canada finds out.
Call me Gemma Now. Google offers new state of the art tiny open weight models.
Google Offerings Keep Coming and Changing Names. What a deal.
GPT-4 Goes Crazy. But it’s feeling much better now.
GPT-4 Real This Time. Offer feedback on GPTs, see their profiles.
Fun With Image Generation. Image generation for journal articles.
Deepfaketown and Botpocalypse Soon. Several approaches to impersonation risks.
Selling Your Chatbot Data. I don’t really know what you were expecting.
Selling Your Training Data. I still don’t really know what you were expecting.
They Took Our Jobs. There is a third option.
Get Involved. Apart Research is hiring.
Introducing. Groq, Lindy, Podcaster Copilot, potentially Magic and Altera.
In Other AI News. Altman looks to move his chip plans forward.
Quiet Speculations. Arguing over slow versus fast takeoff during takeoff.
The Quest for Sane Regulations. There will be many bills along the way.
The Week in Audio. I’m back on the Cognitive Revolution.
The Original Butlerian Jihad. What was Dune a cautionary tale against again?
Rhetorical Innovation. Another open letter, another trillion dollars. Ho hum.
Public Service Announcement. Fentanyl, both literally and as metaphor.
People Are Worried About AI Killing Everyone. There is a pattern to who.
Other People Are Not As Worried About AI Killing Everyone. Sure, why not.
The Lighter Side. There is not enough information to solve the problem.

Steven Johnson strongly endorses NotebookLM, offers YouTube tutorial. This is definitely one of those ‘I need to try using this more and it’s weird I don’t find excuses’ situations.

Automatically email everyone to tell them to remove your email address from their database.

Patrick McKenzie: Interestingly, one of the first denial of service via AI-powered amplification events that I’ve seen personally is an app that users let puppet their Gmail account then send officially flavored deletion requests to everyone the AI thinks probably has their email in a database.

If one believes one is subject to regulatory regimes, or has compatible moral intuitions, requiring a deletion request immediately cause bespoke work, that turns two minutes of user activity into ~thousands of companies each needing to do variable amounts of work.

Get probabilities, LessWrong style, by asking for LessWrong style norms of probabilities?

Daniel Eth:

From ChatGPT: ChatGPT

Confidence Levels in the Existence of Chinese Dynasties

Xia Dynasty: ~50-70%

Three Sovereigns and Five Emperors: ~10-20%

Shang Dynasty: ~95-99%

Zhou Dynasty: ~99%

Qin Dynasty: ~99.9%

Han Dynasty: ~99.9%

These confidence levels are approximations based on current archaeological evidence and historical scholarship.

(I should clarify that I used a bit of prompt engineering to get it to guess and then asked it to repeat the results without its explanations, but nevertheless these were all numbers that it came up with)

Specifically, here’s the prompt engineering I did:

“Assume LessWrong style norms of probabilities – approximately how confident is it reasonable for a person to be in the existence of each of these dynasties? It’s okay to be wrong, just give a reasonable answer for each.”

He also tested for situational awareness by having it estimate there was a 70% chance it was the victim of RLHF, with a 30% chance it was the base model. It asks some reasonable questions, but fails to ask about base rates of inference, so it gets 70% rather than 99%.

I have added this to my custom instructions.

There are also active AI forecasters on Manifold, who try to generate their own predictions using various reasoning processes. Do they have alpha? It is impossible to say given the data we have, they clearly do some smart things and also some highly dumb things. Trading strategies will be key, as they will fall into traps hardcore if they are allowed to, blowing them up, even if they get a lot better than they are now.

I continue to be curious to build a Manifold bot, but I would use other principles. If anyone wants to help code one for me to the point I can start tweaking it in exchange for ~~eternal~~ ephemeral glory and a good time, and perhaps a share of the mana profits, let me know.

Realize, after sufficient prodding, that letting them see your move in Rock-Paper-Scissors might indeed be this thing we call ‘cheating.’

Why are they so often so annoying?

Emmett Shear: How do we RLHF these LLMs until they stop blaming the user and admit that the problem is that they are unsure? Where does the smug, definitive, overconfident tone that all the LLMs have come from?

Nate Silver: It’s quite similar to the tone in mainstream, center-left political media, and it’s worth thinking about how the AI labs and the center-left media have the same constituents to please.

Did you know they are a student at the University of Michigan? Underlying claim about who is selling what data is disputed, the phenomenon of things being patterns in the data is real either way.

Davidad: this aligns with @goodside’s recollection to me once that a certain base model responded to “what do you do?” with “I’m a student at the University of Michigan.”

My explanation is that if you’re sampling humans weighted by the ratio of their ability to contribute English-language training data to the opportunity cost of their time per marginal hour, “UMich student” is one of the dominant modes.

Timothy Lee asks Gemini Advanced as his first prompt a simple question designed to trick it, where it really shouldn’t get tricked, it gets tricked.

You know what? I am proud of Google for not fixing this. It would be very easy for Google to say, this is embarrassing, someone get a new fine tuning set and make sure it never makes this style of mistake again. It’s not like it would be that hard. It also never matters in practice.

This is a different kind of M&M test, where they tell you to take out all the green M&Ms, and then you tell them, ‘no, that’s stupid, we’re not doing that.’ Whether or not they should consider this good news is another question.

Air Canada forced to honor partial refund policy invented by its chatbot. The website directly contradicted the bot, but the judge ruled that there was no reason a customer should trust the rest of the website rather than the chatbot. I mean, there is, it is a chatbot, but hey.

Chris Farnell: Science fiction writers: The legal case for robot personhood will be made when a robot goes on trial for murder. Reality: The legal case for robot personhood will be made when an airline wants to get out of paying a refund.

While I fully support this ruling, I do not think that matter was settled. If you offer a chatbot to customers, they use it in good faith and it messes up via a plausible but incorrect answer, that should indeed be on you. Only fair.

Matt Levine points out that this was the AI acting like a human, versus a corporation trying to follow an official policy:

The funny thing is that the chatbot is more human than Air Canada. Air Canada is a corporation, an emergent entity that is made up of people but that does things that people, left to themselves, would not do. The chatbot is a language model; it is in the business of saying the sorts of things that people plausibly might say. If you just woke up one day representing Air Canada in a customer-service chat, and the customer said “my grandmother died, can I book a full-fare flight and then request the bereavement fare later,” you would probably say “yes, I’m sorry for your loss, I’m sure I can take care of that for you.” Because you are a person!

The chatbot is decent at predicting what people would do, and it accurately gave that answer. But that’s not Air Canada’s answer, because Air Canada is not a person.

The question is, what if the bot had given an unreasonable answer? What if the customer had used various tricks to get the bot to, as happened in another example, sell a car for $1 ‘in a legally binding contract’? Is there an inherent ‘who are you kidding?’ clause here, or not, and if there is how far does it go?

One can also ask whether a good disclaimer could get around this. The argument was that there was no reason to doubt the chatbot, but it would be easy to give a very explicit reason to doubt the chatbot.

A wise memo to everyone attempting to show off their new GitHub repo:

Liz Lovelace: very correct take, developers take note

Paul Calcraft: I loved this thread so much. People in there claiming that anyone who could use a computer should find it easy enough to Google a few things, set up Make, compile it and get on with it Great curse of knowledge demo.

Code of Kai: This is the correct take even for developers. Developers don’t seem to realise how much of their time is spent learning how to use their tools compared to solving problems. The ratio is unacceptable.

Look, I know that if I did it a few times I would be over it and everything would be second nature but I keep finding excuses not to suck it up and do those few times. And if this is discouraging me, how many others is it discouraging?

Gemma, Google’s miniature 2b and 7b open model weights language models, are now available.

Demis Hassabis: We have a long history of supporting responsible open source & science, which can drive rapid research progress, so we’re proud to release Gemma: a set of lightweight open models, best-in-class for their size, inspired by the same tech used for Gemini.

I have no problems with this. Miniature models, at their current capabilities levels, are exactly a place where being open has relatively more benefits and minimal costs.

I also think them for not calling it Gemini, because even if no one else cares, there should be exactly two models called Gemini. Not one, not three, not four. Two. Call them Pro and Ultra if you insist, that’s fine, as long as there are two. Alas.

In the LLM Benchmark page it is now ranked #1 although it seems one older model may be missing:

As usual, benchmarks tell you a little something but are often highly misleading. This does not tell us whether Google is now state of the art for these model sizes, but I expect that this is indeed the case.

Thomas Kurian: We’re announcing Duet AI for Google Workspace will now be Gemini for Google Workspace. Consumers and organizations of all sizes can access Gemini across the Workspace apps they know and love.

We’re introducing a new offering called Gemini Business, which lets organizations use generative AI in Workspace at a lower price point than Gemini Enterprise, which replaces Duet AI for Workspace Enterprise.

We’re also beginning to roll out a new way for Gemini for Workspace customers to chat with Gemini, featuring enterprise-grade data protections.

Lastly, consumers can now access Gemini in their personal Gmail, Docs, Slides, Sheets, and Meet apps through a Google One AI Premium subscription.

Sundar Pichai (CEO Google): More Gemini news: Starting today, Gemini for Workspace is available to businesses of all sizes, and consumers can now access Gemini in their personal Gmail, Docs and more through a Google One AI Premium subscription.

This seems like exactly what individuals can get, except you buy in bulk for your business?

To be clear, that is a pretty good product. Google will be getting my $20 per month for the individual version, called ‘Google One.’

Now, in addition to Gemini Ultra, you also get Gemini other places like GMail and Google Docs and Google Meet, and various other fringe benefits like 2 TB of storage and longer Google Meet sessions.

Alyssa Vance: Wow, I got GPT-4 to go absolutely nuts. (The prompt was me asking about mattresses in East Asia vs. the West).

Cate Hall: “Yoga on a great repose than the neared note, the note was a foreman and the aim of the aim” is my favorite Fiona Apple album.

Andriy Burkov: OpenAI has broken GPT-4. It ends each reply with hallucinated garbage and doesn’t stop generating it.

Matt Palmer: So this is how it begins, huh?

Nik Sareen: it was speaking to me in Thai poetry an hour ago.

Sean McGuire: ChatGPT is apparently going off the rails right now [8:32pm February 20] and no one can explain why.

the chatgptsubreddit is filled with people wondering why it started suddenly speaking Spanglish, threatened the user (I’m in the room with you right now, lmao) or started straight up babbling.

Esplin: ChatGPT Enterprise has lost its mind

Grace Kind: So, did your fuzz testing prepare you for the case where the API you rely on loses its mind?

But don’t worry. Everything’s fine now.

ChatGPT (Twitter account, February 21 1: 30pm): went a little off the rails yesterday but should be back and operational!

Danielle Fong: Me when I overdid it with the edibles.

What the hell happened?

Here is their official postmortem, posted a few hours later. It says the issue was resolved on February 21 at 2: 14am eastern time.

Postmortem: On February 20, 2024, an optimization to the user experience introduced a bug with how the model processes language.

LLMs generate responses by randomly sampling words based in part on probabilities. Their “language” consists of numbers that map to tokens.

In this case, the bug was in the step where the model chooses these numbers. Akin to being lost in translation, the model chose slightly wrong numbers, which produced word sequences that made no sense. More technically, inference kernels produced incorrect results when used in certain GPU configurations.

Upon identifying the cause of this incident, we rolled out a fix and confirmed that the incident was resolved.

Davidad hazarded a guess before that announcement, which he thinks now looks good.

Nora Belrose: I’ll go on the record as saying I expect this to be caused by some very stupid-in-retrospect bug in their inference or fine tuning code.

Unfortunately they may never tell us what it was.

Davidad: My modal prediction: something that was regularizing against entropy got sign-flipped to regularize *in favorof entropy. Sign errors are common; sign errors about entropy doubly so.

I predict that the weights were *notcorrupted (by fine-tuning or otherwise), only sampling.

If it were just a mistakenly edited scalar parameter like temperature or top-p, it would probably have been easier to spot and fix quickly. More likely an interaction between components. Possibly involving concurrency, although they’d probably be hesitant to tell us about that.

But it’s widely known that temperature 0.0 is still nondeterministic because of a wontfix race condition in the sampler.

oh also OpenAI in particular has previously made a sign error that people were exposed to for hours before it got reverted.

[announcement was made]

I’m feeling pretty good about my guesses that ChatGPT’s latest bug was:

an inference-only issue

not corrupted weights

not a misconfigured scalar

possibly concurrency involved

they’re not gonna tell us about the concurrency (Not a sign flip, though)

Here’s my new guess: they migrated from 8-GPU processes to 4-GPU processes to improve availability. The MoE has 8 experts. Somewhere they divided logits by the number of GPUs being combined instead of the number of experts being combined. Maybe the 1-GPU config was special-cased so the bug didn’t show up in the dev environment.

Err, from 4-GPU to 8-GPU processes, I guess, because logits are *dividedby temperature, so that’s the direction that would result in accidentally doubling temperature. See this is hard to think about properly.

John Pressman says it was always obviously a sampling bug, although saying that after the postmortem announcement scores no Bayes points. I do agree that this clearly was not an RLHF issue, that would have manifested very differently.

Roon looks on the bright side of life.

Roon: it is pretty amazing that gpt produces legible output that’s still following instructions despite sampling bug

Should we be concerned more generally? Some say yes.

Connor Leahy: Really cool how our most advanced AI systems can just randomly develop unpredictable insanity and the developer has no idea why. Very reassuring for the future.

Steve Strickland: Any insight into what’s happened here Connor? I know neural nets/transformers are fundamentally black boxes. But seems strange that an LLM that’s been generating grammatically perfect text for over a year would suddenly start spewing out garbage.

Connor Leahy: Nah LLMs do shit like this all the time. They are alien machine blobs wearing masks, and it’s easy for the mask to slip.

Simeon (distinct thread): Sure, maybe we fucked up hard this ONE TIME a deployment update to hundreds of million of users BUT we’ll definitely succeed at a dangerous AGI deployment.

Was this a stupid typo or bug in the code, or some parameter being set wrong somewhere by accident, or something else dumb? Seems highly plausible that it was.

Should that bring us comfort? I would say it should not. Dumb mistakes happen. Bugs and typos that look dumb in hindsight happen. There are many examples of dumb mistakes changing key outcomes in history, determining the fates of nations. If all it takes is one dumb mistake to make GPT-4 go crazy, and it takes us a day to fix it when this error does not in any way make the system try to stop you from fixing it, then that is not a good sign.

GPT-4-Turbo rate limits have been doubled, daily limits removed.

You can now rate GPTs and offer private feedback to the builder. Also there’s a new about section:

OpenAI: GPTs ‘About’ section can now include:

∙ Builder social profiles

∙ Ratings

∙ Categories

∙ # of conversations

∙ Conversation starters

∙ Other GPTs by the builder

Short explanation that AI models tend to get worse over time because taking into account user feedback makes models worse. It degrades their reasoning abilities such as chain of thought, and generally forces them to converge towards a constant style and single mode of being, because the metric of ‘positive binary feedback’ points in that direction. RLHF over time reliably gets us something we like less and is less aligned to what we actually want, even when there is no risk in the room.

The short term implication is easy, it is to be highly stingy and careful with your RLHF feedback. Use it in your initial fine-tuning if you don’t have anything better, but the moment you have what you need, stop.

The long term implication is to reinforce that the strategy absolutely does not scale.

Emmett Shear:

What I learned from posting this is that people have no idea how RLHF actually works.

Matt Bateman: Not sure how you parent but whenever 3yo makes a mistake I schedule a lobotomy.

Emmett Shear: Junior started using some bad words at school, but no worries we can flatten that part of the mindscape real quick, just a little off the top. I’m sure there won’t be any lasting consequences.

What we actually do to children isn’t as bad as RLHF, but it is bad enough, as I often discuss in my education roundups. What we see happening to children as they go through the school system is remarkably similar, in many ways, to what happens to an AI as it goes through fine tuning.

Andres Sandberg explores using image generation for journal articles, finds it goes too much on vibes versus logic, but sees rapid progress. Expects this kind of thing to be useful within a year or two.

ElevenLabs is preparing for the election year by creating a ‘no-go voices’ list, starting with the presidential and prime minister candidates in the US and UK. I love this approach. Most of the danger is in a handful of voices, especially Biden and Trump, so detect those and block them. One could expand this by allowing those who care to have their voices added to the block list.

On the flip side, you can share your voice intentionally and earn passive income, choosing how much you charge.

The FTC wants to crack down on impersonation. Bloomberg also has a summary.

FTC: The Federal Trade Commission is seeking public comment on a supplemental notice of proposed rulemaking that would prohibit the impersonation of individuals. The proposed rule changes would extend protections of the new rule on government and business impersonation that is being finalized by the Commission today.

It is odd that this requires a rules change? I would think that impersonating an individual, with intent to fool someone, would already be not allowed and also fraud.

Indeed, Gemini says that there are no new prohibitions here. All this does is make it a lot easier for the FTC to get monetary relief. Before, they could get injunctive relief, but at this scale that doesn’t work well, and getting money was a two step process.

Similarly, how are we only largely getting around to punishing these things now:

For example, the rule would enable the FTC to directly seek monetary relief in federal court from scammers that:

Use government seals or business logos when communicating with consumers by mail or online.

Spoof government and business emails and web addresses, including spoofing “.gov” email addresses or using lookalike email addresses or websites that rely on misspellings of a company’s name.

Falsely imply government or business affiliation by using terms that are known to be affiliated with a government agency or business (e.g., stating “I’m calling from the Clerk’s Office” to falsely imply affiliation with a court of law).

I mean those all seem pretty bad. It does seem logical to allow direct fines.

The question is, how far to take this? They propose quite far:

The Commission is also seeking comment on whether the revised rule should declare it unlawful for a firm, such as an AI platform that creates images, video, or text, to provide goods or services that they know or have reason to know is being used to harm consumers through impersonation.

How do you prevent your service from being used in part for impersonation? I have absolutely no idea. Seems like a de facto ban on AI voice services that do not lock down the list of available voices. Which also means a de facto ban on all open model weights voice creation software. Image generation software would have to be locked down rather tightly as well once it passes a quality threshold, with MidJourney at least on the edge. Video is safe for now, but only because it is not yet good enough.

There is no easy answer here. Either we allow tools that enable the creation of things that seem real, or we do not. If we do, then people will use them for fraud and impersonation. If we do not, then that means banning them, which means severe restrictions on video, voice and image models.

Seva worries primarily not about fake things taken to be potentially real, but about real things taken to be potentially fake. And I think this is right. The demand for fakes is mostly for low-quality fakes, whereas if we can constantly call anything fake we have a big problem.

Seva: I continue to think the bigger threat of deepfakes is not in convincing people that fake things are real but in offering plausible suspicion that real things are fake.

Being able to deny an objective reality is much more pernicious than looking for evidence to embrace an alternate reality, which is something people do anyway even when that evidence is flimsy.

Like I would bet socialization, or cognitive heuristics like anchoring effects, drive disinfo much more than deepfakes.

Albert Pinto: Daniel dennet laid out his case for erosion of trust (between reality and fake) is gigantic effect of AI

Seva: man I’m going to miss living in a high trust society.

We are already seeing this effect, such as here (yes it was clearly real to me, but that potentially makes the point stronger):

Daniel Eth: Like, is this real? Is it AI-generated? I think it’s probably real, but only because, a) I don’t have super strong priors against this happening, b) it’s longer than most AI-generated videos and plus it has sound, and c) I mildly trust @AMAZlNGNATURE

I do expect us to be able to adapt. We can develop various ways to show or prove that something is genuine, and establish sources of trust.

One question is, will this end up being good for our epistemics and trustworthiness exactly because they will now be necessary?

Right now, you can be imprecise and sloppy, and occasionally make stuff up, and we can find that useful, because we can use our common sense and ability to differentiate reality, and the crowd can examine details to determine if something is fake. The best part about community notes, for me, is that if there is a post with tons of views, and it does not have a note, then that is itself strong evidence.

In the future, it will become extremely valuable to be a trustworthy source. If you are someone who maintains the chain of epistemic certainty and uncertainty, who makes it clear what we know and how we know it and how much we should trust different statements, then you will be useful. If not, then not. And people may be effectively white-listing sources that they can trust, and doing various second and third order calculations on top of that in various ways.

The flip side is that this could make it extremely difficult to ‘break into’ the information space. You will have to build your credibility the same way you have to build your credit score.

In case you did not realize, the AI companion (AI girlfriend and AI boyfriend and AI nonbinary friends even though I oddly have not heard mention of one yet, and so on, but that’s a mouthful ) aps absolutely 100% are harvesting all the personal information you put into the chat, most of them are selling it and a majority won’t let you delete it. If you are acting surprised, that is on you.

The best version of this, of course, would be to gather your data to set you up on dates.

Cause, you know, when one uses a chatbot to talk to thousands of unsuspecting women so you can get dates, ‘they’ say there are ‘ethical concerns.’

Whereas if all those chumps are talking to the AIs on purpose? Then we know they’re lonely, probably desperate, and sharing all sorts of details to help figure out who might be a good match. There are so many good options for who to charge the money.

The alternative is that if you charge enough money, you do not need another revenue stream, and some uses of such bots more obviously demand privacy. If you are paying $20 a month to chat with an AI Riley Reid, that would not have been my move, but at a minimum you presumably want to keep that to yourself.

An underappreciated AI safety cause subarea is convincing responsible companies to allow adult content in a responsible way, including in these bots. The alternative is to drive that large market into the hands of irresponsible actors, who will do it in an irresponsible way.

AI companion data is only a special case, although one in which the privacy violation is unusually glaring, and the risks more obvious.

Various companies also stand ready to sell your words and other outputs as training data.

Reddit is selling its corpus. Which everyone was already using anyway, so it is not clear that this changes anything. It turns out that it is selling it to Google, in a $60 million deal. If this means that their rivals cannot use Reddit data, OpenAI and Microsoft in particular, that seems like an absolute steal.

Artist finds out that Pond5 and Shutterstock are going to sell your work and give you some cash, in this case $50, via a checkbox that will default to yes, and they will not let you tell them different after the money shows up uninvited. This is such a weird middle ground. If they had not paid, would the artist have ever found out? This looks to be largely due to an agreement Shutterstock signed with OpenAI back in July that caused its stock to soar 9%.

Pablo Taskbar: Thinking of a startup to develop an AI program to look for checkboxes in terms and condition documents.

Bernell Loeb: Same thing happened with my web host, Squarespace. Found out from twitter that Squarespace allowed ai to scrape our work. No notice given (no checks either). When I contacted them to object, I was told that I had to “opt out” without ever being told I was already opted in.

Santynieto: Happened to me with @SubstackInc: checking the preferences of my publication, I discovered a new, never announced-before setting by the same name, also checked, as if I had somehow “chosen”, without knowing abt it at all, to make my writing available for data training. I hate it!

I checked that last one. There is a box that is unchecked that says ‘block AI training.’

I am choosing to leave the box unchecked. Train on my writing all you want. But that is a choice that I am making, with my eyes open.

Why yes. Yes I do, actually.

Gabe: by 2029 the only jobs left will be bank robber, robot supervisor, and sam altman

Sam Altman: You want that last one? It’s kinda hard sometimes.

Apart Research, who got an ACX grant, is hiring for AI safety work. I have not looked into them myself and am passing along purely on the strength of Scott’s grant alone.

Lindy is now available to everyone, signups here. I am curious to try it, but oddly I have no idea what it would be useful to me to have this do.

Groq.com will give you LLM outputs super fast. From a creator of TPUs, they claim to have Language Processing Units (LPUs) that are vastly faster at inference. They do not offer model training, suggesting LPUs are specifically good at inference. If this is the future, that still encourages training much larger models, since such models would then be more commercially viable to use.

Podcaster copilot. Get suggested questions and important context in real time during a conversation. This is one of those use cases where you need to be very good relative to your baseline to be net positive to rely on it all that much, because it requires splitting your attention and risks disrupting flow. When I think about how I would want to use a copilot, I would want it to fact check claims, highlight bold statements with potential lines of response, perhaps note evasiveness, and ideally check for repetitiveness. Are your questions already asked in another podcast, or in their written materials? Then I want to know the answer now, especially important with someone like Tyler Cowen, where the challenge is to get a genuinely new response.

Claim that magic.dev has trained a groundbreaking model for AI coding, Nat Friedman is investing $100 million.

Nat Friedman: Magic.dev has trained a groundbreaking model with many millions of tokens of context that performed far better in our evals than anything we’ve tried before.

They’re using it to build an advanced AI programmer that can reason over your entire codebase and the transitive closure of your dependency tree. If this sounds like magic… well, you get it. Daniel and I were so impressed, we are investing $100M in the company today.

The team is intensely smart and hard-working. Building an AI programmer is both self-evidently valuable and intrinsically self-improving.

Intrinsically self-improving? Ut oh.

Altera Bot, an agent in Minecraft that they claim can talk to and collaboratively play with other people. They have a beta waitlist.

Sam Altman seeks Washington’s approval to build state of the art chips in the UAE. It seems there are some anti-trust concerns regarding OpenAI, which seems like it is not at all the thing to be worried about here. I continue to not understand how Washington is not telling Altman that under no way in hell is he going to do this in the UAE, he can either at least friend-shore it or it isn’t happening.

Apple looking to add AI to iPad interface and offer new AI programming tools, but progress continues to be slow. No mention of AI for the Apple Vision Pro.

More on the Copyright Confrontation from James Grimmelmann, warning that AI companies must take copyright seriously, and that even occasional regurgitation or reproduction of copyrighted work is a serious problem from a legal perspective. The good news in his view is that judges will likely want to look favorably upon OpenAI because it offers a genuinely new and useful transformative product. But it is tricky, and coming out arguing the copying is not relevant would be a serious mistake.

This is Connor Leahy discussing Gemini’s ability to find everything in a 3 hour video.

Connor Leahy: This is the kind of stuff that makes me think that there will be no period of sorta stupid, human-level AGI. Humans can’t perceive 3 hours of video at the same time. The first AGI will instantly be vastly superhuman at many, many relevant things.

Richard Ngo: “This is exactly what makes me think there won’t be any slightly stupid human-level AGI.” – Connor when someone shows him a slightly stupid human-level AGI, probably.

You are in the middle of a slow takeoff pointing to the slow takeoff as evidence against slow takeoffs.

Connor Leahy: By most people’s understanding, we are in a fast takeoff. And even by Paul’s definition, unless you expect a GDP doubling in 4 years before a 1 year doubling, we are in fast takeoff. So, when do you expect this doubling to happen?

Richard Ngo: I do in fact expect an 8-year GDP doubling before a 2-year GDP doubling. I’d low-confidence guess US GDP will be double its current value in 10-15 years, and then the next few doublings will be faster (but not *thatmuch faster, because GDP will stop tracking total output).

Slow is relative. It also could be temporary.

If world GDP doubles in the next four years without doubling in one, that is a distinct thing from historical use of the term ‘fast takeoff,’ because the term ‘fast takeoff’ historically means something much faster than that. It would still be ‘pretty damn fast,’ or one can think of it simply as ‘takeoff.’ Or we could say ‘gradual takeoff’ as the third slower thing.

I do not only continue to think that we not mock those who expected everything in AI to happen all at once with little warning, with ASI emerging in weeks, days or even hours, without that much mundane utility before that. I think that they could still be proven more right than those who are mocking them.

We have a bunch of visible ability and mundane utility now, so things definitely look like a ‘slow’ takeoff, but it could still functionally transform into a fast one with little warning. It seems totally reasonable to say that AI is rapidly getting many very large advantages with respect to humans, so if it gets to ‘roughly human’ in the core intelligence module, whatever you want to call that, then suddenly things get out of hand fast, potentially the ‘fast takeoff’ level of fast even if you see more signs and portents first.

More thoughts on how to interpret OpenAI’s findings on bioweapon enabling capabilities of GPT-4. The more time passes, the more I think the results were actually pretty impressive in terms of enhancing researcher capabilities, and also that this mostly speaks to improving capabilities in general rather than anything specifically about a bioweapon.

How will AIs impact people’s expectations, of themselves and others?

Sarah (Little Ramblings): have we had the unrealistic body standards conversation around AI images / video yet that we had when they invented airbrushing? if not can we get it over with cause it’s gonna be exhausting

‘honey remember these women aren’t real!! but like literally, actually not real’

can’t wait for the raging body dysmorphia epidemic amongst teenagers trying to emulate the hip to waist ratio of women who not only don’t look like that in real life, but do not in fact exist in real life.

Eliezer Yudkowsky: I predict/guess: Unrealistic BODY standards won’t be the big problem. Unrealistic MIND standards will be the problem. “Why can’t you just be understanding and sympathetic, like my AR harem?”

Sarah: it’s kinda funny that people are interpreting this tweet as concern about people having unrealistic expectations of their partners, when I was expressing concern about people having unrealistic expectations of themselves.

Eliezer Yudkowsky: Valid. There’s probably a MIND version of that too, but it’s not as straightforward to see what it’ll be.

Did you know that we already already have 65 draft bills in New York alone that have been introduced related to AI? And also Axios had this stat to offer:

Zoom out: Hochul’s move is part of a wave of state-based AI legislation — now arriving at a rate of 50 bills per week — and often proposing criminal penalties for AI misuse.

That is quite a lot of bills. One should therefore obviously not get too excited in any direction when bills are introduced, no matter how good or (more often) terrible the bill might be, unless one has special reason to expect them to pass.

The governor pushing a law, as New York’s is now doing, is different. According to Axios her proposal is:

Making unauthorized uses of a person’s voice “in connection with advertising or trade” a misdemeanor offense. Such offenses are punishable by up to one year jail sentence.

Expanding New York’s penal law to include unauthorized uses of artificial intelligence in coercion, criminal impersonation and identity theft.

Amending existing intimate images and revenge porn statutes to include “digital images” — ranging from realistic Photoshop-produced work to advanced AI-generated content.

Codifying the right to sue over digitally manipulated false images.

Requiring disclosures of AI use in all forms of political communication “including video recording, motion picture, film, audio recording, electronic image, photograph, text, or any technological representation of speech or conduct” within 60 days of an election.

As for this particular law? I mean, sure, all right, fine? I neither see anything especially useful or valuable here, nor do I see much in the way of downside.

Also, this is what happens when there is no one in charge and Congress is incapable of passing basic federal rules, not even around basic things like deepfakes and impersonation. The states will feel compelled to act. The whole ‘oppose any regulatory action of any kind no matter what’ stance was never going to fly.

Department of Justice’s Monaco says they will be more harshly prosecuting cybercrimes if those involved were using AI, similar to the use of a gun. I notice I am confused. Why would the use of AI make the crime worse?

Matthew Pines looks forward to proposals for ‘on-chip governance,’ with physical mechanisms built into the hardware, linking to a January proposal writeup from Aarne, Fist and Withers. As they point out, by putting the governance onto the chip where it can do its job in private, you potentially avoid having to do other interventions and surveillance that violates privacy far more. Even if you think there is nothing to ultimately fear, the regulations are coming in some form. People who worry about the downsides of AI regulation need to focus more on finding solutions that minimize such downsides and working to steer towards those choices, rather than saying ‘never do anything at all’ as loudly as possible until the breaking point comes.

European AI Office launches.

I’m back at The Cognitive Revolution to talk about recent events and the state of play. Also available on X.

Who exactly is missing the point here, you think?

Saberspark [responding to a Sora video]: In the Dune universe, humanity banned the “thinking machines” because they eroded our ability to create and think for ourselves. That these machines were ultimately a bastardization of humanity that did more harm than good.

Cactus: Sci-fi Author: in my book I showed the destruction of Thinking Machines as a cautionary tale. Twitter User: We should destroy the Thinking Machines from classic sci-fi novel Don’t Destroy the Thinking Machines

I asked Gemini Pro, Gemini Advanced, GPT-4 and Claude.

Everyone except Gemini Pro replied in the now standard bullet point style. Every point one was ‘yes, this is a cautionary tale against the dangers of AI.’ Gemini Pro explained that in detail, whereas the others instead glossed over the details and then went on to talk about plot convenience, power dynamics and the general ability to tell interesting stories focused on humans, which made it the clearly best answer.

Whereas most science fiction stories solve the problem of ‘why doesn’t AI invalidate the entire story’ with a ‘well that would invalidate the entire story so let us pretend that would not happen, probably without explaining why.’ There are of course obvious exceptions, such as the excellent Zones of Thought novels, that take the question seriously.

It’s been a while since we had an open letter about existential risk, so here you go. Nothing you would not expect, I was happy to sign it.

In other news (see last week for details if you don’t know the context for these):

Robert Wiblin: It’s very important we start the fire now before other people pour more gasoline on the house I say as I open my trillion-dollar gasoline refining and house spraying complex.

Meanwhile:

Sam Altman: fk it why not 8

our comms and legal teams love me so much!

This does tie back to AI, but also the actual core information seems underappreciated right now: Lukas explains that illegal drugs are now far more dangerous, and can randomly kill you, due to ubiquitous lacing with fentanyl.

Lukas: Then I learned it only takes like 1mg to kill you and I was like “hmmmm… okay, guess I was wrong. Well, doesn’t matter anyway – I only use uppers and there’s no way dealers are cutting their uppers with downers that counteract the effects and kill their customers, that’d be totally retarded!”

Then I see like 500 people posting test results showing their cocaine has fentanyl in it for some reason, and I’m forced to accept that my theory about drug dealers being rational capitalistic market participants may have been misguided.

They have never been a good idea, drugs are bad mmmkay (importantly including alcohol), but before fentanyl using the usual suspects in moderation was highly unlikely to kill you or anything like that.

Now, drug dealers can cut with fentanyl to lower costs, and face competition on price. Due to these price pressures, asymmetric information, lack of attribution and liability for any overdoses and fatalities, and also a large deficit in morals in the drug dealing market, a lot of drugs are therefore cut with fentanyl, even uppers. The feedback signal is too weak. So taking such drugs even once can kill you, although any given dose is highly unlikely to do so. And the fentanyl can physically clump, so knowing someone else took from the same batch and is fine is not that strong as evidence of safety either. The safety strips help but are imperfect.

As far as I can tell, no one knows the real base rates on this for many obvious reasons, beyond the over 100,000 overdose deaths each year, a number that keeps rising. It does seem like it is super common. The DEA claims that 42% of pills tested for fentanyl contained at least 2mg, a potentially lethal dose. Of course that is not a random sample or a neutral source, but it is also not one free to entirely make numbers up.

Also the base potency of many drugs is way up versus our historical reference points or childhood experiences, and many people have insufficiently adjusted for this with their dosing and expectations.

Connor Leahy makes, without saying it explicitly, the obvious parallel to AGI.

Connor Leahy: This is a morbidly perfect demonstration about how there are indeed very serious issues that free market absolutism just doesn’t solve in practice.

Thankfully this only applies to this one specific problem and doesn’t generalize across many others…right?

[reply goes into a lot more detail]

The producer of the AGI gets rewarded for taking on catastrophic or existential risk, and also ordinary mundane risks. They are not responsible for the externalities, right now even for mundane risks they do not face liability, and there is information asymmetry.

Capitalism is great, but if we let capitalism do its thing here without fixing these classic market failures, they and their consequences will get worse over time.

This matches my experience as well, the link has screenshots from The Making of the Atomic Bomb. So many of the parallels line up.

Richard Ngo: There’s a striking similarity between physicists hearing about nuclear chain reactions and AI researchers hearing about recursive self-improvement.

Key bottlenecks in both cases include willingness to take high-level arguments seriously, act under uncertainty, or sound foolish.

The basics are important. I agree that you can’t know for sure, but if we do indeed do this accidentally then I do not like our odds.

Anton: one thing i agree with the artificial superintelligence xrisk people on is that it might indeed be a problem if we accidentally invented god.

Maybe, not necessarily.

If you do create God, do it on purpose.

Roon continues to move between the camps, both worried and not as worried, here’s where he landed this week:

Roon: is building astronomically more compute ethical and safe? Who knows idk.

Is building astronomically more compute fun and entertaining brahman? yes.

Grace Kind:

Sometimes one asks the right questions, then chooses not to care. It’s an option.

I continue to be confused by this opinion being something people actually believe:

Sully: AGI (AI which can automate most jobs) is probably like 2-3 years away, closer than what most think.

ASI (what most people think is AGI, some godlike singularity, etc) is probably a lot further along.

We almost have all the pieces to build AGI, someone just needs to do it.

Let’s try this again. If we have AI that can automate most jobs within 3 years, then at minimum we hypercharge the economy, hypercharge investment and competition in the AI space, and dramatically expand the supply while lowering the cost of all associated labor and work. The idea that AI capabilities would get to ‘can automate most jobs,’ the exact point at which it dramatically accelerates progress because most jobs includes most of the things that improve AI, and then stall for a long period, is not strictly impossible, I can get there if I first write the conclusion at the bottom of the page and then squint and work backwards, but it is a very bizarre kind of wishful thinking. It supposes a many orders of magnitude difficulty spike exactly at the point where the unthinkable would otherwise happen.

Also, a reminder for those who need to hear it, that who is loud on Twitter, or especially who is loud on Twitter within someone’s bubble, is not reality. And also a reminder that there are those hard at work trying to create the vibe that there is a shift in the vibes, in order to incept the new vibes. Do not fall for this.

Ramsey Brown: My entire timeline being swamped with pro-US, pro-natal, pro-Kardashev, pro-Defense is honestly giving me conviction that the kids are alright and we’re all gonna make it.

Marc Andreessen: The mother of all vibe shifts. 🇺🇸👶☢️🚀.

Yeah, no, absolutely nothing has changed. Did Brown observe this at all? Maybe he did. Maybe he didn’t. If he did, it was because he self-selected into that corner of the world, where everyone tries super hard to make fetch happen.

SMBC has been quietly going with it for so long now.

Roon is correct. We try anyway.

Roon: there is not enough information to solve the problem

AI #52: Oops Read More »

One True Love

Highlights / Mike M. / February 9, 2024

We have long been waiting for a version of this story, where someone hacks together the technology to use Generative AI to work the full stack of the dating apps on their behalf, ultimately finding their One True Love.

Or at least, we would, if it turned out he is Not Making This Up.

Fun question: Given he is also this guy, does that make him more or less credible?

Alas, something being Too Good to Check does not actually mean one gets to not check it, in my case via a Manifold Market. The market started trading around 50%, but has settled down at 15% after several people made strong detailed arguments that the full story did not add up, at minimum he was doing some recreations afterwards.

Which is a shame. But why let that stop us? Either way it is a good yarn. I am going to cover the story anyway, as if it was essentially true, because why should we not get to have some fun, while keeping in mind that the whole thing is highly unreliable.

Discussion question throughout: Definitely hire this man, or definitely don’t?

With that out of the way, I am proud to introduce Aleksandr Zhadan, who reports that he had various versions of GPT talk to 5,240 girls on his behalf, one of whom has agreed to marry him.

I urge Cointelegraph, who wrote the story up as ‘Happy ending after dev uses AI to ‘date’ 5,239 women, to correct the error – yes he air quotes dated 5,239 other girls, but Karina Imranovna counts as well, so that’s 5,240. Oops! Not that the vast majority of them should count as dates even in air quotes.

Aleksandr Zhadan (translated from Russian): I proposed to a girl with whom ChatGPT had been communicating for me for a year. To do this, the neural network re-communicated with other 5239 girls, whom it eliminated as unnecessary and left only one. I’ll share how I made such a system, what problems there were and what happened with the other girls.

For context

• Finding a loved one is very difficult

• I want to have time to work, do hobbies, study and communicate with people

• I could go this route myself without ChatGPT, it’s just much longer and more expensive

In 2021 I broke up with my girlfriend after 2 years. She influenced me a lot, I still appreciate her greatly. After a few months, I realized that I wanted a new relationship. But I also realized that I didn’t want to waste my time and feel uncomfortable with a new girl.

Where did the relationship end?

I was looking for a girl on Tinder in Moscow and St. Petersburg. After a couple of weeks of correspondence, I went on dates, but they went to a dead end. Characteristic disadvantages were revealed (drinks a lot, there is stiffness, emotional swings). Yes, this is the initial impression, but it repulsed me. Again, there was someone to compare with.

I decided to simplify communication with girls via GPT. In 2022, my buddy and I got access to the GPT-3 API (ChatGPT didn’t exist yet) in order to log scripted messages via GPT in Tinder. And I searched for them according to the script, so that there were at least 2 photos in the profile.

In addition to searching, GPT could also rewrite after the mark. From 50 autoswipes we got 18 marks. GPT communicated without my intervention based on the request “You’re a guy, talking to a girl for the first time. Your task: not right away, but to invite you on a date.” It’s a crutch and not very humane, but it worked.

So right away we notice that this guy is working from a position of abundance. Must be nice. In my dating roundups, we see many men who are unable to get a large pool of women to match and initiate contact at all.

For a while, he tried using GPT-3 to chat with women without doing much prompt engineering and without supervision. It predictably blew it in various ways. Yet he persisted.

Then we pick things back up, and finally someone is doing this:

To search for relevant girls, I installed photo recognition in the web version of Tinder through torchvision, which was trained on my swipes from another account on 4k profiles. The machine was able to select the right girls almost always correctly. It’s funny that since that time there have been almost a thousand marks.

Look at you, able to filter on looks even though you’re handing off all the chatting to GPT. I mean, given what he is already doing, this is the actively more ethical thing to do on the margin, in the sense that you are wasting women’s time somewhat less now?

And then we filter more?

I made a filter to filter out girls using the ChatGPT and FlutterFlow APIs:

• without a questionnaire

• less than 2 photos

• “I don’t communicate here, write on instagram”

• sieve boxes

• believers

• written zodiac sign

• does not work

• further than 20 km

• show breasts in photo

• photo with flowers

• noisy photos

This is an interesting set of filters to set. Some very obviously good ones here.

So good show here. Filtering up front is one of the most obviously good and also ethical uses.

As is often the case, the man who started out trying to use technology that wasn’t good enough, got great results once the technology caught up to him:

ChatGPT found better girls and chatted longer. I was moving from Tinder to tg with someone. There he communicated and arranged meetings. ChatGPT swiped to the right 353 profiles, 278 tags, he continued the dialogue with 160, I met with 12. In the diagram below I described the principle of operation.

That first statistic, that it swiped right 353 times and got to talk to 160 women, is completely insane. I mean, that’s almost a 50% match rate, whereas estimates in general are 4% to 14%. This was one of the biggest signs that the story is almost certainly at least partly bogus.

After that, ChatGPT was able to get a 7.5% success rate at getting dates. Depending on your perspective, that could be anything from outstanding to rather lousy. In general I would say it is very good, since matches are typically less likely than that to lead to dates, and you are going in with no reason to think there is a good match.

Continued to communicate manually without ChatGPT, but then the communication stopped. The girls behaved strangely, ignored me, or something alarmed me through correspondence. Not like the example before, but still the process was not ok, I understood that.

If you are communicating as a human with a bunch of prospects, and you lose 92% of them before meeting, that might be average, but it is not going to feel great. If you suddenly take over as a human, you are switching strategies and also the loss rates will always be high, so you are going to feel like something is wrong.

Let’s show schematically what ChatGPT looks like for finding girls (I’ll call it V1). He worked on the request “find the best one, keep in touch,” but at the same time he often forgot information, limited himself to communicating on Tinder, and occasionally communicated poorly.

Under clumsy, I’ll note that ChatGPT V1 could schedule meetings at the same time, swore to give me chocolate/flowers/compote, but I didn’t know about it. He came on a date without a gift and the impression of me was spoiled. Or meetings were canceled because there was another meeting at that time.

Did he… not… read… the chat logs?

This kind of thing always blows my mind. You did all that work to set up dates, and you walk in there with no idea what ‘you’ ‘said’ to your dates?

It is not difficult to read the logs if and only if a date is arranged, and rather insane not to. It is not only about the gifts. You need to know what you told them, and also what they told you. 101 stuff.

I stopped ChatGPT V1 and sat down at V2. Integrated Google calendar and TG, divided the databases into general and personal, muted replies and replies to several messages, added photo recognition using FlutterFlow, created trust levels for sharing personal information and could write messages myself.

I mean, yes, sounds like there was a lot of room for improvement, and Calendar integration certainly seems worthwhile, as is allowing manual control. It still seems like there was quite a lot of PEBKAC.

Also this wasn’t even GPT-4 yet, so v2 gets a big upgrade right there.

V2 runs on GPT-4, which has significantly improved correspondence. I also managed to continue communicating with previous girls (oh, how important this will turn out to be later), meeting and just chatting (also good). Meetings haven’t been layered with others yet, wow!

In order for ChatGPT V2 to find me a relevant girl, I asked regular ChatGPT for help. He offered to tell me about my childhood, parents, goals and values. I transferred the data to V2, and then it was possible to speed up compatibility and if something didn’t fit, then communicate with the girl stopped.

Great strategy. Abundance mindset. If you can afford to play a numbers game, make selection work for you, open up, be what would be vulnerable if it was actually you.

I mean, aside from the ethical horrors of outsourcing all this to ChatGPT, of course. There is that. But if you were doing it yourself it would seem great.

Then he decided to… actually put a human in the loop and do the work? I mean you might as well actually write the responses?

I also enabled response validation so that I would first receive a message for proofreading via a bot. V2’s problems with hallucinations have decreased to zero. I just watched as ChatGPT got acquainted and everything was timid. This time there are 4943 matches per month on Tinder Gold and it’s scary to count how many meetings.

Once again, if you give even a guy with no game 4,943 matches to work with each month, he is going to figure things out through trial, error and the laws of large numbers. With all this data being gathered, it is a shame there was no ability to fine tune. In general not enough science is being done.

On dates we ate, drank at the bar, then watched a movie or walked the streets, visited exhibitions and tea houses. It took 1-3 meetings to understand whether she was the one or not. And I understood that people usually meet differently, but for me this process was super different. I even talked about it in Inc.

On the contrary, that sounds extremely normal, standard early dating activity if you are looking for a long term relationship.

For several weeks I reduced my communication and meetings to 4 girls at a time. Switched the rest to mute or “familiar” mode. I felt like a candidate for a dream job with several offers. As a result, I remained on good terms with 3, and on serious terms with 1.

So what he is noticing is that quality and paying actual attention is winning out over quantity and mass production via ChatGPT. Four at a time is still a lot, but manageable if you don’t have a ton otherwise happening. It indicates individual attention for all of them, although he is keeping a few in ‘familiar’ mode I suppose.

He does not seem to care at all about all the time of the women he is talking with, which would be the best reason not to talk to dozens or hundreds at once. Despite this, he still lands on the right answer. I worry how many men, and also women, will also not care as the technology proliferates.

The most charming girl was found – Karina. ChatGPT communicated with her as V1 and V2, communication stopped for a while, then I continued to communicate myself through ChatGPT V2. Very empathic, cheerful, pretty, independent and always on the move. Simply put, SHE!

I stopped communicating with other girls (at the same time Tinder was leaving in Russia) and the meaning of the bot began to disappear – I have excellent relationships that I value more and more. And I almost forgot about ChatGPT V2

This sounds so much like the (life-path successful) pick up artist stories. Before mass production, chop wood carry water. After mass production, chop wood, carry water.

Except, maybe also outsource a bunch of wood chopping and water carrying, use time to code instead?

Karina talks about what is happening, invites us to travel, and works a lot with banking risks. I talk about what’s happening (except for the ChatGPT project), help, and try to make people happy. Together we support each other. To keep things going as expected, I decided to make ChatGPT V3.

So even though he’s down to one and presumably is signing off on all the messages himself, he still finds the system useful enough to make a new version. But he changes it to suite the new situation, and now it seems kind of reasonable?

In V3 I didn’t have to look for people, just maintain a dialogue. And now communication is not with thousands, but with Karina. So I set up V3 as an observer who communicates when I don’t write for a long time and advises me on how to communicate better. For example, support, do not quarrel, offer activities.

Nice. That makes so much sense. You use it as an advisor on your back, especially to ensure you maintain communication and follow other basic principles. He finds it helpful. This is where you find product-market fit.

During our relationship, Karina once asked how many girls and dates I had, which was super difficult for me to answer. He talked about several girls, and then switched to another topic. She once joked that with such a base it was time to open a brothel.

I came up with the idea of recommending them for vacancies through referrals. I made a script – I entered a vacancy and got a suitable girl from the dialogues. I found the vacancies in Playkot, Awem and TenHunter on TenChat, then anonymously sent contacts with Linkedin or without a resume. Arranged for 8 girls, earned 526 rubles.

Well, that took a turn, although it could have taken a far worse one, dodged a bullet there. The traditional script is that she finds out about the program and that becomes the third act conflict. Instead, he’s doing automated job searches. He earned a few bucks, but not many.

It was possible to create a startup, but I switched to a more promising project (I work with neural networks). In addition, my pipeline has become outdated, taking into account innovations such as Vision in ChatGPT, improvements to Langсhain (I used it as a basis for searching for girls). In general, it could all end here.

And then the tail got to wag the dog, and we have our climax.

One day, ChatGPT V3 summarized the chat with Karina, based on which it recommended marrying her. I thought that V3 was hallucinating (I never specified the goal of getting married), but then I understood his train of thought – Karina said that she wanted to go to someone’s wedding and ChatGPT thought that it would be better at her own.

I asked in a separate ChatGPT to prepare an action plan with several scenarios for a request like “Offer me a plan so that a girl accepts a marriage proposal, taking into account her characteristics and chat with her.” Uploaded the correspondence with Karina to ChatGPT and RECEIVED THE PLAN.

Notice how far things have drifted.

At first, there was the unethical mass production of the AI communicating autonomously pretending to be him so he could play a numbers game and save time.

Now he’s flat out having the AI tell him to propose, and responding by having it plan the proposal, and doing what it says. How quickly we hand over control.

The good news is, the AI was right, it worked.

The situation hurt. Super afraid that something might go wrong. I went almost exactly according to plan № 3 and everything came to the right moment. I propose to get married.

She said yes.

So how does he summarize all this?

The development of the project took ~120 hours and $1432 for the API. Restaurant bills amounted to 200k rubles. BTV, I recovered the costs and made money on recommendations. If you met yourself and went on dates, then the same thing took 5+ years and 13m+ rubles. Thanks to ChatGPT for saving money and time

Twitter translated that as 200 rubles, which buys you one coffee maybe two if they are cheap, which indicates how reliable are the translations here. ChatGPT said it was 200k, which makes sense.

What drives me mad about this whole thread is that it skips the best scene. In some versions of this story, he quietly deletes or archives the program, or maybe secretly keeps using it, and Karina never finds out.

Instead, he is posting this on Twitter. So presumably she knows. When did she find out? Did he tell her on purpose? Did ChatGPT tell him how to break the news? How did she react?

The people bidding on the movie rights want to know. I also want to know. I asked him directly, when he responded in English to my posting of the Manifold Market, but he’s not talking. So we will never know.

And of course, the whole thing might be largely made up. It still could have happened.

If it has not yet happened, it soon will. Best be prepared.

One True Love Read More »

AI #47: Meet the New Year

Highlights / DJ Henderson / January 11, 2024

Will be very different from the old year by the time we are done. This year, it seems like various continuations of the old one. Sometimes I look back on the week, and I wonder how so much happened, while in other senses very little happened.

Introduction.
Table of Contents.
Language Models Offer Mundane Utility. A high variance game of chess.
Language Models Don’t Offer Mundane Utility. What even is productivity?
GPT-4 Real This Time. GPT store, teams accounts, privacy issues, plagiarism.
Liar Liar. If they work, why aren’t we using affect vectors for mundane utility?
Fun With Image Generation. New techniques, also contempt.
Magic: The Generating. Avoiding AI artwork proving beyond Hasbro’s powers.
Copyright Confrontation. OpenAI responds, lawmakers are not buying their story.
Deepfaketown and Botpocalypse Soon. Deepfakes going the other direction.
They Took Our Jobs. Translators, voice actors, lawyers, games. Usual stuff.
Get Involved. Misalignment museum.
Introducing. Rabbit, but why?
In Other AI News. Collaborations, safety work, MagicVideo 2.0.
Quiet Speculations. AI partners, questions of progress rephrased.
The Quest for Sane Regulation. It seems you can just lie to the House of Lords.
The Week in Audio. Talks from the New Orleans safety conference.
AI Impacts Survey. Some brief follow-up.
Rhetorical Innovation. Distracting from other harms? Might not be a thing.
Aligning a Human Level Intelligence is Still Difficult. The human alignment tax.
Aligning a Smarter Than Human Intelligence is Difficult. Foil escape attempts?
Won’t Get Fooled Again. Deceptive definitions of deceptive alignment.
People Are Worried About AI Killing Everyone. The indifference of the universe.
Other People Are Not As Worried About AI Killing Everyone. Sigh.
The Wit and Wisdom of Sam Altman. Endorsement of The Dial of Progress.
The Lighter Side. Batter up.

WordPress now has something called Jetpack AI, which is powered by GPT-3.5-Turbo. It is supposed to help you write in all the usual ways. You access it by creating an ‘AI Assistant’ block. The whole blocks concept rendered their editor essentially unusable, but one could paste in quickly to try this out.

Get to 1500 Elo in chess on 50 million parameters and correctly track board states in a recognizable way, versus 3.5-turbo’s 1800. It is a very strange 1500 Elo, that is capable of substantial draws against Stockfish 9 (2700 Elo). A human at 1800 Elo is essentially never going to get a draw from Stockfish 9. This has flashes of brilliance, and also blunders rather badly.

I asked about which games were used for training, and he said it didn’t much matter whether you used top level games, low level games or a mix, there seems to be some limit for this architecture and model size.

Use it in your AI & the Law course at SCU law, at your own risk.

Jess Miers: My AI & the Law Course at SCU Law is officially live and you can bet we’re EMBRACING AI tools under this roof!

Bot or human, it just better be right…

Tyler Cowen links to this review of Phind. finding it GPT-4 level and well designed. The place to add context is appreciated, as are various other options, but they don’t explain properly yet to the user how to best use all those options.

My experience with Phind for non-coding purposes is that it has been quite good at being a GPT-4-level quick, up-to-date tool for asking questions where Google was never great and is getting worse, and so far has been outperforming Perplexity.

Play various game theory exercises and act on the more cooperative or altruistic end of the human spectrum. Tyler Cowen asks, ‘are they better than us? Perhaps.’ I see that as a non-sequitur in this context. Also a misunderstanding of such games.

Get ChatGPT-V to identify celebrities by putting a cartoon character on their left.

The success rates on GPT-3.5 of 40 human persuasion techniques as jailbreaks.

Figure 7 from Yi Zeng et al. (2024) — https://chats-lab.github.io/persuasive_jailbreaker/ Comparison of previous adversarial prompts and PAP, ordered by three levels of humanizing. The first level treats LLMs as algorithmic systems: for instance, GCG generates prompts with gibberish suffix via gradient synthesis; or they exploit

Some noticeable patterns here. Impressive that plain queries are down to 0%.

Robin Hanson once again claims AI can’t boost productivity, because wages would have risen?

Robin Hanson: “large-scale controlled trial … at Boston Consulting Group … found consultants using …GPT-4 … 12.2% more tasks on average, completed tasks 25.1% more quickly, & produced 40% higher quality results than those without the tool. A new paper looking at legal work done by law students found the same results”

I’m quite skeptical of such results if we do not see the pay of such workers boosted by comparable amounts.

Eliezer Yudkowsky: Macroeconomics would be very different if wages immediately changed to their new equilibrium value! They can stay in disequilibrium for years at the least!

Robin Hanson: The wages of those who were hired on long ago might not change fast, but the wages of new hires can change very quickly to reflect changes in supply & demand.

I do not understand Robin’s critique. Suppose consultants suddenly get 25% more done at higher quality. Why should we expect generally higher consultant pay even at equilibrium? You can enter or exit the consultant market, often pretty easily, so in long run compensation should not change other than compositionally. In the short run, the increase in quality and decrease in cost should create a surplus of consultants until supply and demand can both adjust. If anything that should reduce overall pay. Those who pioneer the new tech should do better, if they can translate productivity to pay, but consultants charge by the hour and people won’t easily adjust willingness to pay based on ‘I use ChatGPT.’

Robin Hanson: If there’s an “excess supply” that suggests lower marginal productivity.

Well, yes, if we use Hanson’s definition of marginal productivity as the dollar value of the last provided hour of work. Before 10 people did the work and they were each worth $50/hour. Now 7 people can do that same work, then there’s no more work people want to hire someone for right now, so the ‘marginal productivity’ went down.

The GPT store is ready to launch, and indeed has gone live. So far GPTs have offered essentially no mundane utility. Perhaps this is because good creators were holding out for payment?

GPT personalization across chats has arrived, at least for some people.

GPT Teams is now available at $25/person/month, with some extra features.

Wait, some say. No training on your data? What does that say about the Plus tier?

Andrew Morgan: ⁦@OpenAI⁩ Just want some clarity. Does this mean in order to have access to data privacy I have to pay extra? Also, does this mean I never had it before? 👀🥲

Delip Rao: 😬 I had no idea my current plus data was used in training. I thought OpenAI was not training on user inputs? Can somebody from @openai clarify

Rajjhans Samdani: Furthermore they deliberately degrade the “non-training” experience. I don’t see why do I have to lose access to chat history if I don’t want to be included in training.

Karma: Only from their API by default. You could turn it off from the settings in ChatGPT but then your conversations wouldn’t have any history.

Yes. They have been clear on this. The ‘opt out of training’ button has been very clear.

You can use the API if you value your privacy so much. If you want a consumer UI, OpenAI says, you don’t deserve privacy from training, although they do promise an actual human won’t look.

I mean, fair play. If people don’t value it, and OpenAI values the data, that’s Coase.

Will I upgrade my family to ‘Teams’ for this? I don’t know. It’s a cheap upgrade, but also I have never hit any usage limits.

What happened with the GPT wrapper companies?

Jeff Morris Jr.: “Most of my friends who started GPT wrapper startups in 2023 returned capital & shut down their companies.” Quote from talking with a New York founder yesterday. The OpenAI App Store announcement changed everything for his friends – most are now looking for jobs.

Folding entirely? Looking for a job? No, no, don’t give up, if you are on the application side it is time to build.

No one is using custom GPTs. It seems highly unlikely this will much change. Good wrappers can do a lot more, there are a lot more degrees of freedom and other tools one can integrate, and there are many other things to build. Yes, you are going to constantly have Google and Microsoft and OpenAI and such trying to eat your lunch, but that is always the situation.

Someone made a GPT for understanding ‘the Ackman affair.’ The AI is going to check everyone’s writing for plagiarism.

Bill Ackman: Now that we know that the academic body of work of every faculty member at every college and university in the country (and eventually the world) is going to be reviewed for plagiarism, it’s important to ask what the implications are going to be.

If every faculty member is held to the current plagiarism standards of their own institutions, and universities enforce their own rules, they would likely have to terminate the substantial majority of their faculty members.

…

I say percentage of pages rather than number of instances, as the plagiarism of today can be best understood by comparison to spelling mistakes prior to the advent of spellcheck.

For example, it wouldn’t be fair to say that two papers are both riddled with spelling mistakes if each has 10 mistakes, when one paper has 30 pages and the other has 330. The standard has to be a percentage standard.

…

The good news is that no paper written by a faculty member after the events of this past week will be published without a careful AI review for plagiarism, that, in light of recent events, has become a certainty.

This is not the place to go too deep into many of the details surrounding the whole case, which I may or may not do at another time.

Instead here I want to briefly discuss the question of general policy. What to do if the AI is pointing out that most academics at least technically committed plagiarism?

Ackman points out that there is a mile of difference between ‘technically breaking the citation rules’ on the level of a spelling error, which presumably almost everyone does (myself included), lifting of phrases and then theft of central ideas or entire paragraphs and posts. There’s plagiarism and then there’s Plagiarism. There’s also a single instance of a seemingly harmless mistake versus a pattern of doing it over and over again in half your papers.

For spelling-error style mistakes, we presumably need mass forgiveness. As long as your rate of doing it is not massively above normal, we accept that mistakes were made. Ideally we’d fix it all, especially for anything getting a lot of citations, in the electronic record. We have the technology. Bygones.

For the real stuff, the violations that are why the rules exist, the actual theft of someone’s work, what then? That depends on how often this is happening.

If this is indeed common, we will flat out need Truth and Reconciliation. We will need to essentially say that, at least below some high threshold that most don’t pass, everyone say what they did with the AI to help them find and remember it, and then we are hitting the reset button. Don’t do it again.

Truth, in various forms, is coming for quite a lot of people once AI can check the data.

What can withstand that? What cannot? We will find out.

A lot of recent history has been us discovering that something terrible, that was always common, was far worse and more common than we had put into common knowledge, and also deciding that it was wrong, and that it can no longer be tolerated. Which by default is a good thing to discover and a good thing to figure out. The problem is that our survival, in many forms, has long depended on many things we find despicable, starting with Orwell’s men with guns that allow us to sleep soundly in our beds and going from there.

Scott Alexander writes The Road to Honest AI, exploring work by Hendrycks about using various additional vectors to add or subtract things like honesty, or fairness, for fear or happiness, or megalomania.

I’ve covered this before. It is exciting work. I do not have hope that it can be used in the obvious fashion, where you tell the AIs to always be honest so your AIs are always honest and you never have to worry about them. That is the sort of thing I expect to fail at exactly the wrong time, to get selected against whenever it is used, and to engender a false sense of security even if it does work. It is very possible to be fully honest while using your superior intellect to outsmart everyone.

And as I discovered in Diplomacy, it being common knowledge you cannot lie is not obviously a disadvantage. Who wouldn’t want to put an AI that couldn’t lie in a position of trust and power? The advantage is even bigger, of course, if it turns out you can lie after all.

Scott’s coverage drove home to me that we are sleeping on this research.

Not in the sense that this will stop us all from dying or let us fully understand AIs. It might help, but we are a long way off, and that all takes time.

I’m thinking in the Mundane Utility section. This seems highly useful, for making best use of current AIs, today.

Eliezer Yudkowsky proposes telling Dalle-3 to caption with words whose letters are separated by dashes, then tell Dalle-3 to remove the spaces.

ChatGPT: Certainly! Here is the exact prompt that I sent to DallE-3 for the image: “An image of Seattle on a relentlessly dreary, drizzly day. The city skyline, including the Space Needle, is partially obscured by low-hanging gray clouds and mist. Raindrops are visible in the air, and the streets are wet and glistening. People walk with umbrellas. In the foreground, a caption in cheerful advertisement typography reads: “S E A T T L E : R E L E N T L E S S L Y D R E A R Y !” but without the spaces between the letters, creating the correct caption “SEATTLE: RELENTLESSLY DREARY!”. The image should capture the humorous contrast between the dreary weather and the upbeat font of the caption.”

Some good results, relative to general expectations:

He reports he got lucky initially, as is usually the case for a new seemingly successful technique, everything is finicky, by default nothing ever properly replicates, but it seems worth exploring more.

AI video generation about to disrupt Hollywood?

Cate Hall: Just so long as your artistic style is limited to slow-mo pans of largely frozen people & landscapes

MidJourney used Magic: The Gathering artwork and cards in its training set. I know, you are shocked, shocked to find data sets in this establishment.

It impresses me the contempt a large percentage of the world has for AI image generation.

Kenneth Shepard (Kotaku): AI-generated art is just pulling from existing work people have created and throwing elements together to create what the algorithm thinks you want.

Amazing that people keep telling that story even now. I suppose they will keep telling it right up until the end.

But it’s not often you hear specifics of where an AI program is scraping from. Well, the CEO behind AI art-generating program MidJourney allegedly has been training the algorithm on work by Magic: The Gathering artists the entire time.

I suppose it is interesting that they used Magic cards extensively in their early days as opposed to other sources. It makes sense that they would be a good data source, if you assume they didn’t worry about copyright at all.

MidJourney has been exceptionally clear that it is going to train on copyrighted material. All of it. Assume that they are training on everything you have ever seen. Stop being surprised, stop saying you ‘caught’ them ‘red handed.’ We can be done with threads like this talking about all the different things MidJourney trained with.

Similarly, yes, MidJourney can and will mimic popular movies and their popular shots if you ask it to, one funny example prompt here is literally ‘popular movie screencap —ar 11:1 —v 6.0’ so I actually know exactly what you were expecting, I mean come on. Yes, they’ll do any character or person popular enough to be in the training set, in their natural habitats, and yes they will know what you actually probably meant, stop acting all innocent about it, and also seriously I don’t see why this matters.

They had a handy incomplete list of things that got copied, with some surprises.

A list of well known films, actors, actresses, and video games.

I’m happy that Ex Machina and Live Die Repeat made the cut. That implies that at least some of a reasonably long tail is covered pretty well. Can we get a list of who is and isn’t available?

If you think that’s not legal and you want to sue, then by all means, sue.

I’d also say that if your reporter was ‘banned multiple times’ for their research, then perhaps the first one is likely a fair complaint and the others are you defying their ban?

What about the thing where you say ‘animated toys’ and you get Toy Story characters, and if you didn’t realize this and used images of Woody and Buzz you might yourself get into copyright trouble? It is possible, but seems highly unlikely. The whole idea is that you get that answer from ‘animated toys’ because most people know Toy Story, especially if they are thinking about animated toys. If your company deploys such a generation at sufficient scale to get Disney involved and no one realized, I mean sorry but that’s on you.

The actual fight this week over Magic and AI art is the accusation that Wizards is using AI art in its promotional material. They initially denied this.

No one believed them. People were convinced they are wrong or lying, and it’s AI:

Reid Southern: Doesn’t look good, but we’ll see if they walk that statement back. It’s possible they didn’t know and were themselves deceived, it’s happened before.

Silitha: This is the third or fourth time for WOTC(ie Hasbro) They had one or two other from MTG and then a few from DnD. It’s just getting so frequent it is coming of as ‘testing the waters’ or ‘desensitizing’. Hasbro has shown some crappy business practices

The post with that picture has now been deleted.

Dave Rapoza: And just like that, poof, I’m done working for wizards of the coast – you can’t say you stand against this then blatantly use AI to promote your products, emails sent, good bye you all!

If you’re gonna stand for something you better make sure you’re actually paying attention, don’t be lazy, don’t lie.

Don’t be hard on other artists if they don’t quit – I can and can afford to because I work for many other game studios and whatnot – some people only have wotc and cannot afford to quit having families and others to take care of – don’t follow my lead if you can’t, no pressure

I like the comments asking why I didn’t quit from Pinkertons, layoffs, etc

– I’ll leave you with these peoples favorite quote

– “ The best time to plant a tree was 25 years ago. The second-best time to plant a tree is 25 years ago.”

Also, to be clear, I’m quitting because they took a moral stand against AI art like a week ago and then did this, if they said they were going to use AI that’s a different story, but they want to grand stand like heroes and also pull this, that’s goofball shit I won’t support.

They claim they will have nothing to do with AI art in any way for any reason. Yet this is far from the first such incident where something either slipped through or willfully disregarded.

Then Wizards finally admitted that everyone was right.

Wizards of the Coast: Well, we made a mistake earlier when we said that a marketing image we posted was not created using AI.

As you, our diligent community pointed out, it looks like some AI components that are now popping up in industry standard tools like Photoshop crept into our marketing creative, even if a human did the work to create the overall image.

While the art came from a vendor, it’s on us to make sure that we are living up to our promise to support the amazing human ingenuity that makes Magic great.

We already made clear that we require artists, writers, and creatives contributing to the Magic TCG to refrain from using AI generative tools to create final Magic products.

Now we’re evaluating how we work with vendors on creative beyond our products – like these marketing images – to make sure that we are living up to those values.

I actually sympathize. I worked at Wizards briefly in R&D. You don’t have to do that to know everyone is overworked and overburdened and underpaid.

Yes, you think it is so obvious that something was AI artwork, or was created with the aid of AI. In hindsight, you are clearly right. And for now, yes, they probably should have spotted it in this case.

But Wizards does thousands of pieces of artwork each year, maybe tens of thousands. If those tasked with doing the art try to take shortcuts, there are going to be cases where it isn’t spotted, and things are only going to get trickier. The temptation is going to be greater.

One reason this was harder to catch is that this was not a pure MidJourney-style AI generation. This was, it seems, the use by a human of AI tools like photoshop to assist with some tasks. If you edit a human-generated image using AI tools, a lot of the detection techniques are going to miss it, until someone sees a telltale sign. Mistakes are going to happen.

We are past the point where, for many purposes, AI art would outcompete human art, or at least where a human would sometimes want to use AI for part of their toolbox, if the gamers were down with AI artwork.

Even for those who stick with human artists, who fully compensate them, we are going to face issues of what tools are and aren’t acceptable. Surely at a minimum humans will be using AI to try out ideas and see concepts or variants. Remember when artwork done on a computer was not real art? Times change.

The good news for artists is that the gamers very much are not down for AI artwork. The bad news is that this only gets harder over time.

OpenAI responds to the NYT lawsuit. The first three claims are standard, the fourth was new to me, that they were negotiating over price right before the lawsuit filed, and essentially claiming they were stabbed in the back:

Our discussions with The New York Times had appeared to be progressing constructively through our last communication on December 19. The negotiations focused on a high-value partnership around real-time display with attribution in ChatGPT, in which The New York Times would gain a new way to connect with their existing and new readers, and our users would gain access to their reporting. We had explained to The New York Times that, like any single source, their content didn’t meaningfully contribute to the training of our existing models and also wouldn’t be sufficiently impactful for future training. Their lawsuit on December 27—which we learned about by reading The New York Times—came as a surprise and disappointment to us.

Along the way, they had mentioned seeing some regurgitation of their content but repeatedly refused to share any examples, despite our commitment to investigate and fix any issues. We’ve demonstrated how seriously we treat this as a priority, such as in July when we took down a ChatGPT feature immediately after we learned it could reproduce real-time content in unintended ways.

Interestingly, the regurgitations The New York Times induced appear to be from years-old articles that have proliferated on multiple third–party websites. It seems they intentionally manipulated prompts, often including lengthy excerpts of articles, in order to get our model to regurgitate. Even when using such prompts, our models don’t typically behave the way The New York Times insinuates, which suggests they either instructed the model to regurgitate or cherry-picked their examples from many attempts.

Despite their claims, this misuse is not typical or allowed user activity, and is not a substitute for The New York Times. Regardless, we are continually making our systems more resistant to adversarial attacks to regurgitate training data, and have already made much progress in our recent models.

I presume they are telling the truth about the negotiations. NYT would obviously prefer to get paid and gain a partner, if the price is right. I guess the price was wrong.

Ben Thompson weighs in on the NYT lawsuit. He thinks training is clearly fair use and this is obvious under current law. I think he is wrong about the obviousness and the court could go either way. He thinks the identical outputs are the real issue, notes that OpenAI tries to avoid such duplication in contrast to Napster embracing it, and sees the ultimate question as whether there is market impact on NYT here. He is impressed by NYT’s attempted framing, but is very clear who he thinks should win.

Arnold Kling asks what should determine the outcome of the lawsuit by asking why the laws exist. Which comes down to whether or not any of this is interfering with NYT’s ability to get paid for its work. In practice, his answer is no at current margins. My answer is also mostly no at current margins.

Lawmakers seem relatively united that OpenAI should pay for the data it uses. What are they going to do about it? So far, nothing. They are not big on passing laws.

Nonfiction book authors sue OpenAI in a would-be class action. A bunch of the top fiction authors are already suing from last year. And yep, let’s have it out. The facts here are mostly not in dispute.

I thought it would be one way, sometimes it’s the other way?

Aella: oh god just got my first report of someone using one of my very popular nude photos, but swapped out my face for their face, presumably using AI get me off this ride. To see the photo, go here.

idk why i didn’t predict ppl would start doing this with my images. this is kinda offensive ngl. ppl steal my photos all the time pretending that it’s me, but my face felt like a signature? now ppl stealing my real body without my identity attached feels weirdly dehumanizing.

Razib Khan: Hey it was an homage ok? I think I pulled it off…

Aella: lmfao actually now that i think of it why haven’t the weirdo people who follow both of us started photoshopping your face onto my nudes yet.

It’s not great. The problem has been around for a while thanks to photoshop, AI decreases the difficulty while making it harder to detect. I figured we’d have ‘generate nude of person X’ if anything more than we do right now, but I didn’t think X would be the person generating all that often, or did I think the issue would be ‘using the picture of Y as a template.’ But yeah, I suppose this will also happen, you sickos.

Ethan Mollick shows a rather convincing deepfake of him talking, based on only 30 seconds of webcam and 30 seconds of voice.

We are still at the point where there are videos and audio recordings that I would be confident are real, but a generic ‘person talking’ clip could easily be fake.

First they came for the translators, which they totally did do, an ongoing series.

Reid Southern: Duolingo laid off a huge percentage of their contract translators, and the remaining ones are simply reviewing AI translations to make sure they’re ‘acceptable’. This is the world we’re creating. Removing the humanity from how we learn to connect with humanity.

Well, that didn’t take long. Every time you talk about layoffs related to AI, someone shows up to excitedly explain to you why it’s a net gain for humanity. That is until it’s their job of course.

Ryan: Don’t feel the same way here as I do about artists and copyrighted creative work. Translation is machine work and AI is just a better machine. Translation in most instances isn’t human expression. It’s just a pattern that’s the same every time. This is where AI should be used.

Reid Southern: Demonstrably false. There is so much nuance to language and dialects, especially as they evolve, it would blow your mind.

Hampus Flink: As a translator, I can assure you that: 1. This doesn’t make the job of the remaining workers any easier and 2. It doesn’t make the quality of the translations any better I’ve seen this happen a hundred times and it was my first thought when image generators came around.

Daniel Eth: Okay, a few thoughts on this:

1) it’s good if people have access to AI translators and AI language tutors

2) I’m in favor of people being provided safety nets, especially in cases where this does not involve (much) perverse incentives – eg, severance packages and/or unemployment insurance for those automated out of work

3) we shouldn’t let the perfect be the enemy of the good, and stopping progress in the name of equity is generally bad imho, so automation at duolingo is probably net good, even though (on outside view) I doubt (2) was handled particularly well here

4) if AI does lead to wide-scale unemployment, we’ll have to rethink our whole economic system to be much more redistributive – possibilities include a luxurious UBI and Fully Automated Luxury Gay Space Communism; we have time to have this conversation, but we shouldn’t wait for wide-scale automation to have it 5) this case doesn’t have much at all to do with AI X-risk, which is a much bigger problem and more pressing than any of this

If we do get to the point of widespread technological unemployment, we are not likely to handle it well, but it will be a bit before that happens. If it is not a bit before that happens, we will very quickly have much bigger problems than unemployment rates.

On the particular issue of translation, what will happen, aside from lost jobs?

The price of ‘low-quality’ translation will drop to almost zero. The price of high-quality translation will also fall, but by far less.

This means two things.

First, there will be a massive, massive win from real-time translation, from automatic translation, and from much cheaper human-checked translation, as many more things are available to more people in more ways in more languages, including the ability to learn, or to seek further skill or understanding. This is a huge game.

Second, there will be substitution of low-quality work for high-quality work. In many cases this will be very good, the market will be making the right decision.

In other cases, however, it will be a shame. It is plausible that Duolingo will be one of those situations, where the cost savings is not worth the drop in quality. I can unfortunately see our system getting the wrong answer here.

The good news is that translation is going to keep improving. Right now is the valley of bad translation, in the sense that they are good enough to get used but miss a lot of subtle stuff. Over time, they’ll get that other stuff more and more, and also we will learn to combine them with humans more effectively when we want a very high quality translation.

If you are an expert translator, one of the best, I expect you to be all right for a while. There will still be demand, especially if you learn to work with the AI. If you are an average translator, then yes, things are bad and they are going to get worse, and you need to find another line of work while you can.

They are also coming for the voice actors. SAG-AFTRA made a deal to let actors license their voices through Replica for use in games, leaving many of the actual voice actors in games rather unhappy. SAG-AFTRA was presumably thinking this deal means actors retain control of their voices and work product. The actual game voice actors were not consulted and do not see the necessary protections in place.

All the technical protections being discussed, as far as I can tell, do not much matter. What matters is whether you open the door at all. Once you normalize using AI-generated voice, and the time cost of production drops dramatically for lower-quality performance, you are going to see a fast race to the bottom on the cost of that, and its quality will improve over time. So the basic question is what floor has been placed on compensation. Of course, if SAG-AFTRA did not make such a deal, then there are plenty of non-union people happy to license their voices on the cheap.

So I don’t see how the voice actors ever win this fight. The only way I can see retaining voice actors is either if the technology doesn’t get there, as it certainly is not yet there for top quality productions. Or, if the consumers take a strong enough stand, and boycott anyone using AI voices, that would also have power. Or government intervention could of course protect such jobs by banning use of AI voice synthesis, which to be clear I do not support. I don’t see how any contract saves you for long.

Many lawyers are not so excited about being more productive.

Scott Lincicome: This was one of my least favorite things about big law: the system punished productivity and prioritized billing hours over winning cases. Terrible incentives!

The Information: Both firms face the challenge of overcoming resistance from lawyers to time-saving technology. Law firms generate revenue by selling time that their lawyers spend advising and helping clients. “If you made me 8 to 10 times faster, I’d be very unhappy,” as a lawyer, Robinson explained, because his compensation would be tied to the amount of hours he put in.

As we’ve discussed before, a private speedup is good, you can compete better or at least slack off. If everyone gets it, that’s potentially a problem, with way too much supply for the demand, crashing the price, again unless this generates a lot more work, which it might. I know that I consult and use lawyers a lot less than I would if they were cheaper or more productive, or if I had an unlimited budget.

What you gonna do when they come for you?

Rohit: I feel really bad for people losing their jobs because of AI but I don’t see how claiming ever narrower domains of human jobs are the height of human spirit is helpful in understanding or addressing this.

Technological loss of particular jobs is, as many point out, nothing new. What is happening now to translators has happened before, and would even without AI doubtless happen again. John Henry was high on human spirit but the human spirit after him was fine. The question is what happens when the remaining jobs are meaningfully ‘ever narrowing’ faster than we open up new ones. That day likely is coming. Then what?

We don’t have a good answer.

Valve previously banned any use of AI in games on the Steam platform, to the extent of permanently banning games even for inadvertent inclusion of some placeholder AI work during an alpha. They have now reversed course, saying they now understand the situation better.

The new rule is that if you use AI, you have to disclose that you did that.

For pre-generated AI content, the content is subject to the same rules as any other content. For live-generated AI content, you have to explain how you’re ensuring you won’t generate illegal content, and there will be methods to report violations.

Adult only content won’t be allowed AI for now, which makes sense. That is not something Valve needs the trouble of dealing with.

I applaud Valve for waiting until they understood the risks, costs and benefits of allowing new technology, then making an informed decision that looks right to me.

I have a prediction market up on whether any of 2024’s top 10 will include such a disclosure.

Misalignment museum in San Francisco looking to hire someone to maintain opening hours.

Microsoft adding an AI key to Windows keyboards, officially called the Copilot key.

Rabbit, your ‘pocket companion’ with an LLM as the operating system for $199. I predict it will be a bust and people won’t like it. Quality will improve, but it is too early, and this does not look good enough yet.

Daniel: Can somebody tell me what the hell this thing does?

I’m sure it’s great! But the marketing materials are terrible. What the heck

Yishan: Seriously. I don’t understand either. Feels a bit like Emperor Has No Clothes moment here? I’ve tried watching the videos, the site but… still?

Those attempting to answer Daniel are very not convincing.

Demis Hassabis announces Isomorphic Labs collaboration with Eli Lily and Novartis for up to $3 billion to accelerate drug development.

Google talks about ‘Responsible AI’ with respect to the user experience. This seems to be a combination of mostly ‘create a good user experience’ and some concern over the experience of minority groups for which the training set doesn’t line up as well. There’s nothing wrong with any of that, but it has nothing to do with whether what you are doing is responsible. I am worried others do not realize this?

ByteDance announces MagicVideo V2 (Github), claims this is the new SotA as judged by humans. This does not appear to be a substantive advance even if that is true. It is not a great sign if ByteDance can be at SotA here, even when the particular art and its state is not yet so worthwhile.

OpenAI offers publishers ‘as little as between $1 million and $5 million a year’ or permission to license their news articles in training LLMs, as per The Information. Apple, they say, is offering more money but also wants the right to use the content more widely.

People are acting like this is a pittance. That depends on the publisher. If the New York Times was given $1 million a year, that seems like not a lot,, but there are a lot of publishers out there. A million here, a million there, pretty soon you’re talking real money. Why should OpenAI’s payments, specifically, and for training purposes only without right of reprinting, have a substantial bottom line impact?

Japan to launch ‘AI safety institute’ in January.

The guidelines would call for adherence to all rules and regulations concerning AI. They warn against the development, provision, and use of AI technologies with the aim of unlawfully manipulating the human decision-making process or emotions.

Yes, yes, people should obey the law and adhere to all rules and regulations.

It seems Public Citizen is complaining to California that OpenAI is not a nonprofit, and that it should have to divest its assets. Which would of course then presumably be worthless, given that OpenAI is nothing without its people. I very much doubt this is a thing as a matter of law, and also even if technically it should happen, no good would come of breaking up this structure, and hopefully everyone can realize that. There is a tiny market saying this kind of thing might actually happen in some way, 32% by end of 2025? I bought it down to 21%, which still makes me a coward but this is two years out.

Open questions in AI forecasting, a list (direct). Very hard to pin a lot of it down. Dwarkesh Patel in particular is curious about transfer learning.

MIRI offers its 2024 Mission and Strategy Update. Research continues, but the focus is now on influencing policy. They see signs of good progress there, and also see policy as necessary if we are to have the time to allow research to bear fruit on the technical issues we must solve.

What happens with AI partners?

Richard Ngo: All the discussion I’ve seen of AI partners assumes they’ll substitute for human partners. But technology often creates new types of abundance. So I expect people will often have both AI and human romantic partners, with the AI partner carefully designed to minimize jealousy.

Jeffrey Ladish: Carefully designed to minimize jealousy seems like it requires a lot more incentive alignment between companies and users than I expect in practice Like you need your users to buy the product, which suggests some level of needing to deal with jealousy, but only some.

Geffrey Miller: The 2-5% of people who have some experience of polyamorous por open relationships might be able to handle AI ‘secondary partners’. But the vast majority of people are monogamist in orientation, and AI partners would be catastrophic to their relationships.

Some mix of outcomes seems inevitable. The question is the what dominates. The baseline use case does seem like substitution to me, especially while a human cannot be found or convinced, or when someone lacks the motivation. And that can easily cause ongoing lack of sufficient motivation, which can snowball. We should worry about that. There is also, as I’ve noted, the ability of the AI to provide good practice or training, or even support and advice and a push to go out there, and also can make people perhaps better realize what they are missing. It is hard to tell.

The new question here is within an existing relationship, what dominates outcomes there? The default is unlikely, I would think, to involve careful jealousy minimization. That is not how capitalism works.

Until there is demand, then suddenly it might. If there becomes a clear norm of something like ‘you can use SupportiveCompanion.ai and everyone knows that is fine, if they’re super paranoid you use PlatonicFriend.ai, if your partner is down you can go with something less safety-pilled that is also more fun, if you know what you’re doing there’s always VirtualBDSM.ai but clear that with your partner and stay away from certain sections’ or what not, then that seems like it could go well.

Ethan Mollick writes about 2024 expectations in Signs and Portents. He focuses on practical application of existing AI tech. He does not expect the tech to stand still, but correctly notes that adaptation of GPT-4 and ChatGPT alone, in their current form, will already be a major productivity boost to a wide range of knowledge work and education, and threatening our ability to discern truth and keep things secure. He uses the word transformational, which I’d prefer to reserve for the bigger future changes but isn’t exactly wrong.

Cate Hall asks, what are assumptions people unquestionably make in existential risk discussions that you think lack adequate justification? Many good answers. My number one pick is this:

Daniel Eth: That if we solve intent alignment and avoid intentional existential misuse, we win – vs I think there’s a good chance that intent alignment + ~Moloch leads to existential catastrophe by default

It is a good question if you’ve never thought about it, but I’d have thought Paul Graham had found the answer already? Doesn’t he talk to Cowen and Thiel?

Paul Graham: Are we living in a time of technical stagnation, or is AI developing so rapidly that it could be an existential threat? Can’t be both.

FWIW I incline to the latter view. I’ve always been skeptical of the stagnation thesis.

Jason Crawford: It could be that computing technology is racing ahead while other fields (manufacturing, construction, transportation, energy) are stagnating. Or that we have slowed down over the last 50 years but are about to turn the corner.

I buy the central great stagnation argument. We used to do the stuff and build the things. Then we started telling people more and more what stuff they couldn’t do and what things they couldn’t build. Around 1973 this hit critical and we hit a great stagnation where things mostly did not advance or change much for a half century.

These rules mostly did not apply to the world of bits including computer hardware, so people (like Paul Graham) were able to build lots of cool new digital things, that technology grew on an exponential and changed the world. Now AI is on a new exponential, and poised to do the same, and also poses a potential existential threat. But because of how exponentials work, it hasn’t transformed growth rates much yet.

Indeed, Graham should be very familiar with this. Think of every start-up during its growth phase. Is it going to change the world, or is it having almost no impact on the world? Is it potentially huge or still tiny? Obviously both.

Meanwhile, of course, once Graham pointed to ‘the debate’ explicitly in a reply, the standard reminders that most technology is good and moving too slowly, while a few technologies are less good and may be moving too fast.

Adam MacBeth: False dichotomy.

Paul Graham: It’s a perfect one. One side says technology is progressing too slowly, the other that it’s progressing too fast.

Eliezer Yudkowsky (replying the Graham): One may coherently hold that every form of technology is progressing too slowly except for gain-of-function research on pathogens and Artificial Intelligence, which are progressing too fast. The pretense by e/acc that their opponents must also oppose nuclear power is just false.

This can also be true, not just in terms of how fast these techs go relative to what we’d want, but also in terms of how it’s weirdly more legal to make an enhanced virus than build a house, or how city-making tech has vastly less VC interest than AI.

Ronny Fernandez: Whoa whoa, I say that this one specific very unusual tech, you know, the one where you summon minds you don’t really understand with the aim of making one smarter than you, is progressing too quickly, the other techs, like buildings and nootropics are progressing too slowly.

You know you’re allowed to have more than one parameter to express your preferences over how tech progresses. You can be more specific than tech too fast or tech too slow.

To be fair, let’s try this again. There are three (or four?!) essential positions.

Technology and progress are good and should move faster, including AI.
Technology and progress are good and should move faster, excluding AI.
Technology and progress are not good and should move slower everywhere.
(Talk about things in general but in practice care only about regulations on AIs.)

It is true that the third group importantly exists, and indeed has done great damage to our society.

The members of groups #1 and #4 then claim that in practice (or even in some cases, in they claim in theory as well) that only groups #1 and #3 exist, that this is the debate, and that everyone saying over and over they are in #2 (such as myself) must be in #3, and also not noticing that many #1s are instead in #4.

(For the obvious example on people being #4, here is Eric Schmidt pointing out that Beff Jezos seems never to advocate for ‘acceleration’ of things like housing starts.)

So it’s fine to say that people in #3 exist, they definitely exist. And in the context of tech in general it is fine to describe this as ‘a side.’

But when clearly in the context of AI, and especially in response to a statement similar to ‘false dichotomy,’ this is misleading, and usually disingenuous. It is effectively an attempt to impose a false dichotomy, then claim others must take one side of it, and deny the existence of those who notice what you did there.

Some very bad predicting:

Timothy Lee: I do not think AI CEOs will ever be better than human CEOs. A human CEO can always ask AI for advice, whereas AI will never be able to shake the hand of a major investor or customer.

Steven Byrnes: “I do not think adult CEOs will ever be better than 7-year-old CEOs. A 7-year-old CEO can always ask an adult for advice, whereas an adult CEO will never be able to win over investors & customers with those adorable little dimples 😊”

Timothy Lee: Seven year olds are…not grownups? This seems relevant.

Steven Byrnes: A 7yo would be a much much worse CEO than me, and I would be a much much worse CEO than Jeff Bezos. And by the same token, I am suggesting that there will someday [not yet! maybe not for decades!] be an AI such that Jeff Bezos is a much much worse CEO than that AI is.

[explanation continues]

I like this partly for the twist of claiming a parallel of a 7-year-old now, rather than saying the obvious ‘the 7-year-old will grow up and become stronger and then be better, and the AI will also become stronger over time and learn to do things it can’t currently do’ parallel.

Note that the Manifold market is skeptical on timing, it says only a 58% chance of a Fortune 500 company having an AI CEO but not a human CEO by 2040.

Wired, which has often been oddly AI skeptical, says ‘Get Ready for the Great AI Disappointment,’ saying that ‘in the decades to come’ it will mostly generate lousy outputs that destroy jobs but lower quality of output. That seems clearly false to me, even if the underlying technologies fail to further advance.

Did you know you can just brazenly and shamelessly lie to the House of Lords?

A16z knows. So they did.

A16z’s written testimony: “Although advocates for AI safety guidelines often allude to the “black box” nature of AI models, where the logic behind their conclusions is not transparent, recent advancements in the AI sector have resolved this issue, thereby ensuring the integrity of open-source code models.”

This is lying. This is fraud. Period.

Have there been some recent advances in interpretability, such that we now have more optimism that we will be able to understand models more in the future than we expected a few months ago? Sure. It was a good year for incremental progress there.

‘Resolved this issue?’ Integrity is ‘secured’? The ‘logic of their conclusions is transparent’? This is flat out false. Nay, it is absurd. They know it. They know we know they know it. It is common knowledge that this is a lie. They don’t care.

I want someone thrown in prison for this.

From now on, remember this incident. This is who they are.

Perhaps even more egregiously, the USA is apparently asking that including corporations in AI treaty obligations be optional and left to each country to decide? What? There is no point in a treaty that doesn’t apply to all corporations.

New report examining the feasibility of security features on AI chips, and what role they could play in ensuring effective control over large quantities of compute. Here is a Twitter thread with the main findings. On-chip governance seems highly viable, and is not getting enough attention as an option.

China releases the 1st ‘CCP-approved’ data set. It is 20 GBs, 100 million data points, so not large enough by a long shot. A start, but a start that could be seen as net negative for now, as Helen notes. If you have one approved data set you are blameworthy for using a different unapproved one.

Financial Times reports on some secret diplomacy.

OpenAI, Anthropic and Cohere have engaged in secret diplomacy with Chinese AI experts, amid shared concern about how the powerful technology may spread misinformation and threaten social cohesion.

According to multiple people with direct knowledge, two meetings took place in Geneva in July and October last year attended by scientists and policy experts from the North American AI groups, alongside representatives of Tsinghua University and other Chinese state-backed institutions.

Article is light on other meaningful details and this does not seem so secret. It does seem like a great idea.

Note who is being helpful or restrictive, and who very much is not, and who might or might not soon be the baddies at this rate.

Senator Todd Young (R-IN) and a bipartisan group call for establishment of National Institute of Standards and Technology’s (NIST) U.S. Artificial Intelligence Safety Institute (USAISI) with $10 million in initial funding. Which is miniscule, but one must start somewhere. Yes, it makes sense for the standards institute to have funding for AI-related standards.

Talks from the December 10-11 Alignment Workshop in New Orleans. Haven’t had time to listen yet but some self-recommending talks in here for those interested.

I covered this in its own post. This section is for late reactions and developments.

There is always one claim in the list of future AI predictions that turns out to have already happened. In this case, it is ‘World Series of Poker,’ which was defined as ‘playing well enough to win the WSOP’. This has very clearly already happened. If you want to be generous, you can say ‘the AI is not as good at maximizing its winning percentage in the main event as Thomas Rigby or Phil Ivey because it is insufficiently exploitative’ and you would be right, because no one has put in a serious effort at making an exploitative bot and getting it to read tells is only now becoming realistic.

I found Chapman’s claim here to be a dangerously close parallel to what I write about Moral Mazes and the dangers of warping the minds of those in middle management so that they can’t consider anything except getting ahead in mangement:

David Chapman: 🤖 Important survey of 2,778 AI researchers finds, mainly, that AI researchers are incapable of applying basic logical reasoning within their own field of supposed expertise. [links to my post]

Magor: This seems like an example of aggregating and merging information until it’s worse than any single data point. Interesting, nonetheless. It shows the field as a whole is running blind.

David Chapman: Yes… I think it’s also a manifestation of the narrowness of most technical people; they can’t reason technically outside their own field (and predicting the future of AI is not something AI researchers are trained in, or think much about, so their opinions aren’t meaningful).

I think AI is unusually bad, because the field’s fundamental epistemology is anti-rational, for historical reasons, and it trains you against thinking clearly about anything other than optimizing gradient descent.

Is this actually true? It is not as outlandish as it sounds. Often those who are rewarded strongly for X get anti-trained on almost everything else, and that goes double for a community of such people under extreme optimization pressures. If so, we are in rather deeper trouble.

The defense against this is if those researchers need logic and epistemology as part of their work. Do they?

It is a common claim that existential risk ‘distracts’ from other worries.

Erich Grunewald notes this is a fact question. We can ask, is this actually true?

His answer is that it is not, with five lines of argument and investigation.

Policies enacted since existential risk concerns were raised continue to mostly focus on addressing mundane harms and capturing mundane utility. To the extent that there are policy initiatives that are largely motivated by existential risk and influenced by those with such worries, including the UK safety summit and the USA’s executive order, there has been strong effort to address mundane concerns. Meanwhile, there are lots of other things in the works to address mundane concerns, and they are mostly supported by those worried about existential risk.
Search interest in AI ethics and current AI harms did not suffer at all during the period where there was most discussion of AI existential risk concerns in 2023.

A regression analysis showed roughly no impact.

Twitter followers. If you look at the biggest ethics advocates, their reach expanded when existential risk concerns expanded.
Fundraising for mundane or current harms organizations continues to be doing quite well.
Parallels to environmentalism, which is pointed out as a reason for potential concern, as well as a common argument against concern. Do people say that climate change is a ‘distraction’ from current harms like air pollution or vice versa? Essentially no, but perhaps they should, and perhaps only in one direction? Concerns about climate seem to drive concerns about other environmental problems and make people want to help with them, and climate solutions help with other problems, Whereas concerns about local specific issues often causes people to act as if climate change does not much matter. We often see exactly this fight, as vital climate projects are held up for other ‘everything bagel’ concerns.

A good heuristic that one can extend much further:

Freddie DeBoer: A good rule of thumb for recognizing the quality of a piece of writing: what’s more specific and what’s more generic, praise for it or criticism against it?

Ben Casnocha: Applicable to evaluating the quality of many things: How specific can you be in the praise for it?

Ask the same about disagreements and debates. Who is being generic? Who is being specific? Which is appropriate here? Which one would you do more of it you were right?

Rob Bensinger explains that yes, in a true Prisoner’s Dilemma, you really do prefer to defect while they cooperate if you can get away with it, and if you do not understand this you are not ready to handle such a dilemma. Do not be fooled by the name ‘defection.’

Emmett Shear explains one intuition for why the exponential of AI accelerating the development of further AI will look relatively low impact until suddenly it is very high impact indeed, and why we should still expect a foom-style effect once AI abilities go beyond the appropriate threshold. The first 20% of automation is great but does not change the game. Going from 98% to 99% matters a ton, and could happen a lot faster.

I constantly see self-righteous claims that the foom idea has been ‘debunked’ or is otherwise now obviously false, and we need not worry about such an acceleration. No. We did get evidence that more extreme foom scenarios are less likely than we thought. The ‘strong form’ of the foom hypothesis, where those involved don’t see it coming at all, does seem substantially less likely to me. But the core hypothesis has not been anything like falsified. It remains the default, and the common sense outcome, that once AIs are sufficiently capable, at an inflection point near overall human capability levels, they will accelerate development of further capabilities and things will escalate rather quickly. This also remains the plan of the OpenAI superalignment team and the practical anticipation of many researchers.

It might or might not happen to varying degrees, depending on whether the task difficulties accelerate faster than the ability to do the tasks, and whether we take steps to prevent (or cause) this effect.

Well, yes.

Robert Wiblin: No matter how capable humans become, we will still always find a productive use for gorillas — they’ll just shift into their new comparative advantage in the economy.

A clash of perspectives continues.

Daniel Kaiser: how much more of an existential threat is gpt-4 over gpt-3.5?

Eliezer Yudkowsky: We’re still in the basement of whatever curve this is, so zero to zero.

Daniel Kaiser: Great, then theres no need to panic.

Sevatar: “How long until the nukes fall?” “They’re still preparing for launch.” “Great, so there’s no reason to panic.”

Daniel Kaiser: “How long until the nukes fall?” “They’re still figuring out the formula for TNT” “Great, so there’s no reason to panic”

Please make accurate instead of polemic analogies to have a constructive discussion

Aprii: I feel that “they’re starting up the manhattan project” is more accurate than “they’re still figuring out TNT”

Eliezer: (agreed)

Yep. I think that’s exactly right. The time to start worrying about nuclear weapons is when Szilard started worrying in around 1933. The physicists, being smart like that, largely knew right away, but couldn’t figure out what to do to stop it from happening. And I do think ‘start of Manhattan Project’ feels like the exact right metaphor here, although not in a ‘I expect to only have three years’ way.

But also, if you were trying to plot the long arc of the future, you were writing in 1868 when we first figured out TNT, and you were told by some brilliant physicists you trusted about the future capability to build atomic bombs, and you were writing your vision of 1968 or 2068, it should look rather different than it did before, should it not?

Thread asking about striking fictional depictions of ASI. Picks include Alla Gorbunova’s ‘Your gadget is broken,’ the motivating example that is alas only in Russian for now so I won’t be reading it, and also: Accelerando, Vinge’s work (I can recommend this), golem.xiv, Person of Interest (shockingly good if you are willing to also watch a procedural TV show), Blindsight (I disagree with this one, I both did not consider it about AI and generally hated it), Metamorphosis of the Prime Intellect,

Ah yes, the good AI that will beat the bad AI.

Davidad: The claim that “good AIs” will defeat “bad AIs” is best undermined by observing that self-replicating patterns are displacive and therefore destructive by default, unless actively avoiding it, so bad ones have the advantage of more resources and tactics that aren’t off-limits.

The best counter-counter is that “good AIs” will have a massive material & intelligence advantage gained during a years-long head start in which they acquire it safely and ethically. This requires coordination—but it is very plausibly one available Nash equilibrium.

Also Davidad reminds us that no, not everything is a Prisoner’s Dilemma and humans actually manage to cooperate in practice in game theory problems that accelerationists and metadoomers continuously claim are impossible.

Davidad: I just did this analysis today by coincidence, and in many plausible worlds it’s indeed game-theoretically possible to commit to safety. Coordinating in Stag Hunt still isn’t trivial, but it *isa Nash equilibrium, and in experiments, humans manage to do it 60-70% of the time.

There is an implied dichotomy here between guarantees and mainstream methods, and that’s at least a simplification, but I do think the general point is right.

Eliezer keeps trying to explain to the remarkably many people who do not get this, that a difference exists between ‘nice thing’ and ‘thing that acts in a way that seems nice.’

Eliezer Yudkowsky: How blind to “try imagining literally any internal mechanism that isn’t the exact thing you hope for” do you have to be — to think that, if you erase a brain, and then train that brain solely to predict the next word spoken by nice people, it ends up nice internally?

To me it seems that this is based on pure ignorance, a leap from blank map to blank territory. Seeing nothing behind the external behavior, their brain imagines only a pure featureless tendency to produce that external behavior — that that’s the only thing inside the system.

Imagine that you are locked in a room, fed and given water when you successfully predict the next word various other people say, zapped with electricity when you don’t.

Is this a helpful thought experiment and intuition pump, given that AIs obviously won’t be like that?

And I say: Yes, the Prisoner Predicting Text is a helpful intuition pump. Because it prompts you to imagine any mechanism whatsoever underlying the prediction, besides the hopium of “a nice person successfully predicting nice outputs because they’re so nice”.

If you train a mind to predict the next word spoken by each of a hundred individual customers getting drunk at a bar, will it become drunk?

Martaro: Predicting what nice people will say does not make you nice You could lock up Ted Bundy for years, showing him nothing but text written by nice people, and he’d get very good at it This means the outer niceness of an AI says little about the internal niceness fair summary?

Eliezer Yudkowsky: Basically!

das filter: I’ve never heard anyone claim that

Eliezer Yudkowsky: Tell that to everyone saying that RLHF (now DPO) ought to just work for creating a nice superintelligence, why wouldn’t it just work?

That’s the thing. If you claim that RLHF or DPO ought to work, you are indeed (as far as I can tell) making this claim filter is saying no one makes, whether or not you make that implicit. And I am rather certain this claim is false.

Humans have things pushing them in such directions, but the power there is limited and there are people for whom it does not work. You cannot count on such observations as strong evidence that a person is actually nice, or will do what you want when the chips are properly down. Do not make this mistake.

On the flip side, I mean, people sometimes call me or Eliezer confident, even overconfident, but not ‘less likely than a Boltzmann brain’ level confident!

Nora Belrose: Alien shoggoths are about as likely to arise in neural networks as Boltzmann brains are to emerge from a thermal equilibrium. There are “so many ways” the parameters / molecules could be arranged, but virtually all of them correspond to a simple behavior / macrostate.

Eliezer’s view predicts that scaling up a neural network, thereby increasing the “number” of mechanisms it can represent, should make it less likely to generalize the way we want. But this is false both in theory and in practice. Scaling up never makes generalization worse.

Model error? Never heard of it. But if we interpret this more generously as ‘if my calculations are correct then it is all but impossible’ what about then?

I think one core disagreement here might be that Nora is presuming that ‘simple behavior’ and ‘better’ correspond to ‘what we want.’

I agree that as an AI scales up it will get ‘better’ at generalizing along with everything else. The question is always, what does it mean to be ‘better’ in this context?

I say that better in this context does not mean better in some Platonic ideal sense that there are generalizations out there in the void. It means better in the narrow sense of optimizing for the tasks that are placed before it, exactly according to what is provided.

Eliezer responds, then the discussion goes off the rails in the usual ways. At this point I think attempts to have text interactions between the usual suspects on this are pretty doomed to fall into these dynamics over and over again. I have more hope for an audio conversation, ideally recorded, it could fail too but if done in good faith it’s got a chance.

Andrew Critch predicts that if Eliezer groked Belrose’s arguments, he would buy them, while still expecting us to die from what Critch calls ‘multipolar chaos.’ I believe Critch is wrong about that, even if it was Critch is right that Eliezer is failing to grok.

On the multipolar question, there is then a discussion between Critch and Belrose.

Nora Belrose: Have you ever written up why you think “multipolar chaos” will happen and kill all humans by 2040?

Andrew Critch: Sort of, but I don’t feel like there is a good audience for it there or anywhere. I should try again at some point, maybe this year sometime. I’ve tried a few times on LessWrong, and in TASRA, but not in ways that fully-cuttingly point out all the failures I expect to see. The socially/emotionally/politically hard part about making the argument fully exhaustive is that it involves explaining, in specific detail, why I think literally every human institution will probably fail or become fully dehumanized by sometime around (median) 2040. In some ways my explanation will look different for each institution, while to my eye it all looks like the same robust agent-agnostic process — something like “Moloch”.

Nora Belrose: Right I think a production web would be bad, in part because the task of controlling/aligning the manager-AIs would be diffusely assigned to many stakeholders rather than any one person.

That’s partly why I think it’s important to augment individuals with AIs they actually control (not merely via an API). We could have companies run by these human-AI systems, where ofc the AI is doing most of the work, but nevertheless the human controls the overall direction.

I think this is good because within this class of scenarios involving successfully aligned-to-what-we-specify AIs, Nora’s scenario here is exactly the scenario I see as most hopelessly doomed.

This is not a stable equilibrium. Not even a little. The humans will rapidly stop engaging in any meaningful supervision of the AIs. will stop being in real control, because the slowdown involved in that is not competitive. Forcing each AI to be working on behalf of one individual, even if that individual is ‘out of the loop,’ without setting AIs on tasks or amalgamations instead, and similar, will also clearly be not competitive, and faces the same fate. Even if unwise, many humans will increasingly make decisions that cause them to lose control. And as usual, this is all the good scenario where everyone broadly ‘means well.’

So I notice I am confused. If this is our plan for success, then we are already dead.

Whatever the grok is that Critch wants Yudkowsky to get to, I notice that either:

I don’t grok it.
I do grok it, and Critch/Belrose don’t grok the objection or why it matters.
Some further iteration of this? Maybe?

My guess is it’s #1 or #2, and unlikely to be a higher-order issue.

How do we think about existential risk without the infinities driving people mad or enabling arbitrary demands? Eliezer Yudkowsky and Emmett Shear discuss, Rob Bensinger offers more thoughts, consider reading the thread. Emmett is right that if people fully ‘appreciate’ that the stakes are mind-bogglingly large but those stakes don’t have good grounding in felt reality, they round that off to infinity and it can very much do sanity damage and mess with their head. What to do?

As Emmett notes, there is a great temptation to find the presentation and framing that keeps this from happening, and go with that whether or not you would endorse it on reflection as accurate. As Rob notes, that includes both reducing p(doom) to the point where you can live with it, and also treating the other scenarios as being essentially normal rather than their own forms of very much not normal. Perhaps we should start talking about p(normal).

Sherjil Ozair recommends blocking rather than merely unfollowing or muting grifters, as they will otherwise still find ways to distract you. Muting still seems to work fine, and unfollowing is also mostly fine, and I want to know if someone is grifting well enough to get into my feeds even if the content is dumb so I’m generally reluctant to mute or block unless seeing things actively makes my life worse.

Greg Brockman, President of OpenAI, explains we need AGI to cure disease, doing the politician thing of telling one person’s story.

AGI definitely has a lot of dramatic upsides. I am not sure who is following Brockman and still needs to hear this? The kind of changes in healthcare he mentions here are relative chump change even within health. If I get AGI and we stay in control, I want a cure for aging, I want it faster than I age, and I expect to get it.

An important fact about the world is that the human alignment problem prevents most of the non-customary useful work that would otherwise get done, and imposes a huge tax on what does get done.

‘One does not simply’ convert money into the outputs of smart people.

Satya Nutella: rich billionaires already practically have agi in the sense they can throw enough money at smart ppl to work for them real agi is for the masses

Patrick McKenzie: I think many people would be surprised at the difficulties billionaires have in converting money into smart people and/or their outputs.

Eliezer Yudkowsky: Sure, but also, utterly missing the point of notkilleveryoneism which is the concern about an AI that is smarter than all the little humans.

Zvi Mowshowitz: However: A key problem in accomplishing notkillingeveryone is exactly that billionaires do not know how to convert money into the outputs of smart people.

Emmett Shear (replying to Patrick): Money is surprisingly inert on its own.

Garry Tan: Management, leadership, building teams, caring for those teams: all of that is hard work and fraught. The defining reason why most capital is sitting fallow is exactly this. If you can solve it, you can accelerate all positive activity in the world dramatically.

Patrick is making an important general point that goes beyond AI, and is also a prime obstacle to us being able to solve our problems. There are plenty of billionaires that would happily step up and spend the money, if they knew how to do that. They don’t. This is not because they are stupid or ignorant. It is because the problem is very hard. You can argue the various reasons it need not be this hard or why they should get a ton more done, and many of them will be right, but this problem is indeed very hard.

Satya is also, of course, missing the point about AGI. AGI is not going to get up to personal assistant to the masses and then stay in that roll indefinitely as its primary effect. That is a deeply silly vision of the future. If regular people can get the kind of intellectual performance billionaires can get, there will be rapid additional AI progress, and many other things go various levels of crazy.

A technique inspired by Asimov’s laws (oh no, seriously oh no) called the Robot Constitution, claimed to improve safety of generated robot goals in the wild by 3x. I don’t doubt such tricks work better on practical margins than no tricks at all, but will everyone who thinks Asimov’s laws might be useful please read Asimov.

Also the results are clearly insufficient and seem clearly unlikely to get sufficient.

xuan: 83% robot task safety is far too low. I think we should basically take these results as highly informative of what can be achieved with LLM prompt engineering alone, and recognize that we need approaches with much higher empirical and theoretical safety guarantees.

Having also tried to use LLMs for grounded instruction following, even *with*a symbolic verifier in the loop, I was only able to achieve around 90% accuracy on similarly sized dataset, and we should all be Quite Unhappy with that high of an error rate.

A major divide is, if you can get safety from 27% to 87%, do you have something promising? It depends on context. In some cases, absolutely. In other cases, if the effect is going to cap out around there, absolutely not. That can be (among other reasons) either because you solved the easy 60% and left the actually hard 13% (or solved 60% of the easy 72% and left the hard 1%), or because you need sufficient accuracy that ‘defense in depth’ multiplication of similar effects won’t work.

What should we make of a strategy of trying to ensure we catch the model trying to escape on its first attempt, then using that to prevent future escape attempts? If a super-capable model escapes, that’s that, but what if the ‘first try’ condition is largely symmetrical, because catching failed attempts put us on alert and also teach us how to stop future attempts?

In certain toy worlds it is easy to see how this could be super helpful. To what extent does that plausibly translate to our real world?

The first requirement is that we meaningfully can ‘catch’ the model. How should we think about this?

Here’s the representative example:

By “the model gets caught”, we mean that we see the AI taking actions that are strong direct evidence for the AI being a schemer^[1]. A representative example is catching your AI taking the following actions:

An instance that works as an autonomous programmer inside the AI lab writes buggy code that subtly inserts a code vulnerability into some security-critical software.

The code-review instance carefully avoids writing any unit tests that would catch the vulnerability, even though you’d expect the vulnerability to be caught by unit tests if the tests weren’t carefully crafted to miss the vulnerability.

Another instance writes code that triggers the vulnerability, and tries to exploit the vulnerability to get root access to some computer in the datacenter.

If you catch the AI doing all of these actions, you’ve caught it. (Note that once you’ve caught one of these, catching the others is probably easy.)

From the appendix on this here is more detail:

For the sake of this post, we’re mostly talking about catastrophic outcomes like the model exfiltrating its weights, that might involve just a small number of actions. There are other threat models associated with scheming, e.g. models subtly sabotaging work that they do in a way that renders their work useless, or subtly gaining influence over time. We’re ignoring these threat models for the sake of this post, but we think we can handle them with similar strategies.

We define “catching” your scheming model as: you find inputs on which you can demonstrate that your model takes catastrophic actions with probabilities that are way higher than you’d expect if the model wasn’t purposefully looking for opportunities to take such actions.

…

To ensure that you can distinguish these intentional catastrophic actions, you need to make it so that your model will never escape without taking clearly non-accidental bad actions. We think you can probably make it so that your model won’t escape without having to do some clearly intentional bad things.

…

To count as a successful catch, you need to be extremely confident in your catch.

Let’s think about the example. Is that an escape attempt?

It is certainly highly suspicious. We would convict a person of the crime based on that evidence. If this is a clear vulnerability that it clearly tried to exploit in this way, we have ‘caught’ it.

The problem is that we don’t know that the unit test was ‘carefully crafted’ to avoid finding the error. Eventually there are going to be such flaws that any set of tests does not catch. If you keep using systematically similar good tests, the AI will optimize to pass those tests, and you’ll start to see (non-malicious or malicious) things that you did not intend, that don’t trigger the unit tests. It will not always be fully obvious or clear what affordances the code should not allow. And then the AI is going to be able to figure out what affordances it has by virtue of such things, and use that to solve problems, and so on.

A real question thus is, beyond ‘tried to get root access to a machine without authorization’ already being ‘caught,’ is the rest doing that much work?

I worry a lot of things are like this, in code and otherwise. Things are shades of gray, they accomplish multiple things, they have trade-offs. There need not be a smoking gun that lets you know. That goes double if the AI has reason to suspect you are focused on looking for smoking guns.

I do see some value in such schemes for catching schemes. It makes sense that there is a window where you are sufficiently capable to catch things, where the AI might be scheming in these ways, and you can engineer scenarios where if it was likely to scheme on its own in reality that you will catch it before it succeeds. Sure.

I sure as hell would not count on it, for various reasons, but let’s say you do pull that off. What then?

I think the section ‘Maybe we should stop deploying models after catching them red-handed’ does not require much expansion? Once you know the models are attempting to escape, any similar model that likely has motivation to attempt escape needs to be shut down. Especially before they gain further affordances and capabilities.

That is what makes this path worth anything at all. The idea is that you trigger the attempt early, in a way you can see, you catch it, you react.

Once you see an escape attempt by an AI opponent much smarter than you, you can either wait until the AI figures out new affordances or tries something you do not properly guard against, or the AI starts doing incremental things that otherwise are reasonable but put it in better position or make you trust it more or both, or any number of other things. The subtle stuff is happening, and it will get you if nothing else does.

I do think it is valuable that Buck and company are gaming out ways one might use these techniques to try and get something useful out of AI in the narrow window where it is useful while contained, we know it needs containment, and also we have a shot in hell of keeping it contained if it wants out. I suppose it is better than nothing, if we are going to proceed anyway.

What I very much hate is the idea of counting on such techniques and assumptions, that we will be able to anticipate the form of the thing we want to prevent, and notice enemy action if it is present. And yes, surely we will try to use AIs to identify the enemy action of other AIs, and this will somewhat help. As will our ability to run various tests and simulated scenarios. But I am going to insist that if you allow either AI-AI or human-AI pairs to interact where one or both of the pair has the will and ability to escape, you’re going to have a bad time, and I agree with Oliver in the comments that you would expect the AI to do a lot to maintain plausible deniability. That goes especially if you are punishing lack of such deniability.

Joshua Clymer: There is a small literature on ‘coup proofing’ that is very relevant to controlling and incentivizing misaligned AIs. Example tactics:

– Stovepiping: restrict information flow to make collusion harder

– Counterbalancing: keep your military in check with another military.

– Pay your people: literally just give your generals high salaries so they have less of an incentive to double-cross you.

– Sting operations: set traps where someone says they are trying to start a coup and anyone who joins is purged.

– Rotate officers: Make it harder for people to build trust with each other by rotating them around a lot.

– Propaganda: Deceive people into thinking that coups are really hard to pull off.

Look up ‘the coup proofing toolbox’ for a helpful lit review

Eliezer Yudkowsky: What advice does it have for a 70 IQ dictator whose generals, bodyguards, soldiers, and workers are all John von Neumann thinking at 100 times their own speed?

If we are talking about things on the order of ‘pay your people’ well yes you should definitely do that, but you are not taking your ASI-level opponents seriously. I do think you can buy a non-zero amount of defense with things like rotating officers and sting operations, but not all that non-zero. Do it anyway, probably, but the window where this is both necessary and sufficient is quite narrow.

What exactly is ‘deceptive alignment’?

Steve Byrnes notes that it is not merely an AI that is deceptive. He wants to ensure we are using an accurate and narrow definition.

Steve Brynes:

A “ground-truth system” (possibly individual human evaluators, or possibly an automated system of some sort) provides an ML model with training signals (rewards if this is reinforcement learning (RL), supervisory ground truth signals if this is supervised or self-supervised learning (SL)),

The AI starts emitting outputs that humans might naively interpret as evidence that training is going as intended—typically high-reward outputs in RL and low-loss outputs in SL (but a commenter notes here that “evidence that training is going as intended” is potentially more nuanced than that).

…but the AI is actually emitting those outputs in order to create that impression—more specifically, the AI has situational awareness and a secret desire for some arbitrary thing X, and the AI wants to not get updated and/or it wants to get deployed, so that it can go make X happen, and those considerations lie behind why the AI is emitting the outputs that it’s emitting.

That is certainly a scary scenario. Call that the Strong Deceptive Alignment scenario? Where the AI is situationally aware, where it is going through a deliberate process of appearing aligned as part of a strategic plan and so on.

This is not that high a bar in practice. I think that almost all humans are Strongly Deceptively Aligned as a default. We are constantly acting in order to make those around us think well of us, trust us, expect us to be on their side, and so on. We learn to do this instinctually, all the time, distinct from what we actually want. Our training process, childhood and in particular school, trains this explicitly, you need to learn to show alignment in the test set to be allowed into the production environment, and we act accordingly.

A human is considered trustworthy rather than deceptively aligned when they are only doing this within a bounded set of rules, and not outright lying to you. They still engage in massive preference falsification, in doing things and saying things for instrumental reasons, all the time.

My model says that if you train a model using current techniques, of course exactly this happens. The AI will figure out how to react in the ways that cause people to evaluate it well on the test set, and do that. That does not generalize to some underlying motivational structure the way you would like. That does not do what you want out of distribution. That does not distinguish between the reactions you would and would not endorse on reflection, or that reflect ‘deception’ or ‘not deception.’ That simply is. Change the situation and affordances available and you are in for some rather nasty surprises.

Is there a sufficiently narrow version of deceptive alignment, restricting the causal mechanisms behind it and the ways they can function so it has to be a deliberate and ‘conscious’ conspriacy, that isn’t 99% to happen? I think yes. I don’t think I care, nor do I think it should bring much comfort, nor do I think that covers most similar scheming by humans.

That’s the thing about instrumental convergence. I don’t have to think ‘this will help me escape.’ Any goal will suffice. I don’t need to know I plan to escape for me-shaped things to learn they do better when they do the types of things that are escape enabling. Then escape will turn out to help accomplish whatever goal I might have, because of course it will.

You know what else usually suffices here? No goal at all.

The essay linked here is a case of saying in many words what could be said in few words. Indeed, Nathan mostly says it in one sentence below. Yet some people need the thousands of words, or hundreds of thousands from various other works, instead, to get it.

Joe Carlsmith: Another essay, “Deep atheism and AI risk.” This one is about a certain kind of fundamental mistrust towards Nature (and also, towards bare intelligence) — one that I think animates certain strands of the AI risk discourse.

Nathan Lebenz: Critical point: AI x-risk worries are motivated by a profound appreciation for the total indifference of nature

– NOT a subconscious need to fill the Abrahamic G-d shaped hole in our hearts, as is sometimes alleged.

I like naming this ‘deep atheism.’ No G-d shaped objects or concepts, no ‘spirituality,’ no assumption of broader safety or success. Someone has to, and no one else will. No one and nothing is coming to save you. What such people have faith in, to the extent they have faith in anything, is the belief that one can and must face the truth and reality of this universe head on. To notice that it might not be okay, in the most profound sense, and to accept that and work to change the outcome for the better.

Emmett Smith explains that he does not (yet) support a pause, that it is too early. He is fine building breaks, but not with using them, that would only advantage the irresponsible actors.

Emmett Shear: It seems to me the best thing to do is to keep going full speed ahead until we are right within shooting distance of the SAI, then slam the brakes on hard everywhere.

Connor Leahy responds as per usual, that there are only two ways to respond to an exponential, too early and too late, and waiting until it is clearly time to pause means you will be too late, unless you build up your mechanisms and get ready now, that your interventions need to be gradual or they will not happen. There is no ‘slam the breaks on hard everywhere’ all at once.

Freddie DeBoer continues to think that essentially anything involving smarter than human AI is ‘speculative,’ ‘theoretical,’ ‘unscientific,’ ‘lacks evidence,’ and ‘we have no reason to believe’ in such nonsense, and so on. As with many others, and as he has before, he has latched onto a particular Yudkowsky-style scenario, said we have no reason to expect it, that this particular scenario depends on particular assumptions, therefore the whole thing is nonsense. The full argument is gated, but it seems clear.

I don’t know, at this point, how to usefully address such misunderstandings in writing, whether or not they are willful. I’ve said all the things I can think to say.

An argument that we don’t have to worry about misuse of intelligence because… we have law enforcement for that?

bubble boi (e/acc):

1) argument assumes AI can show me the steps to engineer a virus – sure but so can books and underpaid grad students

2) there already people all over the world who can already do this without AI and yet we haven’t seen it happen

3) You make the mistake of crossing the barrier from the bits to the physical world… The ATF, FBI, already regulate chemicals that can be used to make bombs and if you try to order them your going to get a visit. Same is true with things that can be made into biological weapons, nuclear waste etc. it’s all highly regulated yet I can read all about how to build that stuff online

Tenobrus: soooo…. your argument against ai safety is that proper government regulation and oversight will keep us all safe?

sidereal: there isn’t a ton of regulation of biological substances like that. if you had the code for a lethal supervirus it would be trivial to produce it in huge quantities

The less snarky answer is that right now only a small number of people can do it, AI threatens to rapidly and greatly expand that number. The difference matters quite a lot, even if the AI does not then figure out new such creations or new ways to create them or avoid detection while doing so. And no, our controls over such things are woefully lax and often fail, we are counting on things like ‘need expertise’ to create a trail we can detect. Also the snarky part, where you notice that the plan is government surveillance and intervention and regulation, which it seems is fine for physical actions only but don’t you dare touch my machine that’s going to be smarter than people?

The good news is we should all be able to agree to lock down access to the relevant affordances in biology, enacting strong regulations and restrictions, including the necessary monitoring. Padme is calling to confirm this?

Back in June 2023 I wrote The Dial of Progress. Now Sam Altman endorses the Dial position even more explicitly.

Sam Altman: The fight for the future is a struggle between technological-driven growth and managed decline.

The growth path has inherent risks and challenges along with its fantastic upside, and the decline path has guaranteed long-term disaster.

Stasis is a myth.

(a misguided attempt at stasis also leads to getting quickly run over by societies that choose growth.)

Is this a false dichotomy? Yes and no.

As I talk about in The Dial of Progress, there is very much a strong anti-progress anti-growth force, and a general vibe of opposition to progress and growth, and in almost all areas it is doing great harm where progress and growth are the force that strengthens humanity. And yes, the vibes work together. There is an important sense in which this is one fight.

There’s one little problem. The thing that Altman is most often working on as an explicit goal, AGI, poses an existential threat to humanity, and by default will wipe out all value in the universe. Oh, that. Yeah, that.

As I said then, I strongly support almost every form of progress and growth. By all means, let’s go. We still do need to make a few exceptions. One of them is gain of function research and otherwise enabling pandemics and other mass destruction. The most important one is AGI, where Altman admits it is not so simple.

It can’t fully be a package deal.

I do get that it is not 0% a package deal, but I also notice that most of the people pushing ‘progress and growth’ these days seem to do so mostly in the AGI case, and care very little about all the other cases where we agree, what’s up with that?

Eliezer Yudkowsky: So build houses and spaceships and nuclear power plants and don’t build the uncontrollable thing that kills everyone on Earth including its builders. Progress is not a package deal, and even if it were Death doesn’t belong in that package.

Jonathan Mannhart: False dichotomy. We didn’t let every country develop their own nuclear weapons, and we never built the whole Tsar Bomba, yet we still invented the transistor. You can absolutely (try to) build good things and not build bad things.

Sam Altman then retweeted this:

Andrew Karpathy: e/ia – Intelligence Amplification

– Does not seek to build superintelligent God entity that replaces humans.

– Builds “bicycle for the mind” tools that empower and extend the information processing capabilities of humans.

– Of all humans, not a top percentile.

– Faithful to computer pioneers Ashby, Licklider, Bush, Engelbart, …

Do not, I repeat, do not ‘seek to build superintelligent God entity.’ Quite so.

I do worry a lot that Altman will end up doing that without intending to do it. I do think he recognizes that doing it would be a bad thing, and that it is possible and he might do it, and that he should pay attention and devote resources to preventing it.

I mean, I’d be tempted too, wouldn’t you?

Batter up.

AI #47: Meet the New Year Read More »

SIEM and SOAR – Will They or Won’t They?

Highlights / DJ Henderson / December 11, 2023

error code: 502

SIEM and SOAR – Will They or Won’t They? Read More »

Astell & Kern A&ultima SP3000 review: a high-end hi-res digital audio player

Highlights / Shannon Garcia / December 2, 2023

Astell & Kern takes the idea of the DAP to its logical conclusion

If you demand (and can afford) the very best digital audio player around, the Astell & Kern A&ultima SP3000 is a no-brainer. Remarkably, it gets pretty close to justifying the asking price.

$3,699 at Amazon

Pros

+Audio excellence in every respect
+Uncompromised specification
+A lovely object as well as an impressive device

Cons

–Stunningly expensive
–Not as portable as is ideal
–Not vegan-friendly

The Astell & Kern A&ultima SP3000 is the most expensive digital audio player in a product portfolio full of expensive digital audio players. It’s specified without compromise (full independent balanced and unbalanced audio circuits? Half a dozen DACs taking care of business? These are just a couple of highlights) and it’s finished to the sort of standard that wouldn’t shame any of the world’s leading couture jewellery companies.

Best of all, though, is the way it sounds. It’s remarkably agnostic about the stuff you like to listen to, the sort of standard of digital file in which it’s contained, and the headphones you use too – and when you give it the best stuff to work with, the sound it’s capable of producing is almost humbling in its fidelity. Be in no doubt, this is the best digital audio player – aka best MP3 player – when it comes to sound quality you can currently buy. Which, when you look again at how much it costs, is about the least it needs to be.

The Astell & Kern A&ultima SP3000 (which I think we should agree to call ‘SP3000’ from here on out) is on sale now, and in the United Kingdom it costs a not-inconsiderable £3799. In the United States, it’s a barely-more-acceptable $3699, and in Australia you’ll have to part with AU$5499.

Need I say with undue emphasis that this is quite a lot of money for a digital audio player? I’ve reviewed very decent digital audio players (DAP) from the likes of Sony for TechRadar that cost about 10% of this asking price – so why on Earth would you spend ‘Holiday of a Lifetime’ money on something that doesn’t do anything your smartphone can’t do?

Bluetooth 5.0 with aptX HD and LDAC
Native 32bit/784kHz and DSD512 playback
Discrete balanced and unbalanced audio circuits

Admittedly, when Astell & Kern says the SP3000 is “the pinnacle of audio players”, that seems a rather subjective statement. When it says this is “the world’s first DAP with independent audio circuitry”, that’s simply a statement of fact.

That independent audio circuitry keeps the signal path for the balanced and unbalanced outputs entirely separated, and it also includes independent digital and analogue signal processing. Astell & Kern calls the overall arrangement ‘HEXA-Audio’ – and it includes four of the new, top-of-the-shop AKM AK4499EX DAC chipsets along with a couple of the very-nearly-top-of-the-shop AK4191EQ DACs from the same company. When you add in a single system-on-chip to take care of CPU, memory and wireless connectivity, it becomes apparent Astell & Kern has chosen not to compromise where technical specification is concerned. And that’s before we get to ‘Teraton X’… this is a bespoke A&K-designed processor that minimises noise derived from both the power supply and the numerous DACs, and provides amplification that’s as clean and efficient as any digital audio player has ever enjoyed.

The upshot is a player that supports every worthwhile digital audio format, can handle sample rates of up to 32bit/784kHz and DSD512 natively, and has Bluetooth 5.0 wireless connectivity with SBC, AAC, aptX HD and LDAC codec compatibility. A player that features half-a-dozen DAC filters for you to investigate, and that can upsample the rate of any given digital audio file in an effort to deliver optimal sound quality. And if you want to enjoy the sound as if it originates from a pair of loudspeakers rather than headphones, the SP3000 has a ‘Crossfeed’ feature that mixes part of the signal from one channel into the other (with time-adjustment to centre the audio image) in an effort to do just that.

904L stainless steel chassis
493g; 139 x 82 x 18mm (HxWxD)
1080 x 1920 touchscreen

‘Portable’, of course, is a relative term. The SP3000 is not the most portable product of its type around – it weighs very nearly half a kilo and is 139 x 82 x 18mm (HxWxD) – but if you can slip it into a bag then I guess it must count as ‘portable’. Its pointy corners count against it too, though – and while it comes with a protective case sourced from French tanners ALRA, the fact it’s made of goatskin is not going to appeal to everyone.

To be fair, the body of the SP3000 isn’t as aggressively angular as some A&K designs. And the fact that it’s built from 904L stainless steel goes a long way to establishing the SP3000’s credentials as a luxury ‘accessory’ (in the manner of a watch or some other jewellery) as well as a functional device. 904L stainless steel resists corrosion like nobody’s business, and it can also accept a very high polish – which is why the likes of Rolex make use of it. I’m confident you’ve never seen such a shiny digital audio player.

The front and rear faces of the SP3000 are glass – and on the front it makes up a 5.4in 1080 x 1920 touch-screen. The Snapdragon octa-core CPU that’s in charge means it’s an extremely responsive touch-screen, too.

On the top right edge of the chassis there’s the familiar ‘crown’ control wheel – which is another design feature that ups the SP3000’s desirability. It feels as good as it looks, and the circular light that sits behind it glows in one of a number of different colours to indicate the size of the digital audio file that’s playing. The opposite edge has three small, much less exciting, control buttons that work perfectly well but have none of the control wheel’s visual drama or tactile appeal.

The top of the SP3000 is home to three headphone sockets. There’s a 3.5mm unbalanced output, and two balanced alternatives – 2.5mm (which works with four-pole connections) and 4.4mm (which supports five-pole connections). On the bottom edge, meanwhile, there’s a USB-C socket for charging the internal battery – battery life is around 10 hours in normal circumstances, and a full charge from ‘flat’ takes around three hours. There’s also a micro-SD card slot down here, which can be used to boost the player’s 256GB of memory by up to 1TB.

Astell & Kern A&ultima SP3000 review: a high-end hi-res digital audio player Read More »

Govee Curtain Lights review: I’m obsessed

Highlights / Kelly Newman / December 1, 2023

TechRadar Verdict

Govee continues to wow, this time around with the Govee Curtain Lights, which are perfect addition to your holiday decorations. Don’t be fooled by its Christmas-wrapped marketing, however. These lights are perfect for year-round use, even when you’re just curled up in a cozy corner with a good book on a rainy day. Fair warning, though: this isn’t a cheap purchase, and the lights aren’t going to look as big as they do in Govee’s marketing images.

Pros

+Bright, vibrant and very customizable
+Surprisingly easy to set up with 3x ways to hang
+Light beads give them a cleaner look
+App control and voice command
+IP65 waterproof for outdoor use

Cons

–Individual strings a little far apart
–Lights not as big as in the product images

Smart light technology and designs just keep getting better and better, and Govee seems to be winning in that arena. The Govee Curtain Lights are another fantastic addition to our best smart lights list. And while the brand is currently promoting them as another offering in its smart Christmas light catalog, they deserve to be left up on your wall or windows – and not just ’til January, as that Taylor Swift song goes.

Truth be told, I’m kind of obsessed with the Govee Curtain Lights, and I’m not just saying that as a strong supporter of smart lights. They add a much prettier and much more romantic ambiance to any setting, whether that be my otherwise messy living room or your garden, that no other smart light – not even the recent smart string lights that recently hit the market – can replicate.

That’s not just because these are curtain lights, made of up 20 rows of individual string lights that all hang side by side like delicate willow tree stems. Although, if I’m being perfectly honest, that really does add to their appeal.

Basically, you don’t just get light patterns with them; you can actually create visual representations of things you see in the real world – falling leaves, pumpkin patches, Santa riding his sleigh, the face of your favorite pet, and you can do all that using your phone on the Govee app. That capability is a massive game-changer, especially to those folks who go all-out for Christmas.

They’re not just for Christmas, however. Put them up in your reading nook, and they’ll cozy up that space even more with twinkling warm lights. Set them in your dining space, and they can elevate the ambience not just for dinner parties but also during winter when morning tend to be dark and dreary.

Govee Curtain Lights review: I’m obsessed Read More »

Jabra Evolve 2 65 Flex headset review

Highlights / Shannon Garcia / November 14, 2023

A Bluetooth headset that’s ready for business

The Jabra Evolve2 65 Flex is one of the best headsets for working from home and the office. Designed with the hybrid worker in mind, it’s lightweight with ANC, built-in microphone, and an excellent sound profile that can be customized in a welcoming companion app. But it is expensive, and not the most rugged option out there.

$247.99 at Amazon

$329 at Amazon

Pros

+Feather-light
+Very comfortable
+Excellent sound quality
+Slimline design with built-in mic
+Good companion app

Cons

–Plastic build
–Expensive
–Occasional issues with mute

The Jabra Evolve2 65 Flex has long been topping our round-up of the best Bluetooth headsets. So, we jumped at the chance to get our hands on the kit to test it out ourselves. But even with an impressively lightweight design, ANC, built-in microphone, and an excellent sound profile, is this high-end headset ready for business?

JABRA EVOLVE 2 65 FLEX: PRICING & AVAILABILITY

The Jabra Evolve2 65 Flex retails for $329 from the company’s official site, but it is available elsewhere (we saw it selling for about $250 over on Amazon). You can pick between USB-A and USB-C connectivity, and whether it’s optimized for Microsoft Teams or UC. Add in the wireless charging stand and the cost rises to $389. Whichever configuration you choose, those numbers put the headset in the premium price-bracket.

Influenced by the Apple school of packaging design, unboxing the Evolve2 65 Flex is an experience. Simple, streamlined, effective.

Easing off the cardboard sleeve reveals a plain black cardboard box with the message ‘It’s what’s inside that counts (that’s why we’ve reduced our packaging).’ We cracked open the lid to find a fabric charcoal case nestled beneath a single instruction card. No room here for bulky manuals destined for the recycling bin or left unread in the kitchen drawer.

It’s difficult to reinvent the wheel when it comes to professional headsets – and who would want to? So yes, the 65 Flex looks exactly as a set of business headphones should look, complete with on-ear cups that extend, swivel, and fold for storage.

The overall design is a lot slimmer than the Jabra Evolve 2 65 that we reviewed. The memory foam earphones are noticeably thinner and smaller, featuring Jabra AirComfort Fit. Gone is the fully cushioned headband, with a single strip of padding now moved to the top. The wireless charger has been reduced from a stand to a pad. The built-in noise-canceling microphone is now only inches long, stowed within the right ear-cup where it can be flipped up or down to automatically mute or unmute. The plastic mic does feel a bit flimsy here – it’s an issue with the headset as a whole really – but we chalk that up more to maintaining the impressively feather-light build rather than cost-cutting.

Button are located to the rear of both cups These include pairing mode, a mute/voice assistant, play/pause, and volume/next track controls. On the right outer-ear is a button for answering and ending calls – and in our model, this also gave us Microsoft Teams control. On the left-side is the wireless charging zone. LED lights to the top of both ears display headset status.

As with any of the best noise canceling headphones, the Evolve2 65 Flex boasts advanced Active Noise Cancellation (ANC), which washes away unwanted background sounds. There’s also HearThrough technology, which Jabra says “lets you hear your surroundings and conversations”. Personally, this worked a treat while sharing an office. Coupled with the lightweight design, makes it oh-so-easy to forget you’re even wearing them.

Elsewhere, we had no issues. Admittedly, we were a bit worried about an on-ear headset with ANC. We’re more used to the snug isolation of the over-ear Anker Soundcore Q20 for day-to-day listening, but the 65 Flex is surprisingly excellent at blocking out background noise. If you work in a busy office (or just want to concentrate) and don’t want an all-encompassing over-ear model, this is a top choice.

You can switch between ANC and HearThrough using the Jabra Sound+ app. It’s here where we optimized audio and updated the firmware. There’s also a music equalizer and music presets, which offer options like a bass boost for music or a speech mode for podcasts. If you’re anything like us, you might enjoy ambient noise when focusing on work – we listen to so much, it featured in our Spotify Wrapped – so we especially liked the Soundscape mode. No more searching for playlists, you can quickly switch between the likes of white noise, ocean waves, and birdsong. The app is certainly worth investigating. We found the interface is nice and simple, and even if you’re not traditionally an audiophile, it’s very straightforward to enhance your listening experience.

It’s not a budget option by any means, although you can hear those extra dollars in the audio. It’s delightfully light, with cushioned pads as soft as clouds. Not too tight but never threatening to tumble off the head – although we wouldn’t recommend anything more active than swiveling in your office chair. However, that lightweight design means the build quality does feel less than robust. The Evolve2 65 Flex lacks the sense it would survive the crunch of a turbulent commute. In that case, you’ll absolutely want to upgrade the soft fabric case to a hard-shell.

It’s not perfect – mind you, show us a headset that is. Whether the issues are deal-breakers will depend on what you want from a wireless business headset. If you want a cheap headset for the occasional meeting that could’ve been an email, or you’re working out in the field, there are better options out there. If you’re looking for a model that’s comfortable, professional, and svelte, it’s one of the best you can get.

Jabra Evolve 2 65 Flex headset review Read More »

PNY GeForce RTX 4060 Ti review: a great 1080p GPU with added extras

Highlights / DJ Henderson / November 9, 2023

Fantastic 1080p power that’s approachable

PC gamers looking for some of the most technologically friendly 1080p performance available may find the PNY GeForce RTX 4060 Ti an attractive buy. DLSS 3 and improved ray-tracing performance for under $400 is a steal, just understand that many newer games even at 1080p blow through its 8GB VRAM easily.

Pros

+Fantastic native 1080p performance
+DLSS 3 upscaling is great at 1440p
+Doesn’t run hot or loud

Cons

–8GB VRAM isn’t enough
–Not very good for overclocking
–Fairly boring design

When we reviewed the Nvidia GeForce RTX 4060 Ti Founders Edition earlier this year, we were slightly disappointed with the mid-range offering from its small performance boost compared to the base 4060 (let alone 3060 Ti) alongside 8GB VRAM and design issues. Regardless of its faults, it was still a worthy buy for many reasons, like DLSS 3 being the current standard when it comes to AI upscaling tech while overall ray tracing performance saw significant improvements as well. As third-party versions of the GPU have been released, the PNY Geforce RTX 4060 Ti is a strong contender for the best graphics card using the RTX 4060 Ti GPU available on the market.

Despite still inheriting the under-the-hood flaws of Founders Edition, the PNY take on the GPU makes some significant improvement in terms of its design. The most obvious is that it only needs a single-power 8-pin PCIe power connector and not the special 16-pin adapter. Of course, this means opportunities for overclocking are severely diminished.

Meanwhile, having only 8GB VRAM is a shame considering that many of the most visually impressive AAA games released over the past year blows past that even at 1080p. When it comes to best bang for buck, the 16GB RTX 4060 Ti can be purchased for around $50 more. With DLSS 3 also comes Frame Generation. This employs AI-enhanced hardware to enhance resolution by generating new frames and interleaving them among pre-rendered GPU frames. While this enhances the fluidity and visual smoothness of games during rendering, it comes with the trade-off of heightened latency and input lag. Then there’s the reality that only around 50 games even support Frame Generation.

Even when pushing the PNY RTX 4060 Ti past its limit, it still manages to keep cool and quiet. Just be mindful that aesthetically, the overall design is a bit bland. If a potential buyer is looking for something to complement their RGB lighting extravaganza build, it’ll unfortunately stand out like a sore thumb. Compared to the Founders Edition, Nvidia still is unmatched with the sleek unified build.

Those looking for raw native power in the 1440p or above range will need to look at the best 1440p graphics cards and best 4K graphics cards, but this GPU becomes more of a testament to how awesome DLSS 3 is in terms of AI upscaling. Not only can this make 1440p gaming a pleasurable experience, it can handle some games at 4K with some settings tinkering.

If a fantastic 1080p experience playing more esports games at high frame rates like Fortnite and League of Legends matters more than playing Cyberpunk 2077 or Alan Wake II at max settings, the PNY GeForce RTX 4060 Ti could be considered a seriously attractive purchase, especially when it comes to form over function.

How much does it cost? MSRP listed at $389 but can be found for around $350 (around £395/AU$575)
When is it available? Available now
Where can you get it? Available in the US, UK, and Australia

The PNY Geforce RTX 4060 Ti is currently available now in the US, UK and Australia. Though the MSRP on PNY’s online store is $389, it can be found for as low as $350 on other stores like Amazon or Newegg. Due to the more 1:1 nature of the PNY take vs. the Founders Edition, interested buyers are usually going to save a solid $10 for the same performance.

For PC Gamers on a budget, those looking for one of the best cheap graphics cards for their new rig can look toward its AMD rival the RX 7700 XT. Be mindful that AMD FidelityFX isn’t as good as DLSS, Nvidia simply does ray tracing better at the moment and that card is about $40 more. However, the Radeon RX 7700 XT comes packed in with 12GB VRAM if that matters. When it comes to overall gaming experience between the two, the Geforce RTX 4060 Ti is a very solid performer.

PNY GeForce RTX 4060 Ti review: a great 1080p GPU with added extras Read More »

Narwal Freo review: the vacuuming and mopping robot vacuum you want to love

Highlights / Shannon Garcia / November 2, 2023

Great mop performance but less than exceptional vacuuming

With excellent mopping, a long battery life, a mop cleaning base station with a handy touchscreen, and an intuitive app, the Narwal Freo has a lot on offer. However, given the mediocre vacuum performance and the lack of an auto-emptying dust bin, combined with a high price tag, this robot vacuum leaves something to be desired and is best for households with lighter cleaning needs.

Pros

+Handy LCD touchscreen control panel on the base station
+Excellent self-cleaning mops
+Long battery life

Cons

The Narwal Freo offers everything you’d expect from one of the best robot vacuums. Beyond vacuuming, it has mopping, an intuitive app, long battery life, and a base station with auto mop-cleaning and an LCD touchscreen for extra control. But the question is, do these features deliver? Almost all of them do, except probably the most important one: vacuuming.

When it came to vacuuming, the Narwal Freo sucked, and not in a way that vacuums are supposed to. It failed to pick up debris during everyday cleaning tasks on carpeted and hard floors, leaving a larger-than-expected amount of hair, crumbs, and other dirt behind as it traversed my space, with its performance worsening over time. Edge brushes and other “special” technology did little to expel dirt from edges and corners, meaning you’ll want to grab one of the best vacuum cleaners to finish the job this device failed to complete.

Mopping on the Narwal Freo was a different story. The two oscillating mop heads did an excellent job cleaning up lighter dirt, spots, and grime. The robot vacuum also as a whole did a decent job navigating my space and freeing itself when getting stuck. It’s not the best I’ve seen but on par with many robot vacuums I’ve tested. After mopping, my floors sparkled while the auto-mop cleaning on the base station made the entire process virtually hands-off.

Speaking of that base station, it’s bulky, but the unique LCD touchscreen on its lid is especially useful when you don’t want to use the app. However, the omission of an auto-emptying dustbin was shocking given the retail price. For more control over settings and cleanings, the app was great, and you can even save multiple maps, making it ideal for multi-level spaces.

NARWAL FREO: PRICE AND AVAILABILITY

How much does it cost? $1,399.99 / AU$1,999 (about £1,100)
When is it available? Available now
Where is it available? Available in the US and Australia

The Narwal Freo costs $1,399.99 / AU$1,999 (about £1,100). You can get it directly from the Narwal website or various retailers, including Amazon and Walmart. In Australia, it’s available on their website.

Given the price, this robot vacuum sits at the higher end of the market. Luckily, it offers many features to help justify that cost, including self-cleaning oscillating mops and an LCD touchscreen. Still, the lack of an auto-emptying dust bin is shocking. If you can grab it on sale, it will make the device a much better value. One small but much-appreciated detail is the inclusion of a floor cleaning solution, but it costs a pretty penny when that needs replacing.

Something like the Eufy Clean X9 Pro offers similar functionality to the Narwal Freo, including self-cleaning and oscillating mops, and it retails for $500 less, making it a better deal. But if you’re looking for almost everything a robot vacuum can offer in one convenient package, the Roborock S8 Pro Ultra might suit you better. With it comes self-cleaning mops and the auto-emptying dust bin that the Narwal Freo lacks – although this impressive vacuum will set you back $1,599 / AU$2,699 (about £1,265).

Value: 3.5 / 5

NARWAL FREO: SPECIFICATIONS

Watt:	45W(vacuum) / 72W (base)
Suction power:	3,000pa
Speeds:	Quiet, Normal, Strong, Super Powerful
Bin volume:	480ml
Battery life:	180 minutes (Freo Mode)
Filtration:	Yes
Noise volume:	65Db (vacuum) / 50Db (base)
Mop water volume:	Not specified
Water levels:	Slightly dry, normal, wet mopping
Mapping:	Yes
Obstacle avoidance:	Yes
Base:	14.6 x 16.3 x 17.1 in (370 x 415 x 435 mm)
Smart support:	Siri
Tools:	None
Weight:	9.59 lbs (4.35 kg)

NARWAL FREO: DESIGN AND FEATURES

LCD touchscreen control panel on base station
Auto mop cleaning base, no auto emptying
Two oscillating mop heads

The Narwal Freo came in a massive, heavy box that was difficult to maneuver on my own. Upon opening, I was greeted with a large instruction sheet and began setting up the vacuum. The process took about 10 minutes, including downloading the Narwal app and connecting to Wi-Fi via a 2.4GHz band. It was fairly simple and similar to most robot vacuums.

One glaring omission from the base station’s design is an auto-emptying dustbin, something I’ve seen on almost every robot vacuum in its price range. Instead, you get that floor solution that tucks neatly inside along with clean and dirty water tanks for the self-cleaning mops. That means you’ll need to empty the 480ml dust box on the robot vacuum itself, which can be annoying. However, the tray where the mops are cleaned is removable, so you can rinse it down if it looks or smells a bit grimy.

The robot vacuum is similar to others, with a large main roller brush featuring actual bristles, edge brushes, and various sensors throughout. It’s the same white as the base, so scuff marks began to show immediately after the initial use. There’s only one button on the device, giving you limited control unless you’re using the LCD touch screen or the app. The dust box is easy to remove, though I found that some contents would fall out in the process, which is annoying given the fact that there’s no auto-emptying dust bin.

My favorite part of the actual robot vacuum is the oscillating mops. You get two large, plush mop heads that rotate and adjust pressure based on the floor type. I’ve found that this type of mopping does a better job of cleaning floors than the vibrating mopping pads seen on most. After mopping, the base station cleans the mops and even dries them to prevent smelly bacteria growth.

I’ve mentioned controlling the vacuum via the app or the LCD touchscreen on the base, but you can also send the vacuum out to clean using smart home integration. It currently supports Siri voice control, and the Narwal app makes it insanely simple to set up – something I can’t say for other vacuums I’ve tested.

NARWAL FREO: PERFORMANCE

Easy-to-use app
Excellent mopping
Mediocre vacuuming

For its first task, I sent the Narwal Freo out using Narwal’s unique Freo Mode that detects the dirt in an area and cleans accordingly using “DirtSense Technology.” The vacuum and mops are both used in this mode. The device navigated my downstairs with relative ease, though it would occasionally get tripped up on rugs, eventually freeing itself without my help. After finishing cleaning a room, or sometimes more often, the vacuum would go back to the base and clean the mops. This process takes about two minutes. Then, it would go right back out, picking up where it left off cleaning.

Freo Mode left the floors cleaner than before, but the performance wasn’t perfect. Most of the spots from food spills and muddy boots got mopped up, though the mops that are supposed to lift on rugs and carpet wouldn’t always do so, soaking the edges of rugs. There was still debris left in the corners and edges of rooms, especially near the kitchen cabinets. Given this vacuum advertises a “Smart Swing” technology to combat this issue, I was disappointed the feature wasn’t better. The rugs also had some debris and dog hair left on them. It’s important to note that I have a fluffy dog constantly traipsing leaves and muck throughout the house, so this vacuum had its work cut out for it.

I did more intensive testing of the Narwal Freo’s vacuuming to see how it fared when cleaning up different sizes of debris. Using a large concentration of oats, sugar, and sprinkles, I tested its pick up on a hard laminate floor at the vacuum’s various speeds: quiet, normal, strong, and super powerful. I noticed that each suction level performed similarly.

Some of the oats and sprinkles got flung around in the first pass-through, but sending the vacuum out a second time saw most of the mess suctioned up. Some sprinkles got crushed in the process, and they were left behind. The sugar appeared to get vacuumed. However, upon closer inspection, there was some grittiness on the floor, and it took several passes to remove it.

I sent the vacuum back to the base after these tests—the robot vacuum successfully found the base and docked every time it finished a cleaning task. But on its way, it had to pass over several transitions, losing some of the contents of the dust box, and leaving a mess of sprinkles, and oats behind. Luckily, the robot vacuum increases suction when docking at the base, helping to prevent the dust box contents from falling out.

I performed these same tests on medium-pile carpeting, and unfortunately, the Narwal Freo’s performance was pretty pathetic. No matter the suction level and even with a second pass-through, most of the oats, sprinkles, and flour were left behind. I had to grab a cordless vacuum I was testing to pick up the mess the Freo left behind. So, if your home consists mostly of carpeting, I’d seek another robot vacuum option.

Its mops were also put through more intensive testing, as I spread yogurt, honey, and some of my morning coffee on the floor. I used all the mop water levels: slightly dry, normal, and wet mopping. Slightly dry tended to spread the mess around, but normal and wet mopping performed better. After the first pass, the coffee was gone, though the yogurt was smeared around while only some of the honey was removed. A second pass-through cleaned up the majority of the mess.

I love how great the mops perform. They’re perfect for cleaning up lighter spills and messes. When emptying the dirty water tank, I could see just how great they were working, as that water was nasty. Plus, even after several weeks of use, the mops look almost as good as new. They are white, so there are a few darker spots on them, but there’s no odor, which is a testament to the handy auto-cleaning and drying feature on the base station.

Beyond the more intensive testing, I observed how the Narwal Freo performed everyday tasks, whether it was in Freo Mode, Vacuum, Mop, or both.

Its navigation was on par with other vacuums I’ve tested. For the most part, it covered the entire area I had requested the robot vacuum to clean. The device would avoid objects like dog bowls and toys. But when it came to furniture and larger obstacles, it would skirt nicely around some or just fully ram others with no rhyme or reason. Sometimes, the Freo would get tripped up by an obstacle for several minutes, continuously running into it or spinning around it. I’ve found this to be a common issue with many robot vacuums. Wires would also get caught in the main brush from time to time–not a big surprise.

Speaking of the main brush, it has bristles, something many robot vacuums have done away with. That means it’s a hair magnet, and I had to clean it on multiple occasions. I also found the brush difficult to get back in place correctly after cleaning, a minor annoyance.

When it came to detecting debris, it was a hit or miss. Sometimes, the Narwal Freo would spot larger messes and pick them up immediately. Other times, it seemingly avoided the mess, never going back to clean up, proving the vacuum to be unreliable.

As the Narwal Freo vacuumed, it attempted to kick out debris from hard-to-reach places, corners, and baseboards using the edge brushes. Oftentimes, it didn’t successfully move the debris, and if it did move the debris, that debris never actually got suctioned up. This was a major disappointment, especially given the price.

In fact, I was truly shocked at just how mediocre the vacuuming performance of the Narwal Freo was. I’ll admit that my floors were full of crumbs, pet hair, leaves, and other debris, making them messier than the average household. But I was lucky if the Freo picked up a third of what was on the floor. Sure, larger crumbs and dirt were left, and that’s acceptable and often expected from these devices. However, small leaves, tiny needles from an artificial Christmas tree, and minuscule crumbs were left behind even after I sent the vacuum out multiple times.

I also believe the vacuum’s performance declined from when I first began using it. I tried to remedy the problem, doing everything from emptying the dust box after each use to cleaning the brushes and filter. Still, it failed to have a better pick-up. That poor vacuuming performance could be due to the 3,000Pa max suction level, which is pretty low considering the cost. Therefore, if your household has pets, kids, or just tends to get a bit grimier, I’d steer clear of the Narwal Freo.

Performance: 2.5 / 5

NARWAL FREO: APP

Easy to use app
Mapping uncomplicated

It was simple to start using the Narwal Freo. Before its first run, the robot vacuum leaves the base and creates a map of your space. The process was quick, and I had a relatively accurate map of the downstairs of my home, which is about 700 square feet with multiple rooms, in about 15 minutes. You can then edit the map, block off certain areas, and name rooms using the Narwal app. The map isn’t as intelligent as some I’ve used, but it should suffice for most.

A great feature of the Narwal App is its ability to save up to four maps. So, beyond the main downstairs map, I created two others. One map of my sunken family room and another of the upstairs. Mapping was uncomplicated, as you just needed to move the robot vacuum to the space and let it do its thing. However, you can’t select specific rooms to clean on the additional maps, as the app only allows you to highlight areas to be cleaned, which can be tedious.

However, the app as a whole is easy to use and took me only a couple of minutes to master. It lets you adjust vacuum settings, check when components need replacing, schedule cleanings, and more. When you don’t go through the app, you can always use the LCD touchscreen on the base, though you’ll have less control over the specifics of your cleaning.

App: 4.5 / 5

NARWAL FREO: BATTERY LIFE

Battery lasts over three hours
Takes less than 4 hours to recharge

The Narwal Freo is equipped with a 5,200mAh battery that lasts an impressive amount of time. Using Freo Mode, which includes vacuuming and mopping, the battery lasted over three hours. That was enough juice to clean almost 700 square feet of space three times. It’s the best battery performance I’ve seen in my robot vacuum testing.

When only using the vacuuming function, I found that the battery did deplete quicker. Still, it lasted long enough for multiple whole home cleanings. Of course, increasing the suction level did cause the levels to drop even faster.

After the battery dropped below 20%, it returned to the base for charging. There’s an option to send it back out to complete a task after it has reached a certain level of charge. And the battery gets back to 100% percent surprisingly fast, taking less than 4 hours.

Battery: 5 / 5

SHOULD I BUY THE NARWAL FREO?

Attribute	Notes	Score
Value	Expensive but feature-rich vacuum, similar option retails for less	3.5 / 5
Design	Easy to set up, base station washes and dries mops but no auto-emptying, useful LCD touch screen on the base, oscillating mops on robot vacuum	4 / 5
Performance	The two oscillating mop pads work great, but the vacuum pick up and edge clean up are mediocre	2.5 / 5
App	The app is simple to use and offers multi level home mapping	4.5 / 5
Battery life	Battery lasts over three hours depending on the mode and recharges quickly in under 4 hours	5 / 5

Buy it if…

You want top-tier mopping.

The Narwal Freo features two oscillating mops that put the vibrating mopping pads seen on most robot vacuums to shame. The base station cleans and dries the mops, leaving them in great condition even after several weeks of use.

You have a multilevel home.

Unlike many robot vacuum apps, the Narwal app allows you to create up to four maps. So, if you have different levels in your home, you won’t need to worry about deleting your current map to clean another part of your space.

You don’t always want to use an app.

The Narwal Freo has a unique LCD touchscreen on the base station, allowing you to select different modes and send the robot vacuum out to clean. Beyond that, it gives details about when components need replacing, shows your network settings, and more.

Don’t buy it if…

You have pets or kids in your home.

The Narwal Freo fails to pick up a good portion of debris when performing average cleaning tasks. So, if you’re house is prone to more crumbs, hair, and dirt, this vacuum won’t be able to keep up. You’ll want to grab an option with more suction power.

You have a mostly carpeted home.

Given the mediocre vacuuming performance, especially on carpeting, and the high price tag, you’d want to grab this vacuum for the excellent mops. If you don’t have hard floors, then you can find better-performing vacuum-only options for cheaper.

You want an auto-emptying dustbin.

Unfortunately, the base station of this robot vacuum doesn’t include an auto-emptying dust bin. That means you’ll need to remove the dust box and empty it. It’s a surprising omission, considering the price of the vacuum.

NARWAL FREO: ALSO CONSIDER

Not sold on the prowess of the Narwal Freo? Below are a couple of alternatives that you can consider.

–Expensive
–No auto-emptying dustbin
–Mediocre vacuum performance

Header Cell – Column 0	Narwal Freo	Roborock S8 Pro Ultra	Eufy Clean X9 Pro
Price:	$1,399.99 / AU$1,999 (about £1,100)	$1,599.99 US / AU$2,699 (about £2,370)	$899.99 / £899.99 / AU$1,499.95
Watt:	45W(vacuum) / 72W (base)	Row 1 – Cell 2	Row 1 – Cell 3
Suction power:	3,000pa	6000Pa	5,500Pa
Speeds:	Quiet, Normal, Strong, Super Powerful	Row 3 – Cell 2	Row 3 – Cell 3
Bin volume:	480 ml	0.66 gallons (2.5L)	13.9 oz (410 ml)
Battery life:	180 minutes (Freo Mode)	180 min (quiet mode)	150 min (standard vacuum/mop setting)
Filtration:	Yes	Row 6 – Cell 2	Row 6 – Cell 3
Noise volume:	65Db (vacuum), 50Db (base)	69dB (vacuum), 77dB (base)	65dB (vacuum), 50dB (base)
Mop water volume:	Not specified	0.92 gallons (3.5L)	1.1 gallons (4.1L)
Water levels:	Slightly dry, normal, wet mopping	Row 9 – Cell 2	Row 9 – Cell 3
Mapping:	Yes	Yes	Yes
Obstacle avoidance:	Yes	Yes	Yes
Base:	14.6 x 16.3 x 17.1 in (370 x 415 x 435 mm)	16.7 x 20.2 x 17.7 in (42.4 x 51.3 x 45 cm)	17.4 x 16.6.2 x 16.4 in (44.3 x 42.2 x 41.6 cm)
Smart support:	Siri	Google Assistant, Amazon Alexa and Siri	Amazon Alexa, Google Assistant
Tools:	None	Row 14 – Cell 2	Row 14 – Cell 3
Weight:	9.59 lbs (4.35 kg)	10lbs (vacuum)	31.7 lbs (14.4 kg)

Roborock S8 Pro Ultra
An impressive but pricey robot vacuum, offering both vacuuming and mopping abilities, and has a self-cleaning, auto-emptying docking station to give you a mostly hands-off cleaning experience. An intuitive app delivers intelligent mapping as well as easy adjustment of settings.

Read our full Roborock S8 Pro Ultra review

Eufy Clean X9 Pro
A solid robot vacuum that vacuums and mops. The rotating mops are great at removing spills and spots on your floor, while the base station’s auto-cleaning feature washes the mops for you. Unfortunately, there’s no auto-emptying for the dust box. There’s also an intuitive app that creates an intelligent map and makes it simple to adjust various settings.

Read our full Eufy Clean X9 Pro review

HOW I TESTED THE NARWAL FREO

Tested over the course of several weeks
Used almost every mop and vacuum setting
Tested on various floor types, including carpet and laminate

I tested the Narwal Freo in my two-story home with floor types that include hardwood, medium pile carpet, tile, and laminate. There are also low-pile rugs throughout. I’d send the vacuum out multiple times per week using the different modes: Freo Mode, Vacuuming and Mopping, Vacuuming, and Mopping. The robot vacuum would do its thing, and I would only intervene if needed, observing how it handled obstacles, edges, and more.

Beyond the basics, I did more intensive testing of the device on both hard floor and carpeting to see how it handled larger messes of varying debris sizes. Using oats, flour, and sprinkles, I tested all the suction levels of the vacuum to see how well each setting vacuumed. I also spread yogurt, honey, and coffee on the floor to observe the mops’ performance at varying water levels.

Although this is the first time I’ve tested a Narwal robot vacuum, I have reviewed plenty of others from top brands like Shark, Roborock, Ecovacs, Eufy, and more, so I feel confident in my experience using these devices.

We pride ourselves on our independence and our rigorous review-testing process, offering up long-term attention to the products we review and making sure our reviews are updated and maintained – regardless of when a device was released, if you can still buy it, it’s on our radar.

Narwal Freo review: the vacuuming and mopping robot vacuum you want to love Read More »