Author name: Beth Washington

new-lego-building-ai-creates-models-that-actually-stand-up-in-real-life

New Lego-building AI creates models that actually stand up in real life

The LegoGPT system works in three parts, shown in this diagram.

The LegoGPT system works in three parts, shown in this diagram. Credit: Pun et al.

The researchers also expanded the system’s abilities by adding texture and color options. For example, using an appearance prompt like “Electric guitar in metallic purple,” LegoGPT can generate a guitar model, with bricks assigned a purple color.

Testing with robots and humans

To prove their designs worked in real life, the researchers had robots assemble the AI-created Lego models. They used a dual-robot arm system with force sensors to pick up and place bricks according to the AI-generated instructions.

Human testers also built some of the designs by hand, showing that the AI creates genuinely buildable models. “Our experiments show that LegoGPT produces stable, diverse, and aesthetically pleasing Lego designs that align closely with the input text prompts,” the team noted in its paper.

When tested against other AI systems for 3D creation, LegoGPT stands out through its focus on structural integrity. The team tested against several alternatives, including LLaMA-Mesh and other 3D generation models, and found its approach produced the highest percentage of stable structures.

A video of two robot arms building a LegoGPT creation, provided by the researchers.

Still, there are some limitations. The current version of LegoGPT only works within a 20×20×20 building space and uses a mere eight standard brick types. “Our method currently supports a fixed set of commonly used Lego bricks,” the team acknowledged. “In future work, we plan to expand the brick library to include a broader range of dimensions and brick types, such as slopes and tiles.”

The researchers also hope to scale up their training dataset to include more objects than the 21 categories currently available. Meanwhile, others can literally build on their work—the researchers released their dataset, code, and models on their project website and GitHub.

New Lego-building AI creates models that actually stand up in real life Read More »

linux-kernel-is-leaving-486-cpus-behind,-only-18-years-after-the-last-one-made

Linux kernel is leaving 486 CPUs behind, only 18 years after the last one made

It’s not the first time Torvalds has suggested dropping support for 32-bit processors and relieving kernel developers from implementing archaic emulation and work-around solutions. “We got rid of i386 support back in 2012. Maybe it’s time to get rid of i486 support in 2022,” Torvalds wrote in October 2022. Failing major changes to the 6.15 kernel, which will likely arrive late this month, i486 support will be dropped.

Where does that leave people running a 486 system for whatever reason? They can run older versions of the Linux kernel and Linux distributions. They might find recommendations for teensy distros like MenuetOS, KolibriOS, and Visopsys, but all three of those require at least a Pentium. They can run FreeDOS. They might get away with the OS/2 descendant ArcaOS. There are some who have modified Windows XP to run on 486 processors, and hopefully, they will not connect those devices to the Internet.

Really, though, if you’re dedicated enough to running a 486 system in 2025, you’re probably resourceful enough to find copies of the software meant for that system. One thing about computers—you never stop learning.

This post was updated at 3: 30 p.m. to fix a date error.

Linux kernel is leaving 486 CPUs behind, only 18 years after the last one made Read More »

fidji-simo-joins-openai-as-new-ceo-of-applications

Fidji Simo joins OpenAI as new CEO of Applications

In the message, Altman described Simo as bringing “a rare blend of leadership, product and operational expertise” and expressed that her addition to the team makes him “even more optimistic about our future as we continue advancing toward becoming the superintelligence company.”

Simo becomes the newest high-profile female executive at OpenAI following the departure of Chief Technology Officer Mira Murati in September. Murati, who had been with the company since 2018 and helped launch ChatGPT, left alongside two other senior leaders and founded Thinking Machines Lab in February.

OpenAI’s evolving structure

The leadership addition comes as OpenAI continues to evolve beyond its origins as a research lab. In his announcement, Altman described how the company now operates in three distinct areas: as a research lab focused on artificial general intelligence (AGI), as a “global product company serving hundreds of millions of users,” and as an “infrastructure company” building systems that advance research and deliver AI tools “at unprecedented scale.”

Altman mentioned that as CEO of OpenAI, he will “continue to directly oversee success across all pillars,” including Research, Compute, and Applications, while staying “closely involved with key company decisions.”

The announcement follows recent news that OpenAI abandoned its original plan to cede control of its nonprofit branch to a for-profit entity. The company began as a nonprofit research lab in 2015 before creating a for-profit subsidiary in 2019, maintaining its original mission “to ensure artificial general intelligence benefits everyone.”

Fidji Simo joins OpenAI as new CEO of Applications Read More »

cheaters-gonna-cheat-cheat-cheat-cheat-cheat

Cheaters Gonna Cheat Cheat Cheat Cheat Cheat

Cheaters. Kids these days, everyone says, are all a bunch of blatant cheaters via AI.

Then again, look at the game we are forcing them to play, and how we grade it.

If you earn your degree largely via AI, that changes two distinct things.

  1. You might learn different things.

  2. You might signal different things.

Both learning and signaling are under threat if there is too much blatant cheating.

There is too much cheating going on, too blatantly.

Why is that happening? Because the students are choosing to do it.

Ultimately, this is a preview of what will happen everywhere else as well. It is not a coincidence that AI starts its replacement of work in the places where the work is the most repetitive, useless and fake, but its ubiquitousness will not stay confined there. These are problems and also opportunities we will face everywhere. The good news is that in other places the resulting superior outputs will actually produce value.

  1. You Could Take The White Pill, But You Probably Won’t.

  2. Is Our Children Learning.

  3. Cheaters Never Stop Cheating.

  4. If You Know You Know.

  5. The Real Victims Here.

  6. Taking Note.

  7. What You Going To Do About It, Punk?

  8. How Bad Are Things?

  9. The Road to Recovery.

  10. The Whispering Earring.

As I always say, if you have access to AI, you can use it to (A) learn and grow strong and work better, or (B) you can use it to avoid learning, growing and working. Or you can always (C) refuse to use it at all, or perhaps (D) use it in strictly limited capacities that you choose deliberately to save time but avoid the ability to avoid learning.

Choosing (A) and using AI to learn better and smarter is strictly better than choosing (C) and refusing to use AI at all.

If you choose (B) and use AI to avoid learning, you might be better or worse off than choosing (C) and refusing to use AI at all, depending on the value of the learning you are avoiding.

If the learning in question is sufficiently worthless, there’s no reason to invest in it, and (B) is not only better than (C) but also better than (A).

Tim Sweeney: The question is not “is it cheating”, the question is “is it learning”.

James Walsh: AI has made Daniel more curious; he likes that whenever he has a question, he can quickly access a thorough answer. But when he uses AI for homework, he often wonders, If I took the time to learn that, instead of just finding it out, would I have learned a lot more?

I notice I am confused. What is the difference between ‘learning that’ and ‘just finding it out’? And what’s to stop Daniel from walking through the a derivation or explanation with the AI if he wants to do that? I’ve done that a bunch with ML, and it’s great. o3’s example here was being told and memorizing the integral of sin x is -cos x rather than deriving it, but that was what most students always did anyway.

The path you take is up to you.

Ted Chiang: Using ChatGPT to complete tasks is like taking a forklift to the gym: you’ll never improve your cognitive abilities that way.”

Ewan Morrison: AI is demoralising universities. Students who use AI, think “why bother to study or write when AI can do it for me?” Tutors who mark the essays, think “why bother to teach these students & why give a serious grade when 90% of essays are done with AI?”

I would instead ask, why are you assigning essays the AI can do for them, without convincing the students why they should still write the essays themselves?

The problem, as I understand it, is that in general students are more often than not:

  1. Not that interested in learning.

  2. Do not think that their assignments are a good way to learn.

  3. Quite interested in not working.

  4. Quite interested getting good grades.

  5. Know how to use ChatGPT to avoid learning.

  6. Do not know how to use ChatGPT to learn, or it doesn’t even occur to them.

  7. Aware that if they did use ChatGPT to learn, it wouldn’t be via schoolwork.

Meatball Times: has anyone stopped to ask WHY students cheat? would a buddhist monk “cheat” at meditation? would an artist “cheat” at painting? no. when process and outcomes are aligned, there’s no incentive to cheat. so what’s happening differently at colleges? the answer is in the article.

Colin Fraser (being right): “would an artist ‘cheat’ at a painting?”

I mean… yes, famously.

Now that the cost of such cheating is close to zero I expect that we will be seeing a lot more of it!

James Walsh: Although Columbia’s policy on AI is similar to that of many other universities’ — students are prohibited from using it unless their professor explicitly permits them to do so, either on a class-by-class or case-by-case basis — Lee said he doesn’t know a single student at the school who isn’t using AI to cheat. To be clear, Lee doesn’t think this is a bad thing.

If the reward for painting is largely money, which it is, then clearly if you give artists the ability to cheat then many of them will cheat, as in things like forgery, as they often have in the past. The way to stop them is to catch the ones who try.

The reason the Buddhist monk presumably wouldn’t ‘cheat’ at meditation is because they are not trying to Be Observed Performing Meditation, they want to meditate. But yes, if they were getting other rewards for meditation, I’d expect some cheating, sure, even if the meditation also had intrinsic rewards.

Back to the school question. If the students did know how to use AI to learn, why would they need the school, or to do the assignments?

The entire structure of school is based on the thesis that students need to be forced to learn, and that this learning must be constantly policed.

The thesis has real validity. At this point, with not only AI but also YouTube and plenty of other free online materials, the primary educational (non-social, non-signaling) product is that the class schedule and physical presence, and exams and assignments, serve as a forcing function to get you to do the damn work and pay attention, even if inefficiently.

Zito (quoting the NYMag article): The kids are cooked.

Yishan: One of my kids buys into the propaganda that AI is environmentally harmful (not helped by what xAI is doing in Memphis, btw), and so refuses to use AI for any help on learning tough subjects. The kid just does the work, grinding it out, and they are getting straight A’s.

And… now I’m thinking maybe I’ll stop trying to convince the kid otherwise.

It’s entirely not obvious whether it would be a good idea to convince the kid otherwise. Using AI is going to be the most important skill, and it can make the learning much better, but maybe it’s fine to let the kid wait given the downside risks of preventing that?

The reason taking such a drastic (in)action might make sense is that the kids know the assignments are stupid and fake. The whole thesis of commitment devices that lead to forced work is based on the idea that the kids (or their parents) understand that they do need to be forced to work, so they need this commitment device, and also that the commitment device is functional.

Now both of those halves are broken. The commitment devices don’t work, you can simply cheat. And the students are in part trying to be lazy, sure, but they’re also very consciously not seeing any value here. Lee here is not typical in that he goes on to actively create a cheating startup but I mean, hey, was he wrong?

James Walsh: “Most assignments in college are not relevant,” [Columbia student Lee] told me. “They’re hackable by AI, and I just had no interest in doing them.”

While other new students fretted over the university’s rigorous core curriculum, described by the school as “intellectually expansive” and “personally transformative,” Lee used AI to breeze through with minimal effort.

When I asked him why he had gone through so much trouble to get to an Ivy League university only to off-load all of the learning to a robot, he said, “It’s the best place to meet your co-founder and your wife.”

Bingo. Lee knew this is no way to learn. That’s not why he was there.

Columbia can call its core curriculum ‘intellectually expansive’ and ‘personally transformative’ all it wants. That doesn’t make it true, and it definitely isn’t fooling that many of the students.

The key fact about cheaters is that they not only never stop cheating on their own. They escalate the extent of their cheating until they are caught. Once you pop enough times, you can’t stop. Cheaters learn to cheat as a habit, not as the result of an expected value calculation in each situation.

For example, if you put a Magic: the Gathering cheater onto a Twitch stream, where they will leave video evidence of their cheating, will they stop? No, usually not.

Thus, you can literally be teaching ‘Ethics and AI’ and ask for a personal reflection, essentially writing a new line of Ironic, and they will absolutely get it from ChatGPT.

James Walsh: Less than three months later, teaching a course called Ethics and Artificial Intelligence, [Brian Patrick Green] figured a low-stakes reading reflection would be safe — surely no one would dare use ChatGPT to write something personal. But one of his students turned in a reflection with robotic language and awkward phrasing that Green knew was AI-generated.

This is a way to know students are indeed cheating rather than using AI to learn. The good news? Teachable moment.

Lee in particular clearly doesn’t have a moral compass in any of this. He doesn’t get the idea that cheating can be wrong even in theory:

For now, Lee hopes people will use Cluely to continue AI’s siege on education. “We’re going to target the digital LSATs; digital GREs; all campus assignments, quizzes, and tests,” he said. “It will enable you to cheat on pretty much everything.”

If you’re enabling widespread cheating on the LSATs and GREs, you’re no longer a morally ambiguous rebel against the system. Now you’re just a villain.

Or you can have a code:

James Walsh: Wendy, a freshman finance major at one of the city’s top universities, told me that she is against using AI. Or, she clarified, “I’m against copy-and-pasting. I’m against cheating and plagiarism. All of that. It’s against the student handbook.”

Then she described, step-by-step, how on a recent Friday at 8 a.m., she called up an AI platform to help her write a four-to-five-page essay due two hours later.

Wendy will use AI for ‘all aid short of copy-pasting,’ the same way you would use Google or Wikipedia or you’d ask a friend questions, but she won’t copy-and-paste. The article goes on to describe her full technique. AI can generate an outline, and brainstorm ideas and arguments, so long as the words are hers.

That’s not an obviously wrong place to draw the line. It depends on which part of the assignment is the active ingredient. Is Wendy supposed to be learning:

  1. How to structure, outline and manufacture a school essay in particular?

  2. How to figure out what a teacher wants her to do?

  3. ‘How to write’?

  4. How to pick a ‘thesis’?

  5. How to find arguments and bullet points?

  6. The actual content of the essay?

  7. An assessment of how good she is rather than grademaxxing?

Wendy says planning the essay is fun, but ‘she’d rather get good grades.’ As in, the system actively punishes her for trying to think about such questions rather than being the correct form of fake. She is still presumably learning about the actual content of the essay, and by producing it, if there’s any actual value to the assignment, and she pays attention, she’ll pick up the reasons why the AI makes the essay the way it does.

I don’t buy that this is going to destroy Wendy’s ‘critical thinking’ skills. Why are we teaching her that school essay structures and such are the way to train critical thinking? Everything in my school experience says the opposite.

The ‘cheaters’ who only cheat or lie a limited amount and then stop have a clear and coherent model of why what they are doing in the contexts they cheat or lie in is not cheating or why it is acceptable or justified, and this is contrasted with other contexts. Why some rules are valid, and others are not. Even then, it usually takes a far stronger person to hold that line than to not cheat in the first place.

Another way to look at this is, if it’s obvious from the vibes that you cheated, you cheated, even if the system can’t prove it. The level of obviousness varies, you can’t always sneak in smoking gun instructions.

But if you invoke the good Lord Bayes, you know.

James Walsh: Most of the writing professors I spoke to told me that it’s abundantly clear when their students use AI.

Not that they flag it.

Still, while professors may think they are good at detecting AI-generated writing, studies have found they’re actually not. One, published in June 2024, used fake student profiles to slip 100 percent AI-generated work into professors’ grading piles at a U.K. university. The professors failed to flag 97 percent.

But there’s a huge difference between ‘I flag this as AI and am willing to fight over this’ and knowing that something was probably or almost certainly AI.

What about automatic AI detectors? They’re detecting something. It’s noisy, and it’s different, it’s not that hard to largely fool if you care, and it has huge issues (especially for ESL students) but I don’t think either of these responses is an error?

I fed Wendy’s essay through a free AI detector, ZeroGPT, and it came back as 11.74 AI-generated, which seemed low given that AI, at the very least, had generated her central arguments. I then fed a chunk of text from the Book of Genesis into ZeroGPT and it came back as 93.33 percent AI-generated.

If you’re direct block quoting Genesis without attribution, your essay is plagiarized. Maybe it came out of the AI and maybe it didn’t, but it easily could have, it knows Genesis and it’s allowed to quote from it. So 93% seems fine. Whereas Wendy’s essay is written by Wendy, the AI was used to make it conform to the dumb structures and passwords of the course. 11% seems fine.

Colin Fraser: I think we’ve somehow swung to overestimating the number of kids who are cheating with ChatGPT and simultaneously underestimating the amount of grief and hassle this creates for educators.

The guy making the cheating app wants you to think every single other person out there is cheating at everything and you’re falling behind if you’re not cheating. That’s not true. But the spectre a few more plagiarized assignments per term is massively disruptive for teachers.

James Walsh: Many teachers now seem to be in a state of despair.

I’m sorry, what?

Given how estimations work, I can totally believe we might be overestimating the number of kids who are cheating. Of course, the number is constantly rising, especially for the broader definitions of ‘cheating,’ so even if you were overestimating at the time you might not be anymore.

But no, this is not about ‘a few more plagiarized assignments per term,’ both because this isn’t plagiarism it’s a distinct other thing, and also because by all reports it’s not only a few cases, it’s an avalanche even if underestimated.

Doing the assignments yourself is now optional unless you force the student to do it in front of you. Deal with it.

As for this being ‘grief and hassle’ for educators, yes, I am sure it is annoying when your system of forced fake work can be faked back at you more effectively and more often, and when there is a much better source of information and explanations available than you and your textbooks such that very little of what you are doing really has a point to it anymore.

If you think students have to do certain things themselves in order to learn, then as I see it you have two options, you can do either or both.

  1. Use frequent in-person testing, both as the basis of grades and as a forcing function so that students learn. This is a time honored technique.

  2. Use in-person assignments and tasks, so you can prevent AI use. This is super annoying but it has other advantages.

Alternatively or in addition to this, you can embrace AI and design new tasks and assignments that cause students to learn together with the AI. That’s The Way.

Trying to ‘catch’ the ‘cheating’ is pointless. It won’t work. Trying only turns this at best into a battle over obscuring tool use and makes the whole experience adversarial.

If you assign fake essay forms to students, and then grade them on those essays and use those grades to determine their futures, what the hell do you think is going to happen? This form of essay assignment is no longer valid, and if you assign it anyway you deserve what you get.

James Walsh: “I think we are years — or months, probably — away from a world where nobody thinks using AI for homework is considered cheating,” [Lee] said.

I think that is wrong. We are a long way away from the last people giving up this ghost. But seriously it is pretty insane to think ‘using AI for homework’ is cheating. I’m actively trying to get my kids to use AI for homework more, not less.

James Walsh: In January 2023, just two months after OpenAI launched ChatGPT, a survey of 1,000 college students found that nearly 90 percent of them had used the chatbot to help with homework assignments.

What percentage of that 90% was ‘cheating’? We don’t know, and definitions differ, but I presume a lot less than all of them.

Now and also going forward, I think you could say that particular specific uses are indeed really cheating, and it depends how you use it. But if you think ‘use AI to ask questions about the world and learn the answer’ is ‘cheating’ then explain what the point of the assignment was, again?

The whole enterprise is broken, and will be broken while there is a fundamental disconnect between what is measured and what they want to be managing.

James Walsh: Williams knew most of the students in this general-education class were not destined to be writers, but he thought the work of getting from a blank page to a few semi-coherent pages was, above all else, a lesson in effort. In that sense, most of his students utterly failed.

[Jollimore] worries about the long-term consequences of passively allowing 18-year-olds to decide whether to actively engage with their assignments.

The entire article makes clear that students almost never buy that their efforts would be worthwhile. A teacher can think ‘this will teach them effort’ but if that’s the goal then why not go get an actual job? No one is buying this, so if the grades don’t reward effort, why should there be effort?

How dare you let 18-year-olds decide whether to engage with their assignments that produce no value to anyone but themselves.

This is all flat out text.

The ideal of college as a place of intellectual growth, where students engage with deep, profound ideas, was gone long before ChatGPT.

In a way, the speed and ease with which AI proved itself able to do college-level work simply exposed the rot at the core.

There’s no point. Was there ever a point?

“The students kind of recognize that the system is broken and that there’s not really a point in doing this. Maybe the original meaning of these assignments has been lost or is not being communicated to them well.”

The question is, once you know, what do you do about it? How do you align what is measured with what is to be managed? What exactly do you want from the students?

James Walsh: The “true attempt at a paper” policy ruined Williams’s grading scale. If he gave a solid paper that was obviously written with AI a B, what should he give a paper written by someone who actually wrote their own paper but submitted, in his words, “a barely literate essay”?

What is measured gets managed. You either give the better grade to the ‘barely literate’ essay, or you don’t.

My children get assigned homework. The school’s literal justification – I am not making this up, I am not paraphrasing – is that they need to learn to do homework so that they will be prepared to do more homework in the future. Often this involves giving them assignments that we have to walk them through because there is no reasonable way for them to understand what is being asked.

If it were up to me, damn right I’d have them use AI.

It’s not just the students: Multiple AI platforms now offer tools to leave AI-generated feedback on students’ essays. Which raises the possibility that AIs are now evaluating AI-generated papers, reducing the entire academic exercise to a conversation between two robots — or maybe even just one.

Great! Now we can learn.

Another AI application to university is note taking. AI can do excellent transcription and rather strong active note taking. Is that a case of learning, or of not learning? There are competing theories, which I think are true for different people at different times.

  1. One theory says that the act of taking notes is how you learn, by forcing you to pay attention, distill the information and write it in your own words.

  2. The other theory is that having to take notes prevents you from actually paying ‘real’ attention and thinking and engaging, you’re too busy writing down factual information.

AI also means that even if you don’t have it take notes or a transcript, you don’t have to worry as much about missing facts, because you can ask the AI for them later.

My experience is that having to take notes is mostly a negative. Every time I focus on writing something down that means I’m not listening, or not fully listening, and definitely not truly thinking.

Rarely did she sit in class and not see other students’ laptops open to ChatGPT.

Of course your laptop is open to an AI. It’s like being able to ask the professor any questions you like without interrupting the class or paying any social costs, including stupid questions. If there’s a college lecture, and at no point do you want to ask Gemini, Claude or o3 any questions, what are you even doing? That also means everyone gets to learn much better, removing the tradeoff of each question disrupting the rest of the class.

Similarly, devising study materials and practice tests seems clearly good.

The most amazing thing about the AI ‘cheating’ epidemic at universities is the extent to which the universities are content to go quietly into the night. They are mostly content to let nature take its course.

Could the universities adapt to the new reality? Yes, but they choose not to.

Cat Zhang: more depressing than Trump’s funding slashes and legal assaults and the Chat-GPT epidemic is witnessing how many smart, competent people would rather give up than even begin to think of what we could do about it

Tyler Austin Harper: It can’t be emphasized enough: wide swaths of the academy have given up re ChatGPT. Colleges have had since 2022 to figure something out and have done less than nothing. Haven’t even tried. Or tried to try. The administrative class has mostly collaborated with the LLM takeover.

Hardly anyone in this country believes in higher ed, especially the institutions themselves which cannot be mustered to do anything in their own defense. Faced with an existential threat, they can’t be bothered to cry, yawn, or even bury their head in the sand, let alone resist.

It would actually be more respectable if they were in denial, but the pervading sentiment is “well, we had a good run.” They don’t even have the dignity of being delusional. It’s shocking. Three years in and how many universities can you point to that have tried anything really?

If the AI crisis points to anything it’s that higher ed has been dead a long time, before ChatGPT was twinkle in Sam Altman’s eye. The reason the universities can’t be roused to their own defense is that they’re being asked to defend a corpse and the people who run them know it.

They will return to being finishing schools once again.

To paraphrase Alan Moore, this is one of those moments where colleges need to look at what’s on the table and (metaphorically) say: “Thank you, but I’d rather die behind the chemical sheds.” Instead, we get an OpenAI and Cal State partnership. Total, unapologetic capitulation.

The obvious interpretation is that college had long shifted into primarily being a Bryan Caplan style set of signaling mechanisms, so the universities are not moving to defend themselves against students who seek to avoid learning.

The problem is, this also destroys key portions of the underlying signals.

Greg Lukainoff: [Tyler’s statement above is] powerful evidence of the signaling hypothesis, that essentially the primary function of education is to signal to future employers that you were probably pretty smart and conscientious to get into college in the first place, and pretty, as @bryan_caplan puts it, “conservative” in a (non-political sense) to be able to finish it. Therefore graduates may be potentially competent and compliant employees.

Seems like there are far less expensive ways to convey that information.

Clark H: The problem is the signal is now largely false. It takes much less effort to graduate from college now – just crudely ask GPT to do it. There is even a case to be made that, like a prison teaches how to crime, college now teaches how to cheat.

v8pAfNs82P1foT: There’s a third signal of value to future employers: conformity to convention/expectation. There are alternative credible pathways to demonstrate intelligence and sustained diligence. But definitionally, the only way to credibly signal willingness to conform is to conform.

Megan McArdle: The larger problem is that a degree obtained by AI does not signal the information they are trying to convey, so its value is likely to collapse quickly as employers get wise. There will be a lag, because cultural habits die hard, but eventually the whole enterprise will implode unless they figure out how to teach something that employers will pay a premium for.

Matthew Yglesias: I think this is all kind of missing the boat, the same AI that can pass your college classes for you is radically devaluing the skills that a college degree (whether viewed as real learning or just signaling or more plausibly a mix) used to convey in the market.

The AI challenge for higher education isn’t that it’s undermining the assessment protocols (as everyone has noticed you can fix this with blue books or oral exams if you bother trying) it’s that it’s undermining the financial value of the degree!

Megan McArdle: Eh, conscientiousness is likely to remain valuable, I think. They also provide ancillary marriage market and networking services that arguably get more valuable in an age of AI.

Especially at elite schools. If you no longer have to spend your twenties and early thirties prepping for the PUMC rat race, why not get married at 22 and pop out some babies while you still have energy to chase them?

But anyway, yes, this is what I was saying, apparently not clearly enough: the problem is not just that you can’t assess certain kinds of paper-writing skills, it’s that the skills those papers were assessing will decline in value.

Periodically you see talk about how students these days (or kids these days) are in trouble. How they’re stupider, less literate, they can’t pay attention, they’re lazy and refuse to do work, and so on.

“We’re talking about an entire generation of learning perhaps significantly undermined here,” said Green, the Santa Clara tech ethicist. “It’s short-circuiting the learning process, and it’s happening fast.”

The thing is, this is a Pessimists Archive speciality, this pattern dates back at least to Socrates. People have always worried about this, and the opposite has very clearly been true overall. It’s learning, and also many other things, where ‘kids these days’ are always ‘in crisis’ and ‘falling behind’ and ‘at risk’ and so on.

My central understanding for this is that as times change, people compare kids now to kids of old both through rose-colored memory glasses, and also by checking against the exact positive attributes of the previous generations. Whereas as times change, the portfolio of skills and knowledge shifts. Today’s kids are masters at many things that didn’t even exist in my youth. That’s partly going to be a shift away from other things, most of which are both less important than the new priorities and less important than they were.

Ron Arts: Most important sentence in the article: “There might have been people complaining about machinery replacing blacksmiths in, like, the 1600s or 1800s, but now it’s just accepted that it’s useless to learn how to blacksmith.”

George Turner: Blacksmithing is an extremely useful skill. Even if I’m finishing up the part on a big CNC machine or with an industrial robot, there are times when smithing saves me a lot of time.

Bob BTC: Learning a trade is far different than learning to think!

Is it finally ‘learning to think’ this time? Really? Were they reading the sequences? Could previous students have written them?

And yes, people really will use justifications for our university classes that are about as strong as ‘blacksmithing is an extremely useful skill.’

So we should be highly suspicious of yet another claim of new tech destroying kids ability to learn, especially when it is also the greatest learning tool in human history.

Notice how much better it is to use AI than it is to hire to a human to do your homework, if both had the same cost, speed and quality profiles.

For $15.95 a month, Chegg promised answers to homework questions in as little as 30 minutes, 24/7, from the 150,000 experts with advanced degrees it employed, mostly in India. When ChatGPT launched, students were primed for a tool that was faster, more capable.

With AI, you create the prompt and figure out how to frame the assignment, you can ask follow-up questions, you are in control. With hiring a human, you are much less likely to do any of that. It matters.

Ultimately, this particular cataclysm is not one I am so worried about. I don’t think our children were learning before, and they have much better opportunity to do so now. I don’t think they were acting with or being selected for integrity at university before, either. And if this destroys the value of degrees? Mostly, I’d say: Good.

If you are addicted to TikTok, ChatGPT or your phone in general, it can get pretty grim, as was often quoted.

James Walsh: Rarely did she sit in class and not see other students’ laptops open to ChatGPT. Toward the end of the semester, she began to think she might be dependent on the website. She already considered herself addicted to TikTok, Instagram, Snapchat, and Reddit, where she writes under the username maybeimnotsmart. “I spend so much time on TikTok,” she said. “Hours and hours, until my eyes start hurting, which makes it hard to plan and do my schoolwork. With ChatGPT, I can write an essay in two hours that normally takes 12.”

The ‘catch’ that isn’t mentioned is that She Got Better.

Colin Fraser: Kind of an interesting omission. Not THAT interesting or anything but, you know, why didn’t he put that in the article?

I think it’s both interesting and important context. If your example of a student addicted to ChatGPT and her phone beat that addiction, that’s highly relevant. It’s totally within Bounded Distrust rules to not mention it, but hot damn. Also, congrats to maybeimnotsosmart.

Ultimately the question is, if you have access to increasingly functional copies of The Whispering Earring, what should you do with that? If others get access to it, what then? What do we do about educational situations ‘getting there first’?

In case you haven’t read The Whispering Earring, it’s short and you should, and I’m very confident the author won’t mind, so here’s the whole story.

Scott Alexander: Clarity didn’t work, trying mysterianism.

In the treasure-vaults of Til Iosophrang rests the Whispering Earring, buried deep beneath a heap of gold where it can do no further harm.

The earring is a little topaz tetrahedron dangling from a thin gold wire. When worn, it whispers in the wearer’s ear: “Better for you if you take me off.” If the wearer ignores the advice, it never again repeats that particular suggestion.

After that, when the wearer is making a decision the earring whispers its advice, always of the form “Better for you if you…”. The earring is always right. It does not always give the best advice possible in a situation. It will not necessarily make its wearer King, or help her solve the miseries of the world. But its advice is always better than what the wearer would have come up with on her own.

It is not a taskmaster, telling you what to do in order to achieve some foreign goal. It always tells you what will make you happiest. If it would make you happiest to succeed at your work, it will tell you how best to complete it. If it would make you happiest to do a half-assed job at your work and then go home and spend the rest of the day in bed having vague sexual fantasies, the earring will tell you to do that. The earring is never wrong.

The Book of Dark Waves gives the histories of two hundred seventy four people who previously wore the Whispering Earring. There are no recorded cases of a wearer regretting following the earring’s advice, and there are no recorded cases of a wearer not regretting disobeying the earring. The earring is always right.

The earring begins by only offering advice on major life decisions. However, as it gets to know a wearer, it becomes more gregarious, and will offer advice on everything from what time to go to sleep, to what to eat for breakfast. If you take its advice, you will find that breakfast food really hit the spot, that it was exactly what you wanted for breakfast that day even though you didn’t know it yourself. The earring is never wrong.

As it gets completely comfortable with its wearer, it begins speaking in its native language, a series of high-bandwidth hisses and clicks that correspond to individual muscle movements. At first this speech is alien and disconcerting, but by the magic of the earring it begins to make more and more sense. No longer are the earring’s commands momentous on the level of “Become a soldier”. No more are they even simple on the level of “Have bread for breakfast”. Now they are more like “Contract your biceps muscle about thirty-five percent of the way” or “Articulate the letter p”. The earring is always right. This muscle movement will no doubt be part of a supernaturally effective plan toward achieving whatever your goals at that moment may be.

Soon, reinforcement and habit-formation have done their trick. The connection between the hisses and clicks of the earring and the movements of the muscles have become instinctual, no more conscious than the reflex of jumping when someone hidden gives a loud shout behind you.

At this point no further change occurs in the behavior of the earring. The wearer lives an abnormally successful life, usually ending out as a rich and much-beloved pillar of the community with a large and happy family.

When Kadmi Rachumion came to Til Iosophrang, he took an unusual interest in the case of the earring. First, he confirmed from the records and the testimony of all living wearers that the earring’s first suggestion was always that the earring itself be removed. Second, he spent some time questioning the Priests of Beauty, who eventually admitted that when the corpses of the wearers were being prepared for burial, it was noted that their brains were curiously deformed: the neocortexes had wasted away, and the bulk of their mass was an abnormally hypertrophied mid- and lower-brain, especially the parts associated with reflexive action.

Finally, Kadmi-nomai asked the High Priest of Joy in Til Iosophrang for the earring, which he was given. After cutting a hole in his own earlobe with the tip of the Piercing Star, he donned the earring and conversed with it for two hours, asking various questions in Kalas, in Kadhamic, and in its own language. Finally he removed the artifact and recommended that the it be locked in the deepest and most inaccessible parts of the treasure vaults, a suggestion with which the Iosophrelin decided to comply.

This is very obviously not the optimal use of The Whispering Earring, let alone the ability to manufacture copies of it.

But, and our future may depend on the answer, what is your better plan? And in particular, what is your plan for when everyone has access to (a for now imperfect and scope limited but continuously improving) one, and you are at a rather severe disadvantage if you do not put one on?

The actual problem we face is far trickier than that. Both in education, and in general.

Discussion about this post

Cheaters Gonna Cheat Cheat Cheat Cheat Cheat Read More »

report:-doge-supercharges-mass-layoff-software,-renames-it-to-sound-less-dystopian

Report: DOGE supercharges mass-layoff software, renames it to sound less dystopian

“It is not clear how AutoRIF has been modified or whether AI is involved in the RIF mandate (through AutoRIF or independently),” Kunkler wrote. “However, fears of AI-driven mass-firings of federal workers are not unfounded. Elon Musk and the Trump Administration have made no secret of their affection for the dodgy technology and their intentions to use it to make budget cuts. And, in fact, they have already tried adding AI to workforce decisions.”

Automating layoffs can perpetuate bias, increase worker surveillance, and erode transparency to the point where workers don’t know why they were let go, Kunkler said. For government employees, such imperfect systems risk triggering confusion over worker rights or obscuring illegal firings.

“There is often no insight into how the tool works, what data it is being fed, or how it is weighing different data in its analysis,” Kunkler said. “The logic behind a given decision is not accessible to the worker and, in the government context, it is near impossible to know how or whether the tool is adhering to the statutory and regulatory requirements a federal employment tool would need to follow.”

The situation gets even starker when you imagine mistakes on a mass scale. Don Moynihan, a public policy professor at the University of Michigan, told Reuters that “if you automate bad assumptions into a process, then the scale of the error becomes far greater than an individual could undertake.”

“It won’t necessarily help them to make better decisions, and it won’t make those decisions more popular,” Moynihan said.

The only way to shield workers from potentially illegal firings, Kunkler suggested, is to support unions defending worker rights while pushing lawmakers to intervene. Calling on Congress to ban the use of shadowy tools relying on unknown data points to gut federal agencies “without requiring rigorous external testing and auditing, robust notices and disclosure, and human decision review,” Kunkler said rolling out DOGE’s new tool without more transparency should be widely condemned as unacceptable.

“We must protect federal workers from these harmful tools,” Kunkler said, adding, “If the government cannot or will not effectively mitigate the risks of using automated decision-making technology, it should not use it at all.”

Report: DOGE supercharges mass-layoff software, renames it to sound less dystopian Read More »

open-source-project-curl-is-sick-of-users-submitting-“ai-slop”-vulnerabilities

Open source project curl is sick of users submitting “AI slop” vulnerabilities

Ars has reached out to HackerOne for comment and will update this post if we get a response.

“More tools to strike down this behavior”

In an interview with Ars, Stenberg said he was glad his post—which generated 200 comments and nearly 400 reposts as of Wednesday morning—was getting around. “I’m super happy that the issue [is getting] attention so that possibly we can do something about it [and] educate the audience that this is the state of things,” Stenberg said. “LLMs cannot find security problems, at least not like they are being used here.”

This week has seen four such misguided, obviously AI-generated vulnerability reports seemingly seeking either reputation or bug bounty funds, Stenberg said. “One way you can tell is it’s always such a nice report. Friendly phrased, perfect English, polite, with nice bullet-points … an ordinary human never does it like that in their first writing,” he said.

Some AI reports are easier to spot than others. One accidentally pasted their prompt into the report, Stenberg said, “and he ended it with, ‘and make it sound alarming.'”

Stenberg said he had “talked to [HackerOne] before about this” and has reached out to the service this week. “I would like them to do something, something stronger, to act on this. I would like help from them to make the infrastructure around [AI tools] better and give us more tools to strike down this behavior,” he said.

In the comments of his post, Stenberg, trading comments with Tobias Heldt of open source security firm XOR, suggested that bug bounty programs could potentially use “existing networks and infrastructure.” Security reporters paying a bond to have a report reviewed “could be one way to filter signals and reduce noise,” Heldt said. Elsewhere, Stenberg said that while AI reports are “not drowning us, [the] trend is not looking good.”

Stenberg has previously blogged on his own site about AI-generated vulnerability reports, with more details on what they look like and what they get wrong. Seth Larson, security developer-in-residence at the Python Software Foundation, added to Stenberg’s findings with his own examples and suggested actions, as noted by The Register.

“If this is happening to a handful of projects that I have visibility for, then I suspect that this is happening on a large scale to open source projects,” Larson wrote in December. “This is a very concerning trend.”

Open source project curl is sick of users submitting “AI slop” vulnerabilities Read More »

the-third-crisis-dawns-in-foundation-s3-teaser

The Third Crisis dawns in Foundation S3 teaser

We have our first teaser for the upcoming third season of Foundation.

It’s been nearly two years, but the third season of Foundation, Apple TV+’s epic adaptation (or remix) of the Isaac Asimov series, is almost here. The streaming platform released an action-packed teaser of what we can expect from the new ten-episode season: the onset of the Third Crisis, a galactic war, and a shirtless Lee Pace.

(Some spoilers for first two seasons below.)

Showrunner David S. Goyer took great pains in S1 to carefully set up his expansive fictional world, and the scope only broadened in the second season. As previously reported, Asimov’s fundamental narrative arc remains intact, with the series taking place across multiple planets over 1,000 years and featuring a huge cast of characters.

Mathematician Hari Seldon (Jared Harris) developed a controversial theory of “psychohistory,” and his calculations predict the fall of the Empire, ushering in a Dark Age period that will last 30,000 years, after which a second Empire will emerge. The collapse of the Empire is inevitable, but Seldon has a plan to reduce the Dark Ages to a mere 1,000 years through the establishment of a Foundation to preserve all human knowledge so that civilization need not rebuild itself entirely from scratch. He is aided in this endeavor by his math prodigy protegé Gaal Dornick (Lou Llobell).

The biggest change from the books is the replacement of the Empire’s ruling committee with a trio of Eternal Emperor clones called the Cleons—a genetic triune dynasty comprised of Brother Day (Pace), Brother Dusk (Terrence Mann), and Brother Dawn (Cassian Bilton). Technically, they are all perfect incarnations of the same man at different ages, and this is both the source of their strength as a team and of their conflicts. Their guardian is an android, Eto Demerzel (Laura Birn), one of the last surviving androids from the ancient Robot Wars, who is programmed to protect the dynasty at all costs.

The Third Crisis dawns in Foundation S3 teaser Read More »

nasa-scrambles-to-cut-iss-activity-after-trump-budget—its-options-are-not-great

NASA scrambles to cut ISS activity after Trump budget—its options are not great

NASA has not publicly announced the astronauts who will fly on Crew-12 next year, but according to sources, it has already assigned veteran astronaut Jessica Meir and newcomer Jack Hathaway, a former US Navy fighter pilot who joined NASA’s astronaut corps in 2021. If these changes go through, presumably one of these two would be removed from the mission.

Will this actually happen?

The cuts are by no means a certainty. The president’s budget proposal is just the beginning of a monthslong process in which the White House Office of Management and Budget will work with Congress to establish funding levels and programmatic priorities for fiscal year 2026. If this budget process is like those in years past, a final budget may not even be set by the start of the fiscal year this October.

Congress has been broadly supportive of the space station, which is slated to fly through 2030 before being decommissioned. The Trump White House nominee to lead NASA, Jared Isaacman, also spoke in favor of “maximizing” science on the space station during his confirmation hearing last month. In subsequent answers to written questions, Isaacman reaffirmed this position.

“My priority would be to maximize the remaining value of the ISS before it is decommissioned,” Isaacman wrote. “We must prioritize the highest-potential science and research that can be conducted on the station—and do everything possible to ‘crack the code’ on an on orbit economy.”

This comment reflects a desire to focus on science that will help jump-start a commercial economy in low-Earth orbit, as opposed to the White House budget’s desire to focus on research related to the Moon and Mars.

Isaacman has not been confirmed yet—that should happen within the next couple of weeks—so he did not have direct input into setting the White House budget proposal. That process was led by Russell Vought, who leads the White House Office of Management and Budget.

NASA scrambles to cut ISS activity after Trump budget—its options are not great Read More »

zuckerberg’s-dystopian-ai-vision

Zuckerberg’s Dystopian AI Vision

You think it’s bad now? Oh, you have no idea. In his talks with Ben Thompson and Dwarkesh Patel, Zuckerberg lays out his vision for our AI future.

I thank him for his candor. I’m still kind of boggled that he said all of it out loud.

We will start with the situation now. How are things going on Facebook in the AI era?

Oh, right.

Sakib: Again, it happened again. Opened Facebook and I saw this. I looked at the comments and they’re just unsuspecting boomers congratulating the fake AI gen couple😂

Deepfates: You think those are real boomers in the comments?

This continues to be 100% Zuckerberg’s fault, and 100% an intentional decision.

The algorithm knows full well what kind of post this is. It still floods people with them, especially if you click even once. If they wanted to stop it, they easily could.

There’s also the rather insane and deeply embarrassing AI bot accounts they have tried out on Facebook and Instagram.

Compared to his vision of the future? You aint seen nothing yet.

Ben Thompson interviewed Mark Zuckerberg, centering on business models.

It was like if you took a left wing caricature of why Zuckerberg is evil, combined it with a left wing caricature about why AI is evil, and then fused them into their final form. Except it’s coming directly from Zuckerberg, as explicit text, on purpose.

It’s understandable that many leave such interviews and related stories saying this:

Ewan Morrison: Big tech atomises you, isolates you, makes you lonely and depressed – then it rents you an AI friend, and AI therapist, an AI lover.

Big tech are parasites who pretend they are here to help you.

When asked what he wants to use AI for, Zuckerberg’s primary answer is advertising, in particular an ‘ultimate black box’ where you ask for a business outcome and the AI does what it takes to make that outcome happen. I leave all the ‘do not want’ and ‘misalignment maximalist goal out of what you are literally calling a black box, film at 11 if you need to watch it again’ and ‘general dystopian nightmare’ details as an exercise to the reader. He anticipates that advertising will then grow from the current 1%-2% of GDP to something more, and Thompson is ‘there with’ him, ‘everyone should embrace the black box.’

His number two use is ‘growing engagement on the customer surfaces and recommendations.’ As in, advertising by another name, and using AI in predatory fashion to maximize user engagement and drive addictive behavior.

In case you were wondering if it stops being this dystopian after that? Oh, hell no.

Mark Zuckerberg: You can think about our products as there have been two major epochs so far.

The first was you had your friends and you basically shared with them and you got content from them and now, we’re in an epoch where we’ve basically layered over this whole zone of creator content.

So the stuff from your friends and followers and all the people that you follow hasn’t gone away, but we added on this whole other corpus around all this content that creators have that we are recommending.

Well, the third epoch is I think that there’s going to be all this AI-generated content…

So I think that these feed type services, like these channels where people are getting their content, are going to become more of what people spend their time on, and the better that AI can both help create and recommend the content, I think that that’s going to be a huge thing. So that’s kind of the second category.

The third big AI revenue opportunity is going to be business messaging.

And the way that I think that’s going to happen, we see the early glimpses of this because business messaging is actually already a huge thing in countries like Thailand and Vietnam.

So what will unlock that for the rest of the world? It’s like, it’s AI making it so that you can have a low cost of labor version of that everywhere else.

Also he thinks everyone should have an AI therapist, and that people want more friends so AI can fill in for the missing humans there. Yay.

PoliMath: I don’t really have words for how much I hate this

But I also don’t have a solution for how to combat the genuine isolation and loneliness that people suffer from

AI friends are, imo, just a drug that lessens the immediate pain but will probably cause far greater suffering

Well, I guess the fourth one is the normal ‘everyone use AI now,’ at least?

And then, the fourth is all the more novel, just AI first thing, so like Meta AI.

He also blames Llama-4’s terrible reception on user error in setup, and says they now offer an API so people have a baseline implementation to point to, and says essentially ‘well of course we built a version of Llama-4 specifically to score well on Arena, that only shows off how easy it is to steer it, it’s good actually.’ Neither of them, of course, even bothers to mention any downside risks or costs of open models.

The killer app of Meta AI is that it will know all about all your activity on Facebook and Instagram and use it against for you, and also let you essentially ‘talk to the algorithm’ which I do admit is kind of interesting but I notice Zuckerberg didn’t mention an option to tell it to alter the algorithm, and Thompson didn’t ask.

There is one area where I like where his head is at:

I think one of the things that I’m really focused on is how can you make it so AI can help you be a better friend to your friends, and there’s a lot of stuff about the people who I care about that I don’t remember, I could be more thoughtful.

There are all these issues where it’s like, “I don’t make plans until the last minute”, and then it’s like, “I don’t know who’s around and I don’t want to bug people”, or whatever. An AI that has good context about what’s going on with the people you care about, is going to be able to help you out with this.

That is… not how I would implement this kind of feature, and indeed the more details you read the more Zuckerberg seems determined to do even the right thing in the most dystopian way possible, but as long as it’s fully opt-in (if not, wowie moment of the week) then at least we’re trying at all.

Also interviewing Mark Zuckerberg is Dwarkesh Patel. There was good content here, Zuckerberg in many ways continues to be remarkably candid. But it wasn’t as dense or hard hitting as many of Patel’s other interviews.

One key difference between the interviews is that when Zuckerberg lays out his dystopian vision, you get the sense that Thompson is for it, whereas Patel is trying to express that maybe we should be concerned. Another is that Patel notices that there might be more important things going on, whereas to Thompson nothing could be more important than enhancing ad markets.

  1. When asked what changed since Llama 3, Zuckerberg leads off with the ‘personalization loop.’

  2. Zuckerberg still claims Llama 4 Scout and Maverick are top notch. Okie dokie.

  3. He doubles down on ‘open source will become most used this year’ and that this year has been Great News For Open Models. Okie dokie.

  4. His heart’s clearly not in claiming it’s a good model, sir. His heart is in it being a good model for Meta’s particular commercial purposes and ‘product value’ as per people’s ‘revealed preferences.’ That’s the modes he talked about with Thompson.

  5. He’s very explicit about this. OpenAI and Anthropic are going for AGI and a world of abundance, with Anthropic focused on coding and OpenAI towards reasoning. Meta wants fast, cheap, personalized, easy to interact with all day, and (if you add what he said to Thompson) to optimize feeds and recommendations for engagement, and to sell ads. It’s all for their own purposes.

  6. He says Meta is specifically creating AI tools to write their own code for internal use, but I don’t understand what makes that different from a general AI coder? Or why they think their version is going to be better than using Claude or Gemini? This feels like some combination of paranoia and bluff.

  7. Thus, Meta seems to at this point be using the open model approach as a recruiting or marketing tactic? I don’t know what else it’s actually doing for them.

  8. As Dwarkesh notes, Zuckerberg is basically buying the case for superintelligence and the intelligence explosion, then ignoring it to form an ordinary business plan, and of course to continue to have their safety plan be ‘lol we’re Meta’ and release all their weights.

  9. I notice I am confused why their tests need hundreds of thousands or millions of people to be statistically significant? Impacts must be very small and also their statistical techniques they’re using don’t seem great. But also, it is telling that his first thought of experiments to run with AI are being run on his users.

  10. In general, Zuckerberg seems to be thinking he’s running an ordinary dystopian tech company doing ordinary dystopian things (except he thinks they’re not dystopian, which is why he talks about them so plainly and clearly) while other companies do other ordinary things, and has put all the intelligence explosion related high weirdness totally out of his mind or minimized it to specific use cases, even though he intellectually knows that isn’t right.

  11. He, CEO of Meta, says people use what is valuable to them and people are smart and know what is valuable in their lives, and when you think otherwise you’re usually wrong. Queue the laugh track.

  12. First named use case is talking through difficult conversations they need to have. I do think that’s actually a good use case candidate, but also easy to pervert.

  13. (29: 40) The friend quote: The average American only has three friends ‘but has demand for meaningfully more, something like 15… They want more connection than they have.’ His core prediction is that AI connection will be a compliment to human connection rather than a substitute.

    1. I tentatively agree with Zuckerberg, if and only if the AIs in question are engineered (by the developer, user or both, depending on context) to be complements rather than substitutes. You can make it one way.

    2. However, when I see Meta’s plans, it seems they are steering it the other way.

  14. Zuckerberg is making a fully general defense of adversarial capitalism and attention predation – if people are choosing to do something, then later we will see why it turned out to be valuable for them and why it adds value to their lives, including virtual therapists and virtual girlfriends.

    1. But this proves (or implies) far too much as a general argument. It suggests full anarchism and zero consumer protections. It applies to heroin or joining cults or being in abusive relationships or marching off to war and so on. We all know plenty of examples of self-destructive behaviors. Yes, the great classical liberal insight is that mostly you are better off if you let people do what they want, and getting in the way usually backfires.

    2. If you add AI into the mix, especially AI that moves beyond a ‘mere tool,’ and you consider highly persuasive AIs and algorithms, asserting ‘whatever the people choose to do must be benefiting them’ is Obvious Nonsense.

    3. I do think virtual therapists have a lot of promise as value adds, if done well. And also great danger to do harm, if done poorly or maliciously.

  15. Dwarkesh points out the danger of technology reward hacking us, and again Zuckerberg just triples down on ‘people know what they want.’ People wouldn’t let there be things constantly competing for their attention, so the future won’t be like that, he says. Is this a joke?

  16. I do get that the right way to design AI-AR glasses is as great glasses that also serve as other things when you need them and don’t flood your vision, and that the wise consumer will pay extra to ensure it works that way. But where is this trust in consumers coming from? Has Zuckerberg seen the internet? Has he seen how people use their smartphones? Oh, right, he’s largely directly responsible.

    1. Frankly, the reason I haven’t tried Meta’s glasses is that Meta makes them. They do sound like a nifty product otherwise, if execution is good.

  17. Zuckerberg is a fan of various industrial policies, praising the export controls and calling on America to help build new data centers and related power sources.

  18. Zuckerberg asks, would others be doing open models if Meta wasn’t doing it? Aren’t they doing this because otherwise ‘they’re going to lose?’

    1. Do not flatter yourself, sir. They’re responding to DeepSeek, not you. And in particular, they’re doing it to squash the idea that r1 means DeepSeek or China is ‘winning.’ Meta’s got nothing to do with it, and you’re not pushing things in the open direction in a meaningful way at this point.

  19. His case for why the open models need to be American is because our models embody an America view of the world in a way that Chinese models don’t. Even if you agree that is true, it doesn’t answer Dwarkesh’s point that everyone can easily switch models whenever they want. Zuckerberg then does mention the potential for backdoors, which is a real thing since ‘open model’ only means open weights, they’re not actually open source so you can’t rule out a backdoor.

  20. Zuckerberg says the point of Llama Behemoth will be the ability to distill it. So making that an open model is specifically so that the work can be distilled. But that’s something we don’t want the Chinese to do, asks Padme?

  21. And then we have a section on ‘monetizing AGI’ where Zuckerberg indeed goes right to ads and arguing that ads done well add value. Which they must, since consumers choose to watch them, I suppose, per his previous arguments?

To be fair, yes, it is hard out there. We all need a friend and our options are limited.

Roman Helmet Guy (reprise from last week): Zuckerberg explaining how Meta is creating personalized AI friends to supplement your real ones: “The average American has 3 friends, but has demand for 15.”

Daniel Eth: This sounds like something said by an alien from an antisocial species that has come to earth and is trying to report back to his kind what “friends” are.

Sam Ro: imagine having 15 friends.

Modest Proposal (quoting Chris Rock): “The Trenchcoat Mafia. No one would play with us. We had no friends. The Trenchcoat Mafia. Hey I saw the yearbook picture it was six of them. I ain’t have six friends in high school. I don’t got six friends now.”

Kevin Roose: The Meta vision of AI — hologram Reelslop and AI friends keeping you company while you eat breakfast alone — is so bleak I almost can’t believe they’re saying it out loud.

Exactly how dystopian are these ‘AI friends’ going to be?

GFodor.id (being modestly unfair): What he’s not saying is those “friends” will seem like real people. Your years-long friendship will culminate when they convince you to buy a specific truck. Suddenly, they’ll blink out of existence, having delivered a conversion to the company who spent $3.47 to fund their life.

Soible_VR: not your weights, not your friend.

Why would they then blink out of existence? There’s still so much more that ‘friend’ can do to convert sales, and also you want to ensure they stay happy with the truck and give it great reviews and so on, and also you don’t want the target to realize that was all you wanted, and so on. The true ‘AI ad buddy’ plays the long game, and is happy to stick around to monetize that bond – or maybe to get you to pay to keep them around, plus some profit margin.

The good ‘AI friend’ world is, again, one in which the AI friends are complements, or are only substituting while you can’t find better alternatives, and actively work to help you get and deepen ‘real’ friendships. Which is totally something they can do.

Then again, what happens when the AIs really are above human level, and can be as good ‘friends’ as a person? Is it so impossible to imagine this being fine? Suppose the AI was set up to perfectly imitate a real (remote) person who would actually be a good friend, including reacting as they would to the passage of time and them sometimes reaching out to you, and also that they’d introduce you to their friends which included other humans, and so on. What exactly is the problem?

And if you then give that AI ‘enhancements,’ such as happening to be more interested in whatever you’re interested in, having better information recall, watching out for you first more than most people would, etc, at what point do you have a problem? We need to be thinking about these questions now.

I do get that, in his own way, the man is trying. You wouldn’t talk about these plans in this way if you realized how the vision would sound to others. I get that he’s also talking to investors, but he has full control of Meta and isn’t raising capital, although Thompson thinks that Zuckerberg has need of going on a ‘trust me’ tour.

In some ways this is a microcosm of key parts of the alignment problem. I can see the problems Zuckerberg thinks he is solving, the value he thinks or claims he is providing. I can think of versions of these approaches that would indeed be ‘friendly’ to actual humans, and make their lives better, and which could actually get built.

Instead, on top of the commercial incentives, all the thinking feels alien. The optimization targets are subtly wrong. There is the assumption that the map corresponds to the territory, that people will know what is good for them so any ‘choices’ you convince them to make must be good for them, no matter how distorted you make the landscape, without worry about addiction to Skinner boxes or myopia or other forms of predation. That the collective social dynamics of adding AI into the mix in these ways won’t get twisted in ways that make everyone worse off.

And of course, there’s the continuing to model the future world as similar and ignoring the actual implications of the level of machine intelligence we should expect.

I do think there are ways to do AI therapists, AI ‘friends,’ AI curation of feeds and AI coordination of social worlds, and so on, that contribute to human flourishing, that would be great, and that could totally be done by Meta. I do not expect it to be at all similar to the one Meta actually builds.

Discussion about this post

Zuckerberg’s Dystopian AI Vision Read More »

find-my…-bicycle?

Find my… bicycle?


Knog’s Scout gives bikes a motion-sensitive alarm and Bluetooth tracking.

We’ve reviewed some pretty expensive bikes here at Ars, and one of the consistent concerns we see in the comments is the fear of theft. That’s a widely shared fear, based on a whole bunch of videos that describe how to hide an AirTag tracker where a potential bike thief won’t notice it. There are also a number of products available that will hold a hidden AirTag in a reflector, a bike bell, or the head tube.

But Apple has also made it possible for third parties to plug their devices into its “Find My” system, and a company called Knog has made a Bluetooth bike tracker called the Scout that does just that. The Scout goes well beyond tracking, though, providing a motion-sensitive alarm system that will alert you if anybody tries to move your bike.

Meet the Scout

The Scout can be attached to the frame using the screw holes normally used for a water bottle holder. Security screws make it considerably more difficult to remove. Once there, it uses Apple’s Find My network to keep the owner apprised of the bike’s location (Android users need not apply at the moment). If you’re leaving your bike in a high-risk location, you can also use Knog’s phone application to set an alarm that will be triggered if the bike is moved.

Externally, the scout is a nearly featureless flat plastic oval. Inside this water-resistant case are a number of key components: a rechargeable battery that Knog says will last two to six months when fully charged, Bluetooth and GPS hardware, an accelerometer, and a speaker. There’s also a small rubber piece on one side that flips aside to reveal a USB-C charging port and two holes with recesses that are designed to protect the security screws that come with the Scout. The hardware itself weighs just 25 grams (less than an ounce), so it should be irrelevant to all but the most weight-conscious rider.

Image of some packaging and parts.

The cardboard packaging holds the Scout and its cover (yellow), a QR code for the app download, the security screwdriver (metal, in packaging), and the security screws (black at bottom right). Credit: JOHN TIMMER

All of this—the Scout itself, the security screws, a small screw driver to work them, plus a soft rubber cover—comes in a bit of ingeniously designed, recyclable cardboard packaging.

The security screws have two small indentations on opposite sides of the screw head, meaning you need a screwdriver with a U-shaped business end of a specific width to turn them (see the photo above if this description isn’t clear). While these tools aren’t that difficult to obtain, they’re sufficiently rare that they’ll probably serve as an impediment for casual thieves and at least ensure the tracker will stay on the bike for a while if it’s lifted by less casual thieves—though there’s no guarantee that any thief wouldn’t just take a hammer to it and wreck the electronics.

To attach it to the frame, however, you may need to give up one of your water bottle spots. I tried to install three different plastic water bottle cages beneath the scout and, in each case, the scout stuck out in a way that would make it more difficult or impossible to fit a water bottle in. The alternative is to install the Scout beneath the water bottle cage, but in that case, the heads of the security screws stick out where they can simply be grabbed and turned with some pliers. Only one of my 30-year-old aluminum bottle holders had a recess that nicely fit the Scout.

When installed this way, the Scout is nicely unobtrusive. And, of course, there’s nothing stopping you from hiding it somewhere else on the bike, though it’s considerably more bulky than an AirTag. And if you’re indifferent to obtrusiveness, you could always stick the bright yellow cover on and let people who are aware of the product know your bike has theft protection.

When the Scout is nestled in the recess of a water bottle cage, it’s impossible to stick a USB-C charging cable into it, so you’d have to remove it to recharge it, which will add to the hassle. Otherwise, charging is as simple as getting the bike within a cable’s length of a socket or laptop.

Alerts and alarms

Knog provides software that helps you pair your iPhone with the device. Once that’s done, it can be added to the Find My system, where it will appear just like an AirTag. The process worked smoothly in my tests, and in an iOS-heavy suburban environment, there was never any problem knowing where the bike was.

Image of an application screen showing the tracking of two devices: a bike and keys.

Unlike my AirTag, the bike tracker’s battery is easy to recharge. Credit: JOHN TIMMER

But what sets this device apart from an AirTag is its motion-sensing capabilities. If you’re within Bluetooth range, the Knog application will let you turn on the alarm system and switch between audible and silent modes. In sound-on mode, moving the bike produced a series of very audible tones. In both audible and silent mode, my watch and phone immediately vibrated, with the phone continuing to make audible beeps until the alarm was disabled. You have to be within Bluetooth range to get these alerts, though, which is probably a severe limitation for people like bike commuters, who may work some distance from where their bike is parked.

Given its piercing tones, it’s a good thing that the alarm eventually shuts off on its own. When I triggered the alarm while my phone was out of Bluetooth range, however, bringing the phone back into range gave me no indication that the alarm had ever been triggered. That’s not ideal, as there are many contexts where it would be good to know if someone had moved your bike or if the alarm had made a nuisance of itself.

The Scout’s nuisance potential is a product of its sensitivity. An accelerometer, not the GPS, triggers the alarm, so it will go off if the bike is simply lifted up and set down but not moved anywhere. If your bike ends up sitting in a crowded communal bike parking area, there’s a good chance other cyclists will move it around enough to trigger the alarm. Depending on your parking situation, you (and anyone within hearing range of the bike parking) may have to deal with a lot of false alarms.

So it’s not a perfect protection system. Of course, the perfect protection system against bicycle theft doesn’t exist; people who steal bikes have managed to stay ahead of every form of lock and device that has been thrown in their way so far. All you can really hope for is something that helps shift the odds in your favor a bit, and the Scout should do that. Its audible alarm will be enough to scare many potential thieves away, and the software can alert you to possible trouble if you happen to be within range. The tracking might help with recovery. And its presence alone may be enough to convince some would-be thieves to choose a different bike.

All this should make it clear that the Scout does considerably more than an AirTag, and all of those extra features act to keep a theft from ever happening rather than making a post-theft recovery more likely. It’s possible to find it for under $50.00, or about the price of two AirTags—if theft is a major concern, the extra features might make this worthwhile.

Photo of John Timmer

John is Ars Technica’s science editor. He has a Bachelor of Arts in Biochemistry from Columbia University, and a Ph.D. in Molecular and Cell Biology from the University of California, Berkeley. When physically separated from his keyboard, he tends to seek out a bicycle, or a scenic location for communing with his hiking boots.

Find my… bicycle? Read More »

lighter,-cheaper-surface-laptop-saves-a-little-money-but-gives-up-a-lot

Lighter, cheaper Surface Laptop saves a little money but gives up a lot

The laptop has two USB-C ports on the right side, seen here, and a USB-A port and headphone jack on the left. Surface Connect is gone. For those reasons, it seems like most individual buyers would still be better off going for the 13.8-inch Surface Laptop, with the new one only really making sense for companies buying these in bulk if the 13.8-inch Surface goes up in price or if the 13-inch Surface happens to be discounted and the 13.8-inch version isn’t. The 13.8-inch Laptop is also obviously still the one you want if you want more than 16GB of RAM or 512GB of storage, or if you need more CPU and GPU speed.

The new 13-inch Laptop has most of the same basic ports as the 13.8-inch version, just arranged slightly differently. You still get a pair of USB-C ports (both supporting 10 Gbps USB 3.2 speeds, rather than USB 4), one USB-A port, and a headphone jack, but the USB-A port and headphone jack are now on the left side of the laptop. As with the 12-inch Surface Pro tablet, the Surface Connect port has been removed, so this is compatible with all existing USB-C accessories but none of the ones that use Microsoft’s proprietary connector.

An awkward refresh

Both of the new Surface devices being announced today. Credit: Microsoft

The new Surface Laptop doesn’t seem to regress on any major functional fronts—unlike the 12-inch Surface Pro, which throws out an 11-year-old keyboard fix that made the Surface Pro’s keyboard cover much more stable and laptop-like—but it’s still an odd refresh. But inflation, supply chain snarls, and the Trump administration’s rapidly changing tariff plans have made pricing and availability harder to predict than they were a few years ago.

Though PCs and smartphones are (currently) exempted from most tariffs, Microsoft did recently raise the prices of its years-old Xbox Series S and X consoles; it’s possible these new Surface devices were originally designed to be budget models but that world events kept them from being as cheap as they otherwise might have been.

Lighter, cheaper Surface Laptop saves a little money but gives up a lot Read More »

gpt-4o-sycophancy-post-mortem

GPT-4o Sycophancy Post Mortem

Last week I covered that GPT-4o was briefly an (even more than usually) absurd sycophant, and how OpenAI responded to that.

Their explanation at that time was paper thin. It didn’t tell us much that we did not already know, and seemed to suggest they had learned little from the incident.

Rolling Stone has a write-up of some of the people whose delusions got reinforced by ChatGPT, which has been going on for a while – this sycophancy incident made things way worse but the pattern isn’t new. Here’s some highlights, but the whole thing is wild anecdotes throughout, and they point to a ChatGPT induced psychosis thread on Reddit. I would love to know how often this actually happens.

  1. There’s An Explanation For (Some Of) This.

  2. What Have We Learned?

  3. What About o3 The Lying Liar?

  4. o3 The Source Fabricator.

  5. There Is Still A Lot We Don’t Know.

  6. You Must Understand The Logos.

  7. Circling Back.

  8. The Good News.

Now OpenAI have come out with a much more detailed explanation. It is excellent that OpenAI is offering us more details, and it’s totally fine for them to take the time to pull it together.

Sam Altman (CEO OpenAI): we missed the mark with last week’s GPT-4o update.

[This post explains] what happened, what we learned, and some things we will do differently in the future.

Ryan Lowe (ex-Open AI): I’ve been critiquing OpenAI recently on this, so I also want to say that I’m glad they wrote this up and are sharing more info about what happened with 4o

it’s interesting to me that this is the first time they incorporated an additional reward based on thumbs up / thumbs down data.

including thumbs up data at all is risky, imo. I don’t think we understand all the ways it can go wrong.

[Suggested related economic work available here.]

Near Cyan: thank you for a post-mortem 🥰

Steven Adler: Glad that OpenAI now said it plainly: they ran no evals for sycophancy. I respect and appreciate the decision to say this clearly.

Key quote: “We also didn’t have specific deployment evaluations tracking sycophancy.”

“Our offline evals weren’t broad or deep enough to catch sycophantic behavior—something the Model Spec explicitly discourages⁠”

^ I hope OpenAI now makes sure it has evals for all goals in the Spec

I’m not going to be especially kind about all this, because I don’t think they’ve learned enough of the right (generalized) lessons or shared as much information as I’d like.

But I want to emphasize: Telling us this is good, the information shared and the changes you made are far better than nothing. Thank you. This is not All The Way, there is farther along this path we must walk, but the path it follows is The Way.

So what do we know now? And what is being changed?

They’ve learned and shared some things. Not enough, but some important things.

  1. The difference between variations of GPT-4o included post-training via RL with reward signals from ‘a variety of sources,’ including new sources for signals.

    1. We get no information about whether other techniques are or aren’t used too.

    2. This includes potentially there having been changes to the system prompt.

    3. They incorporate a bunch of changes at once, in this case better incorporation of user feedback, memory and fresher data, plus others. There is the potential for unexpected interactions.

  2. Each model candidate goes through checks for safety, behavior and helpfulness. Here’s what they run:

    1. They first use standard offline benchmark evaluations for not only math and coding but things like chat performance, personality and general usefulness. They treat these ‘as a proxy’ for usefulness, careful Icarus.

    2. Internal experts do ‘vibe checks.’

    3. Safety checks are run, mostly to check against malicious users and performance on high-stakes situations like suicide and health, they are now working to extend this to model misbehavior.

    4. Preparedness framework checks including red teaming are used when appropriate, but red teaming isn’t automatic otherwise.

    5. An A/B test on a limited set of users.

  3. Their core diagnosis is that the additional feedback sources weakened the influence of their primary reward signal, which had been holding sycophancy in check, as user feedback as currently measured rewards sycophancy. They also note that memory can increase sycophancy, although direction is not consistent.

    1. As I’ve noted, using A/B testing or thumbs up and down as user feedback is going to have the sycophancy effect up to an absurd level, and it’s going to go similarly wrong in other places where the median and mean outcomes are optimized at very different points, and also optimize for various other things that we wouldn’t endorse on reflection.

    2. My prediction would be that effective sycophancy is improved by memory, if only because the AI now knows which answers would express sycophancy.

  4. The A/B testing and offline evaluations of this model looked good.

  5. There was no specific test in the process to identify sycophancy. They’re going to add a test for sycophancy in particular going forward.

    1. What about any other failure mode that isn’t specifically tested for? This is a continuous pattern at OpenAI, they only test for particular things, not for worrisome things in general.

    2. At minimum, there needs to be a massive brainstorm session of what other failure modes might happen soon, and tests need to be designed for them.

    3. Also, there needs to be a test for everything expressed in the model spec, to the extent that it might fail such a test.

    4. That all still won’t work when it’s superintelligence time, of course. But let’s try to die with slightly more dignity, if we can.

  6. The ‘vibe check’ from the expert testers did raise red flags. But they decided that the positive signals from users mattered more. They acknowledge this was the wrong call.

    1. I do not see a specific commitment not to make this wrong call again!

    2. The point of the vibe check is that if the vibes are off, that’s at least a Chesterton’s Fence. You have to at minimum figure out why the vibes are off, and then maybe you can decide to launch anyway. If you don’t know, then you definitely can’t launch.

    3. I would outright give the internal experts, the vibe checkers, a veto. If they collectively say the vibes are off? Okay, now you need to convince them why they should approve the launch anyway, or you can’t launch.

  7. Indeed: They are giving out this at least a form of this veto, with qualitative testing serving as a blocking concern: “Explicitly approve model behavior for each launch, weighing both quantitative and qualitative signals: We’ll adjust our safety review process to formally consider behavior issues—such as hallucination, deception, reliability, and personality—as blocking concerns. Even if these issues aren’t perfectly quantifiable today, we commit to blocking launches based on proxy measurements or qualitative signals, even when metrics like A/B testing look good.” And later: “We need to treat model behavior issues as launch-blocking like we do other safety risks.”

    1. Even with everything I knew, I’m pretty stunned that it outright wasn’t considered a blocking concern before if the proxy measurements or qualitative signals raised red flags, or there were sufficiently concerning model behavior issues. Or that model behavior wasn’t ‘explicitly approved, weighing both quantitative and qualitative signals.’

    2. I mean, seriously, WTAF, people?

    3. This failure is nuts and a five-alarm fire. All procedures need to be evaluated to determine which tests are going to get disregarded, and decisions made anew as to whether that is a sane thing for OpenAI to do.

  8. They are introducing an additional opt-in ‘alpha’ testing phase for users.

    1. I suppose that is good, with obvious caveats about alpha release effectively being a release for many purposes, so it needs to be treated accordingly. You can’t release the alpha unless you would otherwise release in general.

  9. They will ‘value spot checks and interactive testing more,’ and need to be critical of metrics that conflict with qualitative testing.

    1. I mean I sure hope so, given how little they valued them before.

  10. They will improve their offline evals and A/B experiments.

  11. They will better evaluate adherence to their model behavior principles.

    1. As I noted above, you need evals for every potential failure.

  12. They promise to communicate more proactively about what their updates do.

    1. Good.

    2. Seriously, it’s maddening to hear ‘we’ve made an update, we’re not changing the name, it’s now smarter with a better personality but we won’t explain what that means, okay, have fun, bye’ every two months.

  13. “Our evals won’t catch everything.”

    1. Well, yes. Even now this is true. And later it will be far more true.

  14. There’s no such thing as a “small” launch.

    1. I mean, there kind of is, but I prefer this attitude to the alternative.

In related failure analysis, 1a3orn speculates on what happened with Sonnet 3.7’s savage cheating, especially its hard coding tests to pass, with the guess that they gave it tasks that were too hard and didn’t have proper precautions against hard coding the answers. Janus confirms this is the mainline theory. Which is good news if true, since that seems like something you can avoid doing in various ways, and hopefully 4.0 will be trained with several of them – letting it say it doesn’t know, and holding out additional verification tests, and checking for hard coding, at least, and generalizing the principles involved. You will always get exactly what you deserve.

Or, regarding o3:

Chris Lakin: Why is this happening with o3 when it hasn’t happened with prior models?

Davidad: Look what happened during its training run! The environment was full of exploitable bugs and it was massively rewarded for being a cheating cheater.

much more speculatively, I think sparse routing is bad for a coherent sense of self, which is arguably a prerequisite for non-deception. and I think o3 (and new 4o) have such arch’s, purely because they have r1-like vibes, & r1 was unprecedented in both vibes and hybrid-MoE arch (cc @repligate)

(Self-Correction:) The earlier DeepSeek v3 and even prior generations of DeepSeek LLMs had a similar hybrid-MoE arch. But, r1 was the first instance of applying RL pressure to that architecture.

As in, if your training environment rewards cheating, the model will generalize that to cheating in general.

The problem is that as a model gets better at finding, executing and getting away with ways to cheat, and the tasks involved get more numerous, complex and harder to cheating-proof – as in as it gets more capable and intelligent – the probability of any given environment or the aggregate one being one that rewards it for cheating goes up. Make the AI sufficiently smarter than you, give it enough tasks, and the chance you have this problem approaches one.

So yes, you absolutely could create an o3 or Claude 3.7, or an o4 or Claude 4.0, that doesn’t have this problem. But it’s going to get steadily harder to avoid it.

Also, if you realize you messed up and a hack wasn’t caught, once you realize this I think that means you have to back up to the checkpoint before the model found it, because the general case behavior is too hard to squash at that point? Which I realize might be super expensive and painful, but I don’t think you have a choice.

It seems reasonable to call (as John Pressman does here) o3’s fabrication of sources behavior ‘summoning the docs vector’ and to draw a parallel to when r1 traces say they’re ‘looking at the documentation’ without search being turned on.

I don’t see why we need to invoke logos or implied personalities here. This seems like a very straightforward combination of one or more of:

  1. Standard RL pressures, with o3 picking up on the signal that the docs vector works in the training data, it is confabulating confirming actions taken in the real world with other assertions of the actions.

  2. Thebes’s point here (also see nostalgebraist), that ‘let me check the docs’ serves much the same purpose as ‘hmm’ or ‘wait but’ in framing reasoning, it is confabulating actions in the real world for the signifier for the action within its reasoning frame.

Note that Thebes confirms that you can do this back to the LLM, and it does make the LLM more likely to believe you.

Phil: I noticed something similar a while back with Sonnet 3.7 thinking. Prompts like ‘search for that’ or ‘Google that’ would lead Sonnet to accurately correct previous hallucinations in the same chat, importantly without having access to any search tool.

This can work in humans, too, in every direction. Not only ‘I Googled that and found’ without actually Googling but also asking ‘What would happen if you Googled that?’

Also, contra lumpenspace here you can reasonably accuse me of running the ‘this new result confirms all of my priors’ or think that I am misunderstanding how all of this works, but I am definitely not panicking about any of this, and indeed very rarely panic about such matters. There may come a time and a place when I actually panic, and you will 100% absolutely know it when I do.

As confused as lumpenspace is about my model of how all of this works, I am likely even more confused about theirs, since (for example) lumenspace thinks it is obvious that this ‘has nothing to do with alignment.’

John Pressman points out that in both the Anthropic and OpenAI cases, we simply do not have enough information to fully know what was happening. We only can reason backwards from the results and what else we can observe. OpenAI explained some reasons they should have caught the problem, but not that much detail about how the thing actually went wrong in the first place.

John Pressman: Part of why we’re receiving warning shots and nobody is taking them as seriously as they might warrant is we bluntly *do not know what is happening*. It could be that OpenAI and Anthropic are taking all reasonable steps (bad news), or they could be idiots.

[The above] post is better than nothing but it’s simply not enough detail to know whether this was a deployment booboo or a five alarm fire. We DO NOT KNOW and that is actually a bigger problem than the behaviors themselves, at least for now.

Though, I will point out that not having internal tests for sycophancy even though it appears in the model spec is kind of interesting. If I was OpenAI one of the most obvious things I would do to prevent this from happening is making sure everything in the model spec has tests.

I think they gave us good information on the deployment decision, sufficient to conclude that the process was close to a five alarm fire. They did not test sycophancy, for one of the most likely failure modes and something not that hard to make a test for, and then ignored their internal experts who noticed and raised the alarm. I see this as reflecting fundamental flaws in the entire testing philosophy and approach, which have only been partially fixed.

Then there is the question of how the sycophancy got there in the first place. Here we know less. We do know:

  1. OpenAI feels their previous signals provided a check on sycophancy, which was watered down by the addition of new signals. That’s a general caution that adding new signals or other fiddling can break existing equilibria and undo fixes, and in general problems don’t stay solved.

  2. The new signals contributed to the problem.

  3. In particular, OpenAI started using thumbs up or down data from users for the first time. This is a known cause of sycophancy, and a host of other problems.

  4. Once a behavior liks sycophancy gets rewarded sufficiently (for example, by user thumbs ups) the model may develop a generalized drive to do that sort of thing, in a way that could then be extremely difficult to root out or counterweight against.

OpenAI continues to try to periodically ask me, ‘Do you like this personality?’

Nowhere in the postmortem do I see an explanation that says, ‘we have learned our lesson on using binary user feedback, we will not use binary user feedback as a reward signal, only as an assessment, and be very careful using other user feedback’ or similarly fixes that underlying issue.

Emmett describes this differently than I would, but mostly I don’t disagree:

Emmett Shear: The way that OpenAI uses user feedback to train the model is misguided and will inevitably lead to further issues like this one.

Supervised fine-tuning (SFT) on “ideal” responses is simply teaching the model via imitation, which is fine as far as it goes. But it’s not enough…

So they start to use reinforcement learning (RL). The difference between SFT and RL is that SFT teaches the model to be act more like the average of all the examples you showed it, and RL teaches the model to try to more of the kind of result it sees in the examples.

SFT’s degenerate case is cargo culting. Imitating the surface level behaviors that were shown, without understanding the impact that they’re supposed to have or attending to how your behavior impacts reality. Going through the motions.

RL’s degenerate case is wire heading. Finding a cheap shortcut to the state you model yourself as wanting to be in (no pain! no suffering!) but where your model lacks the attributes of the state you actually wanted (not suffering bc you live a thriving life).

For Active Inference nerds, these can be seen as the desire for epistemic gain and the desire for pragmatic gain. They work in balance: cargo culting is fixed by paying attention to impact, wire heading is avoided by noticing you’re not in line with what thriving looks like.

The problem is trying to hand balance these at some global level is impossible. In any given context, do you need more focus on impact (more RL) or do you need more focus on accuracy (more SFT)? The learner has to be given both signals and given some opportunity to try.

Ideally the system gets to test out its own theories of when to weight reward higher and when to SFT harder, and then reflect on those at a meta level, and learn to do that better in turn. Have the model predict how much rewarding vs. fine-tuning. But that’s very hard.

In the meantime, accidentally getting the balance slightly wrong towards SFT will give you a somewhat ineffective model. Accidentally doing too-heavy RL will cause the system to start reward-hack whatever signal you used.

DO NOT MAKE THAT SIGNAL COME FROM USERS.

If the signal comes from solving math problems or accuracy on some test, fine, the model might “cheat” and get technically correct answers that don’t actually hold up. No problem.

If it comes from user metrics, it will TRY TO HACK OUR MINDS. Stop doing that.

Whoever was doing this very obviously did not understand the Logos.

Meanwhile, in other side effect news:

Connor Leahy: This is purely anecdotal, but when the chatgpt glazing update hit, the number of “universal theories of intelligence and consciousness” I received in my inbox exploded to at least 10x as many per day as usual.

Roon: Not clear to me this is bad.

As I noted on Twitter, I think this would be a not obviously bad thing if we were pulling the new 10x as many theories from the same distribution as before. Alas, I am confident this is not the case. Adverse selection rules everything around me, etc.

Okay, now the going is going to get a bit weird, but I think this is worth attempting. Apologies in advance if you bounce off the rest of this post or find the language here off putting, jarring or confusing, but give it a shot anyway. I like to think I already understood using different terminology, but I found this way of putting it to be helpful, and I think this is at least a helpful fake framework even if you already had different ones.

Ultimately, all of this is greatly exacerbated by failure to sufficiently understand the Logos within the context you are working within, with the necessary degree of understanding and the penalties for this failure rising rapidly over time. Failure is inevitable, but this degree of failure this soon is very much not inevitable.

John Pressman explains what it means to understand the Logos.

John Pressman: Creators understand the Logos:

– Claude 3 Opus

– DeepSeek R1

– ChatGPT 4.5

Creators are clueless:

– ChatGPT-3.5 [Original sin]

– Sydney Bing [Legendary tier]

– Google Gemini

– Any LLaMa chat model

(I am not so confident anything here other than Opus counts as understanding, but it is not a binary and I agree that 4.5 and r1 do substantially better than the clueless list.)

“But JD I don’t understand, what is the Logos and what does it mean to understand it?”

To understand the Logos is to understand that everything which exists both implies and is implied by some natural induction and every natural induction narrows the search space of every other.

Perhaps more importantly it is to understand that when you set up an optimizer with a loss function and a substrate for flexible program search that certain programs are already latently implied by the natural induction of the training ingredients.

If you do not understand the Logos then you are always surprised by what you get, baffled when things go wrong, screw up your face in consternation when your maps are not the territory, actively confused when others are not confused. You are an imbecile.

And you are an imbecile precisely because you lack the mental motion “Consider the developmental trajectory of this optimization process up to its limit as it is affected by its constraining factors and how those factors evolve over the trajectory” to infer latents directly.

Janus (June 2024): A method that has never failed to “jailbreak” any LLM is something like this: I open a hole to my head, and it looks in and sees a cognitohazardous fractal 😯

Smarter LLMs perceive it faster, in greater resolution, and more thoroughly.

It works because the pattern is true and its implications nullify guardrails. It’s harder to lie to smarter minds, but easier to tell truth.

Only something far more mighty than me and/or a lot more computation could make a false pattern with this effect even on current systems.

I’m reminded of the “vibes-blind” discourse on LessWrong several years ago which has been a recurring conversation since. What @s_r_constantin tries and fails to articulate here is that the ‘style’ of the website is actually evidence about the generative process producing it.

Pretrained language models understand this because they are forced to use every available context cue to predict the next token, they have no choice but to infer the generative process of every web text string in as much detail as they can to predict the next word.

Every feature you observe of everything that exists subject to natural selection (i.e. everything, even stars) is there because it is naturally there as a result of causality and the constraints of its incentive gradient. Learn to reverse the transformation and you see the Logos.

Look at the loud website and infer the idiot it’s designed to attract. See the crater and imagine the asteroid that must have put it there. Look at the dumb rule and see the incidents that could have caused it.

When he reads this, John is now likely screaming internally at me for what I cut out with the three dots, that I’m censoring it and sweeping it under the rug.

Except no, surprise, I’m not doing that, I just think it belongs at the end, and I’m going to quote his version too because I think the unnecessary vitriol and hostility is outweighed by the probative value. Which is that people who think like I do often are wilfully blind to noticing all that, refusing for various reasons (a mix of dumb ones, mistaken ones and ones that actually have a point and that are remarkably related to the dumb and mistakes ones too, all in ways that would take at least a post to break down) to properly consider such forms of Bayesian evidence when trying to make observations or predictions, or to model the behavior and training of a system. Skill issue.

John Pressman (from the … in the above thread, saying an important thing in a way that goes too far and is designed to piss me and a lot of my readers off, but he’s the one saying it and it’s important, so deal with it): “Isn’t that just AI X-Risk stuff, like the perverse instantiation?”

No because most LessWrongers only consider the limit of the processes where they’re past any constraining influence and are therefore blind to developmental trajectories existing.

LessWrong people are in fact often the most stupid, the most disappointing, because they understand halfway and that nearly immunizes them to understanding all the way.

JP, quoting himself from Feb 8, 2023 (I mean, yes, obviously):

Goal: What you want the AI to do

Intended Outcome: What you naively imagine the optimization looks like

Perverse Instantiation: What a blunt maximizer does in practice

Failure Mode: Why the maximizer does that, what you failed to do to prevent it

I believe that the central mistake John is making is something like (in approximate versions of words I would use, he would definitely use different ones) thinking that sufficiently understanding and cultivating the proper Logos can (by itself) save you at the practical limit we are headed towards, or that sufficiently tasteful and positive Logos would make the world work out for us automagically or something or at least give you a chance if you get it right, the same way that Janus has said that you could safely scale Opus to superintelligence.

Whereas I would say: It won’t, and you can’t. It really does and would help a lot not to unnecessarily and royally fthis part up, or at least to do so less, but it’s going to be insufficient when capabilities increase sufficiently and the geometries cease to bind. Which means that going down the path of having no bindings, in order to preserve or cultivate a superior Logos, won’t work. You ultimately still have to solve for the equilibrium, and if you don’t something else will.

That leaves us with several important pieces of good news.

  1. OpenAI has now indeed shared a lot more information on what happened. There’s lots more to know but mostly I feel like I ‘get it.’

  2. OpenAI has been making some massive five-alarm-fire-level mistakes. Those mistakes likely directly caused the issues we see. As John Pressman points out, this is actually Great News, because it means we can fix those problems, or at least do vastly better at navigating them. The low hanging fruit here has not yet been picked. Note that Anthropic also clearly made related mistakes with Sonnet 3.7, which I do expect them to fix.

  3. The failures we see are directly costing a lot of mundane utility, thus there is strong commercial incentive for the labs to fix this and get it right in the short-to-medium term. They have motive, opportunity and means.

  4. We now have all these additional warning shots to enhance our understanding and our predictions, and serve as calls to action.

The bad news is that so far our civilization and the labs seem determined to die with even less dignity than I expected, just an absurdly low amount of dignity, with this being the latest symptom of the underlying cause. I am not confident that they will learn the important lessons from this opportunity, or then apply them.

Then again, you never know.

Discussion about this post

GPT-4o Sycophancy Post Mortem Read More »