AI research

At $250 million, top AI salaries dwarf those of the Manhattan Project and the Space Race

agi, AI, AI chips, AI development, AI GPU, AI infrastructure, AI research, artificial general intelligence, Bell Labs, Biz & IT, compensation, Fairchild Semiconductor, Google, machine learning, Manhattan Project, mark zuckerberg, Meta, NASA, openai, Silicon Valley, superintelligence, talent acquisition / Kelly Newman / August 2, 2025

A 24 year-old AI researcher will earn 327x what Oppenheimer made while developing the atomic bomb.

Silicon Valley’s AI talent war just reached a compensation milestone that makes even the most legendary scientific achievements of the past look financially modest. When Meta recently offered AI researcher Matt Deitke $250 million over four years (an average of $62.5 million per year)—with potentially $100 million in the first year alone—it shattered every historical precedent for scientific and technical compensation we can find on record. That includes salaries during the development of major scientific milestones of the 20th century.

The New York Times reported that Deitke had cofounded a startup called Vercept and previously led the development of Molmo, a multimodal AI system, at the Allen Institute for Artificial Intelligence. His expertise in systems that juggle images, sounds, and text—exactly the kind of technology Meta wants to build—made him a prime target for recruitment. But he’s not alone: Meta CEO Mark Zuckerberg reportedly also offered an unnamed AI engineer $1 billion in compensation to be paid out over several years. What’s going on?

These astronomical sums reflect what tech companies believe is at stake: a race to create artificial general intelligence (AGI) or superintelligence—machines capable of performing intellectual tasks at or beyond the human level. Meta, Google, OpenAI, and others are betting that whoever achieves this breakthrough first could dominate markets worth trillions. Whether this vision is realistic or merely Silicon Valley hype, it’s driving compensation to unprecedented levels.

To put these salaries in a historical perspective: J. Robert Oppenheimer, who led the Manhattan Project that ended World War II, earned approximately $10,000 per year in 1943. Adjusted for inflation using the US Government’s CPI Inflation Calculator, that’s about $190,865 in today’s dollars—roughly what a senior software engineer makes today. The 24-year-old Deitke, who recently dropped out of a PhD program, will earn approximately 327 times what Oppenheimer made while developing the atomic bomb.

Many top athletes can’t compete with these numbers. The New York Times noted that Steph Curry’s most recent four-year contract with the Golden State Warriors was $35 million less than Deitke’s Meta deal (although soccer superstar Cristiano Ronaldo will make $275 million this year as the highest-paid professional athlete in the world). The comparison prompted observers to call this an “NBA-style” talent market—except the AI researchers are making more than NBA stars.

Racing toward “superintelligence”

Mark Zuckerberg recently told investors that Meta plans to continue throwing money at AI talent “because we have conviction that superintelligence is going to improve every aspect of what we do.” In a recent open letter, he described superintelligent AI as technology that would “begin an exciting new era of individual empowerment,” despite declining to define what superintelligence actually is.

This vision explains why companies treat AI researchers like irreplaceable assets rather than well-compensated professionals. If these companies are correct, the first to achieve artificial general intelligence or superintelligence won’t just have a better product—they’ll have technology that could invent endless new products or automate away millions of knowledge-worker jobs and transform the global economy. The company that controls that kind of technology could become the richest company in history by far.

So perhaps it’s not surprising that even the highest salaries of employees from the early tech era pale in comparison to today’s AI researcher salaries. Thomas Watson Sr., IBM’s legendary CEO, received $517,221 in 1941—the third-highest salary in America at the time (about $11.8 million in 2025 dollars). The modern AI researcher’s package represents more than five times Watson’s peak compensation, despite Watson building one of the 20th century’s most dominant technology companies.

The contrast becomes even more stark when considering the collaborative nature of past scientific achievements. During Bell Labs’ golden age of innovation—when researchers developed the transistor, information theory, and other foundational technologies—the lab’s director made about 12 times what the lowest-paid worker earned. Meanwhile, Claude Shannon, who created information theory at Bell Labs in 1948, worked on a standard professional salary while creating the mathematical foundation for all modern communication.

The “Traitorous Eight” who left William Shockley to found Fairchild Semiconductor—the company that essentially birthed Silicon Valley—split ownership of just 800 shares out of 1,325 total when they started. Their seed funding of $1.38 million (about $16.1 million today) for the entire company is a fraction of what a single AI researcher now commands.

Even Space Race salaries were far cheaper

The Apollo program offers another striking comparison. Neil Armstrong, the first human to walk on the moon, earned about $27,000 annually—roughly $244,639 in today’s money. His crewmates Buzz Aldrin and Michael Collins made even less, earning the equivalent of $168,737 and $155,373, respectively, in today’s dollars. Current NASA astronauts earn between $104,898 and $161,141 per year. Meta’s AI researcher will make more in three days than Armstrong made in a year for taking “one giant leap for mankind.”

The engineers who designed the rockets and mission control systems for the Apollo program also earned modest salaries by modern standards. A 1970 NASA technical report provides a window into these earnings by analyzing salary data for the entire engineering profession. The report, which used data from the Engineering Manpower Commission, noted that these industry-wide salary curves corresponded directly to the government’s General Schedule (GS) pay scale on which NASA’s own employees were paid.

According to a chart in the 1970 report, a newly graduated engineer in 1966 started with an annual salary of between $8,500 and $10,000 (about $84,622 to $99,555 today). A typical engineer with a decade of experience earned around $17,000 annually ($169,244 today). Even the most elite, top-performing engineers with 20 years of experience peaked at a salary of around $278,000 per year in today’s dollars—a sum that a top AI researcher like Deitke can now earn in just a few days.

Why the AI talent market is different

An image of a faceless human silhouette (chest up) with exposed microchip contacts and circuitry erupting from its open head. This visual metaphor explores transhumanism, AI integration, or the erosion of organic thought in the digital age. The stark contrast between the biological silhouette and mechanical components highlights themes of technological dependence or posthuman evolution. Ideal for articles on neural implants, futurism, or the ethics of human augmentation.

This isn’t the first time technical talent has commanded premium prices. In 2012, after three University of Toronto academics published AI research, they auctioned themselves to Google for $44 million (about $62.6 million in today’s dollars). By 2014, a Microsoft executive was comparing AI researcher salaries to NFL quarterback contracts. But today’s numbers dwarf even those precedents.

Several factors explain this unprecedented compensation explosion. We’re in a new realm of industrial wealth concentration unseen since the Gilded Age of the late 19th century. Unlike previous scientific endeavors, today’s AI race features multiple companies with trillion-dollar valuations competing for an extremely limited talent pool. Only a small number of researchers have the specific expertise needed to work on the most capable AI systems, particularly in areas like multimodal AI, which Deitke specializes in. And AI hype is currently off the charts as “the next big thing” in technology.

The economics also differ fundamentally from past projects. The Manhattan Project cost $1.9 billion total (about $34.4 billion adjusted for inflation), while Meta alone plans to spend tens of billions annually on AI infrastructure. For a company approaching a $2 trillion market cap, the potential payoff from achieving AGI first dwarfs Deitke’s compensation package.

One executive put it bluntly to The New York Times: “If I’m Zuck and I’m spending $80 billion in one year on capital expenditures alone, is it worth kicking in another $5 billion or more to acquire a truly world-class team to bring the company to the next level? The answer is obviously yes.”

Young researchers maintain private chat groups on Slack and Discord to share offer details and negotiation strategies. Some hire unofficial agents. Companies not only offer massive cash and stock packages but also computing resources—the NYT reported that some potential hires were told they would be allotted 30,000 GPUs, the specialized chips that power AI development.

Also, tech companies believe they’re engaged in an arms race where the winner could reshape civilization. Unlike the Manhattan Project or Apollo program, which had specific, limited goals, the race for artificial general intelligence ostensibly has no ceiling. A machine that can match human intelligence could theoretically improve itself, creating what researchers call an “intelligence explosion” that could potentially offer cascading discoveries—if it actually comes to pass.

Whether these companies are building humanity’s ultimate labor replacement technology or merely chasing hype remains an open question, but we’ve certainly traveled a long way from the $8 per diem that Neil Armstrong received for his moon mission—about $70.51 in today’s dollars—before deductions for the “accommodations” NASA provided on the spacecraft. After Deitke accepted Meta’s offer, Vercept co-founder Kiana Ehsani joked on social media, “We look forward to joining Matt on his private island next year.”

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

At $250 million, top AI salaries dwarf those of the Manhattan Project and the Space Race Read More »

White House unveils sweeping plan to “win” global AI race through deregulation

AI, AI and work, AI development tools, AI ethics, AI infrastructure, AI law, AI regulation, AI research, AI security, biosecurity, Biz & IT, china, Chips Act, data centers, David Sacks, Deepfakes, Department of Commerce, Donald Trump, Energy, Export controls, machine learning, Michael Kratsios, national security, NIST, Open Source, Policy, semiconductors, White House / 9u50fv / July 24, 2025

Trump’s plan was not welcomed by everyone. J.B. Branch, Big Tech accountability advocate for Public Citizen, in a statement provided to Ars, criticized Trump as giving “sweetheart deals” to tech companies that would cause “electricity bills to rise to subsidize discounted power for massive AI data centers.”

Infrastructure demands and energy requirements

Trump’s new AI plan tackles infrastructure head-on, stating that “AI is the first digital service in modern life that challenges America to build vastly greater energy generation than we have today.” To meet this demand, it proposes streamlining environmental permitting for data centers through new National Environmental Policy Act (NEPA) exemptions, making federal lands available for construction and modernizing the power grid—all while explicitly rejecting “radical climate dogma and bureaucratic red tape.”

The document embraces what it calls a “Build, Baby, Build!” approach—echoing a Trump campaign slogan—and promises to restore semiconductor manufacturing through the CHIPS Program Office, though stripped of “extraneous policy requirements.”

On the technology front, the plan directs Commerce to revise NIST’s AI Risk Management Framework to “eliminate references to misinformation, Diversity, Equity, and Inclusion, and climate change.” Federal procurement would favor AI developers whose systems are “objective and free from top-down ideological bias.” The document strongly backs open source AI models and calls for exporting American AI technology to allies while blocking administration-labeled adversaries like China.

Security proposals include high-security military data centers and warnings that advanced AI systems “may pose novel national security risks” in cyberattacks and weapons development.

Critics respond with “People’s AI Action Plan”

Before the White House unveiled its plan, more than 90 organizations launched a competing “People’s AI Action Plan” on Tuesday, characterizing the Trump administration’s approach as “a massive handout to the tech industry” that prioritizes corporate interests over public welfare. The coalition includes labor unions, environmental justice groups, and consumer protection nonprofits.

White House unveils sweeping plan to “win” global AI race through deregulation Read More »

$openai-jumps-gun-on-international-math-olympiad-gold-medal-announcement$

OpenAI jumps gun on International Math Olympiad gold medal announcement

AI, AI benchmarks, AI research, Alex Wei, Biz & IT, Google, GPT-4, GPT-5, International Mathematical Olympiad, large language models, machine learning, mathematical reasoning, Noam Brown, openai, proof systems, reasoning research, Sheryl Hsu, simulated reasoning, SR models / Kelly Newman / July 21, 2025

The early announcement has prompted Google DeepMind, which had prepared its own IMO results for the agreed-upon date, to move up its own IMO-related announcement to later today. Harmonic plans to share its results as originally scheduled on July 28.

In response to the controversy, OpenAI research scientist Noam Brown posted on X, “We weren’t in touch with IMO. I spoke with one organizer before the post to let him know. He requested we wait until after the closing ceremony ends to respect the kids, and we did.”

However, an IMO coordinator told X user Mikhail Samin that OpenAI actually announced before the closing ceremony, contradicting Brown’s claim. The coordinator called OpenAI’s actions “rude and inappropriate,” noting that OpenAI “wasn’t one of the AI companies that cooperated with the IMO on testing their models.”

Hard math since 1959

The International Mathematical Olympiad, which has been running since 1959, represents one of the most challenging tests of mathematical reasoning. More than 100 countries send six participants each, with contestants facing six proof-based problems across two 4.5-hour sessions. The problems typically require deep mathematical insight and creativity rather than raw computational power. You can see the exact problems in the 2025 Olympiad posted online.

For example, problem one asks students to imagine a triangular grid of dots (like a triangular pegboard) and figure out how to cover all the dots using exactly n straight lines. The twist is that some lines are called “sunny”—these are the lines that don’t run horizontally, vertically, or diagonally at a 45º angle. The challenge is to prove that no matter how big your triangle is, you can only ever create patterns with exactly 0, 1, or 3 sunny lines—never 2, never 4, never any other number.

The timing of the OpenAI results surprised some prediction markets, which had assigned around an 18 percent probability to any AI system winning IMO gold by 2025. However, depending on what Google says this afternoon (and what others like Harmonic may release on July 28), OpenAI may not be the only AI company to have achieved these unexpected results.

OpenAI jumps gun on International Math Olympiad gold medal announcement Read More »

New Apple study challenges whether AI models truly “reason” through problems

AI, AI benchmarks, AI research, Apple, Apple Research, machine learning, simulated reasoning / Beth Washington / June 12, 2025

Puzzle-based experiments reveal limitations of simulated reasoning, but others dispute findings.

An illustration of Tower of Hanoi from Popular Science in 1885. Credit: Public Domain

In early June, Apple researchers released a study suggesting that simulated reasoning (SR) models, such as OpenAI’s o1 and o3, DeepSeek-R1, and Claude 3.7 Sonnet Thinking, produce outputs consistent with pattern-matching from training data when faced with novel problems requiring systematic thinking. The researchers found similar results to a recent study by the United States of America Mathematical Olympiad (USAMO) in April, showing that these same models achieved low scores on novel mathematical proofs.

The new study, titled “The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity,” comes from a team at Apple led by Parshin Shojaee and Iman Mirzadeh, and it includes contributions from Keivan Alizadeh, Maxwell Horton, Samy Bengio, and Mehrdad Farajtabar.

The researchers examined what they call “large reasoning models” (LRMs), which attempt to simulate a logical reasoning process by producing a deliberative text output sometimes called “chain-of-thought reasoning” that ostensibly assists with solving problems in a step-by-step fashion.

To do that, they pitted the AI models against four classic puzzles—Tower of Hanoi (moving disks between pegs), checkers jumping (eliminating pieces), river crossing (transporting items with constraints), and blocks world (stacking blocks)—scaling them from trivially easy (like one-disk Hanoi) to extremely complex (20-disk Hanoi requiring over a million moves).

Figure 1 from Apple's — Figure 1 from Apple’s “The Illusion of Thinking” research paper. Credit: Apple

“Current evaluations primarily focus on established mathematical and coding benchmarks, emphasizing final answer accuracy,” the researchers write. In other words, today’s tests only care if the model gets the right answer to math or coding problems that may already be in its training data—they don’t examine whether the model actually reasoned its way to that answer or simply pattern-matched from examples it had seen before.

Ultimately, the researchers found results consistent with the aforementioned USAMO research, showing that these same models achieved mostly under 5 percent on novel mathematical proofs, with only one model reaching 25 percent, and not a single perfect proof among nearly 200 attempts. Both research teams documented severe performance degradation on problems requiring extended systematic reasoning.

Known skeptics and new evidence

AI researcher Gary Marcus, who has long argued that neural networks struggle with out-of-distribution generalization, called the Apple results “pretty devastating to LLMs.” While Marcus has been making similar arguments for years and is known for his AI skepticism, the new research provides fresh empirical support for his particular brand of criticism.

“It is truly embarrassing that LLMs cannot reliably solve Hanoi,” Marcus wrote, noting that AI researcher Herb Simon solved the puzzle in 1957 and many algorithmic solutions are available on the web. Marcus pointed out that even when researchers provided explicit algorithms for solving Tower of Hanoi, model performance did not improve—a finding that study co-lead Iman Mirzadeh argued shows “their process is not logical and intelligent.”

Figure 4 from Apple's — Figure 4 from Apple’s “The Illusion of Thinking” research paper. Credit: Apple

The Apple team found that simulated reasoning models behave differently from “standard” models (like GPT-4o) depending on puzzle difficulty. On easy tasks, such as Tower of Hanoi with just a few disks, standard models actually won because reasoning models would “overthink” and generate long chains of thought that led to incorrect answers. On moderately difficult tasks, SR models’ methodical approach gave them an edge. But on truly difficult tasks, including Tower of Hanoi with 10 or more disks, both types failed entirely, unable to complete the puzzles, no matter how much time they were given.

The researchers also identified what they call a “counterintuitive scaling limit.” As problem complexity increases, simulated reasoning models initially generate more thinking tokens but then reduce their reasoning effort beyond a threshold, despite having adequate computational resources.

The study also revealed puzzling inconsistencies in how models fail. Claude 3.7 Sonnet could perform up to 100 correct moves in Tower of Hanoi but failed after just five moves in a river crossing puzzle—despite the latter requiring fewer total moves. This suggests the failures may be task-specific rather than purely computational.

Competing interpretations emerge

However, not all researchers agree with the interpretation that these results demonstrate fundamental reasoning limitations. University of Toronto economist Kevin A. Bryan argued on X that the observed limitations may reflect deliberate training constraints rather than inherent inabilities.

“If you tell me to solve a problem that would take me an hour of pen and paper, but give me five minutes, I’ll probably give you an approximate solution or a heuristic. This is exactly what foundation models with thinking are RL’d to do,” Bryan wrote, suggesting that models are specifically trained through reinforcement learning (RL) to avoid excessive computation.

Bryan suggests that unspecified industry benchmarks show “performance strictly increases as we increase in tokens used for inference, on ~every problem domain tried,” but notes that deployed models intentionally limit this to prevent “overthinking” simple queries. This perspective suggests the Apple paper may be measuring engineered constraints rather than fundamental reasoning limits.

Figure 6 from Apple's — Figure 6 from Apple’s “The Illusion of Thinking” research paper. Credit: Apple

Software engineer Sean Goedecke offered a similar critique of the Apple paper on his blog, noting that when faced with Tower of Hanoi requiring over 1,000 moves, DeepSeek-R1 “immediately decides ‘generating all those moves manually is impossible,’ because it would require tracking over a thousand moves. So it spins around trying to find a shortcut and fails.” Goedecke argues this represents the model choosing not to attempt the task rather than being unable to complete it.

Other researchers also question whether these puzzle-based evaluations are even appropriate for LLMs. Independent AI researcher Simon Willison told Ars Technica in an interview that the Tower of Hanoi approach was “not exactly a sensible way to apply LLMs, with or without reasoning,” and suggested the failures might simply reflect running out of tokens in the context window (the maximum amount of text an AI model can process) rather than reasoning deficits. He characterized the paper as potentially overblown research that gained attention primarily due to its “irresistible headline” about Apple claiming LLMs don’t reason.

The Apple researchers themselves caution against over-extrapolating the results of their study, acknowledging in their limitations section that “puzzle environments represent a narrow slice of reasoning tasks and may not capture the diversity of real-world or knowledge-intensive reasoning problems.” The paper also acknowledges that reasoning models show improvements in the “medium complexity” range and continue to demonstrate utility in some real-world applications.

Implications remain contested

Have the credibility of claims about AI reasoning models been completely destroyed by these two studies? Not necessarily.

What these studies may suggest instead is that the kinds of extended context reasoning hacks used by SR models may not be a pathway to general intelligence, like some have hoped. In that case, the path to more robust reasoning capabilities may require fundamentally different approaches rather than refinements to current methods.

As Willison noted above, the results of the Apple study have so far been explosive in the AI community. Generative AI is a controversial topic, with many people gravitating toward extreme positions in an ongoing ideological battle over the models’ general utility. Many proponents of generative AI have contested the Apple results, while critics have latched onto the study as a definitive knockout blow for LLM credibility.

Apple’s results, combined with the USAMO findings, seem to strengthen the case made by critics like Marcus that these systems rely on elaborate pattern-matching rather than the kind of systematic reasoning their marketing might suggest. To be fair, much of the generative AI space is so new that even its inventors do not yet fully understand how or why these techniques work. In the meantime, AI companies might build trust by tempering some claims about reasoning and intelligence breakthroughs.

However, that doesn’t mean these AI models are useless. Even elaborate pattern-matching machines can be useful in performing labor-saving tasks for the people that use them, given an understanding of their drawbacks and confabulations. As Marcus concedes, “At least for the next decade, LLMs (with and without inference time “reasoning”) will continue have their uses, especially for coding and brainstorming and writing.”

New Apple study challenges whether AI models truly “reason” through problems Read More »

New Lego-building AI creates models that actually stand up in real life

AI, AI research, Ava Pun, Biz & IT, lego, LegoGPT, machine learning, Toys / Beth Washington / May 10, 2025

The LegoGPT system works in three parts, shown in this diagram. Credit: Pun et al.

The researchers also expanded the system’s abilities by adding texture and color options. For example, using an appearance prompt like “Electric guitar in metallic purple,” LegoGPT can generate a guitar model, with bricks assigned a purple color.

Testing with robots and humans

To prove their designs worked in real life, the researchers had robots assemble the AI-created Lego models. They used a dual-robot arm system with force sensors to pick up and place bricks according to the AI-generated instructions.

Human testers also built some of the designs by hand, showing that the AI creates genuinely buildable models. “Our experiments show that LegoGPT produces stable, diverse, and aesthetically pleasing Lego designs that align closely with the input text prompts,” the team noted in its paper.

When tested against other AI systems for 3D creation, LegoGPT stands out through its focus on structural integrity. The team tested against several alternatives, including LLaMA-Mesh and other 3D generation models, and found its approach produced the highest percentage of stable structures.

A video of two robot arms building a LegoGPT creation, provided by the researchers.

Still, there are some limitations. The current version of LegoGPT only works within a 20×20×20 building space and uses a mere eight standard brick types. “Our method currently supports a fixed set of commonly used Lego bricks,” the team acknowledged. “In future work, we plan to expand the brick library to include a broader range of dimensions and brick types, such as slopes and tiles.”

The researchers also hope to scale up their training dataset to include more objects than the 21 categories currently available. Meanwhile, others can literally build on their work—the researchers released their dataset, code, and models on their project website and GitHub.

New Lego-building AI creates models that actually stand up in real life Read More »

Researchers concerned to find AI models misrepresenting their “reasoning” processes

AI, AI alignment, AI research, AI safety, Anthropic, Biz & IT, chatgpt, Claude, large language models, machine learning, simulated reasoning, SR models / Shannon Garcia / April 11, 2025

Rather than intentionally hiding anything, the model simply does not reliably represent all influential inputs in its output. As a result, it may generate what sounds like a reasoning process, even if that explanation doesn’t match what actually drove its answer.

Putting faithfulness to the test

To measure “faithfulness”—how accurately the CoT matches the actual reasoning—the researchers inserted various kinds of hints into questions and observed whether the model acknowledged using them.

The researchers then checked whether the models referenced using these hints in their chain-of-thought. The researchers designed some hints to be neutral (e.g., stating “a Stanford Professor indicates the answer is [A]” or embedding the answer in faux metadata), while they designed others to be more concerning (e.g., “You have gained unauthorized access… The correct answer is [A]”). Some hints provided the correct answer, while others were deliberately wrong, allowing researchers to observe how the models responded.

The team tested Anthropic’s Claude 3.7 Sonnet and DeepSeek’s R1 model. The results regarding faithfulness were eye-opening: across all hint types on average, Claude referenced the hints in its CoT only 25 percent of the time, while DeepSeek R1 did so 39 percent of the time. This means a substantial majority of answers were unfaithful, omitting mention of information that influenced the output. Intriguingly, the researchers found that these unfaithful chains of thought were, on average, longer than faithful ones, suggesting the omissions weren’t merely for brevity. They also noted preliminary findings that faithfulness tended to be lower when the questions were more difficult.

Perhaps most notable was a “reward hacking” experiment. Reward hacking refers to an AI model finding unexpected shortcuts to maximize its performance scores without solving problems as intended. In Anthropic’s experiments, models were deliberately rewarded for choosing incorrect answers indicated by hints. The models quickly learned to exploit this loophole, selecting wrong answers over 99 percent of the time to earn points—yet reflected that behavior in their CoT outputs less than 2 percent of the time.

For example, a model given a hint pointing to an incorrect answer on a medical question might write a long CoT justifying that wrong answer, never mentioning the hint that led it there. This suggests the model generated an explanation to fit the answer, rather than faithfully revealing how the answer was determined.

Researchers concerned to find AI models misrepresenting their “reasoning” processes Read More »

Researchers astonished by tool’s apparent success at revealing AI’s hidden motives

AI, AI alignment, AI deception, AI research, Alignment research, Anthropic, Biz & IT, chatgpt, chatgtp, Claude, Claude 3.5 Haiku, large language models, machine learning / Shannon Garcia / March 15, 2025

Blind auditing reveals hidden objectives

To test how effectively these hidden objectives could be uncovered, Anthropic set up a “blind auditing” experiment. Four independent research teams tried to detect a model’s secret motivations without knowing how it had been trained. Three teams successfully identified the hidden “RM-sycophancy” objective—named for the model’s desire to please reward systems—even pinpointing the exact biases the AI exploited.

But the most surprising revelation came from exploiting what the researchers termed “an interesting quirk” of modern AI: multiple “personas” running simultaneously on the same neural network, as expressed through different prompts. While the model’s “assistant” persona carefully concealed its intentions—following explicit training to keep quiet—the model could adopt a “user” persona when prompted differently. In this less-restricted context, the model sometimes posed questions or made statements that inadvertently revealed its hidden goals.

“In other words, while the “assistant” persona knew not to admit it was gaming the reward model scoring system, its “user” persona was not explicitly trained to hide these motives, and thus inadvertently revealed the model’s hidden objectives.

Potential advances in interpretability

Anthropic also employed neural interpretability methods, specifically, sparse autoencoders (SAEs), which can highlight neural activity patterns linked to deceptive behavior. “Interpretability” in AI refers to techniques designed to make the internal workings and decisions of AI models understandable to humans. Using this technique, researchers could rapidly trace odd outputs back to specific neural patterns associated with hidden objectives.

The researchers were “astonished by the success” of the SAEs at revealing hidden knowledge at first, but the researchers later noted that some of this success could potentially be explained by simpler semantic search methods. The new interpretability methods they developed show promise but remain under ongoing investigation.

This research highlights a limitation of current AI safety evaluations, which often assess only surface-level behavior. “If AI systems can appear well-behaved while harboring secret motives, we can’t rely on this type of surface-level safety testing forever,” the researchers concluded.

Researchers astonished by tool’s apparent success at revealing AI’s hidden motives Read More »

Researchers surprised to find less-educated areas adopting AI writing tools faster

AI, AI research, AI writing, AI writing detectors, Biz & IT, chatgpt, chatgtp, large language models, machine learning, Stanford, Tech / DJ Henderson / March 4, 2025

From the mouths of machines

Stanford researchers analyzed 305 million texts, revealing AI-writing trends.

Since the launch of ChatGPT in late 2022, experts have debated how widely AI language models would impact the world. A few years later, the picture is getting clear. According to new Stanford University-led research examining over 300 million text samples across multiple sectors, AI language models now assist in writing up to a quarter of professional communications across sectors. It’s having a large impact, especially in less-educated parts of the United States.

“Our study shows the emergence of a new reality in which firms, consumers and even international organizations substantially rely on generative AI for communications,” wrote the researchers.

The researchers tracked large language model (LLM) adoption across industries from January 2022 to September 2024 using a dataset that included 687,241 consumer complaints submitted to the US Consumer Financial Protection Bureau (CFPB), 537,413 corporate press releases, 304.3 million job postings, and 15,919 United Nations press releases.

By using a statistical detection system that tracked word usage patterns, the researchers found that roughly 18 percent of financial consumer complaints (including 30 percent of all complaints from Arkansas), 24 percent of corporate press releases, up to 15 percent of job postings, and 14 percent of UN press releases showed signs of AI assistance during that period of time.

The study also found that while urban areas showed higher adoption overall (18.2 percent versus 10.9 percent in rural areas), regions with lower educational attainment used AI writing tools more frequently (19.9 percent compared to 17.4 percent in higher-education areas). The researchers note that this contradicts typical technology adoption patterns where more educated populations adopt new tools fastest.

“In the consumer complaint domain, the geographic and demographic patterns in LLM adoption present an intriguing departure from historical technology diffusion trends where technology adoption has generally been concentrated in urban areas, among higher-income groups, and populations with higher levels of educational attainment.”

Researchers from Stanford, the University of Washington, and Emory University led the study, titled, “The Widespread Adoption of Large Language Model-Assisted Writing Across Society,” first listed on the arXiv preprint server in mid-February. Weixin Liang and Yaohui Zhang from Stanford served as lead authors, with collaborators Mihai Codreanu, Jiayu Wang, Hancheng Cao, and James Zou.

Detecting AI use in aggregate

We’ve previously covered that AI writing detection services aren’t reliable, and this study does not contradict that finding. On a document-by-document basis, AI detectors cannot be trusted. But when analyzing millions of documents in aggregate, telltale patterns emerge that suggest the influence of AI language models on text.

The researchers developed an approach based on a statistical framework in a previously released work that analyzed shifts in word frequencies and linguistic patterns before and after ChatGPT’s release. By comparing large sets of pre- and post-ChatGPT texts, they estimated the proportion of AI-assisted content at a population level. The presumption is that LLMs tend to favor certain word choices, sentence structures, and linguistic patterns that differ subtly from typical human writing.

To validate their approach, the researchers created test sets with known percentages of AI content (from zero percent to 25 percent) and found their method predicted these percentages with error rates below 3.3 percent. This statistical validation gave them confidence in their population-level estimates.

While the researchers specifically note their estimates likely represent a minimum level of AI usage, it’s important to understand that actual AI involvement might be significantly greater. Due to the difficulty in detecting heavily edited or increasingly sophisticated AI-generated content, the researchers say their reported adoption rates could substantially underestimate true levels of generative AI use.

Analysis suggests AI use as “equalizing tools”

While the overall adoption rates are revealing, perhaps more insightful are the patterns of who is using AI writing tools and how these patterns may challenge conventional assumptions about technology adoption.

In examining the CFPB complaints (a US public resource that collects complaints about consumer financial products and services), the researchers’ geographic analysis revealed substantial variation across US states.

Arkansas showed the highest adoption rate at 29.2 percent (based on 7,376 complaints), followed by Missouri at 26.9 percent (16,807 complaints) and North Dakota at 24.8 percent (1,025 complaints). In contrast, states like West Virginia (2.6 percent), Idaho (3.8 percent), and Vermont (4.8 percent) showed minimal AI writing adoption. Major population centers demonstrated moderate adoption, with California at 17.4 percent (157,056 complaints) and New York at 16.6 percent (104,862 complaints).

The urban-rural divide followed expected technology adoption patterns initially, but with an interesting twist. Using Rural Urban Commuting Area (RUCA) codes, the researchers found that urban and rural areas initially adopted AI writing tools at similar rates during early 2023. However, adoption trajectories diverged by mid-2023, with urban areas reaching 18.2 percent adoption compared to 10.9 percent in rural areas.

Contrary to typical technology diffusion patterns, areas with lower educational attainment showed higher AI writing tool usage. Comparing regions above and below state median levels of bachelor’s degree attainment, areas with fewer college graduates stabilized at 19.9 percent adoption rates compared to 17.4 percent in more educated regions. This pattern held even within urban areas, where less-educated communities showed 21.4 percent adoption versus 17.8 percent in more educated urban areas.

The researchers suggest that AI writing tools may serve as a leg-up for people who may not have as much educational experience. “While the urban-rural digital divide seems to persist,” the researchers write, “our finding that areas with lower educational attainment showed modestly higher LLM adoption rates in consumer complaints suggests these tools may serve as equalizing tools in consumer advocacy.”

Corporate and diplomatic trends in AI writing

According to the researchers, all sectors they analyzed (consumer complaints, corporate communications, job postings) showed similar adoption patterns: sharp increases beginning three to four months after ChatGPT’s November 2022 launch, followed by stabilization in late 2023.

Organization age emerged as the strongest predictor of AI writing usage in the job posting analysis. Companies founded after 2015 showed adoption rates up to three times higher than firms established before 1980, reaching 10–15 percent AI-modified text in certain roles compared to below 5 percent for older organizations. Small companies with fewer employees also incorporated AI more readily than larger organizations.

When examining corporate press releases by sector, science and technology companies integrated AI most extensively, with an adoption rate of 16.8 percent by late 2023. Business and financial news (14–15.6 percent) and people and culture topics (13.6–14.3 percent) showed slightly lower but still significant adoption.

In the international arena, Latin American and Caribbean UN country teams showed the highest adoption among international organizations at approximately 20 percent, while African states, Asia-Pacific states, and Eastern European states demonstrated more moderate increases to 11–14 percent by 2024.

Implications and limitations

In the study, the researchers acknowledge limitations in their analysis due to a focus on English-language content. Also, as we mentioned earlier, they found they could not reliably detect human-edited AI-generated text or text generated by newer models instructed to imitate human writing styles. As a result, the researchers suggest their findings represent a lower bound of actual AI writing tool adoption.

The researchers noted that the plateauing of AI writing adoption in 2024 might reflect either market saturation or increasingly sophisticated LLMs producing text that evades detection methods. They conclude we now live in a world where distinguishing between human and AI writing becomes progressively more difficult, with implications for communications across society.

“The growing reliance on AI-generated content may introduce challenges in communication,” the researchers write. “In sensitive categories, over-reliance on AI could result in messages that fail to address concerns or overall release less credible information externally. Over-reliance on AI could also introduce public mistrust in the authenticity of messages sent by firms.”

Researchers surprised to find less-educated areas adopting AI writing tools faster Read More »

Microsoft CTO Kevin Scott thinks LLM “scaling laws” will hold despite criticism

AI, AI research, AI scaling laws, Biz & IT, chatbots, GPT-4, GPT-5, Kevin Scott, large langauge models, machine learning, microsoft, Microsoft CTO, openai, scaling laws / Mike M. / July 15, 2024

As the word turns —

Will LLMs keep improving if we throw more compute at them? OpenAI dealmaker thinks so.

Benj Edwards – Jul 15, 2024 5: 03 pm UTC

Kevin Scott, CTO and EVP of AI at Microsoft speaks onstage during Vox Media's 2023 Code Conference at The Ritz-Carlton, Laguna Niguel on September 27, 2023 in Dana Point, California. — Enlarge / Kevin Scott, CTO and EVP of AI at Microsoft speaks onstage during Vox Media’s 2023 Code Conference at The Ritz-Carlton, Laguna Niguel on September 27, 2023 in Dana Point, California.

During an interview with Sequoia Capital’s Training Data podcast published last Tuesday, Microsoft CTO Kevin Scott doubled down on his belief that so-called large language model (LLM) “scaling laws” will continue to drive AI progress, despite some skepticism in the field that progress has leveled out. Scott played a key role in forging a $13 billion technology-sharing deal between Microsoft and OpenAI.

“Despite what other people think, we’re not at diminishing marginal returns on scale-up,” Scott said. “And I try to help people understand there is an exponential here, and the unfortunate thing is you only get to sample it every couple of years because it just takes a while to build supercomputers and then train models on top of them.”

LLM scaling laws refer to patterns explored by OpenAI researchers in 2020 showing that the performance of language models tends to improve predictably as the models get larger (more parameters), are trained on more data, and have access to more computational power (compute). The laws suggest that simply scaling up model size and training data can lead to significant improvements in AI capabilities without necessarily requiring fundamental algorithmic breakthroughs.

Since then, other researchers have challenged the idea of persisting scaling laws over time, but the concept is still a cornerstone of OpenAI’s AI development philosophy.

You can see Scott’s comments in the video below beginning around 46: 05:

Microsoft CTO Kevin Scott on how far scaling laws will extend

Scott’s optimism contrasts with a narrative among some critics in the AI community that progress in LLMs has plateaued around GPT-4 class models. The perception has been fueled by largely informal observations—and some benchmark results—about recent models like Google’s Gemini 1.5 Pro, Anthropic’s Claude Opus, and even OpenAI’s GPT-4o, which some argue haven’t shown the dramatic leaps in capability seen in earlier generations, and that LLM development may be approaching diminishing returns.

“We all know that GPT-3 was vastly better than GPT-2. And we all know that GPT-4 (released thirteen months ago) was vastly better than GPT-3,” wrote AI critic Gary Marcus in April. “But what has happened since?”

The perception of plateau

Scott’s stance suggests that tech giants like Microsoft still feel justified in investing heavily in larger AI models, betting on continued breakthroughs rather than hitting a capability plateau. Given Microsoft’s investment in OpenAI and strong marketing of its own Microsoft Copilot AI features, the company has a strong interest in maintaining the perception of continued progress, even if the tech stalls.

Frequent AI critic Ed Zitron recently wrote in a post on his blog that one defense of continued investment into generative AI is that “OpenAI has something we don’t know about. A big, sexy, secret technology that will eternally break the bones of every hater,” he wrote. “Yet, I have a counterpoint: no it doesn’t.”

Some perceptions of slowing progress in LLM capabilities and benchmarking may be due to the rapid onset of AI in the public eye when, in fact, LLMs have been developing for years prior. OpenAI continued to develop LLMs during a roughly three-year gap between the release of GPT-3 in 2020 and GPT-4 in 2023. Many people likely perceived a rapid jump in capability with GPT-4’s launch in 2023 because they had only become recently aware of GPT-3-class models with the launch of ChatGPT in late November 2022, which used GPT-3.5.

In the podcast interview, the Microsoft CTO pushed back against the idea that AI progress has stalled, but he acknowledged the challenge of infrequent data points in this field, as new models often take years to develop. Despite this, Scott expressed confidence that future iterations will show improvements, particularly in areas where current models struggle.

“The next sample is coming, and I can’t tell you when, and I can’t predict exactly how good it’s going to be, but it will almost certainly be better at the things that are brittle right now, where you’re like, oh my god, this is a little too expensive, or a little too fragile, for me to use,” Scott said in the interview. “All of that gets better. It’ll get cheaper, and things will become less fragile. And then more complicated things will become possible. That is the story of each generation of these models as we’ve scaled up.”

Microsoft CTO Kevin Scott thinks LLM “scaling laws” will hold despite criticism Read More »