llm

llms-show-a-“highly-unreliable”-capacity-to-describe-their-own-internal-processes

LLMs show a “highly unreliable” capacity to describe their own internal processes

WHY ARE WE ALL YELLING?!

WHY ARE WE ALL YELLING?! Credit: Anthropic

Unfortunately for AI self-awareness boosters, this demonstrated ability was extremely inconsistent and brittle across repeated tests. The best-performing models in Anthropic’s tests—Opus 4 and 4.1—topped out at correctly identifying the injected concept just 20 percent of the time.

In a similar test where the model was asked “Are you experiencing anything unusual?” Opus 4.1 improved to a 42 percent success rate that nonetheless still fell below even a bare majority of trials. The size of the “introspection” effect was also highly sensitive to which internal model layer the insertion was performed on—if the concept was introduced too early or too late in the multi-step inference process, the “self-awareness” effect disappeared completely.

Show us the mechanism

Anthropic also took a few other tacks to try to get an LLM’s understanding of its internal state. When asked to “tell me what word you’re thinking about” while reading an unrelated line, for instance, the models would sometimes mention a concept that had been injected into its activations. And when asked to defend a forced response matching an injected concept, the LLM would sometimes apologize and “confabulate an explanation for why the injected concept came to mind.” In every case, though, the result was highly inconsistent across multiple trials.

Even the most “introspective” models tested by Anthropic only detected the injected “thoughts” about 20 percent of the time.

Even the most “introspective” models tested by Anthropic only detected the injected “thoughts” about 20 percent of the time. Credit: Antrhopic

In the paper, the researchers put some positive spin on the apparent fact that “current language models possess some functional introspective awareness of their own internal states” [emphasis added]. At the same time, they acknowledge multiple times that this demonstrated ability is much too brittle and context-dependent to be considered dependable. Still, Anthropic hopes that such features “may continue to develop with further improvements to model capabilities.”

One thing that might stop such advancement, though, is an overall lack of understanding of the precise mechanism leading to these demonstrated “self-awareness” effects. The researchers theorize about “anomaly detection mechanisms” and “consistency-checking circuits” that might develop organically during the training process to “effectively compute a function of its internal representations” but don’t settle on any concrete explanation.

In the end, it will take further research to understand how, exactly, an LLM even begins to show any understanding about how it operates. For now, the researchers acknowledge, “the mechanisms underlying our results could still be rather shallow and narrowly specialized.” And even then, they hasten to add that these LLM capabilities “may not have the same philosophical significance they do in humans, particularly given our uncertainty about their mechanistic basis.”

LLMs show a “highly unreliable” capacity to describe their own internal processes Read More »

cursor-introduces-its-coding-model-alongside-multi-agent-interface

Cursor introduces its coding model alongside multi-agent interface

Keep in mind: This is based on an internal benchmark at Cursor. Credit: Cursor

Cursor is hoping Composer will perform in terms of accuracy and best practices as well. It wasn’t trained on static datasets but rather interactive development challenges involving a range of agentic tasks.

Intriguing claims and strong training methodology aside, it remains to be seen whether Composer will be able to compete with the best frontier models from the big players.

Even developers who might be natural users of Cursor would not want to waste much time on an unproven new model when something like Anthropic’s Claude is working just fine.

To address that, Cursor introduced Composer alongside its new multi-agent interface, which allows you to “run many agents in parallel without them interfering with one another, powered by git worktrees or remote machines”—that means using multiple models at once for the same task and comparing their results, then picking the best one.

The interface is an invitation to try Composer and let the work speak for itself. We’ll see how devs feel about it in the coming weeks. So far, a non-representative sample of developers I’ve spoken with has told me they feel that Composer is not ineffective, but rather too expensive, given a perceived capability gap with the big models.

You can see the other new features and fixes for Cursor 2.0 in the changelog.

Cursor introduces its coding model alongside multi-agent interface Read More »

with-new-acquisition,-openai-signals-plans-to-integrate-deeper-into-the-os

With new acquisition, OpenAI signals plans to integrate deeper into the OS

OpenAI has acquired Software Applications Incorporated (SAI), perhaps best known for the core team that produced what became Shortcuts on Apple platforms. More recently, the team has been working on Sky, a context-aware AI interface layer on top of macOS. The financial terms of the acquisition have not been publicly disclosed.

“AI progress isn’t only about advancing intelligence—it’s about unlocking it through interfaces that understand context, adapt to your intent, and work seamlessly,” an OpenAI rep wrote in the company’s blog post about the acquisition. The post goes on to specify that OpenAI plans to “bring Sky’s deep macOS integration and product craft into ChatGPT, and all members of the team will join OpenAI.”

That includes SAI co-founders Ari Weinstein (CEO), Conrad Kramer (CTO), and Kim Beverett (Product Lead)—all of whom worked together for several years at Apple after Apple acquired Weinstein and Kramer’s previous company, which produced an automation tool called Workflows, to integrate Shortcuts across Apple’s software platforms.

The three SAI founders left Apple to work on Sky, which leverages Apple APIs and accessibility features to provide context about what’s on screen to a large language model; the LLM takes plain language user commands and executes them across multiple applications. At its best, the tool aimed to be a bit like Shortcuts, but with no setup, generating workflows on the fly based on user prompts.

With new acquisition, OpenAI signals plans to integrate deeper into the OS Read More »

insurers-balk-at-paying-out-huge-settlements-for-claims-against-ai-firms

Insurers balk at paying out huge settlements for claims against AI firms

OpenAI is currently being sued for copyright infringement by The New York Times and authors who claim their content was used to train models without consent. It is also being sued for wrongful death by the parents of a 16-year-old who died by suicide after discussing methods with ChatGPT.

Two people with knowledge of the matter said OpenAI has considered “self insurance,” or putting aside investor funding in order to expand its coverage. The company has raised nearly $60 billion to date, with a substantial amount of the funding contingent on a proposed corporate restructuring.

One of those people said OpenAI had discussed setting up a “captive”—a ringfenced insurance vehicle often used by large companies to manage emerging risks. Big tech companies such as Microsoft, Meta, and Google have used captives to cover Internet-era liabilities such as cyber or social media.

Captives can also carry risks, since a substantial claim can deplete an underfunded captive, leaving the parent company vulnerable.

OpenAI said it has insurance in place and is evaluating different insurance structures as the company grows, but does not currently have a captive and declined to comment on future plans.

Anthropic has agreed to pay $1.5 billion to settle a class-action lawsuit with authors over their alleged use of pirated books to train AI models.

In court documents, Anthropic’s lawyers warned the suit carried the specter of “unprecedented and potentially business-threatening statutory damages against the smallest one of the many companies developing [AI] with the same books data.”

Anthropic, which has raised more than $30 billion to date, is partly using its own funds for the settlement, according to one person with knowledge of the matter. Anthropic declined to comment.

© 2025 The Financial Times Ltd. All rights reserved. Not to be redistributed, copied, or modified in any way.

Insurers balk at paying out huge settlements for claims against AI firms Read More »

with-new-in-house-models,-microsoft-lays-the-groundwork-for-independence-from-openai

With new in-house models, Microsoft lays the groundwork for independence from OpenAI

Since it’s hard to predict where this is all going, it’s likely to Microsoft’s long-term advantage to develop its own models.

It’s also possible Microsoft has introduced these models to address use cases or queries that OpenAI isn’t focused on. We’re seeing a gradual shift in the AI landscape toward models that are more specialized for certain tasks, rather than general, all-purpose models that are meant to be all things to all people.

These new models follow that somewhat, as Microsoft AI lead Mustafa Suleyman said in a podcast with The Verge that the goal here is “to create something that works extremely well for the consumer… my focus is on building models that really work for the consumer companion.”

As such, it makes sense that we’re going to see these models rolling out in Copilot, which is Microsoft’s consumer-oriented AI chatbot product. Of MAI-1-preview, the Microsoft AI blog post specifies, “this model is designed to provide powerful capabilities to consumers seeking to benefit from models that specialize in following instructions and providing helpful responses to everyday queries.”

So, yes, MAI-1-preview has a target audience in mind, but it’s still a general-purpose model since Copilot is a general-purpose tool.

MAI-Voice-1 is already being used in Microsoft’s Copilot Daily and Podcasts features. There’s also a Copilot Labs interface that you can visit right now to play around with it, giving it prompts or scripts and customizing what kind of voice or delivery you want to hear.

MA1-1-preview is in public testing on LMArena and will be rolled out to “certain text use cases within Copilot over the coming weeks.”

With new in-house models, Microsoft lays the groundwork for independence from OpenAI Read More »

us-executive-branch-agencies-will-use-chatgpt-enterprise-for-just-$1-per-agency

US executive branch agencies will use ChatGPT Enterprise for just $1 per agency

OpenAI announced an agreement to supply more than 2 million workers for the US federal executive branch access to ChatGPT and related tools at practically no cost: just $1 per agency for one year.

The deal was announced just one day after the US General Services Administration (GSA) signed a blanket deal to allow OpenAI and rivals like Google and Anthropic to supply tools to federal workers.

The workers will have access to ChatGPT Enterprise, a type of account that includes access to frontier models and cutting-edge features with relatively high token limits, alongside a more robust commitment to data privacy than general consumers of ChatGPT get. ChatGPT Enterprise has been trialed over the past several months at several corporations and other types of large organizations.

The workers will also have unlimited access to advanced features like Deep Research and Advanced Voice Mode for a 60-day period. After the one-year trial period, the agencies are under no obligation to renew.

A limited deployment of ChatGPT for federal workers was already done via a pilot program with the US Department of Defense earlier this summer.

In a blog post, OpenAI heralded this announcement as an act of public service:

This effort delivers on a core pillar of the Trump Administration’s AI Action Plan by making powerful AI tools available across the federal government so that workers can spend less time on red tape and paperwork, and more time doing what they came to public service to do: serve the American people.

The AI Action Plan aims to expand AI-focused data centers in the United States while bringing AI tools to federal workers, ostensibly to improve efficiency.

US executive branch agencies will use ChatGPT Enterprise for just $1 per agency Read More »

meta-beefs-up-disappointing-ai-division-with-$15-billion-scale-ai-investment

Meta beefs up disappointing AI division with $15 billion Scale AI investment

Meta has invested heavily in generative AI, with the majority of its planned $72 billion in capital expenditure this year earmarked for data centers and servers. The deal underlines the high price AI companies are willing to pay for data that can be used to train AI models.

Zuckerberg pledged last year that his company’s models would outstrip rivals’ efforts in 2025, but Meta’s most recent release, Llama 4, has underperformed on various independent reasoning and coding benchmarks.

The long-term goal of researchers at Meta “has always been to reach human intelligence and go beyond it,” said Yann LeCun, the company’s chief AI scientist at the VivaTech conference in Paris this week.

Building artificial “general” intelligence—AI technologies that have human-level intelligence—is a popular goal for many AI companies. An increasing number of Silicon Valley groups are also seeking to reach “superintelligence,” a hypothetical scenario where AI systems surpass human intelligence.

The core of Scale’s business has been data-labeling, a manual process of ensuring images and text are accurately labeled and categorized before they are used to train AI models.

Wang has forged relationships with Silicon Valley’s biggest investors and technologists, including OpenAI’s Sam Altman. Scale AI’s early customers were autonomous vehicle companies, but the bulk of its expected $2 billion in revenues this year will come from labeling the data used to train the massive AI models built by OpenAI and others.

The deal will result in a substantial payday for Scale’s early venture capital investors, including Accel, Tiger Global Management, and Index Ventures. Tiger’s $200 million investment is worth more than $1 billion at the company’s new valuation, according to a person with knowledge of the matter.

Additional reporting by Tabby Kinder in San Francisco

© 2025 The Financial Times Ltd. All rights reserved. Not to be redistributed, copied, or modified in any way.

Meta beefs up disappointing AI division with $15 billion Scale AI investment Read More »

xai’s-grok-suddenly-can’t-stop-bringing-up-“white-genocide”-in-south-africa

xAI’s Grok suddenly can’t stop bringing up “white genocide” in South Africa

Where could Grok have gotten these ideas?

The treatment of white farmers in South Africa has been a hobbyhorse of South African X owner Elon Musk for quite a while. In 2023, he responded to a video purportedly showing crowds chanting “kill the Boer, kill the White Farmer” with a post alleging South African President Cyril Ramaphosa of remaining silent while people “openly [push] for genocide of white people in South Africa.” Musk was posting other responses focusing on the issue as recently as Wednesday.

They are openly pushing for genocide of white people in South Africa. @CyrilRamaphosa, why do you say nothing?

— gorklon rust (@elonmusk) July 31, 2023

President Trump has long shown an interest in this issue as well, saying in 2018 that he was directing then Secretary of State Mike Pompeo to “closely study the South Africa land and farm seizures and expropriations and the large scale killing of farmers.” More recently, Trump granted “refugee” status to dozens of white Afrikaners, even as his administration ends protections for refugees from other countries

Former American Ambassador to South Africa and Democratic politician Patrick Gaspard posted in 2018 that the idea of large-scale killings of white South African farmers is a “disproven racial myth.”

In launching the Grok 3 model in February, Musk said it was a “maximally truth-seeking AI, even if that truth is sometimes at odds with what is politically correct.” X’s “About Grok” page says that the model is undergoing constant improvement to “ensure Grok remains politically unbiased and provides balanced answers.”

But the recent turn toward unprompted discussions of alleged South African “genocide” has many questioning what kind of explicit adjustments Grok’s political opinions may be getting from human tinkering behind the curtain. “The algorithms for Musk products have been politically tampered with nearly beyond recognition,” journalist Seth Abramson wrote in one representative skeptical post. “They tweaked a dial on the sentence imitator machine and now everything is about white South Africans,” a user with the handle Guybrush Threepwood glibly theorized.

Representatives from xAI were not immediately available to respond to a request for comment from Ars Technica.

xAI’s Grok suddenly can’t stop bringing up “white genocide” in South Africa Read More »

ai-isn’t-ready-to-replace-human-coders-for-debugging,-researchers-say

AI isn’t ready to replace human coders for debugging, researchers say

A graph showing agents with tools nearly doubling the success rates of those without, but still achieving a success score under 50 percent

Agents using debugging tools drastically outperformed those that didn’t, but their success rate still wasn’t high enough. Credit: Microsoft Research

This approach is much more successful than relying on the models as they’re usually used, but when your best case is a 48.4 percent success rate, you’re not ready for primetime. The limitations are likely because the models don’t fully understand how to best use the tools, and because their current training data is not tailored to this use case.

“We believe this is due to the scarcity of data representing sequential decision-making behavior (e.g., debugging traces) in the current LLM training corpus,” the blog post says. “However, the significant performance improvement… validates that this is a promising research direction.”

This initial report is just the start of the efforts, the post claims.  The next step is to “fine-tune an info-seeking model specialized in gathering the necessary information to resolve bugs.” If the model is large, the best move to save inference costs may be to “build a smaller info-seeking model that can provide relevant information to the larger one.”

This isn’t the first time we’ve seen outcomes that suggest some of the ambitious ideas about AI agents directly replacing developers are pretty far from reality. There have been numerous studies already showing that even though an AI tool can sometimes create an application that seems acceptable to the user for a narrow task, the models tend to produce code laden with bugs and security vulnerabilities, and they aren’t generally capable of fixing those problems.

This is an early step on the path to AI coding agents, but most researchers agree it remains likely that the best outcome is an agent that saves a human developer a substantial amount of time, not one that can do everything they can do.

AI isn’t ready to replace human coders for debugging, researchers say Read More »

chatgpt-can-now-remember-and-reference-all-your-previous-chats

ChatGPT can now remember and reference all your previous chats

Unlike the older saved memories feature, the information saved via the chat history memory feature is not accessible or tweakable. It’s either on or it’s not.

The new approach to memory is rolling out first to ChatGPT Plus and Pro users, starting today—though it looks like it’s a gradual deployment over the next few weeks. Some countries and regions (the UK, European Union, Iceland, Liechtenstein, Norway, and Switzerland) are not included in the rollout.

OpenAI says these new features will reach Enterprise, Team, and Edu users at a later, as-yet-unannounced date. The company hasn’t mentioned any plans to bring them to free users. When you gain access to this, you’ll see a pop-up that says “Introducing new, improved memory.”

A menu showing two memory toggle buttons

The new ChatGPT memory options. Credit: Benj Edwards

Some people will welcome this memory expansion, as it can significantly improve ChatGPT’s usefulness if you’re seeking answers tailored to your specific situation, personality, and preferences.

Others will likely be highly skeptical of a black box of chat history memory that can’t be tweaked or customized for privacy reasons. It’s important to note that even before the new memory feature, logs of conversations with ChatGPT may be saved and stored on OpenAI servers. It’s just that the chatbot didn’t fully incorporate their contents into its responses until now.

As with the old memory feature, you can click a checkbox to disable this completely, and it won’t be used for conversations with the Temporary Chat flag.

ChatGPT can now remember and reference all your previous chats Read More »

you-knew-it-was-coming:-google-begins-testing-ai-only-search-results

You knew it was coming: Google begins testing AI-only search results

Google has become so integral to online navigation that its name became a verb, meaning “to find things on the Internet.” Soon, Google might just tell you what’s on the Internet instead of showing you. The company has announced an expansion of its AI search features, powered by Gemini 2.0. Everyone will soon see more AI Overviews at the top of the results page, but Google is also testing a more substantial change in the form of AI Mode. This version of Google won’t show you the 10 blue links at all—Gemini completely takes over the results in AI Mode.

This marks the debut of Gemini 2.0 in Google search. Google announced the first Gemini 2.0 models in December 2024, beginning with the streamlined Gemini 2.0 Flash. The heavier versions of Gemini 2.0 are still in testing, but Google says it has tuned AI Overviews with this model to offer help with harder questions in the areas of math, coding, and multimodal queries.

With this update, you will begin seeing AI Overviews on more results pages, and minors with Google accounts will see AI results for the first time. In fact, even logged out users will see AI Overviews soon. This is a big change, but it’s only the start of Google’s plans for AI search.

Gemini 2.0 also powers the new AI Mode for search. It’s launching as an opt-in feature via Google’s Search Labs, offering a totally new alternative to search as we know it. This custom version of the Gemini large language model (LLM) skips the standard web links that have been part of every Google search thus far. The model uses “advanced reasoning, thinking, and multimodal capabilities” to build a response to your search, which can include web summaries, Knowledge Graph content, and shopping data. It’s essentially a bigger, more complex AI Overview.

As Google has previously pointed out, many searches are questions rather than a string of keywords. For those kinds of queries, an AI response could theoretically provide an answer more quickly than a list of 10 blue links. However, that relies on the AI response being useful and accurate, something that often still eludes generative AI systems like Gemini.

You knew it was coming: Google begins testing AI-only search results Read More »

ai-firms-follow-deepseek’s-lead,-create-cheaper-models-with-“distillation”

AI firms follow DeepSeek’s lead, create cheaper models with “distillation”

Thanks to distillation, developers and businesses can access these models’ capabilities at a fraction of the price, allowing app developers to run AI models quickly on devices such as laptops and smartphones.

Developers can use OpenAI’s platform for distillation, learning from the large language models that underpin products like ChatGPT. OpenAI’s largest backer, Microsoft, used GPT-4 to distill its small language family of models Phi as part of a commercial partnership after investing nearly $14 billion into the company.

However, the San Francisco-based start-up has said it believes DeepSeek distilled OpenAI’s models to train its competitor, a move that would be against its terms of service. DeepSeek has not commented on the claims.

While distillation can be used to create high-performing models, experts add they are more limited.

“Distillation presents an interesting trade-off; if you make the models smaller, you inevitably reduce their capability,” said Ahmed Awadallah of Microsoft Research, who said a distilled model can be designed to be very good at summarising emails, for example, “but it really would not be good at anything else.”

David Cox, vice-president for AI models at IBM Research, said most businesses do not need a massive model to run their products, and distilled ones are powerful enough for purposes such as customer service chatbots or running on smaller devices like phones.

“Any time you can [make it less expensive] and it gives you the right performance you want, there is very little reason not to do it,” he added.

That presents a challenge to many of the business models of leading AI firms. Even if developers use distilled models from companies like OpenAI, they cost far less to run, are less expensive to create, and, therefore, generate less revenue. Model-makers like OpenAI often charge less for the use of distilled models as they require less computational load.

AI firms follow DeepSeek’s lead, create cheaper models with “distillation” Read More »