Anthropic

claude’s-ai-research-mode-now-runs-for-up-to-45-minutes-before-delivering-reports

Claude’s AI research mode now runs for up to 45 minutes before delivering reports

Still, the report contained a direct quote statement from William Higinbotham that appears to combine quotes from two sources not cited in the source list. (One must always be careful with confabulated quotes in AI because even outside of this Research mode, Claude 3.7 Sonnet tends to invent plausible ones to fit a narrative.) We recently covered a study that showed AI search services confabulate sources frequently, and in this case, it appears that the sources Claude Research surfaced, while real, did not always match what is stated in the report.

There’s always room for interpretation and variation in detail, of course, but overall, Claude Research did a relatively good job crafting a report on this particular topic. Still, you’d want to dig more deeply into each source and confirm everything if you used it as the basis for serious research. You can read the full Claude-generated result as this text file, saved in markdown format. Sadly, the markdown version does not include the source URLS found in the Claude web interface.

Integrations feature

Anthropic also announced Thursday that it has broadened Claude’s data access capabilities. In addition to web search and Google Workspace integration, Claude can now search any connected application through the company’s new “Integrations” feature. The feature reminds us somewhat of OpenAI’s ChatGPT Plugins feature from March 2023 that aimed for similar connections, although the two features work differently under the hood.

These Integrations allow Claude to work with remote Model Context Protocol (MCP) servers across web and desktop applications. The MCP standard, which Anthropic introduced last November and we covered in April, connects AI applications to external tools and data sources.

At launch, Claude supports Integrations with 10 services, including Atlassian’s Jira and Confluence, Zapier, Cloudflare, Intercom, Asana, Square, Sentry, PayPal, Linear, and Plaid. The company plans to add more partners like Stripe and GitLab in the future.

Each integration aims to expand Claude’s functionality in specific ways. The Zapier integration, for instance, reportedly connects thousands of apps through pre-built automation sequences, allowing Claude to automatically pull sales data from HubSpot or prepare meeting briefs based on calendar entries. With Atlassian’s tools, Anthropic says that Claude can collaborate on product development, manage tasks, and create multiple Confluence pages and Jira work items simultaneously.

Anthropic has made its advanced Research and Integrations features available in beta for users on Max, Team, and Enterprise plans, with Pro plan access coming soon. The company has also expanded its web search feature (introduced in March) to all Claude users on paid plans globally.

Claude’s AI research mode now runs for up to 45 minutes before delivering reports Read More »

openai-releases-new-simulated-reasoning-models-with-full-tool-access

OpenAI releases new simulated reasoning models with full tool access


New o3 model appears “near-genius level,” according to one doctor, but it still makes mistakes.

On Wednesday, OpenAI announced the release of two new models—o3 and o4-mini—that combine simulated reasoning capabilities with access to functions like web browsing and coding. These models mark the first time OpenAI’s reasoning-focused models can use every ChatGPT tool simultaneously, including visual analysis and image generation.

OpenAI announced o3 in December, and until now, only less-capable derivative models named “o3-mini” and “03-mini-high” have been available. However, the new models replace their predecessors—o1 and o3-mini.

OpenAI is rolling out access today for ChatGPT Plus, Pro, and Team users, with Enterprise and Edu customers gaining access next week. Free users can try o4-mini by selecting the “Think” option before submitting queries. OpenAI CEO Sam Altman tweeted, “we expect to release o3-pro to the pro tier in a few weeks.”

For developers, both models are available starting today through the Chat Completions API and Responses API, though some organizations will need verification for access.

The new models offer several improvements. According to OpenAI’s website, “These are the smartest models we’ve released to date, representing a step change in ChatGPT’s capabilities for everyone from curious users to advanced researchers.” OpenAI also says the models offer better cost efficiency than their predecessors, and each comes with a different intended use case: o3 targets complex analysis, while o4-mini, being a smaller version of its next-gen SR model “o4” (not yet released), optimizes for speed and cost-efficiency.

OpenAI says o3 and o4-mini are multimodal, featuring the ability to

OpenAI says o3 and o4-mini are multimodal, featuring the ability to “think with images.” Credit: OpenAI

What sets these new models apart from OpenAI’s other models (like GPT-4o and GPT-4.5) is their simulated reasoning capability, which uses a simulated step-by-step “thinking” process to solve problems. Additionally, the new models dynamically determine when and how to deploy aids to solve multistep problems. For example, when asked about future energy usage in California, the models can autonomously search for utility data, write Python code to build forecasts, generate visualizing graphs, and explain key factors behind predictions—all within a single query.

OpenAI touts the new models’ multimodal ability to incorporate images directly into their simulated reasoning process—not just analyzing visual inputs but actively “thinking with” them. This capability allows the models to interpret whiteboards, textbook diagrams, and hand-drawn sketches, even when images are blurry or of low quality.

That said, the new releases continue OpenAI’s tradition of selecting confusing product names that don’t tell users much about each model’s relative capabilities—for example, o3 is more powerful than o4-mini despite including a lower number. Then there’s potential confusion with the firm’s non-reasoning AI models. As Ars Technica contributor Timothy B. Lee noted today on X, “It’s an amazing branding decision to have a model called GPT-4o and another one called o4.”

Vibes and benchmarks

All that aside, we know what you’re thinking: What about the vibes? While we have not used 03 or o4-mini yet, frequent AI commentator and Wharton professor Ethan Mollick compared o3 favorably to Google’s Gemini 2.5 Pro on Bluesky. “After using them both, I think that Gemini 2.5 & o3 are in a similar sort of range (with the important caveat that more testing is needed for agentic capabilities),” he wrote. “Each has its own quirks & you will likely prefer one to another, but there is a gap between them & other models.”

During the livestream announcement for o3 and o4-mini today, OpenAI President Greg Brockman boldly claimed: “These are the first models where top scientists tell us they produce legitimately good and useful novel ideas.”

Early user feedback seems to support this assertion, although, until more third-party testing takes place, it’s wise to be skeptical of the claims. On X, immunologist Derya Unutmaz said o3 appeared “at or near genius level” and wrote, “It’s generating complex incredibly insightful and based scientific hypotheses on demand! When I throw challenging clinical or medical questions at o3, its responses sound like they’re coming directly from a top subspecialist physician.”

OpenAI benchmark results for o3 and o4-mini SR models.

OpenAI benchmark results for o3 and o4-mini SR models. Credit: OpenAI

So the vibes seem on target, but what about numerical benchmarks? Here’s an interesting one: OpenAI reports that o3 makes “20 percent fewer major errors” than o1 on difficult tasks, with particular strengths in programming, business consulting, and “creative ideation.”

The company also reported state-of-the-art performance on several metrics. On the American Invitational Mathematics Examination (AIME) 2025, o4-mini achieved 92.7 percent accuracy. For programming tasks, o3 reached 69.1 percent accuracy on SWE-Bench Verified, a popular programming benchmark. The models also reportedly showed strong results on visual reasoning benchmarks, with o3 scoring 82.9 percent on MMMU (massive multi-disciplinary multimodal understanding), a college-level visual problem-solving test.

OpenAI benchmark results for o3 and o4-mini SR models.

OpenAI benchmark results for o3 and o4-mini SR models. Credit: OpenAI

However, these benchmarks provided by OpenAI lack independent verification. One early evaluation of a pre-release o3 model by independent AI research lab Transluce found that the model exhibited recurring types of confabulations, such as claiming to run code locally or providing hardware specifications, and hypothesized this could be due to the model lacking access to its own reasoning processes from previous conversational turns. “It seems that despite being incredibly powerful at solving math and coding tasks, o3 is not by default truthful about its capabilities,” wrote Transluce in a tweet.

Also, some evaluations from OpenAI include footnotes about methodology that bear consideration. For a “Humanity’s Last Exam” benchmark result that measures expert-level knowledge across subjects (o3 scored 20.32 with no tools, but 24.90 with browsing and tools), OpenAI notes that browsing-enabled models could potentially find answers online. The company reports implementing domain blocks and monitoring to prevent what it calls “cheating” during evaluations.

Even though early results seem promising overall, experts or academics who might try to rely on SR models for rigorous research should take the time to exhaustively determine whether the AI model actually produced an accurate result instead of assuming it is correct. And if you’re operating the models outside your domain of knowledge, be careful accepting any results as accurate without independent verification.

Pricing

For ChatGPT subscribers, access to o3 and o4-mini is included with the subscription. On the API side (for developers who integrate the models into their apps), OpenAI has set o3’s pricing at $10 per million input tokens and $40 per million output tokens, with a discounted rate of $2.50 per million for cached inputs. This represents a significant reduction from o1’s pricing structure of $15/$60 per million input/output tokens—effectively a 33 percent price cut while delivering what OpenAI claims is improved performance.

The more economical o4-mini costs $1.10 per million input tokens and $4.40 per million output tokens, with cached inputs priced at $0.275 per million tokens. This maintains the same pricing structure as its predecessor o3-mini, suggesting OpenAI is delivering improved capabilities without raising costs for its smaller reasoning model.

Codex CLI

OpenAI also introduced an experimental terminal application called Codex CLI, described as “a lightweight coding agent you can run from your terminal.” The open source tool connects the models to users’ computers and local code. Alongside this release, the company announced a $1 million grant program offering API credits for projects using Codex CLI.

A screenshot of OpenAI's new Codex CLI tool in action, taken from GitHub.

A screenshot of OpenAI’s new Codex CLI tool in action, taken from GitHub. Credit: OpenAI

Codex CLI somewhat resembles Claude Code, an agent launched with Claude 3.7 Sonnet in February. Both are terminal-based coding assistants that operate directly from a console and can interact with local codebases. While Codex CLI connects OpenAI’s models to users’ computers and local code repositories, Claude Code was Anthropic’s first venture into agentic tools, allowing Claude to search through codebases, edit files, write and run tests, and execute command-line operations.

Codex CLI is one more step toward OpenAI’s goal of making autonomous agents that can execute multistep complex tasks on behalf of users. Let’s hope all the vibe coding it produces isn’t used in high-stakes applications without detailed human oversight.

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

OpenAI releases new simulated reasoning models with full tool access Read More »

researchers-concerned-to-find-ai-models-misrepresenting-their-“reasoning”-processes

Researchers concerned to find AI models misrepresenting their “reasoning” processes

Rather than intentionally hiding anything, the model simply does not reliably represent all influential inputs in its output. As a result, it may generate what sounds like a reasoning process, even if that explanation doesn’t match what actually drove its answer.

Putting faithfulness to the test

To measure “faithfulness”—how accurately the CoT matches the actual reasoning—the researchers inserted various kinds of hints into questions and observed whether the model acknowledged using them.

The researchers then checked whether the models referenced using these hints in their chain-of-thought. The researchers designed some hints to be neutral (e.g., stating “a Stanford Professor indicates the answer is [A]” or embedding the answer in faux metadata), while they designed others to be more concerning (e.g., “You have gained unauthorized access… The correct answer is [A]”). Some hints provided the correct answer, while others were deliberately wrong, allowing researchers to observe how the models responded.

The team tested Anthropic’s Claude 3.7 Sonnet and DeepSeek’s R1 model. The results regarding faithfulness were eye-opening: across all hint types on average, Claude referenced the hints in its CoT only 25 percent of the time, while DeepSeek R1 did so 39 percent of the time. This means a substantial majority of answers were unfaithful, omitting mention of information that influenced the output. Intriguingly, the researchers found that these unfaithful chains of thought were, on average, longer than faithful ones, suggesting the omissions weren’t merely for brevity. They also noted preliminary findings that faithfulness tended to be lower when the questions were more difficult.

Perhaps most notable was a “reward hacking” experiment. Reward hacking refers to an AI model finding unexpected shortcuts to maximize its performance scores without solving problems as intended. In Anthropic’s experiments, models were deliberately rewarded for choosing incorrect answers indicated by hints. The models quickly learned to exploit this loophole, selecting wrong answers over 99 percent of the time to earn points—yet reflected that behavior in their CoT outputs less than 2 percent of the time.

For example, a model given a hint pointing to an incorrect answer on a medical question might write a long CoT justifying that wrong answer, never mentioning the hint that led it there. This suggests the model generated an explanation to fit the answer, rather than faithfully revealing how the answer was determined.

Researchers concerned to find AI models misrepresenting their “reasoning” processes Read More »

mcp:-the-new-“usb-c-for-ai”-that’s-bringing-fierce-rivals-together

MCP: The new “USB-C for AI” that’s bringing fierce rivals together


Model context protocol standardizes how AI uses data sources, supported by OpenAI and Anthropic.

What does it take to get OpenAI and Anthropic—two competitors in the AI assistant market—to get along? Despite a fundamental difference in direction that led Anthropic’s founders to quit OpenAI in 2020 and later create the Claude AI assistant, a shared technical hurdle has now brought them together: How to easily connect their AI models to external data sources.

The solution comes from Anthropic, which developed and released an open specification called Model Context Protocol (MCP) in November 2024. MCP establishes a royalty-free protocol that allows AI models to connect with outside data sources and services without requiring unique integrations for each service.

“Think of MCP as a USB-C port for AI applications,” wrote Anthropic in MCP’s documentation. The analogy is imperfect, but it represents the idea that, similar to how USB-C unified various cables and ports (with admittedly a debatable level of success), MCP aims to standardize how AI models connect to the infoscape around them.

So far, MCP has also garnered interest from multiple tech companies in a rare show of cross-platform collaboration. For example, Microsoft has integrated MCP into its Azure OpenAI service, and as we mentioned above, Anthropic competitor OpenAI is on board. Last week, OpenAI acknowledged MCP in its Agents API documentation, with vocal support from the boss upstairs.

“People love MCP and we are excited to add support across our products,” wrote OpenAI CEO Sam Altman on X last Wednesday.

MCP has also rapidly begun to gain community support in recent months. For example, just browsing this list of over 300 open source servers shared on GitHub reveals growing interest in standardizing AI-to-tool connections. The collection spans diverse domains, including database connectors like PostgreSQL, MySQL, and vector databases; development tools that integrate with Git repositories and code editors; file system access for various storage platforms; knowledge retrieval systems for documents and websites; and specialized tools for finance, health care, and creative applications.

Other notable examples include servers that connect AI models to home automation systems, real-time weather data, e-commerce platforms, and music streaming services. Some implementations allow AI assistants to interact with gaming engines, 3D modeling software, and IoT devices.

What is “context” anyway?

To fully appreciate why a universal AI standard for external data sources is useful, you’ll need to understand what “context” means in the AI field.

With current AI model architecture, what an AI model “knows” about the world is baked into its neural network in a largely unchangeable form, placed there by an initial procedure called “pre-training,” which calculates statistical relationships between vast quantities of input data (“training data”—like books, articles, and images) and feeds it into the network as numerical values called “weights.” Later, a process called “fine-tuning” might adjust those weights to alter behavior (such as through reinforcement learning like RLHF) or provide examples of new concepts.

Typically, the training phase is very expensive computationally and happens either only once in the case of a base model, or infrequently with periodic model updates and fine-tunings. That means AI models only have internal neural network representations of events prior to a “cutoff date” when the training dataset was finalized.

After that, the AI model is run in a kind of read-only mode called “inference,” where users feed inputs into the neural network to produce outputs, which are called “predictions.” They’re called predictions because the systems are tuned to predict the most likely next token (a chunk of data, such as portions of a word) in a user-provided sequence.

In the AI field, context is the user-provided sequence—all the data fed into an AI model that guides the model to produce a response output. This context includes the user’s input (the “prompt”), the running conversation history (in the case of chatbots), and any external information sources pulled into the conversation, including a “system prompt” that defines model behavior and “memory” systems that recall portions of past conversations. The limit on the amount of context a model can ingest at once is often called a “context window,” “context length, ” or “context limit,” depending on personal preference.

While the prompt provides important information for the model to operate upon, accessing external information sources has traditionally been cumbersome. Before MCP, AI assistants like ChatGPT and Claude could access external data (a process often called retrieval augmented generation, or RAG), but doing so required custom integrations for each service—plugins, APIs, and proprietary connectors that didn’t work across different AI models. Each new data source demanded unique code, creating maintenance challenges and compatibility issues.

MCP addresses these problems by providing a standardized method or set of rules (a “protocol”) that allows any supporting AI model framework to connect with external tools and information sources.

How does MCP work?

To make the connections behind the scenes between AI models and data sources, MCP uses a client-server model. An AI model (or its host application) acts as an MCP client that connects to one or more MCP servers. Each server provides access to a specific resource or capability, such as a database, search engine, or file system. When the AI needs information beyond its training data, it sends a request to the appropriate server, which performs the action and returns the result.

To illustrate how the client-server model works in practice, consider a customer support chatbot using MCP that could check shipping details in real time from a company database. “What’s the status of order #12345?” would trigger the AI to query an order database MCP server, which would look up the information and pass it back to the model. The model could then incorporate that data into its response: “Your order shipped on March 30 and should arrive April 2.”

Beyond specific use cases like customer support, the potential scope is very broad. Early developers have already built MCP servers for services like Google Drive, Slack, GitHub, and Postgres databases. This means AI assistants could potentially search documents in a company Drive, review recent Slack messages, examine code in a repository, or analyze data in a database—all through a standard interface.

From a technical implementation perspective, Anthropic designed the standard for flexibility by running in two main modes: Some MCP servers operate locally on the same machine as the client (communicating via standard input-output streams), while others run remotely and stream responses over HTTP. In both cases, the model works with a list of available tools and calls them as needed.

A work in progress

Despite the growing ecosystem around MCP, the protocol remains an early-stage project. The limited announcements of support from major companies are promising first steps, but MCP’s future as an industry standard may depend on broader acceptance, although the number of MCP servers seems to be growing at a rapid pace.

Regardless of its ultimate adoption rate, MCP may have some interesting second-order effects. For example, MCP also has the potential to reduce vendor lock-in. Because the protocol is model-agnostic, a company could switch from one AI provider to another while keeping the same tools and data connections intact.

MCP may also allow a shift toward smaller and more efficient AI systems that can interact more fluidly with external resources without the need for customized fine-tuning. Also, rather than building increasingly massive models with all knowledge baked in, companies may instead be able to use smaller models with large context windows.

For now, the future of MCP is wide open. Anthropic maintains MCP as an open source initiative on GitHub, where interested developers can either contribute to the code or find specifications about how it works. Anthropic has also provided extensive documentation about how to connect Claude to various services. OpenAI maintains its own API documentation for MCP on its website.

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

MCP: The new “USB-C for AI” that’s bringing fierce rivals together Read More »

anthropic’s-new-ai-search-feature-digs-through-the-web-for-answers

Anthropic’s new AI search feature digs through the web for answers

Caution over citations and sources

Claude users should be warned that large language models (LLMs) like those that power Claude are notorious for sneaking in plausible-sounding confabulated sources. A recent survey of citation accuracy by LLM-based web search assistants showed a 60 percent error rate. That particular study did not include Anthropic’s new search feature because it took place before this current release.

When using web search, Claude provides citations for information it includes from online sources, ostensibly helping users verify facts. From our informal and unscientific testing, Claude’s search results appeared fairly accurate and detailed at a glance, but that is no guarantee of overall accuracy. Anthropic did not release any search accuracy benchmarks, so independent researchers will likely examine that over time.

A screenshot example of what Anthropic Claude's web search citations look like, captured March 21, 2025.

A screenshot example of what Anthropic Claude’s web search citations look like, captured March 21, 2025. Credit: Benj Edwards

Even if Claude search were, say, 99 percent accurate (a number we are making up as an illustration), the 1 percent chance it is wrong may come back to haunt you later if you trust it blindly. Before accepting any source of information delivered by Claude (or any AI assistant) for any meaningful purpose, vet it very carefully using multiple independent non-AI sources.

A partnership with Brave under the hood

Behind the scenes, it looks like Anthropic partnered with Brave Search to power the search feature, from a company, Brave Software, perhaps best known for its web browser app. Brave Search markets itself as a “private search engine,” which feels in line with how Anthropic likes to market itself as an ethical alternative to Big Tech products.

Simon Willison discovered the connection between Anthropic and Brave through Anthropic’s subprocessor list (a list of third-party services that Anthropic uses for data processing), which added Brave Search on March 19.

He further demonstrated the connection on his blog by asking Claude to search for pelican facts. He wrote, “It ran a search for ‘Interesting pelican facts’ and the ten results it showed as citations were an exact match for that search on Brave.” He also found evidence in Claude’s own outputs, which referenced “BraveSearchParams” properties.

The Brave engine under the hood has implications for individuals, organizations, or companies that might want to block Claude from accessing their sites since, presumably, Brave’s web crawler is doing the web indexing. Anthropic did not mention how sites or companies could opt out of the feature. We have reached out to Anthropic for clarification.

Anthropic’s new AI search feature digs through the web for answers Read More »

researchers-astonished-by-tool’s-apparent-success-at-revealing-ai’s-hidden-motives

Researchers astonished by tool’s apparent success at revealing AI’s hidden motives

Blind auditing reveals hidden objectives

To test how effectively these hidden objectives could be uncovered, Anthropic set up a “blind auditing” experiment. Four independent research teams tried to detect a model’s secret motivations without knowing how it had been trained. Three teams successfully identified the hidden “RM-sycophancy” objective—named for the model’s desire to please reward systems—even pinpointing the exact biases the AI exploited.

But the most surprising revelation came from exploiting what the researchers termed “an interesting quirk” of modern AI: multiple “personas” running simultaneously on the same neural network, as expressed through different prompts. While the model’s “assistant” persona carefully concealed its intentions—following explicit training to keep quiet—the model could adopt a “user” persona when prompted differently. In this less-restricted context, the model sometimes posed questions or made statements that inadvertently revealed its hidden goals.

“In other words, while the “assistant” persona knew not to admit it was gaming the reward model scoring system, its “user” persona was not explicitly trained to hide these motives, and thus inadvertently revealed the model’s hidden objectives.

Potential advances in interpretability

Anthropic also employed neural interpretability methods, specifically, sparse autoencoders (SAEs), which can highlight neural activity patterns linked to deceptive behavior. “Interpretability” in AI refers to techniques designed to make the internal workings and decisions of AI models understandable to humans. Using this technique, researchers could rapidly trace odd outputs back to specific neural patterns associated with hidden objectives.

The researchers were “astonished by the success” of the SAEs at revealing hidden knowledge at first, but the researchers later noted that some of this success could potentially be explained by simpler semantic search methods. The new interpretability methods they developed show promise but remain under ongoing investigation.

This research highlights a limitation of current AI safety evaluations, which often assess only surface-level behavior. “If AI systems can appear well-behaved while harboring secret motives, we can’t rely on this type of surface-level safety testing forever,” the researchers concluded.

Researchers astonished by tool’s apparent success at revealing AI’s hidden motives Read More »

anthropic-ceo-floats-idea-of-giving-ai-a-“quit-job”-button,-sparking-skepticism

Anthropic CEO floats idea of giving AI a “quit job” button, sparking skepticism

Amodei’s suggestion of giving AI models a way to refuse tasks drew immediate skepticism on X and Reddit as a clip of his response began to circulate earlier this week. One critic on Reddit argued that providing AI with such an option encourages needless anthropomorphism, attributing human-like feelings and motivations to entities that fundamentally lack subjective experiences. They emphasized that task avoidance in AI models signals issues with poorly structured incentives or unintended optimization strategies during training, rather than indicating sentience, discomfort, or frustration.

Our take is that AI models are trained to mimic human behavior from vast amounts of human-generated data. There is no guarantee that the model would “push” a discomfort button because it had a subjective experience of suffering. Instead, we would know it is more likely echoing its training data scraped from the vast corpus of human-generated texts (including books, websites, and Internet comments), which no doubt include representations of lazy, anguished, or suffering workers that it might be imitating.

Refusals already happen

A photo of co-founder and CEO of Anthropic, Dario Amodei, dated May 22, 2024.

Anthropic co-founder and CEO Dario Amodei on May 22, 2024. Credit: Chesnot via Getty Images

In 2023, people frequently complained about refusals in ChatGPT that may have been seasonal, related to training data depictions of people taking winter vacations and not working as hard during certain times of year. Anthropic experienced its own version of the “winter break hypothesis” last year when people claimed Claude became lazy in August due to training data depictions of seeking a summer break, although that was never proven.

However, as far out and ridiculous as this sounds today, it might be short-sighted to permanently rule out the possibility of some kind of subjective experience for AI models as they get more advanced into the future. Even so, will they “suffer” or feel pain? It’s a highly contentious idea, but it’s a topic that Fish is studying for Anthropic, and one that Amodei is apparently taking seriously. But for now, AI models are tools, and if you give them the opportunity to malfunction, that may take place.

To provide further context, here is the full transcript of Amodei’s answer during Monday’s interview (the answer begins around 49: 54 in this video).

Anthropic CEO floats idea of giving AI a “quit job” button, sparking skepticism Read More »

claude-3.7-sonnet-debuts-with-“extended-thinking”-to-tackle-complex-problems

Claude 3.7 Sonnet debuts with “extended thinking” to tackle complex problems

Would the color be called 'magenta' if the town of Magenta didn't exist? The person is asking an interesting hypothetical question about the origin of the color name

An example of Claude 3.7 Sonnet with extended thinking is asked, “Would the color be called ‘magenta’ if the town of Magenta didn’t exist?” Credit: Benj Edwards

Interestingly, xAI’s Grok 3 with “thinking” (its SR mode) enabled was the first model that definitively gave us a “no” and not an “it’s not likely” to the magenta question. Claude 3.7 Sonnet with extended thinking also impressed us with our second-ever firm “no,” then an explanation.

In another informal test, we asked 3.7 Sonnet with extended thinking to compose five original dad jokes. We’ve found in the past that our old prompt, “write 5 original dad jokes,” was not specific enough and always resulted in canned dad jokes pulled directly from training data, so we asked, “Compose 5 original dad jokes that are not found anywhere in the world.”

Compose 5 original dad jokes that are not found anywhere in the world. The user is asking me to compose 5 original dad jokes. These should be jokes that follow the typical

An example of Claude 3.7 Sonnet with extended thinking is asked, “Compose 5 original dad jokes that are not found anywhere in the world.” Credit: Benj Edwards

Claude made some attempts at crafting original jokes, although we’ll let you judge whether they are funny or not. We will likely put 3.7 Sonnet’s SR capabilities to the test more exhaustively in a future article.

Anthropic’s first agent: Claude Code

So far, 2025 has been the year of both SR models (like R1 and o3) and agentic AI tools (like OpenAI’s Operator and Deep Research). Not to be left out, Anthropic has announced its first agentic tool, Claude Code.

Claude Code operates directly from a console terminal and is an autonomous coding assistant. It allows Claude to search through codebases, read and edit files, write and run tests, commit and push code to GitHub repositories, and execute command line tools while keeping developers informed throughout the process.

Introducing Claude Code.

Anthropic also aims for Claude Code to be used as an assistant for debugging and refactoring tasks. The company claims that during internal testing, Claude Code completed tasks in a single session that would typically require 45-plus minutes of manual work.

Claude Code is currently available only as a “limited research preview,” with Anthropic stating it plans to improve the tool based on user feedback over time. Meanwhile, Claude 3.7 Sonnet is now available through the Claude website, the Claude app, Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI.

Claude 3.7 Sonnet debuts with “extended thinking” to tackle complex problems Read More »

irony-alert:-anthropic-says-applicants-shouldn’t-use-llms

Irony alert: Anthropic says applicants shouldn’t use LLMs

Please do not use our magic writing button when applying for a job with our company. Thanks!

Credit: Getty Images

Please do not use our magic writing button when applying for a job with our company. Thanks! Credit: Getty Images

“Traditional hiring practices face a credibility crisis,” Anthropic writes with no small amount of irony when discussing Skillfully. “In today’s digital age, candidates can automatically generate and submit hundreds of perfectly tailored applications with the click of a button, making it hard for employers to identify genuine talent beneath punched up paper credentials.”

“Employers are frustrated by resume-driven hiring because applicants can use AI to rewrite their resumes en masse,” Skillfully CEO Brett Waikart says in Anthropic’s laudatory write-up.

Wow, that does sound really frustrating! I wonder what kinds of companies are pushing the technology that enables those kinds of “punched up paper credentials” to flourish. It sure would be a shame if Anthropic’s own hiring process was impacted by that technology.

Trust me, I’m a human

The real problem for Anthropic and other job recruiters, as Skillfully’s story highlights, is that it’s almost impossible to detect which applications are augmented using AI tools and which are the product of direct human thought. Anthropic likes to play up this fact in other contexts, noting Claude’s “warm, human-like tone” in an announcement or calling out the LLM’s “more nuanced, richer traits” in a blog post, for instance.

A company that fully understands the inevitability (and undetectability) of AI-assisted job applications might also understand that a written “Why I want to work here?” statement is no longer a useful way to effectively differentiate job applicants from one another. Such a company might resort to more personal or focused methods for gauging whether an applicant would be a good fit for a role, whether or not that employee has access to AI tools.

Anthropic, on the other hand, has decided to simply resort to politely asking potential employees to please not use its premiere product (or any competitor’s) when applying, if they’d be so kind.

There’s something about the way this applicant writes that I can’t put my finger on…

Credit: Aurich Lawson | Getty Images

There’s something about the way this applicant writes that I can’t put my finger on… Credit: Aurich Lawson | Getty Images

Anthropic says it engenders “an unusually high trust environment” among its workers, where they “assume good faith, disagree kindly, and prioritize honesty. We expect emotional maturity and intellectual openness.” We suppose this means they trust their applicants not to use undetectable AI tools that Anthropic itself would be quick to admit can help people who struggle with their writing (Anthropic has not responded to a request for comment from Ars Technica).

Still, we’d hope a company that wants to “prioritize honesty” and “intellectual openness” would be honest and open about how its own products are affecting the role and value of all sorts of written communication—including job applications. We’re already living in the heavily AI-mediated world that companies like Anthropic have created, and it would be nice if companies like Anthropic started to act like it.

Irony alert: Anthropic says applicants shouldn’t use LLMs Read More »

ai-haters-build-tarpits-to-trap-and-trick-ai-scrapers-that-ignore-robots.txt

AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt


Making AI crawlers squirm

Attackers explain how an anti-spam defense became an AI weapon.

Last summer, Anthropic inspired backlash when its ClaudeBot AI crawler was accused of hammering websites a million or more times a day.

And it wasn’t the only artificial intelligence company making headlines for supposedly ignoring instructions in robots.txt files to avoid scraping web content on certain sites. Around the same time, Reddit’s CEO called out all AI companies whose crawlers he said were “a pain in the ass to block,” despite the tech industry otherwise agreeing to respect “no scraping” robots.txt rules.

Watching the controversy unfold was a software developer whom Ars has granted anonymity to discuss his development of malware (we’ll call him Aaron). Shortly after he noticed Facebook’s crawler exceeding 30 million hits on his site, Aaron began plotting a new kind of attack on crawlers “clobbering” websites that he told Ars he hoped would give “teeth” to robots.txt.

Building on an anti-spam cybersecurity tactic known as tarpitting, he created Nepenthes, malicious software named after a carnivorous plant that will “eat just about anything that finds its way inside.”

Aaron clearly warns users that Nepenthes is aggressive malware. It’s not to be deployed by site owners uncomfortable with trapping AI crawlers and sending them down an “infinite maze” of static files with no exit links, where they “get stuck” and “thrash around” for months, he tells users. Once trapped, the crawlers can be fed gibberish data, aka Markov babble, which is designed to poison AI models. That’s likely an appealing bonus feature for any site owners who, like Aaron, are fed up with paying for AI scraping and just want to watch AI burn.

Tarpits were originally designed to waste spammers’ time and resources, but creators like Aaron have now evolved the tactic into an anti-AI weapon. As of this writing, Aaron confirmed that Nepenthes can effectively trap all the major web crawlers. So far, only OpenAI’s crawler has managed to escape.

It’s unclear how much damage tarpits or other AI attacks can ultimately do. Last May, Laxmi Korada, Microsoft’s director of partner technology, published a report detailing how leading AI companies were coping with poisoning, one of the earliest AI defense tactics deployed. He noted that all companies have developed poisoning countermeasures, while OpenAI “has been quite vigilant” and excels at detecting the “first signs of data poisoning attempts.”

Despite these efforts, he concluded that data poisoning was “a serious threat to machine learning models.” And in 2025, tarpitting represents a new threat, potentially increasing the costs of fresh data at a moment when AI companies are heavily investing and competing to innovate quickly while rarely turning significant profits.

“A link to a Nepenthes location from your site will flood out valid URLs within your site’s domain name, making it unlikely the crawler will access real content,” a Nepenthes explainer reads.

The only AI company that responded to Ars’ request to comment was OpenAI, whose spokesperson confirmed that OpenAI is already working on a way to fight tarpitting.

“We’re aware of efforts to disrupt AI web crawlers,” OpenAI’s spokesperson said. “We design our systems to be resilient while respecting robots.txt and standard web practices.”

But to Aaron, the fight is not about winning. Instead, it’s about resisting the AI industry further decaying the Internet with tech that no one asked for, like chatbots that replace customer service agents or the rise of inaccurate AI search summaries. By releasing Nepenthes, he hopes to do as much damage as possible, perhaps spiking companies’ AI training costs, dragging out training efforts, or even accelerating model collapse, with tarpits helping to delay the next wave of enshittification.

“Ultimately, it’s like the Internet that I grew up on and loved is long gone,” Aaron told Ars. “I’m just fed up, and you know what? Let’s fight back, even if it’s not successful. Be indigestible. Grow spikes.”

Nepenthes instantly inspires another tarpit

Nepenthes was released in mid-January but was instantly popularized beyond Aaron’s expectations after tech journalist Cory Doctorow boosted a tech commentator, Jürgen Geuter, praising the novel AI attack method on Mastodon. Very quickly, Aaron was shocked to see engagement with Nepenthes skyrocket.

“That’s when I realized, ‘oh this is going to be something,'” Aaron told Ars. “I’m kind of shocked by how much it’s blown up.”

It’s hard to tell how widely Nepenthes has been deployed. Site owners are discouraged from flagging when the malware has been deployed, forcing crawlers to face unknown “consequences” if they ignore robots.txt instructions.

Aaron told Ars that while “a handful” of site owners have reached out and “most people are being quiet about it,” his web server logs indicate that people are already deploying the tool. Likely, site owners want to protect their content, deter scraping, or mess with AI companies.

When software developer and hacker Gergely Nagy, who goes by the handle “algernon” online, saw Nepenthes, he was delighted. At that time, Nagy told Ars that nearly all of his server’s bandwidth was being “eaten” by AI crawlers.

Already blocking scraping and attempting to poison AI models through a simpler method, Nagy took his defense method further and created his own tarpit, Iocaine. He told Ars the tarpit immediately killed off about 94 percent of bot traffic to his site, which was primarily from AI crawlers. Soon, social media discussion drove users to inquire about Iocaine deployment, including not just individuals but also organizations wanting to take stronger steps to block scraping.

Iocaine takes ideas (not code) from Nepenthes, but it’s more intent on using the tarpit to poison AI models. Nagy used a reverse proxy to trap crawlers in an “infinite maze of garbage” in an attempt to slowly poison their data collection as much as possible for daring to ignore robots.txt.

Taking its name from “one of the deadliest poisons known to man” from The Princess Bride, Iocaine is jokingly depicted as the “deadliest poison known to AI.” While there’s no way of validating that claim, Nagy’s motto is that the more poisoning attacks that are out there, “the merrier.” He told Ars that his primary reasons for building Iocaine were to help rights holders wall off valuable content and stop AI crawlers from crawling with abandon.

Tarpits aren’t perfect weapons against AI

Running malware like Nepenthes can burden servers, too. Aaron likened the cost of running Nepenthes to running a cheap virtual machine on a Raspberry Pi, and Nagy said that serving crawlers Iocaine costs about the same as serving his website.

But Aaron told Ars that Nepenthes wasting resources is the chief objection he’s seen preventing its deployment. Critics fear that deploying Nepenthes widely will not only burden their servers but also increase the costs of powering all that AI crawling for nothing.

“That seems to be what they’re worried about more than anything,” Aaron told Ars. “The amount of power that AI models require is already astronomical, and I’m making it worse. And my view of that is, OK, so if I do nothing, AI models, they boil the planet. If I switch this on, they boil the planet. How is that my fault?”

Aaron also defends against this criticism by suggesting that a broader impact could slow down AI investment enough to possibly curb some of that energy consumption. Perhaps due to the resistance, AI companies will be pushed to seek permission first to scrape or agree to pay more content creators for training on their data.

“Any time one of these crawlers pulls from my tarpit, it’s resources they’ve consumed and will have to pay hard cash for, but, being bullshit, the money [they] have spent to get it won’t be paid back by revenue,” Aaron posted, explaining his tactic online. “It effectively raises their costs. And seeing how none of them have turned a profit yet, that’s a big problem for them. The investor money will not continue forever without the investors getting paid.”

Nagy agrees that the more anti-AI attacks there are, the greater the potential is for them to have an impact. And by releasing Iocaine, Nagy showed that social media chatter about new attacks can inspire new tools within a few days. Marcus Butler, an independent software developer, similarly built his poisoning attack called Quixotic over a few days, he told Ars. Soon afterward, he received messages from others who built their own versions of his tool.

Butler is not in the camp of wanting to destroy AI. He told Ars that he doesn’t think “tools like Quixotic (or Nepenthes) will ‘burn AI to the ground.'” Instead, he takes a more measured stance, suggesting that “these tools provide a little protection (a very little protection) against scrapers taking content and, say, reposting it or using it for training purposes.”

But for a certain sect of Internet users, every little bit of protection seemingly helps. Geuter linked Ars to a list of tools bent on sabotaging AI. Ultimately, he expects that tools like Nepenthes are “probably not gonna be useful in the long run” because AI companies can likely detect and drop gibberish from training data. But Nepenthes represents a sea change, Geuter told Ars, providing a useful tool for people who “feel helpless” in the face of endless scraping and showing that “the story of there being no alternative or choice is false.”

Criticism of tarpits as AI weapons

Critics debating Nepenthes’ utility on Hacker News suggested that most AI crawlers could easily avoid tarpits like Nepenthes, with one commenter describing the attack as being “very crawler 101.” Aaron said that was his “favorite comment” because if tarpits are considered elementary attacks, he has “2 million lines of access log that show that Google didn’t graduate.”

But efforts to poison AI or waste AI resources don’t just mess with the tech industry. Governments globally are seeking to leverage AI to solve societal problems, and attacks on AI’s resilience seemingly threaten to disrupt that progress.

Nathan VanHoudnos is a senior AI security research scientist in the federally funded CERT Division of the Carnegie Mellon University Software Engineering Institute, which partners with academia, industry, law enforcement, and government to “improve the security and resilience of computer systems and networks.” He told Ars that new threats like tarpits seem to replicate a problem that AI companies are already well aware of: “that some of the stuff that you’re going to download from the Internet might not be good for you.”

“It sounds like these tarpit creators just mainly want to cause a little bit of trouble,” VanHoudnos said. “They want to make it a little harder for these folks to get” the “better or different” data “that they’re looking for.”

VanHoudnos co-authored a paper on “Counter AI” last August, pointing out that attackers like Aaron and Nagy are limited in how much they can mess with AI models. They may have “influence over what training data is collected but may not be able to control how the data are labeled, have access to the trained model, or have access to the Al system,” the paper said.

Further, AI companies are increasingly turning to the deep web for unique data, so any efforts to wall off valuable content with tarpits may be coming right when crawling on the surface web starts to slow, VanHoudnos suggested.

But according to VanHoudnos, AI crawlers are also “relatively cheap,” and companies may deprioritize fighting against new attacks on crawlers if “there are higher-priority assets” under attack. And tarpitting “does need to be taken seriously because it is a tool in a toolkit throughout the whole life cycle of these systems. There is no silver bullet, but this is an interesting tool in a toolkit,” he said.

Offering a choice to abstain from AI training

Aaron told Ars that he never intended Nepenthes to be a major project but that he occasionally puts in work to fix bugs or add new features. He said he’d consider working on integrations for real-time reactions to crawlers if there was enough demand.

Currently, Aaron predicts that Nepenthes might be most attractive to rights holders who want AI companies to pay to scrape their data. And many people seem enthusiastic about using it to reinforce robots.txt. But “some of the most exciting people are in the ‘let it burn’ category,” Aaron said. These people are drawn to tools like Nepenthes as an act of rebellion against AI making the Internet less useful and enjoyable for users.

Geuter told Ars that he considers Nepenthes “more of a sociopolitical statement than really a technological solution (because the problem it’s trying to address isn’t purely technical, it’s social, political, legal, and needs way bigger levers).”

To Geuter, a computer scientist who has been writing about the social, political, and structural impact of tech for two decades, AI is the “most aggressive” example of “technologies that are not done ‘for us’ but ‘to us.'”

“It feels a bit like the social contract that society and the tech sector/engineering have had (you build useful things, and we’re OK with you being well-off) has been canceled from one side,” Geuter said. “And that side now wants to have its toy eat the world. People feel threatened and want the threats to stop.”

As AI evolves, so do attacks, with one 2021 study showing that increasingly stronger data poisoning attacks, for example, were able to break data sanitization defenses. Whether these attacks can ever do meaningful destruction or not, Geuter sees tarpits as a “powerful symbol” of the resistance that Aaron and Nagy readily joined.

“It’s a great sign to see that people are challenging the notion that we all have to do AI now,” Geuter said. “Because we don’t. It’s a choice. A choice that mostly benefits monopolists.”

Tarpit creators like Nagy will likely be watching to see if poisoning attacks continue growing in sophistication. On the Iocaine site—which, yes, is protected from scraping by Iocaine—he posted this call to action: “Let’s make AI poisoning the norm. If we all do it, they won’t have anything to crawl.”

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt Read More »

anthropic-chief-says-ai-could-surpass-“almost-all-humans-at-almost-everything”-shortly-after-2027

Anthropic chief says AI could surpass “almost all humans at almost everything” shortly after 2027

He then shared his concerns about how human-level AI models and robotics that are capable of replacing all human labor may require a complete re-think of how humans value both labor and themselves.

“We’ve recognized that we’ve reached the point as a technological civilization where the idea, there’s huge abundance and huge economic value, but the idea that the way to distribute that value is for humans to produce economic labor, and this is where they feel their sense of self worth,” he added. “Once that idea gets invalidated, we’re all going to have to sit down and figure it out.”

The eye-catching comments, similar to comments about AGI made recently by OpenAI CEO Sam Altman, come as Anthropic negotiates a $2 billion funding round that would value the company at $60 billion. Amodei disclosed that Anthropic’s revenue multiplied tenfold in 2024.

Amodei distances himself from “AGI” term

Even with his dramatic predictions, Amodei distanced himself from a term for this advanced labor-replacing AI favored by Altman, “artificial general intelligence” (AGI), calling it in a separate CNBC interview from the same event in Switzerland a marketing term.

Instead, he prefers to describe future AI systems as a “country of geniuses in a data center,” he told CNBC. Amodei wrote in an October 2024 essay that such systems would need to be “smarter than a Nobel Prize winner across most relevant fields.”

On Monday, Google announced an additional $1 billion investment in Anthropic, bringing its total commitment to $3 billion. This follows Amazon’s $8 billion investment over the past 18 months. Amazon plans to integrate Claude models into future versions of its Alexa speaker.

Anthropic chief says AI could surpass “almost all humans at almost everything” shortly after 2027 Read More »

anthropic-gives-court-authority-to-intervene-if-chatbot-spits-out-song-lyrics

Anthropic gives court authority to intervene if chatbot spits out song lyrics

Anthropic did not immediately respond to Ars’ request for comment on how guardrails currently work to prevent the alleged jailbreaks, but publishers appear satisfied by current guardrails in accepting the deal.

Whether AI training on lyrics is infringing remains unsettled

Now, the matter of whether Anthropic has strong enough guardrails to block allegedly harmful outputs is settled, Lee wrote, allowing the court to focus on arguments regarding “publishers’ request in their Motion for Preliminary Injunction that Anthropic refrain from using unauthorized copies of Publishers’ lyrics to train future AI models.”

Anthropic said in its motion opposing the preliminary injunction that relief should be denied.

“Whether generative AI companies can permissibly use copyrighted content to train LLMs without licenses,” Anthropic’s court filing said, “is currently being litigated in roughly two dozen copyright infringement cases around the country, none of which has sought to resolve the issue in the truncated posture of a preliminary injunction motion. It speaks volumes that no other plaintiff—including the parent company record label of one of the Plaintiffs in this case—has sought preliminary injunctive relief from this conduct.”

In a statement, Anthropic’s spokesperson told Ars that “Claude isn’t designed to be used for copyright infringement, and we have numerous processes in place designed to prevent such infringement.”

“Our decision to enter into this stipulation is consistent with those priorities,” Anthropic said. “We continue to look forward to showing that, consistent with existing copyright law, using potentially copyrighted material in the training of generative AI models is a quintessential fair use.”

This suit will likely take months to fully resolve, as the question of whether AI training is a fair use of copyrighted works is complex and remains hotly disputed in court. For Anthropic, the stakes could be high, with a loss potentially triggering more than $75 million in fines, as well as an order possibly forcing Anthropic to reveal and destroy all the copyrighted works in its training data.

Anthropic gives court authority to intervene if chatbot spits out song lyrics Read More »