Biz & IT

hp-and-dell-disable-hevc-support-built-into-their-laptops’-cpus

HP and Dell disable HEVC support built into their laptops’ CPUs

The OEMs disabling codec hardware also comes as associated costs for the international video compression standard are set to increase in January, as licensing administrator Access Advance announced in July. Per a breakdown from patent pool administration VIA Licensing Alliance, royalty rates for HEVC for over 100,001 units are increasing from $0.20 each to $0.24 each in the United States. To put that into perspective, in Q3 2025, HP sold 15,002,000 laptops and desktops, and Dell sold 10,166,000 laptops and desktops, per Gartner.

Last year, NAS company Synology announced that it was ending support for HEVC, as well as H.264/AVC and VCI, transcoding on its DiskStation Manager and BeeStation OS platforms, saying that “support for video codecs is widespread on end devices, such as smartphones, tablets, computers, and smart TVs.”

“This update reduces unnecessary resource usage on the server and significantly improves media processing efficiency. The optimization is particularly effective in high-user environments compared to traditional server-side processing,” the announcement said.

Despite the growing costs and complications with HEVC licenses and workarounds, breaking features that have been widely available for years will likely lead to confusion and frustration.

“This is pretty ridiculous, given these systems are $800+ a machine, are part of a ‘Pro’ line (jabs at branding names are warranted – HEVC is used professionally), and more applications these days outside of Netflix and streaming TV are getting around to adopting HEVC,” a Redditor wrote.

HP and Dell disable HEVC support built into their laptops’ CPUs Read More »

bonkers-bitcoin-heist:-5-star-hotels,-cash-filled-envelopes,-vanishing-funds

Bonkers Bitcoin heist: 5-star hotels, cash-filled envelopes, vanishing funds


Bitcoin mining hardware exec falls for sophisticated crypto scam to tune of $200k

As Kent Halliburton stood in a bathroom at the Rosewood Hotel in central Amsterdam, thousands of miles from home, running his fingers through an envelope filled with 10,000 euros in crisp banknotes, he started to wonder what he had gotten himself into.

Halliburton is the cofounder and CEO of Sazmining, a company that operates bitcoin mining hardware on behalf of clients—a model known as “mining-as-a-service.” Halliburton is based in Peru, but Sazmining runs mining hardware out of third-party data centers across Norway, Paraguay, Ethiopia, and the United States.

As Halliburton tells it, he had flown to Amsterdam the previous day, August 5, to meet Even and Maxim, two representatives of a wealthy Monaco-based family. The family office had offered to purchase hundreds of bitcoin mining rigs from Sazmining—around $4 million worth—which the company would install at a facility currently under construction in Ethiopia. Before finalizing the deal, the family office had asked to meet Halliburton in person.

When Halliburton arrived at the Rosewood Hotel, he found Even and Maxim perched in a booth. They struck him as playboy, high-roller types—particularly Maxim, who wore a tan three-piece suit and had a highly manicured look, his long dark hair parted down the middle. A Rolex protruded from the cuff of his sleeve.

Over a three-course lunch—ceviche with a roe garnish, Chilean sea bass, and cherry cake—they discussed the contours of the deal and traded details about their respective backgrounds. Even was talkative and jocular, telling stories about blowout parties in Marrakech. Maxim was aloof; he mostly stared at Halliburton, holding his gaze for long periods at a time as though sizing him up.

As a relationship-building exercise, Even proposed that Halliburton sell the family office around $3,000 in bitcoin. Halliburton was initially hesitant, but chalked it up as a peculiar dating ritual. One of the guys slid Halliburton the cash-filled envelope and told him to go to the bathroom, where he could count out the amount in private. “It felt like something out of a James Bond movie,” says Halliburton. “It was all very exotic to me.”

Halliburton left in a taxi, somewhat bemused by the encounter, but otherwise hopeful of closing the deal with the family office. For Sazmining, a small company with around 15 employees, it promised to be transformative.

Less than two weeks later, Halliburton had lost more than $200,000 worth of bitcoin to Even and Maxim. He didn’t know whether Sazmining could survive the blow, nor how the scammers had ensnared him.

Directly after his lunch with Even and Maxim, Halliburton flew to Latvia for a Bitcoin conference. From there, he traveled to Ethiopia to check on construction work at the data center facility.

While Halliburton was in Ethiopia, he received a WhatsApp message from Even, who wanted to go ahead with the deal on one condition: that Sazmining sell the family office a larger amount of bitcoin as part of the transaction, after the small initial purchase at the Rosewood Hotel. They landed on $400,000 worth—a tenth of the overall deal value.

Even asked Halliburton to return to Amsterdam to sign the contracts necessary to finalize the deal. Having been away from his family for weeks, Halliburton protested. But Even drew a line in the sand: “Remotely doesn’t work for me that’s not how I do business at the moment,” he wrote in a text message reviewed by WIRED.

Halliburton arrived back in Amsterdam in the early afternoon on August 16. That evening, he was due to meet Maxim at a teppanyaki restaurant at the five-star Okura Hotel. The interior is elaborately decorated in traditional Japanese style; it has wooden paneling, paper walls, a zen garden, and a flock of origami cranes that hang from string down a spiral staircase in the lobby.

Halliburton found Maxim sitting on a couch in the waiting area outside the restaurant, dressed in a gaudy silver suit. As they waited for a table, Maxim asked Halliburton whether he could demonstrate that Sazmining held enough bitcoin to go through with the side transaction that Even had proposed. He wanted Halliburton to move roughly half of the agreed amount—worth $220,000—into a bitcoin wallet app trusted by the family office. The funds would remain under Halliburton’s control, but the family office would be able to verify their existence using public transaction data.

Halliburton thumbed open his iPhone. The app, Atomic Wallet, had thousands of positive reviews and had been listed on the Apple App Store for several years. With Maxim at his side, Halliburton downloaded the app and created a new wallet. “I was trying to earn this guy’s trust,” says Halliburton. “Again, a $4 million contract. I’m still looking at that carrot.”

The dinner passed largely without incident. Maxim was less guarded this time; he talked about his fondness for watches and his work sourcing deals for the family office. Feeling under the weather from all the travel, Halliburton angled to wrap things up.

They left with the understanding that Maxim would take the signed contracts to the family office to be executed, while Halliburton would send the $220,000 in bitcoin to his new wallet address as agreed.

Back in his hotel room, Halliburton triggered a small test transaction using his new Atomic Wallet address. Then he wiped and reinstated the wallet using the private credentials—the seed phrase—generated when he first downloaded the app, to make sure that it functioned as expected. “Had to take some security measures but almost ready. Thanks for your patience,” wrote Halliburton in a WhatsApp message to Even. “No worries take your time,” Even responded.

At 10: 45 pm, satisfied with his tests, Halliburton signaled to a colleague to release $220,000 worth of bitcoin to the Atomic Wallet address. When it arrived, he sent a screenshot of the updated balance to Even. One minute later, Even wrote back, “Thank yiu [sic].”

Halliburton sent another message to Even, asking about the contracts. Though previously quick to answer, Even didn’t respond. Halliburton checked the Atomic Wallet app, sensing that something was wrong. The bitcoin had vanished.

Halliburton’s stomach dropped. As he sat on the bed, he tried to stop himself from vomiting. “It was like being punched in the gut,” says Halliburton. “It was just shock and disbelief.”

Halliburton racked his brain trying to figure out how he had been swindled. At 11: 30 pm, he sent another message to Even: “That was the most sophisticated scam I’ve ever experienced. I know you probably don’t give a shit but my business may not survive this. I’ve worked four years of my life to build it.”

Even responded, denying that he had done anything wrong, but that was the last Halliburton heard from him. Halliburton provided WIRED with the Telegram account Even had used; it was last active on the day the funds were drained. Even did not respond to a request for comment.

Within hours, the funds drained from Halliburton’s wallet began to be divided up, shuffled through a web of different addresses, and deposited with third-party platforms for converting crypto into regular currency, analysis by blockchain analytics companies Chainalysis and CertiK shows.

A portion of the bitcoin was split between different instant exchangers, which allow people to swap one type of cryptocurrency for another almost instantaneously. The bulk was funneled into a single address, where it was blended with funds tagged by Chainalysis as the likely proceeds of rip deals, a scam whereby somebody impersonates an investor to steal crypto from a startup.

“There’s nothing illegal about the services the scammer leveraged,” says Margaux Eckle, senior investigator at Chainalysis. “However, the fact that they leveraged consolidation addresses that appear very tightly connected to labeled scam activity is potentially indicative of a fraud operation.”

Some of the bitcoin that passed through the consolidation address was deposited with a crypto exchange, where it was likely swapped for regular currency. The remainder was converted into stablecoin and moved across so-called bridges to the Tron blockchain, which hosts several over-the-counter trading services that can be readily used to cash out large quantities of crypto, researchers claim.

The effect of the many hops, shuffles, conversions, and divisions is to make it more difficult to trace the origin of funds, so that they can be cashed out without arousing suspicion. “The scammer is quite sophisticated,” says Eckle. “Though we can trace through a bridge, it’s a way to slow the tracing of funds from investigators that could be on your tail.”

Eventually, the trail of public transaction data stops. To identify the perpetrators, law enforcement would have to subpoena the services that appear to have been used to cash out, which are widely required to collect information about users.

From the transaction data, it’s not possible to tell precisely how the scammers were able to access and drain Halliburton’s wallet without his permission. But aspects of his interactions with the scammers provide some clue.

Initially, Halliburton wondered whether the incident might be connected to a 2023 hack perpetrated by threat actors affiliated with the North Korean government, which led to $100 million worth of funds being drained from the accounts of Atomic Wallet users. (Atomic Wallet did not respond to a request for comment.)

But instead, the security researchers that spoke to WIRED believe that Halliburton fell victim to a targeted surveillance-style attack. “Executives who are publicly known to custody large crypto balances make attractive targets,” says Guanxing Wen, head of security research at CertiK.

The in-person dinners, expensive clothing, reams of cash, and other displays of wealth were gambits meant to put Halliburton at ease, researchers theorize. “This is a well-known rapport-building tactic in high-value confidence schemes,” says Wen. “The longer a victim spends with the attacker in a relaxed setting, the harder it becomes to challenge a later technical request.”

In order to complete the theft, the scammers likely had to steal the seed phrase for Halliburton’s newly created Atomic Wallet address. Equipped with a wallet’s seed phrase, anyone can gain unfettered access to the bitcoin kept inside.

One possibility is that the scammers, who dictated the locations for both meetings in Amsterdam, hijacked or mimicked the hotel Wi-Fi networks, allowing them to harvest information from Halliburton’s phone. “That equipment you can buy online, no problem. It would all fit inside a couple of suitcases,” says Adrian Cheek, lead researcher at cybersecurity company Coeus. But Halliburton insists that his phone never left his possession, and he used mobile data to download the Atomic Wallet app, not public Wi-Fi.

The most plausible explanation, claims Wen, is that the scammers—perhaps with the help of a nearby accomplice or a camera equipped with long-range zoom—were able to record the seed phrase when it appeared on Halliburton’s phone at the point he first downloaded the app, on the couch at the Okura Hotel.

Long before Halliburton delivered the $220,000 in bitcoin to his Atomic Wallet address, the scammers had probably set up a “sweeper script,” claims Wen, a type of automated bot coded to drain a wallet when it detects a large balance change.

The people the victim meets in-person in cases like this—like Even and Maxim—are rarely the ultimate beneficiaries, but rather mercenaries hired by a network of scam artists, who could be based on the other side of the globe.

“They’re normally recruited through underground forums, and secure chat groups,” says Cheek. “If you know where you’re looking, you can see this ongoing recruitment.”

For a few days, it remained unclear whether Sazmining would be able to weather the financial blow. The stolen funds equated to about six weeks’ worth of revenue. “I’m trying to keep the business afloat and survive this situation where suddenly we’ve got a cash crunch,” says Halliburton. By delaying payment to a vendor and extending the duration of an outstanding loan, the company was ultimately able to remain solvent.

That week, one of the Sazmining board members filed reports with law enforcement bodies in the Netherlands, the UK, and the US. They received acknowledgements from only UK-based Action Fraud, which said it would take no immediate action, and the Cyber Fraud Task Force, a division of the US Secret Service. (The CFTF did not respond to a request for comment.)

The incredible volume of crypto-related scam activity makes it all but impossible for law enforcement to investigate each theft individually. “It’s a type of threat and criminal activity that is reaching a scale that’s completely unprecedented,” says Eckle.

The best chance of a scam victim recovering their funds is for law enforcement to bust an entire scam ring, says Eckle. In that scenario, any funds recovered are typically dispersed to those who have reported themselves victims.

Until such a time, Halliburton has to make his peace with the loss. “It’s still painful,” he says. But “it wasn’t a death blow.”

This story originally appeared on Wired.

Photo of WIRED

Wired.com is your essential daily guide to what’s next, delivering the most original and complete take you’ll find anywhere on innovation’s impact on technology, science, business and culture.

Bonkers Bitcoin heist: 5-star hotels, cash-filled envelopes, vanishing funds Read More »

google-ceo:-if-an-ai-bubble-pops,-no-one-is-getting-out-clean

Google CEO: If an AI bubble pops, no one is getting out clean

Market concerns and Google’s position

Alphabet’s recent market performance has been driven by investor confidence in the company’s ability to compete with OpenAI’s ChatGPT, as well as its development of specialized chips for AI that can compete with Nvidia’s. Nvidia recently reached a world-first $5 trillion valuation due to making GPUs that can accelerate the matrix math at the heart of AI computations.

Despite acknowledging that no company would be immune to a potential AI bubble burst, Pichai argued that Google’s unique position gives it an advantage. He told the BBC that the company owns what he called a “full stack” of technologies, from chips to YouTube data to models and frontier science research. This integrated approach, he suggested, would help the company weather any market turbulence better than competitors.

Pichai also told the BBC that people should not “blindly trust” everything AI tools output. The company currently faces repeated accuracy concerns about some of its AI models. Pichai said that while AI tools are helpful “if you want to creatively write something,” people “have to learn to use these tools for what they’re good at and not blindly trust everything they say.”

In the BBC interview, the Google boss also addressed the “immense” energy needs of AI, acknowledging that the intensive energy requirements of expanding AI ventures have caused slippage on Alphabet’s climate targets. However, Pichai insisted that the company still wants to achieve net zero by 2030 through investments in new energy technologies. “The rate at which we were hoping to make progress will be impacted,” Pichai said, warning that constraining an economy based on energy “will have consequences.”

Even with the warnings about a potential AI bubble, Pichai did not miss his chance to promote the technology, albeit with a hint of danger regarding its widespread impact. Pichai described AI as “the most profound technology” humankind has worked on.

“We will have to work through societal disruptions,” he said, adding that the technology would “create new opportunities” and “evolve and transition certain jobs.” He said people who adapt to AI tools “will do better” in their professions, whatever field they work in.

Google CEO: If an AI bubble pops, no one is getting out clean Read More »

oracle-hit-hard-in-wall-street’s-tech-sell-off-over-its-huge-ai-bet

Oracle hit hard in Wall Street’s tech sell-off over its huge AI bet

“That is a huge liability and credit risk for Oracle. Your main customer, biggest customer by far, is a venture capital-funded start-up,” said Andrew Chang, a director at S&P Global.

OpenAI faces questions about how it plans to meet its commitments to spend $1.4 trillion on AI infrastructure over the next eight years. It has struck deals with several Big Tech groups, including Oracle’s rivals.

Of the five hyperscalers—which include Amazon, Google, Microsoft, and Meta—Oracle is the only one with negative free cash flow. Its debt-to-equity ratio has surged to 500 percent, far higher than Amazon’s 50 percent and Microsoft’s 30 percent, according to JPMorgan.

While all five companies have seen their cash-to-assets ratios decline significantly in recent years amid a boom in spending, Oracle’s is by far the lowest, JPMorgan found.

JPMorgan analysts noted a “tension between [Oracle’s] aggressive AI build-out ambitions and the limits of its investment-grade balance sheet.”

Analysts have also noted that Oracle’s data center leases are for much longer than its contracts to sell capacity to OpenAI.

Oracle has signed at least five long-term lease agreements for US data centers that will ultimately be used by OpenAI, resulting in $100 billion of off-balance-sheet lease commitments. The sites are at varying levels of construction, with some not expected to break ground until next year.

Safra Catz, Oracle’s sole chief executive from 2019 until she stepped down in September, resisted expanding its cloud business because of the vast expenses required. She was replaced by co-CEOs Clay Magouyrk and Mike Sicilia as part of the pivot by Oracle to a new era focused on AI.

Catz, who is now executive vice-chair of Oracle’s board, has exercised stock options and sold $2.5 billion of its shares this year, according to US regulatory filings. She had announced plans to exercise her stock options at the end of 2024.

© 2025 The Financial Times Ltd. All rights reserved. Not to be redistributed, copied, or modified in any way.

Oracle hit hard in Wall Street’s tech sell-off over its huge AI bet Read More »

forget-agi—sam-altman-celebrates-chatgpt-finally-following-em-dash-formatting-rules

Forget AGI—Sam Altman celebrates ChatGPT finally following em dash formatting rules


Next stop: superintelligence

Ongoing struggles with AI model instruction-following show that true human-level AI still a ways off.

Em dashes have become what many believe to be a telltale sign of AI-generated text over the past few years. The punctuation mark appears frequently in outputs from ChatGPT and other AI chatbots, sometimes to the point where readers believe they can identify AI writing by its overuse alone—although people can overuse it, too.

On Thursday evening, OpenAI CEO Sam Altman posted on X that ChatGPT has started following custom instructions to avoid using em dashes. “Small-but-happy win: If you tell ChatGPT not to use em-dashes in your custom instructions, it finally does what it’s supposed to do!” he wrote.

The post, which came two days after the release of OpenAI’s new GPT-5.1 AI model, received mixed reactions from users who have struggled for years with getting the chatbot to follow specific formatting preferences. And this “small win” raises a very big question: If the world’s most valuable AI company has struggled with controlling something as simple as punctuation use after years of trying, perhaps what people call artificial general intelligence (AGI) is farther off than some in the industry claim.

Sam Altman @sama Small-but-happy win: If you tell ChatGPT not to use em-dashes in your custom instructions, it finally does what it's supposed to do! 11:48 PM · Nov 13, 2025 · 2.4M Views

A screenshot of Sam Altman’s post about em dashes on X. Credit: X

“The fact that it’s been 3 years since ChatGPT first launched, and you’ve only just now managed to make it obey this simple requirement, says a lot about how little control you have over it, and your understanding of its inner workings,” wrote one X user in a reply. “Not a good sign for the future.”

While Altman likes to publicly talk about AGI (a hypothetical technology equivalent to humans in general learning ability), superintelligence (a nebulous concept for AI that is far beyond human intelligence), and “magic intelligence in the sky” (his term for AI cloud computing?) while raising funds for OpenAI, it’s clear that we still don’t have reliable artificial intelligence here today on Earth.

But wait, what is an em dash anyway, and why does it matter so much?

AI models love em dashes because we do

Unlike a hyphen, which is a short punctuation mark used to connect words or parts of words, that lives with a dedicated key on your keyboard (-), an em dash is a long dash denoted by a special character (—) that writers use to set off parenthetical information, indicate a sudden change in thought, or introduce a summary or explanation.

Even before the age of AI language models, some writers frequently bemoaned the overuse of the em dash in modern writing. In a 2011 Slate article, writer Noreen Malone argued that writers used the em dash “in lieu of properly crafting sentences” and that overreliance on it “discourages truly efficient writing.” Various Reddit threads posted prior to ChatGPT’s launch featured writers either wrestling over the etiquette of proper em dash use or admitting to their frequent use as a guilty pleasure.

In 2021, one writer in the r/FanFiction subreddit wrote, “For the longest time, I’ve been addicted to Em Dashes. They find their way into every paragraph I write. I love the crisp straight line that gives me the excuse to shove details or thoughts into an otherwise orderly paragraph. Even after coming back to write after like two years of writer’s block, I immediately cram as many em dashes as I can.”

Because of the tendency for AI chatbots to overuse them, detection tools and human readers have learned to spot em dash use as a pattern, creating a problem for the small subset of writers who naturally favor the punctuation mark in their work. As a result, some journalists are complaining that AI is “killing” the em dash.

No one knows precisely why LLMs tend to overuse em dashes. We’ve seen a wide range of speculation online that attempts to explain the phenomenon, from noticing that em dashes were more popular in 19th-century books used as training data (according to a 2018 study, dash use in the English language peaked around 1860 before declining through the mid-20th century) or perhaps AI models borrowed the habit from automatic em-dash character conversion on the blogging site Medium.

One thing we know for sure is that LLMs tend to output frequently seen patterns in their training data (fed in during the initial training process) and from a subsequent reinforcement learning process that often relies on human preferences. As a result, AI language models feed you a sort of “smoothed out” average style of whatever you ask them to provide, moderated by whatever they are conditioned to produce through user feedback.

So the most plausible explanation is still that requests for professional-style writing from an AI model trained on vast numbers of examples from the Internet will lean heavily toward the prevailing style in the training data, where em dashes appear frequently in formal writing, news articles, and editorial content. It’s also possible that during training through human feedback (called RLHF), responses with em dashes, for whatever reason, received higher ratings. Perhaps it’s because those outputs appeared more sophisticated or engaging to evaluators, but that’s just speculation.

From em dashes to AGI?

To understand what Altman’s “win” really means, and what it says about the road to AGI, we need to understand how ChatGPT’s custom instructions actually work. They allow users to set persistent preferences that apply across all conversations by appending written instructions to the prompt that is fed into the model just before the chat begins. Users can specify tone, format, and style requirements without needing to repeat those requests manually in every new chat.

However, the feature has not always worked reliably because LLMs do not work reliably (even OpenAI and Anthropic freely admit this). A LLM takes an input and produces an output, spitting out a statistically plausible continuation of a prompt (a system prompt, the custom instructions, and your chat history), and it doesn’t really “understand” what you are asking. With AI language model outputs, there is always some luck involved in getting them to do what you want.

In our informal testing of GPT-5.1 with custom instructions, ChatGPT did appear to follow our request not to produce em dashes. But despite Altman’s claim, the response from X users appears to show that experiences with the feature continue to vary, at least when the request is not placed in custom instructions.

So if LLMs are statistical text-generation boxes, what does “instruction following” even mean? That’s key to unpacking the hypothetical path from LLMs to AGI. The concept of following instructions for an LLM is fundamentally different from how we typically think about following instructions as humans with general intelligence, or even a traditional computer program.

In traditional computing, instruction following is deterministic. You tell a program “don’t include character X,” and it won’t include that character. The program executes rules exactly as written. With LLMs, “instruction following” is really about shifting statistical probabilities. When you tell ChatGPT “don’t use em dashes,” you’re not creating a hard rule. You’re adding text to the prompt that makes tokens associated with em dashes less likely to be selected during the generation process. But “less likely” isn’t “impossible.”

Every token the model generates is selected from a probability distribution. Your custom instruction influences that distribution, but it’s competing with the model’s training data (where em-dashes appeared frequently in certain contexts) and everything else in the prompt. Unlike code with conditional logic, there’s no separate system verifying outputs against your requirements. The instruction is just more text that influences the statistical prediction process.

When Altman celebrates finally getting GPT to avoid em dashes, he’s really celebrating that OpenAI has tuned the latest version of GPT-5.1 (probably through reinforcement learning or fine-tuning) to weight custom instructions more heavily in its probability calculations.

There’s an irony about control here: Given the probabilistic nature of the issue, there’s no guarantee the issue will stay fixed. OpenAI continuously updates its models behind the scenes, even within the same version number, adjusting outputs based on user feedback and new training runs. Each update arrives with different output characteristics that can undo previous behavioral tuning, a phenomenon researchers call the “alignment tax.”

Precisely tuning a neural network’s behavior is not yet an exact science. Since all concepts encoded in the network are interconnected by values called weights, adjusting one behavior can alter others in unintended ways. Fix em dash overuse today, and tomorrow’s update (aimed at improving, say, coding capabilities) might inadvertently bring them back, not because OpenAI wants them there, but because that’s the nature of trying to steer a statistical system with millions of competing influences.

This gets to an implied question we mentioned earlier. If controlling punctuation use is still a struggle that might pop back up at any time, how far are we from AGI? We can’t know for sure, but it seems increasingly likely that it won’t emerge from a large language model alone. That’s because AGI, a technology that would replicate human general learning ability, would likely require true understanding and self-reflective intentional action, not statistical pattern matching that sometimes aligns with instructions if you happen to get lucky.

And speaking of getting lucky, some users still aren’t having luck with controlling em dash use outside of the “custom instructions” feature. Upon being told in-chat to not use em dashes within a chat, ChatGPT updated a saved memory and replied to one X user, “Got it—I’ll stick strictly to short hyphens from now on.”

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

Forget AGI—Sam Altman celebrates ChatGPT finally following em dash formatting rules Read More »

researchers-question-anthropic-claim-that-ai-assisted-attack-was-90%-autonomous

Researchers question Anthropic claim that AI-assisted attack was 90% autonomous

Claude frequently overstated findings and occasionally fabricated data during autonomous operations, claiming to have obtained credentials that didn’t work or identifying critical discoveries that proved to be publicly available information. This AI hallucination in offensive security contexts presented challenges for the actor’s operational effectiveness, requiring careful validation of all claimed results. This remains an obstacle to fully autonomous cyberattacks.

How (Anthropic says) the attack unfolded

Anthropic said GTG-1002 developed an autonomous attack framework that used Claude as an orchestration mechanism that largely eliminated the need for human involvement. This orchestration system broke complex multi-stage attacks into smaller technical tasks such as vulnerability scanning, credential validation, data extraction, and lateral movement.

“The architecture incorporated Claude’s technical capabilities as an execution engine within a larger automated system, where the AI performed specific technical actions based on the human operators’ instructions while the orchestration logic maintained attack state, managed phase transitions, and aggregated results across multiple sessions,” Anthropic said. “This approach allowed the threat actor to achieve operational scale typically associated with nation-state campaigns while maintaining minimal direct involvement, as the framework autonomously progressed through reconnaissance, initial access, persistence, and data exfiltration phases by sequencing Claude’s responses and adapting subsequent requests based on discovered information.”

The attacks followed a five-phase structure that increased AI autonomy through each one.

The life cycle of the cyberattack, showing the move from human-led targeting to largely AI-driven attacks using various tools, often via the Model Context Protocol (MCP). At various points during the attack, the AI returns to its human operator for review and further direction.

Credit: Anthropic

The life cycle of the cyberattack, showing the move from human-led targeting to largely AI-driven attacks using various tools, often via the Model Context Protocol (MCP). At various points during the attack, the AI returns to its human operator for review and further direction. Credit: Anthropic

The attackers were able to bypass Claude guardrails in part by breaking tasks into small steps that, in isolation, the AI tool didn’t interpret as malicious. In other cases, the attackers couched their inquiries in the context of security professionals trying to use Claude to improve defenses.

As noted last week, AI-developed malware has a long way to go before it poses a real-world threat. There’s no reason to doubt that AI-assisted cyberattacks may one day produce more potent attacks. But the data so far indicates that threat actors—like most others using AI—are seeing mixed results that aren’t nearly as impressive as those in the AI industry claim.

Researchers question Anthropic claim that AI-assisted attack was 90% autonomous Read More »

openai-walks-a-tricky-tightrope-with-gpt-5.1’s-eight-new-personalities

OpenAI walks a tricky tightrope with GPT-5.1’s eight new personalities

On Wednesday, OpenAI released GPT-5.1 Instant and GPT-5.1 Thinking, two updated versions of its flagship AI models now available in ChatGPT. The company is wrapping the models in the language of anthropomorphism, claiming that they’re warmer, more conversational, and better at following instructions.

The release follows complaints earlier this year that its previous models were excessively cheerful and sycophantic, along with an opposing controversy among users over how OpenAI modified the default GPT-5 output style after several suicide lawsuits.

The company now faces intense scrutiny from lawyers and regulators that could threaten its future operations. In that kind of environment, it’s difficult to just release a new AI model, throw out a few stats, and move on like the company could even a year ago. But here are the basics: The new GPT-5.1 Instant model will serve as ChatGPT’s faster default option for most tasks, while GPT-5.1 Thinking is a simulated reasoning model that attempts to handle more complex problem-solving tasks.

OpenAI claims that both models perform better on technical benchmarks such as math and coding evaluations (including AIME 2025 and Codeforces) than GPT-5, which was released in August.

Improved benchmarks may win over some users, but the biggest change with GPT-5.1 is in its presentation. OpenAI says it heard from users that they wanted AI models to simulate different communication styles depending on the task, so the company is offering eight preset options, including Professional, Friendly, Candid, Quirky, Efficient, Cynical, and Nerdy, alongside a Default setting.

These presets alter the instructions fed into each prompt to simulate different personality styles, but the underlying model capabilities remain the same across all settings.

An illustration showing GPT-5.1's eight personality styles in ChatGPT.

An illustration showing GPT-5.1’s eight personality styles in ChatGPT. Credit: OpenAI

In addition, the company trained GPT-5.1 Instant to use “adaptive reasoning,” meaning that the model decides when to spend more computational time processing a prompt before generating output.

The company plans to roll out the models gradually over the next few days, starting with paid subscribers before expanding to free users. OpenAI plans to bring both GPT-5.1 Instant and GPT-5.1 Thinking to its API later this week. GPT-5.1 Instant will appear as gpt-5.1-chat-latest, and GPT-5.1 Thinking will be released as GPT-5.1 in the API, both with adaptive reasoning enabled. The older GPT-5 models will remain available in ChatGPT under the legacy models dropdown for paid subscribers for three months.

OpenAI walks a tricky tightrope with GPT-5.1’s eight new personalities Read More »

meta’s-star-ai-scientist-yann-lecun-plans-to-leave-for-own-startup

Meta’s star AI scientist Yann LeCun plans to leave for own startup

A different approach to AI

LeCun founded Meta’s Fundamental AI Research lab, known as FAIR, in 2013 and has served as the company’s chief AI scientist ever since. He is one of three researchers who won the 2018 Turing Award for pioneering work on deep learning and convolutional neural networks. After leaving Meta, LeCun will remain a professor at New York University, where he has taught since 2003.

LeCun has previously argued that large language models like Llama that Zuckerberg has put at the center of his strategy are useful, but they will never be able to reason and plan like humans, increasingly appearing to contradict his boss’s grandiose AI vision for developing “superintelligence.”

For example, in May 2024, when an OpenAI researcher discussed the need to control ultra-intelligent AI, LeCun responded on X by writing that before urgently figuring out how to control AI systems much smarter than humans, researchers need to have the beginning of a hint of a design for a system smarter than a house cat.

Mark Zuckerberg once believed the “metaverse” was the future and renamed his company because of it. Credit: Facebook

Within FAIR, LeCun has instead focused on developing world models that can truly plan and reason. Over the past year, though, Meta’s AI research groups have seen growing tension and mass layoffs as Zuckerberg has shifted the company’s AI strategy away from long-term research and toward the rapid deployment of commercial products.

Over the summer, Zuckerberg hired Alexandr Wang to lead a new superintelligence team at Meta, paying $14.3 billion to hire the 28-year-old founder of data-labeling startup Scale AI and acquire a 49 percent interest in his company. LeCun, who had previously reported to Chief Product Officer Chris Cox, now reports to Wang, which seems like a sharp rebuke of LeCun’s approach to AI.

Zuckerberg also personally handpicked an exclusive team called TBD Lab to accelerate the development of the next iteration of large language models, luring staff from rivals such as OpenAI and Google with astonishingly large $100 to $250 million pay packages. As a result, Zuckerberg has come under growing pressure from Wall Street to show that his multibillion-dollar investment in becoming an AI leader will pay off and boost revenue. But if it turns out like his previous pivot to the metaverse, Zuckerberg’s latest bet could prove equally expensive and unfruitful.

Meta’s star AI scientist Yann LeCun plans to leave for own startup Read More »

clickfix-may-be-the-biggest-security-threat-your-family-has-never-heard-of

ClickFix may be the biggest security threat your family has never heard of

Another campaign, documented by Sekoia, targeted Windows users. The attackers behind it first compromise a hotel’s account for Booking.com or another online travel service. Using the information stored in the compromised accounts, the attackers contact people with pending reservations, an ability that builds immediate trust with many targets, who are eager to comply with instructions, lest their stay be canceled.

The site eventually presents a fake CAPTCHA notification that bears an almost identical look and feel to those required by content delivery network Cloudflare. The proof the notification requires for confirmation that there’s a human behind the keyboard is to copy a string of text and paste it into the Windows terminal. With that, the machine is infected with malware tracked as PureRAT.

Push Security, meanwhile, reported a ClickFix campaign with a page “adapting to the device that you’re visiting from.” Depending on the OS, the page will deliver payloads for Windows or macOS. Many of these payloads, Microsoft said, are LOLbins, the name for binaries that use a technique known as living off the land. These scripts rely solely on native capabilities built into the operating system. With no malicious files being written to disk, endpoint protection is further hamstrung.

The commands, which are often base-64 encoded to make them unreadable to humans, are often copied inside the browser sandbox, a part of most browsers that accesses the Internet in an isolated environment designed to protect devices from malware or harmful scripts. Many security tools are unable to observe and flag these actions as potentially malicious.

The attacks can also be effective given the lack of awareness. Many people have learned over the years to be suspicious of links in emails or messengers. In many users’ minds, the precaution doesn’t extend to sites that instruct them to copy a piece of text and paste it into an unfamiliar window. When the instructions come in emails from a known hotel or at the top of Google results, targets can be further caught off guard.

With many families gathering in the coming weeks for various holiday dinners, ClickFix scams are worth mentioning to those family members who ask for security advice. Microsoft Defender and other endpoint protection programs offer some defenses against these attacks, but they can, in some cases, be bypassed. That means that, for now, awareness is the best countermeasure.

ClickFix may be the biggest security threat your family has never heard of Read More »

researchers-isolate-memorization-from-problem-solving-in-ai-neural-networks

Researchers isolate memorization from problem-solving in AI neural networks


The hills and valleys of knowledge

Basic arithmetic ability lives in the memorization pathways, not logic circuits.

When engineers build AI language models like GPT-5 from training data, at least two major processing features emerge: memorization (reciting exact text they’ve seen before, like famous quotes or passages from books) and what you might call “reasoning” (solving new problems using general principles). New research from AI startup Goodfire.ai provides the first potentially clear evidence that these different functions actually work through completely separate neural pathways in the model’s architecture.

The researchers discovered that this separation proves remarkably clean. In a preprint paper released in late October, they described that when they removed the memorization pathways, models lost 97 percent of their ability to recite training data verbatim but kept nearly all their “logical reasoning” ability intact.

For example, at layer 22 in Allen Institute for AI’s OLMo-7B language model, the researchers ranked all the weight components (the mathematical values that process information) from high to low based on a measure called “curvature” (which we’ll explain more below). When they examined these ranked components, the bottom 50 percent of weight components showed 23 percent higher activation on memorized data, while the top 10 percent showed 26 percent higher activation on general, non-memorized text.

In other words, the components that specialize in memorization clustered at the bottom of their ranking, while problem-solving components clustered at the top. This mechanistic split enabled the researchers to surgically remove memorization while preserving other capabilities. They found they could delete the bottom-ranked components to eliminate memorization while keeping the top-ranked ones that handle problem-solving.

Perhaps most surprisingly, the researchers found that arithmetic operations seem to share the same neural pathways as memorization rather than logical reasoning. When they removed memorization circuits, mathematical performance plummeted to 66 percent while logical tasks remained nearly untouched. This discovery may explain why AI language models notoriously struggle with math without the use of external tools. They’re attempting to recall arithmetic from a limited memorization table rather than computing it, like a student who memorized times tables but never learned how multiplication works. The finding suggests that at current scales, language models treat “2+2=4” more like a memorized fact than a logical operation.

It’s worth noting that “reasoning” in AI research covers a spectrum of abilities that don’t necessarily match what we might call reasoning in humans. The logical reasoning that survived memory removal in this latest research includes tasks like evaluating true/false statements and following if-then rules, which are essentially applying learned patterns to new inputs. This also differs from the deeper “mathematical reasoning” required for proofs or novel problem-solving, which current AI models struggle with even when their pattern-matching abilities remain intact.

Looking ahead, if the information removal techniques receive further development in the future, AI companies could potentially one day remove, say, copyrighted content, private information, or harmful memorized text from a neural network without destroying the model’s ability to perform transformative tasks. However, since neural networks store information in distributed ways that are still not completely understood, for the time being, the researchers say their method “cannot guarantee complete elimination of sensitive information.” These are early steps in a new research direction for AI.

Traveling the neural landscape

To understand how researchers from Goodfire distinguished memorization from reasoning in these neural networks, it helps to know about a concept in AI called the “loss landscape.” The “loss landscape” is a way of visualizing how wrong or right an AI model’s predictions are as you adjust its internal settings (which are called “weights”).

Imagine you’re tuning a complex machine with millions of dials. The “loss” measures the number of mistakes the machine makes. High loss means many errors, low loss means few errors. The “landscape” is what you’d see if you could map out the error rate for every possible combination of dial settings.

During training, AI models essentially “roll downhill” in this landscape (gradient descent), adjusting their weights to find the valleys where they make the fewest mistakes. This process provides AI model outputs, like answers to questions.

Figure 1: Overview of our approach. We collect activations and gradients from a sample of training data (a), which allows us to approximate loss curvature w.r.t. a weight matrix using K-FAC (b). We decompose these weight matrices into components (each the same size as the matrix), ordered from high to low curvature. In language models, we show that data from different tasks interacts with parts of the spectrum of components differently (c).

Figure 1 from the paper “From Memorization to Reasoning in the Spectrum of Loss Curvature.” Credit: Merullo et al.

The researchers analyzed the “curvature” of the loss landscapes of particular AI language models, measuring how sensitive the model’s performance is to small changes in different neural network weights. Sharp peaks and valleys represent high curvature (where tiny changes cause big effects), while flat plains represent low curvature (where changes have minimal impact). They used these curvature values to rank the weight components from high to low, as mentioned earlier.

Using a technique called K-FAC (Kronecker-Factored Approximate Curvature), they found that individual memorized facts create sharp spikes in this landscape, but because each memorized item spikes in a different direction, when averaged together they create a flat profile. Meanwhile, reasoning abilities that many different inputs rely on maintain consistent moderate curves across the landscape, like rolling hills that remain roughly the same shape regardless of the direction from which you approach them.

“Directions that implement shared mechanisms used by many inputs add coherently and remain high-curvature on average,” the researchers write, describing reasoning pathways. In contrast, memorization uses “idiosyncratic sharp directions associated with specific examples” that appear flat when averaged across data.

Different tasks reveal a spectrum of mechanisms

The researchers tested their technique on multiple AI systems to verify the findings held across different architectures. They primarily used Allen Institute’s OLMo-2 family of open language models, specifically the 7 billion- and 1 billion-parameter versions, chosen because their training data is openly accessible. For vision models, they trained custom 86 million-parameter Vision Transformers (ViT-Base models) on ImageNet with intentionally mislabeled data to create controlled memorization. They also validated their findings against existing memorization removal methods like BalancedSubnet to establish performance benchmarks.

The team tested their discovery by selectively removing low-curvature weight components from these trained models. Memorized content dropped to 3.4 percent recall from nearly 100 percent. Meanwhile, logical reasoning tasks maintained 95 to 106 percent of baseline performance.

These logical tasks included Boolean expression evaluation, logical deduction puzzles where solvers must track relationships like “if A is taller than B,” object tracking through multiple swaps, and benchmarks like BoolQ for yes/no reasoning, Winogrande for common sense inference, and OpenBookQA for science questions requiring reasoning from provided facts. Some tasks fell between these extremes, revealing a spectrum of mechanisms.

Mathematical operations and closed-book fact retrieval shared pathways with memorization, dropping to 66 to 86 percent performance after editing. The researchers found arithmetic particularly brittle. Even when models generated identical reasoning chains, they failed at the calculation step after low-curvature components were removed.

Figure 3: Sensitivity of different kinds of tasks to ablation of flatter eigenvectors. Parametric knowledge retrieval, arithmetic, and memorization are brittle, but openbook fact retrieval and logical reasoning is robust and maintain around 100% of original performance.

Figure 3 from the paper “From Memorization to Reasoning in the Spectrum of Loss Curvature.” Credit: Merullo et al.

“Arithmetic problems themselves are memorized at the 7B scale, or because they require narrowly used directions to do precise calculations,” the team explains. Open-book question answering, which relies on provided context rather than internal knowledge, proved most robust to the editing procedure, maintaining nearly full performance.

Curiously, the mechanism separation varied by information type. Common facts like country capitals barely changed after editing, while rare facts like company CEOs dropped 78 percent. This suggests models allocate distinct neural resources based on how frequently information appears in training.

The K-FAC technique outperformed existing memorization removal methods without needing training examples of memorized content. On unseen historical quotes, K-FAC achieved 16.1 percent memorization versus 60 percent for the previous best method, BalancedSubnet.

Vision transformers showed similar patterns. When trained with intentionally mislabeled images, the models developed distinct pathways for memorizing wrong labels versus learning correct patterns. Removing memorization pathways restored 66.5 percent accuracy on previously mislabeled images.

Limits of memory removal

However, the researchers acknowledged that their technique isn’t perfect. Once-removed memories might return if the model receives more training, as other research has shown that current unlearning methods only suppress information rather than completely erasing it from the neural network’s weights. That means the “forgotten” content can be reactivated with just a few training steps targeting those suppressed areas.

The researchers also can’t fully explain why some abilities, like math, break so easily when memorization is removed. It’s unclear whether the model actually memorized all its arithmetic or whether math just happens to use similar neural circuits as memorization. Additionally, some sophisticated capabilities might look like memorization to their detection method, even when they’re actually complex reasoning patterns. Finally, the mathematical tools they use to measure the model’s “landscape” can become unreliable at the extremes, though this doesn’t affect the actual editing process.

This article was updated on November 11, 2025 at 9: 16 am to clarify an explanation about sorting weights by curvature.

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

Researchers isolate memorization from problem-solving in AI neural networks Read More »

researchers-surprised-that-with-ai,-toxicity-is-harder-to-fake-than-intelligence

Researchers surprised that with AI, toxicity is harder to fake than intelligence

The next time you encounter an unusually polite reply on social media, you might want to check twice. It could be an AI model trying (and failing) to blend in with the crowd.

On Wednesday, researchers from the University of Zurich, University of Amsterdam, Duke University, and New York University released a study revealing that AI models remain easily distinguishable from humans in social media conversations, with overly friendly emotional tone serving as the most persistent giveaway. The research, which tested nine open-weight models across Twitter/X, Bluesky, and Reddit, found that classifiers developed by the researchers detected AI-generated replies with 70 to 80 percent accuracy.

The study introduces what the authors call a “computational Turing test” to assess how closely AI models approximate human language. Instead of relying on subjective human judgment about whether text sounds authentic, the framework uses automated classifiers and linguistic analysis to identify specific features that distinguish machine-generated from human-authored content.

“Even after calibration, LLM outputs remain clearly distinguishable from human text, particularly in affective tone and emotional expression,” the researchers wrote. The team, led by Nicolò Pagan at the University of Zurich, tested various optimization strategies, from simple prompting to fine-tuning, but found that deeper emotional cues persist as reliable tells that a particular text interaction online was authored by an AI chatbot rather than a human.

The toxicity tell

In the study, researchers tested nine large language models: Llama 3.1 8B, Llama 3.1 8B Instruct, Llama 3.1 70B, Mistral 7B v0.1, Mistral 7B Instruct v0.2, Qwen 2.5 7B Instruct, Gemma 3 4B Instruct, DeepSeek-R1-Distill-Llama-8B, and Apertus-8B-2509.

When prompted to generate replies to real social media posts from actual users, the AI models struggled to match the level of casual negativity and spontaneous emotional expression common in human social media posts, with toxicity scores consistently lower than authentic human replies across all three platforms.

To counter this deficiency, the researchers attempted optimization strategies (including providing writing examples and context retrieval) that reduced structural differences like sentence length or word count, but variations in emotional tone persisted. “Our comprehensive calibration tests challenge the assumption that more sophisticated optimization necessarily yields more human-like output,” the researchers concluded.

Researchers surprised that with AI, toxicity is harder to fake than intelligence Read More »

wipers-from-russia’s-most-cut-throat-hackers-rain-destruction-on-ukraine

Wipers from Russia’s most cut-throat hackers rain destruction on Ukraine

One of the world’s most ruthless and advanced hacking groups, the Russian state-controlled Sandworm, launched a series of destructive cyberattacks in the country’s ongoing war against neighboring Ukraine, researchers reported Thursday.

In April, the group targeted a Ukrainian university with two wipers, a form of malware that aims to permanently destroy sensitive data and often the infrastructure storing it. One wiper, tracked under the name Sting, targeted fleets of Windows computers by scheduling a task named DavaniGulyashaSdeshka, a phrase derived from Russian slang that loosely translates to “eat some goulash,” researchers from ESET said. The other wiper is tracked as Zerlot.

A not-so-common target

Then, in June and September, Sandworm unleashed multiple wiper variants against a host of Ukrainian critical infrastructure targets, including organizations active in government, energy, and logistics. The targets have long been in the crosshairs of Russian hackers. There was, however, a fourth, less common target—organizations in Ukraine’s grain industry.

“Although all four have previously been documented as targets of wiper attacks at some point since 2022, the grain sector stands out as a not-so-frequent target,” ESET said. “Considering that grain export remains one of Ukraine’s main sources of revenue, such targeting likely reflects an attempt to weaken the country’s war economy.”

Wipers have been a favorite tool of Russian hackers since at least 2012, with the spreading of the NotPetya worm. The self-replicating malware originally targeted Ukraine, but eventually caused international chaos when it spread globally in a matter of hours. The worm resulted in tens of billions of dollars in financial damages after it shut down thousands of organizations, many for days or weeks.

Wipers from Russia’s most cut-throat hackers rain destruction on Ukraine Read More »