microsoft

lighter,-cheaper-surface-laptop-saves-a-little-money-but-gives-up-a-lot

Lighter, cheaper Surface Laptop saves a little money but gives up a lot

The laptop has two USB-C ports on the right side, seen here, and a USB-A port and headphone jack on the left. Surface Connect is gone. For those reasons, it seems like most individual buyers would still be better off going for the 13.8-inch Surface Laptop, with the new one only really making sense for companies buying these in bulk if the 13.8-inch Surface goes up in price or if the 13-inch Surface happens to be discounted and the 13.8-inch version isn’t. The 13.8-inch Laptop is also obviously still the one you want if you want more than 16GB of RAM or 512GB of storage, or if you need more CPU and GPU speed.

The new 13-inch Laptop has most of the same basic ports as the 13.8-inch version, just arranged slightly differently. You still get a pair of USB-C ports (both supporting 10 Gbps USB 3.2 speeds, rather than USB 4), one USB-A port, and a headphone jack, but the USB-A port and headphone jack are now on the left side of the laptop. As with the 12-inch Surface Pro tablet, the Surface Connect port has been removed, so this is compatible with all existing USB-C accessories but none of the ones that use Microsoft’s proprietary connector.

An awkward refresh

Both of the new Surface devices being announced today. Credit: Microsoft

The new Surface Laptop doesn’t seem to regress on any major functional fronts—unlike the 12-inch Surface Pro, which throws out an 11-year-old keyboard fix that made the Surface Pro’s keyboard cover much more stable and laptop-like—but it’s still an odd refresh. But inflation, supply chain snarls, and the Trump administration’s rapidly changing tariff plans have made pricing and availability harder to predict than they were a few years ago.

Though PCs and smartphones are (currently) exempted from most tariffs, Microsoft did recently raise the prices of its years-old Xbox Series S and X consoles; it’s possible these new Surface devices were originally designed to be budget models but that world events kept them from being as cheap as they otherwise might have been.

Lighter, cheaper Surface Laptop saves a little money but gives up a lot Read More »

microsoft’s-12-inch-surface-pro-is-cheaper-but-unfixes-a-decade-old-design-problem

Microsoft’s 12-inch Surface Pro is cheaper but unfixes a decade-old design problem

Several downgrades, and one that’s hard to ignore

The 12-inch Surface Pro. Credit: Microsoft

The design looks pretty similar to the existing 13-inch Surface Pro overall but with some significant tweaks. The 12-inch Surface still supports the Slim Pen and other Surface styluses, but there’s now a magnet on the back of the tablet that the pen can be stuck to for storage, rather than a divot on the keyboard. The tablet still has a pair of USB-C ports, each of which supports 10 Gbps USB 3.2 speeds rather than full USB 4. But the Surface Connect port is gone, and because it’s physically smaller, the new Surface Pro isn’t compatible with any of the keyboard accessories made for past Surface Pro or Surface Go tablets.

But the biggest downgrade is a fundamental change to the tablet’s design. The 12-inch Surface Pro’s keyboard case (still a separate purchase, frustratingly) lies flat against whatever you have the tablet sitting on, whether that’s a desk, a table, or your lap. If the surface your Surface is resting on is level and stable, that’s mostly fine. If the surface is soft or uneven, like a lap or a couch, this introduces extra instability and floppiness, and your keyboard will wobble around more as you type on it.

Both of the new Surface devices being announced today. Note that the Surface Pro’s keyboard sits flat against the table, rather than folding up against the bottom of the screen. Credit: Microsoft

This is the same approach used as the first two generations of Surface Pro (and the ill-starred Surface RT), and it was also a perennial complaint about those designs from reviewers and users. In 2014, the Surface Pro 3 tweaked the keyboard design so that the top of it would fold flat against the bottom of the device’s screen, giving the keyboard some rigidity and stability that persisted no matter what it was resting on. All subsequent Surface keyboards, including those for the tiny 10.5-inch Surface Go, used the same design, until this one.

The iPad keyboard case I use—a Logitech Combo Touch Keyboard Folio with a built-in trackpad and kickstand—also uses the flop-against-the-table design, which hasn’t been the end of the world. But solving this problem was a major turning point in the evolution of the Surface Pro, and it’s frustrating to see that signature improvement undone here.

Microsoft’s 12-inch Surface Pro is cheaper but unfixes a decade-old design problem Read More »

ai-isn’t-ready-to-replace-human-coders-for-debugging,-researchers-say

AI isn’t ready to replace human coders for debugging, researchers say

A graph showing agents with tools nearly doubling the success rates of those without, but still achieving a success score under 50 percent

Agents using debugging tools drastically outperformed those that didn’t, but their success rate still wasn’t high enough. Credit: Microsoft Research

This approach is much more successful than relying on the models as they’re usually used, but when your best case is a 48.4 percent success rate, you’re not ready for primetime. The limitations are likely because the models don’t fully understand how to best use the tools, and because their current training data is not tailored to this use case.

“We believe this is due to the scarcity of data representing sequential decision-making behavior (e.g., debugging traces) in the current LLM training corpus,” the blog post says. “However, the significant performance improvement… validates that this is a promising research direction.”

This initial report is just the start of the efforts, the post claims.  The next step is to “fine-tune an info-seeking model specialized in gathering the necessary information to resolve bugs.” If the model is large, the best move to save inference costs may be to “build a smaller info-seeking model that can provide relevant information to the larger one.”

This isn’t the first time we’ve seen outcomes that suggest some of the ambitious ideas about AI agents directly replacing developers are pretty far from reality. There have been numerous studies already showing that even though an AI tool can sometimes create an application that seems acceptable to the user for a narrow task, the models tend to produce code laden with bugs and security vulnerabilities, and they aren’t generally capable of fixing those problems.

This is an early step on the path to AI coding agents, but most researchers agree it remains likely that the best outcome is an agent that saves a human developer a substantial amount of time, not one that can do everything they can do.

AI isn’t ready to replace human coders for debugging, researchers say Read More »

that-groan-you-hear-is-users’-reaction-to-recall-going-back-into-windows

That groan you hear is users’ reaction to Recall going back into Windows

Security and privacy advocates are girding themselves for another uphill battle against Recall, the AI tool rolling out in Windows 11 that will screenshot, index, and store everything a user does every three seconds.

When Recall was first introduced in May 2024, security practitioners roundly castigated it for creating a gold mine for malicious insiders, criminals, or nation-state spies if they managed to gain even brief administrative access to a Windows device. Privacy advocates warned that Recall was ripe for abuse in intimate partner violence settings. They also noted that there was nothing stopping Recall from preserving sensitive disappearing content sent through privacy-protecting messengers such as Signal.

Enshittification at a new scale

Following months of backlash, Microsoft later suspended Recall. On Thursday, the company said it was reintroducing Recall. It currently is available only to insiders with access to the Windows 11 Build 26100.3902 preview version. Over time, the feature will be rolled out more broadly. Microsoft officials wrote:

Recall (preview)saves you time by offering an entirely new way to search for things you’ve seen or done on your PC securely. With the AI capabilities of Copilot+ PCs, it’s now possible to quickly find and get back to any app, website, image, or document just by describing its content. To use Recall, you will need to opt-in to saving snapshots, which are images of your activity, and enroll in Windows Hello to confirm your presence so only you can access your snapshots. You are always in control of what snapshots are saved and can pause saving snapshots at any time. As you use your Copilot+ PC throughout the day working on documents or presentations, taking video calls, and context switching across activities, Recall will take regular snapshots and help you find things faster and easier. When you need to find or get back to something you’ve done previously, open Recall and authenticate with Windows Hello. When you’ve found what you were looking for, you can reopen the application, website, or document, or use Click to Do to act on any image or text in the snapshot you found.

Microsoft is hoping that the concessions requiring opt-in and the ability to pause Recall will help quell the collective revolt that broke out last year. It likely won’t for various reasons.

That groan you hear is users’ reaction to Recall going back into Windows Read More »

carmack-defends-ai-tools-after-quake-fan-calls-microsoft-ai-demo-“disgusting”

Carmack defends AI tools after Quake fan calls Microsoft AI demo “disgusting”

The current generative Quake II demo represents a slight advancement from Microsoft’s previous generative AI gaming model (confusingly titled “WHAM” with only one “M”) we covered in February. That earlier model, while showing progress in generating interactive gameplay footage, operated at 300×180 resolution at 10 frames per second—far below practical modern gaming standards. The new WHAMM demonstration doubles the resolution to 640×360. However, both remain well below what gamers expect from a functional video game in almost every conceivable way. It truly is an AI tech demo.

A Microsoft diagram of the WHAMM system.

A Microsoft diagram of the WHAM system. Credit: Microsoft

For example, the technology faces substantial challenges beyond just performance metrics. Microsoft acknowledges several limitations, including poor enemy interactions, a short context length of just 0.9 seconds (meaning the system forgets objects outside its view), and unreliable numerical tracking for game elements like health values.

Which brings us to another point: A significant gap persists between the technology’s marketing portrayal and its practical applications. While industry veterans like Carmack and Sweeney view AI as another tool in the development arsenal, demonstrations like the Quake II instance may create inflated expectations about AI’s current capabilities for complete game generation.

The most realistic near-term application of generative AI technology remains as coding assistants and perhaps rapid prototyping tools for developers, rather than a drop-in replacement for traditional game development pipelines. The technology’s current limitations suggest that human developers will remain essential for creating compelling, polished game experiences for now. But given the general pace of progress, that might be small comfort for those who worry about losing jobs to AI in the near-term.

Ultimately, Sweeney says not to worry: “There’s always a fear that automation will lead companies to make the same old products while employing fewer people to do it,” Sweeney wrote in a follow-up post on X. “But competition will ultimately lead to companies producing the best work they’re capable of given the new tools, and that tends to mean more jobs.”

And Carmack closed with this: “Will there be more or less game developer jobs? That is an open question. It could go the way of farming, where labor-saving technology allow a tiny fraction of the previous workforce to satisfy everyone, or it could be like social media, where creative entrepreneurship has flourished at many different scales. Regardless, “don’t use power tools because they take people’s jobs” is not a winning strategy.”

Carmack defends AI tools after Quake fan calls Microsoft AI demo “disgusting” Read More »

microsoft-turns-50-today,-and-it-made-me-think-about-ms-dos-5.0

Microsoft turns 50 today, and it made me think about MS-DOS 5.0

On this day in 1975, Bill Gates and Paul Allen founded a company called Micro-Soft in Albuquerque, New Mexico.

The two men had worked together before, as members of the Lakeside Programming group in the early 70s and as co-founders of a road traffic analysis company called Traf-O-Data. But Micro-Soft, later renamed to drop the hyphen and relocated to its current headquarters in Redmond, Washington, would be the company that would transform personal computing over the next five decades.

I’m not here to do a history of Microsoft, because Wikipedia already exists and because the company has already put together a gauzy 50th-anniversary retrospective site with some retro-themed wallpapers. But the anniversary did make me try to remember which Microsoft product I consciously used for the first time, the one that made me aware of the company and the work it was doing.

To get the answer, just put a decimal point in the number “50”—my first Microsoft product was MS-DOS 5.0.

Riding with DOS in the Windows era

I remember this version of MS-DOS so vividly because it was the version that we ran on our first computer. I couldn’t actually tell you what computer it was, though, not because I don’t remember it but because it was a generic yellowed hand-me-down that was prodigiously out of date, given to us by well-meaning people from our church who didn’t know enough to know how obsolete the system was.

It was a clone of the original IBM PC 5150, initially released in 1981; I believe we took ownership of it sometime in 1995 or 1996. It had an Intel 8088, two 5.25-inch floppy drives, and 500-something KB of RAM (also, if memory serves, a sac of spider eggs). But it had no hard drive inside, meaning that anything I wanted to run on or save from this computer needed to use a pile of moldering black plastic diskettes, more than a few of which were already going bad.

Microsoft turns 50 today, and it made me think about MS-DOS 5.0 Read More »

ai-search-engines-cite-incorrect-sources-at-an-alarming-60%-rate,-study-says

AI search engines cite incorrect sources at an alarming 60% rate, study says

A new study from Columbia Journalism Review’s Tow Center for Digital Journalism finds serious accuracy issues with generative AI models used for news searches. The research tested eight AI-driven search tools equipped with live search functionality and discovered that the AI models incorrectly answered more than 60 percent of queries about news sources.

Researchers Klaudia Jaźwińska and Aisvarya Chandrasekar noted in their report that roughly 1 in 4 Americans now use AI models as alternatives to traditional search engines. This raises serious concerns about reliability, given the substantial error rate uncovered in the study.

Error rates varied notably among the tested platforms. Perplexity provided incorrect information in 37 percent of the queries tested, whereas ChatGPT Search incorrectly identified 67 percent (134 out of 200) of articles queried. Grok 3 demonstrated the highest error rate, at 94 percent.

A graph from CJR shows

A graph from CJR shows “confidently wrong” search results. Credit: CJR

For the tests, researchers fed direct excerpts from actual news articles to the AI models, then asked each model to identify the article’s headline, original publisher, publication date, and URL. They ran 1,600 queries across the eight different generative search tools.

The study highlighted a common trend among these AI models: rather than declining to respond when they lacked reliable information, the models frequently provided confabulations—plausible-sounding incorrect or speculative answers. The researchers emphasized that this behavior was consistent across all tested models, not limited to just one tool.

Surprisingly, premium paid versions of these AI search tools fared even worse in certain respects. Perplexity Pro ($20/month) and Grok 3’s premium service ($40/month) confidently delivered incorrect responses more often than their free counterparts. Though these premium models correctly answered a higher number of prompts, their reluctance to decline uncertain responses drove higher overall error rates.

Issues with citations and publisher control

The CJR researchers also uncovered evidence suggesting some AI tools ignored Robot Exclusion Protocol settings, which publishers use to prevent unauthorized access. For example, Perplexity’s free version correctly identified all 10 excerpts from paywalled National Geographic content, despite National Geographic explicitly disallowing Perplexity’s web crawlers.

AI search engines cite incorrect sources at an alarming 60% rate, study says Read More »

six-ways-microsoft’s-portable-xbox-could-be-a-steam-deck-killer

Six ways Microsoft’s portable Xbox could be a Steam Deck killer

Bring old Xbox games to PC

The ultimate handheld system seller.

Credit: Microsoft / Bizarre Creations

The ultimate handheld system seller. Credit: Microsoft / Bizarre Creations

Microsoft has made a lot of hay over the way recent Xbox consoles can play games dating all the way back to the original Xbox. If Microsoft wants to set its first gaming handheld apart, it should make those old console games officially available on a Windows-based system for the first time.

The ability to download previous console games dating back to the Xbox 360 era (or beyond) would be an instant “system seller” feature for any portable Xbox. While this wouldn’t be a trivial technical lift on Microsoft’s part, the same emulation layer that powers Xbox console backward compatibility could surely be ported to Windows with a little bit of work. That process might be easier with a specific branded portable, too, since Microsoft would be working with full knowledge of what hardware was being used.

If Microsoft can give us a way to play Geometry Wars 2 on the go without having to deal with finicky third-party emulators, we’ll be eternally grateful.

Multiple hardware tiers

Xbox Series S (left), next to Xbox Series X (right).

One size does not fit all when it comes to consoles or to handhelds.

Credit: Sam Machkovech

One size does not fit all when it comes to consoles or to handhelds. Credit: Sam Machkovech

On the console side, Microsoft’s split simultaneous release of the Xbox Series S and X showed an understanding that not everyone wants to pay more money for the most powerful possible gaming hardware. Microsoft should extend this philosophy to gaming handhelds by releasing different tiers of portable Xbox hardware for price-conscious consumers.

Raw hardware power is the most obvious differentiator that could set a more expensive tier of Xbox portables apart from any cheaper options. But Microsoft could also offer portable options that reduce the overall bulk (a la the Nintendo Switch Lite) or offer relative improvements in screen size and quality (a la the Steam Deck OLED and Switch OLED).

“Made for Xbox”

It worked for Valve, it can work for Microsoft.

Credit: Valve

It worked for Valve, it can work for Microsoft. Credit: Valve

One of the best things about console gaming is that you can be confident any game you buy for a console will “just work” with your hardware. In the world of PC gaming handhelds, Valve has tried to replicate this with the “Deck Verified” program to highlight Steam games that are guaranteed to work in a portable setting.

Microsoft is well-positioned to work with game publishers to launch a similar program for its own Xbox-branded portable. There’s real value in offering gamers assurances that “Made for Xbox” PC games will “just work” on their Xbox-branded handheld.

This kind of verification system could also help simplify and clarify hardware requirements across different tiers of portable hardware power; any handheld marketed as “level 2” could play any games marketed as level 2 or below, for instance.

Six ways Microsoft’s portable Xbox could be a Steam Deck killer Read More »

nearly-1-million-windows-devices-targeted-in-advanced-“malvertising”-spree

Nearly 1 million Windows devices targeted in advanced “malvertising” spree

A broad overview of the four stages. Credit: Microsoft

The campaign targeted “nearly” 1 million devices belonging both to individuals and a wide range of organizations and industries. The indiscriminate approach indicates the campaign was opportunistic, meaning it attempted to ensnare anyone, rather than targeting certain individuals, organizations, or industries. GitHub was the platform primarily used to host the malicious payload stages, but Discord and Dropbox were also used.

The malware located resources on the infected computer and sent them to the attacker’s c2 server. The exfiltrated data included the following browser files, which can store login cookies, passwords, browsing histories, and other sensitive data.

  • AppDataRoamingMozillaFirefoxProfiles.default-releasecookies.sqlite
  • AppDataRoamingMozillaFirefoxProfiles.default-releaseformhistory.sqlite
  • AppDataRoamingMozillaFirefoxProfiles.default-releasekey4.db
  • AppDataRoamingMozillaFirefoxProfiles.default-releaselogins.json
  • AppDataLocalGoogleChromeUser DataDefaultWeb Data
  • AppDataLocalGoogleChromeUser DataDefaultLogin Data
  • AppDataLocalMicrosoftEdgeUser DataDefaultLogin Data

Files stored on Microsoft’s OneDrive cloud service were also targeted. The malware also checked for the presence of cryptocurrency wallets including Ledger Live, Trezor Suite, KeepKey, BCVault, OneKey, and BitBox, “indicating potential financial data theft,” Microsoft said.

Microsoft said it suspects the sites hosting the malicious ads were streaming platforms providing unauthorized content. Two of the domains are movies7[.]net and 0123movie[.]art.

Microsoft Defender now detects the files used in the attack, and it’s likely other malware defense apps do the same. Anyone who thinks they may have been targeted can check indicators of compromise at the end of the Microsoft post. The post includes steps users can take to prevent falling prey to similar malvertising campaigns.

Nearly 1 million Windows devices targeted in advanced “malvertising” spree Read More »

microsoft’s-new-ai-agent-can-control-software-and-robots

Microsoft’s new AI agent can control software and robots

The researchers' explanations about how

The researchers’ explanations about how “Set-of-Mark” and “Trace-of-Mark” work. Credit: Microsoft Research

The Magma model introduces two technical components: Set-of-Mark, which identifies objects that can be manipulated in an environment by assigning numeric labels to interactive elements, such as clickable buttons in a UI or graspable objects in a robotic workspace, and Trace-of-Mark, which learns movement patterns from video data. Microsoft says those features allow the model to complete tasks like navigating user interfaces or directing robotic arms to grasp objects.

Microsoft Magma researcher Jianwei Yang wrote in a Hacker News comment that the name “Magma” stands for “M(ultimodal) Ag(entic) M(odel) at Microsoft (Rese)A(rch),” after some people noted that “Magma” already belongs to an existing matrix algebra library, which could create some confusion in technical discussions.

Reported improvements over previous models

In its Magma write-up, Microsoft claims Magma-8B performs competitively across benchmarks, showing strong results in UI navigation and robot manipulation tasks.

For example, it scored 80.0 on the VQAv2 visual question-answering benchmark—higher than GPT-4V’s 77.2 but lower than LLaVA-Next’s 81.8. Its POPE score of 87.4 leads all models in the comparison. In robot manipulation, Magma reportedly outperforms OpenVLA, an open source vision-language-action model, in multiple robot manipulation tasks.

Magma's agentic benchmarks, as reported by the researchers.

Magma’s agentic benchmarks, as reported by the researchers. Credit: Microsoft Research

As always, we take AI benchmarks with a grain of salt since many have not been scientifically validated as being able to measure useful properties of AI models. External verification of Microsoft’s benchmark results will become possible once other researchers can access the public code release.

Like all AI models, Magma is not perfect. It still faces technical limitations in complex step-by-step decision-making that requires multiple steps over time, according to Microsoft’s documentation. The company says it continues to work on improving these capabilities through ongoing research.

Yang says Microsoft will release Magma’s training and inference code on GitHub next week, allowing external researchers to build on the work. If Magma delivers on its promise, it could push Microsoft’s AI assistants beyond limited text interactions, enabling them to operate software autonomously and execute real-world tasks through robotics.

Magma is also a sign of how quickly the culture around AI can change. Just a few years ago, this kind of agentic talk scared many people who feared it might lead to AI taking over the world. While some people still fear that outcome, in 2025, AI agents are a common topic of mainstream AI research that regularly takes place without triggering calls to pause all of AI development.

Microsoft’s new AI agent can control software and robots Read More »

hugging-face-clones-openai’s-deep-research-in-24-hours

Hugging Face clones OpenAI’s Deep Research in 24 hours

On Tuesday, Hugging Face researchers released an open source AI research agent called “Open Deep Research,” created by an in-house team as a challenge 24 hours after the launch of OpenAI’s Deep Research feature, which can autonomously browse the web and create research reports. The project seeks to match Deep Research’s performance while making the technology freely available to developers.

“While powerful LLMs are now freely available in open-source, OpenAI didn’t disclose much about the agentic framework underlying Deep Research,” writes Hugging Face on its announcement page. “So we decided to embark on a 24-hour mission to reproduce their results and open-source the needed framework along the way!”

Similar to both OpenAI’s Deep Research and Google’s implementation of its own “Deep Research” using Gemini (first introduced in December—before OpenAI), Hugging Face’s solution adds an “agent” framework to an existing AI model to allow it to perform multi-step tasks, such as collecting information and building the report as it goes along that it presents to the user at the end.

The open source clone is already racking up comparable benchmark results. After only a day’s work, Hugging Face’s Open Deep Research has reached 55.15 percent accuracy on the General AI Assistants (GAIA) benchmark, which tests an AI model’s ability to gather and synthesize information from multiple sources. OpenAI’s Deep Research scored 67.36 percent accuracy on the same benchmark.

As Hugging Face points out in its post, GAIA includes complex multi-step questions such as this one:

Which of the fruits shown in the 2008 painting “Embroidery from Uzbekistan” were served as part of the October 1949 breakfast menu for the ocean liner that was later used as a floating prop for the film “The Last Voyage”? Give the items as a comma-separated list, ordering them in clockwise order based on their arrangement in the painting starting from the 12 o’clock position. Use the plural form of each fruit.

To correctly answer that type of question, the AI agent must seek out multiple disparate sources and assemble them into a coherent answer. Many of the questions in GAIA represent no easy task, even for a human, so they test agentic AI’s mettle quite well.

Hugging Face clones OpenAI’s Deep Research in 24 hours Read More »

microsoft-365’s-vpn-feature-will-be-shut-off-at-the-end-of-the-month

Microsoft 365’s VPN feature will be shut off at the end of the month

Last month, Microsoft announced that it was increasing the prices for consumer Microsoft 365 plans for the first time since introducing them as Office 365 plans more than a decade ago. Microsoft is using new Copilot-branded generative AI features to justify the price increases, which amount to an extra $3 per month or $30 per year for both individual and family plans.

But Microsoft giveth (and chargeth more) and Microsoft taketh away; according to a support page, the company is also removing the “privacy protection” VPN feature from Microsoft 365’s Microsoft Defender app for Windows, macOS, iOS, and Android. Other Defender features, including identity theft protection and anti-malware protection, will continue to be available. Privacy protection will stop functioning on February 28.

Microsoft didn’t say exactly why it was removing the feature, but the company implied that not enough people were using the service.

“We routinely evaluate the usage and effectiveness of our features. As such, we are removing the privacy protection feature and will invest in new areas that will better align to customer needs,” the support note reads.

Cutting features at the same time that you raise prices for the first time ever is not, as they say, a Great Look. But the Defender VPN feature was already a bit limited compared to other dedicated VPN services. It came with a 50GB per user, per month data cap, and it automatically excluded “content heavy traffic from reputable sites” like YouTube, Netflix, Disney+, Amazon Prime, Facebook, Instagram, and Whatsapp.

Microsoft 365’s VPN feature will be shut off at the end of the month Read More »