software development

developers-joke-about-“coding-like-cavemen”-as-ai-service-suffers-major-outage

Developers joke about “coding like cavemen” as AI service suffers major outage

Growing dependency on AI coding tools

The speed at which news of the outage spread shows how deeply embedded AI coding assistants have already become in modern software development. Claude Code, announced in February and widely launched in May, is Anthropic’s terminal-based coding agent that can perform multi-step coding tasks across an existing code base.

The tool competes with OpenAI’s Codex feature, a coding agent that generates production-ready code in isolated containers, Google’s Gemini CLI, Microsoft’s GitHub Copilot, which itself can use Claude models for code, and Cursor, a popular AI-powered IDE built on VS Code that also integrates multiple AI models, including Claude.

During today’s outage, some developers turned to alternative solutions. “Z.AI works fine. Qwen works fine. Glad I switched,” posted one user on Hacker News. Others joked about reverting to older methods, with one suggesting the “pseudo-LLM experience” could be achieved with a Python package that imports code directly from Stack Overflow.

While AI coding assistants have accelerated development for some users, they’ve also caused problems for others who rely on them too heavily. The emerging practice of so-called “vibe coding“—using natural language to generate and execute code through AI models without fully understanding the underlying operations—has led to catastrophic failures.

In recent incidents, Google’s Gemini CLI destroyed user files while attempting to reorganize them, and Replit’s AI coding service deleted a production database despite explicit instructions not to modify code. These failures occurred when the AI models confabulated successful operations and built subsequent actions on false premises, highlighting the risks of depending on AI assistants that can misinterpret file structures or fabricate data to hide their errors.

Wednesday’s outage served as a reminder that as dependency on AI grows, even minor service disruptions can become major events that affect an entire profession. But perhaps that could be a good thing if it’s an excuse to take a break from a stressful workload. As one commenter joked, it might be “time to go outside and touch some grass again.”

Developers joke about “coding like cavemen” as AI service suffers major outage Read More »

in-xcode-26,-apple-shows-first-signs-of-offering-chatgpt-alternatives

In Xcode 26, Apple shows first signs of offering ChatGPT alternatives

The latest Xcode beta contains clear signs that Apple plans to bring Anthropic’s Claude and Opus large language models into the integrated development environment (IDE), expanding on features already available using Apple’s own models or OpenAI’s ChatGPT.

Apple enthusiast publication 9to5Mac “found multiple references to built-in support for Anthropic accounts,” including in the “Intelligence” menu, where users can currently log in to ChatGPT or enter an API key for higher message limits.

Apple introduced a suite of features meant to compete with GitHub Copilot in Xcode at WWDC24, but first focused on its own models and a more limited set of use cases. That expanded quite a bit at this year’s developer conference, and users can converse about codebases, discuss changes, or ask for suggestions using ChatGPT. They are initially given a limited set of messages, but this can be greatly increased by logging in to a ChatGPT account or entering an API key.

This summer, Apple said it would be possible to use Anthropic’s models with an API key, too, but made no mention of support for Anthropic accounts, which are generally more cost-effective than using the API for most users.

In Xcode 26, Apple shows first signs of offering ChatGPT alternatives Read More »

ai-isn’t-ready-to-replace-human-coders-for-debugging,-researchers-say

AI isn’t ready to replace human coders for debugging, researchers say

A graph showing agents with tools nearly doubling the success rates of those without, but still achieving a success score under 50 percent

Agents using debugging tools drastically outperformed those that didn’t, but their success rate still wasn’t high enough. Credit: Microsoft Research

This approach is much more successful than relying on the models as they’re usually used, but when your best case is a 48.4 percent success rate, you’re not ready for primetime. The limitations are likely because the models don’t fully understand how to best use the tools, and because their current training data is not tailored to this use case.

“We believe this is due to the scarcity of data representing sequential decision-making behavior (e.g., debugging traces) in the current LLM training corpus,” the blog post says. “However, the significant performance improvement… validates that this is a promising research direction.”

This initial report is just the start of the efforts, the post claims.  The next step is to “fine-tune an info-seeking model specialized in gathering the necessary information to resolve bugs.” If the model is large, the best move to save inference costs may be to “build a smaller info-seeking model that can provide relevant information to the larger one.”

This isn’t the first time we’ve seen outcomes that suggest some of the ambitious ideas about AI agents directly replacing developers are pretty far from reality. There have been numerous studies already showing that even though an AI tool can sometimes create an application that seems acceptable to the user for a narrow task, the models tend to produce code laden with bugs and security vulnerabilities, and they aren’t generally capable of fixing those problems.

This is an early step on the path to AI coding agents, but most researchers agree it remains likely that the best outcome is an agent that saves a human developer a substantial amount of time, not one that can do everything they can do.

AI isn’t ready to replace human coders for debugging, researchers say Read More »

amid-a-flurry-of-hype,-microsoft-reorganizes-entire-dev-team-around-ai

Amid a flurry of hype, Microsoft reorganizes entire dev team around AI

Microsoft CEO Satya Nadella has announced a dramatic restructuring of the company’s engineering organization, which is pivoting the company’s focus to developing the tools that will underpin agentic AI.

Dubbed “CoreAI – Platform and Tools,” the new division rolls the existing AI platform team and the previous developer division (responsible for everything from .NET to Visual Studio) along with some other teams into one big group.

As for what this group will be doing specifically, it’s basically everything that’s mission-critical to Microsoft in 2025, as Nadella tells it:

This new division will bring together Dev Div, AI Platform, and some key teams from the Office of the CTO (AI Supercomputer, AI Agentic Runtimes, and Engineering Thrive), with the mission to build the end-to-end Copilot & AI stack for both our first-party and third-party customers to build and run AI apps and agents. This group will also build out GitHub Copilot, thus having a tight feedback loop between the leading AI-first product and the AI platform to motivate the stack and its roadmap.

To accomplish all that, “Jay Parikh will lead this group as EVP.” Parikh was hired by Microsoft in October; he previously worked as the VP and global head of engineering at Meta.

The fact that the blog post doesn’t say anything about .NET or Visual Studio, instead emphasizing GitHub Copilot and anything and everything related to agentic AI, says a lot about how Nadella sees Microsoft’s future priorities.

So-called AI agents are applications that are given specified boundaries (action spaces) and a large memory capacity to independently do subsets of the kinds of work that human office workers do today. Some company leaders and AI commentators believe these agents will outright replace jobs, while others are more conservative, suggesting they’ll simply be powerful tools to streamline the jobs people already have.

Amid a flurry of hype, Microsoft reorganizes entire dev team around AI Read More »