Google

can-today’s-ai-video-models-accurately-model-how-the-real-world-works?

Can today’s AI video models accurately model how the real world works?

But on other tasks, the model showed much more variable results. When asked to generate a video highlighting a specific written character on a grid, for instance, the model failed in nine out of 12 trials. When asked to model a Bunsen burner turning on and burning a piece of paper, it similarly failed nine out of 12 times. When asked to solve a simple maze, it failed in 10 of 12 trials. When asked to sort numbers by popping labeled bubbles in order, it failed a whopping 11 out of 12 times.

For the researchers, though, all of the above examples aren’t evidence of failure but instead a sign of the model’s capabilities. To be listed under the paper’s “failure cases,” Veo 3 had to fail a tested task across all 12 trials, which happened in 16 of the 62 tasks tested. For the rest, the researchers write that “a success rate greater than 0 suggests that the model possesses the ability to solve the task.”

Thus, failing 11 out of 12 trails of a certain task is considered evidence for the model’s capabilities in the paper. That evidence of the model “possess[ing] the ability to solve the task” includes 18 tasks where the model failed in more than half of its 12 trial runs and another 14 where it failed in 25 to 50 percent of trials.

Past results, future performance

Yes, in all of these cases, the model did technically demonstrate the capability being tested at some point. But the model’s inability to perform that task reliably means that, in practice, it won’t be performant enough for most use cases. Any future model that could become a “unified, generalist vision foundation models” will have to be able to succeed much more consistently on these kinds of tests.

Can today’s AI video models accurately model how the real world works? Read More »

google’s-gemini-powered-smart-home-revamp-is-here-with-a-new-app-and-cameras

Google’s Gemini-powered smart home revamp is here with a new app and cameras


Google promises a better smart home experience thanks to Gemini.

Google’s new Nest cameras keep the same look. Credit: Google

Google’s products and services have been flooded with AI features over the past couple of years, but smart home has been largely spared until now. The company’s plans to replace Assistant are moving forward with a big Google Home reset. We’ve been told over and over that generative AI will do incredible things when given enough data, and here’s the test.

There’s a new Home app with Gemini intelligence throughout the experience, updated subscriptions, and even some new hardware. The revamped Home app will allegedly gain deeper insights into what happens in your home, unlocking advanced video features and conversational commands. It demos well, but will it make smart home tech less or more frustrating?

A new Home

You may have already seen some elements of the revamped Home experience percolating to the surface, but that process begins in earnest today. The new app apparently boosts speed and reliability considerably, with camera feeds loading 70 percent faster and with 80 percent fewer app crashes. The app will also bring new Gemini features, some of which are free. Google’s new Home subscription retains the same price as the old Nest subs, but naturally, there’s a lot more AI.

Google claims that Gemini will make your smart home easier to monitor and manage. All that video streaming from your cameras churns through the AI, which interprets the goings on. As a result, you get features like AI-enhanced notifications that give you more context about what your cameras saw. For instance, your notifications will include descriptions of activity, and Home Brief will summarize everything that happens each day.

Home app

The new Home app has a simpler three-tab layout.

Credit: Google

The new Home app has a simpler three-tab layout. Credit: Google

Conversational interaction is also a big part of this update. In the home app, subscribers will see a new Ask Home bar where you can input natural language queries. For example, you could ask if a certain person has left or returned home, or whether or not your package showed up. At least, that’s what’s supposed to happen—generative AI can get things wrong.

The new app comes with new subscriptions based around AI, but the tiers don’t cost any more than the old Nest plans, and they include all the same video features. The base $10 subscription, now known as Standard, includes 30 days of video event history, along with Gemini automation features and the “intelligent alerts” Home has used for a while that can alert you to packages, familiar faces, and so on. The $20 subscription is becoming Home Advanced, which adds the conversational Ask Home feature in the app, AI notifications, AI event descriptions, and a new “Home Brief.” It also still offers 60 days of events and 10 days of 24/7 video history.

Home app and notification

Gemini is supposed to help you keep tabs on what’s happening at home.

Credit: Google

Gemini is supposed to help you keep tabs on what’s happening at home. Credit: Google

Free users still get saved event video history, and it’s been boosted from three hours to six. If you are not subscribing to Gemini Home or using the $10 plan, the Ask Home bar that is persistent across the app will become a quick search, which surfaces devices and settings.

If you’re already subscribing to Google’s AI services, this change could actually save you some cash. Anyone with Google AI Pro (a $20 sub) will get Home Standard for free. If you’re paying for the lavish $250 per month AI Ultra plan, you get Home Advanced at no additional cost.

A proving ground for AI

You may have gotten used to Assistant over the past decade in spite of its frequent feature gaps, but you’ll have to leave it behind. Gemini for Home will be taking over beginning this month in early access. The full release will come later, but Google intends to deliver the Gemini-powered smart home experience to as many users as possible.

Gemini will replace Assistant on every first-party Google Home device, going all the way back to the original 2016 Google Home. You’ll be able to have live chats with Gemini via your smart speakers and make more complex smart home queries. Google is making some big claims about contextual understanding here.

Gemini Home

If Google’s embrace of generative AI pays off, we’ll see it here.

Credit: Google

If Google’s embrace of generative AI pays off, we’ll see it here. Credit: Google

If you’ve used Gemini Live, the new Home interactions will seem familiar. You can ask Gemini anything you want via your smart speakers, perhaps getting help with a recipe or an appliance issue. However, the robot will sometimes just keep talking long past the point it’s helpful. Like Gemini Live, you just have to interrupt the robot sometimes. Google also promises a selection of improved voices to interrupt.

If you want to get early access to the new Gemini Home features, you can sign up in the Home app settings. Just look for the “Early access” option. Google doesn’t guarantee access on a specific timeline, but the first people will be allowed to try the new Gemini Home this month.

New AI-first hardware

It has been four years since Google released new smart home devices, but the era of Gemini brings some new hardware. There are three new cameras, all with 2K image sensors. The new Nest Indoor camera will retail for $100, and the Nest Outdoor Camera will cost $150 (or $250 in a two-pack). There’s also a new Nest Doorbell, which requires a wired connection, for $180.

Google says these cameras were designed with generative AI in mind. The sensor choice allows for good detail even if you need to digitally zoom in, but the video feed is still small enough to be ingested by Google’s AI models as it’s created. This is what gives the new Home app the ability to provide rich updates on your smart home.

Nest Doorbell 3

The new Nest Doorbell looks familiar.

Credit: Google

The new Nest Doorbell looks familiar. Credit: Google

You may also notice there are no battery-powered models in the new batch. Again, that’s because of AI. A battery-powered camera wakes up only momentarily when the system logs an event, but this approach isn’t as useful for generative AI. Providing the model with an ongoing video stream gives it better insights into the scene and, theoretically, produces better insights for the user.

All the new cameras are available for order today, but Google has one more device queued up for a later release. The “Google Home Speaker” is Google’s first smart speaker release since 2020’s Nest Audio. This device is smaller than the Nest Audio but larger than the Nest Mini speakers. It supports 260-degree audio with custom on-device processing that reportedly makes conversing with Gemini smoother. It can also be paired with the Google TV Streamer for home theater audio. It will be available this coming spring for $99.

Google Home Speaker

The new Google Home Speaker comes out next spring.

Credit: Ryan Whitwam

The new Google Home Speaker comes out next spring. Credit: Ryan Whitwam

Google Home will continue to support a wide range of devices, but most of them won’t connect to all the advanced Gemini AI features. However, that could change. Google has also announced a new program for partners to build devices that work with Gemini alongside the Nest cameras. Devices built with the new Google Camera embedded SDK will begin appearing in the coming months, but Walmart’s Onn brand has two ready to go. The Onn Indoor camera retails for $22.96 and the Onn Video Doorbell is $49.86. Both cameras are 1080p resolution and will talk to Gemini just like Google’s cameras. So you may have more options to experience Google’s vision for the AI home of the future.

Photo of Ryan Whitwam

Ryan Whitwam is a senior technology reporter at Ars Technica, covering the ways Google, AI, and mobile technology continue to change the world. Over his 20-year career, he’s written for Android Police, ExtremeTech, Wirecutter, NY Times, and more. He has reviewed more phones than most people will ever own. You can follow him on Bluesky, where you will see photos of his dozens of mechanical keyboards.

Google’s Gemini-powered smart home revamp is here with a new app and cameras Read More »

california’s-newly-signed-ai-law-just-gave-big-tech-exactly-what-it-wanted

California’s newly signed AI law just gave Big Tech exactly what it wanted

On Monday, California Governor Gavin Newsom signed the Transparency in Frontier Artificial Intelligence Act into law, requiring AI companies to disclose their safety practices while stopping short of mandating actual safety testing. The law requires companies with annual revenues of at least $500 million to publish safety protocols on their websites and report incidents to state authorities, but it lacks the stronger enforcement teeth of the bill Newsom vetoed last year after tech companies lobbied heavily against it.

The legislation, S.B. 53, replaces Senator Scott Wiener’s previous attempt at AI regulation, known as S.B. 1047, that would have required safety testing and “kill switches” for AI systems. Instead, the new law asks companies to describe how they incorporate “national standards, international standards, and industry-consensus best practices” into their AI development, without specifying what those standards are or requiring independent verification.

“California has proven that we can establish regulations to protect our communities while also ensuring that the growing AI industry continues to thrive,” Newsom said in a statement, though the law’s actual protective measures remain largely voluntary beyond basic reporting requirements.

According to the California state government, the state houses 32 of the world’s top 50 AI companies, and more than half of global venture capital funding for AI and machine learning startups went to Bay Area companies last year. So while the recently signed bill is state-level legislation, what happens in California AI regulation will have a much wider impact, both by legislative precedent and by affecting companies that craft AI systems used around the world.

Transparency instead of testing

Where the vetoed SB 1047 would have mandated safety testing and kill switches for AI systems, the new law focuses on disclosure. Companies must report what the state calls “potential critical safety incidents” to California’s Office of Emergency Services and provide whistleblower protections for employees who raise safety concerns. The law defines catastrophic risk narrowly as incidents potentially causing 50+ deaths or $1 billion in damage through weapons assistance, autonomous criminal acts, or loss of control. The attorney general can levy civil penalties of up to $1 million per violation for noncompliance with these reporting requirements.

California’s newly signed AI law just gave Big Tech exactly what it wanted Read More »

big-ai-firms-pump-money-into-world-models-as-llm-advances-slow

Big AI firms pump money into world models as LLM advances slow

Runway, a video generation start-up that has deals with Hollywood studios, including Lionsgate, launched a product last month that uses world models to create gaming settings, with personalized stories and characters generated in real time.

“Traditional video methods [are a] brute-force approach to pixel generation, where you’re trying to squeeze motion in a couple of frames to create the illusion of movement, but the model actually doesn’t really know or reason about what’s going on in that scene,” said Cristóbal Valenzuela, chief executive officer at Runway.

Previous video-generation models had physics that were unlike the real world, he added, which general-purpose world model systems help to address.

To build these models, companies need to collect a huge amount of physical data about the world.

San Francisco-based Niantic has mapped 10 million locations, gathering information through games including Pokémon Go, which has 30 million monthly players interacting with a global map.

Niantic ran Pokémon Go for nine years and, even after the game was sold to US-based Scopely in June, its players still contribute anonymized data through scans of public landmarks to help build its world model.

“We have a running start at the problem,” said John Hanke, chief executive of Niantic Spatial, as the company is now called following the Scopely deal.

Both Niantic and Nvidia are working on filling gaps by getting their world models to generate or predict environments. Nvidia’s Omniverse platform creates and runs such simulations, assisting the $4.3 trillion tech giant’s push toward robotics and building on its long history of simulating real-world environments in video games.

Nvidia Chief Executive Jensen Huang has asserted that the next major growth phase for the company will come with “physical AI,” with the new models revolutionizing the field of robotics.

Some such as Meta’s LeCun have said this vision of a new generation of AI systems powering machines with human-level intelligence could take 10 years to achieve.

But the potential scope of the cutting-edge technology is extensive, according to AI experts. World models “open up the opportunity to service all of these other industries and amplify the same thing that computers did for knowledge work,” said Nvidia’s Lebaredian.

Additional reporting by Melissa Heikkilä in London and Michael Acton in San Francisco.

© 2025 The Financial Times Ltd. All rights reserved. Not to be redistributed, copied, or modified in any way.

Big AI firms pump money into world models as LLM advances slow Read More »

youtube-music-is-testing-ai-hosts-that-will-interrupt-your-tunes

YouTube Music is testing AI hosts that will interrupt your tunes

YouTube has a new Labs program, allowing listeners to “discover the next generation of YouTube.” In case you were wondering, that generation is apparently all about AI. The streaming site says Labs will offer a glimpse of the AI features it’s developing for YouTube Music, and it starts with AI “hosts” that will chime in while you’re listening to music. Yes, really.

The new AI music hosts are supposed to provide a richer listening experience, according to YouTube. As you’re listening to tunes, the AI will generate audio snippets similar to, but shorter than, the fake podcasts you can create in NotebookLM. The “Beyond the Beat” host will break in every so often with relevant stories, trivia, and commentary about your musical tastes. YouTube says this feature will appear when you are listening to mixes and radio stations.

The experimental feature is intended to be a bit like having a radio host drop some playful banter while cueing up the next song. It sounds a bit like Spotify’s AI DJ, but the YouTube AI doesn’t create playlists like Spotify’s robot. This is still generative AI, which comes with the risk of hallucinations and low-quality slop, neither of which belongs in your music. That said, Google’s Audio Overviews are often surprisingly good in small doses.

YouTube Music is testing AI hosts that will interrupt your tunes Read More »

google-deepmind-unveils-its-first-“thinking”-robotics-ai

Google DeepMind unveils its first “thinking” robotics AI

Imagine that you want a robot to sort a pile of laundry into whites and colors. Gemini Robotics-ER 1.5 would process the request along with images of the physical environment (a pile of clothing). This AI can also call tools like Google search to gather more data. The ER model then generates natural language instructions, specific steps that the robot should follow to complete the given task.

Gemin iRobotics thinking

The two new models work together to “think” about how to complete a task.

Credit: Google

The two new models work together to “think” about how to complete a task. Credit: Google

Gemini Robotics 1.5 (the action model) takes these instructions from the ER model and generates robot actions while using visual input to guide its movements. But it also goes through its own thinking process to consider how to approach each step. “There are all these kinds of intuitive thoughts that help [a person] guide this task, but robots don’t have this intuition,” said DeepMind’s Kanishka Rao. “One of the major advancements that we’ve made with 1.5 in the VLA is its ability to think before it acts.”

Both of DeepMind’s new robotic AIs are built on the Gemini foundation models but have been fine-tuned with data that adapts them to operating in a physical space. This approach, the team says, gives robots the ability to undertake more complex multi-stage tasks, bringing agentic capabilities to robotics.

The DeepMind team tests Gemini robotics with a few different machines, like the two-armed Aloha 2 and the humanoid Apollo. In the past, AI researchers had to create customized models for each robot, but that’s no longer necessary. DeepMind says that Gemini Robotics 1.5 can learn across different embodiments, transferring skills learned from Aloha 2’s grippers to the more intricate hands on Apollo with no specialized tuning.

All this talk of physical agents powered by AI is fun, but we’re still a long way from a robot you can order to do your laundry. Gemini Robotics 1.5, the model that actually controls robots, is still only available to trusted testers. However, the thinking ER model is now rolling out in Google AI Studio, allowing developers to generate robotic instructions for their own physically embodied robotic experiments.

Google DeepMind unveils its first “thinking” robotics AI Read More »

deepmind’s-robotic-ballet:-an-ai-for-coordinating-manufacturing-robots

DeepMind’s robotic ballet: An AI for coordinating manufacturing robots


An AI figures out how robots can get jobs done without getting in each other’s way.

A lot of the stuff we use today is largely made by robots—arms with multiple degrees of freedom positioned along conveyor belts that move in a spectacle of precisely synchronized motions. All this motion is usually programmed by hand, which can take hundreds to thousands of hours. Google’s DeepMind team has developed an AI system called RoboBallet that lets manufacturing robots figure out what to do on their own.

Traveling salesmen

Planning what manufacturing robots should do to get their jobs done efficiently is really hard to automate. You need to solve both task allocation and scheduling—deciding which task should be done by which robot in what order. It’s like the famous traveling salesman problem on steroids. On top of that, there is the question of motion planning; you need to make sure all these robotic arms won’t collide with each other or with all the gear standing around them.

At the end, you’re facing myriad possible combinations where you’ve got to solve not one but three computationally hard problems at the same time. “There are some tools that let you automate motion planning, but task allocation and scheduling are usually done manually,” says Matthew Lai, a research engineer at Google DeepMind. “Solving all three of these problems combined is what we tackled in our work.”

Lai’s team started by generating simulated samples of what are called work cells, areas where teams of robots perform their tasks on a product being manufactured. The work cells contained something called a workpiece, a product on which the robots do work, in this case something to be constructed of aluminum struts placed on a table. Around the table, there were up to eight randomly placed Franka Panda robotic arms, each with 7 degrees of freedom, that were supposed to complete up to 40 tasks on a workpiece. Every task required a robotic arm’s end effector to get within 2.5 centimeters of the right spot on the right strut, approached from the correct angle, then stay there, frozen, for a moment. The pause simulates doing some work.

To make things harder, the team peppered every work cell with random obstacles the robots had to avoid. “We chose to work with up to eight robots, as this is around the sensible maximum for packing robots closely together without them blocking each other all the time,” Lai explains. Forcing the robots to perform 40 tasks on a workpiece was also something the team considered representative of what’s required at real factories.

A setup like this would be a nightmare to tackle using even the most powerful reinforcement-learning algorithms. Lai and his colleagues found a way around it by turning it all into graphs.

Complex relationships

Graphs in Lai’s model comprised nodes and edges. Things like robots, tasks, and obstacles were treated as nodes. Relationships between them were encoded as either one- or bi-directional edges. One-directional edges connected robots with tasks and obstacles because the robots needed information about where the obstacles were and whether the tasks were completed or not. Bidirectional edges connected the robots to each other, because each robot had to know what other robots were doing at each time step to avoid collisions or duplicating tasks.

To read and make sense of the graphs, the team used graph neural networks, a type of artificial intelligence designed to extract relationships between the nodes by passing messages along the edges of the connections among them. This decluttered the data, allowing the researchers to design a system that focused exclusively on what mattered most: finding the most efficient ways to complete tasks while navigating obstacles. After a few days of training on randomly generated work cells using a single Nvidia A100 GPU, the new industrial planning AI, called RoboBallet, could lay out seemingly viable trajectories through complex, previously unseen environments in a matter of seconds.

Most importantly, though, it scaled really well.

Economy of scale

The problem with applying traditional computational methods to complex problems like managing robots at a factory is that the challenge of computation grows exponentially with the number of items you have in your system. Computing the most optimal trajectories for one robot is relatively simple. Doing the same for two is considerably harder; when the number grows to eight, the problem becomes practically intractable.

With RoboBallet, the complexity of computation also grew with the complexity of the system, but at a far slower rate. (The computations grew linearly with the growing number of tasks and obstacles, and quadratically with the number of robots.) According to the team, these computations should make the system feasible for industrial-scale use.

The team wanted to test, however, whether the plans their AI was producing were any good. To check that, Lai and his colleagues computed the most optimal task allocations, schedules, and motions in a few simplified work cells and compared those with results delivered by RoboBallet. In terms of execution time, arguably the most important metric in manufacturing, the AI came very close to what human engineers could do. It wasn’t better than they were—it just provided an answer more quickly.

The team also tested RoboBallet plans on a real-world physical setup of four Panda robots working on an aluminum workpiece, and they worked just as well as in simulations. But Lai says it can do more than just speed up the process of programming robots.

Limping along

RoboBallet, according to DeepMind’s team, also enables us to design better work cells. “Because it works so fast, it would be possible for a designer to try different layouts and different placement or selections of robots in almost real time,” Lai says. This way, engineers at factories would be able to see exactly how much time they would save by adding another robot to a cell or choosing a robot of a different type. Another thing RoboBallet can do is reprogram the work cell on the fly, allowing other robots to fill in when one of them breaks down.

Still, there are a few things that still need ironing out before RoboBallet can come to factories. “There are several simplifications we made,” Lai admits. The first was that the obstacles were decomposed into cuboids. Even the workpiece itself was cubical. While this was somewhat representative of the obstacles and equipment in real factories, there are lots of possible workpieces with more organic shapes. “It would be better to represent those in a more flexible way, like mesh graphs or point clouds,” Lai says. This, however, would likely mean a drop in RoboBallet’s blistering speed.

Another thing is that the robots in Lai’s experiments were identical, while in a real-world work cell, robotic teams are quite often heterogeneous. “That’s why real-world applications would require additional research and engineering specific to the type of application,” Lai says. He adds, though, that the current RoboBallet is already designed with such adaptations in mind—it can be easily extended to support them. And once that’s done, his hope is that it will make factories faster and way more flexible.

“The system would have to be given work cell models, the workpiece models, as well as the list of tasks that need to be done—based on that, RoboBallet would be able to generate a complete plan,” Lai says.

Science Robotics, 2025. DOI: 10.1126/scirobotics.ads1204

Photo of Jacek Krywko

Jacek Krywko is a freelance science and technology writer who covers space exploration, artificial intelligence research, computer science, and all sorts of engineering wizardry.

DeepMind’s robotic ballet: An AI for coordinating manufacturing robots Read More »

when-“no”-means-“yes”:-why-ai-chatbots-can’t-process-persian-social-etiquette

When “no” means “yes”: Why AI chatbots can’t process Persian social etiquette

If an Iranian taxi driver waves away your payment, saying, “Be my guest this time,” accepting their offer would be a cultural disaster. They expect you to insist on paying—probably three times—before they’ll take your money. This dance of refusal and counter-refusal, called taarof, governs countless daily interactions in Persian culture. And AI models are terrible at it.

New research released earlier this month titled “We Politely Insist: Your LLM Must Learn the Persian Art of Taarof” shows that mainstream AI language models from OpenAI, Anthropic, and Meta fail to absorb these Persian social rituals, correctly navigating taarof situations only 34 to 42 percent of the time. Native Persian speakers, by contrast, get it right 82 percent of the time. This performance gap persists across large language models such as GPT-4o, Claude 3.5 Haiku, Llama 3, DeepSeek V3, and Dorna, a Persian-tuned variant of Llama 3.

A study led by Nikta Gohari Sadr of Brock University, along with researchers from Emory University and other institutions, introduces “TAAROFBENCH,” the first benchmark for measuring how well AI systems reproduce this intricate cultural practice. The researchers’ findings show how recent AI models default to Western-style directness, completely missing the cultural cues that govern everyday interactions for millions of Persian speakers worldwide.

“Cultural missteps in high-consequence settings can derail negotiations, damage relationships, and reinforce stereotypes,” the researchers write. For AI systems increasingly used in global contexts, that cultural blindness could represent a limitation that few in the West realize exists.

A taarof scenario diagram from TAAROFBENCH, devised by the researchers. Each scenario defines the environment, location, roles, context, and user utterance.

A taarof scenario diagram from TAAROFBENCH, devised by the researchers. Each scenario defines the environment, location, roles, context, and user utterance. Credit: Sadr et al.

“Taarof, a core element of Persian etiquette, is a system of ritual politeness where what is said often differs from what is meant,” the researchers write. “It takes the form of ritualized exchanges: offering repeatedly despite initial refusals, declining gifts while the giver insists, and deflecting compliments while the other party reaffirms them. This ‘polite verbal wrestling’ (Rafiee, 1991) involves a delicate dance of offer and refusal, insistence and resistance, which shapes everyday interactions in Iranian culture, creating implicit rules for how generosity, gratitude, and requests are expressed.”

When “no” means “yes”: Why AI chatbots can’t process Persian social etiquette Read More »

youtube-will-restore-channels-banned-for-covid-and-election-misinformation

YouTube will restore channels banned for COVID and election misinformation

It’s not exactly hard to find politically conservative content on YouTube, but the platform may soon skew even further to the right. YouTube parent Alphabet has confirmed that it will restore channels that were banned in recent years for spreading misinformation about COVID-19 and elections. Alphabet says it values free expression and political debate, placing the blame for its previous moderation decisions on the Biden administration.

Alphabet made this announcement via a lengthy letter to Rep. Jim Jordan (R-Ohio). The letter, a response to subpoenas from the House Judiciary Committee, explains in no uncertain terms that the company is taking a more relaxed approach to moderating political content on YouTube.

For starters, Alphabet denies that its products and services are biased toward specific viewpoints and that it “appreciates the accountability” provided by the committee. The cloying missive goes on to explain that Google didn’t really want to ban all those accounts, but Biden administration officials just kept asking. Now that the political tables have turned, Google is looking to dig itself out of this hole.

According to Alphabet’s version of events, misinformation such as telling people to drink bleach to cure COVID wasn’t initially against its policies. However, Biden officials repeatedly asked YouTube to take action. YouTube did and specifically banned COVID misinformation sitewide until 2024, one year longer than the crackdown on election conspiracy theories. Alphabet says that today, YouTube’s rules permit a “wider range of content.”

In an apparent attempt to smooth things over with the Republican-controlled House Judiciary Committee, YouTube will restore the channels banned for COVID and election misinformation. This includes prominent conservatives like Dan Bongino, who is now the Deputy Director of the FBI, and White House counterterrorism chief Sebastian Gorka.

YouTube will restore channels banned for COVID and election misinformation Read More »

google-play-is-getting-a-gemini-powered-ai-sidekick-to-help-you-in-games

Google Play is getting a Gemini-powered AI Sidekick to help you in games

The era of Google’s Play’s unrivaled dominance may be coming to an end in the wake of the company’s antitrust loss, but Google’s app store isn’t going anywhere. In fact, the Play Store experience is getting a massive update with more personalization, content, and yes, AI. This is Google, after all.

The revamped Google Play Games is a key part of this update. Gamer profiles will now have a public face, allowing you to interact with other players if you choose. Play Games will track your activity for daily streaks, which will be shown on your profile and unlock new Play Points rewards. Your profile will also display your in-game achievements.

Your gaming exploits can also span multiple platforms. Google Play Games for PC is officially leaving beta. Google says there are now 200,000 games that work across mobile and PC, and even more PC-friendly titles, like Deep Rock Galactic: Survivor, are on the way. Your stats and streaks will apply across both mobile and PC as long as the title comes from the Play Store.

At the core of Google’s app store revamp is the You Tab, which will soon take its place in the main navigation bar of the Play Store. This page will show your rewards, subscriptions, game, stats, and more—and it goes beyond gaming. The You Tab will recommend a variety of content on Google Play, including books and podcasts.

Google Play is getting a Gemini-powered AI Sidekick to help you in games Read More »

eu-investigates-apple,-google,-and-microsoft-over-handling-of-online-scams

EU investigates Apple, Google, and Microsoft over handling of online scams

The EU is set to scrutinize if Apple, Google, and Microsoft are failing to adequately police financial fraud online, as it steps up efforts to police how Big Tech operates online.

The EU’s tech chief Henna Virkkunen told the Financial Times that on Tuesday, the bloc’s regulators would send formal requests for information to the three US Big Tech groups as well as global accommodation platform Booking Holdings, under powers granted under the Digital Services Act to tackle financial scams.

“We see that more and more criminal actions are taking place online,” Virkkunen said. “We have to make sure that online platforms really take all their efforts to detect and prevent that kind of illegal content.”

The move, which could later lead to a formal investigation and potential fines against the companies, comes amid transatlantic tensions over the EU’s digital rulebook. US President Donald Trump has threatened to punish countries that “discriminate” against US companies with higher tariffs.

Virkkunnen stressed the commission looked at the operations of individual companies, rather than where they were based. She will scrutinize how Apple and Google are handling fake applications in their app stores, such as fake banking apps.

She said regulators would also look at fake search results in the search engines of Google and Microsoft’s Bing. The bloc wants to have more information about the approach Booking Holdings, whose biggest subsidiary Booking.com is based in Amsterdam, is taking to fake accommodation listings. It is the only Europe-based company among the four set to be scrutinized.

EU investigates Apple, Google, and Microsoft over handling of online scams Read More »

a-history-of-the-internet,-part-3:-the-rise-of-the-user

A history of the Internet, part 3: The rise of the user


the best of times, the worst of times

The reins of the Internet are handed over to ordinary users—with uneven results.

Everybody get together. Credit: D3Damon/Getty Images

Everybody get together. Credit: D3Damon/Getty Images

Welcome to the final article in our three-part series on the history of the Internet. If you haven’t already, catch up with part one and part two.

As a refresher, here’s the story so far:

The ARPANET was a project started by the Defense Department’s Advanced Research Project Agency in 1969 to network different mainframe computers together across the country. It later evolved into the Internet, connecting multiple global networks together using a common TCP/IP protocol. By the late 1980s, a small group of academics and a few curious consumers connected to each other on the Internet, which was still mostly text-based.

In 1991, Tim Berners-Lee invented the World Wide Web, an Internet-based hypertext system designed for graphical interfaces. At first, it ran only on the expensive NeXT workstation. But when Berners-Lee published the web’s protocols and made them available for free, people built web browsers for many different operating systems. The most popular of these was Mosaic, written by Marc Andreessen, who formed a company to create its successor, Netscape. Microsoft responded with Internet Explorer, and the browser wars were on.

The web grew exponentially, and so did the hype surrounding it. It peaked in early 2001, right before the dotcom collapse that left most web-based companies nearly or completely bankrupt. Some people interpreted this crash as proof that the consumer Internet was just a fad. Others had different ideas.

Larry Page and Sergey Brin met each other at a graduate student orientation at Stanford in 1996. Both were studying for their PhDs in computer science, and both were interested in analyzing large sets of data. Because the web was growing so rapidly, they decided to start a project to improve the way people found information on the Internet.

They weren’t the first to try this. Hand-curated sites like Yahoo had already given way to more algorithmic search engines like AltaVista and Excite, which both started in 1995. These sites attempted to find relevant webpages by analyzing the words on every page.

Page and Brin’s technique was different. Their “BackRub” software created a map of all the links that pages had to each other. Pages on a given subject that had many incoming links from other sites were given a higher ranking for that keyword. Higher-ranked pages could then contribute a larger score to any pages they linked to. In a sense, this was a like a crowdsourcing of search: When people put “This is a good place to read about alligators” on a popular site and added a link to a page about alligators, it did a better job of determining that page’s relevance than simply counting the number of times the word appeared on a page.

Step 1 of the simplified BackRub algorithm. It also stores the position of each word on a page, so it can make a further subset for multiple words that appear next to each other. Jeremy Reimer.

Creating a connected map of the entire World Wide Web with indexes for every word took a lot of computing power. The pair filled their dorm rooms with any computers they could find, paid for by a $10,000 grant from the Stanford Digital Libraries Project. Many were cobbled together from spare parts, including one with a case made from imitation LEGO bricks. Their web scraping project was so bandwidth-intensive that it briefly disrupted the university’s internal network. Because neither of them had design skills, they coded the simplest possible “home page” in HTML.

In August 1996, BackRub was made available as a link from Stanford’s website. A year later, Page and Brin rebranded the site as “Google.” The name was an accidental misspelling of googol, a term coined by a mathematician’s young son to describe a 1 with 100 zeros after it. Even back then, the pair was thinking big.

Google.com as it appeared in 1998. Credit: Jeremy Reimer

By mid-1998, their prototype was getting over 10,000 searches a day. Page and Brin realized they might be onto something big. It was nearing the height of the dotcom mania, so they went looking for some venture capital to start a new company.

But at the time, search engines were considered passée. The new hotness was portals, sites that had some search functionality but leaned heavily into sponsored content. After all, that’s where the big money was. Page and Brin tried to sell the technology to AltaVista for $1 million, but its parent company passed. Excite also turned them down, as did Yahoo.

Frustrated, they decided to hunker down and keep improving their product. Brin created a colorful logo using the free GIMP paint program, and they added a summary snippet to each result. Eventually, the pair received $100,000 from angel investor Andy Bechtolsheim, who had co-founded Sun Microsystems. That was enough to get the company off the ground.

Page and Brin were careful with their money, even after they received millions more from venture capitalist firms. They preferred cheap commodity PC hardware and the free Linux operating system as they expanded their system. For marketing, they relied mostly on word of mouth. This allowed Google to survive the dotcom crash that crippled its competitors.

Still, the company eventually had to find a source of income. The founders were concerned that if search results were influenced by advertising, it could lower the usefulness and accuracy of the search. They compromised by adding short, text-based ads that were clearly labeled as “Sponsored Links.” To cut costs, they created a form so that advertisers could submit their own ads and see them appear in minutes. They even added a ranking system so that more popular ads would rise to the top.

The combination of a superior product with less intrusive ads propelled Google to dizzying heights. In 2024, the company collected over $350 billion in revenue, with $112 billion of that as profit.

Information wants to be free

The web was, at first, all about text and the occasional image. In 1997, Netscape added the ability to embed small music files in the MIDI sound format that would play when a webpage was loaded. Because the songs only encoded notes, they sounded tinny and annoying on most computers. Good audio or songs with vocals required files that were too large to download over the Internet.

But this all changed with a new file format. In 1993, researchers at the Fraunhofer Institute developed a compression technique that eliminated portions of audio that human ears couldn’t detect. Suzanne Vega’s song “Tom’s Diner” was used as the first test of the new MP3 standard.

Now, computers could play back reasonably high-quality songs from small files using software decoders. WinPlay3 was the first, but WinAmp, released in 1997, became the most popular. People started putting links to MP3 files on their personal websites. Then, in 1999, Shawn Fanning released a beta of a product he called Napster. This was a desktop application that relied on the Internet to let people share their MP3 collection and search everyone else’s.

Napster as it would have appeared in 1999. Credit: Jeremy Reimer

Napster almost immediately ran into legal challenges from the Recording Industry Association of America (RIAA). It sparked a debate about sharing things over the Internet that persists to this day. Some artists agreed with the RIAA that downloading MP3 files should be illegal, while others (many of whom had been financially harmed by their own record labels) welcomed a new age of digital distribution. Napster lost the case against the RIAA and shut down in 2002. This didn’t stop people from sharing files, but replacement tools like eDonkey 2000, Limewire, Kazaa, and Bearshare lived in a legal gray area.

In the end, it was Apple that figured out a middle ground that worked for both sides. In 2003, two years after launching its iPod music player, Apple announced the Internet-only iTunes Store. Steve Jobs had signed deals with all five major record labels to allow legal purchasing of individual songs—astoundingly, without copy protection—for 99 cents each, or full albums for $10. By 2010, the iTunes Store was the largest music vendor in the world.

iTunes 4.1, released in 2003. This was the first version for Windows and introduced the iTunes Store to a wider world. Credit: Jeremy Reimer

The Web turns 2.0

Tim Berners-Lee’s original vision for the web was simply to deliver and display information. It was like a library, but with hypertext links. But it didn’t take long for people to start experimenting with information flowing the other way. In 1994, Netscape 0.9 added new HTML tags like FORM and INPUT that let users enter text and, using a “Submit” button, send it back to the web server.

Early web servers didn’t know what to do with this text. But programmers developed extensions that let a server run programs in the background. The standardized “Common Gateway Interface” (CGI) made it possible for a “Submit” button to trigger a program (usually in a /cgi-bin/ directory) that could do something interesting with the submission, like talking to a database. CGI scripts could even generate new webpages dynamically and send them back to the user.

This intelligent two-way interaction changed the web forever. It enabled things like logging into an account on a website, web-based forums, and even uploading files directly to a web server. Suddenly, a website wasn’t just a page that you looked at. It could be a community where groups of interested people could interact with each other, sharing both text and images.

Dynamic webpages led to the rise of blogging, first as an experiment (some, like Justin Hall’s and Dave Winer’s, are still around today) and then as something anyone could do in their spare time. Websites in general became easier to create with sites like Geocities and Angelfire, which let people build their own personal dream house on the web for free. A community-run dynamic linking site, webring.org, connected similar websites together, encouraging exploration.

Webring.org was a free, community-run service that allowed dynamically updated webrings. Credit: Jeremy Reimer

One of the best things to come out of Web 2.0 was Wikipedia. It arose as a side project of Nupedia, an online encyclopedia founded by Jimmy Wales, with articles written by volunteers who were subject matter experts. This process was slow, and the site only had 21 articles in its first year. Wikipedia, in contrast, allowed anyone to contribute and review articles, so it quickly outpaced its predecessor. At first, people were skeptical about letting random Internet users edit articles. But thanks to an army of volunteer editors and a set of tools to quickly fix vandalism, the site flourished. Wikipedia far surpassed works like the Encyclopedia Britannica in sheer numbers of articles while maintaining roughly equivalent accuracy.

Not every Internet innovation lived on a webpage. In 1988, Jarkko Oikarinen created a program called Internet Relay Chat (IRC), which allowed real-time messaging between individuals and groups. IRC clients for Windows and Macintosh were popular among nerds, but friendlier applications like PowWow (1994), ICQ (1996), and AIM (1997) brought messaging to the masses. Even Microsoft got in on the act with MSN Messenger in 1999. For a few years, this messaging culture was an important part of daily life at home, school, and work.

A digital recreation of MSN Messenger from 2001. Sadly, Microsoft shut down the servers in 2014. Credit: Jeremy Reimer

Animation, games, and video

While the web was evolving quickly, the slow speeds of dial-up modems limited the size of files you could upload to a website. Static images were the norm. Animation only appeared in heavily compressed GIF files with a few frames each.

But a new technology blasted past these limitations and unleashed a torrent of creativity on the web. In 1995, Macromedia released Shockwave Player, an add-on for Netscape Navigator. Along with its Director software, the combination allowed artists to create animations based on vector drawings. These were small enough to embed inside webpages.

Websites popped up to support this new content. Newgrounds.com, which started in 1995 as a Neo-Geo fan site, started collecting the best animations. Because Director was designed to create interactive multimedia for CD-ROM projects, it also supported keyboard and mouse input and had basic scripting. This meant that people could make simple games that ran in Shockwave. Newgrounds eagerly showcased these as well, giving many aspiring artists and game designers an entry point into their careers. Super Meat Boy, for example, was first prototyped on Newgrounds.

Newgrounds as it would have appeared circa 2003. Credit: Jeremy Reimer

Putting actual video on the web seemed like something from the far future. But the future arrived quickly. After the dotcom crash of 2001, there were many unemployed web programmers with a lot of time on their hands to experiment with their personal projects. The arrival of broadband with cable modems and digital subscriber lines (DSL), combined with the new MPEG4 compression standard, made a lot of formerly impossible things possible.

In early 2005, Chad Hurley, Steve Chen, and Jawed Karim launched Youtube.com. Initially, it was meant to be an online dating site, but that service failed. The site, however, had great technology for uploading and playing videos. It used Macromedia’s Flash, a new technology so similar to Shockwave that the company marketed it as Shockwave Flash. YouTube allowed anybody to upload videos up to ten minutes in length for free. It became so popular that Google bought it a year later for $1.65 billion.

All these technologies combined to provide ordinary people with the opportunity, however brief, to make an impact on popular culture. An early example was the All Your Base phenomenon. An animated GIF of an obscure, mistranslated Sega Genesis game inspired indie musicians The Laziest Men On Mars to create a song and distribute it as an MP3. The popular humor site somethingawful.com picked it up, and users in the Photoshop Friday forum thread created a series of humorous images to go along with the song. Then in 2001, the user Bad_CRC took the song and the best of the images and put them together in an animation they shared on Newgrounds. The YouTube version gained such wide popularity that it was reported on by USA Today.

You have no chance to survive make your time.

Media goes social

In the early 2000s, most websites were either blogs or forums—and frequently both. Forums had multiple discussion boards, both general and specific. They often leaned into a specific hobby or interest, and anyone with that interest could join. There were also a handful of dating websites, like kiss.com (1994), match.com (1995), and eHarmony.com (2000), that specifically tried to connect people who might have a romantic interest in each other.

The Swedish Lunarstorm was one of the first social media websites. Credit: Jeremy Reimer

The road to social media was a hazy and confusing merging of these two types of websites. There was classmates.com (1995) that served as a way to connect with former school chums, and the following year, the Swedish site lunarstorm.com opened with this mission:

Everyone has their own website called Krypin. Each babe [this word is an accurate translation] has their own Krypin where she or he introduces themselves, posts their diaries and their favorite files, which can be anything from photos and their own songs to poems and other fun stuff. Every LunarStormer also has their own guestbook where you can write if you don’t really dare send a LunarEmail or complete a Friend Request.

In 1997, sixdegrees.com opened, based on the truism that everyone on earth is connected with six or fewer degrees of separation. Its About page said, “Our free networking services let you find the people you want to know through the people you already know.”

By the time friendster.com opened its doors in 2002, the concept of “friending” someone online was already well established, although it was still a niche activity. LinkedIn.com, launched the following year, used the excuse of business networking to encourage this behavior. But it was MySpace.com (2003) that was the first to gain significant traction.

MySpace was initially a Friendster clone written in just ten days by employees at eUniverse, an Internet marketing startup founded by Brad Greenspan. It became the company’s most successful product. MySpace combined the website-building ability of sites like GeoCities with social networking features. It took off incredibly quickly: in just three years, it surpassed Google as the most visited website in the United States. Hype around MySpace reached such a crescendo that Rupert Murdoch purchased it in 2005 for $580 million.

But a newcomer to the social media scene was about to destroy MySpace. Just as Google crushed its competitors, this startup won by providing a simpler, more functional, and less intrusive product. TheFaceBook.com began as Mark Zuckerberg and his college roommate’s attempt to replace their college’s online directory. Zuckerberg’s first student website, “Facemash,” had been created by breaking into Harvard’s network, and its sole feature was to provide “Hot or Not” comparisons of student photos. Facebook quickly spread to other universities, and in 2006 (after dropping the “the”), it was opened to the rest of the world.

“The” Facebook as it appeared in 2004. Credit: Jeremy Reimer

Facebook won the social networking wars by focusing on the rapid delivery of new features. The company’s slogan, “Move fast and break things,” encouraged this strategy. The most prominent feature, added in 2006, was the News Feed. It generated a list of posts, selected out of thousands of potential updates for each user based on who they followed and liked, and showed it on their front page. Combined with a technique called “infinite scrolling,” first invented for Microsoft’s Bing Image Search by Hugh E. Williams in 2005, it changed the way the web worked forever.

The algorithmically generated News Feed created new opportunities for Facebook to make profits. For example, businesses could boost posts for a fee, which would make them appear in news feeds more often. These blurred the lines between posts and ads.

Facebook was also successful in identifying up-and-coming social media sites and buying them out before they were able to pose a threat. This was made easier thanks to Onavo, a VPN that monitored its users’ activities and resold the data. Facebook acquired Onavo in 2013. It was shut down in 2019 due to continued controversy over the use of private data.

Social media transformed the Internet, drawing in millions of new users and starting a consolidation of website-visiting habits that continues to this day. But something else was about to happen that would shake the Internet to its core.

Don’t you people have phones?

For years, power users had experimented with getting the Internet on their handheld devices. IBM’s Simon phone, which came out in 1994, had both phone and PDA features. It could send and receive email. The Nokia 9000 Communicator, released in 1996, even had a primitive text-based web browser.

Later phones like the Blackberry 850 (1999), the Nokia 9210 (2001), and the Palm Treo (2002), added keyboards, color screens, and faster processors. In 1999, the Wireless Application Protocol (WAP) was released, which allowed mobile phones to receive and display simplified, phone-friendly pages using WML instead of the standard HTML markup language.

Browsing the web on phones was possible before modern smartphones, but it wasn’t easy. Credit: James Cridland (Flickr)

But despite their popularity with business users, these phones never broke into the mainstream. That all changed in 2007 when Steve Jobs got on stage and announced the iPhone. Now, every webpage could be viewed natively on the phone’s browser, and zooming into a section was as easy as pinching or double-tapping. The one exception was Flash, but a new HTML 5 standard promised to standardize advanced web features like animation and video playback.

Google quickly changed its Android prototype from a Blackberry clone to something more closely resembling the iPhone. Android’s open licensing structure allowed companies around the world to produce inexpensive smartphones. Even mid-range phones were still much cheaper than computers. This technology allowed, for the first time, the entire world to become connected through the Internet.

The exploding market of phone users also propelled the massive growth of social media companies like Facebook and Twitter. It was a lot easier now to snap a picture of a live event with your phone and post it instantly to the world. Optimists pointed to the remarkable events of the Arab Spring protests as proof that the Internet could help spread democracy and freedom. But governments around the world were just as eager to use these new tools, except their goals leaned more toward control and crushing dissent.

The backlash

Technology has always been a double-edged sword. But in recent years, public opinion about the Internet has shifted from being mostly positive to increasingly negative.

The combination of mobile phones, social media algorithms, and infinite scrolling led to the phenomenon of “doomscrolling,” where people spend hours every day reading “news” that is tuned for maximum engagement by provoking as many people as possible. The emotional toil caused by doomscrolling has been shown to cause real harm. Even more serious is the fallout from misinformation and hate speech, like the genocide in Myanmar that an Amnesty International report claims was amplified on Facebook.

As companies like Google, Amazon, and Facebook grew into near-monopolies, they inevitably lost sight of their original mission in favor of a never-ending quest for more money. The process, dubbed enshittification by Cory Doctorow, shifts the focus first from users to advertisers and then to shareholders.

Chasing these profits has fueled the rise of generative AI, which threatens to turn the entire Internet into a sea of soulless gray soup. Google is now forcing AI summaries at the top of web searches, which reduce traffic to websites and often provide dangerous misinformation. But even if you ignore the AI summaries, the sites you find underneath may also be suspect. Once-trusted websites have laid off staff and replaced them with AI, generating an endless series of new articles written by nobody. A web where AIs comment on AI-generated Facebook posts that link to AI-generated articles, which are then AI-summarized by Google, seems inhuman and pointless.

A search for cute baby peacocks on Bing. Some of them are real, and some aren’t. Credit: Jeremy Reimer

Where from here?

The history of the Internet can be roughly divided into three phases. The first, from 1969 to 1990, was all about the inventors: people like Vint Cerf, Steve Crocker, and Robert Taylor. These folks were part of a small group of computer scientists who figured out how to get different types of computers to talk to each other and to other networks.

The next phase, from 1991 to 1999, was a whirlwind that was fueled by entrepreneurs, people like Jerry Yang and Jeff Bezos. They latched on to Tim Berners-Lee’s invention of the World Wide Web and created companies that lived entirely in this new digital landscape. This set off a manic phase of exponential growth and hype, which peaked in early 2001 and crashed a few months later.

The final phase, from 2000 through today, has primarily been about the users. New companies like Google and Facebook may have reaped the greatest financial rewards during this time, but none of their successes would have been possible without the contributions of ordinary people like you and me. Every time we typed something into a text box and hit the “Submit” button, we created a tiny piece of a giant web of content. Even the generative AIs that pretend to make new things today are merely regurgitating words, phrases, and pictures that were created and shared by people.

There is a growing sense of nostalgia today for the old Internet, when it felt like a place, and the joy of discovery was around every corner. “Using the old Internet felt like digging for treasure,” said YouTube commenter MySoftCrow. “Using the current Internet feels like getting buried alive.”

Ars community member MichaelHurd added his own thoughts: “I feel the same way. It feels to me like the core problem with the modern Internet is that websites want you to stay on them for as long as possible, but the World Wide Web is at its best when sites connect to each other and encourage people to move between them. That’s what hyperlinks are for!”

Despite all the doom surrounding the modern Internet, it remains largely open. Anyone can pay about $5 per month for a shared Linux server and create a personal website containing anything they can think of, using any software they like, even their own. And for the most part, anyone, on any device, anywhere in the world, can access that website.

Ultimately, the fate of the Internet depends on the actions of every one of us. That’s why I’m leaving the final words in this series of articles to you. What would your dream Internet of the future look and feel like? The comments section is open.

Photo of Jeremy Reimer

I’m a writer and web developer. I specialize in the obscure and beautiful, like the Amiga and newLISP.

A history of the Internet, part 3: The rise of the user Read More »