AI

google’s-ai-overviews-misunderstand-why-people-use-google

Google’s AI Overviews misunderstand why people use Google

robot hand holding glue bottle over a pizza and tomatoes

Aurich Lawson | Getty Images

Last month, we looked into some of the most incorrect, dangerous, and downright weird answers generated by Google’s new AI Overviews feature. Since then, Google has offered a partial apology/explanation for generating those kinds of results and has reportedly rolled back the feature’s rollout for at least some types of queries.

But the more I’ve thought about that rollout, the more I’ve begun to question the wisdom of Google’s AI-powered search results in the first place. Even when the system doesn’t give obviously wrong results, condensing search results into a neat, compact, AI-generated summary seems like a fundamental misunderstanding of how people use Google in the first place.

Reliability and relevance

When people type a question into the Google search bar, they only sometimes want the kind of basic reference information that can be found on a Wikipedia page or corporate website (or even a Google information snippet). Often, they’re looking for subjective information where there is no one “right” answer: “What are the best Mexican restaurants in Santa Fe?” or “What should I do with my kids on a rainy day?” or “How can I prevent cheese from sliding off my pizza?”

The value of Google has always been in pointing you to the places it thinks are likely to have good answers to those questions. But it’s still up to you, as a user, to figure out which of those sources is the most reliable and relevant to what you need at that moment.

  • This wasn’t funny when the guys at Pep Boys said it, either. (via)

    Kyle Orland / Google

  • Weird Al recommends “running with scissors” as well! (via)

    Kyle Orland / Google

  • This list of steps actually comes from a forum thread response about doing something completely different. (via)

    Kyle Orland / Google

  • An island that’s part of the mainland? (via)

    Kyle Orland / Google

  • If everything’s cheaper now, why does everything seem so expensive?

    Kyle Orland / Google

  • Pretty sure this Truman was never president… (via)

    Kyle Orland / Google

For reliability, any savvy Internet user makes use of countless context clues when judging a random Internet search result. Do you recognize the outlet or the author? Is the information from someone with seeming expertise/professional experience or a random forum poster? Is the site well-designed? Has it been around for a while? Does it cite other sources that you trust, etc.?

But Google also doesn’t know ahead of time which specific result will fit the kind of information you’re looking for. When it comes to restaurants in Santa Fe, for instance, are you in the mood for an authoritative list from a respected newspaper critic or for more off-the-wall suggestions from random locals? Or maybe you scroll down a bit and stumble on a loosely related story about the history of Mexican culinary influences in the city.

One of the unseen strengths of Google’s search algorithm is that the user gets to decide which results are the best for them. As long as there’s something reliable and relevant in those first few pages of results, it doesn’t matter if the other links are “wrong” for that particular search or user.

Google’s AI Overviews misunderstand why people use Google Read More »

windows-recall-demands-an-extraordinary-level-of-trust-that-microsoft-hasn’t-earned

Windows Recall demands an extraordinary level of trust that Microsoft hasn’t earned

The Recall feature as it currently exists in Windows 11 24H2 preview builds.

Enlarge / The Recall feature as it currently exists in Windows 11 24H2 preview builds.

Andrew Cunningham

Microsoft’s Windows 11 Copilot+ PCs come with quite a few new AI and machine learning-driven features, but the tentpole is Recall. Described by Microsoft as a comprehensive record of everything you do on your PC, the feature is pitched as a way to help users remember where they’ve been and to provide Windows extra contextual information that can help it better understand requests from and meet the needs of individual users.

This, as many users in infosec communities on social media immediately pointed out, sounds like a potential security nightmare. That’s doubly true because Microsoft says that by default, Recall’s screenshots take no pains to redact sensitive information, from usernames and passwords to health care information to NSFW site visits. By default, on a PC with 256GB of storage, Recall can store a couple dozen gigabytes of data across three months of PC usage, a huge amount of personal data.

The line between “potential security nightmare” and “actual security nightmare” is at least partly about the implementation, and Microsoft has been saying things that are at least superficially reassuring. Copilot+ PCs are required to have a fast neural processing unit (NPU) so that processing can be performed locally rather than sending data to the cloud; local snapshots are protected at rest by Windows’ disk encryption technologies, which are generally on by default if you’ve signed into a Microsoft account; neither Microsoft nor other users on the PC are supposed to be able to access any particular user’s Recall snapshots; and users can choose to exclude apps or (in most browsers) individual websites to exclude from Recall’s snapshots.

This all sounds good in theory, but some users are beginning to use Recall now that the Windows 11 24H2 update is available in preview form, and the actual implementation has serious problems.

“Fundamentally breaks the promise of security in Windows”

This is Recall, as seen on a PC running a preview build of Windows 11 24H2. It takes and saves periodic screenshots, which can then be searched for and viewed in various ways.

Enlarge / This is Recall, as seen on a PC running a preview build of Windows 11 24H2. It takes and saves periodic screenshots, which can then be searched for and viewed in various ways.

Andrew Cunningham

Security researcher Kevin Beaumont, first in a thread on Mastodon and later in a more detailed blog post, has written about some of the potential implementation issues after enabling Recall on an unsupported system (which is currently the only way to try Recall since Copilot+ PCs that officially support the feature won’t ship until later this month). We’ve also given this early version of Recall a try on a Windows Dev Kit 2023, which we’ve used for all our recent Windows-on-Arm testing, and we’ve independently verified Beaumont’s claims about how easy it is to find and view raw Recall data once you have access to a user’s PC.

To test Recall yourself, developer and Windows enthusiast Albacore has published a tool called AmperageKit that will enable it on Arm-based Windows PCs running Windows 11 24H2 build 26100.712 (the build currently available in the Windows Insider Release Preview channel). Other Windows 11 24H2 versions are missing the underlying code necessary to enable Recall.

  • Windows uses OCR on all the text in all the screenshots it takes. That text is also saved to an SQLite database to facilitate faster searches.

    Andrew Cunningham

  • Searching for “iCloud,” for example, brings up every single screenshot with the word “iCloud” in it, including the app itself and its entry in the Microsoft Store. If I had visited websites that mentioned it, they would show up here, too.

    Andrew Cunningham

The short version is this: In its current form, Recall takes screenshots and uses OCR to grab the information on your screen; it then writes the contents of windows plus records of different user interactions in a locally stored SQLite database to track your activity. Data is stored on a per-app basis, presumably to make it easier for Microsoft’s app-exclusion feature to work. Beaumont says “several days” of data amounted to a database around 90KB in size. In our usage, screenshots taken by Recall on a PC with a 2560×1440 screen come in at 500KB or 600KB apiece (Recall saves screenshots at your PC’s native resolution, minus the taskbar area).

Recall works locally thanks to Azure AI code that runs on your device, and it works without Internet connectivity and without a Microsoft account. Data is encrypted at rest, sort of, at least insofar as your entire drive is generally encrypted when your PC is either signed into a Microsoft account or has Bitlocker turned on. But in its current form, Beaumont says Recall has “gaps you can drive a plane through” that make it trivially easy to grab and scan through a user’s Recall database if you either (1) have local access to the machine and can log into any account (not just the account of the user whose database you’re trying to see), or (2) are using a PC infected with some kind of info-stealer virus that can quickly transfer the SQLite database to another system.

Windows Recall demands an extraordinary level of trust that Microsoft hasn’t earned Read More »

nvidia-jumps-ahead-of-itself-and-reveals-next-gen-“rubin”-ai-chips-in-keynote-tease

Nvidia jumps ahead of itself and reveals next-gen “Rubin” AI chips in keynote tease

Swing beat —

“I’m not sure yet whether I’m going to regret this,” says CEO Jensen Huang at Computex 2024.

Nvidia's CEO Jensen Huang delivers his keystone speech ahead of Computex 2024 in Taipei on June 2, 2024.

Enlarge / Nvidia’s CEO Jensen Huang delivers his keystone speech ahead of Computex 2024 in Taipei on June 2, 2024.

On Sunday, Nvidia CEO Jensen Huang reached beyond Blackwell and revealed the company’s next-generation AI-accelerating GPU platform during his keynote at Computex 2024 in Taiwan. Huang also detailed plans for an annual tick-tock-style upgrade cycle of its AI acceleration platforms, mentioning an upcoming Blackwell Ultra chip slated for 2025 and a subsequent platform called “Rubin” set for 2026.

Nvidia’s data center GPUs currently power a large majority of cloud-based AI models, such as ChatGPT, in both development (training) and deployment (inference) phases, and investors are keeping a close watch on the company, with expectations to keep that run going.

During the keynote, Huang seemed somewhat hesitant to make the Rubin announcement, perhaps wary of invoking the so-called Osborne effect, whereby a company’s premature announcement of the next iteration of a tech product eats into the current iteration’s sales. “This is the very first time that this next click as been made,” Huang said, holding up his presentation remote just before the Rubin announcement. “And I’m not sure yet whether I’m going to regret this or not.”

Nvidia Keynote at Computex 2023.

The Rubin AI platform, expected in 2026, will use HBM4 (a new form of high-bandwidth memory) and NVLink 6 Switch, operating at 3,600GBps. Following that launch, Nvidia will release a tick-tock iteration called “Rubin Ultra.” While Huang did not provide extensive specifications for the upcoming products, he promised cost and energy savings related to the new chipsets.

During the keynote, Huang also introduced a new ARM-based CPU called “Vera,” which will be featured on a new accelerator board called “Vera Rubin,” alongside one of the Rubin GPUs.

Much like Nvidia’s Grace Hopper architecture, which combines a “Grace” CPU and a “Hopper” GPU to pay tribute to the pioneering computer scientist of the same name, Vera Rubin refers to Vera Florence Cooper Rubin (1928–2016), an American astronomer who made discoveries in the field of deep space astronomy. She is best known for her pioneering work on galaxy rotation rates, which provided strong evidence for the existence of dark matter.

A calculated risk

Nvidia CEO Jensen Huang reveals the

Enlarge / Nvidia CEO Jensen Huang reveals the “Rubin” AI platform for the first time during his keynote at Computex 2024 on June 2, 2024.

Nvidia’s reveal of Rubin is not a surprise in the sense that most big tech companies are continuously working on follow-up products well in advance of release, but it’s notable because it comes just three months after the company revealed Blackwell, which is barely out of the gate and not yet widely shipping.

At the moment, the company seems to be comfortable leapfrogging itself with new announcements and catching up later; Nvidia just announced that its GH200 Grace Hopper “Superchip,” unveiled one year ago at Computex 2023, is now in full production.

With Nvidia stock rising and the company possessing an estimated 70–95 percent of the data center GPU market share, the Rubin reveal is a calculated risk that seems to come from a place of confidence. That confidence could turn out to be misplaced if a so-called “AI bubble” pops or if Nvidia misjudges the capabilities of its competitors. The announcement may also stem from pressure to continue Nvidia’s astronomical growth in market cap with nonstop promises of improving technology.

Accordingly, Huang has been eager to showcase the company’s plans to continue pushing silicon fabrication tech to its limits and widely broadcast that Nvidia plans to keep releasing new AI chips at a steady cadence.

“Our company has a one-year rhythm. Our basic philosophy is very simple: build the entire data center scale, disaggregate and sell to you parts on a one-year rhythm, and we push everything to technology limits,” Huang said during Sunday’s Computex keynote.

Despite Nvidia’s recent market performance, the company’s run may not continue indefinitely. With ample money pouring into the data center AI space, Nvidia isn’t alone in developing accelerator chips. Competitors like AMD (with the Instinct series) and Intel (with Guadi 3) also want to win a slice of the data center GPU market away from Nvidia’s current command of the AI-accelerator space. And OpenAI’s Sam Altman is trying to encourage diversified production of GPU hardware that will power the company’s next generation of AI models in the years ahead.

Nvidia jumps ahead of itself and reveals next-gen “Rubin” AI chips in keynote tease Read More »

no-physics?-no-problem-ai-weather-forecasting-is-already-making-huge-strides.

No physics? No problem. AI weather forecasting is already making huge strides.

AI weather models are arriving just in time for the 2024 Atlantic hurricane season.

Enlarge / AI weather models are arriving just in time for the 2024 Atlantic hurricane season.

Aurich Lawson | Getty Images

Much like the invigorating passage of a strong cold front, major changes are afoot in the weather forecasting community. And the end game is nothing short of revolutionary: an entirely new way to forecast weather based on artificial intelligence that can run on a desktop computer.

Today’s artificial intelligence systems require one resource more than any other to operate—data. For example, large language models such as ChatGPT voraciously consume data to improve answers to queries. The more and higher quality data, the better their training, and the sharper the results.

However, there is a finite limit to quality data, even on the Internet. These large language models have hoovered up so much data that they’re being sued widely for copyright infringement. And as they’re running out of data, the operators of these AI models are turning to ideas such as synthetic data to keep feeding the beast and produce ever more capable results for users.

If data is king, what about other applications for AI technology similar to large language models? Are there untapped pools of data? One of the most promising that has emerged in the last 18 months is weather forecasting, and recent advances have sent shockwaves through the field of meteorology.

That’s because there’s a secret weapon: an extremely rich data set. The European Centre for Medium-Range Weather Forecasts, the premiere organization in the world for numerical weather prediction, maintains a set of data about atmospheric, land, and oceanic weather data for every day, at points around the world, every few hours, going back to 1940. The last 50 years of data, after the advent of global satellite coverage, is especially rich. This dataset is known as ERA5, and it is publicly available.

It was not created to fuel AI applications, but ERA5 has turned out to be incredibly useful for this purpose. Computer scientists only really got serious about using this data to train AI models to forecast the weather in 2022. Since then, the technology has made rapid strides. In some cases, the output of these models is already superior to global weather models that scientists have labored decades to design and build, and they require some of the most powerful supercomputers in the world to run.

“It is clear that machine learning is a significant part of the future of weather forecasting,” said Matthew Chantry, who leads AI forecasting efforts at the European weather center known as ECMWF, in an interview with Ars.

It’s moving fast

John Dean and Kai Marshland met as undergraduates at Stanford University in the late 2010s. Dean, an electrical engineer, interned at SpaceX during the summer of 2017. Marshland, a computer scientist, interned at the launch company the next summer. Both graduated in 2019 and were trying to figure out what to do with their lives.

“We decided we wanted to solve the problem of weather uncertainty,” Marshland said, so they co-founded a company called WindBorne Systems.

The premise of the company was simple: For about 85 percent of the Earth and its atmosphere, we have no good data about weather conditions there. A lack of quality data, which establishes initial conditions, represents a major handicap for global weather forecast models. The company’s proposed solution was in its name—wind borne.

Dean and Marshland set about designing small weather balloons they could release into the atmosphere and which would fly around the world for up to 40 days, relaying useful atmospheric data that could be packaged and sold to large, government-funded weather models.

Weather balloons provide invaluable data about atmospheric conditions—readings such as temperature, dewpoints, and pressures—that cannot be captured by surface observations or satellites. Such atmospheric “profiles” are helpful in setting the initial conditions models start with. The problem is that traditional weather balloons are cumbersome and only operate for a few hours. Because of this, the National Weather Service only launches them twice daily from about 100 locations in the United States.

No physics? No problem. AI weather forecasting is already making huge strides. Read More »

journalists-“deeply-troubled”-by-openai’s-content-deals-with-vox,-the-atlantic

Journalists “deeply troubled” by OpenAI’s content deals with Vox, The Atlantic

adventures in training data —

“Alarmed” writers unions question transparency of AI training deals with ChatGPT maker.

A man covered in newspaper.

On Wednesday, Axios broke the news that OpenAI had signed deals with The Atlantic and Vox Media that will allow the ChatGPT maker to license their editorial content to further train its language models. But some of the publications’ writers—and the unions that represent them—were surprised by the announcements and aren’t happy about it. Already, two unions have released statements expressing “alarm” and “concern.”

“The unionized members of The Atlantic Editorial and Business and Technology units are deeply troubled by the opaque agreement The Atlantic has made with OpenAI,” reads a statement from the Atlantic union. “And especially by management’s complete lack of transparency about what the agreement entails and how it will affect our work.”

The Vox Union—which represents The Verge, SB Nation, and Vulture, among other publications—reacted in similar fashion, writing in a statement, “Today, members of the Vox Media Union … were informed without warning that Vox Media entered into a ‘strategic content and product partnership’ with OpenAI. As both journalists and workers, we have serious concerns about this partnership, which we believe could adversely impact members of our union, not to mention the well-documented ethical and environmental concerns surrounding the use of generative AI.”

  • A statement from The Atlantic Union about the OpenAI deal, released May 30, 2024.

  • A statement from the Vox Media Union about the OpenAI deal, released May 29, 2024.

OpenAI has previously admitted to using copyrighted information scraped from publications like the ones that just inked licensing deals to train AI models like GPT-4, which powers its ChatGPT AI assistant. While the company maintains the practice is fair use, it has simultaneously licensed training content from publishing groups like Axel Springer and social media sites like Reddit and Stack Overflow, sparking protests from users of those platforms.

As part of the multi-year agreements with The Atlantic and Vox, OpenAI will be able to openly and officially utilize the publishers’ archived materials—dating back to 1857 in The Atlantic’s case—as well as current articles to train responses generated by ChatGPT and other AI language models. In exchange, the publishers will receive undisclosed sums of money and be able to use OpenAI’s technology “to power new journalism products,” according to Axios.

Reporters react

News of the deals took both journalists and unions by surprise. On X, Vox reporter Kelsey Piper, who recently penned an exposé about OpenAI’s restrictive non-disclosure agreements that prompted a change in policy from the company, wrote, “I’m very frustrated they announced this without consulting their writers, but I have very strong assurances in writing from our editor in chief that they want more coverage like the last two weeks and will never interfere in it. If that’s false I’ll quit..”

Journalists also reacted to news of the deals through the publications themselves. On Wednesday, The Atlantic Senior Editor Damon Beres wrote a piece titled “A Devil’s Bargain With OpenAI,” in which he expressed skepticism about the partnership, likening it to making a deal with the devil that may backfire. He highlighted concerns about AI’s use of copyrighted material without permission and its potential to spread disinformation at a time when publications have seen a recent string of layoffs. He drew parallels to the pursuit of audiences on social media leading to clickbait and SEO tactics that degraded media quality. While acknowledging the financial benefits and potential reach, Beres cautioned against relying on inaccurate, opaque AI models and questioned the implications of journalism companies being complicit in potentially destroying the internet as we know it, even as they try to be part of the solution by partnering with OpenAI.

Similarly, over at Vox, Editorial Director Bryan Walsh penned a piece titled, “This article is OpenAI training data,” in which he expresses apprehension about the licensing deal, drawing parallels between the relentless pursuit of data by AI companies and the classic AI thought experiment of Bostrom’s “paperclip maximizer,” cautioning that the single-minded focus on market share and profits could ultimately destroy the ecosystem AI companies rely on for training data. He worries that the growth of AI chatbots and generative AI search products might lead to a significant decline in search engine traffic to publishers, potentially threatening the livelihoods of content creators and the richness of the Internet itself.

Meanwhile, OpenAI still battles over “fair use”

Not every publication is eager to step up to the licensing plate with OpenAI. The San Francisco-based company is currently in the middle of a lawsuit with The New York Times in which OpenAI claims that scraping data from publications for AI training purposes is fair use. The New York Times has tried to block AI companies from such scraping by updating its terms of service to prohibit AI training, arguing in its lawsuit that ChatGPT could easily become a substitute for NYT.

The Times has accused OpenAI of copying millions of its works to train AI models, finding 100 examples where ChatGPT regurgitated articles. In response, OpenAI accused NYT of “hacking” ChatGPT with deceptive prompts simply to set up a lawsuit. NYT’s counsel Ian Crosby previously told Ars that OpenAI’s decision “to enter into deals with news publishers only confirms that they know their unauthorized use of copyrighted work is far from ‘fair.'”

While that issue has yet to be resolved in the courts, for now, The Atlantic Union seeks transparency.

“The Atlantic has defended the values of transparency and intellectual honesty for more than 160 years. Its legacy is built on integrity, derived from the work of its writers, editors, producers, and business staff,” it wrote. “OpenAI, on the other hand, has used news articles to train AI technologies like ChatGPT without permission. The people who continue to maintain and serve The Atlantic deserve to know what precisely management has licensed to an outside firm and how, specifically, they plan to use the archive of our creative output and our work product.”

Journalists “deeply troubled” by OpenAI’s content deals with Vox, The Atlantic Read More »

google’s-ai-overview-is-flawed-by-design,-and-a-new-company-blog-post-hints-at-why

Google’s AI Overview is flawed by design, and a new company blog post hints at why

guided by voices —

Google: “There are bound to be some oddities and errors” in system that told people to eat rocks.

A selection of Google mascot characters created by the company.

Enlarge / The Google “G” logo surrounded by whimsical characters, all of which look stunned and surprised.

On Thursday, Google capped off a rough week of providing inaccurate and sometimes dangerous answers through its experimental AI Overview feature by authoring a follow-up blog post titled, “AI Overviews: About last week.” In the post, attributed to Google VP Liz Reid, head of Google Search, the firm formally acknowledged issues with the feature and outlined steps taken to improve a system that appears flawed by design, even if it doesn’t realize it is admitting it.

To recap, the AI Overview feature—which the company showed off at Google I/O a few weeks ago—aims to provide search users with summarized answers to questions by using an AI model integrated with Google’s web ranking systems. Right now, it’s an experimental feature that is not active for everyone, but when a participating user searches for a topic, they might see an AI-generated answer at the top of the results, pulled from highly ranked web content and summarized by an AI model.

While Google claims this approach is “highly effective” and on par with its Featured Snippets in terms of accuracy, the past week has seen numerous examples of the AI system generating bizarre, incorrect, or even potentially harmful responses, as we detailed in a recent feature where Ars reporter Kyle Orland replicated many of the unusual outputs.

Drawing inaccurate conclusions from the web

On Wednesday morning, Google's AI Overview was erroneously telling us the Sony PlayStation and Sega Saturn were available in 1993.

Enlarge / On Wednesday morning, Google’s AI Overview was erroneously telling us the Sony PlayStation and Sega Saturn were available in 1993.

Kyle Orland / Google

Given the circulating AI Overview examples, Google almost apologizes in the post and says, “We hold ourselves to a high standard, as do our users, so we expect and appreciate the feedback, and take it seriously.” But Reid, in an attempt to justify the errors, then goes into some very revealing detail about why AI Overviews provides erroneous information:

AI Overviews work very differently than chatbots and other LLM products that people may have tried out. They’re not simply generating an output based on training data. While AI Overviews are powered by a customized language model, the model is integrated with our core web ranking systems and designed to carry out traditional “search” tasks, like identifying relevant, high-quality results from our index. That’s why AI Overviews don’t just provide text output, but include relevant links so people can explore further. Because accuracy is paramount in Search, AI Overviews are built to only show information that is backed up by top web results.

This means that AI Overviews generally don’t “hallucinate” or make things up in the ways that other LLM products might.

Here we see the fundamental flaw of the system: “AI Overviews are built to only show information that is backed up by top web results.” The design is based on the false assumption that Google’s page-ranking algorithm favors accurate results and not SEO-gamed garbage. Google Search has been broken for some time, and now the company is relying on those gamed and spam-filled results to feed its new AI model.

Even if the AI model draws from a more accurate source, as with the 1993 game console search seen above, Google’s AI language model can still make inaccurate conclusions about the “accurate” data, confabulating erroneous information in a flawed summary of the information available.

Generally ignoring the folly of basing its AI results on a broken page-ranking algorithm, Google’s blog post instead attributes the commonly circulated errors to several other factors, including users making nonsensical searches “aimed at producing erroneous results.” Google does admit faults with the AI model, like misinterpreting queries, misinterpreting “a nuance of language on the web,” and lacking sufficient high-quality information on certain topics. It also suggests that some of the more egregious examples circulating on social media are fake screenshots.

“Some of these faked results have been obvious and silly,” Reid writes. “Others have implied that we returned dangerous results for topics like leaving dogs in cars, smoking while pregnant, and depression. Those AI Overviews never appeared. So we’d encourage anyone encountering these screenshots to do a search themselves to check.”

(No doubt some of the social media examples are fake, but it’s worth noting that any attempts to replicate those early examples now will likely fail because Google will have manually blocked the results. And it is potentially a testament to how broken Google Search is if people believed extreme fake examples in the first place.)

While addressing the “nonsensical searches” angle in the post, Reid uses the example search, “How many rocks should I eat each day,” which went viral in a tweet on May 23. Reid says, “Prior to these screenshots going viral, practically no one asked Google that question.” And since there isn’t much data on the web that answers it, she says there is a “data void” or “information gap” that was filled by satirical content found on the web, and the AI model found it and pushed it as an answer, much like Featured Snippets might. So basically, it was working exactly as designed.

A screenshot of an AI Overview query,

Enlarge / A screenshot of an AI Overview query, “How many rocks should I eat each day” that went viral on X last week.

Google’s AI Overview is flawed by design, and a new company blog post hints at why Read More »

russia-and-china-are-using-openai-tools-to-spread-disinformation

Russia and China are using OpenAI tools to spread disinformation

New tool —

Iran and Israel have been getting in on the action as well.

OpenAI said it was committed to uncovering disinformation campaigns and was building its own AI-powered tools to make detection and analysis

Enlarge / OpenAI said it was committed to uncovering disinformation campaigns and was building its own AI-powered tools to make detection and analysis “more effective.”

FT montage/NurPhoto via Getty Images

OpenAI has revealed operations linked to Russia, China, Iran and Israel have been using its artificial intelligence tools to create and spread disinformation, as technology becomes a powerful weapon in information warfare in an election-heavy year.

The San Francisco-based maker of the ChatGPT chatbot said in a report on Thursday that five covert influence operations had used its AI models to generate text and images at a high volume, with fewer language errors than previously, as well as to generate comments or replies to their own posts. OpenAI’s policies prohibit the use of its models to deceive or mislead others.

The content focused on issues “including Russia’s invasion of Ukraine, the conflict in Gaza, the Indian elections, politics in Europe and the United States, and criticisms of the Chinese government by Chinese dissidents and foreign governments,” OpenAI said in the report.

The networks also used AI to enhance their own productivity, applying it to tasks such as debugging code or doing research into public social media activity, it said.

Social media platforms, including Meta and Google’s YouTube, have sought to clamp down on the proliferation of disinformation campaigns in the wake of Donald Trump’s 2016 win in the US presidential election when investigators found evidence that a Russian troll farm had sought to manipulate the vote.

Pressure is mounting on fast-growing AI companies such as OpenAI, as rapid advances in their technology mean it is cheaper and easier than ever for disinformation perpetrators to create realistic deepfakes and manipulate media and then spread that content in an automated fashion.

As about 2 billion people head to the polls this year, policymakers have urged the companies to introduce and enforce appropriate guardrails.

Ben Nimmo, principal investigator for intelligence and investigations at OpenAI, said on a call with reporters that the campaigns did not appear to have “meaningfully” boosted their engagement or reach as a result of using OpenAI’s models.

But, he added, “this is not the time for complacency. History shows that influence operations which spent years failing to get anywhere can suddenly break out if nobody’s looking for them.”

Microsoft-backed OpenAI said it was committed to uncovering such disinformation campaigns and was building its own AI-powered tools to make detection and analysis “more effective.” It added its safety systems already made it difficult for the perpetrators to operate, with its models refusing in multiple instances to generate the text or images asked for.

In the report, OpenAI revealed several well-known state-affiliated disinformation actors had been using its tools. These included a Russian operation, Doppelganger, which was first discovered in 2022 and typically attempts to undermine support for Ukraine, and a Chinese network known as Spamouflage, which pushes Beijing’s interests abroad. Both campaigns used its models to generate text or comment in multiple languages before posting on platforms such as Elon Musk’s X.

It flagged a previously unreported Russian operation, dubbed Bad Grammar, saying it used OpenAI models to debug code for running a Telegram bot and to create short, political comments in Russian and English that were then posted on messaging platform Telegram.

X and Telegram have been approached for comment.

It also said it had thwarted a pro-Israel disinformation-for-hire effort, allegedly run by a Tel Aviv-based political campaign management business called STOIC, which used its models to generate articles and comments on X and across Meta’s Instagram and Facebook.

Meta on Wednesday released a report stating it removed the STOIC content. The accounts linked to these operations were terminated by OpenAI.

Additional reporting by Cristina Criddle

© 2024 The Financial Times Ltd. All rights reserved. Not to be redistributed, copied, or modified in any way.

Russia and China are using OpenAI tools to spread disinformation Read More »

openai-board-first-learned-about-chatgpt-from-twitter,-according-to-former-member

OpenAI board first learned about ChatGPT from Twitter, according to former member

It’s a secret to everybody —

Helen Toner, center of struggle with Altman, suggests CEO fostered “toxic atmosphere” at company.

Helen Toner, former OpenAI board member, speaks onstage during Vox Media's 2023 Code Conference at The Ritz-Carlton, Laguna Niguel on September 27, 2023.

Enlarge / Helen Toner, former OpenAI board member, speaks during Vox Media’s 2023 Code Conference at The Ritz-Carlton, Laguna Niguel on September 27, 2023.

In a recent interview on “The Ted AI Show” podcast, former OpenAI board member Helen Toner said the OpenAI board was unaware of the existence of ChatGPT until they saw it on Twitter. She also revealed details about the company’s internal dynamics and the events surrounding CEO Sam Altman’s surprise firing and subsequent rehiring last November.

OpenAI released ChatGPT publicly on November 30, 2022, and its massive surprise popularity set OpenAI on a new trajectory, shifting focus from being an AI research lab to a more consumer-facing tech company.

“When ChatGPT came out in November 2022, the board was not informed in advance about that. We learned about ChatGPT on Twitter,” Toner said on the podcast.

Toner’s revelation about ChatGPT seems to highlight a significant disconnect between the board and the company’s day-to-day operations, bringing new light to accusations that Altman was “not consistently candid in his communications with the board” upon his firing on November 17, 2023. Altman and OpenAI’s new board later said that the CEO’s mismanagement of attempts to remove Toner from the OpenAI board following her criticism of the company’s release of ChatGPT played a key role in Altman’s firing.

“Sam didn’t inform the board that he owned the OpenAI startup fund, even though he constantly was claiming to be an independent board member with no financial interest in the company on multiple occasions,” she said. “He gave us inaccurate information about the small number of formal safety processes that the company did have in place, meaning that it was basically impossible for the board to know how well those safety processes were working or what might need to change.”

Toner also shed light on the circumstances that led to Altman’s temporary ousting. She mentioned that two OpenAI executives had reported instances of “psychological abuse” to the board, providing screenshots and documentation to support their claims. The allegations made by the former OpenAI executives, as relayed by Toner, suggest that Altman’s leadership style fostered a “toxic atmosphere” at the company:

In October of last year, we had this series of conversations with these executives, where the two of them suddenly started telling us about their own experiences with Sam, which they hadn’t felt comfortable sharing before, but telling us how they couldn’t trust him, about the toxic atmosphere it was creating. They use the phrase “psychological abuse,” telling us they didn’t think he was the right person to lead the company, telling us they had no belief that he could or would change, there’s no point in giving him feedback, no point in trying to work through these issues.

Despite the board’s decision to fire Altman, Altman began the process of returning to his position just five days later after a letter to the board signed by over 700 OpenAI employees. Toner attributed this swift comeback to employees who believed the company would collapse without him, saying they also feared retaliation from Altman if they did not support his return.

“The second thing I think is really important to know, that has really gone under reported is how scared people are to go against Sam,” Toner said. “They experienced him retaliate against people retaliating… for past instances of being critical.”

“They were really afraid of what might happen to them,” she continued. “So some employees started to say, you know, wait, I don’t want the company to fall apart. Like, let’s bring back Sam. It was very hard for those people who had had terrible experiences to actually say that… if Sam did stay in power, as he ultimately did, that would make their lives miserable.”

In response to Toner’s statements, current OpenAI board chair Bret Taylor provided a statement to the podcast: “We are disappointed that Miss Toner continues to revisit these issues… The review concluded that the prior board’s decision was not based on concerns regarding product safety or security, the pace of development, OpenAI’s finances, or its statements to investors, customers, or business partners.”

Even given that review, Toner’s main argument is that OpenAI hasn’t been able to police itself despite claims to the contrary. “The OpenAI saga shows that trying to do good and regulating yourself isn’t enough,” she said.

OpenAI board first learned about ChatGPT from Twitter, according to former member Read More »

openai-training-its-next-major-ai-model,-forms-new-safety-committee

OpenAI training its next major AI model, forms new safety committee

now with 200% more safety —

GPT-5 might be farther off than we thought, but OpenAI wants to make sure it is safe.

A man rolling a boulder up a hill.

On Monday, OpenAI announced the formation of a new “Safety and Security Committee” to oversee risk management for its projects and operations. The announcement comes as the company says it has “recently begun” training its next frontier model, which it expects to bring the company closer to its goal of achieving artificial general intelligence (AGI), though some critics say AGI is farther off than we might think. It also comes as a reaction to a terrible two weeks in the press for the company.

Whether the aforementioned new frontier model is intended to be GPT-5 or a step beyond that is currently unknown. In the AI industry, “frontier model” is a term for a new AI system designed to push the boundaries of current capabilities. And “AGI” refers to a hypothetical AI system with human-level abilities to perform novel, general tasks beyond its training data (unlike narrow AI, which is trained for specific tasks).

Meanwhile, the new Safety and Security Committee, led by OpenAI directors Bret Taylor (chair), Adam D’Angelo, Nicole Seligman, and Sam Altman (CEO), will be responsible for making recommendations about AI safety to the full company board of directors. In this case, “safety” partially means the usual “we won’t let the AI go rogue and take over the world,” but it also includes a broader set of “processes and safeguards” that the company spelled out in a May 21 safety update related to alignment research, protecting children, upholding election integrity, assessing societal impacts, and implementing security measures.

OpenAI says the committee’s first task will be to evaluate and further develop those processes and safeguards over the next 90 days. At the end of this period, the committee will share its recommendations with the full board, and OpenAI will publicly share an update on adopted recommendations.

OpenAI says that multiple technical and policy experts, including Aleksander Madry (head of preparedness), Lilian Weng (head of safety systems), John Schulman (head of alignment science), Matt Knight (head of security), and Jakub Pachocki (chief scientist), will also serve on its new committee.

The announcement is notable in a few ways. First, it’s a reaction to the negative press that came from OpenAI Superalignment team members Ilya Sutskever and Jan Leike resigning two weeks ago. That team was tasked with “steer[ing] and control[ling] AI systems much smarter than us,” and their departure has led to criticism from some within the AI community (and Leike himself) that OpenAI lacks a commitment to developing highly capable AI safely. Other critics, like Meta Chief AI Scientist Yann LeCun, think the company is nowhere near developing AGI, so the concern over a lack of safety for superintelligent AI may be overblown.

Second, there have been persistent rumors that progress in large language models (LLMs) has plateaued recently around capabilities similar to GPT-4. Two major competing models, Anthropic’s Claude Opus and Google’s Gemini 1.5 Pro, are roughly equivalent to the GPT-4 family in capability despite every competitive incentive to surpass it. And recently, when many expected OpenAI to release a new AI model that would clearly surpass GPT-4 Turbo, it instead released GPT-4o, which is roughly equivalent in ability but faster. During that launch, the company relied on a flashy new conversational interface rather than a major under-the-hood upgrade.

We’ve previously reported on a rumor of GPT-5 coming this summer, but with this recent announcement, it seems the rumors may have been referring to GPT-4o instead. It’s quite possible that OpenAI is nowhere near releasing a model that can significantly surpass GPT-4. But with the company quiet on the details, we’ll have to wait and see.

OpenAI training its next major AI model, forms new safety committee Read More »

google’s-“ai-overview”-can-give-false,-misleading,-and-dangerous-answers

Google’s “AI Overview” can give false, misleading, and dangerous answers

This is fine.

Enlarge / This is fine.

Getty Images

If you use Google regularly, you may have noticed the company’s new AI Overviews providing summarized answers to some of your questions in recent days. If you use social media regularly, you may have come across many examples of those AI Overviews being hilariously or even dangerously wrong.

Factual errors can pop up in existing LLM chatbots as well, of course. But the potential damage that can be caused by AI inaccuracy gets multiplied when those errors appear atop the ultra-valuable web real estate of the Google search results page.

“The examples we’ve seen are generally very uncommon queries and aren’t representative of most people’s experiences,” a Google spokesperson told Ars. “The vast majority of AI Overviews provide high quality information, with links to dig deeper on the web.”

After looking through dozens of examples of Google AI Overview mistakes (and replicating many ourselves for the galleries below), we’ve noticed a few broad categories of errors that seemed to show up again and again. Consider this a crash course in some of the current weak points of Google’s AI Overviews and a look at areas of concern for the company to improve as the system continues to roll out.

Treating jokes as facts

  • The bit about using glue on pizza can be traced back to an 11-year-old troll post on Reddit. (via)

    Kyle Orland / Google

  • This wasn’t funny when the guys at Pep Boys said it, either. (via)

    Kyle Orland / Google

  • Weird Al recommends “running with scissors” as well! (via)

    Kyle Orland / Google

Some of the funniest example of Google’s AI Overview failing come, ironically enough, when the system doesn’t realize a source online was trying to be funny. An AI answer that suggested using “1/8 cup of non-toxic glue” to stop cheese from sliding off pizza can be traced back to someone who was obviously trying to troll an ongoing thread. A response recommending “blinker fluid” for a turn signal that doesn’t make noise can similarly be traced back to a troll on the Good Sam advice forums, which Google’s AI Overview apparently trusts as a reliable source.

In regular Google searches, these jokey posts from random Internet users probably wouldn’t be among the first answers someone saw when clicking through a list of web links. But with AI Overviews, those trolls were integrated into the authoritative-sounding data summary presented right at the top of the results page.

What’s more, there’s nothing in the tiny “source link” boxes below Google’s AI summary to suggest either of these forum trolls are anything other than good sources of information. Sometimes, though, glancing at the source can save you some grief, such as when you see a response calling running with scissors “cardio exercise that some say is effective” (that came from a 2022 post from Little Old Lady Comedy).

Bad sourcing

  • Washington University in St. Louis says this ratio is accurate, but others disagree. (via)

    Kyle Orland / Google

  • Man, we wish this fantasy remake was real. (via)

    Kyle Orland / Google

Sometimes Google’s AI Overview offers an accurate summary of a non-joke source that happens to be wrong. When asking about how many Declaration of Independence signers owned slaves, for instance, Google’s AI Overview accurately summarizes a Washington University of St. Louis library page saying that one-third “were personally enslavers.” But the response ignores contradictory sources like a Chicago Sun-Times article saying the real answer is closer to three-quarters. I’m not enough of a history expert to judge which authoritative-seeming source is right, but at least one historian online took issue with the Google AI’s answer sourcing.

Other times, a source that Google trusts as authoritative is really just fan fiction. That’s the case for a response that imagined a 2022 remake of 2001: A Space Odyssey, directed by Steven Spielberg and produced by George Lucas. A savvy web user would probably do a double-take before citing citing Fandom’s “Idea Wiki” as a reliable source, but a careless AI Overview user might not notice where the AI got its information.

Google’s “AI Overview” can give false, misleading, and dangerous answers Read More »

bing-outage-shows-just-how-little-competition-google-search-really-has

Bing outage shows just how little competition Google search really has

Searching for new search —

Opinion: Actively searching without Google or Bing is harder than it looks.

Google logo on a phone in front of a Bing logo in the background

Getty Images

Bing, Microsoft’s search engine platform, went down in the very early morning today. That meant that searches from Microsoft’s Edge browsers that had yet to change their default providers didn’t work. It also meant that services relying on Bing’s search API—Microsoft’s own Copilot, ChatGPT search, Yahoo, Ecosia, and DuckDuckGo—similarly failed.

Services were largely restored by the morning Eastern work hours, but the timing feels apt, concerning, or some combination of the two. Google, the consistently dominating search platform, just last week announced and debuted AI Overviews as a default addition to all searches. If you don’t want an AI response but still want to use Google, you can hunt down the new “Web” option in a menu, or you can, per Ernie Smith, tack “&udm=14” onto your search or use Smith’s own “Konami code” shortcut page.

If dismay about AI’s hallucinations, power draw, or pizza recipes concern you—along with perhaps broader Google issues involving privacy, tracking, news, SEO, or monopoly power—most of your other major options were brought down by a single API outage this morning. Moving past that kind of single point of vulnerability will take some work, both by the industry and by you, the person wondering if there’s a real alternative.

Search engine market share, as measured by StatCounter, April 2023–April 2024.

Search engine market share, as measured by StatCounter, April 2023–April 2024.

StatCounter

Upward of a billion dollars a year

The overwhelming majority of search tools offering an “alternative” to Google are using Google, Bing, or Yandex, the three major search engines that maintain massive global indexes. Yandex, being based in Russia, is a non-starter for many people around the world at the moment. Bing offers its services widely, most notably to DuckDuckGo, but its ad-based revenue model and privacy particulars have caused some friction there in the past. Before his company was able to block more of Microsoft’s own tracking scripts, DuckDuckGo CEO and founder Gabriel Weinberg explained in a Reddit reply why firms like his weren’t going the full DIY route:

… [W]e source most of our traditional links and images privately from Bing … Really only two companies (Google and Microsoft) have a high-quality global web link index (because I believe it costs upwards of a billion dollars a year to do), and so literally every other global search engine needs to bootstrap with one or both of them to provide a mainstream search product. The same is true for maps btw — only the biggest companies can similarly afford to put satellites up and send ground cars to take streetview pictures of every neighborhood.

Bing makes Microsoft money, if not quite profit yet. It’s in Microsoft’s interest to keep its search index stocked and API open, even if its focus is almost entirely on its own AI chatbot version of Bing. Yet if Microsoft decided to pull API access, or it became unreliable, Google’s default position gets even stronger. What would non-conformists have to choose from then?

Bing outage shows just how little competition Google search really has Read More »

sky-voice-actor-says-nobody-ever-compared-her-to-scarjo-before-openai-drama

Sky voice actor says nobody ever compared her to ScarJo before OpenAI drama

Scarlett Johansson attends the Golden Heart Awards in 2023.

Enlarge / Scarlett Johansson attends the Golden Heart Awards in 2023.

OpenAI is sticking to its story that it never intended to copy Scarlett Johansson’s voice when seeking an actor for ChatGPT’s “Sky” voice mode.

The company provided The Washington Post with documents and recordings clearly meant to support OpenAI CEO Sam Altman’s defense against Johansson’s claims that Sky was made to sound “eerily similar” to her critically acclaimed voice acting performance in the sci-fi film Her.

Johansson has alleged that OpenAI hired a soundalike to steal her likeness and confirmed that she declined to provide the Sky voice. Experts have said that Johansson has a strong case should she decide to sue OpenAI for violating her right to publicity, which gives the actress exclusive rights to the commercial use of her likeness.

In OpenAI’s defense, The Post reported that the company’s voice casting call flier did not seek a “clone of actress Scarlett Johansson,” and initial voice test recordings of the unnamed actress hired to voice Sky showed that her “natural voice sounds identical to the AI-generated Sky voice.” Because of this, OpenAI has argued that “Sky’s voice is not an imitation of Scarlett Johansson.”

What’s more, an agent for the unnamed Sky actress who was cast—both granted anonymity to protect her client’s safety—confirmed to The Post that her client said she was never directed to imitate either Johansson or her character in Her. She simply used her own voice and got the gig.

The agent also provided a statement from her client that claimed that she had never been compared to Johansson before the backlash started.

This all “feels personal,” the voice actress said, “being that it’s just my natural voice and I’ve never been compared to her by the people who do know me closely.”

However, OpenAI apparently reached out to Johansson after casting the Sky voice actress. During outreach last September and again this month, OpenAI seemed to want to substitute the Sky voice actress’s voice with Johansson’s voice—which is ironically what happened when Johansson got cast to replace the original actress hired to voice her character in Her.

Altman has clarified that timeline in a statement provided to Ars that emphasized that the company “never intended” Sky to sound like Johansson. Instead, OpenAI tried to snag Johansson to voice the part after realizing—seemingly just as Her director Spike Jonze did—that the voice could potentially resonate with more people if Johansson did it.

“We are sorry to Ms. Johansson that we didn’t communicate better,” Altman’s statement said.

Johansson has not yet made any public indications that she intends to sue OpenAI over this supposed miscommunication. But if she did, legal experts told The Post and Reuters that her case would be strong because of legal precedent set in high-profile lawsuits raised by singers Bette Midler and Tom Waits blocking companies from misappropriating their voices.

Why Johansson could win if she sued OpenAI

In 1988, Bette Midler sued Ford Motor Company for hiring a soundalike to perform Midler’s song “Do You Want to Dance?” in a commercial intended to appeal to “young yuppies” by referencing popular songs from their college days. Midler had declined to do the commercial and accused Ford of exploiting her voice to endorse its product without her consent.

This groundbreaking case proved that a distinctive voice like Midler’s cannot be deliberately imitated to sell a product. It did not matter that the singer used in the commercial had used her natural singing voice, because “a number of people” told Midler that the performance “sounded exactly” like her.

Midler’s case set a powerful precedent preventing companies from appropriating parts of performers’ identities—essentially stopping anyone from stealing a well-known voice that otherwise could not be bought.

“A voice is as distinctive and personal as a face,” the court ruled, concluding that “when a distinctive voice of a professional singer is widely known and is deliberately imitated in order to sell a product, the sellers have appropriated what is not theirs.”

Like in Midler’s case, Johansson could argue that plenty of people think that the Sky voice sounds like her and that OpenAI’s product might be more popular if it had a Her-like voice mode. Comics on popular late-night shows joked about the similarity, including Johansson’s husband, Saturday Night Live comedian Colin Jost. And other people close to Johansson agreed that Sky sounded like her, Johansson has said.

Johansson’s case differs from Midler’s case seemingly primarily because of the casting timeline that OpenAI is working hard to defend.

OpenAI seems to think that because Johansson was offered the gig after the Sky voice actor was cast that she has no case to claim that they hired the other actor after she declined.

The timeline may not matter as much as OpenAI may think, though. In the 1990s, Tom Waits cited Midler’s case when he won a $2.6 million lawsuit after Frito-Lay hired a Waits impersonator to perform a song that “echoed the rhyming word play” of a Waits song in a Doritos commercial. Waits won his suit even though Frito-Lay never attempted to hire the singer before casting the soundalike.

Sky voice actor says nobody ever compared her to ScarJo before OpenAI drama Read More »