Perplexity

ai-search-engines-cite-incorrect-sources-at-an-alarming-60%-rate,-study-says

AI search engines cite incorrect sources at an alarming 60% rate, study says

A new study from Columbia Journalism Review’s Tow Center for Digital Journalism finds serious accuracy issues with generative AI models used for news searches. The research tested eight AI-driven search tools equipped with live search functionality and discovered that the AI models incorrectly answered more than 60 percent of queries about news sources.

Researchers Klaudia Jaźwińska and Aisvarya Chandrasekar noted in their report that roughly 1 in 4 Americans now use AI models as alternatives to traditional search engines. This raises serious concerns about reliability, given the substantial error rate uncovered in the study.

Error rates varied notably among the tested platforms. Perplexity provided incorrect information in 37 percent of the queries tested, whereas ChatGPT Search incorrectly identified 67 percent (134 out of 200) of articles queried. Grok 3 demonstrated the highest error rate, at 94 percent.

A graph from CJR shows

A graph from CJR shows “confidently wrong” search results. Credit: CJR

For the tests, researchers fed direct excerpts from actual news articles to the AI models, then asked each model to identify the article’s headline, original publisher, publication date, and URL. They ran 1,600 queries across the eight different generative search tools.

The study highlighted a common trend among these AI models: rather than declining to respond when they lacked reliable information, the models frequently provided confabulations—plausible-sounding incorrect or speculative answers. The researchers emphasized that this behavior was consistent across all tested models, not limited to just one tool.

Surprisingly, premium paid versions of these AI search tools fared even worse in certain respects. Perplexity Pro ($20/month) and Grok 3’s premium service ($40/month) confidently delivered incorrect responses more often than their free counterparts. Though these premium models correctly answered a higher number of prompts, their reluctance to decline uncertain responses drove higher overall error rates.

Issues with citations and publisher control

The CJR researchers also uncovered evidence suggesting some AI tools ignored Robot Exclusion Protocol settings, which publishers use to prevent unauthorized access. For example, Perplexity’s free version correctly identified all 10 excerpts from paywalled National Geographic content, despite National Geographic explicitly disallowing Perplexity’s web crawlers.

AI search engines cite incorrect sources at an alarming 60% rate, study says Read More »

perplexity-wants-to-reinvent-the-web-browser-with-ai—but-there’s-fierce-competition

Perplexity wants to reinvent the web browser with AI—but there’s fierce competition

It has recently been expanding its offerings—for example, it recently launched a deep research tool competing with similar ones provided by OpenAI and Google, as well as Sonar, an API for generative AI-powered search.

It will face fierce competition in the browser market, though. Google’s Chrome accounts for the majority of web browser use around the world, and despite its position at the forefront of AI search, Perplexity isn’t the first to introduce a browser with heavy use of generative AI features. For example, The Browser Company showed off its Dia browser in December.

Dia will allow users to type natural language commands into the search bar, like finding a document or webpage or creating a calendar event. It’s possible that Comet will do similar things, but again, we don’t know.

So far, most consumer-facing AI tools have come in one of three forms. There are general-purpose chatbots (like OpenAI’s ChatGPT and Anthropic’s Claude); features that use trained deep learning models subtly baked into existing software (as in Adobe Photoshop or Apple’s iOS); and, less commonly, standalone software meant to remake existing application categories using AI features (like the Cursor IDE).

There haven’t been a ton of AI-specific applications in existing categories like this before, but expect to see more coming over the next couple of years.

Perplexity wants to reinvent the web browser with AI—but there’s fierce competition Read More »

reddit-debuts-ai-powered-discussion-search—but-will-users-like-it?

Reddit debuts AI-powered discussion search—but will users like it?

The company then went on to strike deals with major tech firms, including a $60 million agreement with Google in February 2024 and a partnership with OpenAI in May 2024 that integrated Reddit content into ChatGPT.

But Reddit users haven’t been entirely happy with the deals. In October 2024, London-based Redditors began posting false restaurant recommendations to manipulate search results and keep tourists away from their favorite spots. This coordinated effort to feed incorrect information into AI systems demonstrated how user communities might intentionally “poison” AI training data over time.

The potential for trouble

While it’s tempting to lean heavily into generative AI technology while it is currently trendy, the move could also represent a challenge for the company. For example, Reddit’s AI-powered summaries could potentially draw from inaccurate information featured on the site and provide incorrect answers, or it may draw inaccurate conclusions from correct information.

We will keep an eye on Reddit’s new AI-powered search tool to see if it resists the type of confabulation that we’ve seen with Google’s AI Overview, an AI summary bot that has been a critical failure so far.

Advance Publications, which owns Ars Technica parent Condé Nast, is the largest shareholder of Reddit.

Reddit debuts AI-powered discussion search—but will users like it? Read More »

not-just-chatgpt-anymore:-perplexity-and-anthropic’s-claude-get-desktop-apps

Not just ChatGPT anymore: Perplexity and Anthropic’s Claude get desktop apps

There’s a lot going on in the world of Mac apps for popular AI services. In the past week, Anthropic has released a desktop app for its popular Claude chatbot, and Perplexity launched a native app for its AI-driven search service.

On top of that, OpenAI updated its ChatGPT Mac app with support for its flashy advanced voice feature.

Like the ChatGPT app that debuted several weeks ago, the Perplexity app adds a keyboard shortcut that allows you to enter a query from anywhere on your desktop. You can use the app to ask follow-up questions and carry on a conversation about what it finds.

It’s free to download and use, but Perplexity offers subscriptions for major users.

Perplexity’s search emphasis meant it wasn’t previously a direct competitor to OpenAI’s ChatGPT, but OpenAI recently launched SearchGPT, a search-focused variant of its popular product. SearchGPT is not yet supported in the desktop app, though.

Anthropic’s Claude, on the other hand, is a more direct competitor to ChatGPT. It works similarly to ChatGPT but has different strengths, particularly in software development. The Claude app is free to download, but it’s in beta, and like Perplexity and OpenAI, Anthropic charges for more advanced users.

When ChatGPT launched its Mac app, it didn’t release a Windows app right away, saying that it was focused on where its users were at the time. A Windows app recently arrived, and Anthropic took a different approach, simultaneously introducing Windows and Mac apps.

Previously, all these tools offered mobile apps and web apps, but not necessarily native desktop apps.

Not just ChatGPT anymore: Perplexity and Anthropic’s Claude get desktop apps Read More »

google,-microsoft,-and-perplexity-promote-scientific-racism-in-ai-search-results

Google, Microsoft, and Perplexity promote scientific racism in AI search results


AI-powered search engines are surfacing deeply racist, debunked research.

Literal Nazis

LOS ANGELES, CA – APRIL 17: Members of the National Socialist Movement (NSM) salute during a rally on near City Hall on April 17, 2010 in Los Angeles, California. Credit: David McNew via Getty

AI-infused search engines from Google, Microsoft, and Perplexity have been surfacing deeply racist and widely debunked research promoting race science and the idea that white people are genetically superior to nonwhite people.

Patrik Hermansson, a researcher with UK-based anti-racism group Hope Not Hate, was in the middle of a monthslong investigation into the resurgent race science movement when he needed to find out more information about a debunked dataset that claims IQ scores can be used to prove the superiority of the white race.

He was investigating the Human Diversity Foundation, a race science company funded by Andrew Conru, the US tech billionaire who founded Adult Friend Finder. The group, founded in 2022, was the successor to the Pioneer Fund, a group founded by US Nazi sympathizers in 1937 with the aim of promoting “race betterment” and “race realism.”

Wired logo

Hermansson logged in to Google and began looking up results for the IQs of different nations. When he typed in “Pakistan IQ,” rather than getting a typical list of links, Hermansson was presented with Google’s AI-powered Overviews tool, which, confusingly to him, was on by default. It gave him a definitive answer of 80.

When he typed in “Sierra Leone IQ,” Google’s AI tool was even more specific: 45.07. The result for “Kenya IQ” was equally exact: 75.2.

Hermansson immediately recognized the numbers being fed back to him. They were being taken directly from the very study he was trying to debunk, published by one of the leaders of the movement that he was working to expose.

The results Google was serving up came from a dataset published by Richard Lynn, a University of Ulster professor who died in 2023 and was president of the Pioneer Fund for two decades.

“His influence was massive. He was the superstar and the guiding light of that movement up until his death. Almost to the very end of his life, he was a core leader of it,” Hermansson says.

A WIRED investigation confirmed Hermanssons’s findings and discovered that other AI-infused search engines—Microsoft’s Copilot and Perplexity—are also referencing Lynn’s work when queried about IQ scores in various countries. While Lynn’s flawed research has long been used by far-right extremists, white supremacists, and proponents of eugenics as evidence that the white race is superior genetically and intellectually from nonwhite races, experts now worry that its promotion through AI could help radicalize others.

“Unquestioning use of these ‘statistics’ is deeply problematic,” Rebecca Sear, director of the Center for Culture and Evolution at Brunel University London, tells WIRED. “Use of these data therefore not only spreads disinformation but also helps the political project of scientific racism—the misuse of science to promote the idea that racial hierarchies and inequalities are natural and inevitable.”

To back up her claim, Sear pointed out that Lynn’s research was cited by the white supremacist who committed the mass shooting in Buffalo, New York, in 2022.

Google’s AI Overviews were launched earlier this year as part of the company’s effort to revamp its all-powerful search tool for an online world being reshaped by artificial intelligence. For some search queries, the tool, which is only available in certain countries right now, gives an AI-generated summary of its findings. The tool pulls the information from the Internet and gives users the answers to queries without needing to click on a link.

The AI Overview answer does not always immediately say where the information is coming from, but after complaints from people about how it showed no articles, Google now puts the title for one of the links to the right of the AI summary. AI Overviews have already run into a number of issues since launching in May, forcing Google to admit it had botched the heavily hyped rollout. AI Overviews is turned on by default for search results and can’t be removed without resorting to installing third-party extensions. (“I haven’t enabled it, but it was enabled,” Hermansson, the researcher, tells WIRED. “I don’t know how that happened.”)

In the case of the IQ results, Google referred to a variety of sources, including posts on X, Facebook, and a number of obscure listicle websites, including World Population Review. In nearly all of these cases, when you click through to the source, the trail leads back to Lynn’s infamous dataset. (In some cases, while the exact numbers Lynn published are referenced, the websites do not cite Lynn as the source.)

When querying Google’s Gemini AI chatbot directly using the same terms, it provided a much more nuanced response. “It’s important to approach discussions about national IQ scores with caution,” read text that the chatbot generated in response to the query “Pakistan IQ.” The text continued: “IQ tests are designed primarily for Western cultures and can be biased against individuals from different backgrounds.”

Google tells WIRED that its systems weren’t working as intended in this case and that it is looking at ways it can improve.

“We have guardrails and policies in place to protect against low quality responses, and when we find Overviews that don’t align with our policies, we quickly take action against them,” Ned Adriance, a Google spokesperson, tells WIRED. “These Overviews violated our policies and have been removed. Our goal is for AI Overviews to provide links to high quality content so that people can click through to learn more, but for some queries there may not be a lot of high quality web content available.”

While WIRED’s tests suggest AI Overviews have now been switched off for queries about national IQs, the results still amplify the incorrect figures from Lynn’s work in what’s called a “featured snippet,” which displays some of the text from a website before the link.

Google did not respond to a question about this update.

But it’s not just Google promoting these dangerous theories. When WIRED put the same query to other AI-powered online search services, we found similar results.

Perplexity, an AI search company that has been found to make things up out of thin air, responded to a query about “Pakistan IQ” by stating that “the average IQ in Pakistan has been reported to vary significantly depending on the source.”

It then lists a number of sources, including a Reddit thread that relied on Lynn’s research and the same World Population Review site that Google’s AI Overview referenced. When asked for Sierra Leone’s IQ, Perplexity directly cited Lynn’s figure: “Sierra Leone’s average IQ is reported to be 45.07, ranking it among the lowest globally.”

Perplexity did not respond to a request for comment.

Microsoft’s Copilot chatbot, which is integrated into its Bing search engine, generated confident text—“The average IQ in Pakistan is reported to be around 80”—citing a website called IQ International, which does not reference its sources. When asked for “Sierra Leone IQ,” Copilot’s response said it was 91. The source linked in the results was a website called Brainstats.com, which references Lynn’s work. Copilot also referenced Brainstats.com work when queried about IQ in Kenya.

“Copilot answers questions by distilling information from multiple web sources into a single response,” Caitlin Roulston, a Microsoft spokesperson, tells WIRED. “Copilot provides linked citations so the user can further explore and research as they would with traditional search.”

Google added that part of the problem it faces in generating AI Overviews is that, for some very specific queries, there’s an absence of high quality information on the web—and there’s little doubt that Lynn’s work is not of high quality.

“The science underlying Lynn’s database of ‘national IQs’ is of such poor quality that it is difficult to believe the database is anything but fraudulent,” Sear said. “Lynn has never described his methodology for selecting samples into the database; many nations have IQs estimated from absurdly small and unrepresentative samples.”

Sear points to Lynn’s estimation of the IQ of Angola being based on information from just 19 people and that of Eritrea being based on samples of children living in orphanages.

“The problem with it is that the data Lynn used to generate this dataset is just bullshit, and it’s bullshit in multiple dimensions,” Rutherford said, pointing out that the Somali figure in Lynn’s dataset is based on one sample of refugees aged between 8 and 18 who were tested in a Kenyan refugee camp. He adds that the Botswana score is based on a single sample of 104 Tswana-speaking high school students aged between 7 and 20 who were tested in English.

Critics of the use of national IQ tests to promote the idea of racial superiority point out not only that the quality of the samples being collected is weak, but also that the tests themselves are typically designed for Western audiences, and so are biased before they are even administered.

“There is evidence that Lynn systematically biased the database by preferentially including samples with low IQs, while excluding those with higher IQs for African nations,” Sear added, a conclusion backed up by a preprint study from 2020.

Lynn published various versions of his national IQ dataset over the course of decades, the most recent of which, called “The Intelligence of Nations,” was published in 2019. Over the years, Lynn’s flawed work has been used by far-right and racist groups as evidence to back up claims of white superiority. The data has also been turned into a color-coded map of the world, showing sub-Saharan African countries with purportedly low IQ colored red compared to the Western nations, which are colored blue.

“This is a data visualization that you see all over [X, formerly known as Twitter], all over social media—and if you spend a lot of time in racist hangouts on the web, you just see this as an argument by racists who say, ‘Look at the data. Look at the map,’” Rutherford says.

But the blame, Rutherford believes, does not lie with the AI systems alone, but also with a scientific community that has been uncritically citing Lynn’s work for years.

“It’s actually not surprising [that AI systems are quoting it] because Lynn’s work in IQ has been accepted pretty unquestioningly from a huge area of academia, and if you look at the number of times his national IQ databases have been cited in academic works, it’s in the hundreds,” Rutherford said. “So the fault isn’t with AI. The fault is with academia.”

This story originally appeared on wired.com

Photo of WIRED

Wired.com is your essential daily guide to what’s next, delivering the most original and complete take you’ll find anywhere on innovation’s impact on technology, science, business and culture.

Google, Microsoft, and Perplexity promote scientific racism in AI search results Read More »

ai-search-engine-accused-of-plagiarism-announces-publisher-revenue-sharing-plan

AI search engine accused of plagiarism announces publisher revenue-sharing plan

Beg, borrow, or license —

Perplexity says WordPress.com, TIME, Der Spiegel, and Fortune have already signed up.

Robot caught in a flashlight vector illustration

On Tuesday, AI-powered search engine Perplexity unveiled a new revenue-sharing program for publishers, marking a significant shift in its approach to third-party content use, reports CNBC. The move comes after plagiarism allegations from major media outlets, including Forbes, Wired, and Ars parent company Condé Nast. Perplexity, valued at over $1 billion, aims to compete with search giant Google.

“To further support the vital work of media organizations and online creators, we need to ensure publishers can thrive as Perplexity grows,” writes the company in a blog post announcing the problem. “That’s why we’re excited to announce the Perplexity Publishers Program and our first batch of partners: TIME, Der Spiegel, Fortune, Entrepreneur, The Texas Tribune, and WordPress.com.”

Under the program, Perplexity will share a percentage of ad revenue with publishers when their content is cited in AI-generated answers. The revenue share applies on a per-article basis and potentially multiplies if articles from a single publisher are used in one response. Some content providers, such as WordPress.com, plan to pass some of that revenue on to content creators.

A press release from WordPress.com states that joining Perplexity’s Publishers Program allows WordPress.com content to appear in Perplexity’s “Keep Exploring” section on their Discover pages. “That means your articles will be included in their search index and your articles can be surfaced as an answer on their answer engine and Discover feed,” the blog company writes. “If your website is referenced in a Perplexity search result where the company earns advertising revenue, you’ll be eligible for revenue share.”

A screenshot of the Perplexity.ai website taken on July 30, 2024.

Enlarge / A screenshot of the Perplexity.ai website taken on July 30, 2024.

Benj Edwards

Dmitry Shevelenko, Perplexity’s chief business officer, told CNBC that the company began discussions with publishers in January, with program details solidified in early 2024. He reported strong initial interest, with over a dozen publishers reaching out within hours of the announcement.

As part of the program, publishers will also receive access to Perplexity APIs that can be used to create custom “answer engines” and “Enterprise Pro” accounts that provide “enhanced data privacy and security capabilities” for all employees of Publishers in the program for one year.

Accusations of plagiarism

The revenue-sharing announcement follows a rocky month for the AI startup. In mid-June, Forbes reported finding its content within Perplexity’s Pages tool with minimal attribution. Pages allows Perplexity users to curate content and share it with others. Ars Technica sister publication Wired later made similar claims, also noting suspicious traffic patterns from IP addresses likely linked to Perplexity that were ignoring robots.txt exclusions. Perplexity was also found to be manipulating its crawling bots’ ID string to get around website blocks.

As part of company policy, Ars Technica parent Condé Nast disallows AI-based content scrapers, and its CEO Roger Lynch testified in the US Senate earlier this year that generative AI has been built with “stolen goods.” Condé sent a cease-and-desist letter to Perplexity earlier this month.

But publisher trouble might not be Perplexity’s only problem. In some tests of the search we performed in February, Perplexity badly confabulated certain answers, even when citations were readily available. Since our initial tests, the accuracy of Perplexity’s results seems to have improved, but providing inaccurate answers (which also plagued Google’s AI Overviews search feature) is still a potential issue.

Compared to the free tier of service, Perplexity users who pay $20 per month can access more capable LLMs such as GPT-4o and Claude 3, so the quality and accuracy of the output can vary dramatically depending on whether a user subscribes or not. The addition of citations to every Perplexity answer allows users to check accuracy—if they take the time to do it.

The move by Perplexity occurs against a backdrop of tensions between AI companies and content creators. Some media outlets, such as The New York Times, have filed lawsuits against AI vendors like OpenAI and Microsoft, alleging copyright infringement in the training of large language models. OpenAI has struck media licensing deals with many publishers as a way to secure access to high-quality training data and avoid future lawsuits.

In this case, Perplexity is not using the licensed articles and content to train AI models but is seeking legal permission to reproduce content from publishers on its website.

AI search engine accused of plagiarism announces publisher revenue-sharing plan Read More »