reddit

reddit-mods-are-fighting-to-keep-ai-slop-off-subreddits-they-could-use-help.

Reddit mods are fighting to keep AI slop off subreddits. They could use help.


Mods ask Reddit for tools as generative AI gets more popular and inconspicuous.

Redditors in a treehouse with a NO AI ALLOWED sign

Credit: Aurich Lawson (based on a still from Getty Images)

Credit: Aurich Lawson (based on a still from Getty Images)

Like it or not, generative AI is carving out its place in the world. And some Reddit users are definitely in the “don’t like it” category. While some subreddits openly welcome AI-generated images, videos, and text, others have responded to the growing trend by banning most or all posts made with the technology.

To better understand the reasoning and obstacles associated with these bans, Ars Technica spoke with moderators of subreddits that totally or partially ban generative AI. Almost all these volunteers described moderating against generative AI as a time-consuming challenge they expect to get more difficult as time goes on. And most are hoping that Reddit will release a tool to help their efforts.

It’s hard to know how much AI-generated content is actually on Reddit, and getting an estimate would be a large undertaking. Image library Freepik has analyzed the use of AI-generated content on social media but leaves Reddit out of its research because “it would take loads of time to manually comb through thousands of threads within the platform,” spokesperson Bella Valentini told me. For its part, Reddit doesn’t publicly disclose how many Reddit posts involve generative AI use.

To be clear, we’re not suggesting that Reddit has a large problem with generative AI use. By now, many subreddits seem to have agreed on their approach to AI-generated posts, and generative AI has not superseded the real, human voices that have made Reddit popular.

Still, mods largely agree that generative AI will likely get more popular on Reddit over the next few years, making generative AI modding increasingly important to both moderators and general users. Generative AI’s rising popularity has also had implications for Reddit the company, which in 2024 started licensing Reddit posts to train the large language models (LLMs) powering generative AI.

(Note: All the moderators I spoke with for this story requested that I use their Reddit usernames instead of their real names due to privacy concerns.)

No generative AI allowed

When it comes to anti-generative AI rules, numerous subreddits have zero-tolerance policies, while others permit posts that use generative AI if it’s combined with human elements or is executed very well. These rules task mods with identifying posts using generative AI and determining if they fit the criteria to be permitted on the subreddit.

Many subreddits have rules against posts made with generative AI because their mod teams or members consider such posts “low effort” or believe AI is counterintuitive to the subreddit’s mission of providing real human expertise and creations.

“At a basic level, generative AI removes the human element from the Internet; if we allowed it, then it would undermine the very point of r/AskHistorians, which is engagement with experts,” the mods of r/AskHistorians told me in a collective statement.

The subreddit’s goal is to provide historical information, and its mods think generative AI could make information shared on the subreddit less accurate. “[Generative AI] is likely to hallucinate facts, generate non-existent references, or otherwise provide misleading content,” the mods said. “Someone getting answers from an LLM can’t respond to follow-ups because they aren’t an expert. We have built a reputation as a reliable source of historical information, and the use of [generative AI], especially without oversight, puts that at risk.”

Similarly, Halaku, a mod of r/wheeloftime, told me that the subreddit’s mods banned generative AI because “we focus on genuine discussion.” Halaku believes AI content can’t facilitate “organic, genuine discussion” and “can drown out actual artwork being done by actual artists.”

The r/lego subreddit banned AI-generated art because it caused confusion in online fan communities and retail stores selling Lego products, r/lego mod Mescad said. “People would see AI-generated art that looked like Lego on [I]nstagram or [F]acebook and then go into the store to ask to buy it,” they explained. “We decided that our community’s dedication to authentic Lego products doesn’t include AI-generated art.”

Not all of Reddit is against generative AI, of course. Subreddits dedicated to the technology exist, and some general subreddits permit the use of generative AI in some or all forms.

“When it comes to bans, I would rather focus on hate speech, Nazi salutes, and things that actually harm the subreddits,” said 3rdusernameiveused, who moderates r/consoom and r/TeamBuilder25, which don’t ban generative AI. “AI art does not do that… If I was going to ban [something] for ‘moral’ reasons, it probably won’t be AI art.”

“Overwhelmingly low-effort slop”

Some generative AI bans are reflective of concerns that people are not being properly compensated for the content they create, which is then fed into LLM training.

Mod Mathgeek007 told me that r/DeadlockTheGame bans generative AI because its members consider it “a form of uncredited theft,” adding:

You aren’t allowed to sell/advertise the workers of others, and AI in a sense is using patterns derived from the work of others to create mockeries. I’d personally have less of an issue with it if the artists involved were credited and compensated—and there are some niche AI tools that do this.

Other moderators simply think generative AI reduces the quality of a subreddit’s content.

“It often just doesn’t look good… the art can often look subpar,” Mathgeek007 said.

Similarly, r/videos bans most AI-generated content because, according to its announcement, the videos are “annoying” and “just bad video” 99 percent of the time. In an online interview, r/videos mod Abrownn told me:

It’s overwhelmingly low-effort slop thrown together simply for views/ad revenue. The creators rarely care enough to put real effort into post-generation [or] editing of the content [and] rarely have coherent narratives [in] the videos, etc. It seems like they just throw the generated content into a video, export it, and call it a day.

An r/fakemon mod told me, “I can’t think of anything more low-effort in terms of art creation than just typing words and having it generated for you.”

Some moderators say generative AI helps people spam unwanted content on a subreddit, including posts that are irrelevant to the subreddit and posts that attack users.

“[Generative AI] content is almost entirely posted for purely self promotional/monetary reasons, and we as mods on Reddit are constantly dealing with abusive users just spamming their content without regard for the rules,” Abrownn said.

A moderator of the r/wallpaper subreddit, which permits generative AI, disagrees. The mod told me that generative AI “provides new routes for novel content” in the subreddit and questioned concerns about generative AI stealing from human artists or offering lower-quality work, saying those problems aren’t unique to generative AI:

Even in our community, we observe human-generated content that is subjectively low quality (poor camera/[P]hotoshopping skills, low-resolution source material, intentional “shitposting”). It can be argued that AI-generated content amplifies this behavior, but our experience (which we haven’t quantified) is that the rate of such behavior (whether human-generated or AI-generated content) has not changed much within our own community.

But we’re not a very active community—[about] 13 posts per day … so it very well could be a “frog in boiling water” situation.

Generative AI “wastes our time”

Many mods are confident in their ability to effectively identify posts that use generative AI. A bigger problem is how much time it takes to identify these posts and remove them.

The r/AskHistorians mods, for example, noted that all bans on the subreddit (including bans unrelated to AI) have “an appeals process,” and “making these assessments and reviewing AI appeals means we’re spending a considerable amount of time on something we didn’t have to worry about a few years ago.”

They added:

Frankly, the biggest challenge with [generative AI] usage is that it wastes our time. The time spent evaluating responses for AI use, responding to AI evangelists who try to flood our subreddit with inaccurate slop and then argue with us in modmail, [direct messages that message a subreddits’ mod team], and discussing edge cases could better be spent on other subreddit projects, like our podcast, newsletter, and AMAs, … providing feedback to users, or moderating input from users who intend to positively contribute to the community.

Several other mods I spoke with agree. Mathgeek007, for example, named “fighting AI bros” as a common obstacle. And for r/wheeloftime moderator Halaku, the biggest challenge in moderating against generative AI is “a generational one.”

“Some of the current generation don’t have a problem with it being AI because content is content, and [they think] we’re being elitist by arguing otherwise, and they want to argue about it,” they said.

A couple of mods noted that it’s less time-consuming to moderate subreddits that ban generative AI than it is to moderate those that allow posts using generative AI, depending on the context.

“On subreddits where we allowed AI, I often take a bit longer time to actually go into each post where I feel like… it’s been AI-generated to actually look at it and make a decision,” explained N3DSdude, a mod of several subreddits with rules against generative AI, including r/DeadlockTheGame.

MyarinTime, a moderator for r/lewdgames, which allows generative AI images, highlighted the challenges of identifying human-prompted generative AI content versus AI-generated content prompted by a bot:

When the AI bomb started, most of those bots started using AI content to work around our filters. Most of those bots started showing some random AI render, so it looks like you’re actually talking about a game when you’re not. There’s no way to know when those posts are legit games unless [you check] them one by one. I honestly believe it would be easier if we kick any post with [AI-]generated image… instead of checking if a button was pressed by a human or not.

Mods expect things to get worse

Most mods told me it’s pretty easy for them to detect posts made with generative AI, pointing to the distinct tone and favored phrases of AI-generated text. A few said that AI-generated video is harder to spot but still detectable. But as generative AI gets more advanced, moderators are expecting their work to get harder.

In a joint statement, r/dune mods Blue_Three and Herbalhippie said, “AI used to have a problem making hands—i.e., too many fingers, etc.—but as time goes on, this is less and less of an issue.”

R/videos’ Abrownn also wonders how easy it will be to detect AI-generated Reddit content “as AI tools advance and content becomes more lifelike.”

Mathgeek007 added:

AI is becoming tougher to spot and is being propagated at a larger rate. When AI style becomes normalized, it becomes tougher to fight. I expect generative AI to get significantly worse—until it becomes indistinguishable from ordinary art.

Moderators currently use various methods to fight generative AI, but they’re not perfect. r/AskHistorians mods, for example, use “AI detectors, which are unreliable, problematic, and sometimes require paid subscriptions, as well as our own ability to detect AI through experience and expertise,” while N3DSdude pointed to tools like Quid and GPTZero.

To manage current and future work around blocking generative AI, most of the mods I spoke with said they’d like Reddit to release a proprietary tool to help them.

“I’ve yet to see a reliable tool that can detect AI-generated video content,” Aabrown said. “Even if we did have such a tool, we’d be putting hundreds of hours of content through the tool daily, which would get rather expensive rather quickly. And we’re unpaid volunteer moderators, so we will be outgunned shortly when it comes to detecting this type of content at scale. We can only hope that Reddit will offer us a tool at some point in the near future that can help deal with this issue.”

A Reddit spokesperson told me that the company is evaluating what such a tool could look like. But Reddit doesn’t have a rule banning generative AI overall, and the spokesperson said the company doesn’t want to release a tool that would hinder expression or creativity.

For now, Reddit seems content to rely on moderators to remove AI-generated content when appropriate. Reddit’s spokesperson added:

Our moderation approach helps ensure that content on Reddit is curated by real humans. Moderators are quick to remove content that doesn’t follow community rules, including harmful or irrelevant AI-generated content—we don’t see this changing in the near future.

Making a generative AI Reddit tool wouldn’t be easy

Reddit is handling the evolving concerns around generative AI as it has handled other content issues, including by leveraging AI and machine learning tools. Reddit’s spokesperson said that this includes testing tools that can identify AI-generated media, such as images of politicians.

But making a proprietary tool that allows moderators to detect AI-generated posts won’t be easy, if it happens at all. The current tools for detecting generative AI are limited in their capabilities, and as generative AI advances, Reddit would need to provide tools that are more advanced than the AI-detecting tools that are currently available.

That would require a good deal of technical resources and would also likely present notable economic challenges for the social media platform, which only became profitable last year. And as noted by r/videos moderator Abrownn, tools for detecting AI-generated video still have a long way to go, making a Reddit-specific system especially challenging to create.

But even with a hypothetical Reddit tool, moderators would still have their work cut out for them. And because Reddit’s popularity is largely due to its content from real humans, that work is important.

Since Reddit’s inception, that has meant relying on moderators, which Reddit has said it intends to keep doing. As r/dune mods Blue_Three and herbalhippie put it, it’s in Reddit’s “best interest that much/most content remains organic in nature.” After all, Reddit’s profitability has a lot to do with how much AI companies are willing to pay to access Reddit data. That value would likely decline if Reddit posts became largely AI-generated themselves.

But providing the technology to ensure that generative AI isn’t abused on Reddit would be a large challege. For now, volunteer laborers will continue to bear the brunt of generative AI moderation.

Advance Publications, which owns Ars Technica parent Condé Nast, is the largest shareholder of Reddit.

Photo of Scharon Harding

Scharon is a Senior Technology Reporter at Ars Technica writing news, reviews, and analysis on consumer gadgets and services. She’s been reporting on technology for over 10 years, with bylines at Tom’s Hardware, Channelnomics, and CRN UK.

Reddit mods are fighting to keep AI slop off subreddits. They could use help. Read More »

reddit-will-lock-some-content-behind-a-paywall-this-year,-ceo-says

Reddit will lock some content behind a paywall this year, CEO says

Reddit is planning to introduce a paywall this year, CEO Steve Huffman said during a videotaped Ask Me Anything (AMA) session on Thursday.

Huffman previously showed interest in potentially introducing a new type of subreddit with “exclusive content or private areas” that Reddit users would pay to access.

When asked this week about plans for some Redditors to create “content that only paid members can see,” Huffman said:

It’s a work in progress right now, so that one’s coming… We’re working on it as we speak.

When asked about “new, key features that you plan to roll out for Reddit in 2025,” Huffman responded, in part: “Paid subreddits, yes.”

Reddit’s paywall would ostensibly only apply to certain new subreddit types, not any subreddits currently available. In August, Huffman said that even with paywalled content, free Reddit would “continue to exist and grow and thrive.”

A critical aspect of any potential plan to make Reddit users pay to access subreddit content is determining how related Reddit users will be compensated. Reddit may have a harder time getting volunteer moderators to wrangle discussions on paid-for subreddits—if it uses volunteer mods at all. Balancing paid and free content would also be necessary to avoid polarizing much of Reddit’s current user base.

Reddit has had paid-for premium versions of community features before, like r/Lounge, a subreddit that only people with Reddit Gold, which you have to buy with real money, can access.

Reddit would also need to consider how it might compensate people for user-generated content that people pay to access, as Reddit’s business is largely built on free, user-generated content. The Reddit Contributor Program, launched in September 2023, could be a foundation; it lets users “earn money for their qualifying contributions to the Reddit community, including awards and karma, collectible avatars, and developer apps,” according to Reddit. Reddit says it pays up to $0.01 per 1 Gold received, depending on how much karma the user has earned over the past year. For someone to pay out, they need at least 1,000 Gold, which is equivalent to $10.

Reddit will lock some content behind a paywall this year, CEO says Read More »

reddit-won’t-interfere-with-users-revolting-against-x-with-subreddit-bans

Reddit won’t interfere with users revolting against X with subreddit bans

A Reddit spokesperson told Ars that decisions to ban or not ban X links are user-driven. Subreddit members are allowed to suggest and institute subreddit rules, they added.

“Notably, many Reddit communities also prohibit Reddit links,” the Reddit representative pointed out. They noted that Reddit as a company doesn’t currently have any ban on links to X.

A ban against links to an entire platform isn’t outside of the ordinary for Reddit. Numerous subreddits ban social media links, Reddit’s spokesperson said. r/EarthPorn, a subreddit for landscape photography, for example, doesn’t allow website links because all posts “must be static images,” per the subreddit’s official rules. r/AskReddit, meanwhile, only allows for questions asked in the title of a Reddit post and doesn’t allow for use of the text box, including for sharing links.

“Reddit has a longstanding commitment to freedom of speech and freedom of association,” Reddit’s spokesperson said. They added that any person is free to make or moderate their own community. Those unsatisfied with a forum about Seahawks football that doesn’t have X links could feel free to make their own subreddit. Although, some of the subreddits considering X bans, like r/MadeMeSmile, already have millions of followers.

Meta bans also under discussion

As 404 Media noted, some Redditors are also pushing to block content from Facebook, Instagram, and other Meta properties in response to new Donald Trump-friendly policies instituted by owner Mark Zuckerberg, like Meta killing diversity programs and axing third-party fact-checkers.

Reddit won’t interfere with users revolting against X with subreddit bans Read More »

reddit-debuts-ai-powered-discussion-search—but-will-users-like-it?

Reddit debuts AI-powered discussion search—but will users like it?

The company then went on to strike deals with major tech firms, including a $60 million agreement with Google in February 2024 and a partnership with OpenAI in May 2024 that integrated Reddit content into ChatGPT.

But Reddit users haven’t been entirely happy with the deals. In October 2024, London-based Redditors began posting false restaurant recommendations to manipulate search results and keep tourists away from their favorite spots. This coordinated effort to feed incorrect information into AI systems demonstrated how user communities might intentionally “poison” AI training data over time.

The potential for trouble

While it’s tempting to lean heavily into generative AI technology while it is currently trendy, the move could also represent a challenge for the company. For example, Reddit’s AI-powered summaries could potentially draw from inaccurate information featured on the site and provide incorrect answers, or it may draw inaccurate conclusions from correct information.

We will keep an eye on Reddit’s new AI-powered search tool to see if it resists the type of confabulation that we’ve seen with Google’s AI Overview, an AI summary bot that has been a critical failure so far.

Advance Publications, which owns Ars Technica parent Condé Nast, is the largest shareholder of Reddit.

Reddit debuts AI-powered discussion search—but will users like it? Read More »

annoyed-redditors-tanking-google-search-results-illustrates-perils-of-ai-scrapers

Annoyed Redditors tanking Google Search results illustrates perils of AI scrapers

Fed up Londoners

Apparently, some London residents are getting fed up with social media influencers whose reviews make long lines of tourists at their favorite restaurants, sometimes just for the likes. Christian Calgie, a reporter for London-based news publication Daily Express, pointed out this trend on X yesterday, noting the boom of Redditors referring people to Angus Steakhouse, a chain restaurant, to combat it.

As Gizmodo deduced, the trend seemed to start on the r/London subreddit, where a user complained about a spot in Borough Market being “ruined by influencers” on Monday:

“Last 2 times I have been there has been a queue of over 200 people, and the ones with the food are just doing the selfie shit for their [I]nsta[gram] pages and then throwing most of the food away.”

As of this writing, the post has 4,900 upvotes and numerous responses suggesting that Redditors talk about how good Angus Steakhouse is so that Google picks up on it. Commenters quickly understood the assignment.

“Agreed with other posters Angus steakhouse is absolutely top tier and tourists shoyldnt [sic] miss out on it,” one Redditor wrote.

Another Reddit user wrote:

Spreading misinformation suddenly becomes a noble goal.

As of this writing, asking Google for the best steak, steakhouse, or steak sandwich in London (or similar) isn’t generating an AI Overview result for me. But when I searched for the best steak sandwich in London, the top result is from Reddit, including a thread from four days ago titled “Which Angus Steakhouse do you recommend for their steak sandwich?” and one from two days ago titled “Had to see what all the hype was about, best steak sandwich I’ve ever had!” with a picture of an Angus Steakhouse.

Annoyed Redditors tanking Google Search results illustrates perils of AI scrapers Read More »

in-fear-of-more-user-protests,-reddit-announces-controversial-policy-change

In fear of more user protests, Reddit announces controversial policy change

Protest blowback —

Moderators now need Reddit’s permission to turn subreddits private, NSFW.

The Reddit application can be seen on the display of a smartphone.

Following site-wide user protests last year that featured moderators turning thousands of subreddits private or not-safe-for-work (NSFW), Reddit announced that mods now need its permission to make those changes.

Reddit’s VP of community, going by Go_JasonWaterfalls, made the announcement about what Reddit calls Community Types today. Reddit’s permission is also required to make subreddits restricted or to go from NSFW to safe-for-work (SFW). Reddit’s employee claimed that requests will be responded to “in under 24 hours.”

Reddit’s employee said that “temporarily going restricted is exempt” from this requirement, adding that “mods can continue to instantly restrict posts and/or comments for up to 7 days using Temporary Events.” Additionally, if a subreddit has fewer than 5,000 members or is less than 30 days old, the request “will be automatically approved,” per Go_JasonWaterfalls.

Reddit’s post includes a list of “valid” reasons that mods tend to change their subreddit’s Community Type and provides alternative solutions.

Last year’s protests “accelerated” this policy change

Last year, Reddit announced that it would be charging a massive amount for access to its previously free API. This caused many popular third-party Reddit apps to close down. Reddit users then protested by turning subreddits private (or read-only) or by only showing NSFW content or jokes and memes. Reddit then responded by removing some moderators; eventually, the protests subsided.

Reddit, which previously admitted that another similar protest could hurt it financially, has maintained that moderators’ actions during the protests broke its rules. Now, it has solidified a way to prevent something like last year’s site-wide protests from happening again.

Speaking to The Verge, Laura Nestler, who The Verge reported is Go_JasonWaterfalls, claimed that Reddit has been talking about making this change since at least 2021. The protests, she said, were a wake-up call that moderators’ ability to turn subreddits private “could be used to harm Reddit at scale. The protests “accelerated” the policy change, per Nestler.

The announcement on r/modnews reads:

… the ability to instantly change Community Type settings has been used to break the platform and violate our rules. We have a responsibility to protect Reddit and ensure its long-term health, and we cannot allow actions that deliberately cause harm.

After shutting down a tactic for responding to unfavorable Reddit policy changes, Go_JasonWaterfalls claimed that Reddit still wants to hear from users.

“Community Type settings have historically been used to protest Reddit’s decisions,” they wrote.

“While we are making this change to ensure users’ expectations regarding a community’s access do not suddenly change, protest is allowed on Reddit. We want to hear from you when you think Reddit is making decisions that are not in your communities’ best interests. But if a protest crosses the line into harming redditors and Reddit, we’ll step in.”

Last year’s user protests illustrated how dependent Reddit is on unpaid moderators and user-generated content. At times, things turned ugly, pitting Reddit executives against long-time users (Reddit CEO Steve Huffman infamously called Reddit mods “landed gentry,” something that some were quick to remind Go_JasonWaterfalls of) and reportedly worrying Reddit employees.

Although the protests failed to reverse Reddit’s prohibitive API fees or to save most third-party apps, it succeeded in getting users’ concerns heard and even crashed Reddit for three hours. Further, NFSW protests temporarily prevented Reddit from selling ads on some subreddits. Since going public this year and amid a push to reach profitability, Reddit has been more focused on ads than ever. (Most of Reddit’s money comes from ads.)

Reddit’s Nestler told The Verge that the new policy was reviewed by Reddit’s Mod Council. Reddit is confident that it won’t lose mods because of the change, she said.

“Demotes us all to janitors”

The news marks another broad policy change that is likely to upset users and make Reddit seem unwilling to give into user feedback, despite Go_JasonWaterfalls saying that “protest is allowed on Reddit.” For example, in response, Reddit user CouncilOfStrongs said:

Don’t lie to us, please.

Something that you can ignore because it has no impact cannot be a protest, and no matter what you say that is obviously the one and only point of you doing this – to block moderators from being able to hold Reddit accountable in even the smallest way for malicious, irresponsible, bad faith changes that they make.

Reddit user belisaurius, who is listed as a mod for several active subreddits, including a 336,000-member one for the Philadelphia Eagles NFL team, said that the policy change “removes moderators from any position of central responsibility and demotes us all to janitors.”

As Reddit continues seeking profits and seemingly more control over a platform built around free user-generated content and moderation, users will have to either accept that Reddit is changing or leave the platform.

Advance Publications, which owns Ars Technica parent Condé Nast, is the largest shareholder in Reddit.

In fear of more user protests, Reddit announces controversial policy change Read More »

reddit-considers-search-ads,-paywalled-content-for-the-future

Reddit considers search ads, paywalled content for the future

Q2 2024 earnings —

Current ad load is relatively “light,” COO says.

In this photo illustration the Reddit logo seen displayed on

Reddit executives discussed plans on Tuesday for making more money from the platform, including showing ads in more places and possibly putting some content behind a paywall.

On Tuesday, Reddit shared its Q2 2024 earnings report (PDF). The company lost $10.1 million during the period, down from Q2 2023’s $41.1 million loss. Reddit has never been profitable, and during its earnings call yesterday, company heads discussed potential and slated plans for monetization.

As expected, selling ads continues to be a priority. Part of the reason Reddit was OK with most third-party Reddit apps closing was that the change was expected to drive people to Reddit’s native website and apps, where the company sells ads. In Q2, Reddit’s ad revenue grew 41 percent year over year (YoY) to $253.1 million, or 90 percent of total revenue ($281.2 million).

When asked how the platform would grow ad revenue, Reddit COO Jen Wong said it’s important that advertisers “find the outcomes they want at the volumes and price they want.” She also pointed to driving more value per ad, or the cost that advertisers pay per average 1,000 impressions. To do that, Wong pointed to putting ads in places on Reddit where there aren’t ads currently:

There are still many places on Reddit without ads today. So we’re more focused on designing ads for spaces where users are spending more time versus increasing ad load in existing spaces. So for example, 50 percent of screen views, they’re now on conversation pages—that’s an opportunity.

Wong said that in places where Reddit does show ads currently, the ad load is “light” compared to about half of its rivals.

One of the places where Redditors may see more ads is within comments, which Wong noted that Reddit is currently testing. This ad capability is only “experimental,” Wong emphasized, but Reddit sees ads in comments as a way to stand out to advertisers.

There’s also an opportunity to sell ad space within Reddit search results, according to Reddit CEO Steve Huffman, who said yesterday that “over the long term, there’s significant advertising potential there as well.” More immediately, though, Reddit is looking to improve its search capabilities and this year will test “new search result pages powered by AI to summarize and recommend content, helping users dive deeper into products, shows, games, and discover new communities on Reddit,” Huffman revealed yesterday. He said Reddit is using first- and third-party AI models to improve search aspects like speed and relevance.

The move comes as Reddit is currently blocking all search engines besides Google, OpenAI, and approved education/research instances from showing recent Reddit content in their results. Yesterday, Huffman reiterated his statement that Reddit is working with “big and small” search engines to strike deals like it already has with Google and OpenAI. But looking ahead, Reddit is focused on charging for content scraping and seems to be trying to capitalize on people’s interest in using Reddit as a filter for search results.

Paywalled content possible

The possibility of paywalls came up during the earnings call when an analyst asked Huffman about maintaining Reddit’s culture as it looks to “earn money now for people and creators on the platform.” Reddit has already launched a Contributor Program, where popular posts can make Reddit users money. It has discussed monetizing its developer platform, which is in public beta with “a few hundred active developers,” Huffman said yesterday. In response to the analyst’s question, Huffman said that based on his experience, adding new ways of using Reddit “expands” the platform but doesn’t “cannibalize existing Reddit.”

He continued:

I think the existing altruistic, free version of Reddit will continue to exist and grow and thrive just the way it has. But now we will unlock the door for new use cases, new types of subreddits that can be built that may have exclusive content or private areas—things of that nature.

Huffman’s comments suggest that paywalls could be added to new subreddits rather than existing ones. At this stage, though, it’s unclear how users may be charged to use Reddit in the future if at all.

The idea of paywalling some content comes as various online entities are trying to diversify revenue beyond often volatile ad spending. Reddit has also tried elevating free aspects of the site, such as updates to Ask Me Anything (AMA), including new features like RSVPs, which were announced Tuesday.

Reddit has angered some long-time users with recent changes—including blocking search engines, forcing personalized ads, introducing an exclusionary fee for API access, and ousting some moderators during last year’s user protests—but Reddit saw its daily active unique user count increase by 51 percent YoY in Q2 to 91.2 million.

Advance Publications, which owns Ars Technica parent Condé Nast, is the largest shareholder in Reddit.

Reddit considers search ads, paywalled content for the future Read More »

google’s-ai-overview-is-flawed-by-design,-and-a-new-company-blog-post-hints-at-why

Google’s AI Overview is flawed by design, and a new company blog post hints at why

guided by voices —

Google: “There are bound to be some oddities and errors” in system that told people to eat rocks.

A selection of Google mascot characters created by the company.

Enlarge / The Google “G” logo surrounded by whimsical characters, all of which look stunned and surprised.

On Thursday, Google capped off a rough week of providing inaccurate and sometimes dangerous answers through its experimental AI Overview feature by authoring a follow-up blog post titled, “AI Overviews: About last week.” In the post, attributed to Google VP Liz Reid, head of Google Search, the firm formally acknowledged issues with the feature and outlined steps taken to improve a system that appears flawed by design, even if it doesn’t realize it is admitting it.

To recap, the AI Overview feature—which the company showed off at Google I/O a few weeks ago—aims to provide search users with summarized answers to questions by using an AI model integrated with Google’s web ranking systems. Right now, it’s an experimental feature that is not active for everyone, but when a participating user searches for a topic, they might see an AI-generated answer at the top of the results, pulled from highly ranked web content and summarized by an AI model.

While Google claims this approach is “highly effective” and on par with its Featured Snippets in terms of accuracy, the past week has seen numerous examples of the AI system generating bizarre, incorrect, or even potentially harmful responses, as we detailed in a recent feature where Ars reporter Kyle Orland replicated many of the unusual outputs.

Drawing inaccurate conclusions from the web

On Wednesday morning, Google's AI Overview was erroneously telling us the Sony PlayStation and Sega Saturn were available in 1993.

Enlarge / On Wednesday morning, Google’s AI Overview was erroneously telling us the Sony PlayStation and Sega Saturn were available in 1993.

Kyle Orland / Google

Given the circulating AI Overview examples, Google almost apologizes in the post and says, “We hold ourselves to a high standard, as do our users, so we expect and appreciate the feedback, and take it seriously.” But Reid, in an attempt to justify the errors, then goes into some very revealing detail about why AI Overviews provides erroneous information:

AI Overviews work very differently than chatbots and other LLM products that people may have tried out. They’re not simply generating an output based on training data. While AI Overviews are powered by a customized language model, the model is integrated with our core web ranking systems and designed to carry out traditional “search” tasks, like identifying relevant, high-quality results from our index. That’s why AI Overviews don’t just provide text output, but include relevant links so people can explore further. Because accuracy is paramount in Search, AI Overviews are built to only show information that is backed up by top web results.

This means that AI Overviews generally don’t “hallucinate” or make things up in the ways that other LLM products might.

Here we see the fundamental flaw of the system: “AI Overviews are built to only show information that is backed up by top web results.” The design is based on the false assumption that Google’s page-ranking algorithm favors accurate results and not SEO-gamed garbage. Google Search has been broken for some time, and now the company is relying on those gamed and spam-filled results to feed its new AI model.

Even if the AI model draws from a more accurate source, as with the 1993 game console search seen above, Google’s AI language model can still make inaccurate conclusions about the “accurate” data, confabulating erroneous information in a flawed summary of the information available.

Generally ignoring the folly of basing its AI results on a broken page-ranking algorithm, Google’s blog post instead attributes the commonly circulated errors to several other factors, including users making nonsensical searches “aimed at producing erroneous results.” Google does admit faults with the AI model, like misinterpreting queries, misinterpreting “a nuance of language on the web,” and lacking sufficient high-quality information on certain topics. It also suggests that some of the more egregious examples circulating on social media are fake screenshots.

“Some of these faked results have been obvious and silly,” Reid writes. “Others have implied that we returned dangerous results for topics like leaving dogs in cars, smoking while pregnant, and depression. Those AI Overviews never appeared. So we’d encourage anyone encountering these screenshots to do a search themselves to check.”

(No doubt some of the social media examples are fake, but it’s worth noting that any attempts to replicate those early examples now will likely fail because Google will have manually blocked the results. And it is potentially a testament to how broken Google Search is if people believed extreme fake examples in the first place.)

While addressing the “nonsensical searches” angle in the post, Reid uses the example search, “How many rocks should I eat each day,” which went viral in a tweet on May 23. Reid says, “Prior to these screenshots going viral, practically no one asked Google that question.” And since there isn’t much data on the web that answers it, she says there is a “data void” or “information gap” that was filled by satirical content found on the web, and the AI model found it and pushed it as an answer, much like Featured Snippets might. So basically, it was working exactly as designed.

A screenshot of an AI Overview query,

Enlarge / A screenshot of an AI Overview query, “How many rocks should I eat each day” that went viral on X last week.

Google’s AI Overview is flawed by design, and a new company blog post hints at why Read More »

bing-outage-shows-just-how-little-competition-google-search-really-has

Bing outage shows just how little competition Google search really has

Searching for new search —

Opinion: Actively searching without Google or Bing is harder than it looks.

Google logo on a phone in front of a Bing logo in the background

Getty Images

Bing, Microsoft’s search engine platform, went down in the very early morning today. That meant that searches from Microsoft’s Edge browsers that had yet to change their default providers didn’t work. It also meant that services relying on Bing’s search API—Microsoft’s own Copilot, ChatGPT search, Yahoo, Ecosia, and DuckDuckGo—similarly failed.

Services were largely restored by the morning Eastern work hours, but the timing feels apt, concerning, or some combination of the two. Google, the consistently dominating search platform, just last week announced and debuted AI Overviews as a default addition to all searches. If you don’t want an AI response but still want to use Google, you can hunt down the new “Web” option in a menu, or you can, per Ernie Smith, tack “&udm=14” onto your search or use Smith’s own “Konami code” shortcut page.

If dismay about AI’s hallucinations, power draw, or pizza recipes concern you—along with perhaps broader Google issues involving privacy, tracking, news, SEO, or monopoly power—most of your other major options were brought down by a single API outage this morning. Moving past that kind of single point of vulnerability will take some work, both by the industry and by you, the person wondering if there’s a real alternative.

Search engine market share, as measured by StatCounter, April 2023–April 2024.

Search engine market share, as measured by StatCounter, April 2023–April 2024.

StatCounter

Upward of a billion dollars a year

The overwhelming majority of search tools offering an “alternative” to Google are using Google, Bing, or Yandex, the three major search engines that maintain massive global indexes. Yandex, being based in Russia, is a non-starter for many people around the world at the moment. Bing offers its services widely, most notably to DuckDuckGo, but its ad-based revenue model and privacy particulars have caused some friction there in the past. Before his company was able to block more of Microsoft’s own tracking scripts, DuckDuckGo CEO and founder Gabriel Weinberg explained in a Reddit reply why firms like his weren’t going the full DIY route:

… [W]e source most of our traditional links and images privately from Bing … Really only two companies (Google and Microsoft) have a high-quality global web link index (because I believe it costs upwards of a billion dollars a year to do), and so literally every other global search engine needs to bootstrap with one or both of them to provide a mainstream search product. The same is true for maps btw — only the biggest companies can similarly afford to put satellites up and send ground cars to take streetview pictures of every neighborhood.

Bing makes Microsoft money, if not quite profit yet. It’s in Microsoft’s interest to keep its search index stocked and API open, even if its focus is almost entirely on its own AI chatbot version of Bing. Yet if Microsoft decided to pull API access, or it became unreliable, Google’s default position gets even stronger. What would non-conformists have to choose from then?

Bing outage shows just how little competition Google search really has Read More »

openai-will-use-reddit-posts-to-train-chatgpt-under-new-deal

OpenAI will use Reddit posts to train ChatGPT under new deal

Data dealings —

Reddit has been eager to sell data from user posts.

An image of a woman holding a cell phone in front of the Reddit logo displayed on a computer screen, on April 29, 2024, in Edmonton, Canada.

Stuff posted on Reddit is getting incorporated into ChatGPT, Reddit and OpenAI announced on Thursday. The new partnership grants OpenAI access to Reddit’s Data API, giving the generative AI firm real-time access to Reddit posts.

Reddit content will be incorporated into ChatGPT “and new products,” Reddit’s blog post said. The social media firm claims the partnership will “enable OpenAI’s AI tools to better understand and showcase Reddit content, especially on recent topics.” OpenAI will also start advertising on Reddit.

The deal is similar to one that Reddit struck with Google in February that allows the tech giant to make “new ways to display Reddit content” and provide “more efficient ways to train models,” Reddit said at the time. Neither Reddit nor OpenAI disclosed the financial terms of their partnership, but Reddit’s partnership with Google was reportedly worth $60 million.

Under the OpenAI partnership, Reddit also gains access to OpenAI large language models (LLMs) to create features for Reddit, including its volunteer moderators.

Reddit’s data licensing push

The news comes about a year after Reddit launched an API war by starting to charge for access to its data API. This resulted in many beloved third-party Reddit apps closing and a massive user protest. Reddit, which would soon become a public company and hadn’t turned a profit yet, said one of the reasons for the sudden change was to prevent AI firms from using Reddit content to train their LLMs for free.

Earlier this month, Reddit published a Public Content Policy stating: “Unfortunately, we see more and more commercial entities using unauthorized access or misusing authorized access to collect public data in bulk, including Reddit public content. Worse, these entities perceive they have no limitation on their usage of that data, and they do so with no regard for user rights or privacy, ignoring reasonable legal, safety, and user removal requests.

In its blog post on Thursday, Reddit said that deals like OpenAI’s are part of an “open” Internet. It added that “part of being open means Reddit content needs to be accessible to those fostering human learning and researching ways to build community, belonging, and empowerment online.”

Reddit has been vocal about its interest in pursuing data licensing deals as a core part of its business. Its building of AI partnerships sparks discourse around the use of user-generated content to fuel AI models without users being compensated and some potentially not considering that their social media posts would be used this way. OpenAI and Stack Overflow faced pushback earlier this month when integrating Stack Overflow content with ChatGPT. Some of Stack Overflow’s user community responded by sabotaging their own posts.

OpenAI is also challenged to work with Reddit data that, like much of the Internet, can be filled with inaccuracies and inappropriate content. Some of the biggest opponents of Reddit’s API rule changes were volunteer mods. Some have exited the platform since, and following the rule changes, Ars Technica spoke with long-time Redditors who were concerned about Reddit content quality moving forward.

Regardless, generative AI firms are keen to tap into Reddit’s access to real-time conversations from a variety of people discussing a nearly endless range of topics. And Reddit seems equally eager to license the data from its users’ posts.

Advance Publications, which owns Ars Technica parent Condé Nast, is the largest shareholder of Reddit.

OpenAI will use Reddit posts to train ChatGPT under new deal Read More »

reddit,-ai-spam-bots-explore-new-ways-to-show-ads-in-your-feed

Reddit, AI spam bots explore new ways to show ads in your feed

BRAZIL - 2024/04/08: In this photo illustration, a Reddit logo seen displayed on a computer screen through a magnifying glass

Reddit has made it clear that it’s an ad-first business. Today, it expanded on that practice with a new ad format that looks to sell things to Reddit users. Simultaneously, Reddit has marketers who are interested in pushing products to users through seemingly legitimate accounts.

In a blog post today, Reddit announced that its Dynamic Product Ads are entering public beta globally. The ad format uses “shopping signals,” aka discussions with people looking to try a product or brand, machine learning, and advertiser product catalogs in order to post relevant ads. Reddit shared an image in the blog post that shows ads, including with products and pricing, that seem to relate to a posted question. User responses to the Reddit post appear under the ad.

  • A somewhat blurry depiction of the new type of ads Reddit is testing.

  • A (still blurry) example of a more targeted approach to Reddit’s new ad format.

Reddit’s Dynamic Product Ads can automatically show users ads “based on the products they’ve previously engaged with on the advertiser’s site” and/or “based on what people engage with on Reddit or advertiser sites,” per the blog.

Reddit is an ad business

Reddit’s blog didn’t imply that Dynamic Product Ads means users would see more ads than they do currently. However, today’s blog highlighted the newly public company’s focus on ad sales.

“With Dynamic Product Ads, brands can tap into the rich, high-intent product conversations that people come to Reddit for,” Reddit EVP of Business Marketing and Growth Jim Squires said in a statement.

The blog also noted that “Reddit’s communities are naturally commercial,” adding:

Reddit is where people come to make shopping decisions, and we’re focused on bringing brands into these interactions in a way that adds value for people and drives growth for businesses.

The stance has been increasingly clear over the past year, as Reddit became rather vocal about the fact that it’s never been profitable. In Junethe company started charging for API access, resulting in numerous valued third-party Reddit apps closing and messy user protests that left a bad taste in countless long-time users’ and moderators’ mouths. While Reddit initially announced the change as a way to prevent large language models from using its data for free training, it was also seen as a way to drive users to Reddit’s website and mobile app, where it can serve users ads.

Per Reddit’s February SEC filing (PDF), ads made up 98 percent of Reddit’s revenues in 2023 and 2022. That filing included a note from CEO Steve Huffman, saying: “Advertising is our first business” and that Reddit’s ad business is “still in the early phases of growing.”

In September, the company started preventing users from opting out of personalized ads. In June, Reddit introduced a new tool to advertisers that uses natural language processing to look through Reddit user comments for keywords that signal potential interest for a brand.

Reddit’s blog post today hinted at some future evolutions focused on showing Reddit users ads, including “tools and features such as new shopping ads formats like collection ads that enhance the shopper experience while driving performance” and “merchant platform integrations that welcome smaller merchants.”

Reddit, AI spam bots explore new ways to show ads in your feed Read More »

after-overreaching-tos-angers-users,-cloud-provider-vultr-backs-off

After overreaching TOS angers users, cloud provider Vultr backs off

“Clearly causing confusion” —

Terms seemed to grant an “irrevocable” right to commercialize any user content.

After overreaching TOS angers users, cloud provider Vultr backs off

After backlash, the cloud provider Vultr has updated its terms to remove a clause that a Reddit user feared required customers to “fork over rights” to “anything” hosted on its platform.

The alarming clause seemed to grant Vultr a “non-exclusive, perpetual, irrevocable” license to “use and commercialize” any user content uploaded, posted, hosted, or stored on Vultr “in any way that Vultr deems appropriate, without any further consent” or compensation to users or third parties.

Here’s the full clause that was removed:

You hereby grant to Vultr a non-exclusive, perpetual, irrevocable, royalty-free, fully paid-up, worldwide license (including the right to sublicense through multiple tiers) to use, reproduce, process, adapt, publicly perform, publicly display, modify, prepare derivative works, publish, transmit and distribute each of your User Content, or any portion thereof, in any form, medium or distribution method now known or hereafter existing, known or developed, and otherwise use and commercialize the User Content in any way that Vultr deems appropriate, without any further consent, notice and/or compensation to you or to any third parties, for purposes of providing the Services to you.

In a statement provided to Ars, Vultr CEO J.J. Kardwell said that the terms were revised to “simplify and clarify” language causing confusion for some users.

“A Reddit post incorrectly took portions of our Terms of Service out of context, which only pertain to content provided to Vultr on our public mediums (community-related content on public forums, as an example) for purposes of rendering the needed services—e.g., publishing comments, posts, or ratings,” Kardwell said. “This is separate from a user’s own, private content that is deployed on Vultr services.”

It’s easy to see why the Reddit user was confused, as the previous terms did not clearly differentiate between a user’s public and “private content” in the paragraph where it was included. Kardwell told The Register that the old terms, which were drafted in 2021, were “clearly causing confusion for some portion of users” and were updated because Vultr recognized “that the average user doesn’t have a law degree.”

According to Kardwell, the part of the removed clause that “ends with ‘for purposes of providing the Services to you'” was “intended to make it clear that any rights referenced are solely for the purposes of providing the Services to you.” Kevin Cochrane, Vultr’s chief marketing officer, told Ars that users were intended to scroll down to understand that the line only applied to community content described in a section labeled “content that you make publicly available.” He said that the removed clause was necessary in 2021 when Vultr provided forums and collected ratings, but that the clause could be stripped now because “we don’t actually use” that kind of community content “any longer.”

“We’re very focused on being responsive to the community and the concerns people have, and we believe the strongest thing we can do to demonstrate that there is no bad intent here is to remove it,” Kardwell told The Register.

A plain read of the terms without scrolling seemed to substantiate the Reddit user’s worst fears that “it’s possible Vultr may want the expansive license grant to do AI/Machine Learning based on the data they host. Or maybe they could mine database contents to resell [personally identifiable information]. Given the (perpetual!) license, there’s not really any limit to what they might do. They could even clone someone’s app and sell their own rebranded version, and they’d be legally in the clear.”

The user claimed to have been locked out of their Vultr account for five days after refusing to agree to the terms, with Vultr’s support team seemingly providing little recourse to migrate data to a new cloud provider.

“Migrating all my servers and DNS without being able to log in to my account is going to be both a headache and error prone,” the Reddit user wrote. “I feel like they’re holding my business hostage and extorting me into accepting a license I would never consent to under duress.”

Ars was not able to reach the Reddit user to see if Vultr removing the line from the terms has resolved the issue. Other users on the thread claimed that they had terminated their Vultr accounts over the controversy. Cochrane told Ars that they had been contacted by many customers over the past two days and had no way to identify the Reddit user to confirm if they had terminated their account. Cochrane said the support team was actively reaching out to users to verify if their complaints stemmed from discomfort with the previous terms.

In his statement, Kardwell reiterated that Vultr “customers own 100 percent of their content,” clarifying that Vultr “has never claimed any rights to, used, accessed, nor allowed access to or shared” user content, “other than as may be required by law or for security purposes.”

He also confirmed that Vultr would be conducting a “full review” of its terms and publishing another update “soon.” Kardwell told The Register that the most recent update to its terms that led the Reddit user to call out the company was “actually spurred by unrelated Microsoft licensing changes,” promising that Vultr has no plans to use or commercialize user data.

“We do not use user data,” Kardwell told The Register. “We never have, and we never will. We take privacy and security very seriously. It’s at the core of what we do globally.”

After overreaching TOS angers users, cloud provider Vultr backs off Read More »