machine learning

debate-over-“open-source-ai”-term-brings-new-push-to-formalize-definition

Debate over “open source AI” term brings new push to formalize definition

A man peers over a glass partition, seeking transparency.

Enlarge / A man peers over a glass partition, seeking transparency.

The Open Source Initiative (OSI) recently unveiled its latest draft definition for “open source AI,” aiming to clarify the ambiguous use of the term in the fast-moving field. The move comes as some companies like Meta release trained AI language model weights and code with usage restrictions while using the “open source” label. This has sparked intense debates among free-software advocates about what truly constitutes “open source” in the context of AI.

For instance, Meta’s Llama 3 model, while freely available, doesn’t meet the traditional open source criteria as defined by the OSI for software because it imposes license restrictions on usage due to company size or what type of content is produced with the model. The AI image generator Flux is another “open” model that is not truly open source. Because of this type of ambiguity, we’ve typically described AI models that include code or weights with restrictions or lack accompanying training data with alternative terms like “open-weights” or “source-available.”

To address the issue formally, the OSI—which is well-known for its advocacy for open software standards—has assembled a group of about 70 participants, including researchers, lawyers, policymakers, and activists. Representatives from major tech companies like Meta, Google, and Amazon also joined the effort. The group’s current draft (version 0.0.9) definition of open source AI emphasizes “four fundamental freedoms” reminiscent of those defining free software: giving users of the AI system permission to use it for any purpose without permission, study how it works, modify it for any purpose, and share with or without modifications.

By establishing clear criteria for open source AI, the organization hopes to provide a benchmark against which AI systems can be evaluated. This will likely help developers, researchers, and users make more informed decisions about the AI tools they create, study, or use.

Truly open source AI may also shed light on potential software vulnerabilities of AI systems, since researchers will be able to see how the AI models work behind the scenes. Compare this approach with an opaque system such as OpenAI’s ChatGPT, which is more than just a GPT-4o large language model with a fancy interface—it’s a proprietary system of interlocking models and filters, and its precise architecture is a closely guarded secret.

OSI’s project timeline indicates that a stable version of the “open source AI” definition is expected to be announced in October at the All Things Open 2024 event in Raleigh, North Carolina.

“Permissionless innovation”

In a press release from May, the OSI emphasized the importance of defining what open source AI really means. “AI is different from regular software and forces all stakeholders to review how the Open Source principles apply to this space,” said Stefano Maffulli, executive director of the OSI. “OSI believes that everybody deserves to maintain agency and control of the technology. We also recognize that markets flourish when clear definitions promote transparency, collaboration and permissionless innovation.”

The organization’s most recent draft definition extends beyond just the AI model or its weights, encompassing the entire system and its components.

For an AI system to qualify as open source, it must provide access to what the OSI calls the “preferred form to make modifications.” This includes detailed information about the training data, the full source code used for training and running the system, and the model weights and parameters. All these elements must be available under OSI-approved licenses or terms.

Notably, the draft doesn’t mandate the release of raw training data. Instead, it requires “data information”—detailed metadata about the training data and methods. This includes information on data sources, selection criteria, preprocessing techniques, and other relevant details that would allow a skilled person to re-create a similar system.

The “data information” approach aims to provide transparency and replicability without necessarily disclosing the actual dataset, ostensibly addressing potential privacy or copyright concerns while sticking to open source principles, though that particular point may be up for further debate.

“The most interesting thing about [the definition] is that they’re allowing training data to NOT be released,” said independent AI researcher Simon Willison in a brief Ars interview about the OSI’s proposal. “It’s an eminently pragmatic approach—if they didn’t allow that, there would be hardly any capable ‘open source’ models.”

Debate over “open source AI” term brings new push to formalize definition Read More »

ars-technica-content-is-now-available-in-openai-services

Ars Technica content is now available in OpenAI services

Adventures in capitalism —

Condé Nast joins other publishers in allowing OpenAI to access its content.

The OpenAI and Conde Nast logos on a gradient background.

Ars Technica

On Tuesday, OpenAI announced a partnership with Ars Technica parent company Condé Nast to display content from prominent publications within its AI products, including ChatGPT and a new SearchGPT prototype. It also allows OpenAI to use Condé content to train future AI language models. The deal covers well-known Condé brands such as Vogue, The New Yorker, GQ, Wired, Ars Technica, and others. Financial details were not disclosed.

One immediate effect of the deal will be that users of ChatGPT or SearchGPT will now be able to see information from Condé Nast publications pulled from those assistants’ live views of the web. For example, a user could ask ChatGPT, “What’s the latest Ars Technica article about Space?” and ChatGPT can browse the web and pull up the result, attribute it, and summarize it for users while also linking to the site.

In the longer term, the deal also means that OpenAI can openly and officially utilize Condé Nast articles to train future AI language models, which includes successors to GPT-4o. In this case, “training” means feeding content into an AI model’s neural network so the AI model can better process conceptual relationships.

AI training is an expensive and computationally intense process that happens rarely, usually prior to the launch of a major new AI model, although a secondary process called “fine-tuning” can continue over time. Having access to high-quality training data, such as vetted journalism, improves AI language models’ ability to provide accurate answers to user questions.

It’s worth noting that Condé Nast internal policy still forbids its publications from using text created by generative AI, which is consistent with its AI rules before the deal.

Not waiting on fair use

With the deal, Condé Nast joins a growing list of publishers partnering with OpenAI, including Associated Press, Axel Springer, The Atlantic, and others. Some publications, such as The New York Times, have chosen to sue OpenAI over content use, and there’s reason to think they could win.

In an internal email to Condé Nast staff, CEO Roger Lynch framed the multi-year partnership as a strategic move to expand the reach of the company’s content, adapt to changing audience behaviors, and ensure proper compensation and attribution for using the company’s IP. “This partnership recognizes that the exceptional content produced by Condé Nast and our many titles cannot be replaced,” Lynch wrote in the email, “and is a step toward making sure our technology-enabled future is one that is created responsibly.”

The move also brings additional revenue to Condé Nast, Lynch added, at a time when “many technology companies eroded publishers’ ability to monetize content, most recently with traditional search.” The deal will allow Condé to “continue to protect and invest in our journalism and creative endeavors,” Lynch wrote.

OpenAI COO Brad Lightcap said in a statement, “We’re committed to working with Condé Nast and other news publishers to ensure that as AI plays a larger role in news discovery and delivery, it maintains accuracy, integrity, and respect for quality reporting.”

Ars Technica content is now available in OpenAI services Read More »

procreate-defies-ai-trend,-pledges-“no-generative-ai”-in-its-illustration-app

Procreate defies AI trend, pledges “no generative AI” in its illustration app

Political pixels —

Procreate CEO: “I really f—ing hate generative AI.”

Still of Procreate CEO James Cuda from a video posted to X.

Enlarge / Still of Procreate CEO James Cuda from a video posted to X.

On Sunday, Procreate announced that it will not incorporate generative AI into its popular iPad illustration app. The decision comes in response to an ongoing backlash from some parts of the art community, which has raised concerns about the ethical implications and potential consequences of AI use in creative industries.

“Generative AI is ripping the humanity out of things,” Procreate wrote on its website. “Built on a foundation of theft, the technology is steering us toward a barren future.”

In a video posted on X, Procreate CEO James Cuda laid out his company’s stance, saying, “We’re not going to be introducing any generative AI into our products. I don’t like what’s happening to the industry, and I don’t like what it’s doing to artists.”

Cuda’s sentiment echoes the fears of some digital artists who feel that AI image synthesis models, often trained on content without consent or compensation, threaten their livelihood and the authenticity of creative work. That’s not a universal sentiment among artists, but AI image synthesis is often a deeply divisive subject on social media, with some taking starkly polarized positions on the topic.

Procreate CEO James Cuda lays out his argument against generative AI in a video posted to X.

Cuda’s video plays on that polarization with clear messaging against generative AI. His statement reads as follows:

You’ve been asking us about AI. You know, I usually don’t like getting in front of the camera. I prefer that our products speak for themselves. I really fucking hate generative AI. I don’t like what’s happening in the industry and I don’t like what it’s doing to artists. We’re not going to be introducing any generative AI into out products. Our products are always designed and developed with the idea that a human will be creating something. You know, we don’t exactly know where this story’s gonna go or how it ends, but we believe that we’re on the right path supporting human creativity.

The debate over generative AI has intensified among some outspoken artists as more companies integrate these tools into their products. Dominant illustration software provider Adobe has tried to avoid ethical concerns by training its Firefly AI models on licensed or public domain content, but some artists have remained skeptical. Adobe Photoshop currently includes a “Generative Fill” feature powered by image synthesis, and the company is also experimenting with video synthesis models.

The backlash against image and video synthesis is not solely focused on creative app developers. Hardware manufacturer Wacom and game publisher Wizards of the Coast have faced criticism and issued apologies after using AI-generated content in their products. Toys “R” Us also faced a negative reaction after debuting an AI-generated commercial. Companies are still grappling with balancing the potential benefits of generative AI with the ethical concerns it raises.

Artists and critics react

A partial screenshot of Procreate's AI website captured on August 20, 2024.

Enlarge / A partial screenshot of Procreate’s AI website captured on August 20, 2024.

So far, Procreate’s anti-AI announcement has been met with a largely positive reaction in replies to its social media post. In a widely liked comment, artist Freya Holmér wrote on X, “this is very appreciated, thank you.”

Some of the more outspoken opponents of image synthesis also replied favorably to Procreate’s move. Karla Ortiz, who is a plaintiff in a lawsuit against AI image-generator companies, replied to Procreate’s video on X, “Whatever you need at any time, know I’m here!! Artists support each other, and also support those who allow us to continue doing what we do! So thank you for all you all do and so excited to see what the team does next!”

Artist RJ Palmer, who stoked the first major wave of AI art backlash with a viral tweet in 2022, also replied to Cuda’s video statement, saying, “Now thats the way to send a message. Now if only you guys could get a full power competitor to [Photoshop] on desktop with plugin support. Until someone can build a real competitor to high level [Photoshop] use, I’m stuck with it.”

A few pro-AI users also replied to the X post, including AI-augmented artist Claire Silver, who uses generative AI as an accessibility tool. She wrote on X, “Most of my early work is made with a combination of AI and Procreate. 7 years ago, before text to image was really even a thing. I loved procreate because it used tech to boost accessibility. Like AI, it augmented trad skill to allow more people to create. No rules, only tools.”

Since AI image synthesis continues to be a highly charged subject among some artists, reaffirming support for human-centric creativity could be an effective differentiated marketing move for Procreate, which currently plays underdog to creativity app giant Adobe. While some may prefer to use AI tools, in an (ideally healthy) app ecosystem with personal choice in illustration apps, people can follow their conscience.

Procreate’s anti-AI stance is slightly risky because it might also polarize part of its user base—and if the company changes its mind about including generative AI in the future, it will have to walk back its pledge. But for now, Procreate is confident in its decision: “In this technological rush, this might make us an exception or seem at risk of being left behind,” Procreate wrote. “But we see this road less traveled as the more exciting and fruitful one for our community.”

Procreate defies AI trend, pledges “no generative AI” in its illustration app Read More »

chinese-social-media-users-hilariously-mock-ai-video-fails

Chinese social media users hilariously mock AI video fails

Life imitates AI imitating life —

TikTok and Bilibili users transform nonsensical AI glitches into real-world performance art.

Still from a Chinese social media video featuring two people imitating imperfect AI-generated video outputs.

Enlarge / Still from a Chinese social media video featuring two people imitating imperfect AI-generated video outputs.

It’s no secret that despite significant investment from companies like OpenAI and Runway, AI-generated videos still struggle to achieve convincing realism at times. Some of the most amusing fails end up on social media, which has led to a new response trend on Chinese social media platforms TikTok and Bilibili where users create videos that mock the imperfections of AI-generated content. The trend has since spread to X (formerly Twitter) in the US, where users have been sharing the humorous parodies.

In particular, the videos seem to parody image synthesis videos where subjects seamlessly morph into other people or objects in unexpected and physically impossible ways. Chinese social media replicate these unusual visual non-sequiturs without special effects by positioning their bodies in unusual ways as new and unexpected objects appear on-camera from out of frame.

This exaggerated mimicry has struck a chord with viewers on X, who find the parodies entertaining. User @theGioM shared one video, seen above. “This is high-level performance arts,” wrote one X user. “art is imitating life imitating ai, almost shedded a tear.” Another commented, “I feel like it still needs a motorcycle the turns into a speedboat and takes off into the sky. Other than that, excellent work.”

An example Chinese social media video featuring two people imitating imperfect AI-generated video outputs.

While these parodies poke fun at current limitations, tech companies are actively attempting to overcome them with more training data (examples analyzed by AI models that teach them how to create videos) and computational training time. OpenAI unveiled Sora in February, capable of creating realistic scenes if they closely match examples found in training data. Runway’s Gen-3 Alpha suffers a similar fate: It can create brief clips of convincing video within a narrow set of constraints. This means that generated videos of situations outside the dataset often end up hilariously weird.

An AI-generated video that features impossibly-morphing people and animals. Social media users are imitating this style.

It’s worth noting that actor Will Smith beat Chinese social media users to this trend in February by poking fun at a horrific 2023 viral AI-generated video that attempted to depict him eating spaghetti. That may also bring back memories of other amusing video synthesis failures, such as May 2023’s AI-generated beer commercial, created using Runway’s earlier Gen-2 model.

An example Chinese social media video featuring two people imitating imperfect AI-generated video outputs.

While imitating imperfect AI videos may seem strange to some, people regularly make money pretending to be NPCs (non-player characters—a term for computer-controlled video game characters) on TikTok.

For anyone alive during the 1980s, witnessing this fast-changing and often bizarre new media world can cause some cognitive whiplash, but the world is a weird place full of wonders beyond the imagination. “There are more things in Heaven and Earth, Horatio, than are dreamt of in your philosophy,” as Hamlet once famously said. “Including people pretending to be video game characters and flawed video synthesis outputs.”

Chinese social media users hilariously mock AI video fails Read More »

self-driving-waymo-cars-keep-sf-residents-awake-all-night-by-honking-at-each-other

Self-driving Waymo cars keep SF residents awake all night by honking at each other

The ghost in the machine —

Haunted by glitching algorithms, self-driving cars disturb the peace in San Francisco.

A Waymo self-driving car in front of Google's San Francisco headquarters, San Francisco, California, June 7, 2024.

Enlarge / A Waymo self-driving car in front of Google’s San Francisco headquarters, San Francisco, California, June 7, 2024.

Silicon Valley’s latest disruption? Your sleep schedule. On Saturday, NBC Bay Area reported that San Francisco’s South of Market residents are being awakened throughout the night by Waymo self-driving cars honking at each other in a parking lot. No one is inside the cars, and they appear to be automatically reacting to each other’s presence.

Videos provided by residents to NBC show Waymo cars filing into the parking lot and attempting to back into spots, which seems to trigger honking from other Waymo vehicles. The automatic nature of these interactions—which seem to peak around 4 am every night—has left neighbors bewildered and sleep-deprived.

NBC Bay Area’s report: “Waymo cars keep SF neighborhood awake.”

According to NBC, the disturbances began several weeks ago when Waymo vehicles started using a parking lot off 2nd Street near Harrison Street. Residents in nearby high-rise buildings have observed the autonomous vehicles entering the lot to pause between rides, but the cars’ behavior has become a source of frustration for the neighborhood.

Christopher Cherry, who lives in an adjacent building, told NBC Bay Area that he initially welcomed Waymo’s presence, expecting it to enhance local security and tranquility. However, his optimism waned as the frequency of honking incidents increased. “We started out with a couple of honks here and there, and then as more and more cars started to arrive, the situation got worse,” he told NBC.

The lack of human operators in the vehicles has complicated efforts to address the issue directly since there is no one they can ask to stop honking. That lack of accountability forced residents to report their concerns to Waymo’s corporate headquarters, which had not responded to the incidents until NBC inquired as part of its report. A Waymo spokesperson told NBC, “We are aware that in some scenarios our vehicles may briefly honk while navigating our parking lots. We have identified the cause and are in the process of implementing a fix.”

The absurdity of the situation prompted tech author and journalist James Vincent to write on X, “current tech trends are resistant to satire precisely because they satirize themselves. a car park of empty cars, honking at one another, nudging back and forth to drop off nobody, is a perfect image of tech serving its own prerogatives rather than humanity’s.”

Self-driving Waymo cars keep SF residents awake all night by honking at each other Read More »

chatgpt-unexpectedly-began-speaking-in-a-user’s-cloned-voice-during-testing

ChatGPT unexpectedly began speaking in a user’s cloned voice during testing

An illustration of a computer synthesizer spewing out letters.

On Thursday, OpenAI released the “system card” for ChatGPT’s new GPT-4o AI model that details model limitations and safety testing procedures. Among other examples, the document reveals that in rare occurrences during testing, the model’s Advanced Voice Mode unintentionally imitated users’ voices without permission. Currently, OpenAI has safeguards in place that prevent this from happening, but the instance reflects the growing complexity of safely architecting with an AI chatbot that could potentially imitate any voice from a small clip.

Advanced Voice Mode is a feature of ChatGPT that allows users to have spoken conversations with the AI assistant.

In a section of the GPT-4o system card titled “Unauthorized voice generation,” OpenAI details an episode where a noisy input somehow prompted the model to suddenly imitate the user’s voice. “Voice generation can also occur in non-adversarial situations, such as our use of that ability to generate voices for ChatGPT’s advanced voice mode,” OpenAI writes. “During testing, we also observed rare instances where the model would unintentionally generate an output emulating the user’s voice.”

In this example of unintentional voice generation provided by OpenAI, the AI model outbursts “No!” and continues the sentence in a voice that sounds similar to the “red teamer” heard in the beginning of the clip. (A red teamer is a person hired by a company to do adversarial testing.)

It would certainly be creepy to be talking to a machine and then have it unexpectedly begin talking to you in your own voice. Ordinarily, OpenAI has safeguards to prevent this, which is why the company says this occurrence was rare even before it developed ways to prevent it completely. But the example prompted BuzzFeed data scientist Max Woolf to tweet, “OpenAI just leaked the plot of Black Mirror’s next season.”

Audio prompt injections

How could voice imitation happen with OpenAI’s new model? The primary clue lies elsewhere in the GPT-4o system card. To create voices, GPT-4o can apparently synthesize almost any type of sound found in its training data, including sound effects and music (though OpenAI discourages that behavior with special instructions).

As noted in the system card, the model can fundamentally imitate any voice based on a short audio clip. OpenAI guides this capability safely by providing an authorized voice sample (of a hired voice actor) that it is instructed to imitate. It provides the sample in the AI model’s system prompt (what OpenAI calls the “system message”) at the beginning of a conversation. “We supervise ideal completions using the voice sample in the system message as the base voice,” writes OpenAI.

In text-only LLMs, the system message is a hidden set of text instructions that guides behavior of the chatbot that gets added to the conversation history silently just before the chat session begins. Successive interactions are appended to the same chat history, and the entire context (often called a “context window”) is fed back into the AI model each time the user provides a new input.

(It’s probably time to update this diagram created in early 2023 below, but it shows how the context window works in an AI chat. Just imagine that the first prompt is a system message that says things like “You are a helpful chatbot. You do not talk about violent acts, etc.”)

A diagram showing how GPT conversational language model prompting works.

Enlarge / A diagram showing how GPT conversational language model prompting works.

Benj Edwards / Ars Technica

Since GPT-4o is multimodal and can process tokenized audio, OpenAI can also use audio inputs as part of the model’s system prompt, and that’s what it does when OpenAI provides an authorized voice sample for the model to imitate. The company also uses another system to detect if the model is generating unauthorized audio. “We only allow the model to use certain pre-selected voices,” writes OpenAI, “and use an output classifier to detect if the model deviates from that.”

ChatGPT unexpectedly began speaking in a user’s cloned voice during testing Read More »

man-vs.-machine:-deepmind’s-new-robot-serves-up-a-table-tennis-triumph

Man vs. machine: DeepMind’s new robot serves up a table tennis triumph

John Henry was a steel-driving man —

Human-beating ping-pong AI learned to play in a simulated environment.

A blue illustration of a robotic arm playing table tennis.

Benj Edwards / Google DeepMind

On Wednesday, researchers at Google DeepMind revealed the first AI-powered robotic table tennis player capable of competing at an amateur human level. The system combines an industrial robot arm called the ABB IRB 1100 and custom AI software from DeepMind. While an expert human player can still defeat the bot, the system demonstrates the potential for machines to master complex physical tasks that require split-second decision-making and adaptability.

“This is the first robot agent capable of playing a sport with humans at human level,” the researchers wrote in a preprint paper listed on arXiv. “It represents a milestone in robot learning and control.”

The unnamed robot agent (we suggest “AlphaPong”), developed by a team that includes David B. D’Ambrosio, Saminda Abeyruwan, and Laura Graesser, showed notable performance in a series of matches against human players of varying skill levels. In a study involving 29 participants, the AI-powered robot won 45 percent of its matches, demonstrating solid amateur-level play. Most notably, it achieved a 100 percent win rate against beginners and a 55 percent win rate against intermediate players, though it struggled against advanced opponents.

A Google DeepMind video of the AI agent rallying with a human table tennis player.

The physical setup consists of the aforementioned IRB 1100, a 6-degree-of-freedom robotic arm, mounted on two linear tracks, allowing it to move freely in a 2D plane. High-speed cameras track the ball’s position, while a motion-capture system monitors the human opponent’s paddle movements.

AI at the core

To create the brains that power the robotic arm, DeepMind researchers developed a two-level approach that allows the robot to execute specific table tennis techniques while adapting its strategy in real time to each opponent’s playing style. In other words, it’s adaptable enough to play any amateur human at table tennis without requiring specific per-player training.

The system’s architecture combines low-level skill controllers (neural network policies trained to execute specific table tennis techniques like forehand shots, backhand returns, or serve responses) with a high-level strategic decision-maker (a more complex AI system that analyzes the game state, adapts to the opponent’s style, and selects which low-level skill policy to activate for each incoming ball).

The researchers state that one of the key innovations of this project was the method used to train the AI models. The researchers chose a hybrid approach that used reinforcement learning in a simulated physics environment, while grounding the training data in real-world examples. This technique allowed the robot to learn from around 17,500 real-world ball trajectories—a fairly small dataset for a complex task.

A Google DeepMind video showing an illustration of how the AI agent analyzes human players.

The researchers used an iterative process to refine the robot’s skills. They started with a small dataset of human-vs-human gameplay, then let the AI loose against real opponents. Each match generated new data on ball trajectories and human strategies, which the team fed back into the simulation for further training. This process, repeated over seven cycles, allowed the robot to continuously adapt to increasingly skilled opponents and diverse play styles. By the final round, the AI had learned from over 14,000 rally balls and 3,000 serves, creating a body of table tennis knowledge that helped it bridge the gap between simulation and reality.

Interestingly, Nvidia has also been experimenting with similar simulated physics systems, such as Eureka, that allow an AI model to rapidly learn to control a robotic arm in simulated space instead of the real world (since the physics can be accelerated inside the simulation, and thousands of simultaneous trials can take place). This method is likely to dramatically reduce the time and resources needed to train robots for complex interactions in the future.

Humans enjoyed playing against it

Beyond its technical achievements, the study also explored the human experience of playing against an AI opponent. Surprisingly, even players who lost to the robot reported enjoying the experience. “Across all skill groups and win rates, players agreed that playing with the robot was ‘fun’ and ‘engaging,'” the researchers noted. This positive reception suggests potential applications for AI in sports training and entertainment.

However, the system is not without limitations. It struggles with extremely fast or high balls, has difficulty reading intense spin, and shows weaker performance in backhand plays. Google DeepMind shared an example video of the AI agent losing a point to an advanced player due to what appears to be difficulty reacting to a speedy hit, as you can see below.

A Google DeepMind video of the AI agent playing against an advanced human player.

The implications of this robotic ping-pong prodigy extend beyond the world of table tennis, according to the researchers. The techniques developed for this project could be applied to a wide range of robotic tasks that require quick reactions and adaptation to unpredictable human behavior. From manufacturing to health care (or just spanking someone with a paddle repeatedly), the potential applications seem large indeed.

The research team at Google DeepMind emphasizes that with further refinement, they believe the system could potentially compete with advanced table tennis players in the future. DeepMind is no stranger to creating AI models that can defeat human game players, including AlphaZero and AlphaGo. With this latest robot agent, it’s looking like the research company is moving beyond board games and into physical sports. Chess and Jeopardy have already fallen to AI-powered victors—perhaps table tennis is next.

Man vs. machine: DeepMind’s new robot serves up a table tennis triumph Read More »

major-shifts-at-openai-spark-skepticism-about-impending-agi-timelines

Major shifts at OpenAI spark skepticism about impending AGI timelines

Shuffling the deck —

De Kraker: “If OpenAI is right on the verge of AGI, why do prominent people keep leaving?”

The OpenAI logo on a red brick wall.

Benj Edwards / Getty Images

Over the past week, OpenAI experienced a significant leadership shake-up as three key figures announced major changes. Greg Brockman, the company’s president and co-founder, is taking an extended sabbatical until the end of the year, while another co-founder, John Schulman, permanently departed for rival Anthropic. Peter Deng, VP of Consumer Product, has also left the ChatGPT maker.

In a post on X, Brockman wrote, “I’m taking a sabbatical through end of year. First time to relax since co-founding OpenAI 9 years ago. The mission is far from complete; we still have a safe AGI to build.”

The moves have led some to wonder just how close OpenAI is to a long-rumored breakthrough of some kind of reasoning artificial intelligence if high-profile employees are jumping ship (or taking long breaks, in the case of Brockman) so easily. As AI developer Benjamin De Kraker put it on X, “If OpenAI is right on the verge of AGI, why do prominent people keep leaving?”

AGI refers to a hypothetical AI system that could match human-level intelligence across a wide range of tasks without specialized training. It’s the ultimate goal of OpenAI, and company CEO Sam Altman has said it could emerge in the “reasonably close-ish future.” AGI is also a concept that has sparked concerns about potential existential risks to humanity and the displacement of knowledge workers. However, the term remains somewhat vague, and there’s considerable debate in the AI community about what truly constitutes AGI or how close we are to achieving it.

The emergence of the “next big thing” in AI has been seen by critics such as Ed Zitron as a necessary step to justify ballooning investments in AI models that aren’t yet profitable. The industry is holding its breath that OpenAI, or a competitor, has some secret breakthrough waiting in the wings that will justify the massive costs associated with training and deploying LLMs.

But other AI critics, such as Gary Marcus, have postulated that major AI companies have reached a plateau of large language model (LLM) capability centered around GPT-4-level models since no AI company has yet made a major leap past the groundbreaking LLM that OpenAI released in March 2023. Microsoft CTO Kevin Scott has countered these claims, saying that LLM “scaling laws” (that suggest LLMs increase in capability proportionate to more compute power thrown at them) will continue to deliver improvements over time and that more patience is needed as the next generation (say, GPT-5) undergoes training.

In the scheme of things, Brockman’s move sounds like an extended, long overdue vacation (or perhaps a period to deal with personal issues beyond work). Regardless of the reason, the duration of the sabbatical raises questions about how the president of a major tech company can suddenly disappear for four months without affecting day-to-day operations, especially during a critical time in its history.

Unless, of course, things are fairly calm at OpenAI—and perhaps GPT-5 isn’t going to ship until at least next year when Brockman returns. But this is speculation on our part, and OpenAI (whether voluntarily or not) sometimes surprises us when we least expect it. (Just today, Altman dropped a hint on X about strawberries that some people interpret as being a hint of a potential major model undergoing testing or nearing release.)

A pattern of departures and the rise of Anthropic

Anthropic / Benj Edwards

What may sting OpenAI the most about the recent departures is that a few high-profile employees have left to join Anthropic, a San Francisco-based AI company founded in 2021 by ex-OpenAI employees Daniela and Dario Amodei.

Anthropic offers a subscription service called Claude.ai that is similar to ChatGPT. Its most recent LLM, Claude 3.5 Sonnet, along with its web-based interface, has rapidly gained favor over ChatGPT among some LLM users who are vocal on social media, though it likely does not yet match ChatGPT in terms of mainstream brand recognition.

In particular, John Schulman, an OpenAI co-founder and key figure in the company’s post-training process for LLMs, revealed in a statement on X that he’s leaving to join rival AI firm Anthropic to do more hands-on work: “This choice stems from my desire to deepen my focus on AI alignment, and to start a new chapter of my career where I can return to hands-on technical work.” Alignment is a field that hopes to guide AI models to produce helpful outputs.

In May, OpenAI alignment researcher Jan Leike left OpenAI to join Anthropic as well, criticizing OpenAI’s handling of alignment safety.

Adding to the recent employee shake-up, The Information reports that Peter Deng, a product leader who joined OpenAI last year after stints at Meta Platforms, Uber, and Airtable, has also left the company, though we do not yet know where he is headed. In May, OpenAI co-founder Ilya Sutskever left to found a rival startup, and prominent software engineer Andrej Karpathy departed in February, recently launching an educational venture.

As De Kraker noted, if OpenAI were on the verge of developing world-changing AI technology, wouldn’t these high-profile AI veterans want to stick around and be part of this historic moment in time? “Genuine question,” he wrote. “If you were pretty sure the company you’re a key part of—and have equity in—is about to crack AGI within one or two years… why would you jump ship?”

Despite the departures, Schulman expressed optimism about OpenAI’s future in his farewell note on X. “I am confident that OpenAI and the teams I was part of will continue to thrive without me,” he wrote. “I’m incredibly grateful for the opportunity to participate in such an important part of history and I’m proud of what we’ve achieved together. I’ll still be rooting for you all, even while working elsewhere.”

This article was updated on August 7, 2024 at 4: 23 PM to mention Sam Altman’s tweet about strawberries.

Major shifts at OpenAI spark skepticism about impending AGI timelines Read More »

flux:-this-new-ai-image-generator-is-eerily-good-at-creating-human-hands

FLUX: This new AI image generator is eerily good at creating human hands

five-finger salute —

FLUX.1 is the open-weights heir apparent to Stable Diffusion, turning text into images.

AI-generated image by FLUX.1 dev:

Enlarge / AI-generated image by FLUX.1 dev: “A beautiful queen of the universe holding up her hands, face in the background.”

FLUX.1

On Thursday, AI-startup Black Forest Labs announced the launch of its company and the release of its first suite of text-to-image AI models, called FLUX.1. The German-based company, founded by researchers who developed the technology behind Stable Diffusion and invented the latent diffusion technique, aims to create advanced generative AI for images and videos.

The launch of FLUX.1 comes about seven weeks after Stability AI’s troubled release of Stable Diffusion 3 Medium in mid-June. Stability AI’s offering faced widespread criticism among image-synthesis hobbyists for its poor performance in generating human anatomy, with users sharing examples of distorted limbs and bodies across social media. That problematic launch followed the earlier departure of three key engineers from Stability AI—Robin Rombach, Andreas Blattmann, and Dominik Lorenz—who went on to found Black Forest Labs along with latent diffusion co-developer Patrick Esser and others.

Black Forest Labs launched with the release of three FLUX.1 text-to-image models: a high-end commercial “pro” version, a mid-range “dev” version with open weights for non-commercial use, and a faster open-weights “schnell” version (“schnell” means quick or fast in German). Black Forest Labs claims its models outperform existing options like Midjourney and DALL-E in areas such as image quality and adherence to text prompts.

  • AI-generated image by FLUX.1 dev: “A close-up photo of a pair of hands holding a plate full of pickles.”

    FLUX.1

  • AI-generated image by FLUX.1 dev: A hand holding up five fingers with a starry background.

    FLUX.1

  • AI-generated image by FLUX.1 dev: “An Ars Technica reader sitting in front of a computer monitor. The screen shows the Ars Technica website.”

    FLUX.1

  • AI-generated image by FLUX.1 dev: “a boxer posing with fists raised, no gloves.”

    FLUX.1

  • AI-generated image by FLUX.1 dev: “An advertisement for ‘Frosted Prick’ cereal.”

    FLUX.1

  • AI-generated image of a happy woman in a bakery baking a cake by FLUX.1 dev.

    FLUX.1

  • AI-generated image by FLUX.1 dev: “An advertisement for ‘Marshmallow Menace’ cereal.”

    FLUX.1

  • AI-generated image of “A handsome Asian influencer on top of the Empire State Building, instagram” by FLUX.1 dev.

    FLUX.1

In our experience, the outputs of the two higher-end FLUX.1 models are generally comparable with OpenAI’s DALL-E 3 in prompt fidelity, with photorealism that seems close to Midjourney 6. They represent a significant improvement over Stable Diffusion XL, the team’s last major release under Stability (if you don’t count SDXL Turbo).

The FLUX.1 models use what the company calls a “hybrid architecture” combining transformer and diffusion techniques, scaled up to 12 billion parameters. Black Forest Labs said it improves on previous diffusion models by incorporating flow matching and other optimizations.

FLUX.1 seems competent at generating human hands, which was a weak spot in earlier image-synthesis models like Stable Diffusion 1.5 due to a lack of training images that focused on hands. Since those early days, other AI image generators like Midjourney have mastered hands as well, but it’s notable to see an open-weights model that renders hands relatively accurately in various poses.

We downloaded the weights file to the FLUX.1 dev model from GitHub, but at 23GB, it won’t fit in the 12GB VRAM of our RTX 3060 card, so it will need quantization to run locally (reducing its size), which reportedly (through chatter on Reddit) some people have already had success with.

Instead, we experimented with FLUX.1 models on AI cloud-hosting platforms Fal and Replicate, which cost money to use, though Fal offers some free credits to start.

Black Forest looks ahead

Black Forest Labs may be a new company, but it’s already attracting funding from investors. It recently closed a $31 million Series Seed funding round led by Andreessen Horowitz, with additional investments from General Catalyst and MätchVC. The company also brought on high-profile advisers, including entertainment executive and former Disney President Michael Ovitz and AI researcher Matthias Bethge.

“We believe that generative AI will be a fundamental building block of all future technologies,” the company stated in its announcement. “By making our models available to a wide audience, we want to bring its benefits to everyone, educate the public and enhance trust in the safety of these models.”

  • AI-generated image by FLUX.1 dev: A cat in a car holding a can of beer that reads, ‘AI Slop.’

    FLUX.1

  • AI-generated image by FLUX.1 dev: Mickey Mouse and Spider-Man singing to each other.

    FLUX.1

  • AI-generated image by FLUX.1 dev: “a muscular barbarian with weapons beside a CRT television set, cinematic, 8K, studio lighting.”

    FLUX.1

  • AI-generated image of a flaming cheeseburger created by FLUX.1 dev.

    FLUX.1

  • AI-generated image by FLUX.1 dev: “Will Smith eating spaghetti.”

    FLUX.1

  • AI-generated image by FLUX.1 dev: “a muscular barbarian with weapons beside a CRT television set, cinematic, 8K, studio lighting. The screen reads ‘Ars Technica.'”

    FLUX.1

  • AI-generated image by FLUX.1 dev: “An advertisement for ‘Burt’s Grenades’ cereal.”

    FLUX.1

  • AI-generated image by FLUX.1 dev: “A close-up photo of a pair of hands holding a plate that contains a portrait of the queen of the universe”

    FLUX.1

Speaking of “trust and safety,” the company did not mention where it obtained the training data that taught the FLUX.1 models how to generate images. Judging by the outputs we could produce with the model that included depictions of copyrighted characters, Black Forest Labs likely used a huge unauthorized image scrape of the Internet, possibly collected by LAION, an organization that collected the datasets that trained Stable Diffusion. This is speculation at this point. While the underlying technological achievement of FLUX.1 is notable, it feels likely that the team is playing fast and loose with the ethics of “fair use” image scraping much like Stability AI did. That practice may eventually attract lawsuits like those filed against Stability AI.

Though text-to-image generation is Black Forest’s current focus, the company plans to expand into video generation next, saying that FLUX.1 will serve as the foundation of a new text-to-video model in development, which will compete with OpenAI’s Sora, Runway’s Gen-3 Alpha, and Kuaishou’s Kling in a contest to warp media reality on demand. “Our video models will unlock precise creation and editing at high definition and unprecedented speed,” the Black Forest announcement claims.

FLUX: This new AI image generator is eerily good at creating human hands Read More »

senators-propose-“digital-replication-right”-for-likeness,-extending-70-years-after-death

Senators propose “Digital replication right” for likeness, extending 70 years after death

NO SCRUBS —

Law would hold US individuals and firms liable for ripping off a person’s digital likeness.

A stock photo illustration of a person's face lit with pink light.

On Wednesday, US Sens. Chris Coons (D-Del.), Marsha Blackburn (R.-Tenn.), Amy Klobuchar (D-Minn.), and Thom Tillis (R-NC) introduced the Nurture Originals, Foster Art, and Keep Entertainment Safe (NO FAKES) Act of 2024. The bipartisan legislation, up for consideration in the US Senate, aims to protect individuals from unauthorized AI-generated replicas of their voice or likeness.

The NO FAKES Act would create legal recourse for people whose digital representations are created without consent. It would hold both individuals and companies liable for producing, hosting, or sharing these unauthorized digital replicas, including those created by generative AI. Due to generative AI technology that has become mainstream in the past two years, creating audio or image media fakes of people has become fairly trivial, with easy photorealistic video replicas likely next to arrive.

In a press statement, Coons emphasized the importance of protecting individual rights in the age of AI. “Everyone deserves the right to own and protect their voice and likeness, no matter if you’re Taylor Swift or anyone else,” he said, referring to a widely publicized deepfake incident involving the musical artist in January. “Generative AI can be used as a tool to foster creativity, but that can’t come at the expense of the unauthorized exploitation of anyone’s voice or likeness.”

The introduction of the NO FAKES Act follows the Senate’s passage of the DEFIANCE Act, which allows victims of sexual deepfakes to sue for damages.

In addition to the Swift saga, over the past few years, we’ve seen AI-powered scams involving fake celebrity endorsements, the creation of misleading political content, and situations where school kids have used AI tech to create pornographic deepfakes of classmates. Recently, X CEO Elon Musk shared a video that featured an AI-generated voice of Vice President Kamala Harris saying things she didn’t say in real life.

These incidents, in addition to concerns about actors’ likenesses being replicated without permission, have created an increasing sense of urgency among US lawmakers, who want to limit the impact of unauthorized digital likenesses. Currently, certain types of AI-generated deepfakes are already illegal due to a patchwork of federal and state laws, but this new act hopes to unify likeness regulation around the concept of “digital replicas.”

Digital replicas

An AI-generated image of a person.

Enlarge / An AI-generated image of a person.

Benj Edwards / Ars Technica

To protect a person’s digital likeness, the NO FAKES Act introduces a “digital replication right” that gives individuals exclusive control over the use of their voice or visual likeness in digital replicas. This right extends 10 years after death, with possible five-year extensions if actively used. It can be licensed during life and inherited after death, lasting up to 70 years after an individual’s death. Along the way, the bill defines what it considers to be a “digital replica”:

DIGITAL REPLICA.-The term “digital replica” means a newly created, computer-generated, highly realistic electronic representation that is readily identifiable as the voice or visual likeness of an individual that- (A) is embodied in a sound recording, image, audiovisual work, including an audiovisual work that does not have any accompanying sounds, or transmission- (i) in which the actual individual did not actually perform or appear; or (ii) that is a version of a sound recording, image, or audiovisual work in which the actual individual did perform or appear, in which the fundamental character of the performance or appearance has been materially altered; and (B) does not include the electronic reproduction, use of a sample of one sound recording or audiovisual work into another, remixing, mastering, or digital remastering of a sound recording or audiovisual work authorized by the copyright holder.

(There’s some irony in the mention of an “audiovisual work that does not have any accompanying sounds.”)

Since this bill bans types of artistic expression, the NO FAKES Act includes provisions that aim to balance IP protection with free speech. It provides exclusions for recognized First Amendment protections, such as documentaries, biographical works, and content created for purposes of comment, criticism, or parody.

In some ways, those exceptions could create a very wide protection gap that may be difficult to enforce without specific court decisions on a case-by-case basis. But without them, the NO FAKES Act could potentially stifle Americans’ constitutionally protected rights of free expression since the concept of “digital replicas” outlined in the bill includes any “computer-generated, highly realistic” digital likeness of a real person, whether AI-generated or not. For example, is a photorealistic Photoshop illustration of a person “computer-generated?” Similar questions may lead to uncertainty in enforcement.

Wide support from entertainment industry

So far, the NO FAKES Act has gained support from various entertainment industry groups, including Screen Actors Guild-American Federation of Television and Radio Artists (SAG-AFTRA), the Recording Industry Association of America (RIAA), the Motion Picture Association, and the Recording Academy. These organizations have been actively seeking protections against unauthorized AI re-creations.

The bill has also been endorsed by entertainment companies such as The Walt Disney Company, Warner Music Group, Universal Music Group, Sony Music, the Independent Film & Television Alliance, William Morris Endeavor, Creative Arts Agency, the Authors Guild, and Vermillio.

Several tech companies, including IBM and OpenAI, have also backed the NO FAKES Act. Anna Makanju, OpenAI’s vice president of global affairs, said in a statement that the act would protect creators and artists from improper impersonation. “OpenAI is pleased to support the NO FAKES Act, which would protect creators and artists from unauthorized digital replicas of their voices and likenesses,” she said.

In a statement, Coons highlighted the collaborative effort behind the bill’s development. “I am grateful for the bipartisan partnership of Senators Blackburn, Klobuchar, and Tillis and the support of stakeholders from across the entertainment and technology industries as we work to find the balance between the promise of AI and protecting the inherent dignity we all have in our own personhood.”

Senators propose “Digital replication right” for likeness, extending 70 years after death Read More »

chatgpt-advanced-voice-mode-impresses-testers-with-sound-effects,-catching-its-breath

ChatGPT Advanced Voice Mode impresses testers with sound effects, catching its breath

I Am the Very Model of a Modern Major-General —

AVM allows uncanny real-time voice conversations with ChatGPT that you can interrupt.

Stock Photo: AI Cyborg Robot Whispering Secret Or Interesting Gossip

Enlarge / A stock photo of a robot whispering to a man.

On Tuesday, OpenAI began rolling out an alpha version of its new Advanced Voice Mode to a small group of ChatGPT Plus subscribers. This feature, which OpenAI previewed in May with the launch of GPT-4o, aims to make conversations with the AI more natural and responsive. In May, the feature triggered criticism of its simulated emotional expressiveness and prompted a public dispute with actress Scarlett Johansson over accusations that OpenAI copied her voice. Even so, early tests of the new feature shared by users on social media have been largely enthusiastic.

In early tests reported by users with access, Advanced Voice Mode allows them to have real-time conversations with ChatGPT, including the ability to interrupt the AI mid-sentence almost instantly. It can sense and respond to a user’s emotional cues through vocal tone and delivery, and provide sound effects while telling stories.

But what has caught many people off-guard initially is how the voices simulate taking a breath while speaking.

“ChatGPT Advanced Voice Mode counting as fast as it can to 10, then to 50 (this blew my mind—it stopped to catch its breath like a human would),” wrote tech writer Cristiano Giardina on X.

Advanced Voice Mode simulates audible pauses for breath because it was trained on audio samples of humans speaking that included the same feature. The model has learned to simulate inhalations at seemingly appropriate times after being exposed to hundreds of thousands, if not millions, of examples of human speech. Large language models (LLMs) like GPT-4o are master imitators, and that skill has now extended to the audio domain.

Giardina shared his other impressions about Advanced Voice Mode on X, including observations about accents in other languages and sound effects.

It’s very fast, there’s virtually no latency from when you stop speaking to when it responds,” he wrote. “When you ask it to make noises it always has the voice “perform” the noises (with funny results). It can do accents, but when speaking other languages it always has an American accent. (In the video, ChatGPT is acting as a soccer match commentator)

Speaking of sound effects, X user Kesku, who is a moderator of OpenAI’s Discord server, shared an example of ChatGPT playing multiple parts with different voices and another of a voice recounting an audiobook-sounding sci-fi story from the prompt, “Tell me an exciting action story with sci-fi elements and create atmosphere by making appropriate noises of the things happening using onomatopoeia.”

Kesku also ran a few example prompts for us, including a story about the Ars Technica mascot “Moonshark.”

He also asked it to sing the “Major-General’s Song” from Gilbert and Sullivan’s 1879 comic opera The Pirates of Penzance:

Frequent AI advocate Manuel Sainsily posted a video of Advanced Voice Mode reacting to camera input, giving advice about how to care for a kitten. “It feels like face-timing a super knowledgeable friend, which in this case was super helpful—reassuring us with our new kitten,” he wrote. “It can answer questions in real-time and use the camera as input too!”

Of course, being based on an LLM, it may occasionally confabulate incorrect responses on topics or in situations where its “knowledge” (which comes from GPT-4o’s training data set) is lacking. But if considered a tech demo or an AI-powered amusement and you’re aware of the limitations, Advanced Voice Mode seems to successfully execute many of the tasks shown by OpenAI’s demo in May.

Safety

An OpenAI spokesperson told Ars Technica that the company worked with more than 100 external testers on the Advanced Voice Mode release, collectively speaking 45 different languages and representing 29 geographical areas. The system is reportedly designed to prevent impersonation of individuals or public figures by blocking outputs that differ from OpenAI’s four chosen preset voices.

OpenAI has also added filters to recognize and block requests to generate music or other copyrighted audio, which has gotten other AI companies in trouble. Giardina reported audio “leakage” in some audio outputs that have unintentional music in the background, showing that OpenAI trained the AVM voice model on a wide variety of audio sources, likely both from licensed material and audio scraped from online video platforms.

Availability

OpenAI plans to expand access to more ChatGPT Plus users in the coming weeks, with a full launch to all Plus subscribers expected this fall. A company spokesperson told Ars that users in the alpha test group will receive a notice in the ChatGPT app and an email with usage instructions.

Since the initial preview of GPT-4o voice in May, OpenAI claims to have enhanced the model’s ability to support millions of simultaneous, real-time voice conversations while maintaining low latency and high quality. In other words, they are gearing up for a rush that will take a lot of back-end computation to accommodate.

ChatGPT Advanced Voice Mode impresses testers with sound effects, catching its breath Read More »

ai-search-engine-accused-of-plagiarism-announces-publisher-revenue-sharing-plan

AI search engine accused of plagiarism announces publisher revenue-sharing plan

Beg, borrow, or license —

Perplexity says WordPress.com, TIME, Der Spiegel, and Fortune have already signed up.

Robot caught in a flashlight vector illustration

On Tuesday, AI-powered search engine Perplexity unveiled a new revenue-sharing program for publishers, marking a significant shift in its approach to third-party content use, reports CNBC. The move comes after plagiarism allegations from major media outlets, including Forbes, Wired, and Ars parent company Condé Nast. Perplexity, valued at over $1 billion, aims to compete with search giant Google.

“To further support the vital work of media organizations and online creators, we need to ensure publishers can thrive as Perplexity grows,” writes the company in a blog post announcing the problem. “That’s why we’re excited to announce the Perplexity Publishers Program and our first batch of partners: TIME, Der Spiegel, Fortune, Entrepreneur, The Texas Tribune, and WordPress.com.”

Under the program, Perplexity will share a percentage of ad revenue with publishers when their content is cited in AI-generated answers. The revenue share applies on a per-article basis and potentially multiplies if articles from a single publisher are used in one response. Some content providers, such as WordPress.com, plan to pass some of that revenue on to content creators.

A press release from WordPress.com states that joining Perplexity’s Publishers Program allows WordPress.com content to appear in Perplexity’s “Keep Exploring” section on their Discover pages. “That means your articles will be included in their search index and your articles can be surfaced as an answer on their answer engine and Discover feed,” the blog company writes. “If your website is referenced in a Perplexity search result where the company earns advertising revenue, you’ll be eligible for revenue share.”

A screenshot of the Perplexity.ai website taken on July 30, 2024.

Enlarge / A screenshot of the Perplexity.ai website taken on July 30, 2024.

Benj Edwards

Dmitry Shevelenko, Perplexity’s chief business officer, told CNBC that the company began discussions with publishers in January, with program details solidified in early 2024. He reported strong initial interest, with over a dozen publishers reaching out within hours of the announcement.

As part of the program, publishers will also receive access to Perplexity APIs that can be used to create custom “answer engines” and “Enterprise Pro” accounts that provide “enhanced data privacy and security capabilities” for all employees of Publishers in the program for one year.

Accusations of plagiarism

The revenue-sharing announcement follows a rocky month for the AI startup. In mid-June, Forbes reported finding its content within Perplexity’s Pages tool with minimal attribution. Pages allows Perplexity users to curate content and share it with others. Ars Technica sister publication Wired later made similar claims, also noting suspicious traffic patterns from IP addresses likely linked to Perplexity that were ignoring robots.txt exclusions. Perplexity was also found to be manipulating its crawling bots’ ID string to get around website blocks.

As part of company policy, Ars Technica parent Condé Nast disallows AI-based content scrapers, and its CEO Roger Lynch testified in the US Senate earlier this year that generative AI has been built with “stolen goods.” Condé sent a cease-and-desist letter to Perplexity earlier this month.

But publisher trouble might not be Perplexity’s only problem. In some tests of the search we performed in February, Perplexity badly confabulated certain answers, even when citations were readily available. Since our initial tests, the accuracy of Perplexity’s results seems to have improved, but providing inaccurate answers (which also plagued Google’s AI Overviews search feature) is still a potential issue.

Compared to the free tier of service, Perplexity users who pay $20 per month can access more capable LLMs such as GPT-4o and Claude 3, so the quality and accuracy of the output can vary dramatically depending on whether a user subscribes or not. The addition of citations to every Perplexity answer allows users to check accuracy—if they take the time to do it.

The move by Perplexity occurs against a backdrop of tensions between AI companies and content creators. Some media outlets, such as The New York Times, have filed lawsuits against AI vendors like OpenAI and Microsoft, alleging copyright infringement in the training of large language models. OpenAI has struck media licensing deals with many publishers as a way to secure access to high-quality training data and avoid future lawsuits.

In this case, Perplexity is not using the licensed articles and content to train AI models but is seeking legal permission to reproduce content from publishers on its website.

AI search engine accused of plagiarism announces publisher revenue-sharing plan Read More »