copyright infringement

openai-accuses-nyt-of-hacking-chatgpt-to-set-up-copyright-suit

OpenAI accuses NYT of hacking ChatGPT to set up copyright suit

OpenAI accuses NYT of hacking ChatGPT to set up copyright suit

OpenAI is now boldly claiming that The New York Times “paid someone to hack OpenAI’s products” like ChatGPT to “set up” a lawsuit against the leading AI maker.

In a court filing Monday, OpenAI alleged that “100 examples in which some version of OpenAI’s GPT-4 model supposedly generated several paragraphs of Times content as outputs in response to user prompts” do not reflect how normal people use ChatGPT.

Instead, it allegedly took The Times “tens of thousands of attempts to generate” these supposedly “highly anomalous results” by “targeting and exploiting a bug” that OpenAI claims it is now “committed to addressing.”

According to OpenAI this activity amounts to “contrived attacks” by a “hired gun”—who allegedly hacked OpenAI models until they hallucinated fake NYT content or regurgitated training data to replicate NYT articles. NYT allegedly paid for these “attacks” to gather evidence to support The Times’ claims that OpenAI’s products imperil its journalism by allegedly regurgitating reporting and stealing The Times’ audiences.

“Contrary to the allegations in the complaint, however, ChatGPT is not in any way a substitute for a subscription to The New York Times,” OpenAI argued in a motion that seeks to dismiss the majority of The Times’ claims. “In the real world, people do not use ChatGPT or any other OpenAI product for that purpose. Nor could they. In the ordinary course, one cannot use ChatGPT to serve up Times articles at will.”

In the filing, OpenAI described The Times as enthusiastically reporting on its chatbot developments for years without raising any concerns about copyright infringement. OpenAI claimed that it disclosed that The Times’ articles were used to train its AI models in 2020, but The Times only cared after ChatGPT’s popularity exploded after its debut in 2022.

According to OpenAI, “It was only after this rapid adoption, along with reports of the value unlocked by these new technologies, that the Times claimed that OpenAI had ‘infringed its copyright[s]’ and reached out to demand ‘commercial terms.’ After months of discussions, the Times filed suit two days after Christmas, demanding ‘billions of dollars.'”

Ian Crosby, Susman Godfrey partner and lead counsel for The New York Times, told Ars that “what OpenAI bizarrely mischaracterizes as ‘hacking’ is simply using OpenAI’s products to look for evidence that they stole and reproduced The Times’s copyrighted works. And that is exactly what we found. In fact, the scale of OpenAI’s copying is much larger than the 100-plus examples set forth in the complaint.”

Crosby told Ars that OpenAI’s filing notably “doesn’t dispute—nor can they—that they copied millions of The Times’ works to build and power its commercial products without our permission.”

“Building new products is no excuse for violating copyright law, and that’s exactly what OpenAI has done on an unprecedented scale,” Crosby said.

OpenAI argued that the court should dismiss claims alleging direct copyright, contributory infringement, Digital Millennium Copyright Act violations, and misappropriation, all of which it describes as “legally infirm.” Some fail because they are time-barred—seeking damages on training data for OpenAI’s older models—OpenAI claimed. Others allegedly fail because they misunderstand fair use or are preempted by federal laws.

If OpenAI’s motion is granted, the case would be substantially narrowed.

But if the motion is not granted and The Times ultimately wins—and it might—OpenAI may be forced to wipe ChatGPT and start over.

“OpenAI, which has been secretive and has deliberately concealed how its products operate, is now asserting it’s too late to bring a claim for infringement or hold them accountable. We disagree,” Crosby told Ars. “It’s noteworthy that OpenAI doesn’t dispute that it copied Times works without permission within the statute of limitations to train its more recent and current models.”

OpenAI did not immediately respond to Ars’ request to comment.

OpenAI accuses NYT of hacking ChatGPT to set up copyright suit Read More »

court-blocks-$1-billion-copyright-ruling-that-punished-isp-for-its-users’-piracy

Court blocks $1 billion copyright ruling that punished ISP for its users’ piracy

A man, surrounded by music CDs, uses a laptop while wearing a skull-and-crossbones pirate hat and holding one of the CDs in his mouth.

Getty Images | OcusFocus

A federal appeals court today overturned a $1 billion piracy verdict that a jury handed down against cable Internet service provider Cox Communications in 2019. Judges rejected Sony’s claim that Cox profited directly from copyright infringement committed by users of Cox’s cable broadband network.

Appeals court judges didn’t let Cox off the hook entirely, but they vacated the damages award and ordered a new damages trial, which will presumably result in a significantly smaller amount to be paid to Sony and other copyright holders. Universal and Warner are also plaintiffs in the case.

“We affirm the jury’s finding of willful contributory infringement,” said a unanimous decision by a three-judge panel at the US Court of Appeals for the 4th Circuit. “But we reverse the vicarious liability verdict and remand for a new trial on damages because Cox did not profit from its subscribers’ acts of infringement, a legal prerequisite for vicarious liability.”

If the correct legal standard had been used in the district court, “no reasonable jury could find that Cox received a direct financial benefit from its subscribers’ infringement of Plaintiffs’ copyrights,” judges wrote.

The case began when Sony and other music copyright holders sued Cox, claiming that it didn’t adequately fight piracy on its network and failed to terminate repeat infringers. A US District Court jury in the Eastern District of Virginia found the ISP liable for infringement of 10,017 copyrighted works.

Copyright owners want ISPs to disconnect users

Cox’s appeal was supported by advocacy groups concerned that the big-money judgment could force ISPs to disconnect more Internet users based merely on accusations of copyright infringement. Groups such as the Electronic Frontier Foundation also called the ruling legally flawed.

“When these music companies sued Cox Communications, an ISP, the court got the law wrong,” the EFF wrote in 2021. “It effectively decided that the only way for an ISP to avoid being liable for infringement by its users is to terminate a household or business’s account after a small number of accusations—perhaps only two. The court also allowed a damages formula that can lead to nearly unlimited damages, with no relationship to any actual harm suffered. If not overturned, this decision will lead to an untold number of people losing vital Internet access as ISPs start to cut off more and more customers to avoid massive damages.”

In today’s 4th Circuit ruling, appeals court judges wrote that “Sony failed, as a matter of law, to prove that Cox profits directly from its subscribers’ copyright infringement.”

A defendant may be vicariously liable for a third party’s copyright infringement if it profits directly from it and is in a position to supervise the infringer, the ruling said. Cox argued that it doesn’t profit directly from infringement because it receives the same monthly fee from subscribers whether they illegally download copyrighted files or not, the ruling noted.

The question in this type of case is whether there is a causal relationship between the infringement and the financial benefit. “If copyright infringement draws customers to the defendant’s service or incentivizes them to pay more for their service, that financial benefit may be profit from infringement. But in every case, the financial benefit to the defendant must flow directly from the third party’s acts of infringement to establish vicarious liability,” the court said.

Court blocks $1 billion copyright ruling that punished ISP for its users’ piracy Read More »

judge-rejects-most-chatgpt-copyright-claims-from-book-authors

Judge rejects most ChatGPT copyright claims from book authors

Insufficient evidence —

OpenAI plans to defeat authors’ remaining claim at a “later stage” of the case.

Judge rejects most ChatGPT copyright claims from book authors

A US district judge in California has largely sided with OpenAI, dismissing the majority of claims raised by authors alleging that large language models powering ChatGPT were illegally trained on pirated copies of their books without their permission.

By allegedly repackaging original works as ChatGPT outputs, authors alleged, OpenAI’s most popular chatbot was just a high-tech “grift” that seemingly violated copyright laws, as well as state laws preventing unfair business practices and unjust enrichment.

According to judge Araceli Martínez-Olguín, authors behind three separate lawsuits—including Sarah Silverman, Michael Chabon, and Paul Tremblay—have failed to provide evidence supporting any of their claims except for direct copyright infringement.

OpenAI had argued as much in their promptly filed motion to dismiss these cases last August. At that time, OpenAI said that it expected to beat the direct infringement claim at a “later stage” of the proceedings.

Among copyright claims tossed by Martínez-Olguín were accusations of vicarious copyright infringement. Perhaps most significantly, Martínez-Olguín agreed with OpenAI that the authors’ allegation that “every” ChatGPT output “is an infringing derivative work” is “insufficient” to allege vicarious infringement, which requires evidence that ChatGPT outputs are “substantially similar” or “similar at all” to authors’ books.

“Plaintiffs here have not alleged that the ChatGPT outputs contain direct copies of the copyrighted books,” Martínez-Olguín wrote. “Because they fail to allege direct copying, they must show a substantial similarity between the outputs and the copyrighted materials.”

Authors also failed to convince Martínez-Olguín that OpenAI violated the Digital Millennium Copyright Act (DMCA) by allegedly removing copyright management information (CMI)—such as author names, titles of works, and terms and conditions for use of the work—from training data.

This claim failed because authors cited “no facts” that OpenAI intentionally removed the CMI or built the training process to omit CMI, Martínez-Olguín wrote. Further, the authors cited examples of ChatGPT referencing their names, which would seem to suggest that some CMI remains in the training data.

Some of the remaining claims were dependent on copyright claims to survive, Martínez-Olguín wrote.

Arguing that OpenAI caused economic injury by unfairly repurposing authors’ works, even if authors could show evidence of a DMCA violation, authors could only speculate about what injury was caused, the judge said.

Similarly, allegations of “fraudulent” unfair conduct—accusing OpenAI of “deceptively” designing ChatGPT to produce outputs that omit CMI—”rest on a violation of the DMCA,” Martínez-Olguín wrote.

The only claim under California’s unfair competition law that was allowed to proceed alleged that OpenAI used copyrighted works to train ChatGPT without authors’ permission. Because the state law broadly defines what’s considered “unfair,” Martínez-Olguín said that it’s possible that OpenAI’s use of the training data “may constitute an unfair practice.”

Remaining claims of negligence and unjust enrichment failed, Martínez-Olguín wrote, because authors only alleged intentional acts and did not explain how OpenAI “received and unjustly retained a benefit” from training ChatGPT on their works.

Authors have been ordered to consolidate their complaints and have until March 13 to amend arguments and continue pursuing any of the dismissed claims.

To shore up the tossed copyright claims, authors would likely need to provide examples of ChatGPT outputs that are similar to their works, as well as evidence of OpenAI intentionally removing CMI to “induce, enable, facilitate, or conceal infringement,” Martínez-Olguín wrote.

Ars could not immediately reach the authors’ lawyers or OpenAI for comment.

As authors likely prepare to continue fighting OpenAI, the US Copyright Office has been fielding public input before releasing guidance that could one day help rights holders pursue legal claims and may eventually require works to be licensed from copyright owners for use as training materials. Among the thorniest questions is whether AI tools like ChatGPT should be considered authors when spouting outputs included in creative works.

While the Copyright Office prepares to release three reports this year “revealing its position on copyright law in relation to AI,” according to The New York Times, OpenAI recently made it clear that it does not plan to stop referencing copyrighted works in its training data. Last month, OpenAI said it would be “impossible” to train AI models without copyrighted materials, because “copyright today covers virtually every sort of human expression—including blogposts, photographs, forum posts, scraps of software code, and government documents.”

According to OpenAI, it doesn’t just need old copyrighted materials; it needs current copyright materials to ensure that chatbot and other AI tools’ outputs “meet the needs of today’s citizens.”

Rights holders will likely be bracing throughout this confusing time, waiting for the Copyright Office’s reports. But once there is clarity, those reports could “be hugely consequential, weighing heavily in courts, as well as with lawmakers and regulators,” The Times reported.

Judge rejects most ChatGPT copyright claims from book authors Read More »