copyright act

judge:-anthropic’s-$1.5b-settlement-is-being-shoved-“down-the-throat-of-authors”

Judge: Anthropic’s $1.5B settlement is being shoved “down the throat of authors”

At a hearing Monday, US district judge William Alsup blasted a proposed $1.5 billion settlement over Anthropic’s rampant piracy of books to train AI.

The proposed settlement comes in a case where Anthropic could have owed more than $1 trillion in damages after Alsup certified a class that included up to 7 million claimants whose works were illegally downloaded by the AI company.

Instead, critics fear Anthropic will get off cheaply, striking a deal with authors suing that covers less than 500,000 works and paying a small fraction of its total valuation (currently $183 billion) to get away with the massive theft. Defector noted that the settlement doesn’t even require Anthropic to admit wrongdoing, while the company continues raising billions based on models trained on authors’ works. Most recently, Anthropic raised $13 billion in a funding round, making back about 10 times the proposed settlement amount after announcing the deal.

Alsup expressed grave concerns that lawyers rushed the deal, which he said now risks being shoved “down the throat of authors,” Bloomberg Law reported.

In an order, Alsup clarified why he thought the proposed settlement was a chaotic mess. The judge said he was “disappointed that counsel have left important questions to be answered in the future,” seeking approval for the settlement despite the Works List, the Class List, the Claim Form, and the process for notification, allocation, and dispute resolution all remaining unresolved.

Denying preliminary approval of the settlement, Alsup suggested that the agreement is “nowhere close to complete,” forcing Anthropic and authors’ lawyers to “recalibrate” the largest publicly reported copyright class-action settlement ever inked, Bloomberg reported.

Of particular concern, the settlement failed to outline how disbursements would be managed for works with multiple claimants, Alsup noted. Until all these details are ironed out, Alsup intends to withhold approval, the order said.

One big change the judge wants to see is the addition of instructions requiring “anyone with copyright ownership” to opt in, with the consequence that the work won’t be covered if even one rights holder opts out, Bloomberg reported. There should also be instruction that any disputes over ownership or submitted claims should be settled in state court, Alsup said.

Judge: Anthropic’s $1.5B settlement is being shoved “down the throat of authors” Read More »

judge-calls-out-openai’s-“straw-man”-argument-in-new-york-times-copyright-suit

Judge calls out OpenAI’s “straw man” argument in New York Times copyright suit

“Taken as true, these facts give rise to a plausible inference that defendants at a minimum had reason to investigate and uncover end-user infringement,” Stein wrote.

To Stein, the fact that OpenAI maintains an “ongoing relationship” with users by providing outputs that respond to users’ prompts also supports contributory infringement claims, despite OpenAI’s argument that ChatGPT’s “substantial noninfringing uses” are exonerative.

OpenAI defeated some claims

For OpenAI, Stein’s ruling likely disappoints, although Stein did drop some of NYT’s claims.

Likely upsetting to news publishers, that included a “free-riding” claim that ChatGPT unfairly profits off time-sensitive “hot news” items, including the NYT’s Wirecutter posts. Stein explained that news publishers failed to plausibly allege non-attribution (which is key to a free-riding claim) because, for example, ChatGPT cites the NYT when sharing information from Wirecutter posts. Those claims are pre-empted by the Copyright Act anyway, Stein wrote, granting OpenAI’s motion to dismiss.

Stein also dismissed a claim from the NYT regarding alleged removal of copyright management information (CMI), which Stein said cannot be proven simply because ChatGPT reproduces excerpts of NYT articles without CMI.

The Digital Millennium Copyright Act (DMCA) requires news publishers to show that ChatGPT’s outputs are “close to identical” to the original work, Stein said, and allowing publishers’ claims based on excerpts “would risk boundless DMCA liability”—including for any use of block quotes without CMI.

Asked for comment on the ruling, an OpenAI spokesperson declined to go into any specifics, instead repeating OpenAI’s long-held argument that AI training on copyrighted works is fair use. (Last month, OpenAI warned Donald Trump that the US would lose the AI race to China if courts ruled against that argument.)

“ChatGPT helps enhance human creativity, advance scientific discovery and medical research, and enable hundreds of millions of people to improve their daily lives,” OpenAI’s spokesperson said. “Our models empower innovation, and are trained on publicly available data and grounded in fair use.”

Judge calls out OpenAI’s “straw man” argument in New York Times copyright suit Read More »

elon-musk’s-x-can’t-invent-its-own-copyright-law,-judge-says

Elon Musk’s X can’t invent its own copyright law, judge says

Who owns X data? Everyone but X —

Judge rules copyright law governs public data scraping, not X’s terms.

Elon Musk’s X can’t invent its own copyright law, judge says

A US district judge William Alsup has dismissed Elon Musk’s X Corp’s lawsuit against Bright Data, a data-scraping company accused of improperly accessing X (formerly Twitter) systems and violating both X terms and state laws when scraping and selling data.

X sued Bright Data to stop the company from scraping and selling X data to academic institutes and businesses, including Fortune 500 companies.

According to Alsup, X failed to state a claim while arguing that companies like Bright Data should have to pay X to access public data posted by X users.

“To the extent the claims are based on access to systems, they fail because X Corp. has alleged no more than threadbare recitals,” parroting laws and findings in other cases without providing any supporting evidence, Alsup wrote. “To the extent the claims are based on scraping and selling of data, they fail because they are preempted by federal law,” specifically standing as an “obstacle to the accomplishment and execution of” the Copyright Act.

The judge found that X Corp’s argument exposed a tension between the platform’s desire to control user data while also enjoying the safe harbor of Section 230 of the Communications Decency Act, which allows X to avoid liability for third-party content. If X owned the data, it could perhaps argue it has exclusive rights to control the data, but then it wouldn’t have safe harbor.

“X Corp. wants it both ways: to keep its safe harbors yet exercise a copyright owner’s right to exclude, wresting fees from those who wish to extract and copy X users’ content,” Alsup wrote.

If X got its way, Alsup warned, “X Corp. would entrench its own private copyright system that rivals, even conflicts with, the actual copyright system enacted by Congress” and “yank into its private domain and hold for sale information open to all, exercising a copyright owner’s right to exclude where it has no such right.”

That “would upend the careful balance Congress struck between what copyright owners own and do not own,” Alsup wrote, potentially shrinking the public domain.

“Applying general principles, this order concludes that the extent to which public data may be freely copied from social media platforms, even under the banner of scraping, should generally be governed by the Copyright Act, not by conflicting, ubiquitous terms,” Alsup wrote.

Bright Data CEO Or Lenchner said in a statement provided to Ars that Alsup’s decision had “profound implications in business, research, training of AI models, and beyond.”

“Bright Data has proven that ethical and transparent scraping practices for legitimate business use and social good initiatives are legally sound,” Lenchner said. “Companies that try to control user data intended for public consumption will not win this legal battle.”

Alsup pointed out that X’s lawsuit was “not looking to protect X users’ privacy” but rather to block Bright Data from interfering with its “own sale of its data through a tiered subscription service.”

“X Corp. is happy to allow the extraction and copying of X users’ content so long as it gets paid,” Alsup wrote.

In a sea of vague claims that scraping is “unfair,” perhaps most deficient in X’s complaint, Alsup suggested, was X’s failure to allege that Bright Data’s scraping impaired its services or that X suffered any damages.

“There are no allegations of servers harmed or identities misrepresented,” Alsup wrote. “Additionally, there are no allegations of any damage resulting from automated or unauthorized access.”

X will be allowed to amend its complaint and appeal. The case may be strengthened if X can show evidence of damages or prove that the scraping overburdened X or otherwise deprived X users of their use of the platform in a way that could damage X’s reputation.

But as it currently stands, X’s arguments in many ways appear rather “bare,” Alsup wrote, while its terms of service make crystal clear to users that “[w]hat’s yours is yours—you own your Content.”

By attempting to exclude Bright Data from accessing public X posts owned by X users, X also nearly “obliterated” the “fair use” provision of the Copyright Act, “flouting” Congress’ intent in passing the law, Alsup wrote.

“Only by receiving permission and paying X Corp. could Bright Data, its customers, and other X users freely reproduce, adapt, distribute, and display what might (or might not) be available for taking and selling as fair use,” Alsup wrote. “Thus, Bright Data, its customers, and other X users who wanted to make fair use of copyrighted content would not be able to do so.”

A win for X could have had dire consequences for the Internet, Alsup suggested. In dismissing the complaint, Alsup cited an appeals court ruling “that giving social media companies “free rein to decide, on any basis, who can collect and use data—data that the companies do not own, that they otherwise make publicly available to viewers, and that the companies themselves collect and use—risks the possible creation of information monopolies that would disserve the public interest.”

Because that outcome was averted, Lenchner is celebrating Bright Data’s win.

“Bright Data’s victory over X makes it clear to the world that public information on the web belongs to all of us, and any attempt to deny the public access will fail,” Lenchner said.

In 2023, Bright Data won a similar lawsuit lobbed by Meta over scraping public Facebook and Instagram data. These lawsuits, Lenchner alleged, “are used as a monetary weapon to discourage collecting public data from sites, so conglomerates can hoard user-generated public data.”

“Courts recognize this and the risks it poses of information monopolies and ownership of the Internet,” Lenchner said.

X did not respond to Ars’ request to comment.

Elon Musk’s X can’t invent its own copyright law, judge says Read More »