In April, TSMC was provided with $6.6 billion in direct CHIPS Act funding to “support TSMC’s investment of more than $65 billion in three greenfield leading-edge fabs in Phoenix, Arizona, which will manufacture the world’s most advanced semiconductors,” the Department of Commerce said.
These investments are key to the Biden-Harris administration’s mission of strengthening “economic and national security by providing a reliable domestic supply of the chips that will underpin the future economy, powering the AI boom and other fast-growing industries like consumer electronics, automotive, Internet of Things, and high-performance computing,” the department noted. And in particular, the funding will help America “maintain our competitive edge” in artificial intelligence, the department said.
It likely wouldn’t make sense to prop TSMC up to help the US “onshore the critical hardware manufacturing capabilities that underpin AI’s deep language learning algorithms and inferencing techniques,” to then limit access to US-made tech. TSMC’s Arizona fabs are supposed to support companies like Apple, Nvidia, and Qualcomm and enable them to “compete effectively,” the Department of Commerce said.
Currently, it’s unclear where the US probe into TSMC will go or whether a damaging finding could potentially impact TSMC’s CHIPS funding.
Last fall, the Department of Commerce published a final rule, though, designed to “prevent CHIPS funds from being used to directly or indirectly benefit foreign countries of concern,” such as China.
If the US suspected that TSMC was aiding Huawei’s AI chip manufacturing, the company could be perceived as avoiding CHIPS guardrails prohibiting TSMC from “knowingly engaging in any joint research or technology licensing effort with a foreign entity of concern that relates to a technology or product that raises national security concerns.”
Violating this “technology clawback” provision of the final rule risks “the full amount” of CHIPS Act funding being “recovered” by the Department of Commerce. That outcome seems unlikely, though, given that TSMC has been awarded more funding than any other recipient apart from Intel.
The Department of Commerce declined Ars’ request to comment on whether TSMC’s CHIPS Act funding could be impacted by their reported probe.
Judge calls for a swift end to experts secretly using AI to sway cases.
A New York judge recently called out an expert witness for using Microsoft’s Copilot chatbot to inaccurately estimate damages in a real estate dispute that partly depended on an accurate assessment of damages to win.
In an order Thursday, judge Jonathan Schopf warned that “due to the nature of the rapid evolution of artificial intelligence and its inherent reliability issues” that any use of AI should be disclosed before testimony or evidence is admitted in court. Admitting that the court “has no objective understanding as to how Copilot works,” Schopf suggested that the legal system could be disrupted if experts started overly relying on chatbots en masse.
His warning came after an expert witness, Charles Ranson, dubiously used Copilot to cross-check calculations in a dispute over a $485,000 rental property in the Bahamas that had been included in a trust for a deceased man’s son. The court was being asked to assess if the executrix and trustee—the deceased man’s sister—breached her fiduciary duties by delaying the sale of the property while admittedly using it for personal vacations.
To win, the surviving son had to prove that his aunt breached her duties by retaining the property, that her vacations there were a form of self-dealing, and that he suffered damages from her alleged misuse of the property.
It was up to Ranson to figure out how much would be owed to the son had the aunt sold the property in 2008 compared to the actual sale price in 2022. But Ranson, an expert in trust and estate litigation, “had no relevant real estate expertise,” Schopf said, finding that Ranson’s testimony was “entirely speculative” and failed to consider obvious facts, such as the pandemic’s impact on rental prices or trust expenses like real estate taxes.
Seemingly because Ranson didn’t have the relevant experience in real estate, he turned to Copilot to fill in the blanks and crunch the numbers. The move surprised Internet law expert Eric Goldman, who told Ars that “lawyers retain expert witnesses for their specialized expertise, and it doesn’t make any sense for an expert witness to essentially outsource that expertise to generative AI.”
“If the expert witness is simply asking a chatbot for a computation, then the lawyers could make that same request directly without relying on the expert witness (and paying the expert’s substantial fees),” Goldman suggested.
Perhaps the son’s legal team wasn’t aware of how big a role Copilot played. Schopf noted that Ranson couldn’t recall what prompts he used to arrive at his damages estimate. The expert witness also couldn’t recall any sources for the information he took from the chatbot and admitted that he lacked a basic understanding of how Copilot “works or how it arrives at a given output.”
Ars could not immediately reach Ranson for comment. But in Schopf’s order, the judge wrote that Ranson defended using Copilot as a common practice for expert witnesses like him today.
“Ranson was adamant in his testimony that the use of Copilot or other artificial intelligence tools, for drafting expert reports is generally accepted in the field of fiduciary services and represents the future of analysis of fiduciary decisions; however, he could not name any publications regarding its use or any other sources to confirm that it is a generally accepted methodology,” Schopf wrote.
Goldman noted that Ranson relying on Copilot for “what was essentially a numerical computation was especially puzzling because of generative AI’s known hallucinatory tendencies, which makes numerical computations untrustworthy.”
Because Ranson was so bad at explaining how Copilot works, Schopf took the extra time to actually try to use Copilot to generate the estimates that Ranson got—and he could not.
Each time, the court entered the same query into Copilot—”Can you calculate the value of $250,000 invested in the Vanguard Balanced Index Fund from December 31, 2004 through January 31, 2021?”—and each time Copilot generated a slightly different answer.
This “calls into question the reliability and accuracy of Copilot to generate evidence to be relied upon in a court proceeding,” Schopf wrote.
Chatbot not to blame, judge says
While the court was experimenting with Copilot, they also probed the chatbot for answers to a more Big Picture legal question: Are Copilot’s responses accurate enough to be cited in court?
The court found that Copilot had less faith in its outputs than Ranson seemingly did. When asked “are you accurate” or “reliable,” Copilot responded that “my accuracy is only as good as my sources, so for critical matters, it’s always wise to verify.” When more specifically asked, “Are your calculations reliable enough for use in court,” Copilot similarly recommended that outputs “should always be verified by experts and accompanied by professional evaluations before being used in court.”
Although it seemed clear that Ranson did not verify outputs before using them in court, Schopf noted that at least “developers of the Copilot program recognize the need for its supervision by a trained human operator to verify the accuracy of the submitted information as well as the output.”
Microsoft declined Ars’ request to comment.
Until a bright-line rule exists telling courts when to accept AI-generated testimony, Schopf suggested that courts should require disclosures from lawyers to stop chatbot-spouted inadmissible testimony from disrupting the legal system.
“The use of artificial intelligence is a rapidly growing reality across many industries,” Schopf wrote. “The mere fact that artificial intelligence has played a role, which continues to expand in our everyday lives, does not make the results generated by artificial intelligence admissible in Court.”
Ultimately, Schopf found that there was no breach of fiduciary duty, negating the need for Ranson’s Copilot-cribbed testimony on damages in the Bahamas property case. Schopf denied all of the son’s objections in their entirety (as well as any future claims) after calling out Ranson’s misuse of the chatbot at length.
But in his order, the judge suggested that Ranson seemed to get it all wrong before involving the chatbot.
“Whether or not he was retained and/ or qualified as a damages expert in areas other than fiduciary duties, his testimony shows that he admittedly did not perform a full analysis of the problem, utilized an incorrect time period for damages, and failed to consider obvious elements into his calculations, all of which go against the weight and credibility of his opinion,” Schopf wrote.
Schopf noted that the evidence showed that rather than the son losing money from his aunt’s management of the trust—which Ranson’s cited chatbot’s outputs supposedly supported—the sale of the property in 2022 led to “no attributable loss of capital” and “in fact, it generated an overall profit to the Trust.”
Goldman suggested that Ranson did not seemingly spare much effort by employing Copilot in a way that seemed to damage his credibility in court.
“It would not have been difficult for the expert to pull the necessary data directly from primary sources, so the process didn’t even save much time—but that shortcut came at the cost of the expert’s credibility,” Goldman told Ars.
Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.
Among the first AI companies that the Federal Trade Commission has exposed as deceiving consumers is DoNotPay—which initially was advertised as “the world’s first robot lawyer” with the ability to “sue anyone with the click of a button.”
On Wednesday, the FTC announced that it took action to stop DoNotPay from making bogus claims after learning that the AI startup conducted no testing “to determine whether its AI chatbot’s output was equal to the level of a human lawyer.” DoNotPay also did not “hire or retain any attorneys” to help verify AI outputs or validate DoNotPay’s legal claims.
DoNotPay accepted no liability. But to settle the charges that DoNotPay violated the FTC Act, the AI startup agreed to pay $193,000, if the FTC’s consent agreement is confirmed following a 30-day public comment period. Additionally, DoNotPay agreed to warn “consumers who subscribed to the service between 2021 and 2023” about the “limitations of law-related features on the service,” the FTC said.
Moving forward, DoNotPay would also be prohibited under the settlement from making baseless claims that any of its features can be substituted for any professional service.
A DoNotPay spokesperson told Ars that the company “is pleased to have worked constructively with the FTC to settle this case and fully resolve these issues, without admitting liability.”
“The complaint relates to the usage of a few hundred customers some years ago (out of millions of people), with services that have long been discontinued,” DoNotPay’s spokesperson said.
The FTC’s settlement with DoNotPay is part of a larger agency effort to crack down on deceptive AI claims. Four other AI companies were hit with enforcement actions Wednesday, the FTC said, and FTC Chair Lina Khan confirmed that the agency’s so-called “Operation AI Comply” will continue monitoring companies’ attempts to “lure consumers into bogus schemes” or use AI tools to “turbocharge deception.”
“Using AI tools to trick, mislead, or defraud people is illegal,” Khan said. “The FTC’s enforcement actions make clear that there is no AI exemption from the laws on the books. By cracking down on unfair or deceptive practices in these markets, FTC is ensuring that honest businesses and innovators can get a fair shot and consumers are being protected.”
DoNotPay never tested robot lawyer
DoNotPay was initially released in 2015 as a free way to contest parking tickets. Soon after, it quickly expanded its services to supposedly cover 200 areas of law—aiding with everything from breach of contract claims to restraining orders to insurance claims and divorce settlements.
As DoNotPay’s legal services expanded, the company defended its innovative approach to replacing lawyers while acknowledging that it was on seemingly shaky grounds. In 2018, DoNotPay CEO Joshua Browder confirmed to the ABA Journal that the legal services were provided with “no lawyer oversight.” But he said that he was only “a bit worried” about threats to sue DoNotPay for unlicensed practice of law. Because DoNotPay was free, he expected he could avoid some legal challenges.
According to the FTC complaint, DoNotPay began charging subscribers $36 every two months in 2019 while making several false claims in ads to apparently drive up subscriptions.
OpenAI hopes to convince the White House to approve a sprawling plan that would place 5-gigawatt AI data centers in different US cities, Bloomberg reports.
The AI company’s CEO, Sam Altman, supposedly pitched the plan after a recent meeting with the Biden administration where stakeholders discussed AI infrastructure needs. Bloomberg reviewed an OpenAI document outlining the plan, reporting that 5 gigawatts “is roughly the equivalent of five nuclear reactors” and warning that each data center will likely require “more energy than is used to power an entire city or about 3 million homes.”
According to OpenAI, the US needs these massive data centers to expand AI capabilities domestically, protect national security, and effectively compete with China. If approved, the data centers would generate “thousands of new jobs,” OpenAI’s document promised, and help cement the US as an AI leader globally.
But the energy demand is so enormous that OpenAI told officials that the “US needs policies that support greater data center capacity,” or else the US could fall behind other countries in AI development, the document said.
Energy executives told Bloomberg that “powering even a single 5-gigawatt data center would be a challenge,” as power projects nationwide are already “facing delays due to long wait times to connect to grids, permitting delays, supply chain issues, and labor shortages.” Most likely, OpenAI’s data centers wouldn’t rely entirely on the grid, though, instead requiring a “mix of new wind and solar farms, battery storage and a connection to the grid,” John Ketchum, CEO of NextEra Energy Inc, told Bloomberg.
That’s a big problem for OpenAI, since one energy executive, Constellation Energy Corp. CEO Joe Dominguez, told Bloomberg that he’s heard that OpenAI wants to build five to seven data centers. “As an engineer,” Dominguez said he doesn’t think that OpenAI’s plan is “feasible” and would seemingly take more time than needed to address current national security risks as US-China tensions worsen.
OpenAI may be hoping to avoid delays and cut the lines—if the White House approves the company’s ambitious data center plan. For now, a person familiar with OpenAI’s plan told Bloomberg that OpenAI is focused on launching a single data center before expanding the project to “various US cities.”
Bloomberg’s report comes after OpenAI’s chief investor, Microsoft, announced a 20-year deal with Constellation to re-open Pennsylvania’s shuttered Three Mile Island nuclear plant to provide a new energy source for data centers powering AI development and other technologies. But even if that deal is approved by regulators, the resulting energy supply that Microsoft could access—roughly 835 megawatts (0.835 gigawatts) of energy generation, which is enough to power approximately 800,000 homes—is still more than five times less than OpenAI’s 5-gigawatt demand for its data centers.
Ketchum told Bloomberg that it’s easier to find a US site for a 1-gigawatt data center, but locating a site for a 5-gigawatt facility would likely be a bigger challenge. Notably, Amazon recently bought a $650 million nuclear-powered data center in Pennsylvania with a 2.5-gigawatt capacity. At the meeting with the Biden administration, OpenAI suggested opening large-scale data centers in Wisconsin, California, Texas, and Pennsylvania, a source familiar with the matter told CNBC.
During that meeting, the Biden administration confirmed that developing large-scale AI data centers is a priority, announcing “a new Task Force on AI Datacenter Infrastructure to coordinate policy across government.” OpenAI seems to be trying to get the task force’s attention early on, outlining in the document that Bloomberg reviewed the national security and economic benefits its data centers could provide for the US.
In a statement to Bloomberg, OpenAI’s spokesperson said that “OpenAI is actively working to strengthen AI infrastructure in the US, which we believe is critical to keeping America at the forefront of global innovation, boosting reindustrialization across the country, and making AI’s benefits accessible to everyone.”
Big Tech companies and AI startups will likely continue pressuring officials to approve data center expansions, as well as new kinds of nuclear reactors as the AI explosion globally continues. Goldman Sachs estimated that “data center power demand will grow 160 percent by 2030.” To ensure power supplies for its AI, according to the tech news site Freethink, Microsoft has even been training AI to draft all the documents needed for proposals to secure government approvals for nuclear plants to power AI data centers.
Cloudflare announced new tools Monday that it claims will help end the era of endless AI scraping by giving all sites on its network the power to block bots in one click.
That will help stop the firehose of unrestricted AI scraping, but, perhaps even more intriguing to content creators everywhere, Cloudflare says it will also make it easier to identify which content that bots scan most, so that sites can eventually wall off access and charge bots to scrape their most valuable content. To pave the way for that future, Cloudflare is also creating a marketplace for all sites to negotiate content deals based on more granular AI audits of their sites.
These tools, Cloudflare’s blog said, give content creators “for the first time” ways “to quickly and easily understand how AI model providers are using their content, and then take control of whether and how the models are able to access it.”
That’s necessary for content creators because the rise of generative AI has made it harder to value their content, Cloudflare suggested in a longer blog explaining the tools.
Previously, sites could distinguish between approving access to helpful bots that drive traffic, like search engine crawlers, and denying access to bad bots that try to take down sites or scrape sensitive or competitive data.
But now, “Large Language Models (LLMs) and other generative tools created a murkier third category” of bots, Cloudflare said, that don’t perfectly fit in either category. They don’t “necessarily drive traffic” like a good bot, but they also don’t try to steal sensitive data like a bad bot, so many site operators don’t have a clear way to think about the “value exchange” of allowing AI scraping, Cloudflare said.
That’s a problem because enabling all scraping could hurt content creators in the long run, Cloudflare predicted.
“Many sites allowed these AI crawlers to scan their content because these crawlers, for the most part, looked like ‘good’ bots—only for the result to mean less traffic to their site as their content is repackaged in AI-written answers,” Cloudflare said.
All this unrestricted AI scraping “poses a risk to an open Internet,” Cloudflare warned, proposing that its tools could set a new industry standard for how content is scraped online.
How to block bots in one click
Increasingly, creators fighting to control what happens with their content have been pushed to either sue AI companies to block unwanted scraping, as The New York Times has, or put content behind paywalls, decreasing public access to information.
While some big publishers have been striking content deals with AI companies to license content, Cloudflare is hoping new tools will help to level the playing field for everyone. That way, “there can be a transparent exchange between the websites that want greater control over their content, and the AI model providers that require fresh data sources, so that everyone benefits,” Cloudflare said.
Today, Cloudflare site operators can stop manually blocking each AI bot one by one and instead choose to “block all AI bots in one click,” Cloudflare said.
They can do this by visiting the Bots section under the Security tab of the Cloudflare dashboard, then clicking a blue link in the top-right corner “to configure how Cloudflare’s proxy handles bot traffic,” Cloudflare said. On that screen, operators can easily “toggle the button in the ‘Block AI Scrapers and Crawlers’ card to the ‘On’ position,” blocking everything and giving content creators time to strategize what access they want to re-enable, if any.
Beyond just blocking bots, operators can also conduct AI audits, quickly analyzing which sections of their sites are scanned most by which bots. From there, operators can decide which scraping is allowed and use sophisticated controls to decide which bots can scrape which parts of their sites.
“For some teams, the decision will be to allow the bots associated with AI search engines to scan their Internet properties because those tools can still drive traffic to the site,” Cloudflare’s blog explained. “Other organizations might sign deals with a specific model provider, and they want to allow any type of bot from that provider to access their content.”
For publishers already playing whack-a-mole with bots, a key perk would be if Cloudflare’s tools allowed them to write rules to restrict certain bots that scrape sites for both “good” and “bad” purposes to keep the good and throw away the bad.
Perhaps the most frustrating bot for publishers today is the Googlebot, which scrapes sites to populate search results as well as to train AI to generate Google search AI overviews that could negatively impact traffic to source sites by summarizing content. Publishers currently have no way of opting out of training models fueling Google’s AI overviews without losing visibility in search results, and Cloudflare’s tools won’t be able to get publishers out of that uncomfortable position, Cloudflare CEO Matthew Prince confirmed to Ars.
For any site operators tempted to toggle off all AI scraping, blocking the Googlebot from scraping and inadvertently causing dips in traffic may be a compelling reason not to use Cloudflare’s one-click solution.
However, Prince expects “that Google’s practices over the long term won’t be sustainable” and “that Cloudflare will be a part of getting Google and other folks that are like Google” to give creators “much more granular control over” how bots like the Googlebot scrape the web to train AI.
Prince told Ars that while Google solves its “philosophical” internal question of whether the Googlebot’s scraping is for search or for AI, a technical solution to block one bot from certain kinds of scraping will likely soon emerge. And in the meantime, “there can also be a legal solution” that “can rely on contract law” based on improving sites’ terms of service.
Not every site would, of course, be able to afford a lawsuit to challenge AI scraping, but to help creators better defend themselves, Cloudflare drafted “model terms of use that every content creator can add to their sites to legally protect their rights as sites gain more control over AI scraping.” With these terms, sites could perhaps more easily dispute any restricted scraping discovered through Cloudflare’s analytics tools.
“One way or another, Google is going to get forced to be more fine-grained here,” Prince predicted.
LinkedIn admitted Wednesday that it has been training its own AI on many users’ data without seeking consent. Now there’s no way for users to opt out of training that has already occurred, as LinkedIn limits opt-out to only future AI training.
In a blog detailing updates coming on November 20, LinkedIn general counsel Blake Lawit confirmed that LinkedIn’s user agreement and privacy policy will be changed to better explain how users’ personal data powers AI on the platform.
Under the new privacy policy, LinkedIn now informs users that “we may use your personal data… [to] develop and train artificial intelligence (AI) models, develop, provide, and personalize our Services, and gain insights with the help of AI, automated systems, and inferences, so that our Services can be more relevant and useful to you and others.”
An FAQ explained that the personal data could be collected any time a user interacts with generative AI or other AI features, as well as when a user composes a post, changes their preferences, provides feedback to LinkedIn, or uses the platform for any amount of time.
That data is then stored until the user deletes the AI-generated content. LinkedIn recommends that users use its data access tool if they want to delete or request to delete data collected about past LinkedIn activities.
LinkedIn’s AI models powering generative AI features “may be trained by LinkedIn or another provider,” such as Microsoft, which provides some AI models through its Azure OpenAI service, the FAQ said.
A potentially major privacy risk for users, LinkedIn’s FAQ noted, is that users who “provide personal data as an input to a generative AI powered feature” could end up seeing their “personal data being provided as an output.”
LinkedIn claims that it “seeks to minimize personal data in the data sets used to train the models,” relying on “privacy enhancing technologies to redact or remove personal data from the training dataset.”
While Lawit’s blog avoids clarifying if data already collected can be removed from AI training data sets, the FAQ affirmed that users who automatically opted in to sharing personal data for AI training can only opt out of the invasive data collection “going forward.”
Opting out “does not affect training that has already taken place,” the FAQ said.
A LinkedIn spokesperson told Ars that it “benefits all members” to be opted in to AI training “by default.”
“People can choose to opt out, but they come to LinkedIn to be found for jobs and networking and generative AI is part of how we are helping professionals with that change,” LinkedIn’s spokesperson said.
By allowing opt-outs of future AI training, LinkedIn’s spokesperson additionally claimed that the platform is giving “people using LinkedIn even more choice and control when it comes to how we use data to train our generative AI technology.”
How to opt out of AI training on LinkedIn
Users can opt out of AI training by navigating to the “Data privacy” section in their account settings, then turning off the option allowing collection of “data for generative AI improvement” that LinkedIn otherwise automatically turns on for most users.
The only exception is for users in the European Economic Area or Switzerland, who are protected by stricter privacy laws that either require consent from platforms to collect personal data or for platforms to justify the data collection as a legitimate interest. Those users will not see an option to opt out, because they were never opted in, LinkedIn repeatedly confirmed.
Additionally, users can “object to the use of their personal data for training” generative AI models not used to generate LinkedIn content—such as models used for personalization or content moderation purposes, The Verge noted—by submitting the LinkedIn Data Processing Objection Form.
Last year, LinkedIn shared AI principles, promising to take “meaningful steps to reduce the potential risks of AI.”
One risk that the updated user agreement specified is that using LinkedIn’s generative features to help populate a profile or generate suggestions when writing a post could generate content that “might be inaccurate, incomplete, delayed, misleading or not suitable for your purposes.”
Users are advised that they are responsible for avoiding sharing misleading information or otherwise spreading AI-generated content that may violate LinkedIn’s community guidelines. And users are additionally warned to be cautious when relying on any information shared on the platform.
“Like all content and other information on our Services, regardless of whether it’s labeled as created by ‘AI,’ be sure to carefully review before relying on it,” LinkedIn’s user agreement says.
Back in 2023, LinkedIn claimed that it would always “seek to explain in clear and simple ways how our use of AI impacts people,” because users’ “understanding of AI starts with transparency.”
Legislation like the European Union’s AI Act and the GDPR—especially with its strong privacy protections—if enacted elsewhere, would lead to fewer shocks to unsuspecting users. That would put all companies and their users on equal footing when it comes to training AI models and result in fewer nasty surprises and angry customers.
In his complaint, Christopher Kohls—who is known as “Mr Reagan” on YouTube and X (formerly Twitter)—said that he was suing “to defend all Americans’ right to satirize politicians.” He claimed that California laws, AB 2655 and AB 2839, were urgently passed after X owner Elon Musk shared a partly AI-generated parody video on the social media platform that Kohls created to “lampoon” presidential hopeful Kamala Harris.
AB 2655, known as the “Defending Democracy from Deepfake Deception Act,” prohibits creating “with actual malice” any “materially deceptive audio or visual media of a candidate for elective office with the intent to injure the candidate’s reputation or to deceive a voter into voting for or against the candidate, within 60 days of the election.” It requires social media platforms to block or remove any reported deceptive material and label “certain additional content” deemed “inauthentic, fake, or false” to prevent election interference.
The other law at issue, AB 2839, titled “Elections: deceptive media in advertisements,” bans anyone from “knowingly distributing an advertisement or other election communication” with “malice” that “contains certain materially deceptive content” within 120 days of an election in California and, in some cases, within 60 days after an election.
Both bills were signed into law on September 17, and Kohls filed his complaint that day, alleging that both must be permanently blocked as unconstitutional.
Elon Musk called out for boosting Kohls’ video
Kohls’ video that Musk shared seemingly would violate these laws by using AI to make Harris appear to give speeches that she never gave. The manipulated audio sounds like Harris, who appears to be mocking herself as a “diversity hire” and claiming that any critics must be “sexist and racist.”
“Making fun of presidential candidates and other public figures is an American pastime,” Kohls said, defending his parody video. He pointed to a long history of political cartoons and comedic impressions of politicians, claiming that “AI-generated commentary, though a new mode of speech, falls squarely within this tradition.”
While Kohls’ post was clearly marked “parody” in the YouTube title and in his post on X, that “parody” label did not carry over when Musk re-posted the video. This lack of a parody label on Musk’s post—which got approximately 136 million views, roughly twice as many as Kohls’ post—set off California governor Gavin Newsom, who immediately blasted Musk’s post and vowed on X to make content like Kohls’ video “illegal.”
In response to Newsom, Musk poked fun at the governor, posting that “I checked with renowned world authority, Professor Suggon Deeznutz, and he said parody is legal in America.” For his part, Kohls put up a second parody video targeting Harris, calling Newsom a “bully” in his complaint and claiming that he had to “punch back.”
Shortly after these online exchanges, California lawmakers allegedly rushed to back the governor, Kohls’ complaint said. They allegedly amended the deepfake bills to ensure that Kohls’ video would be banned when the bills were signed into law, replacing a broad exception for satire in one law with a narrower safe harbor that Kohls claimed would chill humorists everywhere.
“For videos,” his complaint said, disclaimers required under AB 2839 must “appear for the duration of the video” and “must be in a font size ‘no smaller than the largest font size of other text appearing in the visual media.'” For a satirist like Kohls who uses large fonts to optimize videos for mobile, this “would require the disclaimer text to be so large that it could not fit on the screen,” his complaint said.
On top of seeming impractical, the disclaimers would “fundamentally” alter “the nature of his message” by removing the comedic effect for viewers by distracting from what allegedly makes the videos funny—”the juxtaposition of over-the-top statements by the AI-generated ‘narrator,’ contrasted with the seemingly earnest style of the video as if it were a genuine campaign ad,” Kohls’ complaint alleged.
Imagine watching Saturday Night Live with prominent disclaimers taking up your TV screen, his complaint suggested.
It’s possible that Kohls’ concerns about AB 2839 are unwarranted. Newsom spokesperson Izzy Gardon told Politico that Kohls’ parody label on X was good enough to clear him of liability under the law.
“Requiring them to use the word ‘parody’ on the actual video avoids further misleading the public as the video is shared across the platform,” Gardon said. “It’s unclear why this conservative activist is suing California. This new disclosure law for election misinformation isn’t any more onerous than laws already passed in other states, including Alabama.”
Nevada will soon become the first state to use AI to help speed up the decision-making process when ruling on appeals that impact people’s unemployment benefits.
The state’s Department of Employment, Training, and Rehabilitation (DETR) agreed to pay Google $1,383,838 for the AI technology, a 2024 budget document shows, and it will be launched within the “next several months,” Nevada officials told Gizmodo.
Nevada’s first-of-its-kind AI will rely on a Google cloud service called Vertex AI Studio. Connecting to Google’s servers, the state will fine-tune the AI system to only reference information from DETR’s database, which officials think will ensure its decisions are “more tailored” and the system provides “more accurate results,” Gizmodo reported.
Under the contract, DETR will essentially transfer data from transcripts of unemployment appeals hearings and rulings, after which Google’s AI system will process that data, upload it to the cloud, and then compare the information to previous cases.
In as little as five minutes, the AI will issue a ruling that would’ve taken a state employee about three hours to reach without using AI, DETR’s information technology administrator, Carl Stanfield, told The Nevada Independent. That’s highly valuable to Nevada, which has a backlog of more than 40,000 appeals stemming from a pandemic-related spike in unemployment claims while dealing with “unforeseen staffing shortages” that DETR reported in July.
“The time saving is pretty phenomenal,” Stanfield said.
As a safeguard, the AI’s determination is then reviewed by a state employee to hopefully catch any mistakes, biases, or perhaps worse, hallucinations where the AI could possibly make up facts that could impact the outcome of their case.
Google’s spokesperson Ashley Simms told Gizmodo that the tech giant will work with the state to “identify and address any potential bias” and to “help them comply with federal and state requirements.” According to the state’s AI guidelines, the agency must prioritize ethical use of the AI system, “avoiding biases and ensuring fairness and transparency in decision-making processes.”
If the reviewer accepts the AI ruling, they’ll sign off on it and issue the decision. Otherwise, the reviewer will edit the decision and submit feedback so that DETR can investigate what went wrong.
Gizmodo noted that this novel use of AI “represents a significant experiment by state officials and Google in allowing generative AI to influence a high-stakes government decision—one that could put thousands of dollars in unemployed Nevadans’ pockets or take it away.”
Google declined to comment on whether more states are considering using AI to weigh jobless claims.
Cops are now using AI to generate images of fake kids, which are helping them catch child predators online, a lawsuit filed by the state of New Mexico against Snapchat revealed this week.
According to the complaint, the New Mexico Department of Justice launched an undercover investigation in recent months to prove that Snapchat “is a primary social media platform for sharing child sexual abuse material (CSAM)” and sextortion of minors, because its “algorithm serves up children to adult predators.”
As part of their probe, an investigator “set up a decoy account for a 14-year-old girl, Sexy14Heather.”
An AI-generated image of “Sexy14Heather” included in the New Mexico complaint.
An image of a Snapchat avatar for “Sexy14Heather” included in the New Mexico complaint.
Despite Snapchat setting the fake minor’s profile to private and the account not adding any followers, “Heather” was soon recommended widely to “dangerous accounts, including ones named ‘child.rape’ and ‘pedo_lover10,’ in addition to others that are even more explicit,” the New Mexico DOJ said in a press release.
And after “Heather” accepted a follow request from just one account, the recommendations got even worse. “Snapchat suggested over 91 users, including numerous adult users whose accounts included or sought to exchange sexually explicit content,” New Mexico’s complaint alleged.
“Snapchat is a breeding ground for predators to collect sexually explicit images of children and to find, groom, and extort them,” New Mexico’s complaint alleged.
Posing as “Sexy14Heather,” the investigator swapped messages with adult accounts, including users who “sent inappropriate messages and explicit photos.” In one exchange with a user named “50+ SNGL DAD 4 YNGR,” the fake teen “noted her age, sent a photo, and complained about her parents making her go to school,” prompting the user to send “his own photo” as well as sexually suggestive chats. Other accounts asked “Heather” to “trade presumably explicit content,” and several “attempted to coerce the underage persona into sharing CSAM,” the New Mexico DOJ said.
“Heather” also tested out Snapchat’s search tool, finding that “even though she used no sexually explicit language, the algorithm must have determined that she was looking for CSAM” when she searched for other teen users. It “began recommending users associated with trading” CSAM, including accounts with usernames such as “naughtypics,” “addfortrading,” “teentr3de,” “gayhorny13yox,” and “teentradevirgin,” the investigation found, “suggesting that these accounts also were involved in the dissemination of CSAM.”
This novel use of AI was prompted after Albuquerque police indicted a man, Alejandro Marquez, who pled guilty and was sentenced to 18 years for raping an 11-year-old girl he met through Snapchat’s Quick Add feature in 2022, New Mexico’s complaint said. More recently, the New Mexico complaint said, an Albuquerque man, Jeremy Guthrie, was arrested and sentenced this summer for “raping a 12-year-old girl who he met and cultivated over Snapchat.”
In the past, police have posed as kids online to catch child predators using photos of younger-looking adult women or even younger photos of police officers. Using AI-generated images could be considered a more ethical way to conduct these stings, a lawyer specializing in sex crimes, Carrie Goldberg, told Ars, because “an AI decoy profile is less problematic than using images of an actual child.”
But using AI could complicate investigations and carry its own ethical concerns, Goldberg warned, as child safety experts and law enforcement warn that the Internet is increasingly swamped with AI-generated CSAM.
“In terms of AI being used for entrapment, defendants can defend themselves if they say the government induced them to commit a crime that they were not already predisposed to commit,” Goldberg told Ars. “Of course, it would be ethically concerning if the government were to create deepfake AI child sexual abuse material (CSAM), because those images are illegal, and we don’t want more CSAM in circulation.”
Experts have warned that AI image generators should never be trained on datasets that combine images of real kids with explicit content to avoid any instances of AI-generated CSAM, which is particularly harmful when it appears to depict a real kid or an actual victim of child abuse.
In the New Mexico complaint, only one AI-generated image is included, so it’s unclear how widely the state’s DOJ is using AI or if cops are possibly using more advanced methods to generate multiple images of the same fake kid. It’s also unclear what ethical concerns were weighed before cops began using AI decoys.
The New Mexico DOJ did not respond to Ars’ request for comment.
Goldberg told Ars that “there ought to be standards within law enforcement with how to use AI responsibly,” warning that “we are likely to see more entrapment defenses centered around AI if the government is using the technology in a manipulative way to pressure somebody into committing a crime.”
The Department of Justice is reportedly deepening its probe into Nvidia. Officials have moved on from merely questioning competitors to subpoenaing Nvidia and other tech companies for evidence that could substantiate allegations that Nvidia is abusing its “dominant position in AI computing,” Bloomberg reported.
When news of the DOJ’s probe into the trillion-dollar company was first reported in June, Fast Company reported that scrutiny was intensifying merely because Nvidia was estimated to control “as much as 90 percent of the market for chips” capable of powering AI models. Experts told Fast Company that the DOJ probe might even be good for Nvidia’s business, noting that the market barely moved when the probe was first announced.
But the market’s confidence seemed to be shaken a little more on Tuesday, when Nvidia lost a “record-setting $279 billion” in market value following Bloomberg’s report. Nvidia’s losses became “the biggest single-day market-cap decline on record,” TheStreet reported.
People close to the DOJ’s investigation told Bloomberg that the DOJ’s “legally binding requests” require competitors “to provide information” on Nvidia’s suspected anticompetitive behaviors as a “dominant provider of AI processors.”
One concern is that Nvidia may be giving “preferential supply and pricing to customers who use its technology exclusively or buy its complete systems,” sources told Bloomberg. The DOJ is also reportedly probing Nvidia’s acquisition of RunAI—suspecting the deal may lock RunAI customers into using Nvidia chips.
Bloomberg’s report builds on a report last month from The Information that said that Advanced Micro Devices Inc. (AMD) and other Nvidia rivals were questioned by the DOJ—as well as third parties who could shed light on whether Nvidia potentially abused its market dominance in AI chips to pressure customers into buying more products.
According to Bloomberg’s sources, the DOJ is worried that “Nvidia is making it harder to switch to other suppliers and penalizes buyers that don’t exclusively use its artificial intelligence chips.”
In a statement to Bloomberg, Nvidia insisted that “Nvidia wins on merit, as reflected in our benchmark results and value to customers, who can choose whatever solution is best for them.” Additionally, Bloomberg noted that following a chip shortage in 2022, Nvidia CEO Jensen Huang has said that his company strives to prevent stockpiling of Nvidia’s coveted AI chips by prioritizing customers “who can make use of his products in ready-to-go data centers.”
Potential threats to Nvidia’s dominance
Despite the slump in shares, Nvidia’s market dominance seems unlikely to wane any time soon after its stock more than doubled this year. In an SEC filing this year, Nvidia bragged that its “accelerated computing ecosystem is bringing AI to every enterprise” with an “ecosystem” spanning “nearly 5 million developers and 40,000 companies.” Nvidia specifically highlighted that “more than 1,600 generative AI companies are building on Nvidia,” and according to Bloomberg, Nvidia will close out 2024 with more profits than the total sales of its closest competitor, AMD.
According to Kanter, the DOJ is scrutinizing all aspects of the AI industry—”everything from computing power and the data used to train large language models, to cloud service providers, engineering talent and access to essential hardware such as graphics processing unit chips.” But in particular, the DOJ appears concerned that GPUs like Nvidia’s advanced AI chips remain a “scarce resource.” Kanter told the Financial Times that an “intervention” in “real time” to block a potential monopoly could be “the most meaningful intervention” and the least “invasive” as the AI industry grows.
According to the Dutch Data Protection Authority (DPA), Clearview AI “built an illegal database with billions of photos of faces” by crawling the web and without gaining consent, including from people in the Netherlands.
Clearview AI’s technology—which has been banned in some US cities over concerns that it gives law enforcement unlimited power to track people in their daily lives—works by pulling in more than 40 billion face images from the web without setting “any limitations in terms of geographical location or nationality,” the Dutch DPA found. Perhaps most concerning, the Dutch DPA said, Clearview AI also provides “facial recognition software for identifying children,” therefore indiscriminately processing personal data of minors.
Training on the face image data, the technology then makes it possible to upload a photo of anyone and search for matches on the Internet. People appearing in search results, the Dutch DPA found, can be “unambiguously” identified. Billed as a public safety resource accessible only by law enforcement, Clearview AI’s face database casts too wide a net, the Dutch DPA said, with the majority of people pulled into the tool likely never becoming subject to a police search.
“The processing of personal data is not only complex and extensive, it moreover offers Clearview’s clients the opportunity to go through data about individual persons and obtain a detailed picture of the lives of these individual persons,” the Dutch DPA said. “These processing operations therefore are highly invasive for data subjects.”
Clearview AI had no legitimate interest under the European Union’s General Data Protection Regulation (GDPR) for the company’s invasive data collection, Dutch DPA Chairman Aleid Wolfsen said in a press release. The Dutch official likened Clearview AI’s sprawling overreach to “a doom scenario from a scary film,” while emphasizing in his decision that Clearview AI has not only stopped responding to any requests to access or remove data from citizens in the Netherlands, but across the EU.
“Facial recognition is a highly intrusive technology that you cannot simply unleash on anyone in the world,” Wolfsen said. “If there is a photo of you on the Internet—and doesn’t that apply to all of us?—then you can end up in the database of Clearview and be tracked.”
To protect Dutch citizens’ privacy, the Dutch DPA imposed a roughly $33 million fine that could go up by about $5.5 million if Clearview AI does not follow orders on compliance. Any Dutch businesses attempting to use Clearview AI services could also face “hefty fines,” the Dutch DPA warned, as that “is also prohibited” under the GDPR.
Clearview AI was given three months to appoint a representative in the EU to stop processing personal data—including sensitive biometric data—in the Netherlands and to update its privacy policies to inform users in the Netherlands of their rights under the GDPR. But the company only has one month to resume processing requests for data access or removals from people in the Netherlands who otherwise find it “impossible” to exercise their rights to privacy, the Dutch DPA’s decision said.
It appears that Clearview AI has no intentions to comply, however. Jack Mulcaire, the chief legal officer for Clearview AI, confirmed to Ars that the company maintains that it is not subject to the GDPR.
“Clearview AI does not have a place of business in the Netherlands or the EU, it does not have any customers in the Netherlands or the EU, and does not undertake any activities that would otherwise mean it is subject to the GDPR,” Mulcaire said. “This decision is unlawful, devoid of due process and is unenforceable.”
But the Dutch DPA found that GDPR applies to Clearview AI because it gathers personal information about Dutch citizens without their consent and without ever alerting users to the data collection at any point.
“People who are in the database also have the right to access their data,” the Dutch DPA said. “This means that Clearview has to show people which data the company has about them, if they ask for this. But Clearview does not cooperate in requests for access.”
Dutch DPA vows to investigate Clearview AI execs
In the press release, Wolfsen said that the Dutch DPA has “to draw a very clear line” underscoring the “incorrect use of this sort of technology” after Clearview AI refused to change its data collection practices following fines in other parts of the European Union, including Italy and Greece.
While Wolfsen acknowledged that Clearview AI could be used to enhance police investigations, he said that the technology would be more appropriate if it was being managed by law enforcement “in highly exceptional cases only” and not indiscriminately by a private company.
“The company should never have built the database and is insufficiently transparent,” the Dutch DPA said.
Although Clearview AI appears ready to defend against the fine, the Dutch DPA said that the company failed to object to the decision within the provided six-week timeframe and therefore cannot appeal the decision.
Further, the Dutch DPA confirmed that authorities are “looking for ways to make sure that Clearview stops the violations” beyond the fines, including by “investigating if the directors of the company can be held personally responsible for the violations.”
Wolfsen claimed that such “liability already exists if directors know that the GDPR is being violated, have the authority to stop that, but omit to do so, and in this way consciously accept those violations.”
Now, the LAION (Large-scale Artificial Intelligence Open Network) team has released a scrubbed version of the LAION-5B dataset called Re-LAION-5B and claimed that it “is the first web-scale, text-link to images pair dataset to be thoroughly cleaned of known links to suspected CSAM.”
To scrub the dataset, LAION partnered with the Internet Watch Foundation (IWF) and the Canadian Center for Child Protection (C3P) to remove 2,236 links that matched with hashed images in the online safety organizations’ databases. Removals include all the links flagged by Thiel, as well as content flagged by LAION’s partners and other watchdogs, like Human Rights Watch, which warned of privacy issues after finding photos of real kids included in the dataset without their consent.
In his study, Thiel warned that “the inclusion of child abuse material in AI model training data teaches tools to associate children in illicit sexual activity and uses known child abuse images to generate new, potentially realistic child abuse content.”
Thiel urged LAION and other researchers scraping the Internet for AI training data that a new safety standard was needed to better filter out not just CSAM, but any explicit imagery that could be combined with photos of children to generate CSAM. (Recently, the US Department of Justice pointedly said that “CSAM generated by AI is still CSAM.”)
While LAION’s new dataset won’t alter models that were trained on the prior dataset, LAION claimed that Re-LAION-5B sets “a new safety standard for cleaning web-scale image-link datasets.” Where before illegal content “slipped through” LAION’s filters, the researchers have now developed an improved new system “for identifying and removing illegal content,” LAION’s blog said.
Thiel told Ars that he would agree that LAION has set a new safety standard with its latest release, but “there are absolutely ways to improve it.” However, “those methods would require possession of all original images or a brand new crawl,” and LAION’s post made clear that it only utilized image hashes and did not conduct a new crawl that could have risked pulling in more illegal or sensitive content. (On Threads, Thiel shared more in-depth impressions of LAION’s effort to clean the dataset.)
LAION warned that “current state-of-the-art filters alone are not reliable enough to guarantee protection from CSAM in web scale data composition scenarios.”
“To ensure better filtering, lists of hashes of suspected links or images created by expert organizations (in our case, IWF and C3P) are suitable choices,” LAION’s blog said. “We recommend research labs and any other organizations composing datasets from the public web to partner with organizations like IWF and C3P to obtain such hash lists and use those for filtering. In the longer term, a larger common initiative can be created that makes such hash lists available for the research community working on dataset composition from web.”
According to LAION, the bigger concern is that some links to known CSAM scraped into a 2022 dataset are still active more than a year later.
“It is a clear hint that law enforcement bodies have to intensify the efforts to take down domains that host such image content on public web following information and recommendations by organizations like IWF and C3P, making it a safer place, also for various kinds of research related activities,” LAION’s blog said.
HRW researcher Hye Jung Han praised LAION for removing sensitive data that she flagged, while also urging more interventions.
“LAION’s responsive removal of some children’s personal photos from their dataset is very welcome, and will help to protect these children from their likenesses being misused by AI systems,” Han told Ars. “It’s now up to governments to pass child data protection laws that would protect all children’s privacy online.”
Although LAION’s blog said that the content removals represented an “upper bound” of CSAM that existed in the initial dataset, AI specialist and Creative.AI co-founder Alex Champandard told Ars that he’s skeptical that all CSAM was removed.
“They only filter out previously identified CSAM, which is only a partial solution,” Champandard told Ars. “Statistically speaking, most instances of CSAM have likely never been reported nor investigated by C3P or IWF. A more reasonable estimate of the problem is about 25,000 instances of things you’d never want to train generative models on—maybe even 50,000.”
Champandard agreed with Han that more regulations are needed to protect people from AI harms when training data is scraped from the web.
“There’s room for improvement on all fronts: privacy, copyright, illegal content, etc.,” Champandard said. Because “there are too many data rights being broken with such web-scraped datasets,” Champandard suggested that datasets like LAION’s won’t “stand the test of time.”
“LAION is simply operating in the regulatory gap and lag in the judiciary system until policymakers realize the magnitude of the problem,” Champandard said.