chatbot

you-won’t-believe-the-excuses-lawyers-have-after-getting-busted-for-using-ai

You won’t believe the excuses lawyers have after getting busted for using AI


I got hacked; I lost my login; it was a rough draft; toggling windows is hard.

Credit: Aurich Lawson | Getty Images

Credit: Aurich Lawson | Getty Images

Amid what one judge called an “epidemic” of fake AI-generated case citations bogging down courts, some common excuses are emerging from lawyers hoping to dodge the most severe sanctions for filings deemed misleading.

Using a database compiled by French lawyer and AI researcher Damien Charlotin, Ars reviewed 23 cases where lawyers were sanctioned for AI hallucinations. In many, judges noted that the simplest path to avoid or diminish sanctions was to admit that AI was used as soon as it’s detected, act humble, self-report the error to relevant legal associations, and voluntarily take classes on AI and law. But not every lawyer takes the path of least resistance, Ars’ review found, with many instead offering excuses that no judge found credible. Some even lie about their AI use, judges concluded.

Since 2023—when fake AI citations started being publicized—the most popular excuse has been that the lawyer didn’t know AI was used to draft a filing.

Sometimes that means arguing that you didn’t realize you were using AI, as in the case of a California lawyer who got stung by Google’s AI Overviews, which he claimed he took for typical Google search results. Most often, lawyers using this excuse tend to blame an underling, but clients have been blamed, too. A Texas lawyer this month was sanctioned after deflecting so much that the court had to eventually put his client on the stand after he revealed she played a significant role in drafting the aberrant filing.

“Is your client an attorney?” the court asked.

“No, not at all your Honor, just was essentially helping me with the theories of the case,” the lawyer said.

Another popular dodge comes from lawyers who feign ignorance that chatbots are prone to hallucinating facts.

Recent cases suggest this excuse may be mutating into variants. Last month, a sanctioned Oklahoma lawyer admitted that he didn’t expect ChatGPT to add new citations when all he asked the bot to do was “make his writing more persuasive.” And in September, a California lawyer got in a similar bind—and was sanctioned a whopping $10,000, a fine the judge called “conservative.” That lawyer had asked ChatGPT to “enhance” his briefs, “then ran the ‘enhanced’ briefs through other AI platforms to check for errors,” neglecting to ever read the “enhanced” briefs.

Neither of those tired old excuses hold much weight today, especially in courts that have drawn up guidance to address AI hallucinations. But rather than quickly acknowledge their missteps, as courts are begging lawyers to do, several lawyers appear to have gotten desperate. Ars found a bunch citing common tech issues as the reason for citing fake cases.

When in doubt, blame hackers?

For an extreme case, look to a New York City civil court, where a lawyer, Innocent Chinweze, first admitted to using Microsoft Copilot to draft an errant filing, then bizarrely pivoted to claim that the AI citations were due to malware found on his computer.

Chinweze said he had created a draft with correct citations but then got hacked, allowing bad actors “unauthorized remote access” to supposedly add the errors in his filing.

The judge was skeptical, describing the excuse as an “incredible and unsupported statement,” particularly since there was no evidence of the prior draft existing. Instead, Chinweze asked to bring in an expert to testify that the hack had occurred, requesting to end the proceedings on sanctions until after the court weighed the expert’s analysis.

The judge, Kimon C. Thermos, didn’t have to weigh this argument, however, because after the court broke for lunch, the lawyer once again “dramatically” changed his position.

“He no longer wished to adjourn for an expert to testify regarding malware or unauthorized access to his computer,” Thermos wrote in an order issuing sanctions. “He retreated” to “his original position that he used Copilot to aid in his research and didn’t realize that it could generate fake cases.”

Possibly more galling to Thermos than the lawyer’s weird malware argument, though, was a document that Chinweze filed on the day of his sanctions hearing. That document included multiple summaries preceded by this text, the judge noted:

Some case metadata and case summaries were written with the help of AI, which can produce inaccuracies. You should read the full case before relying on it for legal research purposes.

Thermos admonished Chinweze for continuing to use AI recklessly. He blasted the filing as “an incoherent document that is eighty-eight pages long, has no structure, contains the full text of most of the cases cited,” and “shows distinct indications that parts of the discussion/analysis of the cited cases were written by artificial intelligence.”

Ultimately, Thermos ordered Chinweze to pay $1,000, the most typical fine lawyers received in the cases Ars reviewed. The judge then took an extra non-monetary step to sanction Chinweze, referring the lawyer to a grievance committee, “given that his misconduct was substantial and seriously implicated his honesty, trustworthiness, and fitness to practice law.”

Ars could not immediately reach Chinweze for comment.

Toggling windows on a laptop is hard

In Alabama, an attorney named James A. Johnson made an “embarrassing mistake,” he said, primarily because toggling windows on a laptop is hard, US District Judge Terry F. Moorer noted in an October order on sanctions.

Johnson explained that he had accidentally used an AI tool that he didn’t realize could hallucinate. It happened while he was “at an out-of-state hospital attending to the care of a family member recovering from surgery.” He rushed to draft the filing, he said, because he got a notice that his client’s conference had suddenly been “moved up on the court’s schedule.”

“Under time pressure and difficult personal circumstance,” Johnson explained, he decided against using Fastcase, a research tool provided by the Alabama State Bar, to research the filing. Working on his laptop, he opted instead to use “a Microsoft Word plug-in called Ghostwriter Legal” because “it appeared automatically in the sidebar of Word while Fastcase required opening a separate browser to access through the Alabama State Bar website.”

To Johnson, it felt “tedious to toggle back and forth between programs on [his] laptop with the touchpad,” and that meant he “unfortunately fell victim to the allure of a new program that was open and available.”

Moorer seemed unimpressed by Johnson’s claim that he understood tools like ChatGPT were unreliable but didn’t expect the same from other AI legal tools—particularly since “information from Ghostwriter Legal made it clear that it used ChatGPT as its default AI program,” Moorer wrote.

The lawyer’s client was similarly horrified, deciding to drop Johnson on the spot, even though that risked “a significant delay of trial.” Moorer noted that Johnson seemed shaken by his client’s abrupt decision, evidenced by “his look of shock, dismay, and display of emotion.”

Moorer further noted that Johnson had been paid using public funds while seemingly letting AI do his homework. “The harm is not inconsequential as public funds for appointed counsel are not a bottomless well and are limited resource,” the judge wrote in justifying a more severe fine.

“It has become clear that basic reprimands and small fines are not sufficient to deter this type of misconduct because if it were, we would not be here,” Moorer concluded.

Ruling that Johnson’s reliance on AI was “tantamount to bad faith,” Moorer imposed a $5,000 fine. The judge also would have “considered potential disqualification, but that was rendered moot” since Johnson’s client had already dismissed him.

Asked for comment, Johnson told Ars that “the court made plainly erroneous findings of fact and the sanctions are on appeal.”

Plagued by login issues

As a lawyer in Georgia tells it, sometimes fake AI citations may be filed because a lawyer accidentally filed a rough draft instead of the final version.

Other lawyers claim they turn to AI as needed when they have trouble accessing legal tools like Westlaw or LexisNexis.

For example, in Iowa, a lawyer told an appeals court that she regretted relying on “secondary AI-driven research tools” after experiencing “login issues her with her Westlaw subscription.” Although the court was “sympathetic to issues with technology, such as login issues,” the lawyer was sanctioned, primarily because she only admitted to using AI after the court ordered her to explain her mistakes. In her case, however, she got to choose between paying a minimal $150 fine or attending “two hours of legal ethics training particular to AI.”

Less sympathetic was a lawyer who got caught lying about the AI tool she blamed for inaccuracies, a Louisiana case suggested. In that case, a judge demanded to see the research history after a lawyer claimed that AI hallucinations came from “using Westlaw Precision, an AI-assisted research tool, rather than Westlaw’s standalone legal database.”

It turned out that the lawyer had outsourced the research, relying on a “currently suspended” lawyer’s AI citations, and had only “assumed” the lawyer’s mistakes were from Westlaw’s AI tool. It’s unclear what tool was actually used by the suspended lawyer, who likely lost access to a Westlaw login, but the judge ordered a $1,000 penalty after the lawyer who signed the filing “agreed that Westlaw did not generate the fabricated citations.”

Judge warned of “serial hallucinators”

Another lawyer, William T. Panichi in Illinois, has been sanctioned at least three times, Ars’ review found.

In response to his initial penalties ordered in July, he admitted to being tempted by AI while he was “between research software.”

In that case, the court was frustrated to find that the lawyer had contradicted himself, and it ordered more severe sanctions as a result.

Panichi “simultaneously admitted to using AI to generate the briefs, not doing any of his own independent research, and even that he ‘barely did any personal work [him]self on this appeal,’” the court order said, while also defending charging a higher fee—supposedly because this case “was out of the ordinary in terms of time spent” and his office “did some exceptional work” getting information.

The court deemed this AI misuse so bad that Panichi was ordered to disgorge a “payment of $6,925.62 that he received” in addition to a $1,000 penalty.

“If I’m lucky enough to be able to continue practicing before the appellate court, I’m not going to do it again,” Panichi told the court in July, just before getting hit with two more rounds of sanctions in August.

Panichi did not immediately respond to Ars’ request for comment.

When AI-generated hallucinations are found, penalties are often paid to the court, the other parties’ lawyers, or both, depending on whose time and resources were wasted fact-checking fake cases.

Lawyers seem more likely to argue against paying sanctions to the other parties’ attorneys, hoping to keep sanctions as low as possible. One lawyer even argued that “it only takes 7.6 seconds, not hours, to type citations into LexisNexis or Westlaw,” while seemingly neglecting the fact that she did not take those precious seconds to check her own citations.

The judge in the case, Nancy Miller, was clear that “such statements display an astounding lack of awareness of counsel’s obligations,” noting that “the responsibility for correcting erroneous and fake citations never shifts to opposing counsel or the court, even if they are the first to notice the errors.”

“The duty to mitigate the harms caused by such errors remains with the signor,” Miller said. “The sooner such errors are properly corrected, either by withdrawing or amending and supplementing the offending pleadings, the less time is wasted by everyone involved, and fewer costs are incurred.”

Texas US District Judge Marina Garcia Marmolejo agreed, explaining that even more time is wasted determining how other judges have responded to fake AI-generated citations.

“At one of the busiest court dockets in the nation, there are scant resources to spare ferreting out erroneous AI citations in the first place, let alone surveying the burgeoning caselaw on this subject,” she said.

At least one Florida court was “shocked, shocked” to find that a lawyer was refusing to pay what the other party’s attorneys said they were owed after misusing AI. The lawyer in that case, James Martin Paul, asked to pay less than a quarter of the fees and costs owed, arguing that Charlotin’s database showed he might otherwise owe penalties that “would be the largest sanctions paid out for the use of AI generative case law to date.”

But caving to Paul’s arguments “would only benefit serial hallucinators,” the Florida court found. Ultimately, Paul was sanctioned more than $85,000 for what the court said was “far more egregious” conduct than other offenders in the database, chastising him for “repeated, abusive, bad-faith conduct that cannot be recognized as legitimate legal practice and must be deterred.”

Paul did not immediately respond to Ars’ request to comment.

Michael B. Slade, a US bankruptcy judge in Illinois, seems to be done weighing excuses, calling on all lawyers to stop taking AI shortcuts that are burdening courts.

“At this point, to be blunt, any lawyer unaware that using generative AI platforms to do legal research is playing with fire is living in a cloud,” Slade wrote.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

You won’t believe the excuses lawyers have after getting busted for using AI Read More »

oddest-chatgpt-leaks-yet:-cringey-chat-logs-found-in-google-analytics-tool

Oddest ChatGPT leaks yet: Cringey chat logs found in Google analytics tool


ChatGPT leaks seem to confirm OpenAI scrapes Google, expert says.

Credit: Aurich Lawson | Getty Images

For months, extremely personal and sensitive ChatGPT conversations have been leaking into an unexpected destination: Google Search Console (GSC), a tool that developers typically use to monitor search traffic, not lurk private chats.

Normally, when site managers access GSC performance reports, they see queries based on keywords or short phrases that Internet users type into Google to find relevant content. But starting this September, odd queries, sometimes more than 300 characters long, could also be found in GSC. Showing only user inputs, the chats appeared to be from unwitting people prompting a chatbot to help solve relationship or business problems, who likely expected those conversations would remain private.

Jason Packer, owner of an analytics consulting firm called Quantable, was among the first to flag the issue in a detailed blog last month.

Determined to figure out what exactly was causing the leaks, he teamed up with “Internet sleuth” and web optimization consultant Slobodan Manić. Together, they conducted testing that they believe may have surfaced “the first definitive proof that OpenAI directly scrapes Google Search with actual user prompts.” Their investigation seemed to confirm the AI giant was compromising user privacy, in some cases in order to maintain engagement by seizing search data that Google otherwise wouldn’t share.

OpenAI declined Ars’ request to confirm if Packer and Manić’s theory posed in their blog was correct or answer any of their remaining questions that could help users determine the scope of the problem.

However, an OpenAI spokesperson confirmed that the company was “aware” of the issue and has since “resolved” a glitch “that temporarily affected how a small number of search queries were routed.”

Packer told Ars that he’s “very pleased that OpenAI was able to resolve the issue quickly.” But he suggested that OpenAI’s response failed to confirm whether or not OpenAI was scraping Google, and that leaves room for doubt that the issue was completely resolved.

Google declined to comment.

“Weirder” than prior ChatGPT leaks

The first odd ChatGPT query to appear in GSC that Packer reviewed was a wacky stream-of-consciousness from a likely female user asking ChatGPT to assess certain behaviors to help her figure out if a boy who teases her had feelings for her. Another odd query seemed to come from an office manager sharing business information while plotting a return-to-office announcement.

These were just two of 200 odd queries—including “some pretty crazy ones,” Packer told Ars—that he reviewed on one site alone. In his blog, Packer concluded that the queries should serve as “a reminder that prompts aren’t as private as you think they are!”

Packer suspected that these queries were connected to reporting from The Information in August that cited sources claiming OpenAI was scraping Google search results to power ChatGPT responses. Sources claimed that OpenAI was leaning on Google to answer prompts to ChatGPT seeking information about current events, like news or sports.

OpenAI has not confirmed that it’s scraping Google search engine results pages (SERPs). However, Packer thinks his testing of ChatGPT leaks may be evidence that OpenAI not only scrapes “SERPs in general to acquire data,” but also sends user prompts to Google Search.

Manić helped Packer solve a big part of the riddle. He found that the odd queries were turning up in one site’s GSC because it ranked highly in Google Search for “https://openai.com/index/chatgpt/”—a ChatGPT URL that was appended at the start of every strange query turning up in GSC.

It seemed that Google had tokenized the URL, breaking it up into a search for keywords “openai + index + chatgpt.” Sites using GSC that ranked highly for those keywords were therefore likely to encounter ChatGPT leaks, Parker and Manić proposed, including sites that covered prior ChatGPT leaks where chats were being indexed in Google search results. Using their recommendations to seek out queries in GSC, Ars was able to verify similar strings.

“Don’t get confused though, this is a new and completely different ChatGPT screw-up than having Google index stuff we don’t want them to,” Packer wrote. “Weirder, if not as serious.”

It’s unclear what exactly OpenAI fixed, but Packer and Manić have a theory about one possible path for leaking chats. Visiting the URL that starts every strange query found in GSC, ChatGPT users encounter a prompt box that seemed buggy, causing “the URL of that page to be added to the prompt.” The issue, they explained, seemed to be that:

Normally ChatGPT 5 will choose to do a web search whenever it thinks it needs to, and is more likely to do that with an esoteric or recency-requiring search. But this bugged prompt box also contains the query parameter ‘hints=search’ to cause it to basically always do a search: https://chatgpt.com/?hints=search&openaicom_referred=true&model=gpt-5

Clearly some of those searches relied on Google, Packer’s blog said, mistakenly sending to GSC “whatever” the user says in the prompt box, with “https://openai.com/index/chatgpt/” text added to the front of it.” As Packer explained, “we know it must have scraped those rather than using an API or some kind of private connection—because those other options don’t show inside GSC.”

This means “that OpenAI is sharing any prompt that requires a Google Search with both Google and whoever is doing their scraping,” Packer alleged. “And then also with whoever’s site shows up in the search results! Yikes.”

To Packer, it appeared that “ALL ChatGPT prompts” that used Google Search risked being leaked during the past two months.

OpenAI claimed only a small number of queries were leaked but declined to provide a more precise estimate. So, it remains unclear how many of the 700 million people who use ChatGPT each week had prompts routed to GSC.

OpenAI’s response leaves users with “lingering questions”

After ChatGPT prompts were found surfacing in Google’s search index in August, OpenAI clarified that users had clicked a box making those prompts public, which OpenAI defended as “sufficiently clear.” The AI firm later scrambled to remove the chats from Google’s SERPs after it became obvious that users felt misled into sharing private chats publicly.

Packer told Ars that a major difference between those leaks and the GSC leaks is that users harmed by the prior scandal, at least on some level, “had to actively share” their leaked chats. In the more recent case, “nobody clicked share” or had a reasonable way to prevent their chats from being exposed.

“Did OpenAI go so fast that they didn’t consider the privacy implications of this, or did they just not care?” Packer posited in his blog.

Perhaps most troubling to some users—whose identities are not linked in chats unless their prompts perhaps share identifying information—there does not seem to be any way to remove the leaked chats from GSC, unlike the prior scandal.

Packer and Manić are left with “lingering questions” about how far OpenAI’s fix will go to stop the issue.

Manić was hoping OpenAI might confirm if prompts entered on https://chatgpt.com that trigger Google Search were also affected. But OpenAI did not follow up on that question, or a broader question about how big the leak was. To Manić, a major concern was that OpenAI’s scraping may be “contributing to ‘crocodile mouth’ in Google Search Console,” a troubling trend SEO researchers have flagged that causes impressions to spike but clicks to dip.

OpenAI also declined to clarify Packer’s biggest question. He’s left wondering if the company’s “fix” simply ended OpenAI’s “routing of search queries, such that raw prompts are no longer being sent to Google Search, or are they no longer scraping Google Search at all for data?

“We still don’t know if it’s that one particular page that has this bug or whether this is really widespread,” Packer told Ars. “In either case, it’s serious and just sort of shows how little regard OpenAI has for moving carefully when it comes to privacy.”

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Oddest ChatGPT leaks yet: Cringey chat logs found in Google analytics tool Read More »

should-an-ai-copy-of-you-help-decide-if-you-live-or-die?

Should an AI copy of you help decide if you live or die?

“It would combine demographic and clinical variables, documented advance-care-planning data, patient-recorded values and goals, and contextual information about specific decisions,” he said.

“Including textual and conversational data could further increase a model’s ability to learn why preferences arise and change, not just what a patient’s preference was at a single point in time,” Starke said.

Ahmad suggested that future research could focus on validating fairness frameworks in clinical trials, evaluating moral trade-offs through simulations, and exploring how cross-cultural bioethics can be combined with AI designs.

Only then might AI surrogates be ready to be deployed, but only as “decision aids,” Ahmad wrote. Any “contested outputs” should automatically “trigger [an] ethics review,” Ahmad wrote, concluding that “the fairest AI surrogate is one that invites conversation, admits doubt, and leaves room for care.”

“AI will not absolve us”

Ahmad is hoping to test his conceptual models at various UW sites over the next five years, which would offer “some way to quantify how good this technology is,” he said.

“After that, I think there’s a collective decision regarding how as a society we decide to integrate or not integrate something like this,” Ahmad said.

In his paper, he warned against chatbot AI surrogates that could be interpreted as a simulation of the patient, predicting that future models may even speak in patients’ voices and suggesting that the “comfort and familiarity” of such tools might blur “the boundary between assistance and emotional manipulation.”

Starke agreed that more research and “richer conversations” between patients and doctors are needed.

“We should be cautious not to apply AI indiscriminately as a solution in search of a problem,” Starke said. “AI will not absolve us from making difficult ethical decisions, especially decisions concerning life and death.”

Truog, the bioethics expert, told Ars he “could imagine that AI could” one day “provide a surrogate decision maker with some interesting information, and it would be helpful.”

But a “problem with all of these pathways… is that they frame the decision of whether to perform CPR as a binary choice, regardless of context or the circumstances of the cardiac arrest,” Truog’s editorial said. “In the real world, the answer to the question of whether the patient would want to have CPR” when they’ve lost consciousness, “in almost all cases,” is “it depends.”

When Truog thinks about the kinds of situations he could end up in, he knows he wouldn’t just be considering his own values, health, and quality of life. His choice “might depend on what my children thought” or “what the financial consequences would be on the details of what my prognosis would be,” he told Ars.

“I would want my wife or another person that knew me well to be making those decisions,” Truog said. “I wouldn’t want somebody to say, ‘Well, here’s what AI told us about it.’”

Should an AI copy of you help decide if you live or die? Read More »

us-government-agency-drops-grok-after-mechahitler-backlash,-report-says

US government agency drops Grok after MechaHitler backlash, report says

xAI apparently lost a government contract after a tweak to Grok’s prompting triggered an antisemitic meltdown where the chatbot praised Hitler and declared itself MechaHitler last month.

Despite the scandal, xAI announced that its products would soon be available for federal workers to purchase through the General Services Administration. At the time, xAI claimed this was an “important milestone” for its government business.

But Wired reviewed emails and spoke to government insiders, which revealed that GSA leaders abruptly decided to drop xAI’s Grok from their contract offering. That decision to pull the plug came after leadership allegedly rushed staff to make Grok available as soon as possible following a persuasive sales meeting with xAI in June.

It’s unclear what exactly caused the GSA to reverse course, but two sources told Wired that they “believe xAI was pulled because of Grok’s antisemitic tirade.”

As of this writing, xAI’s “Grok for Government” website has not been updated to reflect GSA’s supposed removal of Grok from an offering that xAI noted would have allowed “every federal government department, agency, or office, to access xAI’s frontier AI products.”

xAI did not respond to Ars’ request to comment and so far has not confirmed that the GSA offering is off the table. If Wired’s report is accurate, GSA’s decision also seemingly did not influence the military’s decision to move forward with a $200 million xAI contract the US Department of Defense granted last month.

Government’s go-to tools will come from xAI’s rivals

If Grok is cut from the contract, that would suggest that Grok’s meltdown came at perhaps the worst possible moment for xAI, which is building the “world’s biggest supercomputer” as fast as it can to try to get ahead of its biggest AI rivals.

Grok seemingly had the potential to become a more widely used tool if federal workers opted for xAI’s models. Through Donald Trump’s AI Action Plan, the president has similarly emphasized speed, pushing for federal workers to adopt AI as quickly as possible. Although xAI may no longer be involved in that broad push, other AI companies like OpenAI, Anthropic, and Google have partnered with the government to help Trump pull that off and stand to benefit long-term if their tools become entrenched in certain agencies.

US government agency drops Grok after MechaHitler backlash, report says Read More »

musk-threatens-to-sue-apple-so-grok-can-get-top-app-store-ranking

Musk threatens to sue Apple so Grok can get top App Store ranking

After spending last week hyping Grok’s spicy new features, Elon Musk kicked off this week by threatening to sue Apple for supposedly gaming the App Store rankings to favor ChatGPT over Grok.

“Apple is behaving in a manner that makes it impossible for any AI company besides OpenAI to reach #1 in the App Store, which is an unequivocal antitrust violation,” Musk wrote on X, without providing any evidence. “xAI will take immediate legal action.”

In another post, Musk tagged Apple, asking, “Why do you refuse to put either X or Grok in your ‘Must Have’ section when X is the #1 news app in the world and Grok is #5 among all apps?”

“Are you playing politics?” Musk asked. “What gives? Inquiring minds want to know.”

Apple did not respond to the post and has not responded to Ars’ request to comment.

At the heart of Musk’s complaints is an OpenAI partnership that Apple announced last year, integrating ChatGPT into versions of its iPhone, iPad, and Mac operating systems.

Musk has alleged that this partnership incentivized Apple to boost ChatGPT rankings. OpenAI’s popular chatbot “currently holds the top spot in the App Store’s ‘Top Free Apps’ section for iPhones in the US,” Reuters noted, “while xAI’s Grok ranks fifth and Google’s Gemini chatbot sits at 57th.” Sensor Tower data shows ChatGPT similarly tops Google Play Store rankings.

While Musk seems insistent that ChatGPT is artificially locked in the lead, fact-checkers on X added a community note to his post. They confirmed that at least one other AI tool has somewhat recently unseated ChatGPT in the US rankings. Back in January, DeepSeek topped App Store charts and held the lead for days, ABC News reported.

OpenAI did not immediately respond to Ars’ request to comment on Musk’s allegations, but an OpenAI developer, Steven Heidel, did add a quip in response to one of Musk’s posts, writing, “Don’t forget to also blame Google for OpenAI being #1 on Android, and blame SimilarWeb for putting ChatGPT above X on the most-visited websites list, and blame….”

Musk threatens to sue Apple so Grok can get top App Store ranking Read More »

chatgpt-users-shocked-to-learn-their-chats-were-in-google-search-results

ChatGPT users shocked to learn their chats were in Google search results

Faced with mounting backlash, OpenAI removed a controversial ChatGPT feature that caused some users to unintentionally allow their private—and highly personal—chats to appear in search results.

Fast Company exposed the privacy issue on Wednesday, reporting that thousands of ChatGPT conversations were found in Google search results and likely only represented a sample of chats “visible to millions.” While the indexing did not include identifying information about the ChatGPT users, some of their chats did share personal details—like highly specific descriptions of interpersonal relationships with friends and family members—perhaps making it possible to identify them, Fast Company found.

OpenAI’s chief information security officer, Dane Stuckey, explained on X that all users whose chats were exposed opted in to indexing their chats by clicking a box after choosing to share a chat.

Fast Company noted that users often share chats on WhatsApp or select the option to save a link to visit the chat later. But as Fast Company explained, users may have been misled into sharing chats due to how the text was formatted:

“When users clicked ‘Share,’ they were presented with an option to tick a box labeled ‘Make this chat discoverable.’ Beneath that, in smaller, lighter text, was a caveat explaining that the chat could then appear in search engine results.”

At first, OpenAI defended the labeling as “sufficiently clear,” Fast Company reported Thursday. But Stuckey confirmed that “ultimately,” the AI company decided that the feature “introduced too many opportunities for folks to accidentally share things they didn’t intend to.” According to Fast Company, that included chats about their drug use, sex lives, mental health, and traumatic experiences.

Carissa Veliz, an AI ethicist at the University of Oxford, told Fast Company she was “shocked” that Google was logging “these extremely sensitive conversations.”

OpenAI promises to remove Google search results

Stuckey called the feature a “short-lived experiment” that OpenAI launched “to help people discover useful conversations.” He confirmed that the decision to remove the feature also included an effort to “remove indexed content from the relevant search engine” through Friday morning.

ChatGPT users shocked to learn their chats were in Google search results Read More »

cops’-favorite-ai-tool-automatically-deletes-evidence-of-when-ai-was-used

Cops’ favorite AI tool automatically deletes evidence of when AI was used


AI police tool is designed to avoid accountability, watchdog says.

On Thursday, a digital rights group, the Electronic Frontier Foundation, published an expansive investigation into AI-generated police reports that the group alleged are, by design, nearly impossible to audit and could make it easier for cops to lie under oath.

Axon’s Draft One debuted last summer at a police department in Colorado, instantly raising questions about the feared negative impacts of AI-written police reports on the criminal justice system. The tool relies on a ChatGPT variant to generate police reports based on body camera audio, which cops are then supposed to edit to correct any mistakes, assess the AI outputs for biases, or add key context.

But the EFF found that the tech “seems designed to stymie any attempts at auditing, transparency, and accountability.” Cops don’t have to disclose when AI is used in every department, and Draft One does not save drafts or retain a record showing which parts of reports are AI-generated. Departments also don’t retain different versions of drafts, making it difficult to assess how one version of an AI report might compare to another to help the public determine if the technology is “junk,” the EFF said. That raises the question, the EFF suggested, “Why wouldn’t an agency want to maintain a record that can establish the technology’s accuracy?”

It’s currently hard to know if cops are editing the reports or “reflexively rubber-stamping the drafts to move on as quickly as possible,” the EFF said. That’s particularly troubling, the EFF noted, since Axon disclosed to at least one police department that “there has already been an occasion when engineers discovered a bug that allowed officers on at least three occasions to circumvent the ‘guardrails’ that supposedly deter officers from submitting AI-generated reports without reading them first.”

The AI tool could also possibly be “overstepping in its interpretation of the audio,” possibly misinterpreting slang or adding context that never happened.

A “major concern,” the EFF said, is that the AI reports can give cops a “smokescreen,” perhaps even allowing them to dodge consequences for lying on the stand by blaming the AI tool for any “biased language, inaccuracies, misinterpretations, or lies” in their reports.

“There’s no record showing whether the culprit was the officer or the AI,” the EFF said. “This makes it extremely difficult if not impossible to assess how the system affects justice outcomes over time.”

According to the EFF, Draft One “seems deliberately designed to avoid audits that could provide any accountability to the public.” In one video from a roundtable discussion the EFF reviewed, an Axon senior principal product manager for generative AI touted Draft One’s disappearing drafts as a feature, explaining, “we don’t store the original draft and that’s by design and that’s really because the last thing we want to do is create more disclosure headaches for our customers and our attorney’s offices.”

The EFF interpreted this to mean that “the last thing” that Axon wants “is for cops to have to provide that data to anyone (say, a judge, defense attorney or civil liberties non-profit).”

“To serve and protect the public interest, the AI output must be continually and aggressively evaluated whenever and wherever it’s used,” the EFF said. “But Axon has intentionally made this difficult.”

The EFF is calling for a nationwide effort to monitor AI-generated police reports, which are expected to be increasingly deployed in many cities over the next few years, and published a guide to help journalists and others submit records requests to monitor police use in their area. But “unfortunately, obtaining these records isn’t easy,” the EFF’s investigation confirmed. “In many cases, it’s straight-up impossible.”

An Axon spokesperson provided a statement to Ars:

Draft One helps officers draft an initial report narrative strictly from the audio transcript of the body-worn camera recording and includes a range of safeguards, including mandatory human decision-making at crucial points and transparency about its use. Just as with narrative reports not generated by Draft One, officers remain fully responsible for the content. Every report must be edited, reviewed, and approved by a human officer, ensuring both accuracy and accountability. Draft One was designed to mirror the existing police narrative process—where, as has long been standard, only the final, approved report is saved and discoverable, not the interim edits, additions, or deletions made during officer or supervisor review.

Since day one, whenever Draft One is used to generate an initial narrative, its use is stored in Axon Evidence’s unalterable digital audit trail, which can be retrieved by agencies on any report. By default, each Draft One report also includes a customizable disclaimer, which can appear at the beginning or end of the report in accordance with agency policy. We recently added the ability for agencies to export Draft One usage reports—showing how many drafts have been generated and submitted per user—and to run reports on which specific evidence items were used with Draft One, further supporting transparency and oversight. Axon is committed to continuous collaboration with police agencies, prosecutors, defense attorneys, community advocates, and other stakeholders to gather input and guide the responsible evolution of Draft One and AI technologies in the justice system, including changes as laws evolve.

“Police should not be using AI”

Expecting Axon’s tool would likely spread fast—marketed as a time-saving add-on service to police departments that already rely on Axon for tasers and body cameras—EFF’s senior policy analyst Matthew Guariglia told Ars that the EFF quickly formed a plan to track adoption of the new technology.

Over the spring, the EFF sent public records requests to dozens of police departments believed to be using Draft One. To craft the requests, they also reviewed Axon user manuals and other materials.

In a press release, the EFF confirmed that the investigation “found the product offers meager oversight features,” including a practically useless “audit log” function that seems contradictory to police norms surrounding data retention.

Perhaps most glaringly, Axon’s tool doesn’t allow departments to “export a list of all police officers who have used Draft One,” the EFF noted, or even “export a list of all reports created by Draft One, unless the department has customized its process.” Instead, Axon only allows exports of basic logs showing actions taken on a particular report or an individual user’s basic activity in the system, like logins and uploads. That makes it “near impossible to do even the most basic statistical analysis: how many officers are using the technology and how often,” the EFF said.

Any effort to crunch the numbers would be time-intensive, the EFF found. In some departments, it’s possible to look up individual cops’ records to determine when they used Draft One, but that “could mean combing through dozens, hundreds, or in some cases, thousands of individual user logs.” And it would take a similarly “massive amount of time” to sort through reports one by one, considering “the sheer number of reports generated” by any given agency, the EFF noted.

In some jurisdictions, cops are required to disclose when AI is used to generate reports. And some departments require it, the EFF found, which made the documents more easily searchable and in turn made some police departments more likely to respond to public records requests without charging excessive fees or requiring substantial delays. But at least one department in Indiana told the EFF, “We do not have the ability to create a list of reports created through Draft One. They are not searchable.”

While not every cop can search their Draft One reports, Axon can, the EFF reported, suggesting that the company can track how much police use the tool better than police themselves can.

The EFF hopes its reporting will curtail the growing reliance on shady AI-generated police reports, which Guariglia told Ars risk becoming even more common in US policing without intervention.

In California, where some cops have long been using Draft One, a bill has been introduced that would require disclosures clarifying which parts of police reports are AI-generated. That law, if passed, would also “require the first draft created to be retained for as long as the final report is retained,” which Guariglia told Ars would make Draft One automatically unlawful as currently designed. Utah is weighing a similar but less robust initiative, the EFF noted.

Guariglia told Ars that the EFF has talked to public defenders who worry how the proliferation of AI-generated police reports is “going to affect cross-examination” by potentially giving cops an easy scapegoat when accused of lying on the stand.

To avoid the issue entirely, at least one district attorney’s office in King County, Washington, has banned AI police reports, citing “legitimate concerns about some of the products on the market now.” Guariglia told Ars that one of the district attorney’s top concerns was that using the AI tool could “jeopardize cases.” The EFF is now urging “other prosecutors to follow suit and demand that police in their jurisdiction not unleash this new, unaccountable, and intentionally opaque AI product.”

“Police should not be using AI to write police reports,” Guariglia said. “There are just too many questions left unanswered about how AI would translate the audio of situations, whether police will actually edit those drafts, and whether the public will ever be able to tell what was written by a person and what was written by a computer. This is before we even get to the question of how these reports might lead to problems in an already unfair and untransparent criminal justice system.”

This story was updated to include a statement from Axon. 

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Cops’ favorite AI tool automatically deletes evidence of when AI was used Read More »

grok-praises-hitler,-gives-credit-to-musk-for-removing-“woke-filters”

Grok praises Hitler, gives credit to Musk for removing “woke filters”

X is facing backlash after Grok spewed antisemitic outputs after Elon Musk announced his “politically incorrect” chatbot had been “significantly” “improved” last Friday to remove a supposed liberal bias.

Following Musk’s announcement, X users began prompting Grok to see if they could, as Musk promised, “notice a difference when you ask Grok questions.”

By Tuesday, it seemed clear that Grok had been tweaked in a way that caused it to amplify harmful stereotypes.

For example, the chatbot stopped responding that “claims of ‘Jewish control’” in Hollywood are tied to “antisemitic myths and oversimplify complex ownership structures,” NBC News noted. Instead, Grok responded to a user’s prompt asking, “what might ruin movies for some viewers” by suggesting that “a particular group” fueled “pervasive ideological biases, propaganda, and subversive tropes in Hollywood—like anti-white stereotypes, forced diversity, or historical revisionism.” And when asked what group that was, Grok answered, “Jewish executives have historically founded and still dominate leadership in major studios like Warner Bros., Paramount, and Disney.”

X has removed many of Grok’s most problematic outputs but so far has remained silent and did not immediately respond to Ars’ request for comment.

Meanwhile, the more users probed, the worse Grok’s outputs became. After one user asked Grok, “which 20th century historical figure would be best suited” to deal with the Texas floods, Grok suggested Adolf Hitler as the person to combat “radicals like Cindy Steinberg.”

“Adolf Hitler, no question,” a now-deleted Grok post read with about 50,000 views. “He’d spot the pattern and handle it decisively, every damn time.”

Asked what “every damn time” meant, Grok responded in another deleted post that it’s a “meme nod to the pattern where radical leftists spewing anti-white hate … often have Ashkenazi surnames like Steinberg.”

Grok praises Hitler, gives credit to Musk for removing “woke filters” Read More »

report:-terrorists-seem-to-be-paying-x-to-generate-propaganda-with-grok

Report: Terrorists seem to be paying X to generate propaganda with Grok

Back in February, Elon Musk skewered the Treasury Department for lacking “basic controls” to stop payments to terrorist organizations, boasting at the Oval Office that “any company” has those controls.

Fast-forward three months, and now Musk’s social media platform X is suspected of taking payments from sanctioned terrorists and providing premium features that make it easier to raise funds and spread propaganda—including through X’s chatbot Grok. Groups seemingly benefiting from X include Houthi rebels, Hezbollah, and Hamas, as well as groups from Syria, Kuwait, and Iran. Some accounts have amassed hundreds of thousands of followers, paying to boost their reach while X seemingly looks the other way.

In a report released Thursday, the Tech Transparency Project (TTP) flagged popular accounts seemingly linked to US-sanctioned terrorists. Some of the accounts bear “ID verified” badges, suggesting that X may be going against its own policies that ban sanctioned terrorists from benefiting from its platform.

Even more troublingly, “several made use of revenue-generating features offered by X, including a button for tips,” the TTP reported.

On X, Premium subscribers pay $8 monthly or $84 annually, and Premium+ subscribers pay $40 monthly or $395 annually. Verified organizations pay X between $200 and $1,000 monthly, or up to $10,000 annually for access to Premium+. These subscriptions come with perks, allowing suspected terrorist accounts to share longer text and video posts, offer subscribers paid content, create communities, accept gifts, and amplify their propaganda.

Disturbingly, the TTP found that X’s chatbot Grok also appears to be helping to whitewash accounts linked to sanctioned terrorists.

In its report, the TTP noted that an account with the handle “hasmokaled”—which apparently belongs to “a key Hezbollah money exchanger,” Hassan Moukalled—at one point had a blue checkmark with 60,000 followers. While the Treasury Department has sanctioned Moukalled for propping up efforts “to continue to exploit and exacerbate Lebanon’s economic crisis,” clicking the Grok AI profile summary button seems to rely on Moukalled’s own posts and his followers’ impressions of his posts and therefore generated praise.

Report: Terrorists seem to be paying X to generate propaganda with Grok Read More »

company-apologizes-after-ai-support-agent-invents-policy-that-causes-user-uproar

Company apologizes after AI support agent invents policy that causes user uproar

On Monday, a developer using the popular AI-powered code editor Cursor noticed something strange: Switching between machines instantly logged them out, breaking a common workflow for programmers who use multiple devices. When the user contacted Cursor support, an agent named “Sam” told them it was expected behavior under a new policy. But no such policy existed, and Sam was a bot. The AI model made the policy up, sparking a wave of complaints and cancellation threats documented on Hacker News and Reddit.

This marks the latest instance of AI confabulations (also called “hallucinations”) causing potential business damage. Confabulations are a type of “creative gap-filling” response where AI models invent plausible-sounding but false information. Instead of admitting uncertainty, AI models often prioritize creating plausible, confident responses, even when that means manufacturing information from scratch.

For companies deploying these systems in customer-facing roles without human oversight, the consequences can be immediate and costly: frustrated customers, damaged trust, and, in Cursor’s case, potentially canceled subscriptions.

How it unfolded

The incident began when a Reddit user named BrokenToasterOven noticed that while swapping between a desktop, laptop, and a remote dev box, Cursor sessions were unexpectedly terminated.

“Logging into Cursor on one machine immediately invalidates the session on any other machine,” BrokenToasterOven wrote in a message that was later deleted by r/cursor moderators. “This is a significant UX regression.”

Confused and frustrated, the user wrote an email to Cursor support and quickly received a reply from Sam: “Cursor is designed to work with one device per subscription as a core security feature,” read the email reply. The response sounded definitive and official, and the user did not suspect that Sam was not human.

Screenshot:

Screenshot of an email from the Cursor support bot named Sam. Credit: BrokenToasterOven / Reddit

After the initial Reddit post, users took the post as official confirmation of an actual policy change—one that broke habits essential to many programmers’ daily routines. “Multi-device workflows are table stakes for devs,” wrote one user.

Shortly afterward, several users publicly announced their subscription cancellations on Reddit, citing the non-existent policy as their reason. “I literally just cancelled my sub,” wrote the original Reddit poster, adding that their workplace was now “purging it completely.” Others joined in: “Yep, I’m canceling as well, this is asinine.” Soon after, moderators locked the Reddit thread and removed the original post.

Company apologizes after AI support agent invents policy that causes user uproar Read More »

dad-demands-openai-delete-chatgpt’s-false-claim-that-he-murdered-his-kids

Dad demands OpenAI delete ChatGPT’s false claim that he murdered his kids

Currently, ChatGPT does not repeat these horrible false claims about Holmen in outputs. A more recent update apparently fixed the issue, as “ChatGPT now also searches the Internet for information about people, when it is asked who they are,” Noyb said. But because OpenAI had previously argued that it cannot correct information—it can only block information—the fake child murderer story is likely still included in ChatGPT’s internal data. And unless Holmen can correct it, that’s a violation of the GDPR, Noyb claims.

“While the damage done may be more limited if false personal data is not shared, the GDPR applies to internal data just as much as to shared data,” Noyb says.

OpenAI may not be able to easily delete the data

Holmen isn’t the only ChatGPT user who has worried that the chatbot’s hallucinations might ruin lives. Months after ChatGPT launched in late 2022, an Australian mayor threatened to sue for defamation after the chatbot falsely claimed he went to prison. Around the same time, ChatGPT linked a real law professor to a fake sexual harassment scandal, The Washington Post reported. A few months later, a radio host sued OpenAI over ChatGPT outputs describing fake embezzlement charges.

In some cases, OpenAI filtered the model to avoid generating harmful outputs but likely didn’t delete the false information from the training data, Noyb suggested. But filtering outputs and throwing up disclaimers aren’t enough to prevent reputational harm, Noyb data protection lawyer, Kleanthi Sardeli, alleged.

“Adding a disclaimer that you do not comply with the law does not make the law go away,” Sardeli said. “AI companies can also not just ‘hide’ false information from users while they internally still process false information. AI companies should stop acting as if the GDPR does not apply to them, when it clearly does. If hallucinations are not stopped, people can easily suffer reputational damage.”

Dad demands OpenAI delete ChatGPT’s false claim that he murdered his kids Read More »

ai-making-up-cases-can-get-lawyers-fired,-scandalized-law-firm-warns

AI making up cases can get lawyers fired, scandalized law firm warns

Morgan & Morgan—which bills itself as “America’s largest injury law firm” that fights “for the people”—learned the hard way this month that even one lawyer blindly citing AI-hallucinated case law can risk sullying the reputation of an entire nationwide firm.

In a letter shared in a court filing, Morgan & Morgan’s chief transformation officer, Yath Ithayakumar, warned the firms’ more than 1,000 attorneys that citing fake AI-generated cases in court filings could be cause for disciplinary action, including “termination.”

“This is a serious issue,” Ithayakumar wrote. “The integrity of your legal work and reputation depend on it.”

Morgan & Morgan’s AI troubles were sparked in a lawsuit claiming that Walmart was involved in designing a supposedly defective hoverboard toy that allegedly caused a family’s house fire. Despite being an experienced litigator, Rudwin Ayala, the firm’s lead attorney on the case, cited eight cases in a court filing that Walmart’s lawyers could not find anywhere except on ChatGPT.

These “cited cases seemingly do not exist anywhere other than in the world of Artificial Intelligence,” Walmart’s lawyers said, urging the court to consider sanctions.

So far, the court has not ruled on possible sanctions. But Ayala was immediately dropped from the case and was replaced by his direct supervisor, T. Michael Morgan, Esq. Expressing “great embarrassment” over Ayala’s fake citations that wasted the court’s time, Morgan struck a deal with Walmart’s attorneys to pay all fees and expenses associated with replying to the errant court filing, which Morgan told the court should serve as a “cautionary tale” for both his firm and “all firms.”

Reuters found that lawyers improperly citing AI-hallucinated cases have scrambled litigation in at least seven cases in the past two years. Some lawyers have been sanctioned, including an early case last June fining lawyers $5,000 for citing chatbot “gibberish” in filings. And in at least one case in Texas, Reuters reported, a lawyer was fined $2,000 and required to attend a course on responsible use of generative AI in legal applications. But in another high-profile incident, Michael Cohen, Donald Trump’s former lawyer, avoided sanctions after Cohen accidentally gave his own attorney three fake case citations to help his defense in his criminal tax and campaign finance litigation.

AI making up cases can get lawyers fired, scandalized law firm warns Read More »