Deepfakes

due-to-ai-fakes,-the-“deep-doubt”-era-is-here

Due to AI fakes, the “deep doubt” era is here

A person writing

Memento | Aurich Lawson

Given the flood of photorealistic AI-generated images washing over social media networks like X and Facebook these days, we’re seemingly entering a new age of media skepticism: the era of what I’m calling “deep doubt.” While questioning the authenticity of digital content stretches back decades—and analog media long before that—easy access to tools that generate convincing fake content has led to a new wave of liars using AI-generated scenes to deny real documentary evidence. Along the way, people’s existing skepticism toward online content from strangers may be reaching new heights.

Deep doubt is skepticism of real media that stems from the existence of generative AI. This manifests as broad public skepticism toward the veracity of media artifacts, which in turn leads to a notable consequence: People can now more credibly claim that real events did not happen and suggest that documentary evidence was fabricated using AI tools.

The concept behind “deep doubt” isn’t new, but its real-world impact is becoming increasingly apparent. Since the term “deepfake” first surfaced in 2017, we’ve seen a rapid evolution in AI-generated media capabilities. This has led to recent examples of deep doubt in action, such as conspiracy theorists claiming that President Joe Biden has been replaced by an AI-powered hologram and former President Donald Trump’s baseless accusation in August that Vice President Kamala Harris used AI to fake crowd sizes at her rallies. And on Friday, Trump cried “AI” again at a photo of him with E. Jean Carroll, a writer who successfully sued him for sexual assault, that contradicts his claim of never having met her.

Legal scholars Danielle K. Citron and Robert Chesney foresaw this trend years ago, coining the term “liar’s dividend” in 2019 to describe the consequence of deep doubt: deepfakes being weaponized by liars to discredit authentic evidence. But whereas deep doubt was once a hypothetical academic concept, it is now our reality.

The rise of deepfakes, the persistence of doubt

Doubt has been a political weapon since ancient times. This modern AI-fueled manifestation is just the latest evolution of a tactic where the seeds of uncertainty are sown to manipulate public opinion, undermine opponents, and hide the truth. AI is the newest refuge of liars.

Over the past decade, the rise of deep-learning technology has made it increasingly easy for people to craft false or modified pictures, audio, text, or video that appear to be non-synthesized organic media. Deepfakes were named after a Reddit user going by the name “deepfakes,” who shared AI-faked pornography on the service, swapping out the face of a performer with the face of someone else who wasn’t part of the original recording.

In the 20th century, one could argue that a certain part of our trust in media produced by others was a result of how expensive and time-consuming it was, and the skill it required, to produce documentary images and films. Even texts required a great deal of time and skill. As the deep doubt phenomenon grows, it will erode this 20th-century media sensibility. But it will also affect our political discourse, legal systems, and even our shared understanding of historical events that rely on that media to function—we rely on others to get information about the world. From photorealistic images to pitch-perfect voice clones, our perception of what we consider “truth” in media will need recalibration.

In April, a panel of federal judges highlighted the potential for AI-generated deepfakes to not only introduce fake evidence but also cast doubt on genuine evidence in court trials. The concern emerged during a meeting of the US Judicial Conference’s Advisory Committee on Evidence Rules, where the judges discussed the challenges of authenticating digital evidence in an era of increasingly sophisticated AI technology. Ultimately, the judges decided to postpone making any AI-related rule changes, but their meeting shows that the subject is already being considered by American judges.

Due to AI fakes, the “deep doubt” era is here Read More »

taylor-swift-cites-ai-deepfakes-in-endorsement-for-kamala-harris

Taylor Swift cites AI deepfakes in endorsement for Kamala Harris

it’s raining creepy men —

Taylor Swift on AI: “The simplest way to combat misinformation is with the truth.”

A screenshot of Taylor Swift's Kamala Harris Instagram post, captured on September 11, 2024.

Enlarge / A screenshot of Taylor Swift’s Kamala Harris Instagram post, captured on September 11, 2024.

On Tuesday night, Taylor Swift endorsed Vice President Kamala Harris for US President on Instagram, citing concerns over AI-generated deepfakes as a key motivator. The artist’s warning aligns with current trends in technology, especially in an era where AI synthesis models can easily create convincing fake images and videos.

“Recently I was made aware that AI of ‘me’ falsely endorsing Donald Trump’s presidential run was posted to his site,” she wrote in her Instagram post. “It really conjured up my fears around AI, and the dangers of spreading misinformation. It brought me to the conclusion that I need to be very transparent about my actual plans for this election as a voter. The simplest way to combat misinformation is with the truth.”

In August 2024, former President Donald Trump posted AI-generated images on Truth Social falsely suggesting Swift endorsed him, including a manipulated photo depicting Swift as Uncle Sam with text promoting Trump. The incident sparked Swift’s fears about the spread of misinformation through AI.

This isn’t the first time Swift and generative AI have appeared together in the news. In February, we reported that a flood of explicit AI-generated images of Swift originated from a 4chan message board where users took part in daily challenges to bypass AI image generator filters.

Listing image by Ronald Woan/CC BY-SA 2.0

Taylor Swift cites AI deepfakes in endorsement for Kamala Harris Read More »

popular-ai-“nudify”-sites-sued-amid-shocking-rise-in-victims-globally

Popular AI “nudify” sites sued amid shocking rise in victims globally

Popular AI “nudify” sites sued amid shocking rise in victims globally

San Francisco’s city attorney David Chiu is suing to shut down 16 of the most popular websites and apps allowing users to “nudify” or “undress” photos of mostly women and girls who have been increasingly harassed and exploited by bad actors online.

These sites, Chiu’s suit claimed, are “intentionally” designed to “create fake, nude images of women and girls without their consent,” boasting that any users can upload any photo to “see anyone naked” by using tech that realistically swaps the faces of real victims onto AI-generated explicit images.

“In California and across the country, there has been a stark increase in the number of women and girls harassed and victimized by AI-generated” non-consensual intimate imagery (NCII) and “this distressing trend shows no sign of abating,” Chiu’s suit said.

“Given the widespread availability and popularity” of nudify websites, “San Franciscans and Californians face the threat that they or their loved ones may be victimized in this manner,” Chiu’s suit warned.

In a press conference, Chiu said that this “first-of-its-kind lawsuit” has been raised to defend not just Californians, but “a shocking number of women and girls across the globe”—from celebrities like Taylor Swift to middle and high school girls. Should the city official win, each nudify site risks fines of $2,500 for each violation of California consumer protection law found.

On top of media reports sounding alarms about the AI-generated harm, law enforcement has joined the call to ban so-called deepfakes.

Chiu said the harmful deepfakes are often created “by exploiting open-source AI image generation models,” such as earlier versions of Stable Diffusion, that can be honed or “fine-tuned” to easily “undress” photos of women and girls that are frequently yanked from social media. While later versions of Stable Diffusion make such “disturbing” forms of misuse much harder, San Francisco city officials noted at the press conference that fine-tunable earlier versions of Stable Diffusion are still widely available to be abused by bad actors.

In the US alone, cops are currently so bogged down by reports of fake AI child sex images that it’s making it hard to investigate child abuse cases offline, and these AI cases are expected to continue spiking “exponentially.” The AI abuse has spread so widely that “the FBI has warned of an uptick in extortion schemes using AI generated non-consensual pornography,” Chiu said at the press conference. “And the impact on victims has been devastating,” harming “their reputations and their mental health,” causing “loss of autonomy,” and “in some instances causing individuals to become suicidal.”

Suing on behalf of the people of the state of California, Chiu is seeking an injunction requiring nudify site owners to cease operation of “all websites they own or operate that are capable of creating AI-generated” non-consensual intimate imagery of identifiable individuals. It’s the only way, Chiu said, to hold these sites “accountable for creating and distributing AI-generated NCII of women and girls and for aiding and abetting others in perpetrating this conduct.”

He also wants an order requiring “any domain-name registrars, domain-name registries, webhosts, payment processors, or companies providing user authentication and authorization services or interfaces” to “restrain” nudify site operators from launching new sites to prevent any further misconduct.

Chiu’s suit redacts the names of the most harmful sites his investigation uncovered but claims that in the first six months of 2024, the sites “have been visited over 200 million times.”

While victims typically have little legal recourse, Chiu believes that state and federal laws prohibiting deepfake pornography, revenge pornography, and child pornography, as well as California’s unfair competition law, can be wielded to take down all 16 sites. Chiu expects that a win will serve as a warning to other nudify site operators that more takedowns are likely coming.

“We are bringing this lawsuit to get these websites shut down, but we also want to sound the alarm,” Chiu said at the press conference. “Generative AI has enormous promise, but as with all new technologies, there are unanticipated consequences and criminals seeking to exploit them. We must be clear that this is not innovation. This is sexual abuse.”

Popular AI “nudify” sites sued amid shocking rise in victims globally Read More »

astronomers-discover-technique-to-spot-ai-fakes-using-galaxy-measurement-tools

Astronomers discover technique to spot AI fakes using galaxy-measurement tools

stars in their eyes —

Researchers use technique to quantify eyeball reflections that often reveal deepfake images.

Researchers write,

Enlarge / Researchers write, “In this image, the person on the left (Scarlett Johansson) is real, while the person on the right is AI-generated. Their eyeballs are depicted underneath their faces. The reflections in the eyeballs are consistent for the real person, but incorrect (from a physics point of view) for the fake person.”

In 2024, it’s almost trivial to create realistic AI-generated images of people, which has led to fears about how these deceptive images might be detected. Researchers at the University of Hull recently unveiled a novel method for detecting AI-generated deepfake images by analyzing reflections in human eyes. The technique, presented at the Royal Astronomical Society’s National Astronomy Meeting last week, adapts tools used by astronomers to study galaxies for scrutinizing the consistency of light reflections in eyeballs.

Adejumoke Owolabi, an MSc student at the University of Hull, headed the research under the guidance of Dr. Kevin Pimbblet, professor of astrophysics.

Their detection technique is based on a simple principle: A pair of eyes being illuminated by the same set of light sources will typically have a similarly shaped set of light reflections in each eyeball. Many AI-generated images created to date don’t take eyeball reflections into account, so the simulated light reflections are often inconsistent between each eye.

A series of real eyes showing largely consistent reflections in both eyes.

Enlarge / A series of real eyes showing largely consistent reflections in both eyes.

In some ways, the astronomy angle isn’t always necessary for this kind of deepfake detection because a quick glance at a pair of eyes in a photo can reveal reflection inconsistencies, which is something artists who paint portraits have to keep in mind. But the application of astronomy tools to automatically measure and quantify eye reflections in deepfakes is a novel development.

Automated detection

In a Royal Astronomical Society blog post, Pimbblet explained that Owolabi developed a technique to detect eyeball reflections automatically and ran the reflections’ morphological features through indices to compare similarity between left and right eyeballs. Their findings revealed that deepfakes often exhibit differences between the pair of eyes.

The team applied methods from astronomy to quantify and compare eyeball reflections. They used the Gini coefficient, typically employed to measure light distribution in galaxy images, to assess the uniformity of reflections across eye pixels. A Gini value closer to 0 indicates evenly distributed light, while a value approaching 1 suggests concentrated light in a single pixel.

A series of deepfake eyes showing inconsistent reflections in each eye.

Enlarge / A series of deepfake eyes showing inconsistent reflections in each eye.

In the Royal Astronomical Society post, Pimbblet drew comparisons between how they measured eyeball reflection shape and how they typically measure galaxy shape in telescope imagery: “To measure the shapes of galaxies, we analyze whether they’re centrally compact, whether they’re symmetric, and how smooth they are. We analyze the light distribution.”

The researchers also explored the use of CAS parameters (concentration, asymmetry, smoothness), another tool from astronomy for measuring galactic light distribution. However, this method proved less effective in identifying fake eyes.

A detection arms race

While the eye-reflection technique offers a potential path for detecting AI-generated images, the method might not work if AI models evolve to incorporate physically accurate eye reflections, perhaps applied as a subsequent step after image generation. The technique also requires a clear, up-close view of eyeballs to work.

The approach also risks producing false positives, as even authentic photos can sometimes exhibit inconsistent eye reflections due to varied lighting conditions or post-processing techniques. But analyzing eye reflections may still be a useful tool in a larger deepfake detection toolset that also considers other factors such as hair texture, anatomy, skin details, and background consistency.

While the technique shows promise in the short term, Dr. Pimbblet cautioned that it’s not perfect. “There are false positives and false negatives; it’s not going to get everything,” he told the Royal Astronomical Society. “But this method provides us with a basis, a plan of attack, in the arms race to detect deepfakes.”

Astronomers discover technique to spot AI fakes using galaxy-measurement tools Read More »

ai-generated-al-michaels-to-provide-daily-recaps-during-2024-summer-olympics

AI-generated Al Michaels to provide daily recaps during 2024 Summer Olympics

forever young —

AI voice clone will narrate daily Olympics video recaps; critics call it a “code-generated ghoul.”

Al Michaels looks on prior to the game between the Minnesota Vikings and Philadelphia Eagles at Lincoln Financial Field on September 14, 2023 in Philadelphia, Pennsylvania.

Enlarge / Al Michaels looks on prior to the game between the Minnesota Vikings and Philadelphia Eagles at Lincoln Financial Field on September 14, 2023, in Philadelphia, Pennsylvania.

On Wednesday, NBC announced plans to use an AI-generated clone of famous sports commentator Al Michaels‘ voice to narrate daily streaming video recaps of the 2024 Summer Olympics in Paris, which start on July 26. The AI-powered narration will feature in “Your Daily Olympic Recap on Peacock,” NBC’s streaming service. But this new, high-profile use of voice cloning worries critics, who say the technology may muscle out upcoming sports commentators by keeping old personas around forever.

NBC says it has created a “high-quality AI re-creation” of Michaels’ voice, trained on Michaels’ past NBC appearances to capture his distinctive delivery style.

The veteran broadcaster, revered in the sports commentator world for his iconic “Do you believe in miracles? Yes!” call during the 1980 Winter Olympics, has been covering sports on TV since 1971, including a high-profile run of play-by-play coverage of NFL football games for both ABC and NBC since the 1980s. NBC dropped him from NFL coverage in 2023, however, possibly due to his age.

Michaels, who is 79 years old, shared his initial skepticism about the project in an interview with Vanity Fair, as NBC News notes. After hearing the AI version of his voice, which can greet viewers by name, he described the experience as “astonishing” and “a little bit frightening.” He said the AI recreation was “almost 2% off perfect” in mimicking his style.

The Vanity Fair article provides some insight into how NBC’s new AI system works. It first uses a large language model (similar technology to what powers ChatGPT) to analyze subtitles and metadata from NBC’s Olympics video coverage, summarizing events and writing custom output to imitate Michaels’ style. This text is then fed into an unspecified voice AI model trained on Michaels’ previous NBC appearances, reportedly replicating his unique pronunciations and intonations.

NBC estimates that the system could generate nearly 7 million personalized variants of the recaps across the US during the games, pulled from the network’s 5,000 hours of live coverage. Using the system, each Peacock user will receive about 10 minutes of personalized highlights.

A diminished role for humans in the future?

Al Michaels reports on the Sweden vs. USA men's ice hockey game at the 1980 Olympic Winter Games on February 12, 1980.

Enlarge / Al Michaels reports on the Sweden vs. USA men’s ice hockey game at the 1980 Olympic Winter Games on February 12, 1980.

It’s no secret that while AI is wildly hyped right now, it’s also controversial among some. Upon hearing the NBC announcement, critics of AI technology reacted strongly. “@NBCSports, this is gross,” tweeted actress and filmmaker Justine Bateman, who frequently uses X to criticize technologies that might replace human writers or performers in the future.

A thread of similar responses from X users reacting to the sample video provided above included criticisms such as, “Sounds pretty off when it’s just the same tone for every single word.” Another user wrote, “It just sounds so unnatural. No one talks like that.”

The technology will not replace NBC’s regular human sports commentators during this year’s Olympics coverage, and like other forms of AI, it leans heavily on existing human work by analyzing and regurgitating human-created content in the form of captions pulled from NBC footage.

Looking down the line, due to AI media cloning technologies like voice, video, and image synthesis, today’s celebrities may be able to attain a form of media immortality that allows new iterations of their likenesses to persist through the generations, potentially earning licensing fees for whoever holds the rights.

We’ve already seen it with James Earl Jones playing Darth Vader’s voice, and the trend will likely continue with other celebrity voices, provided the money is right. Eventually, it may extend to famous musicians through music synthesis and famous actors in video-synthesis applications as well.

The possibility of being muscled out by AI replicas factored heavily into a Hollywood actors’ strike last year, with SAG-AFTRA union President Fran Drescher saying, “If we don’t stand tall right now, we are all going to be in trouble. We are all going to be in jeopardy of being replaced by machines.”

For companies that like to monetize media properties for as long as possible, AI may provide a way to maintain a media legacy through automation. But future human performers may have to compete against all of the greatest performers of the past, rendered through AI, to break out and forge a new career—provided there will be room for human performers at all.

Al Michaels became Al Michaels because he was brought in to replace people who died, or retired, or moved on,” tweeted a writer named Geonn Cannon on X. “If he can’t do the job anymore, it’s time to let the next Al Michaels have a shot at it instead of just planting a code-generated ghoul in an empty chair.

AI-generated Al Michaels to provide daily recaps during 2024 Summer Olympics Read More »

political-deepfakes-are-the-most-popular-way-to-misuse-ai

Political deepfakes are the most popular way to misuse AI

This is not going well —

Study from Google’s DeepMind lays out nefarious ways AI is being used.

Political deepfakes are the most popular way to misuse AI

Artificial intelligence-generated “deepfakes” that impersonate politicians and celebrities are far more prevalent than efforts to use AI to assist cyber attacks, according to the first research by Google’s DeepMind division into the most common malicious uses of the cutting-edge technology.

The study said the creation of realistic but fake images, video, and audio of people was almost twice as common as the next highest misuse of generative AI tools: the falsifying of information using text-based tools, such as chatbots, to generate misinformation to post online.

The most common goal of actors misusing generative AI was to shape or influence public opinion, the analysis, conducted with the search group’s research and development unit Jigsaw, found. That accounted for 27 percent of uses, feeding into fears over how deepfakes might influence elections globally this year.

Deepfakes of UK Prime Minister Rishi Sunak, as well as other global leaders, have appeared on TikTok, X, and Instagram in recent months. UK voters go to the polls next week in a general election.

Concern is widespread that, despite social media platforms’ efforts to label or remove such content, audiences may not recognize these as fake, and dissemination of the content could sway voters.

Ardi Janjeva, research associate at The Alan Turing Institute, called “especially pertinent” the paper’s finding that the contamination of publicly accessible information with AI-generated content could “distort our collective understanding of sociopolitical reality.”

Janjeva added: “Even if we are uncertain about the impact that deepfakes have on voting behavior, this distortion may be harder to spot in the immediate term and poses long-term risks to our democracies.”

The study is the first of its kind by DeepMind, Google’s AI unit led by Sir Demis Hassabis, and is an attempt to quantify the risks from the use of generative AI tools, which the world’s biggest technology companies have rushed out to the public in search of huge profits.

As generative products such as OpenAI’s ChatGPT and Google’s Gemini become more widely used, AI companies are beginning to monitor the flood of misinformation and other potentially harmful or unethical content created by their tools.

In May, OpenAI released research revealing operations linked to Russia, China, Iran, and Israel had been using its tools to create and spread disinformation.

“There had been a lot of understandable concern around quite sophisticated cyber attacks facilitated by these tools,” said Nahema Marchal, lead author of the study and researcher at Google DeepMind. “Whereas what we saw were fairly common misuses of GenAI [such as deepfakes that] might go under the radar a little bit more.”

Google DeepMind and Jigsaw’s researchers analyzed around 200 observed incidents of misuse between January 2023 and March 2024, taken from social media platforms X and Reddit, as well as online blogs and media reports of misuse.

Ars Technica

The second most common motivation behind misuse was to make money, whether offering services to create deepfakes, including generating naked depictions of real people, or using generative AI to create swaths of content, such as fake news articles.

The research found that most incidents use easily accessible tools, “requiring minimal technical expertise,” meaning more bad actors can misuse generative AI.

Google DeepMind’s research will influence how it improves its evaluations to test models for safety, and it hopes it will also affect how its competitors and other stakeholders view how “harms are manifesting.”

© 2024 The Financial Times Ltd. All rights reserved. Not to be redistributed, copied, or modified in any way.

Political deepfakes are the most popular way to misuse AI Read More »

runway’s-latest-ai-video-generator-brings-giant-cotton-candy-monsters-to-life

Runway’s latest AI video generator brings giant cotton candy monsters to life

Screen capture of a Runway Gen-3 Alpha video generated with the prompt

Enlarge / Screen capture of a Runway Gen-3 Alpha video generated with the prompt “A giant humanoid, made of fluffy blue cotton candy, stomping on the ground, and roaring to the sky, clear blue sky behind them.”

On Sunday, Runway announced a new AI video synthesis model called Gen-3 Alpha that’s still under development, but it appears to create video of similar quality to OpenAI’s Sora, which debuted earlier this year (and has also not yet been released). It can generate novel, high-definition video from text prompts that range from realistic humans to surrealistic monsters stomping the countryside.

Unlike Runway’s previous best model from June 2023, which could only create two-second-long clips, Gen-3 Alpha can reportedly create 10-second-long video segments of people, places, and things that have a consistency and coherency that easily surpasses Gen-2. If 10 seconds sounds short compared to Sora’s full minute of video, consider that the company is working with a shoestring budget of compute compared to more lavishly funded OpenAI—and actually has a history of shipping video generation capability to commercial users.

Gen-3 Alpha does not generate audio to accompany the video clips, and it’s highly likely that temporally coherent generations (those that keep a character consistent over time) are dependent on similar high-quality training material. But Runway’s improvement in visual fidelity over the past year is difficult to ignore.

AI video heats up

It’s been a busy couple of weeks for AI video synthesis in the AI research community, including the launch of the Chinese model Kling, created by Beijing-based Kuaishou Technology (sometimes called “Kwai”). Kling can generate two minutes of 1080p HD video at 30 frames per second with a level of detail and coherency that reportedly matches Sora.

Gen-3 Alpha prompt: “Subtle reflections of a woman on the window of a train moving at hyper-speed in a Japanese city.”

Not long after Kling debuted, people on social media began creating surreal AI videos using Luma AI’s Luma Dream Machine. These videos were novel and weird but generally lacked coherency; we tested out Dream Machine and were not impressed by anything we saw.

Meanwhile, one of the original text-to-video pioneers, New York City-based Runway—founded in 2018—recently found itself the butt of memes that showed its Gen-2 tech falling out of favor compared to newer video synthesis models. That may have spurred the announcement of Gen-3 Alpha.

Gen-3 Alpha prompt: “An astronaut running through an alley in Rio de Janeiro.”

Generating realistic humans has always been tricky for video synthesis models, so Runway specifically shows off Gen-3 Alpha’s ability to create what its developers call “expressive” human characters with a range of actions, gestures, and emotions. However, the company’s provided examples weren’t particularly expressive—mostly people just slowly staring and blinking—but they do look realistic.

Provided human examples include generated videos of a woman on a train, an astronaut running through a street, a man with his face lit by the glow of a TV set, a woman driving a car, and a woman running, among others.

Gen-3 Alpha prompt: “A close-up shot of a young woman driving a car, looking thoughtful, blurred green forest visible through the rainy car window.”

The generated demo videos also include more surreal video synthesis examples, including a giant creature walking in a rundown city, a man made of rocks walking in a forest, and the giant cotton candy monster seen below, which is probably the best video on the entire page.

Gen-3 Alpha prompt: “A giant humanoid, made of fluffy blue cotton candy, stomping on the ground, and roaring to the sky, clear blue sky behind them.”

Gen-3 will power various Runway AI editing tools (one of the company’s most notable claims to fame), including Multi Motion Brush, Advanced Camera Controls, and Director Mode. It can create videos from text or image prompts.

Runway says that Gen-3 Alpha is the first in a series of models trained on a new infrastructure designed for large-scale multimodal training, taking a step toward the development of what it calls “General World Models,” which are hypothetical AI systems that build internal representations of environments and use them to simulate future events within those environments.

Runway’s latest AI video generator brings giant cotton candy monsters to life Read More »

deepfakes-in-the-courtroom:-us-judicial-panel-debates-new-ai-evidence-rules

Deepfakes in the courtroom: US judicial panel debates new AI evidence rules

adventures in 21st-century justice —

Panel of eight judges confronts deep-faking AI tech that may undermine legal trials.

An illustration of a man with a very long nose holding up the scales of justice.

On Friday, a federal judicial panel convened in Washington, DC, to discuss the challenges of policing AI-generated evidence in court trials, according to a Reuters report. The US Judicial Conference’s Advisory Committee on Evidence Rules, an eight-member panel responsible for drafting evidence-related amendments to the Federal Rules of Evidence, heard from computer scientists and academics about the potential risks of AI being used to manipulate images and videos or create deepfakes that could disrupt a trial.

The meeting took place amid broader efforts by federal and state courts nationwide to address the rise of generative AI models (such as those that power OpenAI’s ChatGPT or Stability AI’s Stable Diffusion), which can be trained on large datasets with the aim of producing realistic text, images, audio, or videos.

In the published 358-page agenda for the meeting, the committee offers up this definition of a deepfake and the problems AI-generated media may pose in legal trials:

A deepfake is an inauthentic audiovisual presentation prepared by software programs using artificial intelligence. Of course, photos and videos have always been subject to forgery, but developments in AI make deepfakes much more difficult to detect. Software for creating deepfakes is already freely available online and fairly easy for anyone to use. As the software’s usability and the videos’ apparent genuineness keep improving over time, it will become harder for computer systems, much less lay jurors, to tell real from fake.

During Friday’s three-hour hearing, the panel wrestled with the question of whether existing rules, which predate the rise of generative AI, are sufficient to ensure the reliability and authenticity of evidence presented in court.

Some judges on the panel, such as US Circuit Judge Richard Sullivan and US District Judge Valerie Caproni, reportedly expressed skepticism about the urgency of the issue, noting that there have been few instances so far of judges being asked to exclude AI-generated evidence.

“I’m not sure that this is the crisis that it’s been painted as, and I’m not sure that judges don’t have the tools already to deal with this,” said Judge Sullivan, as quoted by Reuters.

Last year, Chief US Supreme Court Justice John Roberts acknowledged the potential benefits of AI for litigants and judges, while emphasizing the need for the judiciary to consider its proper uses in litigation. US District Judge Patrick Schiltz, the evidence committee’s chair, said that determining how the judiciary can best react to AI is one of Roberts’ priorities.

In Friday’s meeting, the committee considered several deepfake-related rule changes. In the agenda for the meeting, US District Judge Paul Grimm and attorney Maura Grossman proposed modifying Federal Rule 901(b)(9) (see page 5), which involves authenticating or identifying evidence. They also recommended the addition of a new rule, 901(c), which might read:

901(c): Potentially Fabricated or Altered Electronic Evidence. If a party challenging the authenticity of computer-generated or other electronic evidence demonstrates to the court that it is more likely than not either fabricated, or altered in whole or in part, the evidence is admissible only if the proponent demonstrates that its probative value outweighs its prejudicial effect on the party challenging the evidence.

The panel agreed during the meeting that this proposal to address concerns about litigants challenging evidence as deepfakes did not work as written and that it will be reworked before being reconsidered later.

Another proposal by Andrea Roth, a law professor at the University of California, Berkeley, suggested subjecting machine-generated evidence to the same reliability requirements as expert witnesses. However, Judge Schiltz cautioned that such a rule could hamper prosecutions by allowing defense lawyers to challenge any digital evidence without establishing a reason to question it.

For now, no definitive rule changes have been made, and the process continues. But we’re witnessing the first steps of how the US justice system will adapt to an entirely new class of media-generating technology.

Putting aside risks from AI-generated evidence, generative AI has led to embarrassing moments for lawyers in court over the past two years. In May 2023, US lawyer Steven Schwartz of the firm Levidow, Levidow, & Oberman apologized to a judge for using ChatGPT to help write court filings that inaccurately cited six nonexistent cases, leading to serious questions about the reliability of AI in legal research. Also, in November, a lawyer for Michael Cohen cited three fake cases that were potentially influenced by a confabulating AI assistant.

Deepfakes in the courtroom: US judicial panel debates new AI evidence rules Read More »

microsoft’s-vasa-1-can-deepfake-a-person-with-one-photo-and-one-audio-track

Microsoft’s VASA-1 can deepfake a person with one photo and one audio track

pics and it didn’t happen —

YouTube videos of 6K celebrities helped train AI model to animate photos in real time.

A sample image from Microsoft for

Enlarge / A sample image from Microsoft for “VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time.”

On Tuesday, Microsoft Research Asia unveiled VASA-1, an AI model that can create a synchronized animated video of a person talking or singing from a single photo and an existing audio track. In the future, it could power virtual avatars that render locally and don’t require video feeds—or allow anyone with similar tools to take a photo of a person found online and make them appear to say whatever they want.

“It paves the way for real-time engagements with lifelike avatars that emulate human conversational behaviors,” reads the abstract of the accompanying research paper titled, “VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time.” It’s the work of Sicheng Xu, Guojun Chen, Yu-Xiao Guo, Jiaolong Yang, Chong Li, Zhenyu Zang, Yizhong Zhang, Xin Tong, and Baining Guo.

The VASA framework (short for “Visual Affective Skills Animator”) uses machine learning to analyze a static image along with a speech audio clip. It is then able to generate a realistic video with precise facial expressions, head movements, and lip-syncing to the audio. It does not clone or simulate voices (like other Microsoft research) but relies on an existing audio input that could be specially recorded or spoken for a particular purpose.

Microsoft claims the model significantly outperforms previous speech animation methods in terms of realism, expressiveness, and efficiency. To our eyes, it does seem like an improvement over single-image animating models that have come before.

AI research efforts to animate a single photo of a person or character extend back at least a few years, but more recently, researchers have been working on automatically synchronizing a generated video to an audio track. In February, an AI model called EMO: Emote Portrait Alive from Alibaba’s Institute for Intelligent Computing research group made waves with a similar approach to VASA-1 that can automatically sync an animated photo to a provided audio track (they call it “Audio2Video”).

Trained on YouTube clips

Microsoft Researchers trained VASA-1 on the VoxCeleb2 dataset created in 2018 by three researchers from the University of Oxford. That dataset contains “over 1 million utterances for 6,112 celebrities,” according to the VoxCeleb2 website, extracted from videos uploaded to YouTube. VASA-1 can reportedly generate videos of 512×512 pixel resolution at up to 40 frames per second with minimal latency, which means it could potentially be used for realtime applications like video conferencing.

To show off the model, Microsoft created a VASA-1 research page featuring many sample videos of the tool in action, including people singing and speaking in sync with pre-recorded audio tracks. They show how the model can be controlled to express different moods or change its eye gaze. The examples also include some more fanciful generations, such as Mona Lisa rapping to an audio track of Anne Hathaway performing a “Paparazzi” song on Conan O’Brien.

The researchers say that, for privacy reasons, each example photo on their page was AI-generated by StyleGAN2 or DALL-E 3 (aside from the Mona Lisa). But it’s obvious that the technique could equally apply to photos of real people as well, although it’s likely that it will work better if a person appears similar to a celebrity present in the training dataset. Still, the researchers say that deepfaking real humans is not their intention.

“We are exploring visual affective skill generation for virtual, interactive charactors [sic], NOT impersonating any person in the real world. This is only a research demonstration and there’s no product or API release plan,” reads the site.

While the Microsoft researchers tout potential positive applications like enhancing educational equity, improving accessibility, and providing therapeutic companionship, the technology could also easily be misused. For example, it could allow people to fake video chats, make real people appear to say things they never actually said (especially when paired with a cloned voice track), or allow harassment from a single social media photo.

Right now, the generated video still looks imperfect in some ways, but it could be fairly convincing for some people if they did not know to expect an AI-generated animation. The researchers say they are aware of this, which is why they are not openly releasing the code that powers the model.

“We are opposed to any behavior to create misleading or harmful contents of real persons, and are interested in applying our technique for advancing forgery detection,” write the researchers. “Currently, the videos generated by this method still contain identifiable artifacts, and the numerical analysis shows that there’s still a gap to achieve the authenticity of real videos.”

VASA-1 is only a research demonstration, but Microsoft is far from the only group developing similar technology. If the recent history of generative AI is any guide, it’s potentially only a matter of time before similar technology becomes open source and freely available—and they will very likely continue to improve in realism over time.

Microsoft’s VASA-1 can deepfake a person with one photo and one audio track Read More »

florida-middle-schoolers-charged-with-making-deepfake-nudes-of-classmates

Florida middle-schoolers charged with making deepfake nudes of classmates

no consent —

AI tool was used to create nudes of 12- to 13-year-old classmates.

Florida middle-schoolers charged with making deepfake nudes of classmates

Jacqui VanLiew; Getty Images

Two teenage boys from Miami, Florida, were arrested in December for allegedly creating and sharing AI-generated nude images of male and female classmates without consent, according to police reports obtained by WIRED via public record request.

The arrest reports say the boys, aged 13 and 14, created the images of the students who were “between the ages of 12 and 13.”

The Florida case appears to be the first arrests and criminal charges as a result of alleged sharing of AI-generated nude images to come to light. The boys were charged with third-degree felonies—the same level of crimes as grand theft auto or false imprisonment—under a state law passed in 2022 which makes it a felony to share “any altered sexual depiction” of a person without their consent.

The parent of one of the boys arrested did not respond to a request for comment in time for publication. The parent of the other boy said that he had “no comment.” The detective assigned to the case, and the state attorney handling the case, did not respond for comment in time for publication.

As AI image-making tools have become more widely available, there have been several high-profile incidents in which minors allegedly created AI-generated nude images of classmates and shared them without consent. No arrests have been disclosed in the publicly reported cases—at Issaquah High School in Washington, Westfield High School in New Jersey, and Beverly Vista Middle School in California—even though police reports were filed. At Issaquah High School, police opted not to press charges.

The first media reports of the Florida case appeared in December, saying that the two boys were suspended from Pinecrest Cove Academy in Miami for 10 days after school administrators learned of allegations that they created and shared fake nude images without consent. After parents of the victims learned about the incident, several began publicly urging the school to expel the boys.

Nadia Khan-Roberts, the mother of one of the victims, told NBC Miami in December that for all of the families whose children were victimized the incident was traumatizing. “Our daughters do not feel comfortable walking the same hallways with these boys,” she said. “It makes me feel violated, I feel taken advantage [of] and I feel used,” one victim, who asked to remain anonymous, told the TV station.

WIRED obtained arrest records this week that say the incident was reported to police on December 6, 2023, and that the two boys were arrested on December 22. The records accuse the pair of using “an artificial intelligence application” to make the fake explicit images. The name of the app was not specified and the reports claim the boys shared the pictures between each other.

“The incident was reported to a school administrator,” the reports say, without specifying who reported it, or how that person found out about the images. After the school administrator “obtained copies of the altered images” the administrator interviewed the victims depicted in them, the reports say, who said that they did not consent to the images being created.

After their arrest, the two boys accused of making the images were transported to the Juvenile Service Department “without incident,” the reports say.

A handful of states have laws on the books that target fake, nonconsensual nude images. There’s no federal law targeting the practice, but a group of US senators recently introduced a bill to combat the problem after fake nude images of Taylor Swift were created and distributed widely on X.

The boys were charged under a Florida law passed in 2022 that state legislators designed to curb harassment involving deepfake images made using AI-powered tools.

Stephanie Cagnet Myron, a Florida lawyer who represents victims of nonconsensually shared nude images, tells WIRED that anyone who creates fake nude images of a minor would be in possession of child sexual abuse material, or CSAM. However, she claims it’s likely that the two boys accused of making and sharing the material were not charged with CSAM possession due to their age.

“There’s specifically several crimes that you can charge in a case, and you really have to evaluate what’s the strongest chance of winning, what has the highest likelihood of success, and if you include too many charges, is it just going to confuse the jury?” Cagnet Myron added.

Mary Anne Franks, a professor at the George Washington University School of Law and a lawyer who has studied the problem of nonconsensual explicit imagery, says it’s “odd” that Florida’s revenge porn law, which predates the 2022 statute under which the boys were charged, only makes the offense a misdemeanor, while this situation represented a felony.

“It is really strange to me that you impose heftier penalties for fake nude photos than for real ones,” she says.

Franks adds that although she believes distributing nonconsensual fake explicit images should be a criminal offense, thus creating a deterrent effect, she doesn’t believe offenders should be incarcerated, especially not juveniles.

“The first thing I think about is how young the victims are and worried about the kind of impact on them,” Franks says. “But then [I] also question whether or not throwing the book at kids is actually going to be effective here.”

This story originally appeared on wired.com.

Florida middle-schoolers charged with making deepfake nudes of classmates Read More »

stability-announces-stable-diffusion-3,-a-next-gen-ai-image-generator

Stability announces Stable Diffusion 3, a next-gen AI image generator

Pics and it didn’t happen —

SD3 may bring DALL-E-like prompt fidelity to an open-weights image-synthesis model.

Stable Diffusion 3 generation with the prompt: studio photograph closeup of a chameleon over a black background.

Enlarge / Stable Diffusion 3 generation with the prompt: studio photograph closeup of a chameleon over a black background.

On Thursday, Stability AI announced Stable Diffusion 3, an open-weights next-generation image-synthesis model. It follows its predecessors by reportedly generating detailed, multi-subject images with improved quality and accuracy in text generation. The brief announcement was not accompanied by a public demo, but Stability is opening up a waitlist today for those who would like to try it.

Stability says that its Stable Diffusion 3 family of models (which takes text descriptions called “prompts” and turns them into matching images) range in size from 800 million to 8 billion parameters. The size range accommodates allowing different versions of the model to run locally on a variety of devices—from smartphones to servers. Parameter size roughly corresponds to model capability in terms of how much detail it can generate. Larger models also require more VRAM on GPU accelerators to run.

Since 2022, we’ve seen Stability launch a progression of AI image-generation models: Stable Diffusion 1.4, 1.5, 2.0, 2.1, XL, XL Turbo, and now 3. Stability has made a name for itself as providing a more open alternative to proprietary image-synthesis models like OpenAI’s DALL-E 3, though not without controversy due to the use of copyrighted training data, bias, and the potential for abuse. (This has led to lawsuits that are unresolved.) Stable Diffusion models have been open-weights and source-available, which means the models can be run locally and fine-tuned to change their outputs.

  • Stable Diffusion 3 generation with the prompt: Epic anime artwork of a wizard atop a mountain at night casting a cosmic spell into the dark sky that says “Stable Diffusion 3” made out of colorful energy.

  • An AI-generated image of a grandma wearing a “Go big or go home sweatshirt” generated by Stable Diffusion 3.

  • Stable Diffusion 3 generation with the prompt: Three transparent glass bottles on a wooden table. The one on the left has red liquid and the number 1. The one in the middle has blue liquid and the number 2. The one on the right has green liquid and the number 3.

  • An AI-generated image created by Stable Diffusion 3.

  • Stable Diffusion 3 generation with the prompt: A horse balancing on top of a colorful ball in a field with green grass and a mountain in the background.

  • Stable Diffusion 3 generation with the prompt: Moody still life of assorted pumpkins.

  • Stable Diffusion 3 generation with the prompt: a painting of an astronaut riding a pig wearing a tutu holding a pink umbrella, on the ground next to the pig is a robin bird wearing a top hat, in the corner are the words “stable diffusion.”

  • Stable Diffusion 3 generation with the prompt: Resting on the kitchen table is an embroidered cloth with the text ‘good night’ and an embroidered baby tiger. Next to the cloth there is a lit candle. The lighting is dim and dramatic.

  • Stable Diffusion 3 generation with the prompt: Photo of an 90’s desktop computer on a work desk, on the computer screen it says “welcome”. On the wall in the background we see beautiful graffiti with the text “SD3” very large on the wall.

As far as tech improvements are concerned, Stability CEO Emad Mostaque wrote on X, “This uses a new type of diffusion transformer (similar to Sora) combined with flow matching and other improvements. This takes advantage of transformer improvements & can not only scale further but accept multimodal inputs.”

Like Mostaque said, the Stable Diffusion 3 family uses diffusion transformer architecture, which is a new way of creating images with AI that swaps out the usual image-building blocks (such as U-Net architecture) for a system that works on small pieces of the picture. The method was inspired by transformers, which are good at handling patterns and sequences. This approach not only scales up efficiently but also reportedly produces higher-quality images.

Stable Diffusion 3 also utilizes “flow matching,” which is a technique for creating AI models that can generate images by learning how to transition from random noise to a structured image smoothly. It does this without needing to simulate every step of the process, instead focusing on the overall direction or flow that the image creation should follow.

A comparison of outputs between OpenAI's DALL-E 3 and Stable Diffusion 3 with the prompt,

Enlarge / A comparison of outputs between OpenAI’s DALL-E 3 and Stable Diffusion 3 with the prompt, “Night photo of a sports car with the text “SD3″ on the side, the car is on a race track at high speed, a huge road sign with the text ‘faster.'”

We do not have access to Stable Diffusion 3 (SD3), but from samples we found posted on Stability’s website and associated social media accounts, the generations appear roughly comparable to other state-of-the-art image-synthesis models at the moment, including the aforementioned DALL-E 3, Adobe Firefly, Imagine with Meta AI, Midjourney, and Google Imagen.

SD3 appears to handle text generation very well in the examples provided by others, which are potentially cherry-picked. Text generation was a particular weakness of earlier image-synthesis models, so an improvement to that capability in a free model is a big deal. Also, prompt fidelity (how closely it follows descriptions in prompts) seems to be similar to DALL-E 3, but we haven’t tested that ourselves yet.

While Stable Diffusion 3 isn’t widely available, Stability says that once testing is complete, its weights will be free to download and run locally. “This preview phase, as with previous models,” Stability writes, “is crucial for gathering insights to improve its performance and safety ahead of an open release.”

Stability has been experimenting with a variety of image-synthesis architectures recently. Aside from SDXL and SDXL Turbo, just last week, the company announced Stable Cascade, which uses a three-stage process for text-to-image synthesis.

Listing image by Emad Mostaque (Stability AI)

Stability announces Stable Diffusion 3, a next-gen AI image generator Read More »

will-smith-parodies-viral-ai-generated-video-by-actually-eating-spaghetti

Will Smith parodies viral AI-generated video by actually eating spaghetti

Mangia, mangia —

Actor pokes fun at 2023 AI video by eating spaghetti messily and claiming it’s AI-generated.

The real Will Smith eating spaghetti, parodying an AI-generated video from 2023.

Enlarge / The real Will Smith eating spaghetti, parodying an AI-generated video from 2023.

On Monday, Will Smith posted a video on his official Instagram feed that parodied an AI-generated video of the actor eating spaghetti that went viral last year. With the recent announcement of OpenAI’s Sora video synthesis model, many people have noted the dramatic jump in AI-video quality over the past year compared to the infamous spaghetti video. Smith’s new video plays on that comparison by showing the actual actor eating spaghetti in a comical fashion and claiming that it is AI-generated.

Captioned “This is getting out of hand!”, the Instagram video uses a split screen layout to show the original AI-generated spaghetti video created by a Reddit user named “chaindrop” in March 2023 on the top, labeled with the subtitle “AI Video 1 year ago.” Below that, in a box titled “AI Video Now,” the real Smith shows 11 video segments of himself actually eating spaghetti by slurping it up while shaking his head, pouring it into his mouth with his fingers, and even nibbling on a friend’s hair. 2006’s Snap Yo Fingers by Lil Jon plays in the background.

In the Instagram comments section, some people expressed confusion about the new (non-AI) video, saying, “I’m still in doubt if second video was also made by AI or not.” In a reply, someone else wrote, “Boomers are gonna loose [sic] this one. Second one is clearly him making a joke but I wouldn’t doubt it in a couple months time it will get like that.”

We have not yet seen a model with the capability of Sora attempt to create a new Will-Smith-eating-spaghetti AI video, but the result would likely be far better than what we saw last year, even if it contained obvious glitches. Given how things are progressing, we wouldn’t be surprised if by 2025, video synthesis AI models can replicate the parody video created by Smith himself.

It’s worth noting for history’s sake that despite the comparison, the video of Will Smith eating spaghetti did not represent the state of the art in text-to-video synthesis at the time of its creation in March 2023 (that title would likely apply to Runway’s Gen-2, which was then in closed testing). However, the spaghetti video was reasonably advanced for open weights models at the time, having used the ModelScope AI model. More capable video synthesis models had already been released at that time, but due to the humorous cultural reference, it’s arguably more fun to compare today’s AI video synthesis to Will Smith grotesquely eating spaghetti than to teddy bears washing dishes.

Will Smith parodies viral AI-generated video by actually eating spaghetti Read More »