The FBI has dismantled a massive network of compromised devices that Chinese state-sponsored hackers have used for four years to mount attacks on government agencies, telecoms, defense contractors, and other targets in the US and Taiwan.
The botnet was made up primarily of small office and home office routers, surveillance cameras, network-attached storage, and other Internet-connected devices located all over the world. Over the past four years, US officials said, 260,000 such devices have cycled through the sophisticated network, which is organized in three tiers that allow the botnet to operate with efficiency and precision. At its peak in June 2023, Raptor Train, as the botnet is named, consisted of more than 60,000 commandeered devices, according to researchers from Black Lotus Labs, making it the largest China state botnet discovered to date.
Burning down the house
Raptor Train is the second China state-operated botnet US authorities have taken down this year. In January, law enforcement officials covertly issued commands to disinfect Internet of Things devices that hackers backed by the Chinese government had taken over without the device owners’ knowledge. The Chinese hackers, part of a group tracked as Volt Typhoon, used the botnet for more than a year as a platform to deliver exploits that burrowed deep into the networks of targets of interest. Because the attacks appear to originate from IP addresses with good reputations, they are subjected to less scrutiny from network security defenses, making the bots an ideal delivery proxy. Russia-state hackers have also been caught assembling large IoT botnets for the same purposes.
An advisory jointly issued Wednesday by the FBI, the Cyber National Mission Force, and the National Security Agency said that China-based company Integrity Technology Group controlled and managed Raptor Train. The company has ties to the People’s Republic of China, officials said. The company, they said, has also used the state-controlled China Unicom Beijing Province Network IP addresses to control and manage the botnet. Researchers and law enforcement track the China-state group that worked with Integrity Technology as Flax Typhoon. More than half of the infected Raptor Train devices were located in North America and another 25 percent in Europe.
“Flax Typhoon was targeting critical infrastructure across the US and overseas, everyone from corporations and media organizations to universities and government agencies,” FBI Director Christopher Wray said Wednesday at the Aspen Cyber Summit. “Like Volt Typhoon, they used Internet-connected devices, this time hundreds of thousands of them, to create a botnet that helped them compromise systems and exfiltrate confidential data.” He added: “Flax Typhoon’s actions caused real harm to its victims who had to devote precious time to clean up the mess.”
Enlarge/ Under C2PA, this stock image would be labeled as a real photograph if the camera used to take it, and the toolchain for retouching it, supported the C2PA. But even as a real photo, does it actually represent reality, and is there a technological solution to that problem?
On Tuesday, Google announced plans to implement content authentication technology across its products to help users distinguish between human-created and AI-generated images. Over several upcoming months, the tech giant will integrate the Coalition for Content Provenance and Authenticity (C2PA) standard, a system designed to track the origin and editing history of digital content, into its search, ads, and potentially YouTube services. However, it’s an open question of whether a technological solution can address the ancient social issue of trust in recorded media produced by strangers.
A group of tech companies created the C2PA system beginning in 2019 in an attempt to combat misleading, realistic synthetic media online. As AI-generated content becomes more prevalent and realistic, experts have worried that it may be difficult for users to determine the authenticity of images they encounter. The C2PA standard creates a digital trail for content, backed by an online signing authority, that includes metadata information about where images originate and how they’ve been modified.
Google will incorporate this C2PA standard into its search results, allowing users to see if an image was created or edited using AI tools. The tech giant’s “About this image” feature in Google Search, Lens, and Circle to Search will display this information when available.
In a blog post, Laurie Richardson, Google’s vice president of trust and safety, acknowledged the complexities of establishing content provenance across platforms. She stated, “Establishing and signaling content provenance remains a complex challenge, with a range of considerations based on the product or service. And while we know there’s no silver bullet solution for all content online, working with others in the industry is critical to create sustainable and interoperable solutions.”
The company plans to use the C2PA’s latest technical standard, version 2.1, which reportedly offers improved security against tampering attacks. Its use will extend beyond search since Google intends to incorporate C2PA metadata into its ad systems as a way to “enforce key policies.” YouTube may also see integration of C2PA information for camera-captured content in the future.
Google says the new initiative aligns with its other efforts toward AI transparency, including the development of SynthID, an embedded watermarking technology created by Google DeepMind.
Widespread C2PA efficacy remains a dream
Despite having a history that reaches back at least five years now, the road to useful content provenance technology like C2PA is steep. The technology is entirely voluntary, and key authenticating metadata can easily be stripped from images once added.
AI image generators would need to support the standard for C2PA information to be included in each generated file, which will likely preclude open source image synthesis models like Flux. So perhaps, in practice, more “authentic,” camera-authored media will be labeled with C2PA than AI-generated images.
Beyond that, maintaining the metadata requires a complete toolchain that supports C2PA every step along the way, including at the source and any software used to edit or retouch the images. Currently, only a handful of camera manufacturers, such as Leica, support the C2PA standard. Nikon and Canon have pledged to adopt it, but The Verge reports that there’s still uncertainty about whether Apple and Google will implement C2PA support in their smartphone devices.
Adobe’s Photoshop and Lightroom can add and maintain C2PA data, but many other popular editing tools do not yet offer the capability. It only takes one non-compliant image editor in the chain to break the full usefulness of C2PA. And the general lack of standardized viewing methods for C2PA data across online platforms presents another obstacle to making the standard useful for everyday users.
Currently, C2PA could arguably be seen as a technological solution for current trust issues around fake images. In that sense, C2PA may become one of many tools used to authenticate content by determining whether the information came from a credible source—if the C2PA metadata is preserved—but it is unlikely to be a complete solution to AI-generated misinformation on its own.
Researchers still don’t know the cause of a recently discovered malware infection affecting almost 1.3 million streaming devices running an open source version of Android in almost 200 countries.
Security firm Doctor Web reported Thursday that malware named Android.Vo1d has backdoored the Android-based boxes by putting malicious components in their system storage area, where they can be updated with additional malware at any time by command-and-control servers. Google representatives said the infected devices are running operating systems based on the Android Open Source Project, a version overseen by Google but distinct from Android TV, a proprietary version restricted to licensed device makers.
Dozens of variants
Although Doctor Web has a thorough understanding of Vo1d and the exceptional reach it has achieved, company researchers say they have yet to determine the attack vector that has led to the infections.
“At the moment, the source of the TV boxes’ backdoor infection remains unknown,” Thursday’s post stated. “One possible infection vector could be an attack by an intermediate malware that exploits operating system vulnerabilities to gain root privileges. Another possible vector could be the use of unofficial firmware versions with built-in root access.”
The following device models infected by Vo1d are:
TV box model
Declared firmware version
R4
Android 7.1.2; R4 Build/NHG47K
TV BOX
Android 12.1; TV BOX Build/NHG47K
KJ-SMART4KVIP
Android 10.1; KJ-SMART4KVIP Build/NHG47K
One possible cause of the infections is that the devices are running outdated versions that are vulnerable to exploits that remotely execute malicious code on them. Versions 7.1, 10.1, and 12.1, for example, were released in 2016, 2019, and 2022, respectively. What’s more, Doctor Web said it’s not unusual for budget device manufacturers to install older OS versions in streaming boxes and make them appear more attractive by passing them off as more up-to-date models.
Further, while only licensed device makers are permitted to modify Google’s AndroidTV, any device maker is free to make changes to open source versions. That leaves open the possibility that the devices were infected in the supply chain and were already compromised by the time they were purchased by the end user.
“These off-brand devices discovered to be infected were not Play Protect certified Android devices,” Google said in a statement. “If a device isn’t Play Protect certified, Google doesn’t have a record of security and compatibility test results. Play Protect certified Android devices undergo extensive testing to ensure quality and user safety.”
The statement said people can confirm a device runs Android TV OS by checking this link and following the steps listed here.
Doctor Web said that there are dozens of Vo1d variants that use different code and plant malware in slightly different storage areas, but that all achieve the same end result of connecting to an attacker-controlled server and installing a final component that can install additional malware when instructed. VirusTotal shows that most of the Vo1d variants were first uploaded to the malware identification site several months ago.
Researchers wrote:
All these cases involved similar signs of infection, so we will describe them using one of the first requests we received as an example. The following objects were changed on the affected TV box:
install-recovery.sh
daemonsu
In addition, 4 new files emerged in its file system:
/system/xbin/vo1d
/system/xbin/wd
/system/bin/debuggerd
/system/bin/debuggerd_real
The vo1d and wd files are the components of the Android.Vo1d trojan that we discovered.
The trojan’s authors probably tried to disguise one if its components as the system program /system/bin/vold, having called it by the similar-looking name “vo1d” (substituting the lowercase letter “l” with the number “1”). The malicious program’s name comes from the name of this file. Moreover, this spelling is consonant with the English word “void”.
The install-recovery.sh file is a script that is present on most Android devices. It runs when the operating system is launched and contains data for autorunning the elements specified in it. If any malware has root access and the ability to write to the /system system directory, it can anchor itself in the infected device by adding itself to this script (or by creating it from scratch if it is not present in the system). Android.Vo1d has registered the autostart for the wd component in this file.
The modified install-recovery.sh file
Doctor Web
The daemonsu file is present on many Android devices with root access. It is launched by the operating system when it starts and is responsible for providing root privileges to the user. Android.Vo1d registered itself in this file, too, having also set up autostart for the wd module.
The debuggerd file is a daemon that is typically used to create reports on occurred errors. But when the TV box was infected, this file was replaced by the script that launches the wd component.
The debuggerd_real file in the case we are reviewing is a copy of the script that was used to substitute the real debuggerd file. Doctor Web experts believe that the trojan’s authors intended the original debuggerd to be moved into debuggerd_real to maintain its functionality. However, because the infection probably occurred twice, the trojan moved the already substituted file (i.e., the script). As a result, the device had two scripts from the trojan and not a single real debuggerd program file.
At the same time, other users who contacted us had a slightly different list of files on their infected devices:
debuggerd_real (the original file of the debuggerd tool);
install-recovery.sh (a script that loads objects specified in it).
An analysis of all the aforementioned files showed that in order to anchor Android.Vo1d in the system, its authors used at least three different methods: modification of the install-recovery.sh and daemonsu files and substitution of the debuggerd program. They probably expected that at least one of the target files would be present in the infected system, since manipulating even one of them would ensure the trojan’s successful auto launch during subsequent device reboots.
Android.Vo1d’s main functionality is concealed in its vo1d (Android.Vo1d.1) and wd (Android.Vo1d.3) components, which operate in tandem. The Android.Vo1d.1 module is responsible for Android.Vo1d.3’s launch and controls its activity, restarting its process if necessary. In addition, it can download and run executables when commanded to do so by the C&C server. In turn, the Android.Vo1d.3 module installs and launches the Android.Vo1d.5 daemon that is encrypted and stored in its body. This module can also download and run executables. Moreover, it monitors specified directories and installs the APK files that it finds in them.
The geographic distribution of the infections is wide, with the biggest number detected in Brazil, Morocco, Pakistan, Saudi Arabia, Russia, Argentina, Ecuador, Tunisia, Malaysia, Algeria, and Indonesia.
Enlarge/ A world map listing the number of infections found in various countries.
Doctor Web
It’s not especially easy for less experienced people to check if a device is infected short of installing malware scanners. Doctor Web said its antivirus software for Android will detect all Vo1d variants and disinfect devices that provide root access. More experienced users can check indicators of compromise here.
On Thursday, Google made Gemini Live, its voice-based AI chatbot feature, available for free to all Android users. The feature allows users to interact with Gemini through voice commands on their Android devices. That’s notable because competitor OpenAI’s Advanced Voice Mode feature of ChatGPT, which is similar to Gemini Live, has not yet fully shipped.
Google unveiled Gemini Live during its Pixel 9 launch event last month. Initially, the feature was exclusive to Gemini Advanced subscribers, but now it’s accessible to anyone using the Gemini app or its overlay on Android.
Gemini Live enables users to ask questions aloud and even interrupt the AI’s responses mid-sentence. Users can choose from several voice options for Gemini’s responses, adding a level of customization to the interaction.
Gemini suggests the following uses of the voice mode in its official help documents:
Talk back and forth: Talk to Gemini without typing, and Gemini will respond back verbally. Brainstorm ideas out loud: Ask for a gift idea, to plan an event, or to make a business plan. Explore: Uncover more details about topics that interest you. Practice aloud: Rehearse for important moments in a more natural and conversational way.
Interestingly, while OpenAI originally demoed its Advanced Voice Mode in May with the launch of GPT-4o, it has only shipped the feature to a limited number of users starting in late July. Some AI experts speculate that a wider rollout has been hampered by a lack of available computer power since the voice feature is presumably very compute-intensive.
To access Gemini Live, users can reportedly tap a new waveform icon in the bottom-right corner of the app or overlay. This action activates the microphone, allowing users to pose questions verbally. The interface includes options to “hold” Gemini’s answer or “end” the conversation, giving users control over the flow of the interaction.
Currently, Gemini Live supports only English, but Google has announced plans to expand language support in the future. The company also intends to bring the feature to iOS devices, though no specific timeline has been provided for this expansion.
Enlarge/ Soon you’ll be able to stream games and video for free on United flights.
United
United Airlines announced this morning that it is giving its in-flight Internet access an upgrade. It has signed a deal with Starlink to deliver SpaceX’s satellite-based service to all its aircraft, a process that will start in 2025. And the good news for passengers is that the in-flight Wi-Fi will be free of charge.
The flying experience as it relates to consumer technology has come a very long way in the two-and-a-bit decades that Ars has been publishing. At the turn of the century, even having a power socket in your seat was a long shot. Laptop batteries didn’t last that long, either—usually less than the runtime of whatever DVD I hoped to distract myself with, if memory serves.
Bring a spare battery and that might double, but it helped to have a book or magazine to read.
By 2011, the picture had changed. Wi-Fi was no longer some esoteric thing known only to nerds who built their own computers, and smartphones and tablets were on their way to ubiquity. After an aborted attempt in 2004, 2008 made in-flight Internet access a reality in North America, although the air-to-ground cellular-based system was slow, unreliable, and expensive.
Air-to-ground Internet access was maybe slightly cheaper by 2018, but it was still frustrating and slow, particularly if you were, oh, I dunno, a journalist trying to upload images to a CMS on your way back from an event. But by then, there was a better alternative—satellites. Airliners started sporting new antenna-concealing blisters, and soon, we were all streaming and posting and working our way across the skies.
Enter SpaceX
That bandwidth was courtesy of Viasat, according to all the receipts in my expense reports, but in 2022, SpaceX announced that it was adding aviation to Starlink’s portfolio. Initially, Starlink only targeted smaller regional and private jet aircraft, but now its equipment is also certified for commercial passenger planes from Airbus and Boeing and is already in use with carriers including Qatar Airways and Air New Zealand.
United says it will start testing Starlink equipment early in 2025, with the first use on passenger flights later that year. The service will be available gate-to-gate (as opposed to only working above 10,000 feet, a restriction some other systems operate under), and it certainly sounds like a superior experience to current in-flight Internet, as it will explicitly allow streaming of both video and games, and multiple connected devices at once. Better yet, United says the service will be free for passengers.
Depending on the route you fly, you may need to have some patience, though. United says it will take several years to install Starlink systems on its more than 1,000 aircraft.
OpenAI finally unveiled its rumored “Strawberry” AI language model on Thursday, claiming significant improvements in what it calls “reasoning” and problem-solving capabilities over previous large language models (LLMs). Formally named “OpenAI o1,” the model family will initially launch in two forms, o1-preview and o1-mini, available today for ChatGPT Plus and certain API users.
OpenAI claims that o1-preview outperforms its predecessor, GPT-4o, on multiple benchmarks, including competitive programming, mathematics, and “scientific reasoning.” However, people who have used the model say it does not yet outclass GPT-4o in every metric. Other users have criticized the delay in receiving a response from the model, owing to the multi-step processing occurring behind the scenes before answering a query.
In a rare display of public hype-busting, OpenAI product manager Joanne Jang tweeted, “There’s a lot of o1 hype on my feed, so I’m worried that it might be setting the wrong expectations. what o1 is: the first reasoning model that shines in really hard tasks, and it’ll only get better. (I’m personally psyched about the model’s potential & trajectory!) what o1 isn’t (yet!): a miracle model that does everything better than previous models. you might be disappointed if this is your expectation for today’s launch—but we’re working to get there!”
OpenAI reports that o1-preview ranked in the 89th percentile on competitive programming questions from Codeforces. In mathematics, it scored 83 percent on a qualifying exam for the International Mathematics Olympiad, compared to GPT-4o’s 13 percent. OpenAI also states, in a claim that may later be challenged as people scrutinize the benchmarks and run their own evaluations over time, o1 performs comparably to PhD students on specific tasks in physics, chemistry, and biology. The smaller o1-mini model is designed specifically for coding tasks and is priced at 80 percent less than o1-preview.
Enlarge/ A benchmark chart provided by OpenAI. They write, “o1 improves over GPT-4o on a wide range of benchmarks, including 54/57 MMLU subcategories. Seven are shown for illustration.”
OpenAI attributes o1’s advancements to a new reinforcement learning (RL) training approach that teaches the model to spend more time “thinking through” problems before responding, similar to how “let’s think step-by-step” chain-of-thought prompting can improve outputs in other LLMs. The new process allows o1 to try different strategies and “recognize” its own mistakes.
AI benchmarks are notoriously unreliable and easy to game; however, independent verification and experimentation from users will show the full extent of o1’s advancements over time. It’s worth noting that MIT Research showed earlier this year that some of the benchmark claims OpenAI touted with GPT-4 last year were erroneous or exaggerated.
A mixed bag of capabilities
OpenAI demos “o1” correctly counting the number of Rs in the word “strawberry.”
Amid many demo videos of o1 completing programming tasks and solving logic puzzles that OpenAI shared on its website and social media, one demo stood out as perhaps the least consequential and least impressive, but it may become the most talked about due to a recurring meme where people ask LLMs to count the number of R’s in the word “strawberry.”
Due to tokenization, where the LLM processes words in data chunks called tokens, most LLMs are typically blind to character-by-character differences in words. Apparently, o1 has the self-reflective capabilities to figure out how to count the letters and provide an accurate answer without user assistance.
Beyond OpenAI’s demos, we’ve seen optimistic but cautious hands-on reports about o1-preview online. Wharton Professor Ethan Mollick wrote on X, “Been using GPT-4o1 for the last month. It is fascinating—it doesn’t do everything better but it solves some very hard problems for LLMs. It also points to a lot of future gains.”
Mollick shared a hands-on post in his “One Useful Thing” blog that details his experiments with the new model. “To be clear, o1-preview doesn’t do everything better. It is not a better writer than GPT-4o, for example. But for tasks that require planning, the changes are quite large.”
Mollick gives the example of asking o1-preview to build a teaching simulator “using multiple agents and generative AI, inspired by the paper below and considering the views of teachers and students,” then asking it to build the full code, and it produced a result that Mollick found impressive.
Mollick also gave o1-preview eight crossword puzzle clues, translated into text, and the model took 108 seconds to solve it over many steps, getting all of the answers correct but confabulating a particular clue Mollick did not give it. We recommend reading Mollick’s entire post for a good early hands-on impression. Given his experience with the new model, it appears that o1 works very similar to GPT-4o but iteratively in a loop, which is something that the so-called “agentic” AutoGPT and BabyAGI projects experimented with in early 2023.
Is this what could “threaten humanity?”
Speaking of agentic models that run in loops, Strawberry has been subject to hype since last November, when it was initially known as Q(Q-star). At the time, The Information and Reuters claimed that, just before Sam Altman’s brief ouster as CEO, OpenAI employees had internally warned OpenAI’s board of directors about a new OpenAI model called Q* that could “threaten humanity.”
In August, the hype continued when The Information reported that OpenAI showed Strawberry to US national security officials.
We’ve been skeptical about the hype around Qand Strawberry since the rumors first emerged, as this author noted last November, and Timothy B. Lee covered thoroughly in an excellent post about Q* from last December.
So even though o1 is out, AI industry watchers should note how this model’s impending launch was played up in the press as a dangerous advancement while not being publicly downplayed by OpenAI. For an AI model that takes 108 seconds to solve eight clues in a crossword puzzle and hallucinates one answer, we can say that its potential danger was likely hype (for now).
Controversy over “reasoning” terminology
It’s no secret that some people in tech have issues with anthropomorphizing AI models and using terms like “thinking” or “reasoning” to describe the synthesizing and processing operations that these neural network systems perform.
Just after the OpenAI o1 announcement, Hugging Face CEO Clement Delangue wrote, “Once again, an AI system is not ‘thinking,’ it’s ‘processing,’ ‘running predictions,’… just like Google or computers do. Giving the false impression that technology systems are human is just cheap snake oil and marketing to fool you into thinking it’s more clever than it is.”
“Reasoning” is also a somewhat nebulous term since, even in humans, it’s difficult to define exactly what the term means. A few hours before the announcement, independent AI researcher Simon Willison tweeted in response to a Bloomberg story about Strawberry, “I still have trouble defining ‘reasoning’ in terms of LLM capabilities. I’d be interested in finding a prompt which fails on current models but succeeds on strawberry that helps demonstrate the meaning of that term.”
Reasoning or not, o1-preview currently lacks some features present in earlier models, such as web browsing, image generation, and file uploading. OpenAI plans to add these capabilities in future updates, along with continued development of both the o1 and GPT model series.
While OpenAI says the o1-preview and o1-mini models are rolling out today, neither model is available in our ChatGPT Plus interface yet, so we have not been able to evaluate them. We’ll report our impressions on how this model differs from other LLMs we have previously covered.
Enlarge/ Hard drives, unfortunately, tend to die not with a spectacular and sparkly bang, but with a head-is-stuck whimper.
Getty Images
One of the things enterprise storage and destruction company Iron Mountain does is handle the archiving of the media industry’s vaults. What it has been seeing lately should be a wake-up call: roughly one-fifth of the hard disk drives dating to the 1990s it was sent are entirely unreadable.
Music industry publication Mix spoke with the people in charge of backing up the entertainment industry. The resulting tale is part explainer on how music is so complicated to archive now, part warning about everyone’s data stored on spinning disks.
“In our line of work, if we discover an inherent problem with a format, it makes sense to let everybody know,” Robert Koszela, global director for studio growth and strategic initiatives at Iron Mountain, told Mix. “It may sound like a sales pitch, but it’s not; it’s a call for action.”
Hard drives gained popularity over spooled magnetic tape as digital audio workstations, mixing and editing software, and the perceived downsides of tape, including deterioration from substrate separation and fire. But hard drives present their own archival problems. Standard hard drives were also not designed for long-term archival use. You can almost never decouple the magnetic disks from the reading hardware inside, so that if either fails, the whole drive dies.
There are also general computer storage issues, including the separation of samples and finished tracks, or proprietary file formats requiring archival versions of software. Still, Iron Mountain tells Mix that “If the disk platters spin and aren’t damaged,” it can access the content.
But “if it spins” is becoming a big question mark. Musicians and studios now digging into their archives to remaster tracks often find that drives, even when stored at industry-standard temperature and humidity, have failed in some way, with no partial recovery option available.
“It’s so sad to see a project come into the studio, a hard drive in a brand-new case with the wrapper and the tags from wherever they bought it still in there,” Koszela says. “Next to it is a case with the safety drive in it. Everything’s in order. And both of them are bricks.”
Entropy wins
Mix’s passing along of Iron Mountain’s warning hit Hacker News earlier this week, which spurred other tales of faith in the wrong formats. The gist of it: You cannot trust any medium, so you copy important things over and over, into fresh storage. “Optical media rots, magnetic media rots and loses magnetic charge, bearings seize, flash storage loses charge, etc.,” writes user abracadaniel. “Entropy wins, sometimes much faster than you’d expect.”
There is discussion of how SSDs are not archival at all; how floppy disk quality varied greatly between the 1980s, 1990s, and 2000s; how Linear Tape-Open, a format specifically designed for long-term tape storage, loses compatibility over successive generations; how the binder sleeves we put our CD-Rs and DVD-Rs in have allowed them to bend too much and stop being readable.
Knowing that hard drives will eventually fail is nothing new. Ars wrote about the five stages of hard drive death, including denial, back in 2005. Last year, backup company Backblaze shared failure data on specific drives, showing that drives that fail tend to fail within three years, that no drive was totally exempt, and that time does, generally, wear down all drives. Google’s server drive data showed in 2007 that HDD failure was mostly unpredictable, and that temperatures were not really the deciding factor.
So Iron Mountain’s admonition to music companies is yet another warning about something we’ve already heard. But it’s always good to get some new data about just how fragile a good archive really is.
Enlarge/ A screenshot of Taylor Swift’s Kamala Harris Instagram post, captured on September 11, 2024.
On Tuesday night, Taylor Swift endorsed Vice President Kamala Harris for US President on Instagram, citing concerns over AI-generated deepfakes as a key motivator. The artist’s warning aligns with current trends in technology, especially in an era where AI synthesis models can easily create convincing fake images and videos.
“Recently I was made aware that AI of ‘me’ falsely endorsing Donald Trump’s presidential run was posted to his site,” she wrote in her Instagram post. “It really conjured up my fears around AI, and the dangers of spreading misinformation. It brought me to the conclusion that I need to be very transparent about my actual plans for this election as a voter. The simplest way to combat misinformation is with the truth.”
In August 2024, former President Donald Trump posted AI-generated images on Truth Social falsely suggesting Swift endorsed him, including a manipulated photo depicting Swift as Uncle Sam with text promoting Trump. The incident sparked Swift’s fears about the spread of misinformation through AI.
This isn’t the first time Swift and generative AI have appeared together in the news. In February, we reported that a flood of explicit AI-generated images of Swift originated from a 4chan message board where users took part in daily challenges to bypass AI image generator filters.
On Friday, Roblox announced plans to introduce an open source generative AI tool that will allow game creators to build 3D environments and objects using text prompts, reports MIT Tech Review. The feature, which is still under development, may streamline the process of creating game worlds on the popular online platform, potentially opening up more aspects of game creation to those without extensive 3D design skills.
Roblox has not announced a specific launch date for the new AI tool, which is based on what it calls a “3D foundational model.” The company shared a demo video of the tool where a user types, “create a race track,” then “make the scenery a desert,” and the AI model creates a corresponding model in the proper environment.
The system will also reportedly let users make modifications, such as changing the time of day or swapping out entire landscapes, and Roblox says the multimodal AI model will ultimately accept video and 3D prompts, not just text.
A video showing Roblox’s generative AI model in action.
The 3D environment generator is part of Roblox’s broader AI integration strategy. The company reportedly uses around 250 AI models across its platform, including one that monitors voice chat in real time to enforce content moderation, which is not always popular with players.
Next-token prediction in 3D
Roblox’s 3D foundational model approach involves a custom next-token prediction model—a foundation not unlike the large language models (LLMs) that power ChatGPT. Tokens are fragments of text data that LLMs use to process information. Roblox’s system “tokenizes” 3D blocks by treating each block as a numerical unit, which allows the AI model to predict the most likely next structural 3D element in a sequence. In aggregate, the technique can build entire objects or scenery.
Anupam Singh, vice president of AI and growth engineering at Roblox, told MIT Tech Review about the challenges in developing the technology. “Finding high-quality 3D information is difficult,” Singh said. “Even if you get all the data sets that you would think of, being able to predict the next cube requires it to have literally three dimensions, X, Y, and Z.”
According to Singh, lack of 3D training data can create glitches in the results, like a dog with too many legs. To get around this, Roblox is using a second AI model as a kind of visual moderator to catch the mistakes and reject them until the proper 3D element appears. Through iteration and trial and error, the first AI model can create the proper 3D structure.
Notably, Roblox plans to open-source its 3D foundation model, allowing developers and even competitors to use and modify it. But it’s not just about giving back—open source can be a two-way street. Choosing an open source approach could also allow the company to utilize knowledge from AI developers if they contribute to the project and improve it over time.
The ongoing quest to capture gaming revenue
News of the new 3D foundational model arrived at the 10th annual Roblox Developers Conference in San Jose, California, where the company also announced an ambitious goal to capture 10 percent of global gaming content revenue through the Roblox ecosystem, and the introduction of “Party,” a new feature designed to facilitate easier group play among friends.
In March 2023, we detailed Roblox’s early foray into AI-powered game development tools, as revealed at the Game Developers Conference. The tools included a Code Assist beta for generating simple Lua functions from text descriptions, and a Material Generator for creating 2D surfaces with associated texture maps.
At the time, Roblox Studio head Stef Corazza described these as initial steps toward “democratizing” game creation with plans for AI systems that are now coming to fruition. The 2023 tools focused on discrete tasks like code snippets and 2D textures, laying the groundwork for the more comprehensive 3D foundational model announced at this year’s Roblox Developer’s Conference.
The upcoming AI tool could potentially streamline content creation on the platform, possibly accelerating Roblox’s path toward its revenue goal. “We see a powerful future where Roblox experiences will have extensive generative AI capabilities to power real-time creation integrated with gameplay,” Roblox said in a statement. “We’ll provide these capabilities in a resource-efficient way, so we can make them available to everyone on the platform.”
Enlarge/ “Hmm, no signal here. I’m trying to figure it out, but nothing comes to mind …”
Getty Images
One issue in getting office buildings networked that you don’t typically face at home is concrete—and lots of it. Concrete walls are an average of 8 inches thick inside most commercial real estate.
Keeping a network running through them is not merely a matter of running cord. Not everybody has the knowledge or tools to punch through that kind of wall. Even if they do, you can’t just put a hole in something that might be load-bearing or part of a fire control system without imaging, permits, and contractors. The bandwidths that can work through these walls, like 3G, are being phased out, and the bandwidths that provide enough throughput for modern systems, like 5G, can’t make it through.
That’s what WaveCore, from Airvine Scientific, aims to fix, and I can’t help but find it fascinating after originally seeing it on The Register. The company had previously taken on lesser solid obstructions, like plaster and thick glass, with its WaveTunnel. Two WaveCore units on either side of a wall (or on different floors) can push through a stated 12 inches of concrete. In their in-house testing, Airvine reports pushing just under 4Gbps through 12 inches of garage concrete, and it can bend around corners, even 90 degrees. Your particular cement and aggregate combinations may vary, of course.
The WaveCore device, installed in a garage space during Airvine Scientific’s testing.
Concept drawing of how WaveCore punches through concrete walls (kind of).
Airvine Scientific
The spec sheet shows that a 6 GHz radio is the part that, through “beam steering,” blasts through concrete, with a 2.4 GHz radio for control functions. There’s PoE or barrel connector power, and RJ45 ethernet in the 1, 2.5, 5, and 10Gbps sizes.
6 GHz concrete fidelity (Con-Fi? Crete-Fi?) is just one of the slightly uncommon connections that may or may not be making their way into office spaces soon. LiFi, standardized as 802.11bb, is seeking to provide an intentionally limited scope to connectivity, whether for security restrictions or radio frequency safety. And Wi-Fi 7, certified earlier this year, aims to multiply data rates by bonding connections over the various bands already in place.
Researchers have discovered more than 280 malicious apps for Android that use optical character recognition to steal cryptocurrency wallet credentials from infected devices.
The apps masquerade as official ones from banks, government services, TV streaming services, and utilities. In fact, they scour infected phones for text messages, contacts, and all stored images and surreptitiously send them to remote servers controlled by the app developers. The apps are available from malicious sites and are distributed in phishing messages sent to targets. There’s no indication that any of the apps were available through Google Play.
A high level of sophistication
The most notable thing about the newly discovered malware campaign is that the threat actors behind it are employing optical character recognition software in an attempt to extract cryptocurrency wallet credentials that are shown in images stored on infected devices. Many wallets allow users to protect their wallets with a series of random words. The mnemonic credentials are easier for most people to remember than the jumble of characters that appear in the private key. Words are also easier for humans to recognize in images.
SangRyol Ryu, a researcher at security firm McAfee, made the discovery after obtaining unauthorized access to the servers that received the data stolen by the malicious apps. That access was the result of weak security configurations made when the servers were deployed. With that, Ryu was able to read pages available to server administrators.
One page, displayed in the image below, was of particular interest. It showed a list of words near the top and a corresponding image, taken from an infected phone, below. The words represented visually in the image corresponded to the same words.
“Upon examining the page, it became clear that a primary goal of the attackers was to obtain the mnemonic recovery phrases for cryptocurrency wallets,” Ryu wrote. “This suggests a major emphasis on gaining entry to and possibly depleting the crypto assets of victims.”
Optical character recognition is the process of converting images of typed, handwritten, or printed text into machine-encoded text. OCR has existed for years and has grown increasingly common to transform characters captured in images into characters that can be read and manipulated by software.
Ryu continued:
This threat utilizes Python and Javascript on the server-side to process the stolen data. Specifically, images are converted to text using optical character recognition (OCR) techniques, which are then organized and managed through an administrative panel. This process suggests a high level of sophistication in handling and utilizing the stolen information.
Enlarge/ Python code for converting text shown in images to machine-readable text.
McAfee
People who are concerned they may have installed one of the malicious apps should check the McAfee post for a list of associated websites and cryptographic hashes.
The malware has received multiple updates over time. Whereas it once used HTTP to communicate with control servers, it now connects through WebSockets, a mechanism that’s harder for security software to parse. WebSockets have the added benefit of being a more versatile channel.
Developers have also updated the apps to better obfuscate their malicious functionality. Obfuscation methods include encoding the strings inside the code so they’re not easily read by humans, the addition of irrelevant code, and the renaming of functions and variables, all of which confuse analysts and make detection harder. While the malware is mostly restricted to South Korea, it has recently begun to spread within the UK.
“This development is significant as it shows that the threat actors are expanding their focus both demographically and geographically,” Ryu wrote. “The move into the UK points to a deliberate attempt by the attackers to broaden their operations, likely aiming at new user groups with localized versions of the malware.”
The cost of renting cloud services using Nvidia’s leading artificial intelligence chips is lower in China than in the US, a sign that the advanced processors are easily reaching the Chinese market despite Washington’s export restrictions.
Four small-scale Chinese cloud providers charge local tech groups roughly $6 an hour to use a server with eight Nvidia A100 processors in a base configuration, companies and customers told the Financial Times. Small cloud vendors in the US charge about $10 an hour for the same setup.
The low prices, according to people in the AI and cloud industry, are an indication of plentiful supply of Nvidia chips in China and the circumvention of US measures designed to prevent access to cutting-edge technologies.
The A100 and H100, which is also readily available, are among Nvidia’s most powerful AI accelerators and are used to train the large language models that power AI applications. The Silicon Valley company has been banned from shipping the A100 to China since autumn 2022 and has never been allowed to sell the H100 in the country.
Chip resellers and tech startups said the products were relatively easy to procure. Inventories of the A100 and H100 are openly advertised for sale on Chinese social media and ecommerce sites such as Xiaohongshu and Alibaba’s Taobao, as well as in electronics markets, at slight markups to pricing abroad.
China’s larger cloud operators such as Alibaba and ByteDance, known for their reliability and security, charge double to quadruple the price of smaller local vendors for similar Nvidia A100 servers, according to pricing from the two operators and customers.
After discounts, both Chinese tech giants offer packages for prices comparable to Amazon Web Services, which charges $15 to $32 an hour. Alibaba and ByteDance did not respond to requests for comment.
“The big players have to think about compliance, so they are at a disadvantage. They don’t want to use smuggled chips,” said a Chinese startup founder. “Smaller vendors are less concerned.”
He estimated there were more than 100,000 Nvidia H100 processors in the country based on their widespread availability in the market. The Nvidia chips are each roughly the size of a book, making them relatively easy for smugglers to ferry across borders, undermining Washington’s efforts to limit China’s AI progress.
“We bought our H100s from a company that smuggled them in from Japan,” said a startup founder in the automation field who paid about 500,000 yuan ($70,000) for two cards this year. “They etched off the serial numbers.”
Nvidia said it sold its processors “primarily to well-known partners … who work with us to ensure that all sales comply with US export control rules”.
“Our pre-owned products are available through many second-hand channels,” the company added. “Although we cannot track products after they are sold, if we determine that any customer is violating US export controls, we will take appropriate action.”
The head of a small Chinese cloud vendor said low domestic costs helped offset the higher prices that providers paid for smuggled Nvidia processors. “Engineers are cheap, power is cheap, and competition is fierce,” he said.
In Shenzhen’s Huaqiangbei electronics market, salespeople speaking to the FT quoted the equivalent of $23,000–$30,000 for Nvidia’s H100 plug-in cards. Online sellers quote the equivalent of $31,000–$33,000.
Nvidia charges customers $20,000–$23,000 for H100 chips after recently cutting prices, according to Dylan Patel of SemiAnalysis.
One data center vendor in China said servers made by Silicon Valley’s Supermicro and fitted with eight H100 chips hit a peak selling price of 3.2 million yuan after the Biden administration tightened export restrictions in October. He said prices had since fallen to 2.5 million yuan as supply constraints eased.
Several people involved in the trade said merchants in Malaysia, Japan, and Indonesia often shipped Supermicro servers or Nvidia processors to Hong Kong before bringing them across the border to Shenzhen.
The black market trade depends on difficult-to-counter workarounds to Washington’s export regulations, experts said.
For example, while subsidiaries of Chinese companies are banned from buying advanced AI chips outside the country, their executives could establish new companies in countries such as Japan or Malaysia to make the purchases.
“It’s hard to completely enforce export controls beyond the US border,” said an American sanctions expert. “That’s why the regulations create obligations for the shipper to look into end users and [the] commerce [department] adds companies believed to be flouting the rules to the [banned] entity list.”
Additional reporting by Michael Acton in San Francisco.