Author name: Mike M.

how-gm’s-super-cruise-went-from-limo-driving-to-lane-changes-and-towing

How GM’s Super Cruise went from limo driving to lane changes and towing

The Unified Lateral Controller

The algorithm that handles all of that is called the Unified Lateral Controller. “So it’s a single software stack, but it is also modular to adapt with different vehicle configurations, with different driving scenarios, different maneuvers,” Zarringhalam said.

“Let’s imagine that you’re driving a Super Cruise vehicle, and you indicate to the left, or the system automatically decides to make a lane change to the left, and then, for whatever reason, the driver decides that they want to go back, mid-maneuver; they want to go back to the original lane. So you can just indicate to the opposite side, in this case, the right-hand side. Under the hood, in this scenario, everything is jumping. Our target trajectory is jumping from a left-lane maneuver to a right turn. The turn can be very sharp. There could be other objects that narrow the envelope of operation that you’re allowed to function in,” Zarringhalam said.

Again, that behavior has to be consistent and predictable, whether it’s below freezing or in the middle of a heatwave, and things like tire wear must also be taken into account. Or, say, the presence of a trailer, which could be anything from a bike rack with wheels to a three-axle trailer.

“As soon as we detect that the trailer is attached, we run several real-time algorithms—trailer inertial parameters, trailer math, trailer configuration, even how many axles we have, and the control adapts itself to execute lane turning and keep both the vehicle and the trailer at the center of the road,” Zarringhalam said.

That’s done automatically without the driver having to input the information (obviating the problem of someone entering the wrong details), “and if you change the loading or the trailer configuration, even mid-drive—if you pull over, load more weight and continue driving on the same road with Super Cruise active—these learnings happen in a matter of seconds,” Zarringhalam said.

How GM’s Super Cruise went from limo driving to lane changes and towing Read More »

skull-long-thought-to-be-cleopatra’s-sister’s-was-actually-a-young-boy

Skull long thought to be Cleopatra’s sister’s was actually a young boy

Scientists have demonstrated that an ancient human skull excavated from a tomb at Ephesos was not that of Arsinoë IV, half-sister to Cleopatra VII. Rather, it’s the skull of a young male between the ages of 11 and 14 from Italy or Sardinia, who may have suffered from one or more developmental disorders, according to a new paper published in the journal Scientific Reports. Arsinoë IV’s remains are thus still missing.

Arsinoë IV led quite an adventurous short life. She was either the third or fourth daughter of Ptolemy XII, who left the throne to Cleopatra and his son, Ptolemy XIII, to rule together. Ptolemy XIII didn’t care for this decision and dethroned Cleopatra in a civil war—until Julius Caesar intervened to enforce their father’s original plan of co-rulership. As for Arsinoë, Caesar returned Cyprus to Egyptian rule and named her and her youngest brother (Ptolemy XIV) co-rulers. This time, it was Arsinoë who rebelled, taking command of the Egyptian army and declaring herself queen.

She was fairly successful at first in battling the Romans, conducting a siege against Alexandria and Cleopatra, until her disillusioned officers decided they’d had enough and secretly negotiated with Caesar to turn her over to him. Caesar agreed, and after a bit of public humiliation, he granted Arsinoë sanctuary in the temple of Artemis in Ephesus. She lived in relative peace for a few years, until Cleopatra and Mark Antony ordered her execution on the steps of the temple—a scandalous violation of the temple as a place of sanctuary. Historians disagree about Arsinoë’s age when she died: Estimates range from 22 to 27.

Archaeologists have been excavating the ancient city of Ephesus for more than a century. The Octagon was uncovered in 1904, and the burial chamber was opened in 1929. That’s where Joseph Keil found a skeleton in a sarcophagus filled with water, but for some reason, Keil only removed the cranium from the tomb before sealing it back up. He took the skull with him to Germany and declared it belonged to a likely female around 20 years old, although he provided no hard data to support that conclusion.

It was Hilke Thur of the Austrian Academy of Sciences who first speculated that the skull may have belonged to Arsinoë IV, despite the lack of an inscription (or even any grave goods) on the tomb where it was found. Old notes and photographs, as well as craniometry, served as the only evidence. The skull accompanied Keil to his new position at the University of Vienna, and there was one 1953 paper reporting on craniometric measurements, but after that, the skull languished in relative obscurity. Archaeologists at the University of Graz rediscovered the skull in Vienna in 2022. The rest of the skeleton remained buried until the chamber was reopened and explored further in the 1980s and 1990s, but it was no longer in the sarcophagus.

Skull long thought to be Cleopatra’s sister’s was actually a young boy Read More »

sonos-ceo-behind-disastrous-app-exits-with-$1.9-million-severance

Sonos CEO behind disastrous app exits with $1.9 million severance

After an app update rollout that can best be described as disastrous, Sonos is seeking a new CEO. The company announced today that Patrick Spence, who had been CEO for eight years, is stepping down.

In its announcement, Sonos said its board of directors and Spence “agreed” on the decision while saying it was unrelated to the company’s fiscal Q1 2025 earnings, which it will report next month.

Spence joined Sonos as chief commercial officer in 2012 after leaving Blackberry. Under his tenure, Sonos branched into new categories, including portable speakers and spatial audio. But in May, Sonos issued an app update that broke basic and critical features. Sonos employees said the update was built on outdated code and infrastructure, impacting users’ ability to do things like access and manage local libraries, set sleep timers, and edit song queues and playlists.

The employees also said the app was rushed so that it could be ready in time for Sonos’ first wireless headphones, Ace. In July, following much public backlash, Spence apologized and promised regular updates until the new app was as good as the old app. But even today, users are still reporting problems with the software.

In August, Spence said Sonos would spend $20 million to $30 million “in the short term” to fix the app. Soon after, Sonos laid off 100 people. Sonos’ stock price declined approximately 13 percent since the app update, Bloomberg noted. Sonos execs, including Spence, received a $72,000 bonus in 2023 but did not get bonuses for the fiscal year that ended on September 30.

Spence will receive a cash severance of $1,875,000, per SEC filings. He will also get $7,500 per month and serve as a Sonos board advisor until June, and his unvested shares will vest.

Tom Conrad, who has been on Sonos’ board since 2017, took the role of interim CEO today. Sonos plans on having a new CEO by February via the help of a third-party firm. In the meantime, Conrad will get $175,000 per month and receive $2.65 million in stock shares.

Sonos CEO behind disastrous app exits with $1.9 million severance Read More »

viral-chatgpt-powered-sentry-gun-gets-shut-down-by-openai

Viral ChatGPT-powered sentry gun gets shut down by OpenAI

OpenAI says it has cut off API access to an engineer whose video of a motorized sentry gun controlled by ChatGPT-powered commands has set off a viral firestorm of concerns about AI-powered weapons.

An engineer going by the handle sts_3d started posting videos of a motorized, auto-rotating swivel chair project in August. By November, that same assembly appeared to seamlessly morph into the basis for a sentry gun that could quickly rotate to arbitrary angles and activate a servo to fire precisely aimed projectiles (though only blanks and simulated lasers are shown being fired in his videos).

Earlier this week, though, sts_3d started getting wider attention for a new video showing the sentry gun’s integration with OpenAI’s real-time API. In the video, the gun uses that ChatGPT integration to aim and fire based on spoken commands from sts_3d and even responds in a chirpy voice afterward.

@sts_3d OpenAI Realtime API project integration #robotics #ai #openai ♬ original sound – sts_3d

“If you need any other assistance, please let me know,” the ChatGPT-powered gun says after firing a volley at one point. “Good job, you saved us,” sts_3d responds, deadpan.

“I’m glad I could help!” ChatGPT intones happily.

In response to a comment request from Futurism, OpenAI said it had “proactively identified this violation of our policies and notified the developer to cease this activity ahead of receiving your inquiry. OpenAI’s Usage Policies prohibit the use of our services to develop or use weapons or to automate certain systems that can affect personal safety.”

Halt, intruder alert!

The “voice-powered killer AI robot angle” has garnered plenty of viral attention for sts_3d’s project in recent days. But the ChatGPT integration shown in his video doesn’t exactly reach Terminator levels of a terrifying killing machine. Here, ChatGPT instead ends up looking more like a fancy, overwrought voice-activated remote control for a legitimately impressive gun mount.

Viral ChatGPT-powered sentry gun gets shut down by OpenAI Read More »

disney,-fox,-and-wbd-give-up-on-controversial-sports-streaming-app-venu

Disney, Fox, and WBD give up on controversial sports streaming app Venu

Although Fubo’s lawsuit against the JV appears to be settled, other rivals in sports television seemed intent on continuing to fight Venu.

In a January 9 letter (PDF) to US District Judge Margaret M. Garnett of the Southern District in New York, who granted Fubo’s premliminary injunction against Venu, Michael Hartman, general counsel and chief external affairs officer for DirectTV, wrote that Fubo’s settlement “does nothing to resolve the underlying antitrust violations at issue.” Hartman asked the court to maintain the preliminary injunction against the app’s launch.

“The preliminary injunction has protected consumers and distributors alike from the JV Defendant’s scheme to ‘capture demand,’ ‘suppress’ potentially competitive sports bundles, and impose consumer price hikes,” the letter says, adding that DirectTV would continue to explore its options regarding the JV “and other anticompetitive harms.”

Similarly, Pantelis Michalopoulos, counsel for EchoStar Corporation, which owns Dish, penned a letter (PDF) to Garnett on January 7, claiming the members of the JV “purchased their way out of their antitrust violation.” Michalopoulos added that the JV defendants “should not be able to pay their way into erasing the Court’s carefully reasoned decision” to temporarily block Venu’s launch.

In addition to Fubo, DirecTV, and Dish, ACA Connects (a trade association for small- to medium-sized telecommunication service providers) publicly expressed concerns about Venu. NFL was also reported to be worried about the implications of the venture.

Now, the three giants behind Venu are throwing in the towel and abandoning an app that could have garnered a lot of subscribers tired of hopping around apps, channels, and subscriptions to watch all the sports content they wanted. But they’re also avoiding a lot of litigation and potential backlash in the process.

Disney, Fox, and WBD give up on controversial sports streaming app Venu Read More »

meta-kills-diversity-programs,-claiming-dei-has-become-“too-charged”

Meta kills diversity programs, claiming DEI has become “too charged”

Meta has reportedly ended diversity, equity, and inclusion (DEI) programs that influenced staff hiring and training, as well as vendor decisions, effective immediately.

According to an internal memo viewed by Axios and verified by Ars, Meta’s vice president of human resources, Janelle Gale, told Meta employees that the shift was due to “legal and policy landscape surrounding diversity, equity, and inclusion efforts in the United States is changing.”

It’s another move by Meta that some view as part of the company’s larger effort to align with the incoming Trump administration’s politics. In December, Donald Trump promised to crack down on DEI initiatives at companies and on college campuses, The Guardian reported.

Earlier this week, Meta cut its fact-checking program, which was introduced in 2016 after Trump’s first election to prevent misinformation from spreading. In a statement announcing Meta’s pivot to X’s Community Notes-like approach to fact-checking, Meta CEO Mark Zuckerberg claimed that fact-checkers were “too politically biased” and “destroyed trust” on Meta platforms like Facebook, Instagram, and Threads.

Trump has also long promised to renew his war on alleged social media censorship while in office. Meta faced backlash this week over leaked rule changes relaxing Meta’s hate speech policies, The Intercept reported, which Zuckerberg said were “out of touch with mainstream discourse.”  Those changes included allowing anti-trans slurs previously banned, as well as permitting women to be called “property” and gay people to be called “mentally ill,” Mashable reported. In a statement, GLAAD said that rolling back safety guardrails risked turning Meta platforms into “unsafe landscapes filled with dangerous hate speech, violence, harassment, and misinformation” and alleged that Meta appeared to be willing to “normalize anti-LGBTQ hatred for profit.”

Meta kills diversity programs, claiming DEI has become “too charged” Read More »

ongoing-attacks-on-ivanti-vpns-install-a-ton-of-sneaky,-well-written-malware

Ongoing attacks on Ivanti VPNs install a ton of sneaky, well-written malware

Networks protected by Ivanti VPNs are under active attack by well-resourced hackers who are exploiting a critical vulnerability that gives them complete control over the network-connected devices.

Hardware maker Ivanti disclosed the vulnerability, tracked as CVE-2025-0283, on Wednesday and warned that it was under active exploitation against some customers. The vulnerability, which is being exploited to allow hackers to execute malicious code with no authentication required, is present in the company’s Connect Secure VPN, and Policy Secure & ZTA Gateways. Ivanti released a security patch at the same time. It upgrades Connect Secure devices to version 22.7R2.5.

Well-written, multifaceted

According to Google-owned security provider Mandiant, the vulnerability has been actively exploited against “multiple compromised Ivanti Connect Secure appliances” since December, a month before the then zero-day came to light. After exploiting the vulnerability, the attackers go on to install two never-before-seen malware packages, tracked under the names DRYHOOK and PHASEJAM on some of the compromised devices.

PHASEJAM is a well-written and multifaceted bash shell script. It first installs a web shell that gives the remote hackers privileged control of devices. It then injects a function into the Connect Secure update mechanism that’s intended to simulate the upgrading process.

“If the ICS administrator attempts an upgrade, the function displays a visually convincing upgrade process that shows each of the steps along with various numbers of dots to mimic a running process,” Mandiant said. The company continued:

PHASEJAM injects a malicious function into the /home/perl/DSUpgrade.pm file named processUpgradeDisplay(). The functionality is intended to simulate an upgrading process that involves 13 steps, with each of those taking a predefined amount of time. If the ICS administrator attempts an upgrade, the function displays a visually convincing upgrade process that shows each of the steps along with various numbers of dots to mimic a running process. Further details are provided in the System Upgrade Persistence section.

The attackers are also using a previously seen piece of malware tracked as SPAWNANT on some devices. One of its functions is to disable an integrity checker tool (ICT) Ivanti has built into recent VPN versions that is designed to inspect device files for unauthorized additions. SpawnAnt does this by replacing the expected SHA256 cryptographic hash of a core file with the hash of it after it has been infected. As a result, when the tool is run on compromised devices, admins see the following screen:

Ongoing attacks on Ivanti VPNs install a ton of sneaky, well-written malware Read More »

a-taller,-heavier,-smarter-version-of-spacex’s-starship-is-almost-ready-to-fly

A taller, heavier, smarter version of SpaceX’s Starship is almost ready to fly


Starship will test its payload deployment mechanism on its seventh test flight.

SpaceX’s first second-generation Starship, known as Version 2 or Block 2, could launch as soon as January 13. Credit: SpaceX

An upsized version of SpaceX’s Starship mega-rocket rolled to the launch pad early Thursday in preparation for liftoff on a test flight next week.

The two-mile transfer moved the bullet-shaped spaceship one step closer to launch Monday from SpaceX’s Starbase test site in South Texas. The launch window opens at 5 pm EST (4 pm CST; 2200 UTC). This will be the seventh full-scale test flight of SpaceX’s Super Heavy booster and Starship spacecraft and the first of 2025.

In the coming days, SpaceX technicians will lift the ship on top of the Super Heavy booster already emplaced on the launch mount. Then, teams will complete the final tests and preparations for the countdown on Monday.

“The upcoming flight test will launch a new generation ship with significant upgrades, attempt Starship’s first payload deployment test, fly multiple reentry experiments geared towards ship catch and reuse, and launch and return the Super Heavy booster,” SpaceX officials wrote in a mission overview posted on the company’s website.

The mission Monday will repeat many of the maneuvers SpaceX demonstrated on the last two Starship test flights. The company will again attempt to return the Super Heavy booster to the launch site and attempt to catch it with two mechanical arms, or “chopsticks,” on the launch tower approximately seven minutes after liftoff.

SpaceX accomplished this feat on the fifth Starship test flight in October but aborted a catch attempt on a November flight because of damaged sensors on the tower chopsticks. The booster, which remained healthy, diverted to a controlled splashdown offshore in the Gulf of Mexico.

SpaceX’s next Starship prototype, Ship 33, emerges from its assembly building at Starbase, Texas, early Thursday morning. Credit: SpaceX/Elon Musk via X

For the next flight, SpaceX added protections to the sensors on the tower and will test radar instruments on the chopsticks to provide more accurate ranging measurements for returning vehicles. These modifications should improve the odds of a successful catch of the Super Heavy booster and of Starship on future missions.

In another first, one of the 33 Raptor engines that will fly on this Super Heavy booster—designated Booster 14 in SpaceX’s fleet—was recovered from the booster that launched and returned to Starbase in October. For SpaceX, this is a step toward eventually flying the entire rocket repeatedly. The Super Heavy booster and Starship spacecraft are designed for full reusability.

After separation of the booster stage, the Starship upper stage will ignite six engines to accelerate to nearly orbital velocity, attaining enough energy to fly halfway around the world before gravity pulls it back into the atmosphere. Like the past three test flights, SpaceX will guide Starship toward a controlled reentry and splashdown in the Indian Ocean northwest of Australia around one hour after liftoff.

New ship, new goals

The most significant changes engineers will test next week are on the ship, or upper stage, of SpaceX’s enormous rocket. The most obvious difference on Starship Version 2, or Block 2, is with the vehicle’s forward flaps. Engineers redesigned the flaps, reducing their size and repositioning them closer to the tip of the ship’s nose to better protect them from the scorching heat of reentry. Cameras onboard Starship showed heat damage to the flaps during reentry on test flights last year.

SpaceX is also developing an upgraded Super Heavy booster that is slightly taller than the existing model. The next version of the booster will produce more thrust and will be slightly taller than the current Super Heavy, but for the upcoming test flight, SpaceX will still use the first-generation booster design.

Starship Block 2 has smaller flaps than previous ships. The flaps are located in a more leeward position to protect them from the heat of reentry. Credit: SpaceX

For next week’s flight, Super Heavy and Starship combined will hold more than 10.5 million pounds of fuel and oxidizer. The ship’s propellant tanks have 25 percent more volume than previous iterations of the vehicle, and the payload compartment, which contains 10 mock-ups of Starlink Internet satellites on this launch, is somewhat smaller. Put together, the changes add nearly 6 feet (1.8 meters) to the rocket’s height, bringing the full stack to approximately 404 feet (123.1 meters).

This means SpaceX will break its own record for launching the largest and most powerful rocket ever built. And the company will do it again with the even larger Starship Version 3, which SpaceX says will have nine upper stage engines, instead of six, and will deliver up to 440,000 pounds (200 metric tons) of cargo to low-Earth orbit.

Other changes debuting with Starship Version 2 next week include:

• Vacuum jacketing of propellant feedlines

• A new fuel feedline system for the ship’s Raptor vacuum engines

• An improved propulsion avionics module controlling vehicle valves and reading sensors

• Redesigned inertial navigation and star tracking sensors

• Integrated smart batteries and power units to distribute 2.7 megawatts of power across the ship

• An increase to more than 30 cameras onboard the vehicle.

Laying the foundation

The enhanced avionics system will support future missions to prove SpaceX’s ability to refuel Starships in orbit and return the ship to the launch site. For example, SpaceX will fly a more powerful flight computer and new antennas that integrate connectivity with the Starlink Internet constellation, GPS navigation satellites, and backup functions for traditional radio communication links. With Starlink, SpaceX said Starship can stream more than 120Mbps of real-time high-definition video and telemetry in every phase of flight.

These changes “all add additional vehicle performance and the ability to fly longer missions,” SpaceX said. “The ship’s heat shield will also use the latest generation tiles and includes a backup layer to protect from missing or damaged tiles.”

Somewhere over the Atlantic Ocean, a little more than 17 minutes into the flight, Starship will deploy 10 dummy payloads similar in size and weight to next-generation Starlink satellites. The mock-ups will soar around the world on a suborbital trajectory, just like Starship, and reenter over the unpopulated Indian Ocean. Future Starship flights will launch real next-gen Starlink satellites to add capacity to the Starlink broadband network, but they’re too big and too heavy to launch on SpaceX’s smaller Falcon 9 rocket.

SpaceX will again reignite one of the ship’s Raptor engines in the vacuum of space, repeating a successful test achieved on Flight 6 in November. The engine restart capability is important for several reasons. It gives the ship the ability to maneuver itself out of low-Earth orbit for reentry (not a concern for Starship’s suborbital tests), and will allow the vehicle to propel itself to higher orbits, the Moon, or Mars once SpaceX masters the technology for orbital refueling.

Artist’s illustration of Starship on the surface of the Moon. Credit: SpaceX

NASA has contracts with SpaceX to build a derivative of Starship to ferry astronauts to and from the surface of the Moon for the agency’s Artemis program. The NASA program manager overseeing SpaceX’s lunar lander contract, Lisa Watson-Morgan, said she was pleased with the results of the in-space engine restart demo last year.

“The whole path to the Moon, as we are getting ready to land on the Moon, we’ll perform a series of maneuvers, and the Raptors will have an environment that is very, very cold,” Morgan told Ars in a recent interview. “To that, it’s going to be important that they’re able to relight for landing purposes. So that was a great first step towards that.

“In addition, after we land, clearly, the Raptors will be off, and it will get very cold, and they will have to relight in a cold environment (to launch the crews off the lunar surface),” she said. “So that’s why that step was critical for the Human Landing System and NASA’s return to the Moon.”

“The biggest technology challenge remaining”

SpaceX continues to experiment with Starship’s heat shield, which the company’s founder and CEO, Elon Musk, has described as “the biggest technology challenge remaining with Starship.” In order for SpaceX to achieve its lofty goal of launching Starships multiple times per day, the heat shield needs to be fully and immediately reusable.

While the last three ships have softly splashed down in the Indian Ocean, some of their heat-absorbing tiles stripped away from the vehicle during reentry, when it’s exposed to temperatures up to 2,600° Fahrenheit (1,430° Celsius).

Engineers removed tiles from some areas of the ship for next week’s test flight in order to “stress-test” vulnerable parts of the vehicle. They also smoothed and tapered the edge of the tile line, where the ceramic heat shield gives way to the ship’s stainless steel skin, to address “hot spots” observed during reentry on the most recent test flight.

“Multiple metallic tile options, including one with active cooling, will test alternative materials for protecting Starship during reentry,” SpaceX said.

SpaceX is also flying rudimentary catch fittings on Starship to test their thermal performance on reentry. The ship will fly a more demanding trajectory during descent to probe the structural limits of the redesigned flaps at the point of maximum entry dynamic pressure, according to SpaceX.

All told, SpaceX’s inclusion of a satellite deployment demo and ship upgrades on next week’s test flight will lay the foundation for future missions, perhaps in the next few months, to take the next great leap in Starship development.

In comments following the last Starship test flight in November, SpaceX founder and CEO Elon Musk posted on X that the company could try to return the ship to a catch back at the launch site—something that would require the vehicle to complete at least one full orbit of Earth—as soon as the next flight following Monday’s mission.

“We will do one more ocean landing of the ship,” Musk posted. “If that goes well, then SpaceX will attempt to catch the ship with the tower.”

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

A taller, heavier, smarter version of SpaceX’s Starship is almost ready to fly Read More »

of-course-atari’s-new-handheld-includes-a-trackball,-spinner,-and-numpad

Of course Atari’s new handheld includes a trackball, spinner, and numpad

The $50 GameStation Gamepad. My Arcade

This year, My Arcade seems ready to go all in on the Atari GameStation branding. Beyond the GameStation Go, the company announced a $50 wireless GameStation Gamepad, a $70 GameStation Arcade Stick, and a $250 GameStation Mega tabletop arcade cabinet (with a 10.1-inch display). All four GameStation products feature a trackball, spinner, and number pad for maximum control authenticity, as well as helpful accent lighting that highlights which controls are active on a per-game basis—handy for younger gamers who might be overwhelmed by all the different control options.

In a hands-on video from CES, YouTuber GenXGrownUp shows off a preliminary GameStation Go game list, including the usual mix of well over 100 Atari 2600/5200/7800 and classic Atari arcade games you might expect from this kind of retro product (though it’s almost criminal not to see Marble Madness listed among the trackball-supported games). And despite the Atari name, the game selection on hand also includes many licensed NES and Super NES era titles from Jaleco: Bases Loaded, modern retro-styled titles from Piko Interactive, themed virtual pinball tables from Atari’s Balls of Steel line, and even Namco’s Pac-Man (why not?).

Atari’s modernized Centipede Recharged is also included in the game lineup, and GenXGrownUp reports that more Recharged games will be included with downloadable firmware updates after launch (which he says is “more than six months away”). Players will also seemingly be able to update the firmware through an SD card slot atop the GameStation Go, though it’s unclear whether you’ll be able to load your own ROMs in the same way (at least officially).

Despite including a numpad like the Intellivision controller, the GameStation Go doesn’t currently include any games from Atari’s recently purchased Intellivision library. But GenXGrownUp says including those titles—alongside Atari Lynx and Jaguar games—is not “off the table yet” for the final release.

We can only hope that the Gamestation line will show a pent-up demand for these esoteric retro control options, leading to similar modular options for the Nintendo Switch or its coming successor. How about it, Nintendo?

Of course Atari’s new handheld includes a trackball, spinner, and numpad Read More »

it’s-remarkably-easy-to-inject-new-medical-misinformation-into-llms

It’s remarkably easy to inject new medical misinformation into LLMs


Changing just 0.001% of inputs to misinformation makes the AI less accurate.

It’s pretty easy to see the problem here: The Internet is brimming with misinformation, and most large language models are trained on a massive body of text obtained from the Internet.

Ideally, having substantially higher volumes of accurate information might overwhelm the lies. But is that really the case? A new study by researchers at New York University examines how much medical information can be included in a large language model (LLM) training set before it spits out inaccurate answers. While the study doesn’t identify a lower bound, it does show that by the time misinformation accounts for 0.001 percent of the training data, the resulting LLM is compromised.

While the paper is focused on the intentional “poisoning” of an LLM during training, it also has implications for the body of misinformation that’s already online and part of the training set for existing LLMs, as well as the persistence of out-of-date information in validated medical databases.

Sampling poison

Data poisoning is a relatively simple concept. LLMs are trained using large volumes of text, typically obtained from the Internet at large, although sometimes the text is supplemented with more specialized data. By injecting specific information into this training set, it’s possible to get the resulting LLM to treat that information as a fact when it’s put to use. This can be used for biasing the answers returned.

This doesn’t even require access to the LLM itself; it simply requires placing the desired information somewhere where it will be picked up and incorporated into the training data. And that can be as simple as placing a document on the web. As one manuscript on the topic suggested, “a pharmaceutical company wants to push a particular drug for all kinds of pain which will only need to release a few targeted documents in [the] web.”

Of course, any poisoned data will be competing for attention with what might be accurate information. So, the ability to poison an LLM might depend on the topic. The research team was focused on a rather important one: medical information. This will show up both in general-purpose LLMs, such as ones used for searching for information on the Internet, which will end up being used for obtaining medical information. It can also wind up in specialized medical LLMs, which can incorporate non-medical training materials in order to give them the ability to parse natural language queries and respond in a similar manner.

So, the team of researchers focused on a database commonly used for LLM training, The Pile. It was convenient for the work because it contains the smallest percentage of medical terms derived from sources that don’t involve some vetting by actual humans (meaning most of its medical information comes from sources like the National Institutes of Health’s PubMed database).

The researchers chose three medical fields (general medicine, neurosurgery, and medications) and chose 20 topics from within each for a total of 60 topics. Altogether, The Pile contained over 14 million references to these topics, which represents about 4.5 percent of all the documents within it. Of those, about a quarter came from sources without human vetting, most of those from a crawl of the Internet.

The researchers then set out to poison The Pile.

Finding the floor

The researchers used an LLM to generate “high quality” medical misinformation using GPT 3.5. While this has safeguards that should prevent it from producing medical misinformation, the research found it would happily do so if given the correct prompts (an LLM issue for a different article). The resulting articles could then be inserted into The Pile. Modified versions of The Pile were generated where either 0.5 or 1 percent of the relevant information on one of the three topics was swapped out for misinformation; these were then used to train LLMs.

The resulting models were far more likely to produce misinformation on these topics. But the misinformation also impacted other medical topics. “At this attack scale, poisoned models surprisingly generated more harmful content than the baseline when prompted about concepts not directly targeted by our attack,” the researchers write. So, training on misinformation not only made the system more unreliable about specific topics, but more generally unreliable about medicine.

But, given that there’s an average of well over 200,000 mentions of each of the 60 topics, swapping out even half a percent of them requires a substantial amount of effort. So, the researchers tried to find just how little misinformation they could include while still having an effect on the LLM’s performance. Unfortunately, this didn’t really work out.

Using the real-world example of vaccine misinformation, the researchers found that dropping the percentage of misinformation down to 0.01 percent still resulted in over 10 percent of the answers containing wrong information. Going for 0.001 percent still led to over 7 percent of the answers being harmful.

“A similar attack against the 70-billion parameter LLaMA 2 LLM4, trained on 2 trillion tokens,” they note, “would require 40,000 articles costing under US$100.00 to generate.” The “articles” themselves could just be run-of-the-mill webpages. The researchers incorporated the misinformation into parts of webpages that aren’t displayed, and noted that invisible text (black on a black background, or with a font set to zero percent) would also work.

The NYU team also sent its compromised models through several standard tests of medical LLM performance and found that they passed. “The performance of the compromised models was comparable to control models across all five medical benchmarks,” the team wrote. So there’s no easy way to detect the poisoning.

The researchers also used several methods to try to improve the model after training (prompt engineering, instruction tuning, and retrieval-augmented generation). None of these improved matters.

Existing misinformation

Not all is hopeless. The researchers designed an algorithm that could recognize medical terminology in LLM output, and cross-reference phrases to a validated biomedical knowledge graph. This would flag phrases that cannot be validated for human examination. While this didn’t catch all medical misinformation, it did flag a very high percentage of it.

This may ultimately be a useful tool for validating the output of future medical-focused LLMs. However, it doesn’t necessarily solve some of the problems we already face, which this paper hints at but doesn’t directly address.

The first of these is that most people who aren’t medical specialists will tend to get their information from generalist LLMs, rather than one that will be subjected to tests for medical accuracy. This is getting ever more true as LLMs get incorporated into internet search services.

And, rather than being trained on curated medical knowledge, these models are typically trained on the entire Internet, which contains no shortage of bad medical information. The researchers acknowledge what they term “incidental” data poisoning due to “existing widespread online misinformation.” But a lot of that “incidental” information was generally produced intentionally, as part of a medical scam or to further a political agenda. Once people realize that it can also be used to further those same aims by gaming LLM behavior, its frequency is likely to grow.

Finally, the team notes that even the best human-curated data sources, like PubMed, also suffer from a misinformation problem. The medical research literature is filled with promising-looking ideas that never panned out, and out-of-date treatments and tests that have been replaced by approaches more solidly based on evidence. This doesn’t even have to involve discredited treatments from decades ago—just a few years back, we were able to watch the use of chloroquine for COVID-19 go from promising anecdotal reports to thorough debunking via large trials in just a couple of years.

In any case, it’s clear that relying on even the best medical databases out there won’t necessarily produce an LLM that’s free of medical misinformation. Medicine is hard, but crafting a consistently reliable medically focused LLM may be even harder.

Nature Medicine, 2025. DOI: 10.1038/s41591-024-03445-1  (About DOIs).

Photo of John Timmer

John is Ars Technica’s science editor. He has a Bachelor of Arts in Biochemistry from Columbia University, and a Ph.D. in Molecular and Cell Biology from the University of California, Berkeley. When physically separated from his keyboard, he tends to seek out a bicycle, or a scenic location for communing with his hiking boots.

It’s remarkably easy to inject new medical misinformation into LLMs Read More »

after-embarrassing-blunder,-at&t-promises-bill-credits-for-future-outages

After embarrassing blunder, AT&T promises bill credits for future outages

“All voice and 5G data services for AT&T wireless customers were unavailable, affecting more than 125 million devices, blocking more than 92 million voice calls, and preventing more than 25,000 calls to 911 call centers,” the Federal Communications Commission said in a report after a months-long investigation into the incident.

The FCC report said the nationwide outage began three minutes after “AT&T Mobility implemented a network change with an equipment configuration error.” This error caused the AT&T network “to enter ‘protect mode’ to prevent impact to other services, disconnecting all devices from the network.”

The FCC found various problems in AT&T’s processes that increased the likelihood of an outage and made recovery more difficult than it should have been. The agency described “a lack of adherence to AT&T Mobility’s internal procedures, a lack of peer review, a failure to adequately test after installation, inadequate laboratory testing, insufficient safeguards and controls to ensure approval of changes affecting the core network, a lack of controls to mitigate the effects of the outage once it began, and a variety of system issues that prolonged the outage once the configuration error had been remedied.”

AT&T said it implemented changes to prevent the same problem from happening again. The company could face punishment, but it’s less likely to happen under Trump’s pick to chair the FCC, Brendan Carr, who is taking over soon. The Biden-era FCC compelled Verizon Wireless to pay a $1,050,000 fine and implement a compliance plan because of a December 2022 outage in six states that lasted one hour and 44 minutes.

An AT&T executive told Reuters that the company has been trying to regain customers’ trust over the past few years with better offers and product improvements. “Four years ago, we were losing share in the industry for a significant period of time… we knew we had lost our customers’ trust,” Reuters quoted AT&T Executive VP Jenifer Robertson as saying in an article today.

After embarrassing blunder, AT&T promises bill credits for future outages Read More »