Author name: Shannon Garcia

changing-one-gene-can-restore-some-tissue-regeneration-to-mice

Changing one gene can restore some tissue regeneration to mice

Regeneration is a trick many animals, including lizards, starfish, and octopuses, have mastered. Axolotls, a salamander species originating in Mexico, can regrow pretty much everything from severed limbs, to eyes and parts of brain, to the spinal cord. Mammals, though, have mostly lost this ability somewhere along their evolutionary path. Regeneration persisted, in a limited number of tissues, in just a few mammalian species like rabbits or goats.

“We were trying to learn how certain animals lost their regeneration capacity during evolution and then put back the responsible gene or pathway to reactivate the regeneration program,” says Wei Wang, a researcher at the National Institute of Biological Sciences in Beijing. Wang’s team has found one of those inactive regeneration genes, activated it, and brought back a limited regeneration ability to mice that did not have it before.

Of mice and bunnies

The idea Wang and his colleagues had was a comparative study of how the wound healing process works in regenerating and non-regenerating mammalian species. They chose rabbits as their regenerating mammals and mice as the non-regenerating species. As the reference organ, the team picked the ear pinna. “We wanted a relatively simple structure that was easy to observe and yet composed of many different cell types,” Wang says. The test involved punching holes in the ear pinna of rabbits and mice and tracking the wound-repairing process.

The healing process began in the same way in rabbits and mice. Within the first few days after the injury, a blastema—a mass of heterogeneous cells—formed at the wound site. “Both rabbits and mice will heal the wounds after a few days,” Wang explains. “But between the 10th and 15th day, you will see the major difference.” In this timeframe, the earhole in rabbits started to become smaller. There were outgrowths above the blastema—the animals were producing more tissue. In mice, on the other hand, the healing process halted completely, leaving a hole in the ear.

Changing one gene can restore some tissue regeneration to mice Read More »

rfk-jr.’s-cdc-panel-ditches-some-flu-shots-based-on-anti-vaccine-junk-data

RFK Jr.’s CDC panel ditches some flu shots based on anti-vaccine junk data


Flu shots with thimerosal abandoned, despite decades of data showing they’re safe.

Dr. Martin Kulldorff, chair of the Advisory Committee on Immunization Practices, during the first meeting of the CDC’s Advisory Committee On Immunization Practices on June 25, 2025. Credit: Getty | Bloomberg

The vaccine panel hand-selected by health secretary and anti-vaccine advocate Robert F. Kennedy Jr. on Thursday voted overwhelmingly to drop federal recommendations for seasonal flu shots that contain the ethyl-mercury containing preservative thimerosal. The panel did so after hearing a misleading and cherry-picked presentation from an anti-vaccine activist.

There is extensive data from the last quarter century proving that the antiseptic preservative is safe, with no harms identified beyond slight soreness at the injection site, but none of that data was presented during today’s meeting.

The significance of the vote is unclear for now. The vast majority of seasonal influenza vaccines currently used in the US—about 96 percent of flu shots in 2024–2025—do not contain thimerosal. The preservative is only included in multi-dose vials of seasonal flu vaccines, where it prevents the growth of bacteria and fungi potentially introduced as doses are withdrawn.

However, thimerosal is more common elsewhere in the world for various multi-dose vaccine vials, which are cheaper than the single-dose vials more commonly used in the US. If other countries follow the US’s lead and abandon thimerosal, it could increase the cost of vaccines in other countries and, in turn, lead to fewer vaccinations.

Broken process

However, it remains unclear what impact today’s vote will have—both in the US and abroad. Normally, before voting on any significant changes to vaccine recommendations from the Centers for Disease Control and Prevention, the committee that met today—the CDC’s Advisory Committee on Immunization Practices (ACIP)— would go through an exhaustive process. That includes thoroughly reviewing and discussing the extensive safety and efficacy data of the vaccines, the balance of their benefits and harms, equity considerations, and the feasibility and resource implications of their removal.

But, instead, the committee heard a single presentation given by anti-vaccine activist, Lyn Redwood, who was once the president of the anti-vaccine organization founded by Kennedy, Children’s Health Defense.

Thimerosal has long been a target of anti-vaccine activists like Redwood, who hold fast to the false and thoroughly debunked claim that vaccines—particularly thimerosal-containing vaccines—cause autism and neurological disorders. Her presentation today was a smorgasbord of anti-vaccine talking points against thimerosal, drawing on old and fringe studies she claimed prove that thimerosal is an ineffective preservative, kills cells in petri dishes, and can be found in the brains of baby monkeys after it has been injected into them. The presentation did not appear to have gone through any vetting by the CDC, and an earlier version contained a reference to a study that does not exist.

Yesterday, CBS News reported that the Centers for Disease Control and Prevention is hiring Redwood to oversee vaccine safety. In response, Sen. Patty Murray (D-Wash.) called Redwood an “extremist,” and urged the White House to immediately reverse the decision. “We cannot allow a few truly deranged individuals to distort the plain truth and facts around vaccines so badly,” Murray said in a statement.

CDC scientists censored

Prior to the meeting, CDC scientists posted a background briefing document on thimerosal. It contained summaries of around two dozen studies that all support the safety of thimerosal and/or find no association with autism or neurological disorders. It also explained how in 1999, health experts and agencies made plans to remove thimerosal from childhood vaccines out of an abundance of caution for concern that it was adding to cumulative exposures that could hypothetically become toxic—at high doses, thimerosal can be dangerous. By 2001, it was removed from every childhood vaccine in the US and remains so to this day. But, since then, studies have found thimerosal to be perfectly safe in vaccines. All the studies listed by the CDC in support of thimerosal were published after 2001.

The document also contained a list of nearly two dozen studies claiming to find a link to autism, but where described by the CDC as having “significant methodological limitations.” The Institute of Medicine also called them “uninterpretable, and therefore, noncontributory with respect to causality.” Every single one of the studies was authored by the anti-vaccine father and son duo Mark and David Geier.

In March, it came to light that Kennedy had hired David Geier to the US health department to continue trying to prove a link between autism and vaccines. He is now working on the issue.

The CDC’s thimerosal document was removed from the ACIP’s meeting documents prior to the meeting. Robert Malone, one of the new ACIP members who holds anti-vaccine views, said during the meeting that it was taken down because it “was not authorized by the Office of the Secretary [Kennedy].” You can read it here.

Lone voice

In the meeting today, Kennedy’s hand-selected ACIP members did not ask Redwood any questions about the data or arguments she made against thimerosal. Nearly all of them readily accepted that thimerosal should be removed entirely. The only person to push back was Cody Meissner, a pediatric professor at Dartmouth’s Geisel School of Medicine who has served on ACIP in the past—arguably the most qualified and reasonable member of the new lineup.

“I’m not quite sure how to respond to this presentation,” he said after Redwood finished her slides. “This is an old issue that has been addressed in the past. … I guess one of the most important [things] to remember is that thimerosal is metabolized into ethylmercury and thiosalicylate. It’s not metabolized into methylmercury, which is in fish and shellfish. Ethylmercury is excreted much more quickly from the body. It is not associated with the high neurotoxicity that methylmercury is,” he explained.

Meissner scoffed at the committee even spending time on it. “So, of all the issues that I think we, ACIP, needs to focus on, this is not a big issue. … no study has ever indicated any harm from thimerosal. It’s been used in vaccines … since before World War II.

But he did express concern that it could be removed from the vaccine used globally.

“The recommendations the ACIP makes are followed among many countries around the world,” he said. “And removing thimerosal from all vaccines that are used in other countries, for example, is going to reduce access to these vaccines.”

Anti-vaccine agenda

In the end, the seven-member panel voted in favor of recommending only those seasonal flu vaccines that did not contain thimerosal. There were three separate votes for this, making this recommendation for children, pregnant women, and all adults each, but all with the same outcome: five ‘yes’ votes, one ‘no’ vote (Meissner), and one abstention from anti-vaccine activist and nurse Vicky Pebsworth. After the vote, Pebsworth clarified that she did not support the use of thimerosal in vaccines, but had a quibble with how the voting questions were written.

Prior to the vote, ACIP Chair Martin Kulldorff gave a brief presentation on the MMRV vaccine (measles, mumps, rubella, and varicella/chickenpox). He previewed a proposed recommendation to vote on in a future meeting that would remove the CDC’s recommendation for that vaccine as well.

Photo of Beth Mole

Beth is Ars Technica’s Senior Health Reporter. Beth has a Ph.D. in microbiology from the University of North Carolina at Chapel Hill and attended the Science Communication program at the University of California, Santa Cruz. She specializes in covering infectious diseases, public health, and microbes.

RFK Jr.’s CDC panel ditches some flu shots based on anti-vaccine junk data Read More »

judge:-pirate-libraries-may-have-profited-from-meta-torrenting-80tb-of-books

Judge: Pirate libraries may have profited from Meta torrenting 80TB of books

It could certainly look worse for Meta if authors manage to present evidence supporting the second way that torrenting could be relevant to the case, Chhabaria suggested.

“Meta downloading copyrighted material from shadow libraries” would also be relevant to the character of the use, “if it benefitted those who created the libraries and thus supported and perpetuated their unauthorized copying and distribution of copyrighted works,” Chhabria wrote.

Counting potential strikes against Meta, Chhabria pointed out that the “vast majority of cases” involving “this sort of peer-to-peer file-sharing” are found to “constitute copyright infringement.” And it likely doesn’t help Meta’s case that “some of the libraries Meta used have themselves been found liable for infringement.”

However, Meta may overcome this argument, too, since book authors “have not submitted any evidence” that potentially shows how Meta’s downloading may perhaps be “propping up” or financially benefiting pirate libraries.

Finally, Chhabria noted that the “last issue relating to the character of Meta’s use” of books in regards to its torrenting is “the relationship between Meta’s downloading of the plaintiffs’ books and Meta’s use of the books to train Llama.”

Authors had tried to argue that these elements were distinct. But Chhabria said there’s no separating the fact that Meta downloaded the books to serve the “highly transformative” purpose of training Llama.

“Because Meta’s ultimate use of the plaintiffs’ books was transformative, so too was Meta’s downloading of those books,” Chhabria wrote.

AI training rulings may get more authors paid

Authors only learned of Meta’s torrenting through discovery in the lawsuit, and because of that, Chhabria noted that “the record on Meta’s alleged distribution is incomplete.”

It’s possible that authors may be able to show evidence that Meta “contributed to the BitTorrent network” by providing significant computing power that could’ve meaningfully assisted shadow libraries, Chhabria said in a footnote.

Judge: Pirate libraries may have profited from Meta torrenting 80TB of books Read More »

testing-ancient-paleolithic-migration-with-a-replica-canoe

Testing ancient Paleolithic migration with a replica canoe

(Left) GPS tracking and modeling of ocean currents toward the end of the experimental voyage. (Right) The team on the water around the time of the left image.

(Left) GPS tracking and modeling of ocean currents toward the end of the experimental voyage. (Right) The team on the water around the time of the left image. Credit: Kaifu et al., 2025/CC-By-ND

At the 30-hour mark, the captain ordered the entire crew to rest, letting the dugout drift freely for a while, which fortunately brought them closer to Yonaguni Island. At hour 40, the island’s silhouette was visible, and over the next five hours, the crew was able to navigate the strong tidal flow along the coast until they reached their landing site: Nama Beach. So the experimental voyage was a success, augmented by the numerical simulations to demonstrate that the boat could make similar voyages from different departure points across both modern and late-Pleistocene oceans.

Granted, it was not possible to recreate Paleolithic conditions perfectly on a modern ocean. The crew first spotted the island because of its artificial lights, although by that time, they were on track navigationally. They were also accompanied by escort ships to ensure the crew’s safety, supplying fresh water twice during the voyage. But the escort ships did not aid with navigation or the dugout captain’s decision-making, and the authors believe that any effects were likely minimal. The biggest difference was the paddlers’ basic modern knowledge of local geography, which helped them develop a navigation plan—an unavoidable anachronism, although the crew did not rely on compasses, GPS, or watches during the voyage.

“Scientists try to reconstruct the processes of past human migrations, but it is often difficult to examine how challenging they really were,” said Kaifu. “One important message from the whole project was that our Paleolithic ancestors were real challengers. Like us today, they had to undertake strategic challenges to advance. For example, the ancient Polynesian people had no maps, but they could travel almost the entire Pacific. There are a variety of signs on the ocean to know the right direction, such as visible land masses, heavenly bodies, swells and winds. We learned parts of such techniques ourselves along the way.”

DOI: “Traversing the Kuroshio: Paleolithic migration across one of the world’s strongest ocean currents,” Science Advances, 2025. 10.1126/sciadv.adv5508  (About DOIs).

DOI: “Palaeolithic seafaring in East Asia: an experimental test of the dugout canoe hypothesis,” Science Advances, 2025. 10.1126/sciadv.adv5507  (About DOIs).

Testing ancient Paleolithic migration with a replica canoe Read More »

during-a-town-hall-wednesday,-nasa-officials-on-stage-looked-like-hostages

During a town hall Wednesday, NASA officials on stage looked like hostages


A Trump appointee suggests NASA may not have a new administrator until next year.

NASA press secretary Bethany Stevens, acting administrator Janet Petro, chief of staff Brian Hughes, associate administrator Vanessa Wyche, and deputy associate administrator Casey Swails held a town hall with NASA employees Wednesday. Credit: NASA

The four people at the helm of America’s space agency held a town hall meeting with employees Wednesday, fielding questions about downsizing, layoffs, and proposed budget cuts that threaten to undermine NASA’s mission and prestige.

Janet Petro, NASA’s acting administrator, addressed questions from an auditorium at NASA Headquarters in Washington, DC. She was joined by Brian Hughes, the agency’s chief of staff, a political appointee who was formerly a Florida-based consultant active in city politics and in Donald Trump’s 2024 presidential campaign. Two other senior career managers, Vanessa Wyche and Casey Swails, were also on the stage.

They tried to put a positive spin on the situation at NASA. Petro, Wyche, and Swails are civil servants, not Trump loyalists. None of them looked like they wanted to be there. The town hall was not publicized outside of NASA ahead of time, but live video of the event was available—unadvertised—on an obscure NASA streaming website. The video has since been removed.

8 percent down

NASA’s employees are feeling the pain after the White House proposed a budget cut of nearly 25 percent in fiscal year 2026, which begins October 1. The budget request would slash NASA’s topline budget by nearly 25 percent, from $24.8 billion to $18.8 billion. Adjusted for inflation, this would be the smallest NASA budget since 1961, when the first American launched into space.

“The NASA brand is really strong still, and we have a lot of exciting missions ahead of us,” Petro said. “So, I know it’s a hard time that we’re going to be navigating, but again, you have my commitment that I’m here and I will share all of the information that I have when I get it.”

It’s true that NASA employees, along with industry officials and scientists who regularly work with the agency, are navigating through what would most generously be described as a period of great uncertainty. The perception among NASA’s workforce is far darker. “NASA is f—ed,” one current leader in the agency told Ars a few weeks ago, soon after President Trump rescinded his nomination of billionaire businessman and commercial astronaut Jared Isaacman to be the agency’s next administrator.

Janet Petro, NASA’s acting administrator, is seen in 2020 at Kennedy Space Center in Florida. Credit: NASA/Kim Shiflett

Before the White House released its detailed budget proposal in May, NASA and other federal agencies were already scrambling to respond to the Trump administration’s directives to shrink the size of the government. While NASA escaped the mass layoffs of probationary employees that affected other departments, the space agency offered buyouts and incentives for civil servants to retire early or voluntarily leave their posts.

About 900 NASA employees signed up for the first round of the government’s “deferred resignation” program. Casey Swails, NASA’s deputy associate administrator, said Wednesday that number is now up to 1,500 after NASA announced another chance for employees to take the government’s deferred resignation offer. This represents about 8 percent of NASA’s workforce, and the window for employees to apply runs until July 25.

One takeaway from Wednesday’s town hall is that at least some NASA leaders want to motivate more employees to resign voluntarily. Hughes said a “major reason” for luring workers to leave the agency is to avoid “being in a spot where we have to do the involuntary options.”

Rumors of these more significant layoffs, or reductions in force, have hung over NASA for several months. If that happens, workers may not get the incentives the government is offering today to those who leave the agency on their own. Swails said NASA isn’t currently planning any such layoff, although she left the door open for the situation to change: “We’re doing everything we can to avoid going down that path.”

Ultimately, it will depend on how many employees NASA can get to resign on their own. If it’s not enough, layoffs may still be an option.

Many questions, few answers

Nearly all of the questions employees addressed to NASA leadership Wednesday were submitted anonymously, and in writing: When might Trump nominate someone for NASA administrator to take Isaacman’s place? Will any of NASA’s 10 field centers be closed? What is NASA going to do about Trump’s budget proposal, particularly its impact on science missions?

Their responses to these questions, in order: Probably not any time soon, maybe, and nothing.

The Trump administration selected Petro, an engineer and former Army helicopter pilot, to become acting head of NASA on Inauguration Day in January. Bill Nelson, who served as a Florida senator until 2019, resigned the NASA administrator job when former President Biden left the White House.

Petro was previously director of NASA’s Kennedy Space Center since 2021, and before that, she was deputy director of the Florida spaceport for 14 years. She leapfrogged NASA’s top civil servant, associate administrator Jim Free, to become acting administrator in January. Free retired from the agency in February. Before the presidential election last year, Free advocated for the next administration to stay the course with NASA’s Artemis program.

But that’s not what the Trump administration wants to do. The White House seeks to cancel the Space Launch System rocket and Orion spacecraft, both core elements of the Artemis program to return astronauts to the Moon after two more flights. Under the new plan, NASA would procure commercial transportation to ferry crews to the Moon and Mars in a similar way to how the agency buys rides for its astronauts to the International Space Station in low-Earth orbit.

NASA’s Curiosity rover captured images to create this selfie mosaic on the surface of Mars in 2015. If implemented as written, the Trump budget proposal would mark the first time in 30 years that NASA does not have a Mars lander in development. The agency would instead turn to commercial companies to demonstrate they can deliver payloads, and eventually humans, to the red planet.

The Trump administration’s statements on space policy have emphasized the longer-term goal of human missions to Mars. The White House’s plans for what NASA will do at the Moon after the Artemis program’s first landing are still undefined.

Petro has kept a low profile since becoming NASA’s temporary chief executive five months ago. If Trump moved forward with Isaacman’s nomination, he would likely be NASA administrator today. The Senate was a few days away from confirming Isaacman when Trump pulled his nomination, apparently for political reasons. The White House withdrew the nomination the day after Elon Musk, who backed Isaacman to take the top job at NASA, left the Trump administration.

Who’s running NASA?

Now, Petro could serve out the year as NASA’s acting administrator. Petro is well-regarded at Kennedy Space Center, where she was a fixture in the center’s headquarters building for nearly 20 years. But she lacks a political constituency in the Trump administration and isn’t empowered to make major policy decisions. The budget cuts proposed for NASA came from the White House’s Office of Management and Budget, not from within the agency itself.

President Trump has the reins on the process to select the next NASA administrator. Trump named Isaacman for the office in December, more than a month before his inauguration, and the earliest any incoming president has nominated a NASA administrator. Musk had close ties to Trump then, and a human mission to Mars got a mention in Trump’s inauguration speech.

But space issues seem to have fallen far down Trump’s list of priorities. Hughes, who got his job at NASA in part due to his political connections, suggested it might be a while before Trump gets around to selecting another NASA administrator nominee.

“I think the best guess would tell you that it’s hard to imagine it happening before the next six months, and could perhaps go longer than that into the eight- or nine-month range, but that’s purely speculation,” Hughes said, foreseeing impediments such as the large number of other pending nominations for posts across the federal government and high-priority negotiations with Congress over the federal budget.

Congress is also expected to go on recess in August, so the earliest a NASA nominee might get a confirmation hearing is this fall. Then, the Senate must vote to confirm the nominee before they can take office.

The timeline of Isaacman’s nomination for NASA administrator is instructive. Trump nominated Isaacman in December, and his confirmation hearing was in April. He was on the cusp of a confirmation vote in early June when Trump withdrew his nomination on May 31.

As NASA awaits a leader with political backing, Petro said the agency is undergoing an overhaul to make it “leaner and more agile.” This is likely to result in office closures, and Hughes indicated NASA might end up shuttering entire field centers.

“To the specific question, will they be closed or consolidated? I don’t think we’re there yet to answer that question, but it is actively a part of the conversation we’re having as we go step-by-step through this,” Hughes said.

What can $4 billion buy you?

While Trump’s budget proposal includes robust funding for human space exploration, it’s a different story for most of the rest of NASA. The agency’s science budget would be cut in half to approximately $3.9 billion. NASA’s technology development division would also be reduced by 50 percent.

If the White House gets its way, NASA would scale back research on the International Space Station and cancel numerous robotic missions in development or already in space. The agency would terminate missions currently exploring Jupiter, on the way to study an asteroid, and approaching interstellar space. It would shut down the largest X-ray space telescope ever built and the only one in its class likely to be operating for the next 10 years.

“There’s a lot of science that can still be done with $4 billion,” Petro said. “How we do science, and how we do partnerships, may change in the future to sort of multiply what we’re doing.”

These partnerships might include asking academic institutions or wealthy benefactors to pitch in money to fund science projects at NASA. The agency might also invite commercial companies to play bigger roles in NASA robotic missions, which are typically owned by the government.

This view of Jupiter’s turbulent atmosphere from NASA’s Juno spacecraft includes several of the planet’s southern jet streams. Juno is one of the missions currently in space that NASA would shut down under Trump’s budget request. Credit: NASA

One employee asked what NASA could do to secure more funding in the president’s budget request. But that ship has sailed. The options now available to NASA’s leadership are to support the budget proposal, stay silent, or leave. NASA is an executive agency and part of the Trump administration, and the White House’s budget request is NASA’s, too.

“It’s not our job to advocate, but let’s try to look at this in a positive way,” Petro said. “We’ve still got a lot of money. Let’s see how much mission we can do.”

Ultimately, it’s up to Congress to appropriate funding for NASA and other parts of the government. Lawmakers haven’t signaled where they might land on NASA’s budget, but Sen. Ted Cruz (R-Texas), who is influential on space-related matters, released the text of a proposed bill a few weeks ago that would restore funding for the International Space Station and forego cancellation of the Space Launch System rocket, among other things. But Cruz did not have much to say about adding more money for NASA’s science programs.

NASA’s senior leaders acknowledged on Wednesday that the pain of the agency’s downsizing will extend far beyond its walls.

“Eighty-five percent of our budget goes out the door to contractors,” Petro said. “So, with a reduced budget, absolutely, our contractors will also be impacted. In fact, they’re probably the bigger driver that will be impacted.”

It’s clearly a turbulent time for America’s space agency, and NASA employees have another month to decide if they want to be part of it.

“I know there’s a lot to consider,” Swails said. “There’s a lot that people are thinking about. I would encourage you to talk it out. Tap into your support systems. Talk to your spouse, your partner, your friend, your financial advisor, whomever you consider those trusted advisors for you.”

This sounds like hollow advice, but it seems like it’s all NASA’s workers can do. The Trump administration isn’t waiting for Congress to finalize the budget for 2026. The downsizing is here.

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

During a town hall Wednesday, NASA officials on stage looked like hostages Read More »

all-childhood-vaccines-in-question-after-first-meeting-of-rfk-jr.’s-vaccine-panel

All childhood vaccines in question after first meeting of RFK Jr.’s vaccine panel

A federal vaccine panel entirely hand-selected by health secretary and anti-vaccine activist Robert F. Kennedy Jr. gathered for its first meeting Wednesday—and immediately announced that it would re-evaluate the entire childhood vaccination schedule, as well as the one for adults.

The meeting overall was packed with anti-vaccine talking points and arguments from the new panel members, confirming public health experts’ fears that the once-revered panel is now critically corrupted and that Kennedy’s controversial picks will only work to fulfill his long-standing anti-vaccine agenda.

Controversial committee

An hour before the meeting began, the American Academy of Pediatrics came out swinging against the new panel, saying that the panel’s work is “no longer a credible process.” The organization shunned the meeting, refusing to send a liaison to the panel’s meeting, which it has done for decades.

“We won’t lend our name or our expertise to a system that is being politicized at the expense of children’s health,” AAP President Susan Kressly said in a video posted on social media.

The panel in question, the Advisory Committee on Immunization Practices (ACIP), has for more than 60 years provided rigorous public scientific review, discussion, and trusted recommendations to the Centers for Disease Control and Prevention on how vaccines should be used in the US after they’ve earned approval from the Food and Drug Administration. The CDC typically adopts ACIP’s recommendations, and once that happens, insurance providers are required to cover the cost of the recommended shots.

The system is highly regarded globally. But, on June 9, Kennedy unilaterally and summarily fired all 17 esteemed ACIP members and, two days later, replaced them with eight new people. Some have clear anti-vaccine views, others have controversial and contrarian public health views, and several have little to no expertise in the fields relevant to vaccines.

Last night, it came to light that one of the eight new appointees—Michael Ross, an obstetrics and gynecology physician—had withdrawn from the committee during a financial holdings review that ACIP members are required to complete before beginning work on the panel.

All childhood vaccines in question after first meeting of RFK Jr.’s vaccine panel Read More »

data-recovery-firm-tests-$28,-500gb-hdd-from-amazon-and-gets-surprising-results

Data-recovery firm tests $28, 500GB HDD from Amazon and gets surprising results

Ars was unable to confirm if UnionSine and Toshiba have any formal business relationship. UnionSine’s website says that its full company name is Shenzhen Union Integrity Technology Co., Ltd., a Shenzhen-based company launched in 2014 with “more than 50 employees.” It doesn’t list Toshiba as a partner. Toshiba also doesn’t mention any collaboration with UnionSine on its website. Neither company responded to requests for comment ahead of publication. Interestingly, there’s at least one account of someone finding a Western Digital drive inside their UnionSine HDD’s enclosure.

Rymko said that Secure Data Recovery couldn’t confirm if the Toshiba drive was refurbished but also emphasized the drive’s nearly 10 years of age:

Our internal data indicates that the average lifespan of a drive is approximately three-to-five years, depending on the brand, capacity, and other factors. Our data found that the average ‘power-on’ hours of failed drives was about two years and 10 months. With the right tools, the ‘power-on hours’ data can be reset. This could mean the drive may last a few years; I can’t say for sure.

There are better storage options

Rymko told me that UnionSine “seem[s] like a legitimate company” and noted that Secure Data Recovery has recovered data from UnionSine drives before. He also said that for $28, “the drive performs well and provides good value;” it also “meets expectations for speed and reliability.” Still, he has some concerns about long-term use:

We haven’t identified any major issues with this device, but as with any budget drive, long-term durability and sustained performance under heavy use are potential concerns to watch for. It’s always a good idea to back up important data regularly.

But there are still reasons to look elsewhere for storage.

For one, UnionSine doesn’t have a clearly posted warranty policy for its HDDs. As Rymko mentioned, the long-term durability of its drive is dubious, making the lack of a clear warranty concerning.

Further, there are bigger and roomier storage options than a 500GB HDD. If you’re opting for an HDD over an SSD to save money, it can be prudent to put at least some of those savings toward more storage space. A roomier HDD will cost more, but the price-per-GB may not differ much, depending on the drive.

When storing valued files, you can rest easier by following the 3-2-1 backup rule and by buying from a reputable brand with a solid warranty. Losing important data is frustrating enough, and that frustration is exacerbated when a company doesn’t take accountability for a potentially faulty device.

Data-recovery firm tests $28, 500GB HDD from Amazon and gets surprising results Read More »

the-resume-is-dying,-and-ai-is-holding-the-smoking-gun

The résumé is dying, and AI is holding the smoking gun

Beyond volume, fraud poses an increasing threat. In January, the Justice Department announced indictments in a scheme to place North Korean nationals in remote IT roles at US companies. Research firm Gartner says that fake identity cases are growing rapidly, with the company estimating that by 2028, about 1 in 4 job applicants could be fraudulent. And as we have previously reported, security researchers have also discovered that AI systems can hide invisible text in applications, potentially allowing candidates to game screening systems using prompt injections in ways human reviewers can’t detect.

Illustration of a robot generating endless text, controlled by a scientist.

And that’s not all. Even when AI screening tools work as intended, they exhibit similar biases to human recruiters, preferring white male names on résumés—raising legal concerns about discrimination. The European Union’s AI Act already classifies hiring under its high-risk category with stringent restrictions. Although no US federal law specifically addresses AI use in hiring, general anti-discrimination laws still apply.

So perhaps résumés as a meaningful signal of candidate interest and qualification are becoming obsolete. And maybe that’s OK. When anyone can generate hundreds of tailored applications with a few prompts, the document that once demonstrated effort and genuine interest in a position has devolved into noise.

Instead, the future of hiring may require abandoning the résumé altogether in favor of methods that AI can’t easily replicate—live problem-solving sessions, portfolio reviews, or trial work periods, just to name a few ideas people sometimes consider (whether they are good ideas or not is beyond the scope of this piece). For now, employers and job seekers remain locked in an escalating technological arms race where machines screen the output of other machines, while the humans they’re meant to serve struggle to make authentic connections in an increasingly inauthentic world.

Perhaps the endgame is robots interviewing other robots for jobs performed by robots, while humans sit on the beach drinking daiquiris and playing vintage video games. Well, one can dream.

The résumé is dying, and AI is holding the smoking gun Read More »

ted-cruz-can’t-get-all-republicans-to-back-his-fight-against-state-ai-laws

Ted Cruz can’t get all Republicans to back his fight against state AI laws


Cruz plan moves ahead but was reportedly watered down amid Republican opposition.

Sen. Ted Cruz (R-Texas) presides over a subcommittee hearing on June 3, 2025 in Washington, DC. Credit: Getty Images | Chip Somodevilla

A Republican proposal to penalize states that regulate artificial intelligence can move forward without requiring approval from 60 senators, the Senate parliamentarian decided on Saturday. But the moratorium on state AI laws did not have unanimous Republican support and has reportedly been watered down in an effort to push it toward passage.

In early June, Sen. Ted Cruz (R-Texas) proposed enforcing a 10-year moratorium on AI regulation by making states ineligible for broadband funding if they try to impose any limits on development of artificial intelligence. While the House previously approved a version of the so-called “One Big Beautiful Bill” with an outright 10-year ban on state AI regulation, Cruz took a different approach because of the Senate rule that limits inclusion of “extraneous matter” in budget reconciliation legislation.

Under the Senate’s Byrd rule, a senator can object to a potentially extraneous budget provision. A motion to waive the Byrd rule requires a vote of 60 percent of the Senate.

As originally drafted, Cruz’s backdoor ban on state AI laws would have made it impossible for states to receive money from the $42 billion Broadband Equity, Access, and Deployment (BEAD) program if they try to regulate AI. He tied the provision into the budget bill by proposing an extra $500 million for the broadband-deployment grant program and expanding its purpose to also subsidize construction and deployment of infrastructure for artificial intelligence systems.

Punchbowl News reported today that Cruz made changes in order to gain more Republican support and comply with Senate procedural rules. Cruz was quoted as saying that under his current version, states that regulate AI would only be shut out of the $500 million AI fund.

This would seem to protect states’ access to the $42 billion broadband deployment fund that will offer subsidies to ISPs that expand access to Internet service. Losing that funding would be a major blow to states that have spent the last couple of years developing plans to connect more of their residents to modern broadband. The latest Senate bill text was not available today. We contacted Cruz’s office and will update this article if we get a response.

A spokesperson for Sen. Maria Cantwell (D-Wash.) told Ars today that Cruz’s latest version could still prevent states from getting broadband funding. The text has “a backdoor to apply new AI requirements to the entire $42.45 billion program, not just the new $500 million,” Cantwell’s representative said.

Plan has opponents from both parties

Senate Parliamentarian Elizabeth MacDonough ruled that several parts of the Republican budget bill are subject to the Byrd rule and its 60-vote requirement, but Cruz’s AI proposal wasn’t one of them. A press release from Senate Budget Committee Ranking Member Jeff Merkley (D-Ore.) noted that “the parliamentarian’s advice is based on whether a provision is appropriate for reconciliation and conforms to the limitations of the Byrd rule; it is not a judgement on the relative merits of a particular policy.”

Surviving the parliamentarian review doesn’t guarantee passage. A Bloomberg article said the parliamentarian’s decision is “a win for tech companies pushing to stall and override dozens of AI safety laws across the country,” but that the “provision will likely still be challenged on the Senate floor, where stripping the provision would need just a simple majority. Some Republicans in both the House and Senate have pushed back on the AI provision.”

Republicans have a 53–47 edge in the Senate. Cantwell and Sen. Marsha Blackburn (R-Tenn.) teamed up for a press conference last week in which they spoke out against the proposed moratorium on state regulation.

Cantwell said that 24 states last year started “regulating AI in some way, and they have adopted these laws that fill a gap while we are waiting for federal action. Now Congress is threatening these laws, which will leave hundreds of millions of Americans vulnerable to AI harm by abolishing those state law protections.”

Blackburn said she agreed with Cantwell that the AI regulation proposal “is not the type of thing that we put into reconciliation bills.” Blackburn added that lawmakers “are working to move forward with legislation at the federal level, but we do not need a moratorium that would prohibit our states from stepping up and protecting citizens in their state.”

Sens. Ron Johnson (R-Wis.) and Josh Hawley (R-Mo.) have also criticized the idea of stopping states from regulating AI.

Cruz accused states of “strangling AI”

Cruz argued that his proposal stops states “from strangling AI deployment with EU-style regulation.” Under his first proposal, no BEAD funds were to be given to any state or territory that enforces “any law or regulation… limiting, restricting, or otherwise regulating artificial intelligence models, artificial intelligence systems, or automated decision systems entered into interstate commerce.”

The Cantwell/Blackburn press conference also included Washington Attorney General Nick Brown, a Democrat; and Tennessee Attorney General Jonathan Skrmetti, a Republican. Brown said that “Washington has a law that prohibits deep fakes being used against political candidates by mimicking their appearance and their speech,” another “that prohibits sharing fabricated sexual images without consent and provides for penalties for those who possess and distribute such images,” and a third “that prohibits the knowing distribution of forged digital likenesses that can be used to harm or defraud people.”

“All of those laws, in my reading, would be invalid if this was to pass through Congress, and each of those laws are prohibiting and protecting people here in our state,” Brown said.

Skrmetti said that if the Senate proposal becomes law “there would be arguments out there for the big tech companies that the moratorium does, in fact, preclude any enforcement of any consumer protection laws if there’s an AI component to the product that we’re looking at.”

Other Republican plans fail Byrd rule test

Senate Democrats said they are pleased that the parliamentarian ruled that several other parts of the bill are subject to the Byrd rule. “We continue to see Republicans’ blatant disregard for the rules of reconciliation when drafting this bill… Democrats plan to challenge every part of this bill that hurts working families and violates this process,” Merkley said.

Merkley’s press release said the provisions that are subject to a 60-vote threshold include one that “limits certain grant funding for ‘sanctuary cities,’ and where the Attorney General disagrees with states’ and localities’ immigration enforcement,” and another that “gives state and local officials the authority to arrest any noncitizen suspected of being in the US unlawfully.”

The Byrd rule also applies to a section that “limits the ability of federal courts to issue preliminary injunctions or temporary restraining orders against the federal government by requiring litigants to post a potentially enormous bond,” and another that “limits when the federal government can enter into or enforce settlement agreements that provide for payments to third parties to fully compensate victims, remedy harm, and punish and deter future violations,” Merkley’s office said.

The office of Senate Democratic Leader Chuck Schumer (D-N.Y.) said yesterday that the provision requiring litigants to post bonds has been struck from the legislation. “This Senate Republican provision, which was even worse than the similar House-passed version, required a plaintiff seeking an emergency court order, preliminary injunction, or a temporary restraining order against the Trump Administration or the federal government to pay a costly bond up front—essentially making the justice system pay-to-play,” Schumer’s office said.

Schumer said that “if enacted, this would have been one of the most brazen power grabs we’ve seen in American history—an attempt to let a future President Trump ignore court orders with impunity, putting him above the law.”

Photo of Jon Brodkin

Jon is a Senior IT Reporter for Ars Technica. He covers the telecom industry, Federal Communications Commission rulemakings, broadband consumer affairs, court cases, and government regulation of the tech industry.

Ted Cruz can’t get all Republicans to back his fight against state AI laws Read More »

analyzing-a-critique-of-the-ai-2027-timeline-forecasts

Analyzing A Critique Of The AI 2027 Timeline Forecasts

There was what everyone agrees was a high quality critique of the timelines component of AI 2027, by the LessWrong user and Substack writer Titotal.

It is great to have thoughtful critiques like this. The way you get actual thoughtful critiques like this, of course, is to post the wrong answer (at length) on the internet, and then respond by listening to the feedback and by making your model less wrong.

This is a high-effort, highly detailed, real engagement on this section, including giving the original authors opportunity to critique the critique, and warnings to beware errors, give time to respond, shares the code used to generate the graphs, engages in detail, does a bunch of math work, and so on. That is The Way.

So, Titotal: Thank you.

I note up front that at least Daniel Kokotajlo has indeed adjusted his estimates, and has moved his median from ‘AI 2027’ to ‘AI 2028’ based on events since publication, and Eli’s revisions also push the estimates back a bit.

I also note up front that if you evaluated most statements made in the discourse (either non-worried AI forecasting, or AI in general, or more broadly) with this level of rigor, mostly you couldn’t because you’d hit ‘I made it up’ very quickly, but in other cases where someone is trying at least a little, in my experience the models fall apart a lot worse and a lot faster. No one has suggested ‘here is a better attempt to forecast the future and take the whole thing seriously’ that I consider to have a reasonable claim to that.

A lot of the disagreements come down to how much one should care about which calculations and graphs match past data how closely in different contexts. Titotal demands very strong adherence throughout. I think it’s good to challenge and poke at the gaps but this seems to in several places go too far.

  1. The Headline Message Is Not Ideal.

  2. An Explanation of Where Superexponentiality Is Coming From.

  3. Three Methods.

  4. Time Horizon Extension Method.

  5. The Public Versus Internal Gap.

  6. The Difficulty Gap.

  7. Recent Progress.

  8. Infinite Time Horizons.

  9. Intermediate Speedups.

  10. Is There A Flawed Graph Still Up?.

  11. Some Skepticism About Projection.

  12. Part 2: Benchmarks and Gaps and Beyond.

  13. Benchmarks.

  14. The Time Horizon Part of the Second Model.

  15. Why The Thresholds?

  16. The Gap Model.

  17. Eli Responds On LessWrong.

  18. On Eli’s Recent Update.

  19. Conclusion.

  20. Perhaps The Most Important Disagreement.

Note that this section is about discourse rather than the model, so many of you can skip it.

While I once again want to say up front that I am very much thankful for the substance of this critique, it would also be great to have an equally thoughtful headline presentation of such critiques. That, alas, (although again, thanks for writing this!) we did not get.

It is called ‘A deep critique of AI 2027’s bad timeline model,’ one could simply not use the word ‘bad’ here and we would still know you have strong disagreements with it, and there is much similar talk throughout, starting with the title and then this, the first use of bold:

Titotal (formatting in original): The article is huge, so I focussed on one section alone: their “timelines forecast” code and accompanying methodology section. Not to mince words, I think it’s pretty bad.

I’m not full on ‘please reconsider your use of adjectives’ but, well, maybe? Here is an active defense of the use of the word ‘bad’ here:

Neel Nanda: I agree in general [to try and not call things bad], but think that titotal’s specific use was fine. In my opinion, the main goal of that post was not to engage the AI 2027, which had already be done extensively in private but rather to communicate their views to the broader community.

Titles in particular are extremely limited, many people only read the title, and titles are a key way people decide whether to eat on, and efficiency of communication is extremely important.

The point they were trying to convey was these models that are treated as high status and prestigious should not be and I disagree that non-violent communication could have achieved a similar effect to that title (note, I don’t particularly like how they framed the post, but I think this was perfectly reasonable from their perspective.)

I mean, yes, if the goal of the post was to lower the status and prestige of AI 2027 and to do so through people reading the title and updating in that way, rather than to offer a helpful critique, then it is true that the title was the best local way to achieve that objective, epistemic commons be damned. I would hope for a different goal?

There are more of these jabs, and a matching persistent attitude and framing, sprinkled throughout what is in its actual content an excellent set of critiques – I find much that I object to, but I think a good critique here should look like that. Most of your objections should be successfully answered. Others can be improved. This is all the system working as designed, and the assessments don’t match the content.

To skip ahead, the author is a physicist, which is great except that they are effectively holding AI 2027 largely to the standards of a physics model before they would deem it fit for anyone to use it make life decisions, even if this is ‘what peak modeling performance looks like.’

Except that you don’t get to punt the decisions, and Bayes Rule is real. Sharing one’s probability estimates and the reasons behind them is highly useful, and you can and should use that to help you make better decisions.

Tyler Cowen’s presentation of the criticism then compounds this, entitled ‘Modeling errors in AI doom circles’ (which is pejorative on multiple levels), calling the critique ‘excellent’ (the critique in its title calls the original ‘bad’), then presenting this as an argument for why this proves they should have… submitted AI 2027 to a journal? Huh?

Tyler Cowen: There is much more detail (and additional scenarios) at the link. For years now, I have been pushing the line of “AI doom talk needs traditional peer review and formal modeling,” and I view this episode as vindication of that view.

That was absurd years ago. It is equally absurd now, unless the goal of this communication is to lower the status of its subject.

This is the peer! This is the review! That is how all of this works! This is it working!

Classic ‘if you want the right answer, post the (ideally less) wrong one on the internet.’ The system works. Whereas traditional peer review is completely broken here.

Indeed, Titotal says it themselves.

Titotal: What makes AI 2027 different from other similar short stories is that it is presented as a forecast based on rigorous modelling and data analysis from forecasting experts. It is accompanied by five appendices of “detailed research supporting these predictions” and a codebase for simulations.

Now, I was originally happy to dismiss this work and just wait for their predictions to fail, but this thing just keeps spreading, including a youtube video with millions of views.

As in: I wasn’t going to engage with any of this until I saw it getting those millions of views, only then did I actually look at any of it.

Which is tough but totally fair, a highly sensible decision algorithm, except for the part where Titotal dismissed the whole thing as bogus before actually looking.

The implications are clear. You want peer review? Earn it with views. Get peers.

It is strange to see these two juxtaposed together. You get the detailed thoughtful critique for those who Read the Whole Thing. For those who don’t, at the beginning and conclusion, you get vibes.

Also (I discovered this after I’d finished analyzing the post) it turns out this person’s substack (called Timeline Topography Tales) is focused on, well, I’ll let Titotal explain, by sharing the most recent headlines and the relevant taglines in order, that appear before you click ‘see all’:

15 Simple AI Image prompts that stump ChatGPT

Slopworld 2035: The dangers of mediocre AI. None of this was written with AI assistance.

AI is not taking over material science (for now): an analysis and conference report. Confidence check: This is my field of expertise, I work in the field and I have a PhD in the subject.

A nerds guide to dating: Disclaimer: this blog is usually about debunking singularity nerds. This is not a typical article, nor is it my area of expertise.

The walled marketplace of ideas: A statistical critique of SSC book reviews.

Is ‘superhuman’ AI forecasting BS? Some experiments on the “539” bot from the Centre for AI Safety.

Most smart and skilled people are outside of the EA/rationalist community: An analysis.

I’m not saying this is someone who has an axe and is grinding it, but it is what it is.

Despite this, it is indeed a substantively excellent post, so LessWrong has awarded this post 273 karma as of this writing, very high and more than I’ve ever gotten in a single post, and 213 on the EA forum, also more than I’ve ever gotten in a single post.

Okay, with that out of the way up top, who wants to stay and Do Forecasting?

This tripped me up initially, so it’s worth clarifying up front.

The AI 2027 model has two distinct sources of superexponentiality. That it is why Titotal will later talk about there being an exponential model and a superexponential model, and then that there is a superexponential effect applied to both.

The first source is AI automation of AI R&D. It should be clear why this effect is present.

The second source is a reduction in difficulty of doubling the length or reliability of tasks, once the lengths in question pass basic thresholds. As in, at some point, it is a lot easier to go from reliably doing one year tasks to two year tasks, than it is to go from one hour to two hours, or from one minute to two minutes. I think this is true in humans, and likely true for AIs in the circumstances in question, as well. But you certainly could challenge this claim.

Okay, that’s out of the way, on to the mainline explanation.

Summarizing the breakdown of the AI 2027 model:

  1. The headline number is the time until development of ‘superhuman coders’ (SC), that can do an AI researcher job 30x as fast and 30x cheaper than a human.

  2. Two methods are used, ‘time horizon extension’ and ‘benchmarks and gaps.’

  3. There is also a general subjective ‘all things considered.’

Titotal (matching my understanding): The time horizon method is based on 80% time horizons from this report, where the team at METR tried to compare the performance of AI on various AI R&D tasks and quantify how difficult they are by comparing to human researchers. An 80% “time horizon” of 1 hour would mean that an AI has an overall success rate of 80% on a variety of selected tasks that would take a human AI researcher 1 hour to complete, presumably taking much less time than the humans (although I couldn’t find this statement explicitly).

The claim of the METR report is that the time horizon of tasks that AI can do has been increasing at an exponential rate. The following is one of the graphs showing this progress: note the logarithmic scale on the y-axis:

Titoral warns that this report is ‘quite recent, not peer-reviewed and not replicated.’ Okay. Sure. AI comes at you fast, the above graph is already out of date and the o3 and Opus 4 (or even Sonnet 4) data points should further support the ‘faster progress recently’ hypothesis.

The first complaint is that they don’t include uncertainty in current estimates, and this is framed (you see this a lot) as one-directional uncertainty: Maybe the result is accurate, maybe it’s too aggressive.

But we don’t know whether or not this is the new normal or just noise or temporary bump where we’ll go back to the long term trend at some point. If you look at a graph of Moore’s law, for example, there are many points where growth is temporarily higher or lower than the long term trend. It’s the long term curve you are trying to estimate, you should be estimating the long term curve parameters, not the current day parameters.

This is already dangerously close to assuming the conclusion that there is a long term trend line (a ‘normal’), and we only have to find out what it is. This goes directly up against the central thesis being critiqued, which is that the curve bends when AI speeds up coding and AI R&D in a positive feedback loop.

There are three possibilities here:

  1. We have a recent blip of faster than ‘normal’ progress and will go back to trend.

    1. You could even suggest, this is a last gasp of reasoning models and inference scaling, and soon we’ll stall out entirely. You never know.

  2. We have a ‘new normal’ and will continue on the new trend.

  3. We have a pattern of things accelerating, and they will keep accelerating.

That’s where the whole ‘super exponential’ part comes in. I think the good critique here is that we should have a lot of uncertainty regarding which of these is true.

So what’s up with that ‘super exponential’ curve? They choose to model this as ‘each subsequent doubling time is 10% shorter than the one before.’ Titotal does some transformational math (which I won’t check) and draws curves.

Just like before, the initial time horizon H0 parameter is not subject to uncertainty analysis. What’s much more crazy here is that the rate of doubling growth, which we’ll call alpha, wasn’t subject to uncertainty either! (Note that this has been updated in Eli’s newest version). As we’ll see, the value of this alpha parameter is one of the most impactful parameters in the whole model, so it’s crazy that they didn’t model any uncertainty on it, and just pick a seemingly arbitrary value of 10% without explaining why they did so.

The central criticism here seems to be that there isn’t enough uncertainty, that essentially all the parameters here should be uncertain. I think that’s correct. I think it’s also a correct general critique of most timeline predictions, that people are acting far more certain than they should be. Note that this goes both ways – it makes it more likely things could be a lot slower, but also they could be faster.

What the AI 2027 forecast is doing is using the combination of different curve types to embody the uncertainty in general, rather than also trying to fully incorporate uncertainty in all individual parameters.

I also agree that this experiment shows something was wrong, and a great way to fix a model is to play with it until it produces a stupid result in some hypothetical world, then figure out why that happened:

Very obviously, having to go through a bunch more doublings should matter more than this. You wouldn’t put p(SC in 2025) at 5.8% if we were currently at fifteen nanoseconds. Changing the initial conditions a lot seems to break the model.

If you think about why the model sets up the way it does, you can see why it breaks. The hypothesis is that as AI improves, it gains the ability to accelerate further AI R&D progress, and that this may be starting to happen, or things might otherwise still go superexponential.

Those probabilities are supposed to be forward looking from this point, whereas we know they won’t happen until this point. It’s not obvious when we should have had this effect kick in if we were modeling this ‘in the past’ without knowing what we know now, but it obviously shouldn’t kick in before several minute tasks (as in, before the recent potential trend line changes) because the human has to be in the loop and you don’t save much time.

Thus, yes, the model breaks if you start it before that point, and ideally you would force the super exponential effects to not kick in until H is at least minutes long (with some sort of gradual phase in, presumably). Given that we were using a fixed H0, this wasn’t relevant, but if you wanted to use the model on situations with lower H0s you would have to fix that.

How much uncertainty do we have about current H0, at this point? I think it’s reasonable to argue something on the order of a few minutes is on the table if you hold high standards for what that means, but I think 15 seconds is very clearly off the table purely on the eyeball test.

Similarly, there is the argument that these equations start giving you crazy numbers if you extend them past some point. And I’d say, well, yeah, if you hit a singularity then your model outputting Obvious Nonsense is an acceptable failure mode. Fitting, even.

The next section asks for why we are using both super exponential curves in general, and this ‘super exponential’ curve in particular.

So, what arguments do they provide for superexponentiality? Let’s take a look, in no particular order:

Argument 1: public vs internal:

“The trend would likely further tilt toward superexponetiality if we took into account that the public vs. internal gap has seemed to decrease over time.

But even if we do accept this argument, this effect points to a slower growth rate, not a faster one.

I do think we should accept this argument, and also Titoral is correct on this one. The new curve suggests modestly slower progress.

The counterargument is that we used to be slowed down by this wait between models, in two ways.

  1. Others couldn’t know about see, access, distill or otherwise follow your model while it wasn’t released, which previously slowed down progress.

  2. No one could use the model to directly accelerate progress during the wait.

The counterargument to the counterargument is that until recently direct acceleration via using the model wasn’t a thing, so that effect shouldn’t matter, and mostly the trendline is OpenAI models so that effect shouldn’t matter much either.

I can see effects in both directions, but overall I do think within this particular context the slower direction arguments are stronger. We only get to accelerate via recklessly releasing new models once, and we’ve used that up now.

Slightly off topic, but it is worth noting that in AI 2027, this gap opens up again. The top lab knows that its top model accelerates AI R&D, so it does not release an up-to-date version not for safety but to race ahead of the competition, and to direct more compute towards further R&D.

This argument is that time doublings get easier. Going from being able to consistently string together an hour to a week is claimed to be a larger conceptual gap than a week to a year.

Titoral is skeptical of this for both AIs and humans, especially because we have a lot of short term tutorials and few long term ones.

I would say that learning how to do fixed short term tasks, where you follow directions, is indeed far easier than general ‘do tasks that are assigned’ but once you are past that phase I don’t think the counterargument does much.

I agree with the generic ‘more research is needed’ style call here. Basically everywhere, more research is needed, better understanding would be good. Until then, better to go with what you have than to throw up one’s hands and say variations on ‘no evidence,’ of course one is free to disagree with the magnitudes chosen.

In humans, I think the difficulty gap is clearly real if you were able to hold yourself intact, once you are past the ‘learn the basic components’ stage. You can see it in the extremes. If you can sustain an effort reliably for a year, you’ve solved most of the inherent difficulties of sustaining it for ten.

The main reasons ten is harder (and a hundred is much, much harder!) is because life gets in the way, you age and change, and this alters your priorities and capabilities. At some point you’re handing off to successors. There’s a lot of tasks where humans essentially do get to infinite task length if the human were an em that didn’t age.

With AIs in this context, aging and related concepts are not an issue. If you can sustain a year, why couldn’t you sustain two? The answer presumably is ‘compounding error rates’ plus longer planning horizons, but if you can use system designs that recover from failures, that solves itself, and if you get non-recoverable error rates either down to zero or get them to correlate enough, you’re done.

A recent speedup is quite weak evidence for this specific type of super exponential curve. As I will show later, you can come up with lots of different superexponential equations, you have to argue for your specific one.

That leaves the “scaling up agency training”. The METR report does say that this might be a cause for the recent speedup, but it doesn’t say anything about “scaling up agency training” being a superexponential factor. If agency training only started recently, could instead be evidence that the recent advances have just bumped us into a faster exponential regime.

Or, as the METR report notes, it could just be a blip as a result of recent advances: “But 2024–2025 agency training could also be a one-time boost from picking low-hanging fruit, in which case horizon growth will slow once these gains are exhausted”.

This seems like an argument that strictly exponential curves should have a very strong prior? So you need to argue hard if you want to claim more than that?

The argument that ‘agency training’ has led to a faster doubling curve seems strong. Of course we can’t ‘prove’ it, but the point of forecasting is to figure out our best projections and models in practice, not to pass some sort of theoretical robustness check, or to show strongly why things must be this exact curve.

Is it possible that this has ‘only’ kicked us into a new faster exponential? Absolutely, but that possibility is explicitly part of AI 2027’s model, and indeed earlier Titotal was arguing that we shouldn’t think that the exponential was likely to even have permanently altered, and they’re not here admitting that the mechanisms involved make this shift likely to be real.

I mention the ‘one time blip’ possibility above, as well, but it seems to me highly implausible that if it is a ‘blip’ that we are close to done with this. There is obviously quite a lot of unhobbling left to do related to agency.

Should superhuman AGIs have infinite time horizons? AI 2027 doesn’t fully endorse their argument on this, but I think it is rather obvious that at some point doublings are essentially free.

Titotal responds to say that an AI that could do extremely long time horizon CS tasks would be a superintelligence, to which I would tap the sign that says we are explicitly considering what would be true about a superintelligence. That’s the modeling task.

The other argument here, that given a Graham’s number of years (and presumably immortality of some kind, as discussed earlier) a human can accomplish quite an awful lot, well, yes, even if you force them not to do the obviously correct path of first constructing a superintelligence to do it for them. But I do think there’s an actual limit here if the human has to do all the verification too, an infinite number of monkeys on typewriters can write Shakespeare but they can’t figure out where they put it afterwards, and their fastest solution to this is essentially to evolve into humans.

Alternatively, all we’re saying is ‘the AI can complete arbitrary tasks so long as they are physically possible’ and at that point it doesn’t matter if humans can do them too, the metric is obviously not mapping to Reality in a useful way and the point is made.

Now if you read the justifications in the section above, you might be a little confused as to why they didn’t raise the most obvious justification for superexponentiality: the justification that as AI gets better, people will be able to use the AI for r&d research, thus leading to a feedback loop of faster AI development.

The reason for this that they explicitly assume this is true and apply it to every model, including the “exponential” and “subexponential” ones. The “exponential” model is, in fact, also superexponential in their model.

(Note: in Eli’s newest model this is substantially more complicated, I will touch on this later)

Titotal walks us through the calculation, which is essentially a smooth curve that speeds up progress based on feedback loops proportional to progress made towards a fully superhuman coder, implemented in a way to make it easily calculable and so it doesn’t go haywire on parameter changes.

Titotal’s first objection is that this projection implies (if you run the calculation backwards) AI algorithmic progress is currently 66% faster than it was in 2022, whereas Nikola (one of the forecasters) estimates current algorithmic progress is only 3%-30% faster, and the attempt to hardcode a different answer in doesn’t work, because relative speeds are what matters and they tried to change absolute speeds instead. That seems technically correct.

The question is, how much does this mismatch ultimately matter? It is certainly possible for the speedup factor from 2022 to 2025 to be 10% (1 → 1.1) and for progress to then accelerate far faster going forward as AI crosses into more universally useful territory.

As in, if you have an agent or virtual employee, it needs to cross some threshold to be useful at all, but after that it rapidly gets a lot more useful. But that’s not the way the model works here, so it needs to be reworked, and also yes I think we should be more skeptical about the amount of algorithmic progress speedup we can get in the transitional stages here, with the amount of progress required to get to SC, or both.

After walking through the curves in detail, this summarizes the objection to the lack of good fit for the past parts of the curve:

I assume the real data would mostly be within the 80% CI of these curves, but I don’t think the actual data should be an edge case of your model.

So, to finish off the “superexponential” the particular curve in their model does not match empirically with data, and as I argued earlier, it has very little conceptual justification either. I do not see the justification for assigning this curve 40% of the probability space.

I don’t think 75th percentile is an ‘edge case’ but I do agree that it is suspicious.

I think that the ‘super exponential’ curves are describing a future phenomena, for reasons that everyone involved understands, that one would not expect to match backwards in time unless you went to the effort of designing equations to do that, which doesn’t seem worthwhile here.

This is the graph in question, the issues with it are in the process of being addressed.

I agree that various aspects of this graph and how it was presented weren’t great, especially using a 15% easier-each-time doubling curve rather than the 10% that AI 2027 actually uses, and calling it ‘our projection.’ I do think it mostly serves the purpose of giving a rough idea what is being discussed, but more precision would have been better, and I am glad this is being fixed.

This objection is largely that there are only 11 data points (there are now a few more) on the METR curve, and you can fit it with curves that look essentially the same now but give radically different future outcomes. And yes, I agree, that is kind of the point, and if anything we are underrepresenting the uncertainty here, we can agree that even if we commit to using fully simplified and fully best-fit-to-the-past models we get a range of outcomes that prominently include 2028-2029 SCs.

I do think it is a reasonable to say that the super exponential curve the way AI 2027 set it up has more free variables than you would like when fitting 11 data points, if that’s all you were looking to do, but a lot of these parameters are far from free and are not being chosen in order to fit the past curve data.

We now move on to the second more complex model, which Titotal says in many ways is worse, because if you use a complicated model you have to justify the complications, and it doesn’t.

I think a better way to describe the 2nd model is, it predicts a transition in rate of progress around capabilities similar to saturation of re-bench, after which things will move at a faster pace, and uses the re-bench point as a practical way of simulating this.

Method 2 starts by predicting how long it would take to achieve a particular score (referred to as “saturation”) on Re-bench, a benchmark of AI skill on a group of ML research engineering tasks, also prepared by METR. After that, the time horizon extension model is used as with method 1, except that it starts later (when Re-bench saturates), and that it stops earlier (when a certain convoluted threshold is reached).

After that stopping point, 5 new gaps are estimated, which are just constants (as always, sampled from lognormal), and then the whole thing is run through an intermediate speedup model. So any critiques of model 1 will also apply to model 2, there will just be some dilution with all the constant gap estimates and the “re-bench” section.

The reason to start later is obvious, you can’t start actually using AI skill for ML research tasks until it can beat not using it. So what you actually have is a kind of ‘shadow curve’ that starts out super negative – if you tried to use AI to do your ML tasks in 2017 you’d very obviously do way worse than doing it yourself. Then at some point in the 2020s you cross that threshold.

We also need a top of the curve, because this is a benchmark and by its nature it saturates even if the underlying skills don’t. In some senses the top of the S-curve is artificial, in some it isn’t.

Titotal points out that you can’t meaningfully best-fit an S-curve until you know you’ve already hit the top, because you won’t know where the top is. The claim is that we have no idea where the benchmark saturates, that projecting it to be 2 is arbitrary. To which I’d say, I mean, okay, weird but if true who cares? If the maximum is 3 and we approach that a bit after we hit 2, then that’s a truth about the benchmark not about Reality, and nothing important changes. As I then realize Titotal noticed too that as long as you’re above human performance it doesn’t change things substantially, so why are we having this conversation?

This is a general pattern here. It’s virtuous to nitpick, but you should know when you’re nitpicking and when you’re not.

When you’re doing forecasting or modeling, you have to justify your decisions if and only if those decisions matter to the outcome. If it does not matter, it does not matter.

Speaking of doesn’t matter, oh boy does it not matter?

Step 2 is to throw this calculation in the trash.

I’m serious here. Look at the code. The variable t_sat_ci, the “CI for date when capability saturates”, is set by the forecaster, not calculated. There is no function related to the RE-bench data at all in the code. Feel free to look! It’s not in the updated code either.

Eli gives an 80% CI of saturation between september 2025 to january 2031, and Nikola gives an 80% CI of saturation between august 2025 and november 2026. Neither of these are the same as the 80% CI in the first of the two graphs, which is early 2026 to early 2027. Both distributions peak like half a year earlier than the actual Re-bench calculation, although Eli’s median value is substantially later.

Eli has told me that the final estimates for saturation time are “informed” by the logistic curve fitting, but if you look above they are very different estimates.

Those are indeed three very different curves. It seems that the calculation above is an intuition pump or baseline, and they instead go with the forecasters predictions, with Nikola expecting it to happen faster than the projection, and Eli having more uncertainty. I do think Nikola’s projection here seems unreasonably fast and I’d be surprised if he hasn’t updated by now?

Eli admits the website should have made the situation clear and he will fix it.

Titotal says we’ve ‘thrown out’ the re-bench part of the appendix. I say no, that’s not how this works, yes we’re not directly doing math with the output of the model above, but we are still projecting the re-bench results and using that to inform the broader model. That should have been made clear, and I am skeptical of Eli and Nikola’s graphs on this, especially the rapid sudden peak in Nikola’s, but the technique used is a thing you sometimes will want to do.

So basically we now do the same thing we did before except a lot starts in the future.

Titotal: Okay, so we’ve just thrown out the re-bench part of the appendix. What happens next? Well, next, we do another time horizons calculation, using basically the same methodology as in method 1. Except we are starting later now, so:

They guess the year that we hit re-bench saturation.

They guess the time horizon at the point we hit re-bench saturation.

They guess the doubling time at the point when we hit re-bench saturation.

They guess the velocity of R&D speedup at the point when we hit re-bench saturation.

Then, they use these parameters to do the time horizons calculation from part 1, with a lower cut-off threshold I will discuss in a minute.

And they don’t have a good basis for these guesses, either. I can see how saturating RE-bench could you give you some information about the time horizon, but not things like the doubling time, which is one of the most crucial parameters that is inextricably tied to long term trends.

Setting aside the cutoff, yes this is obviously how you would do it. Before we estimated those variables now. If you start in the future, you want to know what they look like as you reach the pivot point.

Presumably you would solve this by running your model forward in the previous period, the same way you did in the first case? Except that this is correlated with the pace of re-bench progress, so that doesn’t work on its own. My guess is you would want to assign some percent weight to the date and some percept to what it would look like on your median pivot date.

And the estimation of doubling time is weird. The median estimate for doubling time at re-bench saturation is around 3 months, which is 33% lower than their current estimate for doubling time. Why do they lower it?

Well, partly because under the superexponential model there would have been speedups during the re-bench saturation period.

Titotal then repeats the concern about everything being super exponential, but I don’t see the issue on this one, although I would do a different calculation to decide on my expectation here.

I also don’t understand the ‘this simulation predicts AI progress to freeze in place for two years’ comment, as in I can’t parse why one would say that there.

And now here’s where we come to a place where I actually am more concerned than Titotal is:

The other main difference is that this time horizons model only goes to a lower threshold, corresponding to when AI hits the following requirement:

“Ability to develop a wide variety of software projects involved in the AI R&D process which involve modifying a maximum of 10,000 lines of code across files totaling up to 20,000 lines. Clear instructions, unit tests, and other forms of ground-truth feedback are provided. Do this for tasks that take humans about 1 month (as controlled by the “initial time horizon” parameter) with 80% reliability, add the same cost and speed as humans.”

Despite differing by 2 orders of magnitude on the time horizon required for SC in the first method, when it comes to meeting this benchmark they are both in exact agreement for this threshold, which they both put as a median of half a month.

This is weird to me, but I won’t dwell on it.

I kind of want to dwell on this, and how they are selecting the first set of thresholds, somewhat more, since it seems rather important. I want to understand how these various disagreements interplay, and how they make sense together.

That’s central to how I look at things like this. You find something suspicious that looks like it won’t add up right. You challenge. They address it. Repeat.

I think I basically agree with the core criticism here that this consists of guessing things about future technologies in a way that seems hard to get usefully right, it really is mostly a bunch of guessing, and it’s not clear that this is complexity is helping the model be better than making a more generalized guess, perhaps using this as an intuition pump. I’m not sure. I don’t think this is causing a major disagreement in the mainline results, though?

In addition to updating the model, Eli responds with this comment.

I don’t understand the perspective that this is a ‘bad response.’ It seems like exactly how all of this should work, they are fixing mistakes and addressing communication issues, responding to the rest, and even unprompted offer a $500 bounty payment.

Eli starts off linking to the update to the model from May 7.

Here is Eli’s response on the ‘most important disagreements’:

  1. Whether to estimate and model dynamics for which we don’t have empirical data. e.g. titotal says there is “very little empirical validation of the model,” and especially criticizes the modeling of superexponentiality as having no empirical backing. We agree that it would be great to have more empirical validation of more of the model components, but unfortunately that’s not feasible at the moment while incorporating all of the highly relevant factors.[1]

    1. Whether to adjust our estimates based on factors outside the data. For example, titotal criticizes us for making judgmental forecasts for the date of RE-Bench saturation, rather than plugging in the logistic fit. I’m strongly in favor of allowing intuitive adjustments on top of quantitative modeling when estimating parameters.

  2. [Unsure about level of disagreement] The value of a “least bad” timelines model. While the model is certainly imperfect due to limited time and the inherent difficulties around forecasting AGI timelines, we still think overall it’s the “least bad” timelines model out there and it’s the model that features most prominently in my overall timelines views. I think titotal disagrees, though I’m not sure which one they consider least bad (perhaps METR’s simpler one in their time horizon paper?). But even if titotal agreed that ours was “least bad,” my sense is that they might still be much more negative on it than us. Some reasons I’m excited about publishing a least bad model:

    1. Reasoning transparency. We wanted to justify the timelines in AI 2027, given limited time. We think it’s valuable to be transparent about where our estimates come from even if the modeling is flawed in significant ways. Additionally, it allows others like titotal to critique it.

    2. Advancing the state of the art. Even if a model is flawed, it seems best to publish to inform others’ opinions and to allow others to build on top of it.

My read, as above, is that titotal indeed objects to a ‘least bad’ model if it is presented in a way that doesn’t have ‘bad’ stamped all over it with a warning not to use it for anything. I am strongly with Eli here. I am also with Thane that being ‘least bad’ is not on its own enough, reality does not grade on a curve and you have to hit a minimum quality threshold to be useful, but I do think they hit that.

As discussed earlier, I think #1 is also an entirely fair response, although there are other issues to dig into on those estimates and where they come from.

  1. The likelihood of time horizon growth being superexponential, before accounting for AI R&D automation. See this section for our arguments in favor of superexponentiallity being plausible, and titotal’s responses (I put it at 45% in our original model). This comment thread has further discussion. If you are very confident in no inherent superexponentiality, superhuman coders by end of 2027 become significantly less likely, though are still >10% if you agree with the rest of our modeling choices (see here for a side-by-side graph generated from my latest model).

    1. How strongly superexponential the progress would be. This section argues that our choice of superexponential function is arbitrary. While we agree that the choice is fairly arbitrary and ideally we would have uncertainty over the best function, my intuition is that titotal’s proposed alternative curve feels less plausible than the one we use in the report, conditional on some level of superexponentiality.

    2. Whether the argument for superexponentiality is stronger at higher time horizons. titotal is confused about why there would sometimes be a delayed superexponential rather than starting at the simulation starting point. The reasoning here is that the conceptual argument for superexponentiality is much stronger at higher time horizons (e.g. going from 100 to 1,000 years feels likely much easier than going from 1 to 10 days, while it’s less clear for 1 to 10 weeks vs. 1 to 10 days). It’s unclear that the delayed superexponential is the exact right way to model that, but it’s what I came up with for now.

I don’t think 3b here is a great explanation, as I initially misunderstood it, but Eli has clarified that its intent matches my earlier statements about ease of shifting to longer tasks being clearly easier at some point past the ‘learn the basic components’ stage. Also I worry this does drop out a bunch of the true objections, especially the pointing towards multiple different sources of superexponentiallity (we have both automation of AI R&D and a potential future drop in the difficulty curve of tasks), which he lists under ‘other disagreements’ and says he hasn’t looked into yet – I think that’s probably the top priority to look at here at this point. I find the ‘you have to choose a curve and this seemed like the most reasonable one’ response to be, while obviously not the ideal world state, in context highly reasonable.

He then notes two other disagreements and acknowledges three mistakes.

Eli released an update in response to a draft of the Titotal critiques.

The new estimates are generally a year or two later, which mostly matches the updates I’d previously seen from Daniel Kokotajlo. This seems like a mix of model tweaks and adjusting for somewhat disappointing model releases over the last few months.

Overall Titotal is withholding judgment until Eli writes up more about it, which seems great, and also offers initial thoughts. Mostly he sees a few improvements but doesn’t believe his core objections are addressed.

Titotal challenges the move from 40% chance of super exponential curves to a 90% chance of an eventual such curve, although Eli notes that the 90% includes a lot of probability put into very large time horizon levels and thus doesn’t impact the answer that much.I see why one would generally be concerned about double counting, but I believe that I understand this better now and they are not double counting.

Titotal wraps up by showing you could draw a lot of very distinct graphs that ‘fit the data’ where ‘the data’ is METR’s results. And yes, of course, we know this, but that’s not the point of the exercise. No, reality doesn’t ‘follow neat curves’ all that often, but AI progress remarkably often has so far, and also we are trying to create approximations and we are all incorporating a lot more than the METR data points.

If you want to look at Titotal’s summary of why bad thing is bad, it’s at this link. I’ve already addressed each of these bullet points in detail. Some I consider to point to real issues, some not so much.

What is my overall take on the right modeling choices?

Simplicity is highly valuable. As the saying goes, make everything as simple as possible, but no simpler. There’s a lot to be said for mostly relying on something that has the shape of the first model, with the caveat of more uncertainty in various places, and that the ‘superexponential’ effects have an uncertain magnitude and onset point. There are a few different ways you could represent this. If I was doing this kind of modeling I’d put a lot more thought into the details than I have had the chance to do.

I would probably drop the detailed considerations of future bottlenecks and steps from the ultimate calculation, using them more as an intuition pump, the same way they currently calculate re-bench times and then put the calculation in the trash (see: plans are worthless, planning is essential.)

If I was going to do a deep dive, I would worry about whether we are right to combine these different arguments for superexponential progress, as in both AI R&D feedback loops and ease of future improvements, and whether either or both of them should be incorporated into the preset trend line or whether they have other issues.

The final output is then of course only one part of your full model of Reality.

At core, I buy the important concepts as the important concepts. As in, if I was using my own words for all this:

  1. AI progress continues, although a bit slower than we would have expected six months ago – progress since then has made a big practical difference, it’s kind of hard to imagine going back to models of even six months ago, but proper calibration means that can still be disappointing.

  2. In addition to scaling compute and data, AI itself is starting to accelerate the pace at which we can make algorithmic progress in AI. Right now that effect is real but modest, but we’re crossing critical thresholds where it starts to make a big difference, and this effect probably shouldn’t be considered part of the previous exponentials.

  3. The benefit of assigning tasks to AI starts to take off when you can reliably assign tasks for the AI without needing continuous human supervision, and now can treat those tasks as atomic actions not requiring state.

  4. If AI can take humans out of the effective loops in this research and work for more extended periods, watch the hell out (on many levels, but certainly in terms of capabilities and algorithmic progress.)

  5. Past a certain point where you can reliably do what one might call in-context atomic components, gaining the robustness and covering the gaps necessary to do this more reliably starts to get easier rather than harder, relative to the standard exponential curves.

  6. This could easily ‘go all the way’ to SC (and then quickly to full ASI) although we don’t know that it does. This is another uncertainty point, also note that AI 2027 as written very much involves waiting for various physical development steps.

  7. Thus, without making any claims about what the pace of all this is (and my guess is it is slower than they think it is, and also highly uncertain), the Baseline Scenario very much looks like AI 2027, but there’s a lot of probability mass also on other scenarios.

  8. One then has to ask what happens after you get this ‘superhuman coder’ or otherwise get ASI-like things of various types.

Which all adds up to me saying that I agree with Eli that none of the criticisms raised here challenges, to me, the ultimate or fundamental findings, only the price. The price is of course what we are here to talk about, so that is highly valuable even within relatively narrow bands (2028 is very different from 2029 because of reasons, and 2035 is rather different from that, and so on).

I realize that none of this is the kind of precision that lets you land on the moon.

The explanation for all this is right there: This is a physicist, holding forecasting of AI timelines to the standards of physics models. Well, yeah, you’re not going to be happy. If you try to use this to land on the moon, you will almost certainly miss the moon, the same way that if you try to use current alignment techniques on a superintelligence, you will almost certainly miss and then you will die.

One of the AI 2027 authors joked to me in the comments on a recent article that “you may not like it but it’s what peak AI forecasting performance looks like”.

Well, I don’t like it, and if this truly is “peak forecasting”, then perhaps forecasting should not be taken very seriously.

Maybe this is because I am a physicist, not a Rationalist. In my world, you generally want models to have strong conceptual justifications or empirical validation with existing data before you go making decisions based off their predictions: this fails at both.

Yes, in the world of physics, things work very differently, and we have much more accurate and better models. If you want physics-level accuracy in your predictions of anything that involves interactions of humans, well, sorry, tough luck. And presumably everyone agrees that you can’t have a physics-quality model here and that no one is claiming to have one? So what’s the issue?

The issue is whether basing decisions on modeling attempts like this is better than basing them on ‘I made it up’ or not having probabilities and projections at all and vibing the damn thing.

What I’m most against is people taking shoddy toy models seriously and basing life decisions on them, as I have seen happen for AI 2027.

I am not going to propose an alternate model. If I tried to read the tea leaves of the AI future, it would probably also be very shaky. There are a few things I am confident of, such as a software-only singularity not working and that there will be no diamondoid bacteria anytime soon. But these beliefs are hard to turn into precise yearly forecasts, and I think doing so will only cement overconfidence and leave people blindsided when reality turns out even weirder than you imagined..

Why is this person confident the software-only singularity won’t work? This post does not say. You’d have to read their substack, I assume it’s there.

The forecast here is ‘precise’ in the sense that it has a median, and we have informed people of that median. It is not ‘precise’ in the sense of putting a lot of probability mass on that particular median, even as an entire year, or even in the sense that the estimate wouldn’t change with more work or better data. It is precise in the sense that, yes, Bayes Rule is a thing, and you have to have a probability distribution, and it’s a lot more useful to share it than not share it.

I do find that the AI 2027 arguments updated me modestly towards a faster distribution of potential outcomes. I find 2027 to be a totally plausible time for SC to happen, although my median would be substantially longer.

You can’t ‘not base life decisions’ on information until it crosses some (higher than this) robustness threshold. Or I mean you can, but it will not go great.

In conclusion, I once again thank Titotal for the excellent substance of this critique, and wish it had come with better overall framing.

Discussion about this post

Analyzing A Critique Of The AI 2027 Timeline Forecasts Read More »

google’s-new-robotics-ai-can-run-without-the-cloud-and-still-tie-your-shoes

Google’s new robotics AI can run without the cloud and still tie your shoes

We sometimes call chatbots like Gemini and ChatGPT “robots,” but generative AI is also playing a growing role in real, physical robots. After announcing Gemini Robotics earlier this year, Google DeepMind has now revealed a new on-device VLA (vision language action) model to control robots. Unlike the previous release, there’s no cloud component, allowing robots to operate with full autonomy.

Carolina Parada, head of robotics at Google DeepMind, says this approach to AI robotics could make robots more reliable in challenging situations. This is also the first version of Google’s robotics model that developers can tune for their specific uses.

Robotics is a unique problem for AI because, not only does the robot exist in the physical world, but it also changes its environment. Whether you’re having it move blocks around or tie your shoes, it’s hard to predict every eventuality a robot might encounter. The traditional approach of training a robot on action with reinforcement was very slow, but generative AI allows for much greater generalization.

“It’s drawing from Gemini’s multimodal world understanding in order to do a completely new task,” explains Carolina Parada. “What that enables is in that same way Gemini can produce text, write poetry, just summarize an article, you can also write code, and you can also generate images. It also can generate robot actions.”

General robots, no cloud needed

In the previous Gemini Robotics release (which is still the “best” version of Google’s robotics tech), the platforms ran a hybrid system with a small model on the robot and a larger one running in the cloud. You’ve probably watched chatbots “think” for measurable seconds as they generate an output, but robots need to react quickly. If you tell the robot to pick up and move an object, you don’t want it to pause while each step is generated. The local model allows quick adaptation, while the server-based model can help with complex reasoning tasks. Google DeepMind is now unleashing the local model as a standalone VLA, and it’s surprisingly robust.

Google’s new robotics AI can run without the cloud and still tie your shoes Read More »

sailing-the-fjords-like-the-vikings-yields-unexpected-insights

Sailing the fjords like the Vikings yields unexpected insights


“On we sweep with threshing oar”

Greer Jarrett has identified four possible small ports, or “havens,” used by Vikings along the Norwegian coast.

Experimental archaeologist Greer Jarrett of Lund University in Sweden has been sailing in the footsteps of Vikings for the last three years.

If you want to learn more about how and where the Vikings sailed, making the journey through the fjords yourself in replica boats is a practical, hands-on approach to achieving that end. Greer Jarrett, an archaeologist at Lund University in Sweden, has spent the last three years doing just that, sailing more than 5,000 kilometers along known Viking trade routes in open, spare-rigged clinker boats similar to those used by the Vikings.

Not only has Jarrett learned a great deal about the boats themselves, he also identified four possible havens along the Norwegian coast, part of what may have been a decentralized network that played a crucial role in trade and travel during that period. And those ports are located farther out to sea than other major ports and hubs known to date, according to a paper he published in the Journal of Archaeological Method and Theory.

It’s just the latest intriguing discovery enabled by the growing field of experimental archaeology, whereby researchers seek to reverse-engineer all manner of ancient technologies. Experimental archaeologists have, for instance, built their own versions of Early Upper Paleolithic adzes, axes, and chisels. The resulting fractures and wear enabled them to develop new criteria for identifying the likely functions of ancient tools. Others have tried to cook like the Neanderthals, concluding that flint flakes were surprisingly effective for butchering birds, and that roasting the birds damages the bones to such an extent that it’s unlikely they would be preserved in the archaeological record.

Kent State University’s Metin Eren has done practical experiments to study, for instance, the trajectories of atlatls attached to spears tipped with replica Clovis points, and how their performance compares to javelins used by Neanderthals. He even fashioned rudimentary blades out of his own frozen feces to test whether they could cut through pig hide, muscle, and tendon—solely to test a famous anthropological legend about an elderly Inuit man in the 1950s who purportedly did the same to kill and skin a dog, using its rib cage as a makeshift sled to venture off into the Arctic. (It did not work, so myth: busted. But it did snag Eren an Ig Nobel prize.)

Taking a hands-on, experimental archaeological approach to studying the Vikings makes sense in light of the dearth of contemporary written sources. “We have a few things written by outsiders, but there’s very, very few accounts written or delivered by people from Scandinavia during that period,” Jarrett told Ars. “We normally rely on indirect forms of evidence, be that genetics or archaeology or linguistics, which show strong, very frequent connections across maritime areas in the North Atlantic. But because traveling by boat is kind of an archaeologically invisible act, you don’t leave any footprints. So we have very little information about the voyages between these points.”

The sailing voyages made by Greer Jarrett during the research project. The image also shows the four possible Viking harbours identified by Jarrett.

The sailing voyages made by Greer Jarrett during the research project, as well as the four possible Viking harbors he identified. Credit: Greer Jarrett

Garrett and his crew used four or five different replica boats for their test voyages. Most were built by volunteers, enthusiasts, or students Jarrett had met during his considerable time in the field. They then sailed along the west coast of the Scandinavian Peninsula, a core area of Viking seafaring.

“These are reconstructions of traditional Norwegian boats from the 1800s and early 1900s,” said Jarrett. “My idea was, because of this really long-term continuity in traditional boat building practices, especially in Norway, it might be possible to use these later boats which have lots of similarities to try and work out the potentials of where people might have gotten out. It’s the idea of suggesting potentials based on practical experience to try and join those dots between the different evidence we have across the Viking world.”

That decision has led to some criticism from colleagues because of the enormous gap in time, but Jarrett defends his choice. “The Viking Age ends in the 11th century, and we’re talking about boats from 800 years later,” he said. “But the construction techniques and the way they are rigged and their general performance characteristics are similar enough. Because this is a project about voyages and not a project about boat building, it seemed like a defensible analogy.”

Seeking safe harbor

“On the long-range voyages, we worked in watches of four hours on and four hours off, and that is just about long enough to get some sleep on your off watch, but also just about short enough that you don’t get really, really, really cold, which is obviously a risk,” said Jarrett. “It was manageable, but we looked like penguins. I mean, we’re wearing six layers of wool at any time and sleeping all stacked together for warmth. But other times it’s really nice. The spring and the autumn in Scandinavia, there’s much more likelihood of high-pressure cycles, which means that it’s clearer and sunnier than in the summer itself.”

Nonetheless, there were some rough moments, such as when the mast spar holding up the mainsail snapped, forcing the crew to improvise and lash two oars together to hold the sail so they could continue their journey. It took several days to repair the boat so it could sail again. There was no safety boat following along in case the crew got into trouble, and no engine, although they did have a life raft, which the crew has yet to use.

Based on his sailing trials, Jarrett believes that the Vikings had no need for navigational tools like maps, a compass, or a sextant, relying instead on what he calls “mental maps”—or a “maritime cultural mindscape”—based on sailors’ memories and experiences passed down orally through generations. Those maps might also be informed by the myths linked to well-known coastal landmarks, such as skerries, small islets, or reefs.

“People had been moving by boat along the west coast of Scandinavia for a really, really, really long time, probably since the late Neolithic, if not earlier—thousands of years before the Viking age,” said Jarrett. “There are big trading networks in place beforehand, and that is reflected in the names, place names along the west coast. My primary argument is if you spend 3,000 years traveling up and down a coastline in which you can use the coast at all times for navigation, then it’s unnecessary to develop instrumentation.”

“Instruments are used when you are in a place out in the open sea that you don’t know,” Jarrett continued. “We definitely know they didn’t have compasses because those don’t arrive from China until the 1200s. There are these ideas about sunstones and sundials, or little sun compasses, which are entirely possible. But there’s no legitimate proof of either of them archaeologically yet. I may well be proved wrong if we find them at some point, but I don’t think they’re necessary for this at all.”

Based on the sailing trials, archaeological and documentary evidence of Viking Age maritime centers, and digital reconstructions of past sea levels. Jarrett was able develop a useful set of criteria for evaluating potential havens. For instance, the site should be reachable in low visibility, with land or sea marks that sailors could use as bearings; large enough to accommodate multiple vessels of at least the size of a fyring (which can house a crew of four to 10 people); provide good protection from sea swell and storm surges; and have access to fresh water, among other criteria. Four sites scored sufficiently high by those criteria to qualify as possible Viking havens.

The four sites are Smørhamn, located at the confluence of Oldersund and the Frøysjø, where an inn and trading post are known to have existed since at least the late 17th century; the archipelago of Sørøyane between Stad and Ålesund, near where the sea battle of Hjörungavágr was fought circa 986 CE; Bjørnsund, a number of small islands off the southwestern tip of Hustadvika; and the island of Storfosna, which appears on 16th and 17th century charts.

“I’m not saying, ‘This is where they went,'” said Jarrett. “I’m saying that, with these kinds of boats under these conditions, it would be possible to go to these places. And it’s much more difficult—not impossible, but much more difficult—to go to these other places or to sail in these other conditions.”

Pining for the fjords

The next step is for Jarrett and other archaeologists to hunt for evidence in support of his hypothesis. “Most of these sites have never been excavated,” said Jarrett. “There’s been a long assumption that these are landing places with the idea that you are dragging your boat ashore. I’m very opposed to that idea because these are two-and-a-half-ton boats, let alone the cargo. Unless you have a team of oxen and 20 people at your command, there is no way you’re getting them on the beach. I’m very convinced that these places have jetties and mooring posts likely preserved underwater. All of that organic material survives much better underwater than it does on land. So I think that’s very possible.”

They might also find smaller items suggestive of a thriving harbor community. “Whenever you go into land, you’ve got something that’s broken, so you need to do repairs,” said Jarrett. “So things like clink nails or piles of balustones or signs of smithing—the typical kind of things you’d use for repairing your ship, I think are possible to find.” Jarrett’s methodology might also prove useful for studying other seafaring communities. 

The practical experience of sailing the same seas as the Vikings naturally led to some surprising insights. “You are able to ask very different questions the minute you walk away from your desk and get on a boat,” said Jarrett. “I think it’s essential to do that because you think in new ways. In terms of the results themselves, the boats are extremely seaworthy crafts. When you get in them for the first time, you don’t think that, because they’re very, very light. They feel very flimsy, and they’re very low in the water compared to a modern sailing boat. So you feel really in touch with the wave, which is kind of scary. But because they’re so flexible and because of the way they’re rigged, they’re actually really stable, even in big waves.”

“We kept going out thinking, ‘Oh, this is maybe the limit of what this boat can tolerate,’ and then it would be fine, and we’d be, ‘Okay, let’s go a little bit in slightly bigger waves with slightly stronger wind,'” Jarrett continued. “So I think our comfort zones definitely visibly expanded during that period. And I had the chance to work with the same crews over three years. By the end of those three years, we were doing stuff that we would never have been able to do at the beginning.”

Another big difference from modern boats, Jarrett discovered, is that one cannot sail a traditional Viking craft alone. “It has to be a collaborative effort because of how you need a person at the front and the back of the boat basically at all times,” he said. “So developing the crew together and gaining not only skills, but also trust between us meant that we could do things in 2024 that seemed completely insane just a couple of years earlier. I cannot imagine what that is like if you have an entire lifetime of Viking sailors working together for 30 years. It must be an incredible way of creating social bonds.”

DOI: Journal of Archaeological Method and Theory, 2025. 10.1007/s10816-025-09708-6  (About DOIs).

Photo of Jennifer Ouellette

Jennifer is a senior writer at Ars Technica with a particular focus on where science meets culture, covering everything from physics and related interdisciplinary topics to her favorite films and TV series. Jennifer lives in Baltimore with her spouse, physicist Sean M. Carroll, and their two cats, Ariel and Caliban.

Sailing the fjords like the Vikings yields unexpected insights Read More »