Genetics

bringing-the-“functionally-extinct”-american-chestnut-back-from-the-dead

Bringing the “functionally extinct” American chestnut back from the dead


Wiped out in its native range by invasive pathogens, the trees may make a comeback.

Very few people alive today have seen the Appalachian forests as they existed a century ago. Even as state and national parks preserved ever more of the ecosystem, fungal pathogens from Asia nearly wiped out one of the dominant species of these forests, the American chestnut, killing an estimated 3 billion trees. While new saplings continue to sprout from the stumps of the former trees, the fungus persists, killing them before they can seed a new generation.

But thanks in part to trees planted in areas where the two fungi don’t grow well, the American chestnut isn’t extinct. And efforts to revive it in its native range have continued, despite the long generation times needed to breed resistant trees. In Thursday’s issue of Science, researchers describe their efforts to apply modern genomic techniques and exhaustive testing to identify the best route to restoring chestnuts to their native range.

Multiple paths to restoration

While the American chestnut is functionally extinct—it’s no longer a participant in the ecosystems it once dominated—it’s most certainly not extinct. Two Asian fungi that have killed it off in its native range; one causes chestnut blight, while a less common pathogen causes a root rot disease. Both prefer warmer, humid environments and persist there because they can grow asymptomatically on distantly related trees, such as oaks. Still, chestnuts planted outside the species’ original range—primarily in drier areas of western North America—have continued to thrive.

There is also a virus that attacks the chestnut blight fungus, allowing a few trees to survive in areas where that virus is common. Finally, a handful of trees have grown to maturity in the American chestnut’s original range. These trees, which the paper refers to as LSACs (large surviving American chestnuts), suggest that there might have been some low level of natural resistance within the now-vanished population.

Those trees are central to one of the efforts to restore the American chestnut. If enough of them have distinct means of resisting the fungi, interbreeding them might produce a strain that not only survives the fungi but can also thrive in the Appalachians.

A related approach took advantage of the fact that the American chestnut can produce fertile hybrids with the Chinese chestnut, which had co-evolved with the introduced fungi and were thus resistant to lethal infections. The hope was that continued back-breeding of these hybrids with American chestnuts would result in trees that were very similar to American chestnuts yet retained the fungal resistance of their Asian cousins.

Both efforts suffered from the same problem that faces any biologist working on trees: They are slow-growing and can take years to reach a size at which they produce seeds. The situation was further complicated by the fact that the American chestnut can’t pollinate itself, so you need at least two trees before any breeding is possible.

Concerned about what this might mean for the potential reintroduction of the chestnut into the Appalachians, a third project turned to biotechnology. Research had identified oxalic acid as a key factor in the blight’s virulence. Wheat naturally produces an enzyme that degrades oxalic acid, and researchers inserted the gene that encodes that enzyme into the American chestnut genome, creating a genetically modified tree that can potentially disarm the fungus’ attack.

Without understanding the nature of resistance or the effectiveness of the transgenic gene, there’s no way to know which method would be most effective. So researchers from the American Chestnut Foundation assembled a massive collaboration to examine all these options and determine what would be needed to reintroduce blight-resistant chestnuts into the wild.

Tracking resistance

The scale of the effort is immense. All told, the team infected over 4,000 individual trees with the blight fungus and tracked their growth in Appalachian nurseries for an average of over 14 years. The trees were scored for resistance on a zero-to-100 scale based on the damage caused by the infection. This data was combined with some serious lab work; the team produced the highest-quality chestnut genomes yet (of both American and Chinese species) and gathered biochemical data on how the trees respond to infection.

It quickly became apparent that there were significant differences in the growth rates of some of the resistant trees. When planted at sites where viruses kept the blight in check, the Chinese chestnuts grew more slowly than native trees, while hybrids grew at an intermediate rate. That could make a big difference, as rapid growth may have enabled the chestnut to reach its former dominance of the canopy.

Somewhat surprisingly, this slow growth turned out to be a problem for the genetically modified American chestnuts as well. By chance, the wheat gene ended up being inserted into a gene known to be important for the growth of other plants. It seems to be important in the chestnut as well; plants with two copies of the inserted genes survived at 16 percent of their expected rate, and those with a single copy grew 22 percent slower than unmodified trees.

That said, there was a lot of variability among the genetically modified trees, with 4 percent of the tested trees showing both high blight resistance and growth comparable to that of unmodified American chestnuts. It will be important to determine whether this collection of traits remains consistent in ensuing generations.

In a bit of good news, the progeny from surviving American chestnuts grew like American chestnuts. In less good news, among 143 of these trees, only seven had resistance levels of above 50 on the team’s 100-point scale. It’s possible that interbreeding these trees could further boost resistance, but it also poses the risk of creating a population that’s too inbred to thrive after reintroduction.

Root causes

The research team decided to use their testing to investigate the genetic basis of resistance. There’s a very practical reason for this: If resistance is mediated by just a handful of genes that each have large impacts, it should be possible to continue breeding resistant strains back to regular American chestnuts and selecting for resistance. But if there are many factors with relatively small impacts, it will require directed interbreeding of hybrids to maximize both resistance and DNA originating from the American chestnut.

The team completed the highest-quality chestnut genomes for both the American and Chinese species, identifying about 25,000 to 30,000 genes in the different assemblies. They then used this information for two types of genetic analysis: quantitative trait locus identification and genome-wide association. Both approaches aim to identify regions of the genome associated with specific properties and estimate their impact.

The work suggested that resistance arises from a relatively large number of sites, each with relatively minor effects. For example, the sites in the genome identified by quantitative trait analysis typically boosted resistance by about 10 points on the researchers’ 100-point scale. In the genome-wide analysis, 17 individual genetic differences were associated with about a quarter of the heritable resistance traits. All of this suggests that, for the hybrids (and likely for the weaker blight resistance found in surviving American chestnuts), directed breeding among surviving trees will be needed.

For the root rot fungus, in contrast, it looks like there are a limited number of important alleles with a large impact.

The researchers also took an alternative approach to identify resistance factors, comparing 100 chemicals produced by resistant and susceptible strains. Among the 41 chemicals detected at higher levels in the Chinese chestnut, the researchers found a metabolite, lupeol, that completely suppressed the growth of the fungal pathogen. Another, erythrodiol, limited its growth. If we can identify the genes involved in producing those chemicals, we could use that knowledge to guide directed breeding programs—or even engage in gene editing to increase their production.

The team’s current plan is to use genomic predictions to select hybrid seedlings for planting in test orchards, aiming to identify plants with high growth and resistance. From there, the process can be repeated. But even after the exhaustive exploration of resistance traits, the researchers seem to believe that all three approaches—selecting resistant American chestnuts, breeding hybrids derived from Chinese chestnuts, and directed genetic modification—can help bring the American chestnut back.

The researchers warn, though, that as environmental disturbances and invasive species continue to push some key species to the brink of extinction, we need to get better at this kind of species rescue operation.

Science, 2026. DOI: 10.1126/science.adw3225  (About DOIs).

Photo of John Timmer

John is Ars Technica’s science editor. He has a Bachelor of Arts in Biochemistry from Columbia University, and a Ph.D. in Molecular and Cell Biology from the University of California, Berkeley. When physically separated from his keyboard, he tends to seek out a bicycle, or a scenic location for communing with his hiking boots.

Bringing the “functionally extinct” American chestnut back from the dead Read More »

why-is-my-dog-like-this?-current-dna-tests-won’t-explain-it-to-you.

Why is my dog like this? Current DNA tests won’t explain it to you.

Popular genetics tests can’t tell you much about your dog’s personality, according to a recent study.

A team of geneticists recently found no connection between simple genetic variants and behavioral traits in more than 3,200 dogs, even though previous studies suggested that hundreds of genes might predict aspects of a dog’s behavior and personality. That’s despite the popularity of at-home genetic tests that claim they can tell you whether your dog’s genes contain the recipe for anxiety or a fondness for cuddles.

A little gray dog with his tongue sticking out tilts his head backwards as he looks sideways at the camera.

This is Max, and no single genetic variant can explain why he is the way he is. Credit: Kiona Smith

Gattaca for dogs, except it doesn’t work

University of Massachusetts genomicist Kathryn Lord and her colleagues compared DNA sequences and behavioral surveys from more than 3,000 dogs whose humans had enrolled them in the Darwin’s Ark project (and filled out the surveys). “Genetic tests for behavioral and personality traits in dogs are now being marketed to pet owners, but their predictive accuracy has not been validated,” wrote Lord and her colleagues in their recent paper.

So the team checked for relatively straightforward associations between genetic variants and personality traits such as aggression, drive, and affection. The 151 genetic variants in question all involved small changes to a single nucleotide, or “letter,” in a gene, known as single-nucleotide polymorphisms (SNPs).

It turns out that the answer was no: Your dog’s genes don’t predict its behavior, at least not in the simplistic way popular doggy DNA tests often claim.

And that can have serious consequences when pet owners, shelter workers, or animal rescues use these tests to make decisions about a dog’s future. “For example, if a dog is labeled as genetically predisposed to aggression, an owner might limit essential social interactions, or a shelter might decide against adoption,” Lord and her colleagues wrote.

Why is my dog like this? Current DNA tests won’t explain it to you. Read More »

many-genes-associated-with-dog-behavior-influence-human-personalities,-too

Many genes associated with dog behavior influence human personalities, too

Many dog breeds are noted for their personalities and behavioral traits, from the distinctive vocalizations of huskies to the herding of border collies. People have worked to identify the genes associated with many of these behaviors, taking advantage of the fact that dogs can interbreed. But that creates its own experimental challenges, as it can be difficult to separate some behaviors from physical traits distinctive to the breed—small dog breeds may seem more aggressive simply because they feel threatened more often.

To get around that, a team of researchers recently did the largest gene/behavior association study within a single dog breed. Taking advantage of a population of over 1,000 golden retrievers, they found a number of genes associated with behaviors within that breed. A high percentage of these genes turned out to correspond to regions of the human genome that have been associated with behavioral differences as well. But, in many cases, these associations have been with very different behaviors.

Gone to the dogs

The work, done by a team based largely at Cambridge University, utilized the Golden Retriever Lifetime Study, which involved over 3,000 owners of these dogs filling out annual surveys that included information on their dogs’ behavior. Over 1,000 of those owners also had blood samples obtained from their dogs and shipped in; the researchers used these samples to scan the dogs’ genomes for variants. Those were then compared to ratings of the dogs’ behavior on a range of issues, like fear or aggression directed toward strangers or other dogs.

Using the data, the researchers identified when different regions of the genome were frequently associated with specific variants. In total, 14 behavioral tendencies were examined, and 12 genomic regions were associated with specific behaviors, and another nine showed somewhat weaker associations. For many of these traits, it was difficult to find much because golden retrievers are notoriously friendly and mellow dogs, so they tended to score low on traits like aggression and fear.

That result was significant, as some of these same regions of the genome had been associated with very different behaviors in populations that were a mix of breeds. For example, two different regions associated with touch sensitivity in golden retrievers had been linked to a love of chasing and owner-directed aggression in a non-breed-specific study. That finding suggests that the studies were identifying genes that may be involved in setting the stage for behaviors, but were directed into specific outcomes by other genetic or environmental factors.

Many genes associated with dog behavior influence human personalities, too Read More »

some-ai-tools-don’t-understand-biology-yet

Some AI tools don’t understand biology yet


A collection of new studies on gene activity shows that AI tools aren’t very good.

Gene activity appears to remain beyond the abilities of AI at the moment. Credit: BSIP

Biology is an area of science where AI and machine-learning approaches have seen some spectacular successes, such as designing enzymes to digest plastics and proteins to block snake venom. But in an era of seemingly endless AI hype, it might be easy to think that we could just set AI loose on the mounds of data we’ve already generated and end up with a good understanding of most areas of biology, allowing us to skip a lot of messy experiments and the unpleasantness of research on animals.

But biology involves a whole lot more than just protein structures. And it’s extremely premature to suggest that AI can be equally effective at handling all aspects of biology. So we were intrigued to see a study comparing a set of AI software packages designed to predict how active genes will be in cells exposed to different conditions. As it turns out, the AI systems couldn’t manage to do any better than a deliberately simplified method of predicting.

The results serve as a useful caution that biology is incredibly complex, and developing AI systems that work for one aspect of it is not an indication that they can work for biology generally.

AI and gene activity

The study was conducted by a trio of researchers based in Heidelberg: Constantin Ahlmann-Eltze, Wolfgang Huber, and Simon Anders. They note that a handful of additional studies have been released while their work was on a pre-print server, all of them coming to roughly the same conclusions. But these authors’ approach is pretty easy to understand, so we’ll use it as an example.

The AI software they examined attempts to predict changes in gene activity. While every cell carries copies of the roughly 20,000 genes in the human genome, not all of them are active in a given cell—”active” in this case meaning they are producing messenger RNAs. Some provide an essential function and are active at high levels at all times. Others are only active in specific cell types, like nerves or skin. Still others are activated under specific conditions, like low oxygen or high temperatures.

Over the years, we’ve done many studies examining the activity of every gene in a given cell type under different conditions. These studies can range from using gene chips to determine which messenger RNAs are present in a population of cells to sequencing the RNAs isolated from single cells and using that data to identify which genes are active. But collectively, they can provide a broad, if incomplete, picture that links the activity of genes with different biological circumstances. It’s a picture you could potentially use to train an AI that would make predictions about gene activity under conditions that haven’t been tested.

Ahlmann-Eltze, Huber, and Anders tested a set of what are called single-cell foundation models that have been trained on this sort of gene activity data. The “single cell” portion indicates that these models have been trained on gene activity obtained from individual cells rather than a population average of a cell type. Foundation models mean that they have been trained on a broad range of data but will require additional training before they’re deployed for a specific task.

Underwhelming performance

The task in this case is predicting how gene activity might change when genes are altered. When an individual gene is lost or activated, it’s possible that the only messenger RNA that is altered is the one made by that gene. But some genes encode proteins that regulate a collection of other genes, in which case you might see changes in the activity of dozens of genes. In other cases, the loss or activation of a gene could affect a cell’s metabolism, resulting in widespread alterations of gene activity.

Things get even more complicated when two genes are involved. In many cases, the genes will do unrelated things, and you get a simple additive effect: the changes caused by the loss of one, plus the changes caused by the loss of others. But if there’s some overlap between the functions, you can get an enhancement of some changes, suppression of others, and other unexpected changes.

To start exploring these effects, researchers have intentionally altered the activity of one or more genes using the CRISPR DNA editing technology, then sequenced every RNA in the cell afterward to see what sorts of changes took place. This approach (termed Perturb-seq) is useful because it can give us a sense of what the altered gene does in a cell. But for Ahlmann-Eltze, Huber, and Anders, it provides the data they need to determine if these foundation models can be trained to predict the ensuing changes in the activity of other genes.

Starting with the foundation models, the researchers conducted additional training using data from an experiment where either one or two genes were activated using CRISPR. This training used the data from 100 individual gene activations and another 62 where two genes were activated. Then, the AI packages were asked to predict the results for another 62 pairs of genes that were activated. For comparison, the researchers also made predictions using two extremely simple models: one that always predicted that nothing would change and a second that always predicted an additive effect (meaning that activating genes A and B would produce the changes caused by activating A plus the changes caused by activating B).

They didn’t work. “All models had a prediction error substantially higher than the additive baseline,” the researchers concluded. The result held when the researchers used alternative measurements of the accuracy of the AI’s predictions.

The gist of the problem seemed to be that the trained foundation models weren’t very good at predicting when the alterations of pairs of genes would produce complex patterns of changes—when the alteration of one gene synergized with the alteration of a second. “The deep learning models rarely predicted synergistic interactions, and it was even rarer that those predictions were correct,” the researchers concluded. In a separate test that looked specifically at these synergies between genes, it turned out that none of the models were better than the simplified system that always predicted no changes.

Not there yet

The overall conclusions from the work are pretty clear. “As our deliberately simple baselines are incapable of representing realistic biological complexity yet were not outperformed by the foundation models,” the researchers write, “we conclude that the latter’s goal of providing a generalizable representation of cellular states and predicting the outcome of not-yet-performed experiments is still elusive.”

It’s important to emphasize that “still elusive” doesn’t mean we’re incapable of ever developing an AI that can help with this problem. It also doesn’t mean that this applies to all cellular states (the results are specific to gene activity), much less all of biology. At the same time, the work provides a valuable caution at a time when there’s a lot of enthusiasm for the idea that AI’s success in a couple of areas means we’re on the cusp of a world where it can be applied to anything.

Nature Methods, 2025. DOI: 10.1038/s41592-025-02772-6  (About DOIs).

Photo of John Timmer

John is Ars Technica’s science editor. He has a Bachelor of Arts in Biochemistry from Columbia University, and a Ph.D. in Molecular and Cell Biology from the University of California, Berkeley. When physically separated from his keyboard, he tends to seek out a bicycle, or a scenic location for communing with his hiking boots.

Some AI tools don’t understand biology yet Read More »

dna-links-modern-pueblo-dwellers-to-chaco-canyon-people

DNA links modern pueblo dwellers to Chaco Canyon people

A thousand years ago, the people living in Chaco Canyon were building massive structures of intricate masonry and trading with locations as far away as Mexico. Within a century, however, the area would be largely abandoned, with little indication that the same culture was re-established elsewhere. If the people of Chaco Canyon migrated to new homes, it’s unclear where they ended up.

Around the same time that construction expanded in Chaco Canyon, far smaller pueblos began appearing in the northern Rio Grande Valley hundreds of kilometers away. These have remained occupied to the present day in New Mexico; although their populations shrank dramatically after European contact, their relationship to the Chaco culture has remained ambiguous. Until now, that is. People from one of these communities, Picuris Pueblo, worked with specialistsancient DNA to show that they are the closest relatives of the Chaco people yet discovered, confirming aspects of the pueblo’s oral traditions.

A pueblo-driven study

The list of authors of the new paper describing this genetic connection includes members of the Pueblo government, including its present governor. That’s because the study was initiated by the members of the Pueblo, who worked with archeologists to get in contact with DNA specialists at the Center for GeoGenetics at the University of Copenhagen. In a press conference, members of the Pueblo said they’d been aware of the power of DNA studies via their use in criminal cases and ancestry services. The leaders of Picuris Pueblo felt that it could help them understand their origin and the nature of some of their oral history, which linked them to the wider Pueblo-building peoples.

After two years of discussions, the collaboration settled on a plan of research, and the ancient DNA specialists were given access to both ancient skeletons at Picuris Pueblo, as well as samples from present-day residents. These were used to generate complete genome sequences.

The first clear result is that there is a strong continuity in the population living at Picuris. The ancient skeletons range from 500 to 700 years old, and thus date back to roughly the time of European contact, with some predating it. They also share strong genetic connections to the people of Chaco Canyon, where DNA has also been obtained from remains. “No other sampled population, ancient or present-day, is more closely related to Ancestral Puebloans from Pueblo Bonito [in Chaco Canyon] than the Picuris individuals are,” the paper concludes.

DNA links modern pueblo dwellers to Chaco Canyon people Read More »

in-one-dog-breed,-selection-for-utility-may-have-selected-for-obesity

In one dog breed, selection for utility may have selected for obesity

High-risk Labradors also tended to pester their owners for food more often. Dogs with low genetic risk scores, on the other hand, stayed slim regardless of whether the owners paid attention to how and whether they were fed or not.

But other findings proved less obvious. “We’ve long known chocolate-colored Labradors are prone to being overweight, and I’ve often heard people say that’s because they’re really popular as pets for young families with toddlers that throw food on the floor all the time and where dogs are just not given that much attention,” Raffan says. Her team’s data showed that chocolate Labradors actually had a much higher genetic obesity risk than yellow or black ones

Some of the Labradors particularly prone to obesity, the study found, were guide dogs, which were included in the initial group. Training a guide dog in the UK usually takes around two years, during which the dogs learn multiple skills, like avoiding obstacles, stopping at curbs, navigating complex environments, and responding to emergency scenarios. Not all dogs are able to successfully finish this training, which is why guide dogs are often selectively bred with other guide dogs in the hope their offspring would have a better chance at making it through the same training.

But it seems that this selective breeding among guide dogs might have had unexpected consequences. “Our results raise the intriguing possibility that we may have inadvertently selected dogs prone to obesity, dogs that really like their food, because that makes them a little bit more trainable. They would do anything for a biscuit,” Raffan says.

The study also found that genes responsible for obesity in dogs are also responsible for obesity in humans. “The impact high genetic risk has on dogs leads to increased appetite. It makes them more interested in food,” Raffan claims. “Exactly the same is true in humans. If you’re at high genetic risk you aren’t inherently lazy or rubbish about overeating—it’s just you are more interested in food and get more reward from it.”

Science, 2025.  DOI: 10.1126/science.ads2145

In one dog breed, selection for utility may have selected for obesity Read More »

“wooly-mice”-a-test-run-for-mammoth-gene-editing

“Wooly mice” a test run for mammoth gene editing

On Tuesday, the team behind the plan to bring mammoth-like animals back to the tundra announced the creation of what it is calling wooly mice, which have long fur reminiscent of the woolly mammoth. The long fur was created through the simultaneous editing of as many as seven genes, all with a known connection to hair growth, color, and/or texture.

But don’t think that this is a sort of mouse-mammoth hybrid. Most of the genetic changes were first identified in mice, not mammoths. So, the focus is on the fact that the team could do simultaneous editing of multiple genes—something that they’ll need to be able to do to get a considerable number of mammoth-like changes into the elephant genome.

Of mice and mammoths

The team at Colossal Biosciences has started a number of de-extinction projects, including the dodo and thylacine, but its flagship project is the mammoth. In all of these cases, the plan is to take stem cells from a closely related species that has not gone extinct, and edit a series of changes based on the corresponding genomes of the deceased species. In the case of the mammoth, that means the elephant.

But the elephant poses a large number of challenges, as the draft paper that describes the new mice acknowledges. “The 22-month gestation period of elephants and their extended reproductive timeline make rapid experimental assessment impractical,” the researchers acknowledge. “Further, ethical considerations regarding the experimental manipulation of elephants, an endangered species with complex social structures and high cognitive capabilities, necessitate alternative approaches for functional testing.”

So, they turned to a species that has been used for genetic experiments for over a century: the mouse. We can do all sorts of genetic manipulations in mice, and have ways of using embryonic stem cells to get those manipulations passed on to a new generation of mice.

For testing purposes, the mouse also has a very significant advantage: mutations that change its fur are easy to spot. Over the century-plus that we’ve been using mice for research, people have noticed and observed a huge variety of mutations that affect their fur, altering color, texture, and length. In many of these cases, the changes in the DNA that cause these changes have been identified.

“Wooly mice” a test run for mammoth gene editing Read More »

these-hornets-break-down-alcohol-so-fast-that-they-can’t-get-drunk

These hornets break down alcohol so fast that they can’t get drunk

Many animals, including humans, have developed a taste for alcohol in some form, but excessive consumption often leads to adverse health effects. One exception is the Oriental wasp. According to a new paper published in the Proceedings of the National Academy of Sciences, these wasps can guzzle seemingly unlimited amounts of ethanol regularly and at very high concentrations with no ill effects—not even intoxication. They pretty much drank honeybees used in the same experiments under the table.

“To the best of our knowledge, Oriental hornets are the only animal in nature adapted to consuming alcohol as a metabolic fuel,” said co-author Eran Levin of Tel Aviv University. “They show no signs of intoxication or illness, even after chronically consuming huge amounts of alcohol, and they eliminate it from their bodies very quickly.”

Per Levin et al., there’s a “drunken monkey” theory that predicts that certain animals well-adapted to low concentrations of ethanol in their diets nonetheless have adverse reactions at higher concentrations. Studies have shown that tree shrews, for example, can handle concentrations of up to 3.8 percent, but in laboratory conditions, when they consumed ethanol in concentrations of 10 percent or higher, they were prone to liver damage.

Similarly, fruit flies are fine with concentrations up to 4 percent but have increased mortality rates above that range. They’re certainly capable of drinking more: fruit flies can imbibe half their body volume in 15 percent (30 proof) alcohol each day. Not even spiking the ethanol with bitter quinine slows them down. Granted, they have ultra-fast metabolisms—the better to burn off the booze—but they can still become falling-down drunk. And fruit flies vary in their tolerance for alcohol depending on their genetic makeup—that is, how quickly their bodies adapt to the ethanol, requiring them to inhale more and more of it to achieve the same physical effects, much like humans.

These hornets break down alcohol so fast that they can’t get drunk Read More »

the-fish-with-the-genome-30-times-larger-than-ours-gets-sequenced

The fish with the genome 30 times larger than ours gets sequenced

Image of the front half of a fish, with a brown and cream pattern and long fins.

Enlarge / The African Lungfish, showing it’s thin, wispy fins.

When it was first discovered, the coelacanth caused a lot of excitement. It was a living example of a group of fish that was thought to only exist as fossils. And not just any group of fish. With their long, stalk-like fins, coelacanths and their kin are thought to include the ancestors of all vertebrates that aren’t fish—the tetrapods, or vertebrates with four limbs. Meaning, among a lot of other things, us.

Since then, however, evidence has piled up that we’re more closely related to lungfish, which live in freshwater and are found in Africa, Australia, and South America. But lungfish are a bit weird. The African and South American species have seen the limb-like fins of their ancestors reduced to thin, floppy strands. And getting some perspective on their evolutionary history has proven difficult because they have the largest genomes known in animals, with the South American lungfish genome containing over 90 billion base pairs. That’s 30 times the amount of DNA we have.

But new sequencing technology has made tackling that sort of challenge manageable, and an international collaboration has now completed the largest genome ever, one where all but one chromosome carry more DNA than is found in the human genome. The work points to a history where the South American lungfish has been adding 3 billion extra bases of DNA every 10 million years for the last 200 million years, all without adding a significant number of new genes. Instead, it seems to have lost the ability to keep junk DNA in check.

Going long

The work was enabled by a technology generically termed “long-read sequencing.” Most of the genomes that were completed were done using short reads, typically in the area of 100–200 base pairs long. The secret was to do enough sequencing that, on average, every base in the genome should be sequenced multiple times. Given that, a cleverly designed computer program could figure out where two bits of sequence overlapped and register that as a single, longer piece of sequence, repeating the process until the computer spit out long strings of contiguous bases.

The problem is that most non-microbial species have stretches of repeated sequence (think hundreds of copies of the bases G and A in a row) that were longer than a few hundred bases long—and nearly identical sequences that show up in multiple locations of the genome. These would be impossible to match to a unique location, and so the output of the genome assembly software would have lots of gaps of unknown length and sequence.

This creates extreme difficulty for genomes like that of the lungfish, which is filled with non-functional “junk” DNA, all of which is typically repetitive. The software tends to produce a genome that’s more gap than sequence.

Long-read technology gets around that by doing exactly what its name implies. Rather than being able to sequence fragments of 200 bases or so, it can generate sequences that are thousands of base pairs long, easily covering the entire repeat that would have otherwise created a gap. One early version of long-read technology involved stuffing long DNA molecules through pores and watching for different voltage changes across the pore as different bases passed through it. Another had a DNA copying enzyme make a duplicate of a long strand and watch for fluorescence changes as different bases were added. These early versions tended to be a bit error-prone but have since been improved, and several newer competing technologies are now on the market.

Back in 2021, researchers used this technology to complete the genome of the Australian lungfish—the one that maintains the limb-like fins of the ancestors that gave rise to tetrapods. Now they’re back with the genomes from African and South American species. These species seem to have gone their separate ways during the breakup of the supercontinent Gondwana, a process that started nearly 200 million years ago. And having the genomes of all three should give us some perspective on the features that are common to all lungfish species, and thus are more likely to have been shared with the distant ancestors that gave rise to tetrapods.

The fish with the genome 30 times larger than ours gets sequenced Read More »

path-to-precision:-targeted-cancer-drugs-go-from-table-to-trials-to-bedside

Path to precision: Targeted cancer drugs go from table to trials to bedside

Path to precision: Targeted cancer drugs go from table to trials to bedside

Aurich Lawson

In 1972, Janet Rowley sat at her dining room table and cut tiny chromosomes from photographs she had taken in her laboratory. One by one, she snipped out the small figures her children teasingly called paper dolls. She then carefully laid them out in 23 matching pairs—and warned her kids not to sneeze.

The physician-scientist had just mastered a new chromosome-staining technique in a year-long sabbatical at Oxford. But it was in the dining room of her Chicago home where she made the discovery that would dramatically alter the course of cancer research.

Rowley's 1973 partial karyotype showing the 9;22 translocation

Enlarge / Rowley’s 1973 partial karyotype showing the 9;22 translocation

Looking over the chromosomes of a patient with acute myeloid leukemia (AML), she realized that segments of chromosomes 8 and 21 had broken off and swapped places—a genetic trade called a translocation. She looked at the chromosomes of other AML patients and saw the same switch: the 8;21 translocation.

Later that same year, she saw another translocation, this time in patients with a different type of blood cancer, called chronic myelogenous leukemia (CML). Patients with CML were known to carry a puzzling abnormality in chromosome 22 that made it appear shorter than normal. The abnormality was called the Philadelphia chromosome after its discovery by two researchers in Philadelphia in 1959. But it wasn’t until Rowley pored over her meticulously set dining table that it became clear why chromosome 22 was shorter—a chunk of it had broken off and traded places with a small section of chromosome 9, a 9;22 translocation.

Rowley had the first evidence that genetic abnormalities were the cause of cancer. She published her findings in 1973, with the CML translocation published in a single-author study in Nature. In the years that followed, she strongly advocated for the idea that the abnormalities were significant for cancer. But she was initially met with skepticism. At the time, many researchers considered chromosomal abnormalities to be a result of cancer, not the other way around. Rowley’s findings were rejected from the prestigious New England Journal of Medicine. “I got sort of amused tolerance at the beginning,” she said before her death in 2013.

The birth of targeted treatments

But the evidence mounted quickly. In 1977, Rowley and two of her colleagues at the University of Chicago identified another chromosomal translocation—15;17—that causes a rare blood cancer called acute promyelocytic leukemia. By 1990, over 70 translocations had been identified in cancers.

The significance mounted quickly as well. Following Rowley’s discovery of the 9;22 translocation in CML, researchers figured out that the genetic swap creates a fusion of two genes. Part of the ABL gene normally found on chromosome 9 becomes attached to the BCR gene on chromosome 22, creating the cancer-driving BCR::ABL fusion gene on chromosome 22. This genetic merger codes for a signaling protein—a tyrosine kinase—that is permanently stuck in “active” mode. As such, it perpetually triggers signaling pathways that lead white blood cells to grow uncontrollably.

Schematic of the 9;22 translocation and the creation of the BCR::ABL fusion gene.

Enlarge / Schematic of the 9;22 translocation and the creation of the BCR::ABL fusion gene.

By the mid-1990s, researchers had developed a drug that blocks the BCR-ABL protein, a tyrosine kinase inhibitor (TKI) called imatinib. For patients in the chronic phase of CML—about 90 percent of CML patients—imatinib raised the 10-year survival rate from less than 50 percent to a little over 80 percent. Imatinib (sold as Gleevec or Glivec) earned approval from the Food and Drug Administration in 2001, marking the first approval for a cancer therapy targeting a known genetic alteration.

With imatinib’s success, targeted cancer therapies—aka precision medicine—took off. By the early 2000s, there was widespread interest among researchers to precisely identify the genetic underpinnings of cancer. At the same time, the revolutionary development of next-generation genetic sequencing acted like jet fuel for the soaring field. The technology eased the identification of mutations and genetic abnormalities driving cancers. Sequencing is now considered standard care in the diagnosis, treatment, and management of many cancers.

The development of gene-targeting cancer therapies skyrocketed. Classes of TKIs, like imatinib, expanded particularly fast. There are now over 50 FDA-approved TKIs targeting a wide variety of cancers. For instance, the TKIs lapatinib, neratinib, tucatinib, and pyrotinib target human epidermal growth factor receptor 2 (HER2), which runs amok in some breast and gastric cancers. The TKI ruxolitinib targets Janus kinase 2, which is often mutated in the rare blood cancer myelofibrosis and the slow-growing blood cancer polycythemia vera. CML patients, meanwhile, now have five TKI therapies to choose from.

Path to precision: Targeted cancer drugs go from table to trials to bedside Read More »

much-of-neanderthal-genetic-diversity-came-from-modern-humans

Much of Neanderthal genetic diversity came from modern humans

A large, brown-colored skull seen in profile against a black background.

The basic outline of the interactions between modern humans and Neanderthals is now well established. The two came in contact as modern humans began their major expansion out of Africa, which occurred roughly 60,000 years ago. Humans picked up some Neanderthal DNA through interbreeding, while the Neanderthal population, always fairly small, was swept away by the waves of new arrivals.

But there are some aspects of this big-picture view that don’t entirely line up with the data. While it nicely explains the fact that Neanderthal sequences are far more common in non-African populations, it doesn’t account for the fact that every African population we’ve looked at has some DNA that matches up with Neanderthal DNA.

A study published on Thursday argues that much of this match came about because an early modern human population also left Africa and interbred with Neanderthals. But in this case, the result was to introduce modern human DNA to the Neanderthal population. The study shows that this DNA accounts for a lot of Neanderthals’ genetic diversity, suggesting that their population was even smaller than earlier estimates had suggested.

Out of Africa early

This study isn’t the first to suggest that modern humans and their genes met Neanderthals well in advance of our major out-of-Africa expansion. The key to understanding this is the genome of a Neanderthal from the Altai region of Siberia, which dates from roughly 120,000 years ago. That’s well before modern humans expanded out of Africa, yet its genome has some regions that have excellent matches to the human genome but are absent from the Denisovan lineage.

One explanation for this is that these are segments of Neanderthal DNA that were later picked up by the population that expanded out of Africa. The problem with that view is that most of these sequences also show up in African populations. So, researchers advanced the idea that an ancestral population of modern humans left Africa about 200,000 years ago, and some of its DNA was retained by Siberian Neanderthals. That’s consistent with some fossil finds that place anatomically modern humans in the Mideast at roughly the same time.

There is, however, an alternative explanation: Some of the population that expanded out of Africa 60,000 years ago and picked up Neanderthal DNA migrated back to Africa, taking the Neanderthal DNA with them. That has led to a small bit of the Neanderthal DNA persisting within African populations.

To sort this all out, a research team based at Princeton University focused on the Neanderthal DNA found in Africans, taking advantage of the fact that we now have a much larger array of completed human genomes (approximately 2,000 of them).

The work was based on a simple hypothesis. All of our work on Neanderthal DNA indicates that their population was relatively small, and thus had less genetic diversity than modern humans did. If that’s the case, then the addition of modern human DNA to the Neanderthal population should have boosted its genetic diversity. If so, then the stretches of “Neanderthal” DNA found in African populations should include some of the more diverse regions of the Neanderthal genome.

Much of Neanderthal genetic diversity came from modern humans Read More »

frozen-mammoth-skin-retained-its-chromosome-structure

Frozen mammoth skin retained its chromosome structure

Artist's depiction of a large mammoth with brown fur and huge, curving tusks in an icy, tundra environment.

One of the challenges of working with ancient DNA samples is that damage accumulates over time, breaking up the structure of the double helix into ever smaller fragments. In the samples we’ve worked with, these fragments scatter and mix with contaminants, making reconstructing a genome a large technical challenge.

But a dramatic paper released on Thursday shows that this isn’t always true. Damage does create progressively smaller fragments of DNA over time. But, if they’re trapped in the right sort of material, they’ll stay right where they are, essentially preserving some key features of ancient chromosomes even as the underlying DNA decays. Researchers have now used that to detail the chromosome structure of mammoths, with some implications for how these mammals regulated some key genes.

DNA meets Hi-C

The backbone of DNA’s double helix consists of alternating sugars and phosphates, chemically linked together (the bases of DNA are chemically linked to these sugars). Damage from things like radiation can break these chemical linkages, with fragmentation increasing over time. When samples reach the age of something like a Neanderthal, very few fragments are longer than 100 base pairs. Since chromosomes are millions of base pairs long, it was thought that this would inevitably destroy their structure, as many of the fragments would simply diffuse away.

But that will only be true if the medium they’re in allows diffusion. And some scientists suspected that permafrost, which preserves the tissue of some now-extinct Arctic animals, might block that diffusion. So, they set out to test this using mammoth tissues, obtained from a sample termed YakInf that’s roughly 50,000 years old.

The challenge is that the molecular techniques we use to probe chromosomes take place in liquid solutions, where fragments would just drift away from each other in any case. So, the team focused on an approach termed Hi-C, which specifically preserves information about which bits of DNA were close to each other. It does this by exposing chromosomes to a chemical that will link any pieces of DNA that are close physical proximity. So, even if those pieces are fragments, they’ll be stuck to each other by the time they end up in a liquid solution.

A few enzymes are then used to convert these linked molecules to a single piece of DNA, which is then sequenced. This data, which will contain sequence information from two different parts of the genome, then tells us that those parts were once close to each other inside a cell.

Interpreting Hi-C

On its own, a single bit of data like this isn’t especially interesting; two bits of genome might end up next to each other at random. But when you have millions of bits of data like this, you can start to construct a map of how the genome is structured.

There are two basic rules governing the pattern of interactions we’d expect to see. The first is that interactions within a chromosome are going to be more common than interactions between two chromosomes. And, within a chromosome, parts that are physically closer to each other on the molecule are more likely to interact than those that are farther apart.

So, if you are looking at a specific segment of, say, chromosome 12, most of the locations Hi-C will find it interacting with will also be on chromosome 12. And the frequency of interactions will go up as you move to sequences that are ever closer to the one you’re interested in.

On its own, you can use Hi-C to help reconstruct a chromosome even if you start with nothing but fragments. But the exceptions to the expected pattern also tell us things about biology. For example, genes that are active tend to be on loops of DNA, with the two ends of the loop held together by proteins; the same is true for inactive genes. Interactions within these loops tend to be more frequent than interactions between them, subtly altering the frequency with which two fragments end up linked together during Hi-C.

Frozen mammoth skin retained its chromosome structure Read More »