Santa Cruz Works

View Original

Project to Read Genomes of all 70,000 Vertebrate Species

A bold project to read the complete genetic sequences of every known vertebrate species has reached its first milestone, publishing new methods and the first 25 high-quality genomes.

By Howard Hughes Medical Institute

Originally posted in the UC Santa Cruz NEWSCENTER

It’s one of the most audacious projects in biology today—reading the entire genome of every bird, mammal, lizard, fish, and all other creatures with backbones.

And now comes the first major payoff from the Vertebrate Genomes Project (VGP): near complete, high-quality genomes of 25 species, including the greater horseshoe bat, the Canada lynx, the platypus, and the kākāpō parrot, one of the first high-quality genomes of an endangered vertebrate species.

The VGP team, including scientists at the UC Santa Cruz Genomics Institute, published their findings April 28 in a special issue of Nature, with companion papers simultaneously published in other scientific journals.

The flagship paper lays out the technical advances that let scientists achieve a new level of accuracy and completeness and paves the way for decoding the genomes of the roughly 70,000 vertebrate species living today, said coauthor David Haussler, director of the Genomics Institute and a professor of biomolecular engineering at UCSC.

“We will get a spectacular picture of how nature actually filled out all the ecosystems with this unbelievably diverse array of animals,” said Haussler, a Howard Hughes Medical Institute (HHMI) investigator.

The new results are beginning to deliver on that promise. The project team has discovered previously unknown chromosomes in the zebra finch genome, for example, and a surprise finding about genetic differences between marmoset and human brains. The new research also offers hope for saving the kākāpō and the endangered vaquita dolphin from extinction.

“These 25 genomes represent a key milestone,” explained coauthor Erich Jarvis, VGP chair and HHMI investigator at Rockefeller University. “We are learning a lot more than we expected. The work is a proof of principle for what’s to come.”

From 10K to 70K

The VGP milestone has been years in the making. The project’s origins date back to the late-2000s, when Haussler, geneticist Stephen O’Brien, and Oliver Ryder, director of conservation genetics at the San Diego Zoo, figured it was time to think big.

Instead of sequencing just a few species, such as humans and model organisms like fruit flies, why not read the complete genomes of ten thousand animals in a bold “Genome 10K” effort? At the time, though, the price tag was hundreds of millions of dollars, and the plan never really got off the ground.

“Everyone knew it was a great idea, but nobody wanted to pay for it,” recalled coauthor Beth Shapiro, professor of ecology and evolutionary biology at UCSC and a HHMI investigator.

Plus, scientists’ early efforts at spelling out, or “sequencing,” all the DNA letters in an animal’s genome were riddled with errors. The introduction of new sequencing technologies helped make the idea of reading thousands of genomes possible. These rapidly developing technologies slashed costs, but also reduced quality in genome assembly structure.

Then in 2015, Haussler and colleagues brought in Jarvis, a pioneer in deciphering the intricate neural circuits that let birds trill new tunes after listening to others’ songs. Jarvis had already shown a knack for managing big, complex efforts. In 2014, he and more than a hundred colleagues sequenced the genomes of 48 bird species, which turned up new genes involved in vocal learning.

“David and others asked me to take on leadership of the Genome 10K project,” Jarvis recalled. “They felt I had the personality for it.” Or, as Shapiro put it: “Erich is a very pushy leader, in a nice way. What he wants to happen, he will make happen.”

Jarvis expanded and rebranded the Genome 10K idea to include all vertebrate genomes. He also helped launch a new sequencing center at Rockefeller that, together with one at the Max Planck Institute in Germany led by former HHMI Janelia Research Campus Group Leader Gene Myers, and another at the Sanger Institute in the UK led by Richard Durbin and Mark Blaxter, is currently producing most of the VGP genome data. He asked Adam Phillippy, a leading genome expert at the National Human Genome Research Institute (NHGRI), to chair the VGP assembly team. Then, he found about 60 top scientists willing to use their own grant money to pay for the sequencing costs at the centers to tackle the genomes they were most interested in. The team also negotiated with the Māori in New Zealand and officials in Mexico to get kākāpō and vaquita samples in “a beautiful example of international collaboration,” said Sadye Paez, program director of the VGP at Rockefeller.

Continue reading here