The owners are attractive: scientists sequenced the genome of white sharks. Or the bamboo lemur or the golden eagle. But why spend so much time and money knowing the DNA composition of different species?
The open genome selection is useful to help identify genetic markers (gene sequences) to better understand population-level processes. But the real and lasting value of complete sequencing of the genome is only performed when many accurate and high resolution genomes are accumulated that can be compared to each other. This type of work is just beginning.
I am an evolutionary biologist in the Florida Program for Shark Research. Our research focuses on understanding how modern sharks and rays have diversified throughout their evolution to colonize the habitats they occupy today.
Models without instructions
The genome of an organism, the complete catalog of its DNA, has the model for its design. The differences in DNA sequences that make up the genomes are responsible for the differences that we see among individuals.
The identical twins are physically similar to each other because their genomes are identical. The brothers look like each other because they inherit large tracts of their genomes from the same set of parents. And the closely related species seem more similar to each other than those that are more remote because their underlying genomes are more similar.
It follows that if we had a complete sequence of genomes for an organism, we would have all the information we need to understand how it works from the beginning. In fact, this was the justification of the first Human Genome project
But the genomic DNA sequence of an organism may contain billions of nucleotides or genetic blocks. Trying to be part of what could be the organism of the genome sequence would be to try to make sense of thousands of telephone conversations transmitted simultaneously in the "packages" of information that reach the extreme receiver of a fiber optic cable, without knowing anything about how it was organized the information. The data is "everything there", but it is difficult to know what it means without an explicit interpreter. And scientists still do not know how to organize all the information in the genomes or how their activity is choreographed.
Learn how to compare
If it is so difficult to interpret information buried in the genomes, why do you bother to collect the data? The answer is that if we compare the genomes with each other, we can deduce what are the elements responsible for certain traits.
For example, humans and chimpanzees have genomes that are approximately 98 percent similar. This means that the difference of 2 percent between their respective genomes must somehow account for the differences in their appearance and their associated traits. Comparing the genomes together allows us to identify the parts of the genome responsible for the observed differences.
Obviously, it is important to carefully choose the comparisons you must perform. Comparing a human genome with a platypus genome of duck will not tell us much about what causes humans – or platypus billed by duck -, so to speak "so special." The two species have diverged about 150 million years ago and there are so many differences in their genomes and traces that show that it would be impossible to know what genomic differences were responsible for what the traits are.
However, the comparison of human genomes and platypus (two mammals) against a bird genome will allow us to identify aspects of the shared genome and platypus, but different from the bird's genome. And, in turn, comparing genomes of various mammals and birds against amphibian genomes helped us limit the genomic elements that commonly had birds and mammals different from amphibians.
Construction of genetic libraries
The hierarchical comparisons as described above are found in the nucleus of comparative genomics, a field that includes how patterns of variation in the genomes are associated or "map to" patterns of variation in the observable traits. Biologists refer to this set of associations as the "genotype-phenotype map."
Obviously, scientists need to know evolutionary relationships between organisms before anything can be done and make sure that the genomic information that we collect is accurate. If it is inaccurate or incomplete, we risk losing important associations between the genotypes and the traits they encode.
Recent advances in the DNA sequence of the latest generation and computing sciences revolutionize the collection and analysis of these data. But it's still expensive. It costs approximately $ 30,000 to sequencing and join a 2,500 million pairs of genome bases (for comparison, the human genome has about 3 billion base pairs) with enough accuracy to be useful for comparative genomic work – and more for Major genomes such as lungfish or salamander.
An international consortium of scientists is working to collect high quality genomic sequences for all vertebrate animals that meet this standard. Initial comparisons are centered on the selected species to represent the evolutionary diversity of different vertebrate groups: a set of birds, reptiles, mammals, amphibians and fish served. In September 2018, the project launched its first 15 high-quality reference genomes for species, such as Canadian lynx, zebra finch and clingfish with nerves.
Later comparisons will cover the evolutionary gaps, until we finally have a complete set of very accurate genomes that can be compared to each other. These highly accurate genomes will improve our understanding of the genotype-phenotype map. They will also serve as references for researchers who try to understand the role that different genes play in the direction of normal development and, for others, explore possible causes for developmental anomalies, birth defects and genetic diseases.
Other sequencing initiatives are less focused on obtaining very accurate and / or complete genomes for comparative genomic work. Many are essentially "fishing expeditions", looking to see if something interesting appears or identify molecular markers that can later be used for management and conservation efforts. For example, the recently published white shark genome found that olfactory genes were not as abundant as expected due to the good smell of white sharks and that white sharks had a greater proportion of transpondable elements. The DNA sequences that can move from place to place in the genome to another – of what is typical.
These projects are often less expensive, as they are not designed to obtain high resolution genomic maps with full coverage of the genome. Unfortunately, they have a limited utility for ongoing research. They are generally too incomplete to be useful for development biologists and they are of limited use to understand the genotype-phenotype map.
However, they serve to boost public interest in the flourishing field of genomics, which already has a great impact on fields ranging from basic biology to applied personalized medicine. As more high-resolution genomes are combined and compared, we can expect our understanding of the architectures that sustain different lifestyles to expand exponentially.