In an EMR commentary (http://evomed.org/?p=1644) from March two years ago, I discussed issues related to the functional classification of genomic DNA sequences that arose in the context of claims from the ENCODE (ENCyclopedia Of DNA Elements) consortium.  A particular focus of that piece was an article by Graur and colleagues (2013) that offered an often humorous but rather stinging critique of the definition of “function” applied by the ENCODE authors to genomic DNA sequences.  Graur and two of his associates have now published (2015) an interesting and valuable functional classification of genomic sequences that is critically informed by their understanding of evolution.

A key distinction promulgated by the authors that is essential to their subsequent analysis is that between what they refer to as “causal-role activity” and what they refer to as “selected-effect function.”  So for instance, a genomic sequence may be transcribed, but unless there is or has been selection acting on that transcribed product, it is not a biologically meaningful effect.  What matters from the perspective of evolution are the biochemical activities that account for why the corresponding genomic elements are present in the genome.

They begin by disparaging two equivalencies they believe to be erroneous. The first assumption (attributed especially to the medical literature) that the authors criticize is that any non-coding DNA can be regarded as “junk” DNA, i.e. DNA that neither enhances nor degrades fitness. Second, Graur et al. criticize the equating of causal-role activity with biologically meaningful function.

The authors also emphasize a critical point that may not be apparent to those relatively unfamiliar with population genetics theory and the concept of effective population size.  As the authors note, no one should expect evolution to produce genomes that consist solely of sequences associated with one or more selected-effect functions.  They note that only a population meeting three rather stringent criteria could achieve such functional purity: 1) infinite effective population size, 2) increasing genome size by even one nucleotide will uniformly have substantial and negative influence on fitness, and 3) generation time is short.  According to the authors, even bacterial populations cannot meet these exacting requirements.

The first of a series of distinctions applied to genomic elements is that between those that are “functional” and those that are “rubbish.”  In this scheme, a functional sequence is one associated with a selected-effect function that has contributed to fitness (i.e. was selected) and/or contributed to retention of the sequence.  In contrast, rubbish sequences are those that are not associated with a selected-effect function.  The mere fact that a DNA sequence is transcribed into RNA or that a protein can bind to it would not qualify that sequence for functional status.

Graur et al. further divide functional sequences into “literal DNA” sequences and “indifferent DNA” sequences, where the former are under selection for precise orders of particular nucleotides and the latter are under selection that is not influenced by the particular identities or precise orders of nucleotides.  Literal sequences will include: 1) those that are transcribed into RNA that is then translated into functionally meaningful proteins, 2) those that are transcribed into RNA molecules, such as micro-RNAs, that directly mediate functions, such as influencing the rate of transcription or translation for protein coding genes, and 3) non-coding sequences, such as those that are involved in the regulation of gene transcription, which generally requires sequence-specific binding by proteins known as transcription factors.

Examples of indifferent sequences include those that serve as spacers between literal sequences, those that contribute to achieving nuclei of optimal size, and those that reduce the probability of frameshift mutations, which can subvert the selected-effect function of a gene and the corresponding protein product.  The authors also note that for amino acids encoded by four different codons with the same first two nucleotides, such as leucine and valine, the third position is indifferent DNA, since any of the four deoxynucleotides will suffice to direct the incorporation of same amino acid into a new polypeptide chain.  Another example, not considered by the authors, is the DNA in the bacteria-trapping extracellular fibers generated by neutrophils (NETs) (Brinkmann et al., 2004).

The neutral roles of indifferent DNA sequences are not affected by individual nucleotide substitutions.  However, insertions and deletions can have consequences for selected-effect function and thereby affect fitness.  In the case of NETs, even insertions and deletions of modest size would probably not greatly influence selected-effect function.

Rubbish DNA is separated into junk and garbage.  Junk sequences lack a selected-effect function whereas garbage sequences are actually negatively affecting fitness.

Why is garbage DNA not eliminated if it is deleterious?  It may be lost from a genome given enough time, but as Graur et al. note, natural selection is neither all-powerful nor instantaneously effective.  In fact, negative selection requires a random event leading to loss (or, see below, transformation to a different functional category of sequence) of a garbage sequence in one individual that proves relatively advantageous in comparison to individuals that retain the original garbage sequence.

In the concluding section of the article, the authors focus on the fundamentally dynamic nature of genomes.  Any genomic sequence classified at a given time into one of the four major categories (functional-literal, functional-indifferent, junk, and garbage) can at a later time be altered through mutation such that it requires re-classification in a new category.  Graur and colleagues inject a bit of levity into the discussion by naming a few of the possible 12 category-to-category transitions: 1) garbage DNA mutated into functional DNA is termed “Lazarus DNA,” based on the resurrected biblical character, 2) functional DNA transformed into garbage DNA is termed “Hyde DNA,” based on the story of Dr. Jekyll and Mr. Hyde, and 3) junk DNA altered so as to constitute garbage DNA is called “zombie DNA,” a term previously suggested in an article about a form of the disease muscular dystrophy in the New York Times (2010).

References

Greenspan, N. The power of Graur to explode ENCODE using incision and derision. March 30, 2013. http://evomed.org/?p=1644.

Graur D, Zheng Y, Price N, Azevedo RB, Zufall RA, Elhaik E. On the immortality of television sets: “function” in the human genome according to the evolution-free gospel of ENCODE. Genome Biol Evol. 2013 Feb 20. [Epub ahead of print] PubMed PMID: 23431001.

Graur D, Zheng Y, Azevedo RB. An Evolutionary Classification of Genomic Function. Genome Biol Evol. 2015 Jan 28;7(3):642-645. Review. PubMed PMID: 25635041.

Brinkmann V, Reichard U, Goosmann C, Fauler B, Uhlemann Y, Weiss DS, Weinrauch Y, Zychlinsky A. Neutrophil extracellular traps kill bacteria. Science. 2004 Mar 5;303(5663):1532-5. PubMed PMID: 15001782.

Kolata G. 2010. Reanimated ‘junk’ DNA is found to cause disease. New York Times http://www.nytimes.com/2010/08/20/science/20gene.html?_r=0