An assumption fundamental to medical genetics is that the DNA sequence of an allele at a particular locus will (in the vast majority of instances) be faithfully transcribed into RNA and translated into protein. This assumption has been largely accepted in spite of known rates of transcriptional and translational errors as well as special cases of RNA editing, in which enzymes alter the RNA sequence post-transcriptionally in ways that can influence translation. If DNA-RNA-peptide sequence fidelity were reduced to zero, it would not be worth attempting to correlate genotype and phenotype. More fundamentally, traits would not be heritable, thereby abrogating a necessary condition for Darwinian evolution.
Therefore, the recent study by Li et al., in Science (2011) is of substantial interest. The authors document numerous differences (still a minority) between DNA sequences and the putatively corresponding RNA sequences (referred to by the authors as RNA-DNA differenes or RDDs). If the data can be confirmed to result from real biological processes (as opposed to method-associated artifacts), it will be necessary to acknowledge more complex and probabilistic relationships between the nucleotide sequences of genes and both the amino acid sequences of gene products and the traits influenced by those gene products.
Li et al. used high-throughput sequencing of RNA from the immortalized B cells of 27 individuals who were participants in the International HapMap and the 1000 Genomes projects. They found all 12 possible transitions and tranversions in their RDDs, suggesting that the the known mechanisms of RNA editing, which result in what amount to A-to-G or C-to-U transitions, would not fully account for all instances of DNA-RNA sequence disparity. Many of the RDDs were observed in all or most informative individuals, but some were more limited in distribution. Analysis of RNA and DNA from other cell types (primary skin fibroblasts and brain cells from other putatively normal individuals) suggested that most RDDs are not cell type-specific, although cell-dependent editing remains a possibility that one could imagine influencing disease susceptibility. For a given DNA RDD site, there was considerable variation in the fraction of resulting RNAs that carried the discrepant nucleotide.
While the efforts of the authors to assure the reliability of their data were extensive, experts in high-thoughput sequencing of DNA and RNA have put forward alternative explanations for the results that relate to the possible limitations of the methods used (Hayden, 2011). One concern is that different types or rates of errors in sequencing DNA versus RNA with high-throughput methods might give the illusion of discrepancies where they do not actually exist. Another relevant issue is that in order to demonstrate that an RNA sequence fails to correspond to the relevant DNA sequence, the correct DNA sequence must be identified. The existence of multi-gene families can, in some cases, make finding the correct DNA sequence to compare with an RNA sequence less than completely straightforward.
Li M, Wang IX, Li Y, Bruzel A, Richards AL, Toung JM, Cheung VG. Widespread RNA and DNA sequence differences in the human transcriptome. Science. 2011 Jul 1;333(6038):53-8. Epub 2011 May 19. PubMed PMID: 21596952. Science. 2011 Jul 1;333(6038):53-8. Epub 2011 May 19.
Hayden EC. Evidence of altered RNA stirs debate. Nature. 2011 May 26;473(7348):432. PubMed PMID: 21614050.