The new tools for determining nucleotide sequences for whole genomes can sometimes present a problem of data analysis: How can mutations that influence important phenotypes be distinguished from mutations that may be of minimal or no impact on fitness, so-called passenger mutations that arise and persist primarily by chance and can greatly outnumber adaptive genetic variants?  Merely finding nucleotide substitutions or larger genomic differences in comparing independent isolates of a microbial pathogen does not automatically reveal which genetic variants are responsible for the medically-relevant differences in pathogen attributes.

Lieberman et al. (2011) have approached this problem by determining the whole genome sequences for 112 isolates of an opportunistic bacterial pathogen, Burkholderia dolosa, obtained from 14 cystic fibrosis (CF) patients, including the initial patient infected, who were all part of an epidemic of small scale in the Boston area.  A total of 39 individuals were infected in the course of the outbreak, and the patient samples were taken over a period of 16 years.  Bacterial samples were obtained primarily from the airways and from blood.  For these genome sequences, the average read depth was 37x, and the genomes were aligned based on a B. dolosa reference genome.

For the current study, the authors focused on single-nucleotide polymorphisms (SNPs).  They did not evaluate structural genomic variants (e.g., insertions or deletions) or mobile genetic elements.  In their analysis, Lieberman et al. identified 561 mutations affecting 304 genes.  The rate of SNP accumulation was approximately 2.1 SNPs per year, which created sufficient genetic diversity to permit generation of a maximum-likelihood phylogenetic tree.  

Bacterial genomes from the same subject tended to cluster in proximity to one another in the phylogenetic tree.  These subject-related clusters could be used to define a subject-specific genetic profile and to construct the genomic last common ancestor (LCA) for each such cluster.  These inferred LCAs facilitated the creation of a plausible map of the transmission events from subject to subject.  Due to the fact that the analysis involved only 14 of the total of 39 patients infected during the Boston B. dolosa epidemic, inferred transmission events could not be definitively classified as direct. 

The authors first analyzed the bacterial isolates for genetic correlates of two phenotypes of expected relevance to bacterial pathogenicity (a candidate gene approach), the first of these being resistance to ciprofloxacin (which is often used to treat bacterial infections in CF patients).  All of the isolates exhibiting resistance to ciprofloxacin had non-synonymous mutations at two sites, T83 and D87, in a gene (BDAG_02180) related to gyrA from Escherichia coli.  These particular mutations occurred in isolates from six subjects and the phylogenetic analysis indicated that they occurred independently within each patient post-infection.  From these results the authors concluded that ciprofloxacin exerts substantial selection on the bacteria.

The second known pathogenic phenotype addressed was the presentation of O-antigen repeats in the outer membrane molecule, lipopolysaccharide (LPS).  This trait, associated with increased virulence in related bacterial pathogens, was found to correlate absolutely with alteration of a particular nucleotide in the gene, BDAG_02317, a glycosyltransferase.  Interestingly, the ancestral phenotype was the absence of O-antigen repeats due to a stop codon, and the derived trait was due to two distinct mutations (to glutamic acid or to glutamine) at the same site causing the transcription of the full length enzyme.  Additional experiments confirmed this interpretation of the genotype-phenotype relationship.

Finally the authors pursued a systematic and functionally unbiased search of the sequenced genomes for genes subject to positive selection.  Genes that had suffered multiple mutations were identified since on a random basis (i.e., without selection) genes acquiring more than single mutation would be rare.  Seventeen genes with three or more mutations were identified while only one such gene would have been expected in the case of random mutation without selection.  Furthermore, all of the genes that acquired three or more mutations were also mutated in two or more individuals, a pattern that the authors regarded as critical evidence that these genes were under selection.  Additional evidence pointing to positive selection for all 17 of the multiply mutated genes is that the ratio of non-synonymous to synonymous mutations for this group of genes was 18.  Such a result would be highly improbable in the absence of selection or on the basis of these genes harboring so-called mutational hotspots.

I have previously addressed the evidence that some synonymous mutations may be subject to selection (2009).  Nevertheless, the ability of some synonymous mutations to affect one or another phenotype by, for example, altering rates of translation for mRNAs does not negate the assumption of Lieberman et al. that a ratio of 18:1 for non-synonymous to synonymous mutations (dN/dS) is strong evidence for positive selection.  This conclusion is strengthened by the fact that, as the authors note, the average dN/dS ratio for the whole genome of the B. dolosa patient isolates was close to 1.

Another point of interest in these results is the range of functions associated with these 17 selected genes.  Several genes were associated with phenotypes expected to be associated with pathogenesis, such as antibiotic resistance, outer membrane synthesis, and secretion.  However, there were also three genes of uncertain function and three genes that were classified as being involved in oxygen-related gene regulation, a function that might normally be regarded as vegetative.  The authors cite prior work that provides some plausibility for the influence of these oxygen-related genes on pathogen survival in the lung.  Thus, an evolutionarily-based definition of virulence genes may be broader than one based on gene product functions known to directly contribute to pathogen invasion of host tissues and the mediation of host tissue damage, such as adhesins, toxins, and molecules involved in immune evasion.  Elsewhere, I have similarly argued that mammalian host genes involved in minimizing pathogen-related tissue damage may extend beyond those genes conventionally identified with the immune system (Greenspan, 1998). The broader principle is that evolution is usually indifferent to the categories we construct to simplify thinking about host-pathogen interactions.  

Although the evidence presented by Lieberman et al. that the 17 genes they identified were positively selected is strong, as Richard Lenski (2011) notes in a commentary accompanying the article, there is another sort of experiment that it would be interesting to perform to address the roles of these genes in promoting B. dolosa survival and spread in human hosts.  Isogenic strains of B. dolosa could be constructed so that otherwise genetically-identical bacteria would differ only in one of the 17 genes shown to be positively selected by Lieberman et al.  These strains could then be put into competition with one another in various conditions in vitro or in vivo in various animal models.  Of course, the strains could also be studied separately but in parallel in experimental animals.  Such experiments would be likely to provide important additional insights into how these various genetic alterations contribute to host-pathogen interactions.


Lieberman TD, Michel JB, Aingaran M, Potter-Bynoe G, Roux D, Davis MR Jr, Skurnik D, Leiby N, LiPuma JJ, Goldberg JB, McAdam AJ, Priebe GP, Kishony R. Parallel bacterial evolution within multiple patients identifies candidate pathogenicity genes. Nat Genet. 2011 Nov 13;43(12):1275-80. doi: 10.1038/ng.997.  PubMed PMID: 22081229; PubMed Central PMCID: PMC3245322.

Greenspan, N. Growing complexities in relating genotype to phenotype. (7/21/09) 

Greenspan NS. Genomic logic, allelic inference, and the functional classification of genes. Perspect Biol Med. 1998 Spring;41(3):409-16. PubMed PMID: 11829018. 

Lenski RE. Chance and necessity in the evolution of a bacterial pathogen. Nat Genet. 2011 Nov 28;43(12):1174-6. doi: 10.1038/ng.1011. PubMed PMID: 22120052.