Thursday, May 10, 2012

Quick post - new paper of interest on "The Infinitely Many Genes Model ..."

This paper seems of potential interest: The Infinitely Many Genes Model for the Distributed Genome of Bacteria by Franz Baumdicker, Wolfgang R. Hess, and Peter Pfaffelhuber

The distributed genome hypothesis states that the gene pool of a bacterial taxon is much more complex than that found in a single individual genome. However, the possible fitness advantage, why such genomic diversity is maintained, whether this variation is largely adaptive or neutral, and why these distinct individuals can coexist, remains poorly understood. Here, we present the infinitely many genes (IMG) model, which is a quantitative, evolutionary model for the distributed genome. It is based on a genealogy of individual genomes and the possibility of gene gain (from an unbounded reservoir of novel genes, e.g., by horizontal gene transfer from distant taxa) and gene loss, for example, by pseudogenization and deletion of genes, during reproduction. By implementing these mechanisms, the IMG model differs from existing concepts for the distributed genome, which cannot differentiate between neutral evolution and adaptation as drivers of the observed genomic diversity. Using the IMG model, we tested whether the distributed genome of 22 full genomes of picocyanobacteria (Prochlorococcus and Synechococcus) shows signs of adaptation or neutrality. We calculated the effective population size of Prochlorococcus at 1.01 × 1011 and predicted 18 distinct clades for this population, only six of which have been isolated and cultured thus far. We predicted that the Prochlorococcus pangenome contains 57,792 genes and found that the evolution of the distributed genome of Prochlorococcus was possibly neutral, whereas that of Synechococcus and the combined sample shows a clear deviation from neutrality.

Wish they had gone beyond these two cyanobacteria ... but still seems of possible interest. Baumdicker, F., Hess, W., & Pfaffelhuber, P. (2012). The Infinitely Many Genes Model for the Distributed Genome of Bacteria Genome Biology and Evolution, 4 (4), 443-456 DOI: 10.1093/gbe/evs016


  1. There is a lot going for this model. We've been working on extending it to larger datasets and it holds up very well. There's a pre-print of our manuscript and R-scripts for fitting your own data at


Irresponsible reporting on "poop doping" from the Washington Post

UPDATE - see below - the author updated her article including some of my critiques. Went on a bit of a Twitter tirade last night. See mor...