Tuesday, December 16, 2008

Open Evolution Highlights - the Population Genetics of dN/dS

An interesting new paper in PLoS Genetics (PLoS Genetics: The Population Genetics of dN/dS) by Sergey Kryazhimskiy and Josh Plotkin that discusses the use of the widely used parameter dN/dS (in essence a measure of the ratio of non synonymous to synonymous substitutions in protein coding genes). This parameter is commonly used to estimate the type of selection that has occurred in a protein coding gene.

Here is their summary of their article:
Since the time of Darwin, biologists have worked to identify instances of evolutionary adaptation. At the molecular scale, it is understood that adaptation should induce more genetic changes at amino acid altering sites in the genome, compared to amino acid–preserving sites. The ratio of substitution rates at such sites, denoted dN/dS, is therefore commonly used to detect proteins undergoing adaptation. This test was originally developed for application to distantly diverged genetic sequences, the differences among which represent substitutions along independent evolutionary lineages. Nonetheless, the dN/dS statistics are also frequently applied to genetic sequences sampled from a single population, the differences among which represent transient polymorphisms, not substitutions. Here, we show that the behavior of the dN/dS statistic is very different in these two cases. In particular, when applied to sequences from a single population, the dN/dS ratio is relatively insensitive to the strength of natural selection, and the anticipated signature of adaptive evolution, dN/dS>1, is violated. These results have implications for the interpretation of genetic variation sampled from a population. In particular, these results suggest that microbes may experience substantially stronger selective forces than previously thought.
The key to me is that it seems that many may have been using dN/dS ratios inappropriately when comparing samples within a species. For more, well, see the paper.


  1. I see their point with the viral evolution stuff...but to characterize different strains of bacteria as coming from a single population (and thus dN/dS studies between bacterial genomes) is a tad bit unrealistic

  2. David

    Yes this seems unrealistic but they do at least hint at the problems in their conclusion: "For microbes and viruses, however, the distinction may be more opaque."

  3. This is a great paper and I'm glad to see it discussed. This is an off-topic comment and I apologize, but this seems like a good place to get it answered:

    Why don't the trackbacks work on PLoS? (Maybe I am doing something wrong but) I have tried to use trackbacks a few times on blog posts of my own and I don't see them show up on the journal article site in PLoS. (And this post isn't linked on the paper yet either.)

    I'm a giant fan of PLoS and think that its infrastructure, which permits/encourages online interaction, is awesome and holds great potential. It would be really nice to seem some of that interconnectivity working a little better.

  4. APB - I think trackbacks do work in theory in PLoS. ut some sites do not allow trackbacks. For example, Blogger/Blogspot does not allow them still I believe. So it might be the fault of how you publish your blog.

  5. I believe that it is impossible to calculate the MacDonald's index (dN/dS).

    If you look at most of the metazoans' trascriptome, you'll see that the maiority of genes have at least two or three splicing isoforms each. The latest studies say that in human, this happen for the 70% of the genes.

    How can you calculate which nucleotide is coding and which is not, unless you know the full transcriptome of an organism?

  6. Yes, gioby, fair point. There are issues in lots of organisms with defining coding with things like alternative splicing, overlapping genes, etc . But that does not of course mean you cannot calculate dN/dS. It means that it may not mean what you think it means (to steal some lingo from the Princess Bride). Even when you have the full transcriptome the meaning can be difficult to figure out as few regions are "on" all the time. I still view these measures as potentially useful -- just to be taken with a grain of salt.

  7. We have previously shown there is a time-dependent dN/dS gradient in bacteria as non-synonymous mutations preferentially get removed by selection (at a rate determined partially by the population size) here:
    J Theor Biol. 2006 Mar 21;239(2):226-35. Epub 2005 Oct 18

    so, yes, caution advised!!

  8. Thanks Ed. I somehow missed yours/Eduardo's paper.