Monday, July 26, 2010

Testing, testing - why we need more testing like this in genomic informatics & annotation methods

Just got an announcement regarding this challenge:

Automated Function Prediction SIG 2011 featuring the CAFA Challenge: Critical Assessment of Function Annotations | Automated Function Prediction 2011 July 15-16 2011, Vienna, Austria

Here is a description:
CAFA is a community-driven effort. We call upon computational function prediction groups to predict the function of a set of proteins whose true function is sequestered. At the meeting, we will reveal the functions, and discuss the predictions. The CAFA challenge goals are to foster a discussion between annotators, predictors and experimentalists about methodology as quality of functional predictions, as well as the methodology of assessing those predictions. Registration for CAFA starts July 15, 2010 and the CAFA challenge will take place September 15, 2010 through January 15, 2011.See here for more details on how you can enroll in CAFA.

This is near and dear to my heart as I have been working on methods to predict gene function from sequence for some 15 years now.  My first paper on this was in 1995 in which I showed that for genes in multigene families, phylogenetic trees of the gene family could help in predicting functions of uncharacterized members of the gene family.  More specifically, I suggested that the position of an uncharacterized gene in a gene tree relative to characterized genes could be used to predict its function.  I did this for one family in particular - the SNF2 family - but argued that it could be applied to other families.  (I think perhaps it was the first time someone had made this specific argument about using trees to predict function, but am not sure)

I then formalized this idea with a few papers (e.g., here and here) describing a "phylogenomic" approach to predicting function (alas, this is when I invented my first omics word).  And for many years since, I continued to work on functional prediction methods and continue to do so.  When I was at TIGR for eight years I did this both in my own research and helped others with their functional predictions.  I firmly believe that evolutionary approached approaches are critical in such functional prediction and have laid this out in a series of talks and papers (e.g., see this more recent one).

Anyway, enough about me.  I can argue all I want about how brilliant I am and about how evolutionary methods are the best approach.  But arguing is alas not science.  What we need are tests and experiments.  And that is where things like CAFA come in.  In CAFA one can test how well various functional prediction methods work.  And the people involved in CAFA (including organizers  Iddo FriedbergMichal Linial, and Predrag Radivojac and others such as Amos Bairoch, Sean Mooney, Patricia Babbitt, Steven Brenner, Christine Orengo and Burkhard RoshRost)) are to be commended for putting this together because we do not have a lot of these activities and need more in all aspects of genomics (and metagenomics too).  Others have discussed doing tests of functional prediction methods before, but I am not sure if any have happened per se.

Have a favorite functional prediction method?  Enter it in the competition or give a talk on it.  And if you are feeling inspired, organize a similar activity in your area of science - testing is a good thing.

See also Iddo Friedberg's post about this


  1. Is there a word missing from one of these sentences?

    "I firmly believe that evolutionary approached are critical..."

  2. How do the assessors determine the correct annotation?

  3. well, that should have said "approaches" - i fixed that - now does it make sense?

  4. James - I am not sure about that - will have to get Iddo to answer

  5. Iddo Here. In answer to James's question:

    1) There is a natural growth in experimentally annotated proteins in UniProt. esp. in model organisms. So between January 15, 2011 and June 2011 we expect to get several hundred of proteins annotated with experimental evidence that were not annotated before.

    2) We are requesting SwissProt & others to sequester some annotations for us.

    3) We are working with several experimental labs, who are providing us with their own targets.

  6. Thank you, we had a lot of fun to browse through all the links although there are a couple which don't work anymore!!!That is great. It is really a helpful information. Keep up the good work...