The Tree of Life: functional prediction

Showing posts with label functional prediction. Show all posts

Wednesday, August 28, 2013

Announcing CAFA 2: The Second Critical Assessment of Protein Function Annotations

Just received this from Iddo Friedberg:

Friends and Colleagues,

We are pleased to announce the Second Critical Assessment of protein
Function Annotation (CAFA) challenge. In CAFA 2, we would like to
evaluate the performance of protein function prediction tools/methods
(in old and new scenarios) and also expand the challenge to include
prediction of human phenotypes associated with genes and gene
products. As the last time, CAFA will be a part of the Automated
Function Prediction Special Interest Group (AFP-SIG) meeting that will
be held alongside the ISMB conference. AFP-SIG will be held as a
two-day meeting in July 2014 in Boston.

The targets and all information about the CAFA challenge are now
available at http://biofunctionprediction.org. The submission deadline
for predictions is January 15, 2014. The initial evaluation will be
done during the AFP-SIG meeting in Boston. Anyone in the world is
welcome to participate.

The mission of the Automated Function Prediction Special Interest
Group (AFP-SIG) is to bring together computational biologists who are
dealing with the important problem of gene and gene product function
prediction, to share ideas and create collaborations. We also aim to
facilitate interactions with experimental biologists and biocurators.
We hope that AFP-SIG serves an important role in stimulating research
in annotation of biological macromolecules, but also related fields.

About the CAFA experiment

Tuesday, September 06, 2011

More on 'phylogenomics' - as in functional prediction w/ phylogeny

There is a new paper out: Phylogenetic-based propagation of functional annotations within the Gene Ontology consortium in Briefings in Bioinformatics.

The paper is interesting and presents a new general approach to using phylogeny for functional prediction of uncharacterized genes. I am interested in this for many reasons including that I was one of, if not the first to lay this out as a concept. In a series of papers from 1995-1998 I outlined how phylogenetic analysis could be used to aid in functional prediction for all the genes that were starting to be sequenced in genome projects without any associated functional studies (at the time, I referred to all these ESTs and other sequences as an "onslaught" - little did I know what was to come).

My first paper on this topic was in 1995: Evolution of the SNF2 family of proteins: subfamilies with distinct sequences and functions. The abstract is below:

The SNF2 family of proteins includes representatives from a variety of species with roles in cellular processes such as transcriptional regulation (e.g. MOT1, SNF2 and BRM), maintenance of chromosome stability during mitosis (e.g. lodestar) and various aspects of processing of DNA damage, including nucleotide excision repair (e.g. RAD16 and ERCC6), recombinational pathways (e.g. RAD54) and post-replication daughter strand gap repair (e.g. RAD5). This family also includes many proteins with no known function. To better characterize this family of proteins we have used molecular phylogenetic techniques to infer evolutionary relationships among the family members. We have divided the SNF2 family into multiple subfamilies, each of which represents what we propose to be a functionally and evolutionarily distinct group. We have then used the subfamily structure to predict the functions of some of the uncharacterized proteins in the SNF2 family. We discuss possible implications of this evolutionary analysis on the general properties and evolution of the SNF2 family.

Monday, September 05, 2011

Some links on "ortholog conjecture" paper and critiques of it

Recently a paper by Matt Hahn was published in PLoS Computational Biology entitled "Testing the ortholog conjecture with comparative functional genomic data from mammals." The paper created a bit of a stir as some aspects of it call into question some of the standard assumptions made in comparative genomic analysis.

I alas do not have time to go into all the details but fortunately others have tackled this and I am posting some links here:

Comments at Faculty of 1000 (currently available for free but not sure for how long).
Of mice and men and meta analyses (Richard Grant discussing Michael Galperin's F1000 dissent)
The 'Ortholog Conjecture' | The Daily Scan | GenomeWeb
Of Mice and Men or: Revisiting the Ortholog Conjecture (Iddo Friedberg)
Friendfeed discussion: http://friendfeed.com/erickma... (see below)

Monday, July 26, 2010

Testing, testing - why we need more testing like this in genomic informatics & annotation methods

Just got an announcement regarding this challenge:

Automated Function Prediction SIG 2011 featuring the CAFA Challenge: Critical Assessment of Function Annotations | Automated Function Prediction 2011 July 15-16 2011, Vienna, Austria

Here is a description:

CAFA is a community-driven effort. We call upon computational function prediction groups to predict the function of a set of proteins whose true function is sequestered. At the meeting, we will reveal the functions, and discuss the predictions. The CAFA challenge goals are to foster a discussion between annotators, predictors and experimentalists about methodology as quality of functional predictions, as well as the methodology of assessing those predictions. Registration for CAFA starts July 15, 2010 and the CAFA challenge will take place September 15, 2010 through January 15, 2011.See here for more details on how you can enroll in CAFA.

This is near and dear to my heart as I have been working on methods to predict gene function from sequence for some 15 years now. My first paper on this was in 1995 in which I showed that for genes in multigene families, phylogenetic trees of the gene family could help in predicting functions of uncharacterized members of the gene family. More specifically, I suggested that the position of an uncharacterized gene in a gene tree relative to characterized genes could be used to predict its function. I did this for one family in particular - the SNF2 family - but argued that it could be applied to other families. (I think perhaps it was the first time someone had made this specific argument about using trees to predict function, but am not sure)

I then formalized this idea with a few papers (e.g., here and here) describing a "phylogenomic" approach to predicting function (alas, this is when I invented my first omics word). And for many years since, I continued to work on functional prediction methods and continue to do so. When I was at TIGR for eight years I did this both in my own research and helped others with their functional predictions. I firmly believe that evolutionary ~~approached~~ approaches are critical in such functional prediction and have laid this out in a series of talks and papers (e.g., see this more recent one).

Anyway, enough about me. I can argue all I want about how brilliant I am and about how evolutionary methods are the best approach. But arguing is alas not science. What we need are tests and experiments. And that is where things like CAFA come in. In CAFA one can test how well various functional prediction methods work. And the people involved in CAFA (including organizers Iddo Friedberg, Michal Linial, and Predrag Radivojac and others such as Amos Bairoch, Sean Mooney, Patricia Babbitt, Steven Brenner, Christine Orengo and Burkhard ~~Rosh~~Rost)) are to be commended for putting this together because we do not have a lot of these activities and need more in all aspects of genomics (and metagenomics too). Others have discussed doing tests of functional prediction methods before, but I am not sure if any have happened per se.

Have a favorite functional prediction method? Enter it in the competition or give a talk on it. And if you are feeling inspired, organize a similar activity in your area of science - testing is a good thing.

See also Iddo Friedberg's post about this

The Tree of Life

Wednesday, August 28, 2013

Announcing CAFA 2: The Second Critical Assessment of Protein Function Annotations

Tuesday, September 06, 2011

More on 'phylogenomics' - as in functional prediction w/ phylogeny

Monday, September 05, 2011

Some links on "ortholog conjecture" paper and critiques of it

Monday, July 26, 2010

Testing, testing - why we need more testing like this in genomic informatics & annotation methods

Most recent post

A ton to be thankful for -- here is one part of that - all the acknowledgement sections from my scholarly papers

Popular Posts