Anyway - it is a nice article about Sean, especially the parts talking about how his background in video games contributed to his success in bioinformatics. Back to something I said above, Sean is without a doubt one of my favorite people in science. There are many reasons for this but here are a few.
- He is very open with ideas.
Once, at a conference, I gave a talk on this bizarre new pattern we had found when we were comparing the genomes of E. coli and V. cholerae. We had found that when we did genome-level alignments of these species there was an X-like pattern (see our paper on this here). Anyway, in the talk I said something to the effect of "we have no friggin idea how these X-like alignments could be generated" And Sean, I think in the quesiton session, pointed out that in another paper of ours we had seen what appeared to be symmetric inversions occurring around the origin of replication and that could create the X-alignment. And lo and behold he was right. We got the paper, but in a large part it was his push that got us looking at the inversions sooner than we would have.
- He is very open with science.
Most of Sean's work is on the open side of science. Open Source software. Open Access publications. Open everything. And I should point out that it was a talk by Sean that catalyzed my conversion into an Open Science supporter. I was attending a meeting in Ft. Lauderdale to discuss data release policies for genome projects. This meeting led to the "Ft Lauderdale Agreement" on data release, by the way. A the meeting there were many genomics players like Eric Lander and Francis Collins who were trying to push for not completely open data release policies where genome centers could release data but there would be constraints placed on the use of the data so that the genome centers would be the first to be able to publish genome scale analysis of an organisms genome sequence.
At the time I was working at TIGR and I supported this notion of basically letting people search for a few genes of interest but preventing them from doing genome analyses. And then Sean got up and gave a talk and, well, blew my mind. I am sure I have notes somewhere from the meeting but basically what he said was - the genome projects whole point is to generate genome data for people to do genome-level analysis. So how on earth can we justify preventing exactly the type of analysis that the projects were designed to generate. He was not saying that we should not somehow protect the genome centers. What he was saying was that for the benefit of science, we need to find a way to allow people to do genome-level analyses immediately on the data. And he also said that the risks of releasing ones data with no restrictions are much less than everyone claims. I think he convinced many people that genome centers needed to open up their data release policies a bit more. And he convinced me.
And so I went home from that meeting and decided to release the data from as many of my genome projects as I could, with NO restrictions (e.g., this is what we did with Tetrahymena). And also, this new found belief in openness helped pave the way for my conversion to being an Open Access publishing supporter.