Sunday, February 27, 2011

Arsenic revisited: discussing arsenic story with a #UCDavis biology writing class next week

Well, this could be fun. Next week I am making a guest appearance in a Writing class at UC Davis. The class focuses on writing in Biology and the instructor invited me to come in as a guest to coordinate a discussion of the arsenic paper and the coverage of it.

When the instructor asked for reading assignments I said they should read:

I think I probably should have suggested they read Zimmer's excellent full write up here.  Going to suggest that now but may be too late.

Any other pointers to good write ups of what has happened since the first week after the paper would be appreciated.

Some suggestions coming in from twitter:

Saturday, February 26, 2011

The Microbe Project - A Portal to Federal Efforts in Microbial Research #microBEnet

For those interested in what the US Government is doing in areas relating to microbiology, there is a useful site to check out called "The Microbe Project". The goal of this project is to "to maximize the opportunities offered by genome-enabled microbial science to benefit science and society, through coordinated interagency efforts to promote research, infrastructure development, education and outreach." There is a lot of information there on different agencies and their funding opportunities.

Tuesday, February 22, 2011

Sunday, February 20, 2011

Some quick notes on mini trip to DOE-JGI & UCSF

Well, just got back from a very brief trip to give a talk at UCSF.  I was invited by graduate students via the following email message
The students in the DeRisi Lab at UCSF are thrilled to invite you as our selection for the California Institute for Quantitative Biosciences (QB3) and Integrative Program in Quantitative Biology (iPQB) Invitational Speaker Series!  This is a new type of speaker series at UCSF where students from a laboratory directly nominate and invite speakers from outside UCSF to deliver lectures about cutting-edge topics in quantitative biology.  All nominations are discussed by a QB3/iPQB student committee, and a speaker list is compiled from many outstanding nominations.  The invited speakers are scientists who gathered especially strong support from their nominating lab and the QB3/iPQB student committee, and who have an outstanding record in delivering exciting research talks grounded in quantitative methods. As our selected nominee, your travel, lodging and food expenses will be covered by funds from QB3/iPQB.
It was impossible to say no - I am always thrilled to be invited by student groups.  It took a bit to coordinate when I would come (I am notoriously bad at planning and answering such invitations).  But eventually, with a little prodding, I answered and we picked a date. I of course then asked to change the date and we settled on February 17.

Friday, February 18, 2011

Some things to read in light of reported human DNA in bacterial genomes vs. contamination

Well, there is an interesting few papers out there relating to human DNA and whether or not there have been some recent lateral transfers of it into microbial genomes.  See for example
  • this paper in mBio that suggests there has been lateral transfer of LINE elements from humans to Neisseria species
  • but then see this paper suggesting massive contamination of sequence databases with LINE elements (PLoS One paper on contamination)
So what is going on?  Not clear.  If you want more detail about these papers I suggest reading one of the following
There were other stories out there ... but since Hannah and Ed interviewed me, I am a bit biased about which ones are worth reading.  Here are some others to read though
Personally, I am a bit skeptical of the LGT claim because most of the evidence they present relies on amplification (ie PCR).  But without getting into too many of the details myself I thought I would just post some background reading connected to some of my past work in this area for anyone interested in this type of thing

Information about claim of HGT into humans from bacteria that was in the Lander et al Human Genome paper:
A short story I wrote in 1998 about, well, contamination in genome databases
My colleagues assembling of nearly complete bacterial genomes from the raw sequence reads from fly genome projects
Complete mitochondrial genome(s) found in Chromosome II of Arabidopsis.  Was very difficult to sort out which reads came from nuclear genome and which from mitochondria

End of Sequence Read Archive (SRA) - some quick notes

Well, it seems that the Sequence Read Archive (SRA) is going away sometime in the near future.  I posted about the SRA last week and in the discussion someone posted an email message that supposedly was from David Lipman of the NCBI saying that the SRA is going to be closing.   This has now been confirmed and I thought I would just post some links discussing this

Tuesday, February 15, 2011

Open Access Pioneer Award: George Garrity for his work on Standards in Genomic Sciences journal

A few days ago I got a message that made a big impact on my publication record in PubMed Central.  My number of publications there went up by 85 in one fell swoop (see below for the list ...).  Did I publish 85 new papers yesterday?  No.  But a journal in which I have been a co-author on many papers recently finally showed up in Pubmed Central.  The journal is called Standards in Genomic Sciences.  The journal's scope is:
The goal of SIGS is to serve as an open-access, standards-supportive publication for rapid dissemination of concise genome and metagenome reports that comply with the emerging MIGS/MIMS standards, detailed standard operating procedures, meeting reports, reviews and commentaries, data policies, white papers and other gray literature that is relevant to genome sciences but currently absent from the scholarly literature.
Lots of jargon, I know.  But you can ignore that.  The reason I am writing here is that this journal is a place to publish what could be called "genome sequencing data reports."  These reports are a way for data producers to describe, in a formal manner, their sequencing project - and to share - in a formal manner - not only the data but also metadata about the organism(s) sequenced and the methods used.  As sequencing gets cheaper and easier, we need places for people to publish these types of "data papers" to both produce a citable unit with a DOI relating to the data, and to also share the details of the data production in a way that a simple Genbank entry does not.

One aspect of the papers in this "SIGS" journal is that they are being done in a way that is compliant with sets of standards for sharing metadata about the organism and the project.  I confess, when I first heard about these standards developments, I was bored almost to tears.  But now I realize that this is a very important aspect of getting the most out of genome data.  If people who sequence a genome not only release the sequence data, but also a table of information about the project, such as information about the organism (e.g., aerobic vs anaerobic, location of isolation) and about the data production (e.g., sequencing methods used) then people will be able to do high throughput analyses of these features.  Then we will not just be looking at sequence but also connecting these sequences to organismal features.  Right now that is very hard to do since genome data is rarely accompanied by machine usable information about the organism that has been sequenced.

Anyway - long story short - there will be a paper published in this journal for each genome being produced as part of the "Genomic Encyclopedia of Bacteria and Archaea" project that I have been coordinating at the DOE Joint Genome Institute in collaboration with the DSMZ.

Today I am writing to recognize the people connected to this movement - the people who created the standards, the people who created and run the journal, and the people writing papers for this journal.  All of them are to be commended for their vision and their dedication to openness.  Thus I am giving George Garrity the EIC of SIGS  my "Open Access Pioneer Award" for creating SIGS, making it an open access journal,  for keeping it running and for getting its papers into Pubmed Central and soon Pubmed.   Many others should be recognized too for their contribution to SIGS (see the whole list of founding members here).  I also should recognize Nikos Kyrpides from the DOE JGI who helped coordinate the writing and submission of these papers along with Hans Peter Klenk from the DSMZ.  Without them, these papers would never have gotten out there.  Plus I think some credit goes to Michigan State University and the Department of Energy which apparently are sponsors of SIGS


Monday, February 14, 2011

Checking out Davis Life Magazine

Just a quick post here - haven't posted to my Davis blog in a while but have been looking at Davis Life Magazine and like it a lot. Worth a look for Davisites out there.

Tuesday, February 08, 2011

Though I generally love NCBI, the Sequence/Short Read Archive (SRA) seems to have issues; what do others think?

Well, here goes. Hope to not get people from NCBI too pissed off here. Overall, I think NCBI is invaluable: GenBank. PubMed. PubMed Central (PMC) (well, I have some complaints about that but let's not get into those here -- I still like it), BLAST (Basic Local Alignment Search Tool) and a plethora of other tools, databases and resources. Generally, money well spent.

However, one database from NCBI is driving me a bit wacky these days. This is the Sequence Read Archive (SRA). Known to some as the "Short Read Archive" this database is supposedly for storing "sequencing data from the next generation of sequencing platforms including Roche 454 GS System®, Illumina Genome Analyzer®, Life Technologies AB SOLiD System® , Helicos Biosciences Heliscope®;, Complete Genomics®, and Pacific Biosciences SMRT®."

It certainly seems to be used for that function. But alas, storing sequence is not the only need here. Recovering sequence and making use of it is really the key. And this is the area I have been having trouble with (especially related to environmental studies like rRNA PCR and metagenomics). Rather than go on about my particular issues here (and thus possibly biasing the discussion too much), I am wondering what others think of the SRA? Usability? Ease of deposition? Ease of extraction? Missing features? Things it does or does not do well? Do we need a new system for environmental projects?

Any and all comments welcome here or on twitter or on Friendfeed or wherever. See Friendfeed stream below:

Here are some comments so far from twitter
  • digitalbio Sandra Porter I agree. RT @phylogenomics: Though I generally love NCBI, the Sequence/Short Read Archive (SRA) seems to hav… (cont)
  • lswenson Luke Swenson @phylogenomics I was JUST trying to navigate the SRA! There's no help section to be found, and forget about depositing sequences!
  • audyyy Davis-Richardson @phylogenomics I can never tell if my submission went through without emailing support. Also, no FASTQ support?
  • cabbageRed Rich C .@phylogenomics I agree, the SRA doesn't seem to be the easiest repository to search with what I believe to be "typical" NGS queries

Monday, February 07, 2011

Is it time to refer to mitochondria as bacteria?

Any time a scientific article has in the summary a sentence like the one below, I am attracted to it:
"Here, I playfully explore the arguments for and against a phylogenetic fundamentalism that states that mitochondria are bacteria and should be given their own taxonomic family, the Mitochondriaceae."
So how could I not want to read this: Trends in Microbiology - Time to recognise that mitochondria are bacteria?:

Well, one reason is that it had been unavailable outside of the TIM paywall. However, the author, Mark Pallen, with a little prodding from me, managed to get the Editors to feature it as a "free" article on their website for at least some time. So everyone, download this paper and distribute it to as many as you can (legally). Oh, and read it, it is definitely worth a read.

In the article Pallen argues for giving mitochondria their own family w/in bacteria. I think that would be a good idea as they are really just a highly reduced form of bacteria. We give endosymbionts, even those with tiny genomes, their own groups. So why note organelles that are derived from bacteria? After all - phylogenetically they are bacteria.

Pallen even goes so far as to suggest rethinking of mitochondria as bacteria will help with efforts to engineer mitochondria in various ways. That is an interesting notion.

I suppose one could push this to an extreme position and argue that the nucleus and all the genes associated with it are really just a shell around a mitochondrial core. And then I guess all eukaryotes could be considered bacteria. But I do not want to confuse the issue too much here. Overall, I really really like this paper. I wish it were in an OA journal, but since it is free for now I think it is worth checking out. In the long run, it would be better (hint hint Mark Pallen) to publish such thought provoking pieces in places everyone can access ....

So I guess this paper, along with all the "microbiome" stuff means that humans are really just carrying vessels for bacteria.

Is it time to refer to mitochondria as bacteria?

Any time a scientific article has in the summary a sentence like the one below, I am attracted to it:

"Here, I playfully explore the arguments for and against a phylogenetic fundamentalism that states that mitochondria are bacteria and should be given their own taxonomic family, the Mitochondriaceae."

Saturday, February 05, 2011

Alternative Real Time Twitter Feed for #AGBT #Experiment

Tracking Advances in Genome Biology & Technology meeting w/ Twitter Widget: #AGBT

Just a little experiment here seeing if this twitter widget tracks #AGBT tweets in real time ...

Valentine's Special: Dating in the 21st Century: 2/8 Berkeley #BABS

Posting this email I received:

Bay Area Biosystematists Meeting

Tuesday evening, February 8th, 2011

at UC Berkeley, 2063 Valley Life Sciences Bldg.

Valentine's Special:
"Dating in the 21st Century:
Theoretical and Empirical Issues in Putting Dates on Phylogenies"

Featuring a Diverse and Distinguished Panel of Discussants
Followed by vigorous audience discussion

Panel members representing different approaches will give short informal presentations (10 minutes each), to be followed by active audience participation (this all following traditional pizza and beer, of course!).  
Confirmed Panel Members:
Tracy Heath
Pat Holroyd
Nick Matzke (moderator)
Sarah Werning

The venerable Biosystematists group (, operating since 1936 (see the history on the website), is the only inter-institutional seminar/discussion group on evolution for the Bay Area, so we encourage everyone to join in.

Schedule and venue:
    5:30 - social gathering with beverages (beer and soft drinks) and informal pizza dinner:  cost ca. $10, to be collected at door, 2063 Valley Life Sciences Bldg., UC Berkeley campus.
    7:00 - talk followed by discussion, in same room.

Reservations required for beverages and dinner (but not the talk).  Please email reservations to your host, Brent Mishler, at by Sunday, Feb. 6th  

For a map of campus and view of VLSB, use the link below.

All are welcome, members or not.  If you want to join the Biosystematists, sign up for our mailing list at: 

See you all there!

Friday, February 04, 2011

Indoor Air 2011; Austin, Tx; June 5-10 #microBEnet

Just thought I would give people the heads up - I am helping plan a session on "Microbiology of the Indoor Environment" that will happen at the "Indoor Air 2011" meeting in Austin, TX June 5-10 2011.  The conference itself covers an enormous amount of ground about, well, Indoor Air.  And I am helping the meeting organizer Rich Corsi plan a special session that will try to bring together (1) researchers working on culture-independent studies of the microbes in the indoor environment with (2) scientists and engineers and others who work on the indoor environment.  Will post more about this special session as details come out.  But thought I would give people a heads up ...

I note, this is a component of the Sloan Foundation's New program in Indoor Microbiology - I have received a grant from them to create something called microBEnet ("microbiology of the Built Environment network").  In this microBEnet project we will be working to foster communication, collaboration, research and other related activities for the Sloan Program.  More coming on microBEnet soon but if you want a little taste (a very preliminary taste) - see our blog here.

IQ Test for bacteria

Social IQ of bacteria
Another quick one here.  Interesting paper out in BMC Genomics: Genome sequence of the pattern forming Paenibacillus vortex bacterium reveals potential for thriving in complex environments

The paper is from Eshel-Ben Jacob and colleagues from many institutions around the world.

Here is a summary of the article (from the paper)

The pattern-forming bacterium Paenibacillus vortex is notable for its advanced social behavior, which is reflected in development of colonies with highly intricate architectures. Prior to this study, only two other Paenibacillus species (Paenibacillus sp. JDR-2 and Paenibacillus larvae) have been sequenced. However, no genomic data is available on the Paenibacillus species with pattern-forming and complex social motility. Here we report the de novo genome sequence of this Gram-positive, soil-dwelling, sporulating bacterium.
The complete P. vortex genome was sequenced by a hybrid approach using 454 Life Sciences and Illumina, achieving a total of 289× coverage, with 99.8% sequence identity between the two methods. The sequencing results were validated using a custom designed Agilent microarray expression chip which represented the coding and the non-coding regions. Analysis of the P. vortex genome revealed 6,437 open reading frames (ORFs) and 73 non-coding RNA genes. Comparative genomic analysis with 500 complete bacterial genomes revealed exceptionally high number of two-component system (TCS) genes, transcription factors (TFs), transport and defense related genes. Additionally, we have identified genes involved in the production of antimicrobial compounds and extracellular degrading enzymes.
These findings suggest that P. vortex has advanced faculties to perceive and react to a wide range of signaling molecules and environmental conditions, which could be associated with its ability to reconfigure and replicate complex colony architectures. Additionally, P. vortex is likely to serve as a rich source of genes important for agricultural, medical and industrial applications and it has the potential to advance the study of social microbiology within Gram-positive bacteria.

The organism is certainly interesting.  See for more detail (Eshel-Ben Jacob told me he updated the site).

But perhaps more interesting is the concept that Eshel-Ben Jacob has been pushing on bacterial social intelligence.  See for more detail:

The main idea behind this is to look at social communication strategies as a measure of intelligence.  And from a genomics point of view one can measure the genetic diversity of genes likely involved in these processes.  Such counting of genes is not the most useful thing in the world but more important, these organisms really have some fascinating behaviors and in the end we should measure behavior diversity not genomic diversity of putative social genes to measure "bacterial IQ". 

Sirota-Madi, A., Olender, T., Helman, Y., Ingham, C., Brainis, I., Roth, D., Hagi, E., Brodsky, L., Leshkowitz, D., Galatenko, V., Nikolaev, V., Mugasimangalam, R., Bransburg-Zabary, S., Gutnick, D., Lancet, D., & Ben-Jacob, E. (2010). Genome sequence of the pattern forming Paenibacillus vortex bacterium reveals potential for thriving in complex environments BMC Genomics, 11 (1) DOI: 10.1186/1471-2164-11-710

More and more books focusing on #metagenomics these days ...

A friend/colleague Peter Turnbaugh just sent me a note about a new metagenomics book which he contributed to (see Metagenomics: Current Innovations and Future Trends). So I sniffed around Amazon and found a collection now of books that focus heavily on metagenomics. I do not know much about the quality of them but some look interesting. So I created a mini-Amazon collection of them: Metagenomics Books. Yet another sign metagenomics is still hot I suppose ...

Tuesday, February 01, 2011

#Badomics word of the day (or even month): Culturomics

Omics omics omics. The suffix is everywhere. And this paper has one of the worst ones I have seen in a while: Quantitative Analysis of Culture Using Millions of Digitized Books. paper came out right when I started winter vacation, which is why I am getting to it now ...

In this paper, which is pretty cool, the authors make use of digitized books to track and study cultural trends. The data is f#*$#$ impressive. The results are very very interesting. The press coverage was very positive. The word, however, was and is awful. Did they really really have to call it culturomics, thereby contaminating their fascinating work with all the baggage of genomics? Really? Really? For that you get a Bad Omics Word of the Day Award.

Anyway - here are some links to coverage of the culturomics work, which as I said, is quite impressive. Just wish they had come up with their own non omicy word:
Hat tips to many people including Morgan Langille, Elizabeth Pringle, Michael Dunn, Nick Matzke, Mihai Pip and Sergios-Orestis Kolokotronis for calling my attention to culturomics.

Most recent post

My Ode to Yolo Bypass

Gave my 1st ever talk about Yolo Bypass and my 1st ever talk about Nature Photography. Here it is ...