- this paper in mBio that suggests there has been lateral transfer of LINE elements from humans to Neisseria species
- but then see this paper suggesting massive contamination of sequence databases with LINE elements (PLoS One paper on contamination)
So what is going on? Not clear. If you want more detail about these papers I suggest reading one of the following
- Mark Pallen: Human DNA in bacterial genomes? Yes? No? Maybe?
- Ed Yong: Gonorrhea has picked up human DNA (and that’s just the beginning)
- Hannah Waters: Contaminated genomes
- Hannah Waters: Transitioning into “real” science journalism
There were other stories out there ... but since Hannah and Ed interviewed me, I am a bit biased about which ones are worth reading. Here are some others to read though
Personally, I am a bit skeptical of the LGT claim because most of the evidence they present relies on amplification (ie PCR). But without getting into too many of the details myself I thought I would just post some background reading connected to some of my past work in this area for anyone interested in this type of thing
Information about claim of HGT into humans from bacteria that was in the Lander et al Human Genome paper:
- Link Between Human Genes and Bacteria Is Hotly Debated by Rival ...
- Salzberg et al. 2001. No bacterial to human HGT. Science.
One use of finishing genomes: helping rule out contamination
A short story I wrote in 1998 about, well, contamination in genome databases
My colleagues assembling of nearly complete bacterial genomes from the raw sequence reads from fly genome projects
Complete mitochondrial genome(s) found in Chromosome II of Arabidopsis. Was very difficult to sort out which reads came from nuclear genome and which from mitochondria
Contamination is the bane of my existence. It's constantly making me look silly for the Rfam work. Frequently a perfectly good bacterial family contains high-scoring eukaryotic sequence. RNAI is one of the worst affected families. It's a beautiful phage encoded RNA that represses phage replication. Unfortunately the phage is used for a lot of sequencing projects and is often not cleaned up from the submissions to the sequence archives. So I'm left with a phage model that annotates a handful of phage homologs and THOUSANDS of contaminant sequences (see the 'species' tab). I've been considering using families like this to identify the authors who have submitted the most contaminated sequence to the sequence archives. I'm just not too sure if any benefit would come from this or not.
ReplyDelete