However, one database from NCBI is driving me a bit wacky these days. This is the Sequence Read Archive (SRA). Known to some as the "Short Read Archive" this database is supposedly for storing "sequencing data from the next generation of sequencing platforms including Roche 454 GS System®, Illumina Genome Analyzer®, Life Technologies AB SOLiD System® , Helicos Biosciences Heliscope®;, Complete Genomics®, and Pacific Biosciences SMRT®."
It certainly seems to be used for that function. But alas, storing sequence is not the only need here. Recovering sequence and making use of it is really the key. And this is the area I have been having trouble with (especially related to environmental studies like rRNA PCR and metagenomics). Rather than go on about my particular issues here (and thus possibly biasing the discussion too much), I am wondering what others think of the SRA? Usability? Ease of deposition? Ease of extraction? Missing features? Things it does or does not do well? Do we need a new system for environmental projects?
Any and all comments welcome here or on twitter or on Friendfeed or wherever. See Friendfeed stream below:
Here are some comments so far from twitter
- digitalbio Sandra Porter I agree. RT @phylogenomics: Though I generally love NCBI, the Sequence/Short Read Archive (SRA) seems to hav… (cont) http://deck.ly/~XM75A
- lswenson Luke Swenson @phylogenomics I was JUST trying to navigate the SRA! There's no help section to be found, and forget about depositing sequences!
- audyyy Davis-Richardson @phylogenomics I can never tell if my submission went through without emailing support. Also, no FASTQ support?
- cabbageRed Rich C .@phylogenomics I agree, the SRA doesn't seem to be the easiest repository to search with what I believe to be "typical" NGS queries