Saturday, June 10, 2023

Summary of responses to request for examples of data sets of microbial genomes with associated phenotypic data

So last week I posted a question to many different places as follows:
Wanted - dataset(s) to test bacterial genome analysis / annotation methods. Ideally has many genomes from collection of [interesting] bacteria with associated experimental phenotypes / metadata.
On Linked In:
  • Jonathan Jacobs:"
    • All the reference genomes in the ATCC Genome Portal are freely available for non-commercial research purposes. They are also fully authenticated and traceable to physical production lots in our biorepository, and produced under ISO quality management. I’m biased, but I think we’re producing the quality microbial genomes I’ve ever seen on a regular basis - so you might want to look there. We’re producing about 100-150 new genomes every month - and many of them (about ~1/3rd) are for organisms with no preexisting genome. We have about 3,300 microbial genomes now (bacteria, viruses, fungi, protists), and all bacteria and fungi are sequenced kn both Illumina and Nanopore. Drop me an email or DM if you want to learn more or collaborate or something. Here’s a link: (and here’s a link to a comparative genomics paper we published in mSpheres earlier"
    • And then "I should also add that we have tons of metadata and I’ve hired a full time data curator to help bring metadata we have in our historical records warehouse into our digital records (ie phenotypic data from routine QA/QC testing going back to the 1920s…) "
    • Natalie Ma wrote
      • Joint Genome Institute may have this (if you're fine with an environmental microbes focus). Adam Deutschbauer has done several Tnseq libraries for the bugs and characterized their phenotypes.

On Facebook:

