I am a former Research Associate of Jonathan’s interested in understanding evolution and ecology of microbes in natural environments. Recently I’ve become interested in learning about the expression of secondary metabolite related genes in natural settings to put the gene’s products into an ecological context, because almost certainly microbes are not making natural products just to benefit humans. I am currently studying these topics as a post-doc in Janelle Thompson’s lab at MIT.
When I got to
MIT there was a set of paired end Illumina HiSeq data from six time points
collected over one day night cycle from the Kranji Reservoir in Singapore,
which was experiencing a cyanobacterial Harmful Algal Bloom (cyanoHAB). Note algal in this case means bacterial, I used to argue that this is taxonomically incorrect but used colloquially I think it works. These samples are what the paper “Secondary
metabolite gene expression and interplay of bacterial functions in a tropical
freshwater cyanobacteria bloom” is based on.
MIT has a program in Singapore called Center for Environmental Sensing
and modeling/Singapore MIT Alliance (CENSAM/SMART) and one of the projects is
to learn about microbial populations associated with the drainage and reservoirs over the city/state/country. The
motivation for the study (Penn, et al 2014) is based on two observations. 1) The idea to sample a day night cycle of a
harmful algal bloom derived from experiments done for marine Prochlorococcus showing major changes in
gene expression in the evenings and morning and more similar profiles at noon
and midnight (Zinser, et al 2009). 2) An initial
sample collection and analysis for this study did not readily detect
genes for the toxin microcystin from drainages around the reservoir
catchment (Nshimyimana, et al 2014) indicating the Cyanobacterium was growing in the reservoir (i.e. not
being flushed in). We knew the bloom in
the Reservoir was dominated by Microcystis
aeruginosa but now we wanted to learn if microcystin toxin genes were
expressed in the reservoir and if so were they expressed around the clock.
cyanoHABS
Harmful algal blooms are of concern because they appear to
be increasing in frequency on a global scale. HABs are not only eyesores they also produce toxins that make lakes
unusable for drinking water and recreation.
For a good introduction to HABs I suggest reading an excellent book “The algal bowl: overfertilization of the world's freshwaters and estuaries” by David
W. Schindler & John R.
Vallentyne. But I should note
there are probably thousands of books written on the subject. Below you can see what our study site looked
like during a bloom with a surface scum visible and during conditions where the water
is a bit more clear (post bloom).
Polyketide synthases (PKS) and Non-ribsomal peptide synthetases (NRPS)
The search for
expression of microcystin toxin genes is also a part of my larger interest to
learn about the expression of PKS and NRPS genes in natural settings. PKS and NRPS derived molecules represent a
large class of natural products famous for being toxins and used as medicine to
treat human disease. Two phyla of
bacteria are historically known for their production of these compounds
(Actinobacteria and Cyanobacteria). For
example the PKS and NRPS derived microcystin toxin is produced by M.
aeruginosa and members of the Phylum Actinobacteria produce the potent
antibiotic rifamycin. The expression and
presence of most PKS and NRPS pathways in natural settings is currently not
very well understood.
Prior to this
work it was not clear that bacterial PKS and NRPS pathways are expressed in natural
settings. The products of the microcystin
pathways are present in harmful algal blooms (thus the term Harmful). This made
Kranji Reservoir a good system to study because we should observe the
transcripts for microcystin. PKS and
NRPS genes can be highly repetitive and similar between different pathways so we were not sure we find them with Illumina type sequencing. Based on my initial tests using a tool called
NaPDoS, which I helped developed at Scripps to quickly identify sequence tags
from PKS and NRPS gene pathways, it was clear we could see the expression of many
different pathways in our data. This
spurred me on to look at the differences in expression over time. The examination of the time series revealed
that there appears to be a rhythm to expression of PKS and NRPS genes and that strikingly, one
of the most highly expressed PKS/NRPS gene cluster in M. aeruginosa has not been linked to a molecule. This is especially interestingly from an
ecological perspective, as one of the most highly expressed PKS/NRPS pathways have yet to be associated with a product.
Interplay
One of the
cool things about science is that it can be predictive. Within an experiment of
photosynthetic bacteria then you would hope that your expression data reflects
the idea that photosynthetic life uses light to photosynthesize and that the
genes that code for the machines that harness light would be most highly
expressed during the day. We call that, the the “sanity check,” and it came out very
nicely in our metatranscript data; showing that photosynthesis related genes
cycle in the environment and are highly correlated with the day night cycle. Our observation that the things we expected to
be highly expressed were highly expressed gave us confidence that our data may
have other patterns that we would not necessarily think to look for. We started to look at broader categories of
function genes for the top four phyla. From
this analysis we noticed that some phyla were enriched for particular genes
relative to other phyla, which in turn allowed us to make some ecological
predictions in relation to how each group, might be functioning in the bloom
community. For example look at figure 4
and 5 in the paper and you can see that Actinobacteria are mostly transporting photosynthetically derived carbohydrates but Bacteroidetes groups are mostly transporting peptides furthermore groups within the proteobacteria are expressing most of the motility and chemotaxis related genes.
Quantifying
natural microbial communities remains a significant challenge and more
importantly identifying ecological functions for phenotypes promises to provide
microbial evolutionary biologist with crucial data to learn about the evolution
of bacteria. Imagine trying to study the
evolution of a hand if you had no idea of the ecological function for the hand.
Problem Solving- paired end reads
One of the
important decisions we had to make for us to start the analysis of Illumina
data center on the state of paired end sequencing in metatranscriptomics. Paired end sequencing is a great boon for Illumina sequencing and Illumina sequencing created a huge opportunity for the
field of metagenomics. But paired Illumina reads that do not
collapse into one can represent a large portion of an Illumina sequence run
despite efforts to create short enough sequences to have overlap and yet make
the fragments large enough to make paired end sequences more informative. Paired ends can complicate issues because
they may represent two genes but one operon, or two genes from different
operons which is a problem for analysis trying to assign function to reads. The
other issue is that in assigning taxonomy to reads by chance alone similar
sequences although part of a pair may match different organisms. MEGAN tries to deal with this by increasing
bit scores for sequences that match the same thing. We made the decision to use paired
information to improve the confidence in function assignment in MEGAN if both
reads hit the same gene, and treated 1 and 2 reads as separate for counting
total reads matching a gene if the read counts were not to be normalized to
gene length. Another aspect of the study
focused on calculating expression for genes from the bloom former M. aeruginosa using RPKM which does take
into account gene length thus we decided to treat the 1 and 2 reads as
technical replicates for calculating RPKMs and averaged the values.
Future
This
experiment has given us the first glimpse at expression of toxin genes in a
natural setting and provided us with some clues of microbial phylum level
interplay. The next experiment to further test our observations includes a
greater sampling effort over two day night cycles at a greater frequency and
with replicates and sampling at the surface and subsurface. This work is being done in collaboration with
another research group interested in Microcystis
and harmful algal blooms at the National University of Singapore led by Prof.
Karina Gin. It is known that M.
aeruginosa strains migrate up and down in the water column and we want to
check to see if some of our cyclic observations relate to the presence of
different strains present on the surface throughout the day. A follow up study
in progress is to look at the reservoir community during non-bloom conditions
and run perturbations to identify the effects of the addition of nitrate,
phosphate, and microcystin on the microbial community in hopes to learn if
there are expression patterns that show how Microcystis
is able to bloom.
My Background
The exact
story behind the paper will be better understood if it is supplemented with a
brief background about my introduction to Genomics and microbial ecology which mainly occurred after
starting work as a Research Associate for Jonathan. Looking back “many years ago” I had just
finished up an undergrad degree at UCSB in Aquatic biology and I was looking
for a job as a scientist when I met Jonathan.
It was really my first meetings with Jonathan that have set my way
forward in research. I wanted to learn
about how things evolve and the ecological functions of traits and Jonathan
wanted to understand how all life evolved which meant he was studying the
genomes of microbes. In our first
meeting we discussed how genomics and methods associated with genomics namely
16S rRNA gene community studies were going to allow us to learn all about
microbial ecosystems and even allow us to do insitu ecological studies of
microbes (the term metagenomics was not widely known or used at this
time). As TIGR slowly evolved into
JCVI, I began my move to grad school to work in Paul Jensen’s lab at Scripps
Institute of Oceanography who had recently sequenced the genome of a couple
species of marine actinomycetes. In grad
school I spent a lot of time learning about natural products and the genomes of
famous group of organism called Actinomycetes, which make about 80% of the
antibiotics we take today. By the time I
finished grad school I had become acutely interested in learning about the
expression of natural products related genes in a natural setting.
Conclusion
Our latest
paper published in ISME reflects a combination of my exposure to some very
different fields of scientific research, from studying genomics and community
diversity at The Institute for Genomic Research (TIGR) to my PhD work in
natural products research at Scripps and now my studies on community gene
expression dynamics in Harmful Algal blooms at MIT. I have been researching the ideas about
insitu microbial ecology that Jonathan discussed with me those many years ago
and continue to expand our knowledge about what microbes are doing in natural
setting in this paper.
Of course I
did not do this paper in a vacuum at MIT. Prof. Janelle Thompson organized the
data collection, co-wrote the paper and taught me a lot about the appropriate
statistics we needed to use to analyze our data and interpret the results. Graduate students Tim Helbig and Sonia
Timberlake helped me get going on the computer clusters here at MIT. One of my
favorite parts of moving institutions is learning the in and outs of new
computer clusters. I have been funded as
a postdoctoral associate at MIT and subsequently by the NSF post-doctoral
fellowship intersection of math and biology during this research. Singapore
CENSAM/SMART has supported our travels to Singapore along with sequencing
costs.
No comments:
Post a Comment