The Tree of Life: Bacteria & archaea don't get no respect from interesting but flawed #PLoSBio paper on # of species on the planet

Tuesday, August 23, 2011

Bacteria & archaea don't get no respect from interesting but flawed #PLoSBio paper on # of species on the planet

Uggh. Double uggh. No no. My first blog quadruple uggh. There is an interesting new paper in PLoS Biology published today. Entitled "How many Species Are There on Earth and in the Ocean?" PLoS Biol 9(8): e1001127 - it is by Camilo Mora, Derek Tittensor, Sina Adl, Alastair Simpson and Boris Worm. It is accompanied by a commentary by none other than Robert May, one of the greatest Ecologists of all time: PLoS Biology: Why Worry about How Many Species and Their Loss?

I note - I found out about this paper from Carl Zimmer who asked me if I had any comments. Boy did I. And Zimmer has a New York Times article today discussing the paper: How Many Species on Earth? It’s Tricky. Here are my thoughts that I wrote down without seeing Carl's article, which I will look at in a minute.

The new paper takes a novel approach to estimating the number of species. I would summarize it but May does a pretty good job:

"Mora et al. [4] offer an interesting new approach to estimating the total number of distinct eukaryotic species alive on earth today. They begin with an excellent survey of the wide variety of previous estimates, which give a range of different numbers in the broad interval 3 to 100 million species"

....

"Mora et al.'s imaginative new approach begins by looking at the hierarchy of taxonomic categories, from the details of species and genera, through orders and classes, to phyla and kingdoms. They documented the fact that for eukaryotes, the higher taxonomic categories are “much more completely described than lower levels”, which in retrospect is perhaps not surprising. They also show that, within well-known taxonomic groups, the relative numbers of species assigned to phylum, class, order, family, genus, and species follow consistent patterns. If one assumes these predictable patterns also hold for less well-studied groups, the more secure information about phyla and class can be used to estimate the total number of distinct species within a given group."

The approach is novel and shows what appears to be some promise and robustness for certain multicellular eukaryotes. For example, analysis of animals shows a reasonable leveling off for many taxonomic levels:

Figure 1. Predicting the global number of species in Animalia from their higher taxonomy. (A–F) The temporal accumulation of taxa (black lines) and the frequency of the multimodel fits to all starting years selected (graded colors). The horizontal dashed lines indicate the consensus asymptotic number of taxa, and the horizontal grey area its consensus standard error. (G) Relationship between the consensus asymptotic number of higher taxa and the numerical hierarchy of each taxonomic rank. Black circles represent the consensus asymptotes, green circles the catalogued number of taxa, and the box at the species level indicates the 95% confidence interval around the predicted number of species (see Materials and Methods).
From Mora C, Tittensor DP, Adl S, Simpson AGB, Worm B (2011) How Many Species Are There on Earth and in the Ocean? PLoS Biol 9(8): e1001127. doi:10.1371/journal.pbio.1001127

They also do a decent job of testing their use of higher taxon discovery to estimate number of species. Figure 2 shows this pretty well.

Figure 2. Validating the higher taxon approach. We compared the number of species estimated from the higher taxon approach implemented here to the known number of species in relatively well-studied taxonomic groups as derived from published sources [37]. We also used estimations from multimodel averaging from species accumulation curves for taxa with near-complete inventories. Vertical lines indicate the range of variation in the number of species from different sources. The dotted line indicates the 1∶1 ratio. Note that published species numbers (y-axis values) are mostly derived from expert approximations for well-known groups; hence there is a possibility that those estimates are subject to biases arising from synonyms.

So all seems hunky dory and pretty interesting. That is, until we get to the bacteria and archaea. For example, check out Table 2:

Table 2. Currently catalogued and predicted total number of species on Earth and in the ocean.

Their approach leads to an estimate of 455 ± 160 Archaea on Earth and 1 in the ocean. Yes, one in the ocean. Amazing. Completely silly too. Bacteria are a little better. An estimate of 9,680 ± 3,470 on Earth and 1,,320 ±436 in the oceans. Still completely silly.

Now the authors do admit to some challenges with bacteria and archaea. For example:

We also applied the approach to prokaryotes; unfortunately, the steady pace of description of taxa at all taxonomic ranks precluded the calculation of asymptotes for higher taxa (Figure S1). Thus, we used raw numbers of higher taxa (rather than asymptotic estimates) for prokaryotes, and as such our estimates represent only lower bounds on the diversity in this group. Our approach predicted a lower bound of ~10,100 species of prokaryotes, of which ~1,320 are marine. It is important to note that for prokaryotes, the species concept tolerates a much higher degree of genetic dissimilarity than in most eukaryotes [26],[27]; additionally, due to horizontal gene transfers among phylogenetic clades, species take longer to isolate in prokaryotes than in eukaryotes, and thus the former species are much older than the latter [26],[27]; as a result the number of described species of prokaryotes is small (only ~10,000 species are currently accepted).

But this is not remotely good enough from my point of view. Their estimates of ~ 10,000 or so bacteria and archaea on the planet are so completely out of touch in my opinion that this calls into question the validity of their method for bacteria and archaea at all.

Now you may ask - why do I think this is out of touch. Well because reasonable estimates are more on the order or millions or hundreds of millions, not tens of thousands. To help people feel their way through the literature on this I have created a Mendeley group where I am posting some references worth checking out.

Number of species of bacteria and archaea is a group in Biological Sciences on Mendeley.

I think it is definitely worth looking at those papers. But just for the record, some quotes might be useful. For example, Dan Dykhuizen writes

we estimate that there are about 20,000 common species and 500,000 rare species in a small quantity of soil or about a half million species.

And Curtis et al write:

We are also able to speculate about diversity at a larger scale, thus the entire bacterial diversity of the sea may be unlikely to exceed 2 × 10^6, while a ton of soil could contain 4 × 10^6 different taxa.

Are their estimates perfect? No surely not. But I think without a doubt the number of bacterial and archaeal species on the planet is in the range of millions upon millions upon millions. 10,000 is clearly not even close. Sure, we do not all agree on what a bacterial or archaeal species is. But with just about ANY definition I have heard, I think we would still count millions.

Given how horribly horribly off their estimates are for bacteria and archaea, I think it would have been better to be more explicit in admitting that their method probably simply does not work for such taxa right now. Instead, they took the approach of saying this is a "lower bound". Sure. That is one way of dealing with this. But that is like saying "Dinosaurs lived at least 500 years ago" or "There are at least 10 people living in New York City" or "Hiking the Appalachian Trail will take at least two days." Lower bounds are only useful when they provide some new insight. This lower bound did not provide any.

Mind you, I like the paper. The parts on eukaryotes seem quite novel and useful. But the parts of bacteria and archaea are painful. Really really painful.

Mora, C., Tittensor, D., Adl, S., Simpson, A., &amp;amp; Worm, B. (2011). How Many Species Are There on Earth and in the Ocean? PLoS Biology, 9 (8) DOI: 10.1371/journal.pbio.1001127

27 comments:

Carl8/24/2011 6:44 AM
"The parts on eukaryotes seem quite novel and useful"

Some experts on eukaryotes--beetles and fungi--beg to differ! See my article in the New York Times for the divided opinion: http://www.nytimes.com/2011/08/30/science/30species.html
ReplyDelete
Replies
T Ryan Gregory8/24/2011 7:11 AM
Out of curiosity, what counts as a species in prokaryotes these days? Seems this could severely influence the estimate. (Not that the definitions are agreed upon in euks either!).
ReplyDelete
Replies
TFox8/24/2011 8:50 AM
Haven't read the paper, only your post. But: this looks like a basic difficulty with extrapolation in general. Eg, the curves in Fig 1, I could imagine fitting them to A(1-exp(k t)), and getting an estimate of an asymptotic value. But picking a different model gives a different value, and eg a model like log(t) might fit just as well but not give a finite asymptote. In general, I'd suspect the predicted asymptote would depend much more sensitively on the model selected than the data, which doesn't seem like a good thing. And there's no a priori reason I can think of for preferring any one given model over another. I don't know what "multimodel" means, exactly, but it doesn't sound good :) For the curves with no tail off at all, which I guess is true for prokaryotes, the problem is more obvious, as all models will try to predict infinity as the number of species. Using the same analysis would prevent summing the numbers for prokaryotes and eukaryotes in the same table, but maybe that's exactly the problem: the simplest straightforward approach makes too obvious the difficulties of extrapolation.
ReplyDelete
Replies
Jonathan Eisen8/24/2011 8:53 AM
Well, Ryan there is a lot of work on species concept for bacteria / archaea but not complete agreement. I think there is some good stuff Peg Riley, Howard Ochman, Jim Staley, Fred Cohen, and others. Maybe I will create a Mendeley group for this too.
ReplyDelete
Replies
Jonathan Eisen8/24/2011 8:55 AM
TFox - I did not even get to the point you are at. Was too peeved at the microbial part ...
ReplyDelete
Replies
Jonathan Eisen8/24/2011 9:09 AM
Carl - Thanks for pointing that out. I may add an update to the post with some of the comments from around the web and in your paper.
ReplyDelete
Replies
Mike Keesey8/24/2011 9:22 AM
Dinosaurs still live--they're in that chart! (Aves)

While I haven't read the paper, the fact that it uses arbitrary ranks and arbitrary taxa like "Reptilia" makes me deeply suspicious.
ReplyDelete
Replies
Jonathan Eisen8/24/2011 9:28 AM
Crap - Mike - I can't believe I did not think of the bird - dinosaur thing
ReplyDelete
Replies
Morgan Langille8/24/2011 9:38 AM
Got an email through university mailing list then saw news headlines about the story and I said to myself "they must just be counting Eukaryotes.". Then I skimmed the paper and found out they estimated ~10,000 for Bacteria and Archaea. Brutal! I guess none of the reviewers did reasearch on Bacteria and Archaea, since there is no way they would let them publish this. Would have been fine if they estimated just for Eukaryotes (and stated so in the title), but I guess that isn't as headline catching as "all organisms on the planet". What a joke!
ReplyDelete
Replies
Mike Keesey8/24/2011 10:20 AM
Few do -- still an excellent point, though.
ReplyDelete
Replies
Jonathan Eisen8/24/2011 10:26 AM
Well, Mike, few may think about it, but I do all the time. I have posted about it many times on twitter and taught about it. Just slipped my mind here ...
ReplyDelete
Replies
NickM8/26/2011 3:18 PM
Why doesn't someone just apply this whole approach, but ignore Linnaean ranks and "species" designations, and just look at "number of lineages of age t", where t is 1 mya, 10 mya, 100 mya, 1 bya, etc.

(In other words, just make lineages-through-time plots and look at specific timepoints)

You could build accumulation curves in just the same way. Presumably we are nearer to sampling all of the 1 bya lineages that exist presently on the Earth than we are to sampling all of the 1 mya lineages.

There is the minor problem of dating phylogenies, but ballparking that isn't very hard.

I suppose this study would tell us more about how much stuff is in the sequence databases, rather than in the taxonomy lists, but that would still be interesting. And you could still use the scaling approach.

LOL - email me if anyone wants to collaborate. matzkeATberkeley.edu
ReplyDelete
Replies
Camilo Mora12/12/2011 12:44 PM
I would like to provide my response to several comments that have been mentioned here that will not arise in a peer-review setting and that make blogs a dangerous venue for information delivery as it reduces the credibility of findings regardless of scientific support.
Let me expand...
A person finds “suspicious” the use of a unique model to make a prediction (since multiple models may exist, choosing one is like cherry picking); this same person acknowledged that he/she did not read the paper. If this person would have indeed read the paper, it would have noted that our paper covered multiple models and that to avoid choosing the predictions of one single model, we use multimodel averaging, based on the principals of fit and parsimony, to weight each model prediction and obtain a single prediction from multiple models. This is the most advanced statistical method to deal with the issue of model selection; and in our case, several hundred models were run.
Another person suggests that the whole paper should be tossed because it did not predict what he thinks it should be the diversity of bacteria. This is ridiculous; this is like saying that Galileo should have tossed his idea that the Earth moved around the sun because everybody at the time thought otherwise. More importantly, science is not about believing but testing; if there is something we know well, at least from taxonomy, is that opinions change and are most often wrong. Phillipe Bouchet (2005) pointed out, for instance, how a given scientists reported in a paper that he believed there were over 10 million species in a given taxonomic group; ten years later, this same scientists concluded in another paper that in reality the same taxonomic group should not have more that 1 million species. Similar examples are everywhere. So while opinions are fine when we do not have much, they should be treated with care, they do not represent scientific evidence and certainly should avoid being vicious towards other scientists.
Now focusing particularly in bacteria, I certainly doubt there will be millions to tens of millions of these species. If this was to be the case, probabilistically we should know more than the ~10,000 species we know today. In combination all major repositories of species of bacteria do contain around 10.000 species in total. However, I can see how molecular data would indicate the existence of many linages in bacteria. In fact, if we were to put a plate of the same strand of bacteria under two different temperatures, probably by the end of a week, these two environments should make the bacteria to divert to genetic levels likely to be considered different species. If we were to extrapolate these conditions to the millions of environments on Earth, then it will not be surprising to think that there should be millions of lineages. However, are they different species? If we were to extrapolate this criterion to humans, we could also find a few millions to billions of human lineages. This is a conflict about the definition of species unlikely to be resolved here but suggest, as a blogger wisely pointed out here, that the number of species is a function of what we define as a species. Our paper relays on data that has been collected under different conventions for over 250 years; of course, our conclusions apply only to those conventions. If we were to change the definitions, I will certainly agree that data will change and so will our predictions about the number of species.

I recommend the audience of this blog to read our paper, it is free. Technically, the paper is boring and long because due to the peer-review process we spend over two years and a long history with statistical analyses to assess the potential effect of limitations that were evident to us and the reviewers at the time. We welcome criticism and improvement in our approach, but please let’s avoid being demining.
Camilo Mora, Ph.D.
Assistant Professor
University of Hawaii Manoa
http://www.soc.hawaii.edu/mora
ReplyDelete
Replies
Jonathan Eisen12/12/2011 1:13 PM
Thanks for the response Camilo

With all due respect however, I disagree with your comment about blogs. You say "... blogs a dangerous venue for information delivery as it reduces the credibility of findings regardless of scientific support."

Really? How do blogs do this? Did blogs do that in the arsenic-life case? Actually don't answer that - I will - blogs are what exposed the arsenic-life story to be likely incorrect. There are 1000s of examples of how blogs enrich the scientific discussion around papers. Certainly people say some things in blogs and in comments to blog that are mean, uncalled for, unjustified and silly. I try to delete any comments that are offensive but I don't think any here are. Are you suggesting people should not discuss your paper? Then why did you discuss it with the press? Are you suggesting people should only discuss your paper if they agree with it? I assume you would not suggest that. Are you suggesting people should only discuss your paper privately? That would seem silly to me. Blogs are a way that people discuss scientific papers these days, just like they are discussed in newspapers and magazines and such. What exactly do you mean by saying blogs are a dangerous venue? Should we ban blogs?
ReplyDelete
Replies
Ernesto Guevara12/12/2011 7:05 PM
Well, of course there are blogs and blogs
ReplyDelete
Replies
Psi Wavefunction12/12/2011 8:20 PM
Camilo,
Shall we, perhaps, decouple species counts from diversity then?

Even simply going by, say, the ecological species concept, there are unquestionably more microhabitats/microniches than those on the macro- scale. I'm amazed a couple of your co-authors (who may, btw, disagree that blogs are dangerous and evil) went along with a thoroughly defunct kingdom categorisation and didn't throw a fit over 'protozoa', but that aside, there's no freaking way protists have less species than Animalia, given that 'protozoa' comprise the vast majority of eukaryotes, including a few notorious groups of parasites. Given that there appears to be roughly at least a species of apicomplexan specialising in each species of arthropod that is examined, and apicomplexa are but one small group of protists, making home in plenty of things non-arthropod as well, this already suggests their species count should be at least on the same order of magnitude as that of animals. The reason much fewer are described is that those who went about describing obscure beetles in the tropics are generally too preoccupied with things to slice them open and investigate their microbial diversity. In other words, even if we take the daring assumption that free-living microbes suck at speciating, the speciation of animals alone seems to provide a plethora of microhabitats encouraging isolation of microbial populations into species-like entities.

Now if we look at the protists... my are they ever full of prokaryotic symbionts of all shapes and sizes! Even just my boring Paramecium aurelia species complex hosts a good 10-20 different bacterial species -- and definitely distinct species, given they're polyphyletically distributed among a sea of things that don't live in Paramecium. Some days it seems like any protist we look at is full of it's very own special life forms, acting as distinct populations with restricted gene flow between them -- fairly species-like. As much as I love eukaryotic microbes, I must concede that prokaryotes far defeat us in diversity.

It is clear that prokaryotes are the most diverse, by far, of any life form on the planet. They're the deepest branching, they've had the longest time to evolve, they've had the rapid generational turnover and they have the smallest, most specific and unique microhabitats available to speciate into, if they feel so inclined. Now, whether our macroeukaryocentric (hell, zoocentric even) species concepts can adequately describe that diversity is a whole other question, and perhaps your method works fine in that approach, and does a nice job at revealing its inadequacies in dealing with the microbial essence of life.

Again, this is from a *eukaryotic* microbiologist.

I'm curious whether you agree with the thought that the paper focuses on species counts as defined by traditional taxonomic approaches rather than necessarily true diversity. Both measurements are interesting and valid in their own right, but I think it's a distinction that really needs to be emphasised over and over again.

Regards,
-"Psi"-
ReplyDelete
Replies
Psi Wavefunction12/12/2011 8:25 PM
PS: People *are* quick to dismiss everything and pile up on things in the blog world, but how is that any different from your typical journal club, where the participants end up definitively proving that the research was done by humans fighting a trade-off curve with finite time and money? It ain't blogs, it's just how people work...
ReplyDelete
Replies
Camilo Mora12/12/2011 9:18 PM
Hi Jonathan,
We did our work as transparently as we could, using the most advanced methods known and using the most up-to-dated data available. If we are proven wrong, I would not feel bad about it … that is part of science and I know we did not deceive anyone, the work was done under all ethics, we outlined and tested all limitations that were evident to us.
Of course, I am willing to take the heat, critiques and suggestions that come with any discovery. The difference between the blog style and a reviewed paper is that at some point the evidence presented is subject to evaluation and the language moderation. How can a person suggest that a paper is suspicious after acknowledging that the paper was not read? How can a person use adjectives and sarcasm about a scientific paper based on opinions? The tone of the blog is just open for an unnecessary fighty debate. WE can have a healthy discussion or debate without labeling or demining anyone.
Hi Psi. We work with what is available. Part of the discussion emerging from our paper is related to the fact that many conventions have different definitions of species and not everyone agrees with them. So our numbers need to be judged under the context of those conventions. There are other limitations, of course. Some of which are related to the quality of the data. There is nothing wrong with the pattern but with the data. We acknowledged this up front in the paper for prokaryotes, protozoa and fungi. We also acknowledged that Protozoa was not broadly accepted; that is what is used in the databases currently developed; we explicitly called for an improvement of this situation. Many of these critiques could be tone down if the paper is read in full.
Thanks for the discussion,

Camilo
ReplyDelete
Replies
Jonathan Eisen12/12/2011 10:49 PM
Camilo

I agree that language moderation sometimes does not happen in blogs and/or in comments on blogs. But in a way - that is par for the course for such a venue. They are more akin to conversations about something than formal scientific publications. As as such, I like them.

As for the lack of peer review in blogs and blog comments, I am not sure you want to go there. Your extensive comments about your paper to the press (e.g., here and here ). In addition, in your press release you have extensive comments and even Figures that do not appear to be from the paper. You even have audio interview files posted. I think trying to communicate with the press is a great idea. However, to then turn around and suggest that other people's comments need peer review (especially when you do not like them) is disingenuous at best.

Yes, comments in many venues can sometimes be informal, awkward, misguided and inappropriate. But I am not really seeing that here. I agree with you that I sometimes undervalue comments from people that have not read a paper on which they comment - but I do not value their comments as "zero" automatically. People can have insight and useful comments from reading news stories, press releases, and blog postings, to name a few. Are you suggesting that people have to read the paper to comment about it? Then why talk to the press at all and/or why put out a press release. It seems that you want the press to report on the paper - to sum it up for others - and to present it in a positive light. Well, some people out there - including many scientists - did not agree with parts of your paper. Restricting those who can comment to those who have read it and/or who get their comments peer reviewed is a strange concept to me and again seems disingenuous.

I get that you did not like what some people said. How about just coming out and saying "I did not like what those people said - for these reasons" rather than implying that the problem here is blogs or lack of peer review.
ReplyDelete
Replies
Karl Brand12/13/2011 6:03 AM
Prof. Mora, with out this blog and its authors twitter feed, i never would have discovered your meritorious paper (which i did then read). I tend to avoid main stream news sites when it comes to science. Too many competing interest's. Blog's on the other hand facilitate discourse, which IMHO is the essence of science ("bio" = life "logos" = discourse). And are only dangerous to those who seek to manipulate mainstream news for their own less laudable pursuits.
ReplyDelete
Replies
Morgan Langille12/13/2011 9:41 AM
My understanding is that the author's model(s) do not work on Bacteria or Archaea. This basically comes down to the fact that the slope of number of taxa in each level does not asymptote for Bacteria and Archaea as they do for other taxonomic groups such as Animalia. The model might not also work well for some groups such as Protozoa, but I am not going to talk about those groups right now.

I think the work is great for those groups of taxa that give evidence that the model works. However, the problem and criticism comes when the authors try to predict the number of all species on earth when they can only make predictions for a subset of taxa. The original blog post of Jonathan's outlines why he has a problem with this and the simple method of just using the number we have already observed as a "lower limit". He has the opinion (as well as many others) that this so grossly under estimates the number of bacteria and archaea species that the total prediction of 8.7 million is inaccurate for all species on Earth.

What isn't clear to me from reading the paper or from Mora's comments above is why they used this lower limit species count for Bacteria and Archaea?

Initially, I thought they just wanted to have some number for Bacteria and Archaea in the paper so they could claim a nice number for all species "on earth and in the ocean". Not having any method to use they then just applied the lower limit approach. This comes across as offensive to Eisen and others (including myself)because we believe that the lower limit is vastly under-representative. It is not that difficult to imagine how they feel. Say I develop a method that I show can predict the number of bacteria and archaea species. This method does not work for eukaryotes, but I write a new paper saying that the total number of species on earth is my predicted number of species for bacteria and archaea plus the number of eukaryotes we have already observed (this is a lower bound since I can't actually predict them). See how that doesn't really seem that honest or accurate?

However, I am starting to feel that Mora's opinion is that there is not that many more species of Bacteria or Archaea on earth than what we have already observed and thus negligible in the error of their total prediction of all species on earth. That would mean that the authors are not trying to be deceptive but rather that they think of bacteria and archaea as a small class of taxa that do not have many species. For example Mora writes above:

"Now focusing particularly in bacteria, I certainly doubt there will be millions to tens of millions of these species. If this was to be the case, probabilistically we should know more than the ~10,000 species we know today."

I would like to know what Mora means by "probabilistically". They have already shown and stated that their model doesn't work on Bacteria and Archaea, so is there some other evidence suggesting this? This would be interesting to know because new sequencing technologies in the past few years have allowed microbiologists to start sampling the number of species in the environment and extend beyond the small subset of species that have been previously described because they are culturable. These analyses do not show any trend that we are starting to reach a plateau of new species and would contradict Mora's claim that there would be millions of these species.
ReplyDelete
Replies
Camilo Mora12/13/2011 12:52 PM
Hi Jonathan,

Ok, I give you that and I apologize, I should have not blamed the blog style. What I question is the tone. You can tell me that you disagree with the RESULTS of my paper because in your opinion the diversity of bacteria should be higher. I would have responded by saying, I agree; in fact we caution those results and hope that this will motivate more research to catalog all the diversity of that group….a constructive and perhaps productive discussion. If we open the discussion by saying from the title of the blog that the paper is flawed, which have not demonstrated empirically, running sarcasm from the beginning, then of course the debate is open to be unnecessary nasty and probably unproductive.
And do not get me wrong, I am in favor of press outreach, they take a lot of time to prepare but it can be fun; more importantly it is a unique and free opportunity to educate the public at large. We see as a major achievement that this paper has been viewed over 40,000 times; yet a single video by Justin Bieber in Youtube has been watched 600 million times (seriously, check it out yourself). In a time went the world can only change for better when people is better educated, we need to use all opportunities to spread the knowledge we generate. Nowadays, we have a social responsibility to let people know about our planet and better from us than from people that do not know about it. This is not a presidential race, so I do not see why we have to get nasty in that process. I do not say this to undermine your opinion in any way and write this with the modesty that my English allows me. I am sure we will probably be enjoying this discussion if we were face to face and perhaps having a beer.

Hi Morgan,
You find “Offensive” that our paper reported a lower boundary of species of bacteria and that we may have done this to claim a “nice” number for all species on Earth. You went then to say that I may truly believe that there may not be so many species in Bacteria. As with Morgan, I apologize because it was not my intention to OFFEND you or anyone else. Our intention was to report a pattern that fits well taxonomic groups for which we already have a good knowledge of their higher taxonomy. For other groups, the reported numbers will increase in accuracy as the higher taxonomy gets more complete and more accepted (This was explicit in the text of the paper). We did report the results for bacteria and other groups not to have a “Nice” number but to show that the method has its pitfalls, which can be solved with additional data. This is a normal scientific procedure: report the positives and negatives. Additionally, the poor quality of the data does not undermine the validity of the pattern even for bacteria; it just means that the discovered pattern and developed method will increase its reliability when we get more and improved data (again, this is explicit in the text of our paper).
There is also a misinterpretation of my comment about the diversity of bacteria. What I said is that under the current convention for the definition of bacteria species I doubt at the end we will have millions to tens of millions of these species. The taxonomic trend for other groups would suggest that if there were millions of species of bacteria by now we should know more than only 10,000 species. Now, if a new genetic approach was to be used to define species, then yes, millions of bacteria species are likely to exist. In short, what I mean to say is that numbers will change depending on what rules one use to define a species (this was also cautioned in the paper).
ReplyDelete
Replies
Jonathan Eisen12/13/2011 2:22 PM
Camilo

Thanks for the continued comments. I must say, I just do not really know what to say here but will try.

1. I have read and reread my post about 10 times. I do not think it is too sarcastic or negative.

2. I stand by the title of the blog post. I think your paper is interesting but flawed.

3. I do not think I have to "empirically" show as you suggest that your paper is flawed in order to comment about it.

4. I think your paper is fundamentally flawed in its treatment of microbes. I believe in particular you did a poor job of placing your paper in the context of prior work on the subject. That is why I think your paper is flawed. I am not just making up numbers but I am citing other studies. And to help people out I created an database of papers people could look at to compare to your study. This DB was embedded in my post. I noted that since your estimates were so different than the published peer reviewed literature on the topic, that this suggested your method was flawed. I stand by that statement. You did a very poor job of integrating your work with the prior published literature on bacteria and archaea.

5. Even if I had not cited published literature I stand by my and other people's ability to critique your paper, to point out its flaws if we believe there are any, and to discuss it in a public forum without rubber stamping simply because it is published.

6. I find your comments about outreach to the public to be almost ludicrous. Sure it is a free opportunity to educate the public. But it is not a free opportunity to make claims that are possibly erroneous. It is also not a free opportunity to ignore the published literature on a topic. The point of my and many other people commenting about your paper was that we found the press coverage of it in the first few days to be distasteful in that there were claims made about species on the planet that both seemed wrong and were inconsistent with current knowledge and the literature.

7. I agree with you that one should not be nasty in the process. I did not criticize you personally as happens in presidential races. I criticized the content of your paper. I again stand by the critique. In fact, I believe I went out of my way to say positive things about your paper in my post.

I know that may not seem that I am at a loss for words. But I still feel like I am.
ReplyDelete
Replies
Ian Holmes12/14/2011 8:43 AM
Dr Mora, discussion of scientific papers on blogs *is* a form of peer review. Post-publication, rather than pre-publication, of course.

I thought Dr Eisen's comments on your paper were vigorous but (ultimately) perfectly civil. I know it can be uncomfortable to have your work criticized in public, but I'd urge you to try and take the criticism constructively. I've seen far ruder comments on my own work in the context of anonymous private reviews, I can assure you :-\
ReplyDelete
Replies
Camilo Mora12/14/2011 6:10 PM
Hi Jonathan,

I agree with you, and this was cautioned in the paper, that for micros the method did not work well because of the quality of the data; there was no reason to provide comparison with other estimates because in advance we said the estimations for these groups are likely off. Saying that the paper is flawed based on that will dismiss its validity for groups for which the method is more robust and where it was tested. I see your side that the applicability of the method to one group should not be generalized, especially when it undermines the diversity of other groups. Our paper, however, explicitly cautioned those cases.

Regarding the press, we appreciated the limitations of our method for micros. That is why our press release explicitly excluded any reference to Prokaryotes (checked yourself in our press release in the link below). In fact, we said in our press release that our estimates did not include “certain micro-organisms and virus “types”, for example, which could be highly numerous”. We feared that if reporters did not use the paper completely, especially outlining the limitations we describe for some groups, our results could be taken out of context by other scientists. That is why prokaryotes were taken out of the press outreach.

http://www.soc.hawaii.edu/mora/PressNumberOfSpeciesPaper/PressReleaseSpecies.pdf
ReplyDelete
Replies
Jonathan Eisen12/14/2011 8:37 PM
Camilo

Thanks again for the continued discussion. Some comments

1. I said very clearly in my post that I thought the paper seemed potentially useful and value for eukaryotes. For example at the end "The parts on eukaryotes seem quite novel and useful."

2. I believe your paper was unclear in many parts on its utility for bacteria and archaea. You made some claims and had some lines I felt were invalid. These included

Title: How Many Species Are There on Earth and in the Ocean?

No caveats in the title about poor treatment of bacteria and archaea.

Author summary:

"follows a consistent pattern from which the total number of species in any taxonomic group can be predicted. "

"Assessment of this pattern for all kingdoms of life on Earth predicts ~8.7 million (±1.3 million SE) species globally, of which ~2.2 million (±0.18 million SE) are marine"

Introduction

"Here we present a quantitative method to estimate the global number of species in all domains of life"

"that such a pattern allows the extrapolation of the global number of species for any kingdom of life (Figures 1 and 2)."

and so on

3. I noted in my post that you had caveated some of the statements about bacteria and archaea. But I stated that I did not feel this was good enough. I thought you should have left them completely out of your figures and estimates since the method did not seem to work for them. By including them at all, this to me had the potential for many misinterpretations. And this is exactly what happened in the press coverage, which I note, you seem to have contributed to

See for example

ABC News

Where you are quoted saying "This is the first time we've delivered a method that gives a number. The number we have come out with 8.7 million species. It is not only precise, but it is also accurate - we have validated it"

No caveats there.

Then they report "He says the number of prokaryote species, which include bacteria, is only approximately 11,000."

An ABC online story

Our estimate is that there are 8.7 million species that live on the planet with us.

No caveats again

Only a few reports, like that of Carl Zimmer in the New York Times , pointed to some potential flaws in your method.

So -- I appreciate that you made some caveats to your estimates for bacteria and archaea. I think you di d not do a good enough job in making it clear in your paper or in your interactions with the press that your estimates only should be applied to some taxa on the planet.

Furthermore, I note, as Carl Zimmer reported, some also called into question your method as it applied to other taxa such as fungi. And others were skeptical of whether the method could be extended from the feel studied taxa to the poorly studies ones.

So - in the end - though your paper may have many strong parts, and may be highly novel and interesting, your paper does not appear to have been a perfect formation - it is not the Mona Lisa. It has some flaws. Discussing those is part of how science progresses. Overselling your paper to the press or trying to hide the flaws is not how science progresses. That is how science heads down a dangerous path. One I do not want to see.
ReplyDelete
Replies
SRM3/26/2013 2:52 AM
In fact, if we were to put a plate of the same strand of bacteria under two different temperatures, probably by the end of a week, these two environments should make the bacteria to divert to genetic levels likely to be considered different species.

Huh? Seriously? How would that work? It would seem at least one problem is that the author knows nothing about bacteria. Or am I mis-interpreting what is being said?
ReplyDelete
Replies