A rapid and scalable method for multilocus species delimitation using Bayesian model comparison and rooted triplets

File Description SizeFormat 
Fujisawa2016.pdfAccepted version1.72 MBAdobe PDFDownload
Syst Biol-2016-Fujisawa-sysbio-syw028.pdfPublished version427.57 kBAdobe PDFDownload
Title: A rapid and scalable method for multilocus species delimitation using Bayesian model comparison and rooted triplets
Author(s): Barraclough, TG
Fujisawa, T
Aswad, A
Item Type: Journal Article
Abstract: Multilocus sequence data provide far greater power to resolve species limits than the single locus data typically used for broad surveys of clades. However, current statistical methods based on a multispecies coalescent framework are computationally demanding, because of the number of possible delimitations that must be compared and time-consuming likelihood calculations. New methods are therefore needed to open up the power of multilocus approaches to larger systematic surveys. Here, we present a rapid and scalable method that introduces two new innovations. First, the method reduces the complexity of likelihood calculations by decomposing the tree into rooted triplets. The distribution of topologies for a triplet across multiple loci has a uniform trinomial distribution when the 3 individuals belong to the same species, but a skewed distribution if they belong to separate species with a form that is specified by the multispecies coalescent. A Bayesian model comparison framework was developed and the best delimitation found by comparing the product of posterior probabilities of all triplets. The second innovation is a new dynamic programming algorithm for finding the optimum delimitation from all those compatible with a guide tree by successively analyzing subtrees defined by each node. This algorithm removes the need for heuristic searches used by current methods, and guarantees that the best solution is found and potentially could be used in other systematic applications. We assessed the performance of the method with simulated, published and newly generated data. Analyses of simulated data demonstrate that the combined method has favourable statistical properties and scalability with increasing sample sizes. Analyses of empirical data from both eukaryotes and prokaryotes demonstrate its potential for delimiting species in real cases.
Publication Date: 7-Apr-2016
Date of Acceptance: 21-Mar-2016
URI: http://hdl.handle.net/10044/1/30456
DOI: https://dx.doi.org/10.1093/sysbio/syw028
ISSN: 1076-836X
Publisher: Oxford University Press (OUP)
Start Page: 759
End Page: 771
Journal / Book Title: Systematic Biology
Volume: 65
Issue: 5
Copyright Statement: © The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Sponsor/Funder: Biotechnology and Biological Sciences Research Council (BBSRC)
Funder's Grant Number: BB/G004250/1
Keywords: Bacterial species
Bayesian model comparison
Dynamic programming
Multilocus species delimitation
Evolutionary Biology
0603 Evolutionary Biology
0604 Genetics
Publication Status: Published
Appears in Collections:Faculty of Natural Sciences



Items in Spiral are protected by copyright, with all rights reserved, unless otherwise indicated.

Creative Commons