Repository logo
  • Log In
    Log in via Symplectic to deposit your publication(s).
Repository logo
  • Communities & Collections
  • Research Outputs
  • Statistics
  • Log In
    Log in via Symplectic to deposit your publication(s).
  1. Home
  2. Faculty of Natural Sciences
  3. Faculty of Natural Sciences
  4. Evaluation of phylogenetic reconstruction methods using bacterial whole genomes: a simulation based study
 
  • Details
Evaluation of phylogenetic reconstruction methods using bacterial whole genomes: a simulation based study
File(s)
Evaluation of phylogenetic reconstruction methods using bacterial whole genomes: a simulation based study.pdf (2.23 MB)
Published version
Author(s)
Lees, John A
Kendall, Michelle
Parkhill, Julian
Colijn, Caroline
Bentley, Stephen D
more
Type
Journal Article
Abstract
Background: Phylogenetic reconstruction is a necessary first step in many analyses which use whole genome sequence data from bacterial populations. There are many available methods to infer phylogenies, and these have various advantages and disadvantages, but few unbiased comparisons of the range of approaches have been made. Methods: We simulated data from a defined "true tree" using a realistic evolutionary model. We built phylogenies from this data using a range of methods, and compared reconstructed trees to the true tree using two measures, noting the computational time needed for different phylogenetic reconstructions. We also used real data from Streptococcus pneumoniae alignments to compare individual core gene trees to a core genome tree. Results: We found that, as expected, maximum likelihood trees from good quality alignments were the most accurate, but also the most computationally intensive. Using less accurate phylogenetic reconstruction methods, we were able to obtain results of comparable accuracy; we found that approximate results can rapidly be obtained using genetic distance based methods. In real data we found that highly conserved core genes, such as those involved in translation, gave an inaccurate tree topology, whereas genes involved in recombination events gave inaccurate branch lengths. We also show a tree-of-trees, relating the results of different phylogenetic reconstructions to each other. Conclusions: We recommend three approaches, depending on requirements for accuracy and computational time. Quicker approaches that do not perform full maximum likelihood optimisation may be useful for many analyses requiring a phylogeny, as generating a high quality input alignment is likely to be the major limiting factor of accurate tree topology. We have publicly released our simulated data and code to enable further comparisons.
Date Issued
2018-03-23
Date Acceptance
2018-03-20
Citation
Wellcome Open Research, 2018, 3, pp.33-33
URI
http://hdl.handle.net/10044/1/59675
DOI
https://www.dx.doi.org/10.12688/wellcomeopenres.14265.1
ISSN
2398-502X
Publisher
F1000Research
Start Page
33
End Page
33
Journal / Book Title
Wellcome Open Research
Volume
3
Copyright Statement
© 2018 Lees JA et al. This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Identifier
https://www.ncbi.nlm.nih.gov/pubmed/29774245
Subjects
bacteria
phylogenetic methods
phylogeny
simulation
tree distance
Publication Status
Published
Coverage Spatial
England
About
Spiral Depositing with Spiral Publishing with Spiral Symplectic
Contact us
Open access team Report an issue
Other Services
Scholarly Communications Library Services
logo

Imperial College London

South Kensington Campus

London SW7 2AZ, UK

tel: +44 (0)20 7589 5111

Accessibility Modern slavery statement Cookie Policy

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement
  • Send Feedback