When are pathogen genome sequences informative of transmission events?
File(s)journal.ppat.1006885.pdf (6.22 MB)
Published version
Author(s)
Campbell, Finlay
Strang, Camilla
Ferguson, Neil
Cori, Anne
Jombart, Thibaut
Type
Journal Article
Abstract
Recent years have seen the development of numerous methodologies for reconstructing transmission trees in infectious disease outbreaks from densely sampled whole genome sequence data. However, a fundamental and as of yet poorly addressed limitation of such approaches is the requirement for genetic diversity to arise on epidemiological timescales. Specifically, the position of infected individuals in a transmission tree can only be resolved by genetic data if mutations have accumulated between the sampled pathogen genomes. To quantify and compare the useful genetic diversity expected from genetic data in different pathogen outbreaks, we introduce here the concept of ‘transmission divergence’, defined as the number of mutations separating whole genome sequences sampled from transmission pairs. Using parameter values obtained by literature review, we simulate outbreak scenarios alongside sequence evolution using two models described in the literature to describe transmission divergence of ten major outbreak-causing pathogens. We find that while mean values vary significantly between the pathogens considered, their transmission divergence is generally very low, with many outbreaks characterised by large numbers of genetically identical transmission pairs. We describe the impact of transmission divergence on our ability to reconstruct outbreaks using two outbreak reconstruction tools, the R packages outbreaker and phybreak, and demonstrate that, in agreement with previous observations, genetic sequence data of rapidly evolving pathogens such as RNA viruses can provide valuable information on individual transmission events. Conversely, sequence data of pathogens with lower mean transmission divergence, including Streptococcus pneumoniae, Shigella sonnei and Clostridium difficile, provide little to no information about individual transmission events. Our results highlight the informational limitations of genetic sequence data in certain outbreak scenarios, and demonstrate the need to expand the toolkit of outbreak reconstruction tools to integrate other types of epidemiological data.
Date Issued
2018-02-08
Date Acceptance
2018-01-18
Citation
PLoS Pathogens, 2018, 14 (2)
ISSN
1553-7366
Publisher
Public Library of Science (PLoS)
Journal / Book Title
PLoS Pathogens
Volume
14
Issue
2
Copyright Statement
© 2018 Campbell et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited (https://creativecommons.org/licenses/by/4.0/).
License URL
Sponsor
Medical Research Council (MRC)
National Institute for Health Research
National Institutes of Health
Identifier
http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000426477000032&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=1ba7043ffcc86c417c072aa74d649202
Grant Number
MR/K010174/1B
HPRU-2012-10080
1U01GM110721-03
Subjects
Science & Technology
Life Sciences & Biomedicine
Microbiology
Parasitology
Virology
RESISTANT STAPHYLOCOCCUS-AUREUS
RESPIRATORY SYNDROME CORONAVIRUS
CLOSTRIDIUM-DIFFICILE INFECTION
KLEBSIELLA-PNEUMONIAE
MYCOBACTERIUM-TUBERCULOSIS
STREPTOCOCCUS-PNEUMONIAE
EPIDEMIOLOGIC DATA
SARS CORONAVIRUS
NOSOCOMIAL OUTBREAK
MATHEMATICAL-THEORY
Publication Status
Published
Article Number
ARTN e1006885
Date Publish Online
2018-02-08