Repository logo
  • Log In
    Log in via Symplectic to deposit your publication(s).
Repository logo
  • Communities & Collections
  • Research Outputs
  • Statistics
  • Log In
    Log in via Symplectic to deposit your publication(s).
  1. Home
  2. Faculty of Medicine
  3. School of Public Health
  4. Department of Infectious Diseases
  5. INGOT-DR: an interpretable classifier for predicting drug resistance in <i>M. tuberculosis</i>
 
  • Details
INGOT-DR: an interpretable classifier for predicting drug resistance in <i>M. tuberculosis</i>
File(s)
INGOT-DR an interpretable classifier for predicting drug resistance in M. tuberculosis.pdf (2.09 MB)
Published version
Author(s)
Zabeti, Hooman
Dexter, Nick
Safari, Amir Hosein
Sedaghat, Nafiseh
Libbrecht, Maxwell
more
Type
Journal Article
Abstract
Motivation

Prediction of drug resistance and identification of its mechanisms in bacteria such as Mycobacterium tuberculosis, the etiological agent of tuberculosis, is a challenging problem. Solving this problem requires a transparent, accurate, and flexible predictive model. The methods currently used for this purpose rarely satisfy all of these criteria. On the one hand, approaches based on testing strains against a catalogue of previously identified mutations often yield poor predictive performance; on the other hand, machine learning techniques typically have higher predictive accuracy, but often lack interpretability and may learn patterns that produce accurate predictions for the wrong reasons. Current interpretable methods may either exhibit a lower accuracy or lack the flexibility needed to generalize them to previously unseen data.
Contribution

In this paper we propose a novel technique, inspired by group testing and Boolean compressed sensing, which yields highly accurate predictions, interpretable results, and is flexible enough to be optimized for various evaluation metrics at the same time.
Results

We test the predictive accuracy of our approach on five first-line and seven second-line antibiotics used for treating tuberculosis. We find that it has a higher or comparable accuracy to that of commonly used machine learning models, and is able to identify variants in genes with previously reported association to drug resistance. Our method is intrinsically interpretable, and can be customized for different evaluation metrics. Our implementation is available at github.com/hoomanzabeti/INGOT_DR and can be installed via The Python Package Index (Pypi) under ingotdr. This package is also compatible with most of the tools in the Scikit-learn machine learning library.
Date Issued
2021-08-10
Date Acceptance
2021-07-23
Citation
Algorithms for Molecular Biology, 2021, 16 (1)
URI
https://hdl.handle.net/10044/1/117348
URL
https://doi.org/10.1186/s13015-021-00198-1
DOI
https://www.dx.doi.org/10.1186/s13015-021-00198-1
ISSN
1748-7188
Publisher
BMC
Journal / Book Title
Algorithms for Molecular Biology
Volume
16
Issue
1
Copyright Statement
© The Author(s) 2021. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
License URL
http://creativecommons.org/licenses/by/4.0/
Identifier
https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000683725800001&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=a2bf6146997ec60c407a63945d4e92bb
Subjects
Biochemical Research Methods
Biochemistry & Molecular Biology
Biotechnology & Applied Microbiology
Drug resistance
Group testing
Integer linear programming
Interpretable machine learning
Life Sciences & Biomedicine
Mathematical & Computational Biology
RANDOM FORESTS
Rule-based learning
Science & Technology
Whole-genome sequencing
Publication Status
Published
Coverage Spatial
England
Article Number
ARTN 17
Date Publish Online
2021-08-10
About
Spiral Depositing with Spiral Publishing with Spiral Symplectic
Contact us
Open access team Report an issue
Other Services
Scholarly Communications Library Services
logo

Imperial College London

South Kensington Campus

London SW7 2AZ, UK

tel: +44 (0)20 7589 5111

Accessibility Modern slavery statement Cookie Policy

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement
  • Send Feedback