Supervised machine learning to support the diagnosis of bacterial infection in the context of COVID-19
File(s)dlab002.pdf (348.32 KB)
Published version
Author(s)
Type
Journal Article
Abstract
Background: Bacterial infection has been challenging to diagnose in patients with COVID-19. We developed and evaluated supervised machine learning algorithms to support the diagnosis of secondary bacterial infection in hospitalized patients during COVID-19.
Methods: Inpatient data at three London hospitals for the first COVD-19 wave in March and April 2020 were extracted. Demographic, blood test, and microbiology data for individuals with and without SARS-CoV-2 positive PCR were obtained. A Gaussian-Naïve Bayes (GNB), Support Vector Machine (SVM), and Artificial Neuronal Network (ANN) were trained and compared using the area under the receiver operating characteristic curve (AUCROC). The best performing algorithm (SVM with 21 blood test variables) was prospectively piloted in July 2020. AUCROC was calculated for the prediction of a positive microbiological sample within 48 hours of admission.
Results: A total of 15,599 daily blood profiles for 1,186 individual patients were identified to train the algorithms. 771/1186 (65%) individuals were SARS-CoV-2 PCR positive. Clinically significant microbiology results were present for 166/1186 (14%) patients during admission. A SVM algorithm trained with 21 routine blood test variables and over 8000 individual profiles had the best performance. AUCROC was 0.913, sensitivity 0.801, and specificity 0.890. Prospective testing on 54 patients on admission (28/54, 52% SARS-CoV-2 PCR positive) demonstrated an AUCROC of 0.960 (0.90-1.00).
Conclusion: A SVM using 21 routine blood test variables had excellent performance at inferring the likelihood of positive microbiology. Further prospective evaluation of the algorithms ability to support decision making for the diagnosis of bacterial infection in COVID-19 cohorts is underway.
Methods: Inpatient data at three London hospitals for the first COVD-19 wave in March and April 2020 were extracted. Demographic, blood test, and microbiology data for individuals with and without SARS-CoV-2 positive PCR were obtained. A Gaussian-Naïve Bayes (GNB), Support Vector Machine (SVM), and Artificial Neuronal Network (ANN) were trained and compared using the area under the receiver operating characteristic curve (AUCROC). The best performing algorithm (SVM with 21 blood test variables) was prospectively piloted in July 2020. AUCROC was calculated for the prediction of a positive microbiological sample within 48 hours of admission.
Results: A total of 15,599 daily blood profiles for 1,186 individual patients were identified to train the algorithms. 771/1186 (65%) individuals were SARS-CoV-2 PCR positive. Clinically significant microbiology results were present for 166/1186 (14%) patients during admission. A SVM algorithm trained with 21 routine blood test variables and over 8000 individual profiles had the best performance. AUCROC was 0.913, sensitivity 0.801, and specificity 0.890. Prospective testing on 54 patients on admission (28/54, 52% SARS-CoV-2 PCR positive) demonstrated an AUCROC of 0.960 (0.90-1.00).
Conclusion: A SVM using 21 routine blood test variables had excellent performance at inferring the likelihood of positive microbiology. Further prospective evaluation of the algorithms ability to support decision making for the diagnosis of bacterial infection in COVID-19 cohorts is underway.
Date Issued
2021-03
Date Acceptance
2021-01-04
Citation
JAC-Antimicrobial Resistance, 2021, 3 (1), pp.1-4
ISSN
2632-1823
Publisher
Oxford University Press (OUP)
Start Page
1
End Page
4
Journal / Book Title
JAC-Antimicrobial Resistance
Volume
3
Issue
1
Copyright Statement
© The Author(s) 2021. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
License URL
Sponsor
National Institute for Health Research
National Institute for Health Research
National Institute for Health Research
Identifier
https://academic.oup.com/jacamr/article/3/1/dlab002/6127116
Grant Number
NIHR200646
RDF04
NIHR200876
Publication Status
Published
Date Publish Online
2021-02-03