Catalysing Artificial Intelligence for Paediatric Tuberculosis Research (CAPTURE): protocol for establishing a global multicentre study establishing paediatric chest X-ray repository to evaluate computer-aided detection algorithms
File(s)
Author(s)
Type
Journal Article
Abstract
Introduction
The substantial case detection gap in the field of child tuberculosis (TB) disease is largely driven by inadequate diagnostic tools and approaches. Chest radiographs (CXR) remain a key component in the evaluation of children and young adolescents (0-15 years) with presumptive TB, aiding clinicians in making the diagnosis and discriminating children with TB from those with other diseases. Widespread use and optimal interpretation of CXR is hampered by a lack of access to well-trained specialists to interpret images. Artificial intelligence CXR interpretation software, termed computer aided detection (CAD), are now well developed for adults, yet few products have been evaluated in children. The CXR features of child TB are different from those of adults, and as a result the performance of these CAD algorithms, largely developed for use in adults, will be sub-optimal when used in children. Adapting, or fine-tuning adult CAD algorithms, using CXR images from children with presumptive TB, could allow optimisation of these products for use in children. We therefore set out to develop a large image and data repository collected from children evaluated for TB (called Catalysing Artificial intelligence for Paediatric Tuberculosis Research, CAPTURE) with the purpose of evaluating current CAD products and then working with developers and other partners to optimize CAD algorithms for use in children.
Methods and analysis
We identified approximately 20 studies, from which potentially up to 11,000 CXRs could be utilized for the proposed project. CXRs and data were eligible for inclusion in the CAPTURE repository if collected from high quality child TB diagnostic studies that enrolled children with presumptive TB and if CXRs were obtained as part of the baseline assessment. All lead investigators of these studies are members of the CAPTURE consortium. The images and meta-data contributed are centrally collated and the key variable of TB case classification as Confirmed, Unconfirmed or Unlikely TB, using an established consensus case definition, are available. All CXRs included in the CAPTURE repository have a consensus radiological interpretation allocated by a panel of independent expert child TB CXR readers who have classified them as ‘unreadable’, ‘normal’, ‘abnormal typical of TB’ or ‘abnormal not typical of TB’. To determine diagnostic performance of existing CAD products, we will evaluate these against a primary composite clinical reference standard (Confirmed TB and Unconfirmed TB vs. Unlikely TB), as well as other secondary microbiological and radiological reference standards. A sub-set of images will be subsequently allocated to a ‘training set’ and made available to developers, academic groups or other parties to either develop novel paediatric CAD products or fine-tune existing adult ones, which will then be re-evaluated by the CAPTURE team using an image sub-set (‘validation set’) that is independent of the training set.
Ethics and dissemination
The CAPTURE study has been approved by Stellenbosch University Health Research Ethics Committee (N22/09/113), with additional ethics approval or waivers by relevant local authorities obtained by consortium members contributing data if required. The final pooled, harmonized and cleaned dataset, as well as the de-identified, renamed CXR images are stored on a secure cloud-based server. All analyses of existing CAD products, as well as the paediatric-optimised products, will be published in peer-reviewed publications and shared with other stakeholders like the World Health Organization and donor and procurement organizations to guide policy updates and procurement pathways to ensure widespread uptake.
The substantial case detection gap in the field of child tuberculosis (TB) disease is largely driven by inadequate diagnostic tools and approaches. Chest radiographs (CXR) remain a key component in the evaluation of children and young adolescents (0-15 years) with presumptive TB, aiding clinicians in making the diagnosis and discriminating children with TB from those with other diseases. Widespread use and optimal interpretation of CXR is hampered by a lack of access to well-trained specialists to interpret images. Artificial intelligence CXR interpretation software, termed computer aided detection (CAD), are now well developed for adults, yet few products have been evaluated in children. The CXR features of child TB are different from those of adults, and as a result the performance of these CAD algorithms, largely developed for use in adults, will be sub-optimal when used in children. Adapting, or fine-tuning adult CAD algorithms, using CXR images from children with presumptive TB, could allow optimisation of these products for use in children. We therefore set out to develop a large image and data repository collected from children evaluated for TB (called Catalysing Artificial intelligence for Paediatric Tuberculosis Research, CAPTURE) with the purpose of evaluating current CAD products and then working with developers and other partners to optimize CAD algorithms for use in children.
Methods and analysis
We identified approximately 20 studies, from which potentially up to 11,000 CXRs could be utilized for the proposed project. CXRs and data were eligible for inclusion in the CAPTURE repository if collected from high quality child TB diagnostic studies that enrolled children with presumptive TB and if CXRs were obtained as part of the baseline assessment. All lead investigators of these studies are members of the CAPTURE consortium. The images and meta-data contributed are centrally collated and the key variable of TB case classification as Confirmed, Unconfirmed or Unlikely TB, using an established consensus case definition, are available. All CXRs included in the CAPTURE repository have a consensus radiological interpretation allocated by a panel of independent expert child TB CXR readers who have classified them as ‘unreadable’, ‘normal’, ‘abnormal typical of TB’ or ‘abnormal not typical of TB’. To determine diagnostic performance of existing CAD products, we will evaluate these against a primary composite clinical reference standard (Confirmed TB and Unconfirmed TB vs. Unlikely TB), as well as other secondary microbiological and radiological reference standards. A sub-set of images will be subsequently allocated to a ‘training set’ and made available to developers, academic groups or other parties to either develop novel paediatric CAD products or fine-tune existing adult ones, which will then be re-evaluated by the CAPTURE team using an image sub-set (‘validation set’) that is independent of the training set.
Ethics and dissemination
The CAPTURE study has been approved by Stellenbosch University Health Research Ethics Committee (N22/09/113), with additional ethics approval or waivers by relevant local authorities obtained by consortium members contributing data if required. The final pooled, harmonized and cleaned dataset, as well as the de-identified, renamed CXR images are stored on a secure cloud-based server. All analyses of existing CAD products, as well as the paediatric-optimised products, will be published in peer-reviewed publications and shared with other stakeholders like the World Health Organization and donor and procurement organizations to guide policy updates and procurement pathways to ensure widespread uptake.
Date Acceptance
2025-11-04
Citation
BMJ Open
ISSN
2044-6055
Publisher
BMJ Publishing Group
Journal / Book Title
BMJ Open
Copyright Statement
Copyright This paper is embargoed until publication. Once published the Version of Record (VoR) will be available on immediate open access.
License URL
Publication Status
Accepted