RefPlantNLR is a comprehensive collection of experimentally validated plant disease resistance proteins from the NLR family
Author(s)
Kourelis, Jiorgos
Sakai, Toshiyuki
Adachi, Hiroaki
Kamoun, Sophien
Type
Journal Article
Abstract
Reference datasets are critical in computational biology. They help define canonical biological features and are essential for benchmarking studies. Here, we describe a comprehensive reference dataset of experimentally validated plant nucleotide-binding leucine-rich repeat (NLR) immune receptors. RefPlantNLR consists of 481 NLRs from 31 genera belonging to 11 orders of flowering plants. This reference dataset has several applications. We used RefPlantNLR to determine the canonical features of functionally validated plant NLRs and to benchmark 5 NLR annotation tools. This revealed that although NLR annotation tools tend to retrieve the majority of NLRs, they frequently produce domain architectures that are inconsistent with the RefPlantNLR annotation. Guided by this analysis, we developed a new pipeline, NLRtracker, which extracts and annotates NLRs from protein or transcript files based on the core features found in the RefPlantNLR dataset. The RefPlantNLR dataset should also prove useful for guiding comparative analyses of NLRs across the wide spectrum of plant diversity and identifying understudied taxa. We hope that the RefPlantNLR resource will contribute to moving the field beyond a uniform view of NLR structure and function.
Date Issued
2021-10-20
Date Acceptance
2021-09-23
Citation
PLoS Biology, 2021, 19 (10)
ISSN
1544-9173
Publisher
Public Library of Science (PLoS)
Journal / Book Title
PLoS Biology
Volume
19
Issue
10
Copyright Statement
© 2021 Kourelis et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
License URL
Identifier
https://www.ncbi.nlm.nih.gov/pubmed/34669691
PII: PBIOLOGY-D-21-00318
Subjects
Amino Acid Sequence
Databases, Protein
Disease Resistance
Molecular Sequence Annotation
NLR Proteins
Phylogeny
Plant Diseases
Plant Proteins
Protein Domains
Reproducibility of Results
Publication Status
Published
Coverage Spatial
United States
Article Number
ARTN e3001124