A note on utilising binary features as ligand descriptors
File(s)
Author(s)
Mussa, HY
Mitchell, JBO
Glen, RC
Type
Journal Article
Abstract
It is common in cheminformatics to represent the properties of a ligand as a string of 1’s and 0’s, with the intention of
elucidating, inter alia, the relationship between the chemical structure of a ligand and its bioactivity. In this commentary
we note that, where relevant but non-redundant features are binary, they inevitably lead to a classifier capable
of capturing only a linear relationship between structural features and activity. If, instead, we were to use relevant
but non-redundant real-valued features, the resulting predictive model would be capable of describing a non-linear
structure-activity relationship. Hence, we suggest that real-valued features, where available, are to be preferred in this
scenario.
elucidating, inter alia, the relationship between the chemical structure of a ligand and its bioactivity. In this commentary
we note that, where relevant but non-redundant features are binary, they inevitably lead to a classifier capable
of capturing only a linear relationship between structural features and activity. If, instead, we were to use relevant
but non-redundant real-valued features, the resulting predictive model would be capable of describing a non-linear
structure-activity relationship. Hence, we suggest that real-valued features, where available, are to be preferred in this
scenario.
Date Issued
2015-12-01
Date Acceptance
2015-11-11
Citation
Journal of Cheminformatics, 2015, 7
ISSN
1758-2946
Publisher
Chemistry Central
Journal / Book Title
Journal of Cheminformatics
Volume
7
Copyright Statement
© 2015 Mussa et al. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium,
provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license,
and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/
publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated
(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium,
provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license,
and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/
publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated
License URL
Subjects
Science & Technology
Physical Sciences
Technology
Chemistry, Multidisciplinary
Computer Science, Information Systems
Computer Science, Interdisciplinary Applications
Chemistry
Computer Science
Binary descriptors
Ligand chemical structure
Linear relationship
Bernoulli distribution
MUTUAL INFORMATION
FEATURE-SELECTION
Publication Status
Published
Article Number
58