Can CNN-based species classification generalise across variation in habitat within a camera trap survey?
Author(s)
Type
Journal Article
Abstract
Camera trap surveys are a popular ecological monitoring tool that produce vast numbers of images making their annotation extremely time-consuming. Advances in machine learning, in the form of convolutional neural networks, have demonstrated potential for automated image classification, reducing processing time. These networks often have a poor ability to generalise, however, which could impact assessments of species in habitats undergoing change.
Here, we (i) compare the performance of three network architectures in identifying species in camera trap images taken from tropical forest of varying disturbance intensities; (ii) explore the impacts of training dataset configuration; (iii) use habitat disturbance categories to investigate network generalisability and (iv) test whether classification performance and generalisability improve when using images cropped to bounding boxes.
Overall accuracy (72.8%) was improved by excluding the rarest species and by adding extra training images (76.3% and 82.8%, respectively). Generalisability to new camera locations within a disturbance level was poor (mean F1-score: 0.32). Performance across unseen habitat disturbance levels was worse (mean F1-score: 0.27). Training the network on multiple disturbance levels improved generalisability (mean F1-score on unseen disturbance levels: 0.41). Cropping images to bounding boxes improved overall performance (F1-score: 0.77 vs. 0.47) and generalisability (mean F1-score on unseen disturbance levels: 0.73), but at a cost of losing images that contained animals which the detector failed to detect.
These results suggest researchers should consider using an object detector before passing images to a classifier, and an improvement in classification might be seen if labelled images from other studies are added to their training data. Composition of training data was shown to be influential, but including rarer classes did not compromise performance on common classes, providing support for the inclusion of rare species to inform conservation efforts. These findings have important implications for use of these methods for long-term monitoring of habitats undergoing change, as they highlight the potential for misclassifications due to poor generalisability to impact subsequent ecological analyses. These methods therefore need to be considered as dynamic, in that changes to the study site would need to be reflected in the updated training of the network.
Here, we (i) compare the performance of three network architectures in identifying species in camera trap images taken from tropical forest of varying disturbance intensities; (ii) explore the impacts of training dataset configuration; (iii) use habitat disturbance categories to investigate network generalisability and (iv) test whether classification performance and generalisability improve when using images cropped to bounding boxes.
Overall accuracy (72.8%) was improved by excluding the rarest species and by adding extra training images (76.3% and 82.8%, respectively). Generalisability to new camera locations within a disturbance level was poor (mean F1-score: 0.32). Performance across unseen habitat disturbance levels was worse (mean F1-score: 0.27). Training the network on multiple disturbance levels improved generalisability (mean F1-score on unseen disturbance levels: 0.41). Cropping images to bounding boxes improved overall performance (F1-score: 0.77 vs. 0.47) and generalisability (mean F1-score on unseen disturbance levels: 0.73), but at a cost of losing images that contained animals which the detector failed to detect.
These results suggest researchers should consider using an object detector before passing images to a classifier, and an improvement in classification might be seen if labelled images from other studies are added to their training data. Composition of training data was shown to be influential, but including rarer classes did not compromise performance on common classes, providing support for the inclusion of rare species to inform conservation efforts. These findings have important implications for use of these methods for long-term monitoring of habitats undergoing change, as they highlight the potential for misclassifications due to poor generalisability to impact subsequent ecological analyses. These methods therefore need to be considered as dynamic, in that changes to the study site would need to be reflected in the updated training of the network.
Date Issued
2023-01
Date Acceptance
2021-12-03
Citation
Methods in Ecology and Evolution, 2023, 14 (1), pp.242-251
ISSN
2041-210X
Publisher
Wiley
Start Page
242
End Page
251
Journal / Book Title
Methods in Ecology and Evolution
Volume
14
Issue
1
Copyright Statement
© 2022 The Authors. Methods in Ecology and Evolution published by John Wiley & Sons Ltd on behalf of British Ecological Society.
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
Sponsor
Rainforest Research Sdn Bhd
Identifier
https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000888418400001&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=a2bf6146997ec60c407a63945d4e92bb
Grant Number
LBEE_P34395
Subjects
camera trap
convolutional neural network
deep learning
disturbance
Ecology
Environmental Sciences & Ecology
generalisability
image classification
LAND-USE
Life Sciences & Biomedicine
object detection
Science & Technology
Publication Status
Published
OA Location
https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/2041-210X.14031
Date Publish Online
2022-11-18