SU = (Biodiversity & Conservation) AND TS = ((‘artificial
intelligence’ OR ‘Bayesian’ OR ‘boost’ OR ‘boosted regression tree*’ OR
‘data mining’ OR ‘decision tree*’ OR ‘deep learning’ OR ‘DNN’ OR
‘genetic algorithm*’ OR ‘genetic programming’ OR ‘machine learning’ OR
‘MaxEnt’ OR ‘maximum entropy’ OR ‘natural language’ OR ‘neural network*’
OR ‘NLP’ OR ‘perceptron’ OR ‘random forest*’ OR ‘RNN’ OR ‘support
vector’ OR ‘SVM’ OR ‘symbolic regression’) AND [conservation
keyword]) NOT WC = ‘Engineering, Biomedical’
The [conservation keyword] was one of the following terms: ‘alien
species’, ‘climate change’, ‘co-extinct*’, ‘conservation’,
‘desertification’, ‘economic incentive*’, ‘ecosystem service*’, ‘energy
production’, ‘environmental education’, ‘extinct’, ‘extinction*’,
‘fishing’, ‘fragmentation’, ‘geological event*’, ‘global warming’,
‘habitat change’, ‘habitat degradation’, ‘habitat loss’, ‘harvesting’,
‘hunting’, ‘invasive species’, ‘land management’, ‘land protection’,
‘monitoring’, ‘nutrient loading’, ‘overexploitation’, ‘pesticide*’,
‘poaching’, ‘pollution’, ‘species management’, ‘threat*’, ‘urban
expansion’, ‘urbanization’.
SU means ‘Research Area’, TS means ‘Topic’ and WC means ‘Web of
Science’. Both SU and WC restrictions were imposed in order to increase
the specificity, reducing the number of engineering and biomedical
papers. This initial search yielded 5046 results after removing
duplicates. Due to the high number of hits associated with certain
conservation keywords (see Annex A), and in order to simplify the review
process, search results were randomly sampled to a maximum of 100 papers
per conservation keyword. This process yielded 1290 sources. These
sources were organised under non-exclusive categories (Habitat loss &
fragmentation, Biological resource use, etc.) derived from the IUCN
classification system. For the presentation of results, to maintain the
representativeness of each conservation keyword, all values were divided
by the proportion represented in the sample.
Lastly, to guarantee a comprehensive overview of machine learning in
species conservation, we compare our findings with other papers,
including reviews and sources cited in them, in the discussion. Although
reviews of the use of machine learning in different fields were found in
our systematic review, these were not included in the analysis to avoid
duplicate results. Instead, they were used for framing and discussing
our results.
Table 1. Threats to species hierarchically organised according to the
CBD and IUCN threat classification schemes. Keywords used in this
systematic review reflect these concepts.