SU = (Biodiversity & Conservation) AND TS = ((‘artificial intelligence’ OR ‘Bayesian’ OR ‘boost’ OR ‘boosted regression tree*’ OR ‘data mining’ OR ‘decision tree*’ OR ‘deep learning’ OR ‘DNN’ OR ‘genetic algorithm*’ OR ‘genetic programming’ OR ‘machine learning’ OR ‘MaxEnt’ OR ‘maximum entropy’ OR ‘natural language’ OR ‘neural network*’ OR ‘NLP’ OR ‘perceptron’ OR ‘random forest*’ OR ‘RNN’ OR ‘support vector’ OR ‘SVM’ OR ‘symbolic regression’) AND [conservation keyword]) NOT WC = ‘Engineering, Biomedical’
The [conservation keyword] was one of the following terms: ‘alien species’, ‘climate change’, ‘co-extinct*’, ‘conservation’, ‘desertification’, ‘economic incentive*’, ‘ecosystem service*’, ‘energy production’, ‘environmental education’, ‘extinct’, ‘extinction*’, ‘fishing’, ‘fragmentation’, ‘geological event*’, ‘global warming’, ‘habitat change’, ‘habitat degradation’, ‘habitat loss’, ‘harvesting’, ‘hunting’, ‘invasive species’, ‘land management’, ‘land protection’, ‘monitoring’, ‘nutrient loading’, ‘overexploitation’, ‘pesticide*’, ‘poaching’, ‘pollution’, ‘species management’, ‘threat*’, ‘urban expansion’, ‘urbanization’.
SU means ‘Research Area’, TS means ‘Topic’ and WC means ‘Web of Science’. Both SU and WC restrictions were imposed in order to increase the specificity, reducing the number of engineering and biomedical papers. This initial search yielded 5046 results after removing duplicates. Due to the high number of hits associated with certain conservation keywords (see Annex A), and in order to simplify the review process, search results were randomly sampled to a maximum of 100 papers per conservation keyword. This process yielded 1290 sources. These sources were organised under non-exclusive categories (Habitat loss & fragmentation, Biological resource use, etc.) derived from the IUCN classification system. For the presentation of results, to maintain the representativeness of each conservation keyword, all values were divided by the proportion represented in the sample.
Lastly, to guarantee a comprehensive overview of machine learning in species conservation, we compare our findings with other papers, including reviews and sources cited in them, in the discussion. Although reviews of the use of machine learning in different fields were found in our systematic review, these were not included in the analysis to avoid duplicate results. Instead, they were used for framing and discussing our results.
Table 1. Threats to species hierarchically organised according to the CBD and IUCN threat classification schemes. Keywords used in this systematic review reflect these concepts.