Rapid Prototyping of Species Classifiers using Deep Learning: A Guide for Non-Experts
Deep learning algorithms are revolutionizing how hypothesis generation, pattern recognition, and prediction occurs in the sciences. In the life sciences, particularly biology and its subfields, the use of deep learning is slowly but steadily increasing. However, prototyping or development of tools for practical applications remains in the domain of experienced coders. Furthermore, many tools can be quite costly and difficult to put together without expertise in Artificial intelligence (AI) computing. We built a biological species classifier that leverages existing open-source tools and libraries. We designed the corresponding tutorial for users with basic skills in python and a small, but well-curated image dataset. We included annotated code in form of a Jupyter Notebook that can be adapted to any image dataset, ranging from satellite images, animals to bacteria, or even data such as song or echolocation recordings transformed into images. The prototype developer is publicly available and can be adapted for citizen science as well as other applications not envisioned in this paper.
We illustrate our approach with a case study of 219 images of 3 three seastar species. We show that with minimal parameter tuning of the AI pipeline we can create a classifier with 87% accuracy. We include additional approaches to understand the misclassified images and to curate the dataset to increase accuracy. The power of AI approaches is becoming increasingly accessible. We can now readily build and prototype species classifiers that can have a great impact on research that requires species identification and other types of image analysis. Such tools have implications for citizen science, biodiversity monitoring, and a wide range of ecological applications.