Braindroverview
Big Data in neuroimaging holds promise for answering important questions about the brain. However, many standard lab protocols that rely on experts examining each one of the samples no longer work with large-scale datasets, because they are difficult to scale, and because automated approaches lack the accuracy of highly trained scientists. Our proposed solution is to 1) start with a small, expertly labelled dataset, 2) amplify labels through citizen science via web-based tools, and 3) train machine learning on amplified labels to emulate expert decision making. As a proof of concept, we developed a system to quality control over 700 T1-weighted images from the Healthy Brain Network. An initial expertly labelled dataset (of 200 images) was amplified by citizen scientists to the entire dataset (724) with over 60,000 ratings through a simple web interface. A deep learning algorithm was trained to predict data quality with the aggregate citizen scientist labels in a subset of the data. In an ROC analysis on left out test data, the deep learning network performed as well as a state-of-the-art, specialized algorithm (MRIQC) for T1-weighted images, each with an area under the curve of 0.99. Therefore, we assert that combining citizen science and deep learning can generalize and scale neuroimaging expert decision making; this is particularly important in the cases where specialized, automated tools do not already exist. Finally, as a specific practical application of the method, we explore how brain image quality relates to the replicability of a well established relationship between brain volumes and age over development.