MindGames: A Crowd-Sourcing Game Platform for Brain MRI Segmentation
Advances in MRI technology and image segmentation algorithms have enabled researchers to begin to understand the mechanisms of healthy brain development (Giedd 1999) and neurological disorders, such as multiple sclerosis (Bakshi 2008). Due to the wide variability of brain morphology, coupled with a pathological process in the case of neurological disorders, increasingly large sample sizes are necessary to confidently answer the progressively complex biomedical questions the research community is interested in. Automated algorithms have been developed to reduce information-rich 3D MRI images to 1-dimensional summary measures that describe tissue properties and are easy to interpret, such as total gray matter volume. Automated segmentation algorithms save considerable time, compared to manual human inspection, but lack the advanced visual system of humans. As a result, these algorithms often make systematic errors, especially when analyzing brains with pathology or those in the early stages of development. Data science is poised to facilitate complex neuroscience research by fusing a crowdsourcing strategy with machine learning methods; automatic quantification can perform the bulk of the work efficiently and errors can be resolved by non-expert ”citizen-scientists” with the advantage of the human visual system.
Crowdsourcing has been successful in many other disciplines (Wiggins 2011), including mathematics (Cranshaw 2011), astronomy (Lintott 2008), and biochemistry (Eiben 2012) . Recently, over 200,000 ”citizen-neuroscientists” from over 147 countries helped identify neuronal connections in a mouse retina through the Eyewire game (Kim 2014). This crowdsourced game led to a new understanding of how mammalian retinal cells detect motion. I propose to implement three key features of the EyeWire paradigm and adapt them for the segmentation of MRI data. First, by breaking up the problem into smaller ”micro-tasks”, Eyewire scientists were able to access a much larger user-pool of non-experts. In a similar vein, 3D MRI data can be divided into 2D slices to be segmented by users. Second, machine learning algorithms were trained to help with the task, which improved the speed of manual neuronal tracing and validated non-expert input in the Eyewire game. Deep learning methods have already shown to be successful at segmenting MRI data, and similar models could be built to support manual segmentation. Lastly, EyeWire transformed a dull, monotonous task for experts into a fun, competitive game that trained non-experts and acquired valuable scientific data. The University of Washington is an ideal place to develop a similar game platform for MRI segmentation, using the resources at the Center for Game Science, led by Zoran Popovic. I propose to create an open-source platform for efficiently crowdsourcing brain tissue classification problems in order to answer neuroscience research questions with more precision.
Scaleable and Secure Micro-Tasks: A scaleable database system and server backend that keeps data private by dividing it into small ”micro-tasks”
Learning by Example: Machine learning algorithm that learns from human curation to improve efficiency of manual tasks
Training through Gamification: User interface that trains users to solve a specific problem, and keeps them engaged through a reward system
This Aim will address two key challenges: 1) Partitioning 3D data into micro-tasks that keep data private, 2) serving micro-tasks at scale. While there are many large-scale open-source data collection efforts, many datasets are kept private within research institutions due to IRB restrictions, so presenting a full 3D MRI volume to the public would be a violation. Serving smaller ”chunks” of data serves two purposes: it allows us to keep data private (because you cannot see the whole brain), and it reduces the fatigue of non-experts (because you only need to edit a small section), which enables us to engage a larger user base. A scaleable server will be implemented on a commercial cloud computing platform, with an API that allows researchers to upload MRI micro-tasks to the server database, and serves micro-tasks to users. Researchers will be asked to provide the following to the API: 1) an initial segmentation file from an automated algorithm 2) any original images (T1, T2, PD) that users need to properly edit the segmentation 3) directions on how the images should be sliced into micro-tasks (including the slicing plane and the number of slices). Additionally, researchers must provide a validation dataset, which includes ”correctly” segmented images, which will be used to train non-experts in Aim 3. The resources and faculty at the eScience Institute will help me implement state-of-the-art database and cloud computing technologies in order to increase the delivery of micro-tasks to ”citizen-scientists.”
This Aim will address three challenges: 1) Resolve user input to create a final 3D volume, 2) Prioritize serving micro-tasks based on user consensus and 3) Predict the user-edited segmentation image. To reconstruct the micro-tasks back into a 3D image, a weighted consensus map will be computed, based on how accurately each user performed edits on training data. Micro-tasks with lower consensus scores will be served more frequently to users, until the consensus is high. Participants will also be scored based on how well their segmentations match with other users on the same image, and this will be used to reward users in Aim 3. Finally, improving automated segmentation algorithms based on human input will save time and reduce the number of editors assigned to each micro-task. For example, a dataset of 100 3D volumes could be broken into 20,000 patches, each of which would need to be manually edited. Alternatively, convolutional neural networks (CNNs) have been very successful at pattern recognition when trained on similarly large sample sizes, and could reduce the time spent editing each patch. I propose to build a CNN using existing architecture, such as Tensorflow or Theanet, to predict segmentation results, under the guidance of the machine learning experts at the eScience institute.
For individuals with minimal neuroamatomy knowledge, the difficulty of manual neuroimaging segmentation will depend on the contrast of the image as well as the location/complexity of the target structure. An example of an easy task would be the segmentation of brain tissue from non-brain tissue, whereas a more difficult task would be the segmentation of multiple sclerosis lesions. This Aim will address simple as well as challenging problems through varying levels of training and rewards. A web application will be developed that hooks into the server developed in Aim 1. The app will include an in-browser brain editor (similar to the Mindcontrol application (Keshavan 2016)), a reward structure and a scoreboard for the top users, and an optional link to the Amazon Turk engine, where users can be paid (in micro-payments) for completing micro-tasks. Initially, the user will only be presented with training tasks until they reach an adequate accuracy score. Next, the training tasks will be interspersed with new tasks, in order to detect performnce drift. The frequency of training tasks will increase based on the researcher’s specification of task difficulty. The reward structure will be based on 1) how well the user edits training data, 2) how well the user segmentations match those of other users, and 3) how many voxels are edited by the user. The time spent on the task along with the number of edited voxels will also be used to validate whether or not the user completed the task with some thought. For example, a user’s score would be penalized if a large number of voxels were edited too quickly for a difficult task. I plan to collaborate with the data scientists at the eScience Institute to build an intuitive and engaging crowd-sourcing user interface on the Amazon Turk Platform.
To summarize, I propose to develop an open-source platform for the crowd-sourced image segmentation of brain MRI data, under the guidance of Ariel Rokem and Jason Yeatman at the eScience Institute and the University of Washington Institute for Neuroengineering. Through gamification, piece-wise exposure, and machine learning, I plan to engage a large user base across a variety of image segmentation tasks. Example applications include parcellating gray and white matter in a low contrast image where traditional segmentation algorithms fail, and delineating multiple sclerosis lesions which usually requires trained neuroradiologists. For a particular application, the Yeatman Lab at UW is collecting a large, longitudinal MRI dataset on children undergoing an intensive learning program, with the goal of determining how experience shapes brain development. The segmentation data from the MindGames platform can be used to 1) define the typical timecourse of cortical changes by examining gray/white matter volumes from segmentation, 2) construct normative developmental curves in order to detect abnormalities, and 3) study how learning shapes brain development by analyzing quantitative MR intensities within the gray and white matter. The MindGames platform will help researchers by improving the precision of segmentation measures without advanced computer science expertise, but will also engage, educate and excite the public and help advance cutting edge neuroscience research.
Jay N Giedd, Jonathan Blumenthal, Neal O Jeffries, F Xavier Castellanos, Hong Liu, Alex Zijdenbos, Tomáš Paus, Alan C Evans, Judith L Rapoport. Brain development during childhood and adolescence: a longitudinal MRI study. Nature neuroscience 2, 861–863 Nature Publishing Group, 1999.
Rohit Bakshi, Alan J Thompson, Maria A Rocca, Daniel Pelletier, Vincent Dousset, Frederik Barkhof, Matilde Inglese, Charles RG Guttmann, Mark A Horsfield, Massimo Filippi. MRI in multiple sclerosis: current status and future prospects. The Lancet Neurology 7, 615–625 Elsevier, 2008.
Andrea Wiggins, Kevin Crowston. From conservation to crowdsourcing: A typology of citizen science. 1–10 In System Sciences (HICSS), 2011 44th Hawaii international conference on. (2011).
Justin Cranshaw, Aniket Kittur. The polymath project: lessons from a successful online collaboration in mathematics. 1865–1874 In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. (2011).
Chris J Lintott, Kevin Schawinski, Anže Slosar, Kate Land, Steven Bamford, Daniel Thomas, M Jordan Raddick, Robert C Nichol, Alex Szalay, Dan Andreescu, others. Galaxy Zoo: morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey. Monthly Notices of the Royal Astronomical Society 389, 1179–1189 Oxford University Press, 2008.
Christopher B Eiben, Justin B Siegel, Jacob B Bale, Seth Cooper, Firas Khatib, Betty W Shen, Barry L Stoddard, Zoran Popovic, David Baker. Increased Diels-Alderase activity through backbone remodeling guided by Foldit players. Nature biotechnology 30, 190–192 Nature Publishing Group, 2012.
Jinseop S Kim, Matthew J Greene, Aleksandar Zlateski, Kisuk Lee, Mark Richardson, Srinivas C Turaga, Michael Purcaro, Matthew Balkam, Amy Robinson, Bardia F Behabadi, others. Space-time wiring specificity supports direction selectivity in the retina. Nature 509, 331 NIH Public Access, 2014.
Anisha Keshavan, Esha Datta, Ian McDonough, Christopher R Madan, Kesshi Jordan, Roland Henry. Mindcontrol: A Web Application for Brain Segmentation Quality Control. bioRxiv 090431 Cold Spring Harbor Labs Journals, 2016.