In "Computing with CORGIS: Diverse, Real-world Datasets for Introductory Computing" \cite{Bart_2017} Bart et al. address the trending relevance of computing skills for non Computer Science students and lack of interesting and engaging computing-related course offerings for those students. Non-CS students often attend introductory computing courses, where they fail to understand the significance and usefulness of the concepts and exhibit little motivation. Pedagogy is often a contributing factor that emphasizes computing for its own sake and abstract assignments, such as computing the factorial function. \cite{anderson2014introductory} The authors look to educational theory to determine that a contextualized data science curriculum is a suitable solution to the issue. Bart et al. develop the CORGIS project: a broad range of real-world datasets, online tools, and suggested curriculum. In this paper we look at the problem, their solution and evaluation, followed by an evaluation of its strengths and weaknesses.

The Problem

As the Age of Computing matures and data increasingly permeates every area of our lives, computing skills are becoming necessary for more and more disciplines beyond traditional Computer Science majors. Traditional CS courses are geared toward students who plan to become software engineers, working at companies alongside other software engineers. Computing skills are becoming highly significant to many non-CS majors, but traditional CS course material is not as relevant or interesting to these students. Non-CS students find the teaching inauthentic and not aligned with their future career goals.
Lave and Wegner, in their book Situated Learning: Legitimate Peripheral Participation, argue a theory of situated learning, where social context and an individual role within a community of practice is integral to learning. \cite{Lave} They describe the form of apprenticeship that East African tailors implement, where apprentices begin by running errands for the master tailors. Over time they become familiar with the business and language of tailoring. Later, they will be allowed to cut pieces for an outfit, and eventually work up to assembling articles of clothing. In this way they progress from peripheral participation to full participation in the community of practice. This and other examples show how students learn and can be applied to traditional schooling methods. The Situated Learning Theory argues that tasks performed by learners should reflect tasks performed in a real-world environment. These learning tasks should be perceived as authentic. To maximize the authenticity, non-CS students require courses that are relevant to their professions and foster more effective learning within their own professional circles. For example, a Bioinformatics student will be a more effective learner when a computing course incorporates biology, sequencing concepts, and gene data manipulation. Likewise, concepts of quantum mechanics, high energy physics, and particle interaction data will be most useful and engaging to a Particle Physics student. Learning is increased when the coursework is aligned with career goals and is perceived as authentic.
Situated Learning Theory \cite{Lave} describes how students learn. Bart et al. \cite{Bart_2017} also look to the MUSIC Model of Academic Motivation \cite{jones2009motivating} to understand why students choose to learn. The MUSIC Model consists of five components: 1) empowerment, 2) usefulness, 3) success, 4) interest, and 5) caring. The concepts and guidelines within each of these components should be considered when designing coursework that will be engaging to a learner. Empowerment refers to the amount of perceived control a student feels about his learning. The greater the control students feel about their learning, the greater will be their understanding. Suggestions include providing students with choices of topics they can study, allowing students to control the pace of the lesson, providing opportunities for students to  express their opinions. Usefulness refers to a students understanding of why a concept is useful. Students need to have a clear understanding of why they are studying a concept and how it applies to the real-world. Suggestions include explicitly explaining to students how the course material is related to their interests or career goals, and providing opportunities for students to engage in activities that demonstrate the usefulness of the material in the real world. Success refers to the fact that students need to believe they can succeed in the course if they invest the effort. Suggestions include making course expectations clear and explicit, provide learning activities that challenge the students, and order learning activities by difficulty, beginning with the easiest. Interest refers to the fact that course tasks and activities must be interesting and engaging for the students. Suggestions include designing course activities and material that relate to the students' background knowledge and interests, and varying the presentation style. Caring, the final component, should be demonstrated by showing care about whether or not students successfully meet the course objectives. Suggestions include listening to and valuing students opinions and showing geniune concern for a students successes and failures. 
The Situated Learning theory and the MUSIC Model of learning show us that course material and activities for non-CS majors must be relevant to their career paths, interesting, and engaging for successful learning. Based on their analysis of these models, Bart et al. postulate that Data Science will provide the necessary relevance, interest and engagement for those students.
Bart et al. look at prior work that incorporates data science with introductory computing. One of these studies ways of teaching introductory computing courses that integrate real-world data, such as analyzing DNA, predicting the outcome of elections, detecting fraudulent data, suggesting friends in a social network, determining the authorship of text, analyzing the mood, or sentiment, of Twitter posts, and others. \cite{anderson2014introductory} Other work recognizes the need for Data Science in non-CS courses and presents a course design covering the basics of databases, an introduction to programming, and data visualization. \cite{sullivan2013data} Yet another defines material for a data-centric introductory computing course geared toward non-CS majors. \cite{Hall_Holt_2015} Despite this prior research, Bart et al. note that there is little evaluation of the impact of data science in introductory computing courses.

The Solution

Bart et al. propose properly contextualized Data Science as a solution for a lack of contextual, engaging coursework for non-CS majors. The authors have developed the CORGIS project, the Collection Of Really Great and Interesting dataSets, which is comprised of a collection of real-world datasets, a set of tools for interacting with the data, and suggested assignments.

Data

The foundation of the CORGIS project is the data. The CORGIS project seeks to implement 3 to 4 datasets for each non-CS major that may have an introductory computing requirement. Currently, over 40 datasets have been implemented, which cover a broad range of topics including education, medicine, politics, the arts, and others. Each dataset is acquired from real-world data, which is preprocessed, cleaned, and converted to JSON format. Each dataset includes an associated specification file that describes the metadata, fields, and interfaces available for that dataset.

Tools

The CORGIS project provides a set of online browser-based tools for exploring and interacting with the datasets. These tools include a Visualizer, an Explorer, Raw Data Files, language libraries, and a Gallery.