Capture1

QY

Problem Description: The question to be answered is what factors affect New York City students’ English test academic performance in 2015 and to what extent. I will first define what are the appropriate proxies for academic performance and collate a list of suitable factors. Based on the proxies and factors collated, a regression will be conducted to help determine if the factors are significant at α = 5%, and quantify the impact of each factor. While the factors affecting students academic performance has been a well studied one- with the impact of factors such as poverty (Battistich et al., 1995), teacher’s qualifications (Boyd et al., 2008), attendance in after-school programs (Shernoff, 2010) understudied previously, it is still interesting to conduct this research again in the context of newly available data such as school budgets and iZone programs to see if new policies affected academic performances.  Data:Data NameWhy is Data SuitableProcessing conducted2013 - 2015 New York State English Language Arts (ELA) ExamMean english score for each school as a proxy for academic performanceNeed to narrow dataset to 2015 only 2010 - 2016 School Safety ReportAn unsafe school might be more disruptive for learningNeed to narrow dataset to 2015 only 2015 - 2016 Final Class Size Report Pupil-to-Teacher Ratio (PTR)Smaller class size has been shown to result in better quality learning by various studiesNeed to combine this dataset with the rest of the data using school namesiZone PLS School ListSchools in this list have access to funds to use newer softwares for teachingThis data is likely to be converted to a dummy variable for regression2012 - 2017 Historical Monthly Grade Level Attendance By SchoolWhen students are absent from school, they are not likely to be able to learn more.  Need to narrow dataset to 2015 only and need to combine dataset to a yearly average instead of monthlyDYCD after-school programs: Adolescent Literacy Schools with such literacy programs might improve english test scoresThis data is likely to be converted to a dummy variable for regression2014-2015 FSF budgetSchools with more funding might be able to provide a better quality education Funding is location based not school based, thus need to convert it to the latter2015 SAM Budget School Point locationsProvide geospatial coordinates to plot all the schools and the factorsNeed to merge all cleaned factors to this dataset
Download 1

QY

and 2 more

Submitted by: Yanmei Guan @yg833, Samantha Jeanne Falk @sjf374, Qinyu Goh @qg412ABSTRACT: For this Citibike mini project, our team wanted to test if riders were more keen on riding Citibike on Saturdays than Sundays. The idea was based on the rationale that there are more places of interests closed on Sundays than Saturdays. To test our idea, we looked at Citibike data from 2016 and selected 1 month of data from each season -February for Winter, May for Spring,  August for Summer, and November for Fall. Seasonality is important considering that bike riding is outdoors; cooler temperatures during some seasons will affect ridership. Ergo by sampling 4 months across the year, we are hoping to see a more fuller picture. We initially visualized the counts of rides by weekday using scatter plot, mean with error bars, and box plot with median of riders, and it looked like there maybe some differences. Especially considering that the mean number of rides for Saturday was 32884.13 with a standard deviation 12260.51 and the mean number of rides for Sunday was 28834.11 with a standard deviation of 11372.29. Then, we ran a two sample t-test on the counts from Saturdays and Sundays across the 2016 year, and it returned a t-statistic of 0.985 and a p-value of 0.332. As the p-value is greater than 0.05, we fail to reject the null hypothesis and therefore conclude that there the mean bike trips on Saturdays are the same or less than the mean of bike trips on Sundays in the 4 months of 2016, at a significance level of 0.05. INTRODUCTION: Citibike is New York City's (NYC) very own bicycle sharing program. Functioning as a docked bicycle sharing system, users either purchase day passes or annual membership in order to unlock a bicycle at a specific station and ride it to another station to return the bicycle.  Since its inception in 2013, Citibike has quickly grown to become a staple mode of transportation in NYC, even beating taxi in travelling time in certain instances \citep{bliss2017} .  Given the popularity of Citibike in NYC,  the team set out the explore Citibike's readily available public trip data to see if interesting trends of usage, as well as the behavior of users, can be distilled. Idea/Question