This project aims to determine if there is a relationship between the percentage of noise complaints related to after/before hours constructions and the prices of houses owned based on PUMA granularity level. From performing some correlation and one-way ANOVA analysis, the results showed that these two variables are highly correlated and the means for each of the pricing groups are different.
AbstractThis project analyzes Citibike data to determine whether or not the proportion of millennials biking is higher than the proportion of baby boomers riding. The results of the z-test performed demonstrate that in fact, the proportion of millennials that ride bikes in New York City is much higher than that of baby boomers.IntroductionThe Citibike data is a collection of all the data received since the Citibike project started until today. The entire collection has multiple databases, each one consisting of the data belonging to a month. Each row consists of a ride and each ride has different attributes, e.g. start station id, gender of rider, date and time or ride, birth year of rider, etc. To understand the people that are using the service, it is important to study some demographics, which is why asking the proportion of millennials riding vs the proportion of baby boomers riding is an important fact. In this study millennials are defined as the people born between 1981 and 1196 and baby boomers are defined as people born 1946 and 1964.DataThe specific data chosen for this study was November 2016. The reason for choosing 2016 and not a previous year is that in 2016 Citibike had more users than in 2015 and I thought results could be more interesting and significant. After deciding to perform the analysis only based on the birth year of riders, I dropped the rest of the columns, as they were unnecessary. In figure 1, you can see the distribution of the birth year data.