The notion of a digital divide has been increasingly addressed by policy makers for the last two decades. In the year 2013 the Census Bureau included questions regarding internet access for the first time in the federal agency’s history. The results of the survey highlighted the situation of hundreds of thousands of low-income Americans with no computer ownership and deficient access broadband. For New York City, the American Community Survey (ACS) stressed estimates for its most vulnerable segments of the population located in all five boroughs. According to the Comptroller’s Office’ analysis of ACS results, 20% of the city’s youth and 45% of its senior population lacked broadband at home. The deficient access was concentrated in the Bronx and Brooklyn with 34% and 30% of residents lacking internet access, respectively. Aiming to address this increasing divide, Mayor Bill de Blasio announced LinkNYC program, a municipal Wi-Fi network that will eventually replace more than 7,000 phone booths with as many as 10,000 interactive kiosks with the capacity to provide New Yorkers with free high-speed internet within a 150-foot radius from each device. This project evaluates the spatial relationship between the current disposition of the self-funded LinkNYC kiosks (Links) and the areas of New York City hosting population living below poverty. Specifically, this study was designed to examine whether or not, low-income members of the community were more likely to be located in long-proximity to Links, compared to high-income segments of the population. The study was done by utilizing American Community Survey population and internet use estimates for 2015 and 2014, respectively and the LinkNYC locator provided by New York City’s open data portal. Among the findings of this project, a strong presence of Links was identified in the borough of Manhattan, with 90% of total free Wi-Fi devices installed within community districts of higher median household income and the greatest number internet subscriptions. Conversely, Brooklyn was the borough with lower median income households, the least number of internet subscriptions and had, at the time of the analysis, only two installed LinkNYC kiosks (representing less than 1% of the total installed Links). Findings indicated that population below poverty was more likely to be located at longer-distance from their nearest LinkNYC kios than their higher-income neighbors. The limitations of the linear nature measurements and results’ implications within New York City’s digital divide, however, will be explored in depth throughout the sections below.
ABSTRACT New York City keeps records of Citi Bike services, including demographics of users and statistics on bike use. Here, we performed a statistical analysis to determine the relationship between biker age and trip duration, testing the alternative hypothesis that Citi Bike users under age 35 are more likely to bike for longer durations than the average user. Through a simple Z-test, we were able to reject our null hypothesis, concluding that trip duration of bikers under 35 is significantly greater than the average user. DATA For this project, our research question was: _Are Citi Bike users under 35 years of age significantly more likely bike for longer durations compared to the average user?_ For this analysis, we formed the following hypotheses: _Null Hypothesis:_ The mean trip duration of Citi Bike users under the age of 35 is the same or less than the mean trip duration of an average user, significance level = 0.05. _Alternative Hypothesis:_ The mean trip duration of Citi Bike users under the age of 35 is more than the mean trip duration of an average user, significance level = 0.05 To test these hypotheses, we chose Citi Bike data from December 2015. The information downloaded from the data facility contained more variables than needed to compare age and trip duration. Additionally, it was not organized in columns, which could led to errors, such as interpreting variable names as observations. As such, we first organized our data into columns, then dropped 13 of the 15 categories. We were left with “birth year” as our independent variable, and “trip duration” in seconds as our dependent variable. After plotting both variables, we identified several outliers of impossibly old users, i.e., those born before 1910. Plot 1 shows a scatter plot of the raw data, plotting birth year against trip duration. Histogram 1 shows the raw distribution of age across the data set. In Histogram 3, the distributions of trip duration for the entire data set (in blue) and for the group of those 35 and under (in green) are compared. ANALYSIS Our peer reviews suggested we perform a Z-test to compare the information of users under 35 and the total population. This test is possible because we know the population parameters (since dataset itself represents the entire population of Citi Bike users). Given the size of our sample, and the fact that we know the mean and standard deviation for both both groups, we chose to test our hypothesis with a Z-test. As such, we first had to calculate the mean and standard trip duration for the two groups. These values were plugged into the Z-test formula. RESULTS From our Z-test, we obtained a Z-statistic of 17.79. From the Z-Table, this gave an area of over 0.9998. Thus, our p-value is (1 - 0.9998), or 0.0002, meaning there is a 0.02% probability that the difference observed between the two groups is due to chance alone. Specifically, this p-value is much smaller than our alpha level of 0.05, meaning we can reject our null hypothesis, and can conclude that trip duration times of Citi Bike users are longer for those under age 35 compared the average user. LINK TO ORIGINAL NOTEBOOK https://github.com/jc7344/PUI2016_jc7344/blob/master/HW6_jc7344/HW6_Assignment2.ipynb