Introduction

Citibike, introduced to NYC in 2013, is a privately owned bike-share system expanding throughout the boroughs of New York City. Since its inception, use of the bike share system has grown significantly, and 

Data

We looked at one month of data, July 2017, dropping extraneous observed variables so that the only two that remained were relevant to our research: birth year and gender. We subtracted birth year values from 2017 in order to determine the age of each rider. The data contained a few unrealistic observed ages (one rider was listed as being 140 years old) so we removed all observations with an observed age greater than 80..

Methodology

In order to test for significance, we used the two-tailed independent t-Test. Our independent variable (age) is categorical, and our dependent variable (age) is continuous. Based on this information, we determined that we could use either the t-Test or an ANOVA. Because the ANOVA is more complex, we decided that the simpler t-Test would be more appropriate. 
Alternatively, we were encouraged to use the z-Test and the Chi-Squared test. We cannot use Chi-Squared because age is not categorical and we are not testing a proportion, and we cannot use the z-Test because we do not have the population parameters. Thus, the independent t-Test was our best fit. At first we were considering a one-tailed test to determine if female riders had a significantly lower average age than males, but eventually determined that we did not have a strong enough reason to know  

Conclusion

Based on the outcome of the t-Test, we determined that there is a significant difference between the average age of female and male users. To strengthen the analysis, we could increase the number of months that we consider in our dataset. To add additional value to this analysis, we could try to identify whether there is a significant difference between the ages of users using Citibike recreationally versus those using Citibike for commuting to work, based on gender. This could help to better inform the user base and where future stations should be built.