In this project I tried to compare the age of female riders and age of male riders. I build a hypothesis test to verify this problem. And use the Z-test to test the hypothesis, since I found that the samples are come from a kind of normal distribution population.
I used the January 2015 data set from the Citibike official data. Since there is no information about the age of riders directly in the data frame I created 'age' column for female and male riders use the 'Birth Year' information in the data frame.
The age of man rider is same or larger than than women rider.
H0: M age <= W_ age
H1: M age > W_ age
H0: M age- W_ age <= 0
H1: M age- W_ age > 0
I will use a significance level alpha=0.05
Which means i want the probability of getting a result at least as significant as mine to be less then 5%