In this project I tried to compare the age of female riders and age of male riders. I build a hypothesis test to verify this problem. And use the Z-test to test the hypothesis, since I found that the samples are come from a kind of normal distribution population.
I used the January 2015 data set from the Citibike official data. Since there is no information about the age of riders directly in the data frame I created 'age' column for female and male riders use the 'Birth Year' information in the data frame.
The age of man rider is same or larger than than women rider.
H_0: M_ age <= W_ age
H_1: M_ age > W_ age
H_0: M_ age- W_ age <= 0
H_1: M_ age- W_ age > 0
I will use a significance level alpha=0.05
Which means i want the probability of getting a result at least as significant as mine to be less then 5%