This project intends to examine if there is any difference between male bikers and female bikers in the day and night time. More specifically, my initial hypothesis is that men are more likely to bike than women in the night time due to safety concerns. Women are more sensitive to the potential safety risks when traveling at night than men do.
I use the information of bikers in February, 2015 as the sample for my study. The time they started biking will be the determinant of the time they biked. I divided my sample into two groups, men and women based on the gender information. The day and night times are categorized as the followings:
Based on the available data, I calculated the normalized ratios of bikers in the day and night times for each gender for illustrative graphs and statistical analysis.
The graphs below show that there may be some differences between men and women in terms of the hours they are most likely to bike. In figure 1, the fractions of men riding bike after 6pm and before 8am are higher than those of women riding bike. In figure 2, which illustrates the fraction of each gender at day and night, the fraction of female riders is higher than that of male riders at day and lower at night.
The ratio of man start biking in the night time to man start biking in the day time is the same or lower than the ratio of woman biking in the night time over woman biking in the day time
H0: m_nighttime/m_alltime <= w_nighttime/w_alltime Ha: m_nighttime/m_alltime > w_nighttime/w_alltime
- Day time: from 7am - 7pm - Night time: from 7pm - 7 am - m_nighttime: number of times the biker is male and bike in the night time - m_alltime: number of times the biker is male and bike in the whole day - w_nighttime: number of times the biker is female and bike in the night time - w_alltime: number of times the biker is female and bike in the whole day
I choose the z-test for my hypothesis. The significance level for the test is 0.05
###Result (description of the result and conclusion from the analysis)
The z value attained for the test is 12.50. For the p value = 0.0001, the z's threshold value is 3.8, which means that the p value from the test is smaller than 0.0001. Hence, I can reject the Null hypothesis at the significance level of 5% that men are more likely to bike than women does in the weekend. For further analysis, I will look at more data in other months to validate my results.
The notebook contains the data and the test: Link
City bike data. Retrieved from https://s3.amazonaws.com/tripdata/201502-citibike-tripdata.zip