Karan Saini

and 1 more

AbstractThis study is a statistical analysis of CitiBike open data. The idea we had was that young people are more likely to be subscribers of CitiBike. To test this hypothesis we used the  Mann-Whitney rank test on subsets of customers data and subscribers data. The p-value for the test was equal to \(1.5788174779838282\ e-231\) which was less than chosen significance level \(\left(\alpha\ =\ 0.5\right)\). Hence the null hypothesis was rejected.IntroductionCitiBike is a bike sharing service which offers it's users two service models : customers and subscribers. Subscribers pay a standard monthly fee for unlimited access to CitiBike. Customers make payments in a pay per ride model. Our idea was based on the fact that riders need a minimum fitness level to be able to use CitiBike as a primary transportation medium and hence get good value for subscription services. This analysis is also of potential marketing significance to CitiBike as a company.DataCitiBike Trip Data from the CitiBike website for the month of June in the year 2018 has been used for the statistical analysis. The data set consists of information like trip duration, start time and end time of the journey, start station and end station information, bike id, user type , birth year and gender for all the trips recorded in June. The data has been read into a data-frame and the data-frame is reduced to keep only columns of interest which are user type and birth year. Age is calculated from the birth year and null values are dropped. Two separate data-frames are created for the customers and subscribers.