Methodology
The initial data processing of the June 2017 Citi Bike trips suggests that there are more subscriber riders than pass riders. However, the average trip duration result suggest that subscribers tend to have shorter trip duration than pass riders (Figure 1). This finding set the the base for the null hypothesis, where the proportion of Citi Bike subscribers would have the same or longer average trip duration than proportion of Citi Bike pass riders' average trip duration. The formula for the hypothesis:
H0: Subscriber‘s mean >= Customer's mean
H1: Subscriber‘s mean <= Customer's mean
Therefore, the Alternative hypothesis is the proportion of Citi Bike subscribers would have shorter trip duration than the proportion of Citi Bike pass riders' trip duration.
The appropriate statistical test in this analysis is one tailed T-Test since the focus of this study is to see the difference between the trip duration of the two groups and assess the probability where subscriber average trip duration is same or higher than pass rider average trip duration. One tailed T-Test is ideal for this analysis because it only tells the possibility of s difference in the distribution between two groups and does not require a control group. The alpha for the T-test is 0.05.
To proceed the T-Test analysis, a random sample from the population for each user type is taken. The sample size for both user type group will be the same at 30,000 rides.