CitiBike is a popular transportation alternative in New York City and is widely used by people across all ages. This project is designed to find out among those CitiBike riders, who have more usage of CitiBike on weekends over weekdays. The main idea is to divide the riders into two age groups:above 30 years old and under(includes equal to) 30 years old. By utlizing one month CitiBike data and Null Hypothesis Significance Test, we conclude younger generations who are under 30 years old are more prone to CitiBike on weekends.
Citi Bike is the nation's largest bike share program, with 10,000 bikes and 600 stations across Manhattan, Brooklyn, Queens and Jersey City. It is a quick and affordable way to get around town and very popular in NYC area. Analyzing CitiBike users' activities is one of the most important ways to understand the business and social behaviors. My project is trying to find out which group uses CitiBike more on weekend for transportation. We use 30 as divide age line because in general, people in the city below 30 years old are children, teenagers or singles , many of them are students or new starters in their careers. Meanwhile people over 30 years old might have families and stable jobs.
The data used for this project is CitiBike monthly ridership dataset . And specifically, the month of June 2016 dataset is used for analysis. It is provided by CitiBike Program, which can be accessed at their official website: https://www.citibikenyc.com/system-data, and https://s3.amazonaws.com/tripdata/index.html. The dataset contains columns of trip duration, location information and riders information. To focus on our question mentioned above in introduction part. Only the column of the riders' birth year is kept, all the other columns are removed. Then riders who were born over 30 years ago are grouped together and summed up, same with riders who were born less than or equal to 30 years ago. Finally we plot these two groups' data into two figures, one is total quantity of each group's each week day's ridership, the other figure is each weekday's ridership fraction within their own groups. Note that for two groups data are plot into same figure for comparison.