Citi Bike Ridership Mini-Research:
Young People Are More Likely to Use Citi Bikes on Weekends

Abstract

This study aims to find out whether or not young people ride bikes on weekends more often than that of middle-aged people. The analysis performs a hypothesis test (Z-test) to compare the ratio of the number of young people using citi bikes on weekends over weekdays to that of mid-age people. The result shows that under 5% significance level, the ratio of the number of young people biking on weekends over week days(7 days) is greater than the counterpart middle-aged people.

Keywords: Citi Bike, Hypothesis Test, Z-test , Age

Introduction

In a fast pace modern city like New York, citi bike has become not only one of the most popular alternatives for commuting, but also a crucial component of a city’s gradually formed network system of both transportation and social activities. As a source from which quite comprehensive data sets can be acquired, citi bike is a great subject for researchers to study citizen’s behavior through patterns in its ridership. This study aims to find out whether young people ride bikes on weekends more often than that of middle-aged people, with the assumption that the bikers’ usage of citi bikes fully reflects their personal preferences– biking only for general use rather than heavily commuting purpose.

Data Availability and Processing

All processed data used to perform the statistical test is from:

https://s3.amazonaws.com/tripdata which is documented on a monthly basis. The data wrangling process follows the idea of reproducibility and includes the following stages:

  1. 1.

    Enable checking and downloading data to a pointed directory each time when searching for data of a specific month, so the existed data becomes retrievable. We choose February 2015 citi bike data for our research.

  2. 2.

    Read the data with Pandas Dataframe; select and modify the attributes as needed(i.e. create a binary ”age group” by calculating the ages using ”birth year”). Label each row with \(18\leqslant age<40\) and \(40\leqslant age<60\) as young and middle-aged respectively.

  3. 3.

    Plot histograms to visualize the normalized fractions of young and middle-aged bikers’ average biking trip counts as well as each individual group on each day of the week.

  4. 4.

    Consider the errors of average daily riding counts on weekdays and weekends for both biker groups.