Introduction:

The New York City Taxi & Limousine Commission has released a staggeringly detailed historical dataset covering billions of individual taxi trips in the city from January 2009. The detailed trip data is more than just a list of pickups and drop-offs : it’s the stories of New York and indicating urban residents' social behaviors. The dataset addresses many simple questions and beyond.In all those 'taxis',  there is a huge fleet of FHV(For-Hire-Vehicle)  thanks to the tech companies creativity. And almost all the new FHVs are Uber or Lyft. How they are impacting the city is interesting. One question is do they really provide  more transportation accessibility to the four boroughs outside Manhattan where some people are relatively far from subway stations ? Also does FHV pick up more customers in lower income areas?I mapped the FHV trips between July 2015 to Jun 2016 . And grouped every trip into local census tracts, then set about in an attempt to extract the answers to our question from the data. 

Data

The datasets used for this project are FHV trip data, they are provided by New York City Taxi & Limousine Commission and can be accessed at their official website :http://www.nyc.gov/html/tlc/html.And I used the records from July 2015 to June 2016 which is a 12 months duration. The FHV data has far less details than Yellow cab's trips due to Uber's claiming of privacy protection and declining to provide most of the trip information. So we only have pickup location in NYC taxi zone format and pick up time.  Taxi zone boundary shapefile and  zone names can be accessed at https://s3.amazonaws.com/nyc-tlc/misc/taxi_zones.zip.
To find New York City Area income Info we used 2016 American Community  Survey which was just released on Dec 8th, 2017. And I used  https://www.socialexplorer.com/explore/tables to filter out the columns I needed. SocialExplorer provides US census and ACS tables, and they have more convenient way to identify table codes  and filter out information for download.

Data Processing

First I projected all New York census tracts into NYC taxi zones since taxi zones were much larger than most tracts. With all the intersection results, I could tell which taxi zones each census tract were divided into and their corresponding proportional areas. It is assumed that inside each taxi zone the possibility of FHV pickup is the same, so we could calculate the each tract's pickups from the intersections with different zones by adding all proportional amount togethers.  Then I combined all 12 monthes' FHV trip datasets together, and grouped them by taxi zones and months, using the tracts-taxi zones projection file I got each month's trip counts in each census tract.