NYC Subways Safety Study

PUI2016 Extra Credit Project Proposal
Sunny Kulkarni, Github Handle - sunnyk-skk456, NYU ID - skk456
Problem Description:
The knowledge on NYC Subway Station’s Safety would be helpful to commuters to select a safer route and helpful to the city to deploy its resources effectively. The safety for the subway stations (henceforth referred to as stations) would be determined by the crimes occurring in the station and it’s vicinity. The key is to determine what factors affect the final perception of the station’s safety. Given the limited availability of data, the research will be to establish these following correlations –
  - Does the crime rate near the stations correlate with the use of the stations during business and non-business hours?

Data would be taken from – 
1.  NYC Open Data Source Socrata for –
  • Information on Subway Stations and it’s Entrances – This dataset will give the base information on the location of subway stations across the city. Link.

  • NYPD Crime Data – This data will be filtered for crimes occurred inside and close vicinity of the subway premises and aggregated at the station level. This dataset will provide information on the different types of crimes at different times of the day occurring at the subway stations. Link.

2.  MTA Turnstile Data – dataset provides information on the usage of different stations across the city. Link.

3.  MIT StreetScore dataset – This dataset contains scores of perceived safety of locations for New York. This score is calculated by using machine learning and computer vision algorithms. To perform analysis, this dataset will help gauge the perceived safety of the streets close to the station entrances and check its correlation to the station’s usage during the different time of the day. Link.