https://github.com/jazzzchan/PUI2017_yc3300/tree/master/extracredit
Abstract 
The scaling of rape crime coefficients across New York City has been hampered by data limitations, information asymmetries. Using New York City rape statistics in 2016 as a focus, this paper aims to empirically describe the way in which different variables affect official crime statistic. A predictive model is developed to provide the basis for a multivariate rape crime index. By employing time series analysis and multivariate regression model, results has demonstrated that midnight has the most rape crime commitment throughout 2016. Additionally, 75% of the total crimes were found to be conducted inside residential areas. Due to the complex nature of rape and sexual assaults, a thorough knowledge of the principles guiding this process and corresponding social consciousnesses are extremely essential. 
KEYWORDS crime statistics, rape, New York City

Introduction

Events from the past few years have gradually brought much public attention to rape crime, recent affairs especially made sexual assaults to the forefront by the headline. Data has shown an increasing tendency on reported rape crime total numbers started from 2008. This project applies time series analysis and multivariate regression model to analyze the correlative features from rape crime data in 2015 and 2016. 

Data

Crime data for this paper is collected from the New York City Police Department (hereafter NYPD). NYPD provides crime data in accordance with the New York State Penal Law and other New York State laws. For each rape crime,  both historical felony data and citywide complaint statistics (citywide incident level) are utilized to generate incident dataset for 2015 and 2016.
NYPD citywide crime statistics-Incident Level Data enable us to access all the complaints that NYPD has received till the most recent (which is third quarter of 2017) regarding varied levels of crimes. In order to comply consistenty with NYPD Historical Data, which only provides seven major felonies level crimes, only Rape has been extracted from Citywide Incident Level Dataset.
Due to privacy reasons, the most recent Incident Level Data no longer contains geocode information of where the incident occured, but instead includes building types. NYPD Historical Data contains all the geocode related information on incidents happened before 2015. 
Weather data is provided by Kaggle - weather 2016 in New York City Dataset. This dataset only provides monthly weather data with temperature information and  precipitation. The average temperature for the month has been utilized as temperature throughout that month for this project.