PUI2016 Extra Credit Project

Time Series Analysis of Beijing Air Pollution
<Chunqing Xu, cx495, cx495>
Abstract:
This project is designed to find the main factors causing air pollution in Beijing. Time series analysis is applied to the dataset about PM2.5 of Beijing and other main cities of China. With related time series plots and event detection, we can tell that factory emissions, vehicle emissions and coal burning are three main factors of air pollution in Beijing.
Introduction:
The question to be answered is what are main factors of air pollution in Beijing. Air pollution is a severe problem in Beijing and China now, it is always vital since the air quality is strongly related to the health of residents. The government has made several policies trying to reduce the air pollution, the air pollution is reduced but far from being solved. Finding the main factors of air pollution may not only reveal why it is so hard to be eliminated but also help the government to make real effective policies.
Data:
The datasets used for this projects are hourly PM2.5 value of Beijing and some other main cities of China. They are provided by U.S. Department of State Air Quality Monitoring Program, which can be accessed at the official website: http://www.stateair.net/web/historical/1/1.html  But there are some errors in these datasets, some PM2.5 values are negative. Since PM2.5 is correctly defined as particulate matter with a mean aerodynamic diameter of 2.5 μm, so the negative PM2.5 values are wrong ones. Although this dataset contains only the information of PM2.5, analysis about air pollution would be persuasive if more information is contained for that PM2.5 is only one of the pollution particles. Negative values of PM2.5 Value are dropped and only the column of ‘Time’ and ‘PM2.5 Value’ are kept. Also, the values in ‘Time’ column are transferred into formal datetime data by using the function.