Digigtal Signs of the neighbourhoods
With the popularity of different social media platforms, the amounts of data recorded might be seen as an opportunity to fulfill needs of city management. Strugglyng multiple challenges, city gouverneurs are eager to get real-time data in order to understand what is happening in the city and reinforce their decisions. In our study, we investigate how Twitter stream might be used to characterize urban areas and predict their socioeconomic properties. As result, several directions of potential implementation were defined, including event detection and evaluation, areas partitioning and profiling, and prediction of areas socio-economic properties.
As digital technologies are becoming more and more widespread, big data created by recording the digital traces left behind human activities become a powerful mean to study various aspects of human behavior. Many of those aspects can be described with social media feeds - data, generated and shared by people though multiple global social media platforms (Grauwin). At the same time, the increasing urbanization of the world’s population and great diversity of new urban population deeply affect urban environment. Solving many challenges of modern cities, including crime, illegal construction, tax regulation, emergencies, and many others, require frequent updates and prioritization, and, therefore, large quantities of highly granular incoming data with frequent updates for analysis.
This need might be resolved by records of modern communication systems, and social media in particular. Focusing on records aggregated on spatial locations rather than on individuals, new approaches have been initiated different types of communication might be used for many purposes, from urban landscape description (Frias-Martinez) (Jacobs-Crisioni) (Ratti), to regional delineation (Amini) (Kung), population density estimation, land use classification (Pei) (Grauwina) and identification of social groups and events (Reades) . As many channels and platforms of communication have unique sets of features, it is crucial to develop theoretical frameworks and a real-time monitoring systems, as it is required to understand how the individual dynamics shape the structure of our cities in order to make better tactical decisions and general strategies for city governance (Grauwin).
We use Twitter stream for 3 years, generated within New York city as a main source of data in our study. Twitter source was selected for it's accessibility, rich data, including geographical location, and large user base within the city. Also, while Twitter by itself represents a wide range of activities, it also provides "app signature" — credentials of application, that initiated particular tweet, — along with each message. Therefore, some part of the tweets can be interpreted as very specific activity. Our choice of geographical boundaries was defined by large penetration of Social Media, huge population, and multiple complimentary datasets available
Further we provide a comparative study of twitter stream for New York City. The focus of our study is on demonstrating that temporal "signature" of an area can be interpreted and used to explore relationships between urban areas, used for event detection and evaluation, and even represent certain sociodemographic characteristics of the area.
A feed of Twitter data was collected through official API using ensemble of custom scrapers. Data then being processed, folowing prosedure described in Appendix I.