# PhD Qualifying Exam. Statement of Research & Reading List

The focus of my research work up till now has been on Computational Social Science which can be considered as an intersection of Social Sciences and Computer Science. Computational Social Science is a big field employing different algorithmic tools and techniques to solve different problems in different areas of social sciences. My focus within Computational Social Science is on Human Behavior Analysis using Data Science Techniques. Study in this area will enable me to see that how different aspects of Human Behavior can be analyzed better using computational and statistical techniques.
Application of big data tools and techniques is becoming more and more important every day as the magnitude of information and data available for analysis has grown tremendously in the last decade. As a result, many research projects have been carried out which employ new and efficient techniques for mining human data. My research is focused on application of these big data tools and techniques on social problems. Some of the problems that I have been working on include Customer Churn Prediction using Telecommunication CDR data, Migration Trends Analysis using Telecommunication CDR Data and Latent Profile Features Predication using Twitter Data.
Human Behavior Analysis using Data Science is a natural progression of my background in Computer Science and theme of the projects that I have been involved in as a PhD Student in the Information School during the last 2 years. Reading list for my general exam will consist of research material involving major techniques in data science and the application of the these techniques on the problems related to Social Science.

# TODO

1. Look at http://www.derekruths.com/

2. http://www.barabasilab.com/pubs/CCNR-ALB_Publications/201411-11_SciReports-CloseRelationships/201411-11_SciReports-CloseRelationships.pdf

3. http://www.math.cmu.edu/ ctsourak/int-math-triangles.pdf

4. Look at http://keg.cs.tsinghua.edu.cn/jietang/publications/TKDD12-Lou-Tang-et-al-follow-back-prediction.pdf

5. Read Dong’s Paper http://www3.nd.edu/ ydong1/

7. Email network science and behavior related list to Em.

8. Look at Brian Keegan work http://www.brianckeegan.com/.

# Foundations of Data Science

1. Hastie et al. (2009). Elements of Statistical Learning (The Elements of Stati...)

2. Esther Duflo et al. (2006). Using Randomization in Development Economics Research: A Toolkit (Duflo)

3. Rajaraman et al. (2009). Mining of Massive Datasets. (Rajaraman 2009)

4. Yaser S. AbuMostafa et al. (2012). Learning from Data (Abu-Mostafa 2012)

5. Tomaso Poggio and Steve Smale (2005). The Mathematics of Learning: Dealing with Data(Poggio 2005)

6. Z. Ghahramani (2004). Unsupervised Learning (Ghahramani 2004)

# Social Applications of Data Science

## Data Science and Measurements

1. P. Deville et al. (2014). Dynamic Population Mapping using Mobile Phone Data(Deville 2014)

2. Joshua Blumenstock et al. (2010). A Method for Estimating the Relationship Between Phone Use and Wealth(Blumenstock 2010)

3. V. Frias-Martinez, Jesus Virseda (2012). On the relationship between socio-economic factors and cell phone usage (Frias-Martinez 2012)

4. A. Llorente et al. (2014). Social Media Fingerprints of Unemployment(Llorente 2015)

5. T. Gutierrez et al. (2013). Evaluating socio-economic state of a country analyzing airtime credit and mobile phone datasets(Gutierrez 2013)

6. N. Eagle et al. (2010). Network Diversity and Economic Development(Eagle 2010)

7. Dong et al. (2014). Inferring User Demographics and Social Strategies in Mobile Social Networks (Dong 2014)

8. Wang et al. (2015). Forecasting Elections with Non-Representative Polls(Wang 2015)

9. C. Smith-Clarke et al. (2014). Poverty on the Cheap: Estimating Poverty Maps Using Aggregated Mobile Communication Networks(Smith-Clarke 2014)

10. Magno, Weber (2014). International Gender Differences and Gaps in Online Social Networks (Magno 2014)

## Migration, Mobility and Epidemiology using Big Data

1. Wesolowski et al. (2015). Impact of human mobility on the emergence of dengue epidemics in Pakistan(Wesolowski 2015)

2. Gonzales et al. (2008) Understanding individual human mobility patterns(González 2008)

3. Wesolowski et al. (2013). The impact of biases in mobile phone ownership on estimates of human mobility(Wesolowski 2013)

4. Blumenstock, JE.(2012). Inferring Patterns of Internal Migration from Mobile Phone Call Records: Evidence from Rwanda(Blumenstock 2012).

5. State, B. et al.(2014). Migration of Professionals to the U.S.: Evidence from LinkedIn Data(State 2014)

6. Zagheni et al.(2014). Inferring International and Internal Migration Patterns from Twitter Data (Zagheni 2014)

7. Wesolowski, et al.(2012). Quantifying the Impact of Human Mobility on Malaria (Wesolowski 2012)

8. Balcan et al. (2009). Multiscale mobility networks and the spatial spreading of infectious diseases(Balcan 2009)

9. Ginsberg et al. (2008). Detecting Influenza Epidemics using Search Engine Query Data(Ginsberg 2008)

10. Pervaiz et al. (2012). FluBreaks: Early Epidemic Detection from Google Flu Trends(Pervaiz 2012)

11. Eagle, Pentland (2005). Reality mining: sensing complex social systems(Eagle 2005)

12. Onnela et al. (2007). Structure and tie strengths in mobile communication networks(Onnela 2007)

13. Onnela et al. (2014). Using sociometers to quantify social interaction patterns(Onnela 2014)

14. Ratti et al. (2010). Redrawing the map of Great Britain from a network of human interactions(Ratti 2010)

15. Amini et al. (2014). The Impact of Social Segregation on Human Mobility in Developing and Urbanized Regions(Amini 2014)

16. Onnela et al. (2011). Geographic constraints on social network groups(Onnela 2011)

17. Cattuto et al. (2010). Dynamics of person-to-person interactions from distributed RFID sensor networks(Cattuto 2010)

18. Muchnik et al. (2013). Social Influence Bias: A Randomized Experiment(Muchnik 2013)

19. Aral et al. (2009). Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks(Aral 2009)

20. M. Gomez-Rodriguez et al. (2012). Inferring Networks of Diffusion and Influence(Gomez-Rodriguez 2012)

21. M. Gomez-Rodriguez et al. (2013). Modeling “Modeling Information Propagation with Survival Theory(Rodriguez 2013)

22. J.E. Blumenstock , N. Eagle (2011). Divided We Call: Disparities in Access and Use of Mobile Phones in Rwanda(Blumenstock 2012a)

23. X. Lu et al. (2012). Predictability “Predictability of Population Displacement after the 2010 Haiti Earthquake(Lu 2012)

24. Bengtsson et al. (2011) Improved "Improved Response to Disasters and Outbreaks by Tracking Population Movements with Mobile Phone Network Data: A Post-Earthquake Geospatial Study in Haiti(Bengtsson 2011)

25. Gao et al. (2014). Quantifying “Quantifying Information Flow during Emergencies(Gao 2014)

26. Bagrow et al. 2011). Collective “Collective Response of Human Populations to Large-Scale Emergencies(Bagrow 2011)

27. Wang et al. (2014). . Learning to Detect Patterns of Crime(Wang 2013)

## Data Science, Human Behavior and Networks

1. Backstrom et al. (2012). Four Degrees of Separation (Backstrom 2012)

2. Quang Duong et al. (2013) Sharding Social Networks(Duong 2013)

3. Goel, Daniel Goldstein.(2014) Predicting Individual Behavior with Social Networks (Goel 2014)

4. Rao et al. (2010). Classifying latent user attributes in twitter(Rao 2010)

5. Pennacchiotti (2010). A Machine Learning Approach to Twitter User Classification(Pennacchiotti 2011)

6. Backstrom, Kleinberg (2014). Romantic Partnerships and the Dispersion of Social Ties: A Network Analysis of Relationship Status on Facebook(Backstrom 2014)

7. Eagle, Pentland (2009). Eigenbehaviors: identifying structure in routine(Eagle 2009)

8. Aral, Walker (2013). Tie Strength, Embeddedness & Social Influence: Evidence from a Large Scale Networked Experiment (Aral)

9. Anderson et al. (2013). Steering User Behavior with Badges. (Anderson 2013)

10. Eagle et al. (2009). Inferring friendship network structure by using mobile phone data (Eagle 2009)

11. Leskovec et al. (2010). Signed Networks in Social Media (Leskovec 2010)

12. Goel et. al.(2015). The Structual Virality of Online Diffusion(Goel 2015)

13. Dafna Shahaf, Carlos Guestrin (2012). Connecting the Dots between news articles Shahaf . (Shahaf 2012)

14. Ugander et al. (2012). Structural Diversity in Social Contagion (Ugander 2012)

## Product Adoption, Churn and Marketing

1. Zhang et al. (2012). Predicting customer churn through interpersonal inﬂuence (Zhang 2012)

2. Hill et al. (2006). Network-Based Marketing: Identifying Likely Adopters via Consumer Networks(Hill 2006)

3. Leskovec at al. (2006). The dynamics of viral marketing (Leskovec 2007)

4. Bhagat et al. (2012). Maximizing Product Adoption in Social Networks(Bhagat 2012)

### References

1. The Elements of Statistical Learning. Springer New York, 2009. Link

2. Esther Duflo, Rachel Glennerster, Michael Kremer. Using Randomization in Development Economics Research: A Toolkit. SSRN Electronic Journal Social Science Electronic Publishing Link

3. Anand Rajaraman, Jeffrey David Ullman. Mining of Massive Datasets. Cambridge University Press (CUP), 2009. Link

4. Yaser S. Abu-Mostafa, Malik Magdon-Ismail, Hsuan-Tien Lin. Learning From Data. AMLBook, 2012.

5. T. Poggio, S. Smale. The Mathematics of Learning: Dealing with Data. In 2005 International Conference on Neural Networks and Brain. Institute of Electrical & Electronics Engineers (IEEE), 2005. Link

6. Zoubin Ghahramani. Unsupervised Learning. 72–112 In Advanced Lectures on Machine Learning. Springer Science $$\mathplus$$ Business Media, 2004. Link

7. Pierre Deville, Catherine Linard, Samuel Martin, Marius Gilbert, Forrest R. Stevens, Andrea E. Gaughan, Vincent D. Blondel, Andrew J. Tatem. Dynamic population mapping using mobile phone data. Proceedings of the National Academy of Sciences 111, 15888–15893 Proceedings of the National Academy of Sciences, 2014.