Machine Learning and Sociology

Machine learning can be used in sociology?

This is a question on the Quora website (https://www.quora.com/How-will-machine-learning-change-sociology). That's my answer.

Machine learning combines statistical analysis and computer-intensive data processing, for classifying, understanding and predicting data-based event patterns.

There are many obstacles for computational resources, such as Machine learning, to be fully adopted in Sociology.

First problem, Machine learning is in the field of Artificial Intelligence (AI). Sociologists have demonstrated strong resistance with AI as a method applicable to the Social Sciences.

Second, the strength of a Machine Learning application lies in its ability to predict events. Sociologists in general feel that predictions are impossible for studies on social issues.

Third obstacle is the statistical basis of Machine learning. In the last decades, there was a reduction in the training of sociologists in quantitative research. "Mainstream" sociology has resisted the use of data analysis and statistics application in social studies, seen as "positivist", "empiricist" and "non critical" approaches. This situation has changed in recent years. With ease of data access and the intensive use of computing, there is a growing interest of sociologists by quantitative and computational methods. One can perceive, both in publications and in academic syllabus, less prejudice among social scientists about the uses of quantitative data analysis and statistics. But there is a lack of knowledge and mistrust regarding statistical applications in the social sciences yet.

Some authors remark another obstacle. Sociology is a hypothesis-driven science, while Machine Learning is more appropriate for inductive research. I do not agree, because neither Sociology is a hypothesis-driven science (it is more an "authority argument-driven"), nor is Machine Learning confined to inductive research. But it is also a highlight when discussing the possibility of applying Machine Learning in Sociology.

These obstacles mean that one should not expect a "paradigm shift" in the Sociology any time soon. But it is possible to glimpse a growing use of Machine Learning and other computational methods in sociological research in the coming years.

Let's suppose a researcher interested in a study about "power elite" based on 11.5 million documents in Panama Papers (https://panamapapers.icij.org/), spread by The International Consortium of Investigative Journalists (ICIJ). The dataset is about 2,6 Terabytes of information. It's impossible to explore this huge amount of information without some computer help. For instance, Machine learning could be used to find personalities and companies by countries to show hidden multinational connections among them. In this case, the researcher would benefit from the data mining application of Machine Learning. But this would not mean necessarily a change in the structure of sociological thinking. It could just represent some extra results to reinforce arguments previously endorsed by canonical literature.

Because of that, Machine learning is more likely to influence sociological research as a data analysis tool than as effective Artificial Intelligence application to sociological studies.

The recent interest of sociologists about computational resources and mathematical knowledge is a good sign. Perhaps, after the gradual increase in the number of researchers using computational resources, sociologists will become developers of new AI based computational methodologies applied to social studies.

The interdisciplinary field of Computational Social Sciences is a movement in this direction. Researchers around the world are discussing how to understand social issues based on AI, Machine Learning, Agent Based Simulation, Social Network Analysis, Complexity and others computational methods. This field is still in the beginning and raises more questions than results. But, it feeds expectations that Social Sciences (including Sociology) can follow the advances observed in other sciences in the 21st century.

I suggest reading the paper by Conte, R. et al. (http://link.springer.com/article...) and watching this video by Wallach, H. (https://www.youtube.com/shared?c...)

[Someone else is editing this]