Machine learning to classify bullying messages on twitter.

In this study we look at cyberbullying on twitter. We try to automatically classify bullying messages and compare various classification algorithms. We fine-tune the performance of some algorithms and find that oversampling the bullying messages in the minority improves the kappa and accuracy measures of some classifiers.

We explored different external features, and while not giving conclusive results, we suggest directions for future work in that area.


Bullying on the internet


Bullying can have bad consequences,

increased levels of depression, anxiety and psychosomatic symptoms in victims (Kaltiala-Heino, Rimpela, Rantanen, & Rimpela, 2000; Kumpulainen et al., 1998; Neary & Joseph, 1994; Roland, 2002). The bullied students also feel more socially ineffective and have greater interpersonal difficulties (Craig, 1998; Forero, McLellan, Rissel, & Baum, 1999), together with higher absenteeism from school and lower academic competence (Rigby, 1997; Zubrick et al., 1997). However, it is still unclear if these symptoms are antecedents or consequences of bullying (Hodges & Perry, 1999; Roland, 2002). Thus the direction of causality may be both ways (Kaltiala-Heino et al., 2000).

(Campbell 2005)

Bullying people can also happen online. This can take the form of making hurtful comments about someone, humiliating them, or threats and verbal abuse. The effects of cyberbullying can be the same or worse than face-to-face bullying, including increased levels of anxiety and depression.(Campbell 2005)

Cyberbullying can happen on social media like facebook, twitter, and instagram. Monitoring this behaviour can be valuable, to increase insight in the behaviour or possibly to help prevent it. Recognizing messages as bullying messages with reasonable accuracy and speed generates opportunities like notifying the sender before the message is sent, or recognizing a spike of bullying in a certain area.

Problems with manual classification


Manual and automatic approaches for identifying bullying messages are possible. However, manually monitoring this is time-inefficient due to the large amount of data involved (roughly 6000 messages per second on twitter, here, according to a tool collecting statistics about twitter, used the 5th of July, 2015), and due to bullying messages only making up a small portion of all messages on twitter. That means that for manual classification of messages a lot of work needs to be done to get any meaningful results.

Automatic classification

This task can also be done automatically, with machine learning algorithms. In machine learning a classifier is trained on certain features of the data. Based on these features, it makes a model of what a certain class of data ’looks like’. In this case, it could make a model of bullying messages. There are many different algorithms and features to consider.

There are two main ways of training a classifier: One time or iterative. The former method is trained on sample data once, and the latter periodically takes new data with which it improves its model. An advantage of iterative or online learning is that the model improves over time. This is useful, because language changes over time. By periodically re-training, the model will be more aware of these changes and will be more accurate.

Research questions

  • What are effective features to identify bullying messages?

We will explore particular features in tweets as well as tweet metadata to find features that automatic analysis might not.

  • Are they useful in selecting data from twitter to train on?

Taken as an absolute, not a high proportion of messages on twitter is bullying. If we can filter data from twitter before analyzing it, we might retrieve more examples of bullying. This would aid in classifying.

  • What are good classifier algorithms and parameters for this task?

There are many different classification methods. We can compare them to find the most suited one.