Authorea

Approach

\label{sec:approach}

What is machine learning?

Machine learning can be used to classify data according to certain features of that data. For example, it is used on emails to classify spam messages. To filter messages it makes a model of what features the various classes of messages share. To make the model it is provided with a dataset of messages with the messages already annotated according to what class of message they belong to. There are two main ways of training a classifier: One time or iterative. The former method is trained on sample data once, and the latter periodically takes new data with which it improves its model. An advantage of iterative or online learning is that the model improves over time. This is useful, because language changes over time. By periodically re-training, the model will be aware of these changes and will be more accurate. Additionally, the automatic classifier can sort a batch of tweets based on how confident it is that they are bullying messages. This way, manual classification can speed up by presenting actual bullying messages first.

How is it applied and useful here?

We use a machine learning approach to attempt to classify bullying messages automatically. Then, we run experiments to compare and optimize classifiers as well as to find additional features.