7. Future Works
The size and complexity of the dataset presented the biggest difficulty
in creating the model. Although it is an expensive process, expanding
the dataset will improve model performance. Since the time of postings
was the sole factor taken into account in addition to users’ posts, it
stands to reason that more information about users would be helpful to
academics as well. The work of Benton et al. [14] on gender
representation as an additional attribute may be useful. Alternatively,
the authors could use it directly to Long Short-Term Memory (LSTM)
networks, which are the main component of most cutting-edge models
because they can start making use of long sequential input data as well
as its going to order by wanting to avoid gradient vanishing and using
recurrent connections. The authors could try using word embeddings
currently averaging across all sayings as input towards the logistic
regression model.