1. Introduction
Depression is a common mental disorder whose importance is occasionally overlooked. It can cause serious problems, incapacity, psychotic episodes, and, in rare cases, suicide if it is not properly controlled. The World Health Organization (WHO) estimates that depression affects more than 264 million people worldwide. Similar to the majority of mental illnesses, early detection may be very helpful for prevention. As technology develops, people are spending more and more time online, and their behaviour on various websites may reveal a lot about them. Excellent indicators of personality, emotional or social status, as well as mental health, can be found in language. It should not be surprising that people who are depressed use a lot of words and phrases that convey negative emotions, especially negative adjectives and adverbs like ”lonely” and ”sad”.  The use of pronouns like ”me”, ”myself”, and ”I” by depressed people is particularly intriguing because it shows that they are frequently more self-focused [1]. The authors can frequently learn useful information about a person’s mental condition from the content of their social media accounts. But the technology that is now used to treat depression only acts in response. Tracking internet users, for instance, can identify some risks, but warnings only sound when the person is ill or when anything immoral or insulting is done. The authors think a better option for early detection is to use Natural Language Processing (NLP) techniques on typical social network posts. This study’s main goal is to suggest a method for using NLP to assess whether a person is having depressive thoughts or intentions. By extracting phrases most frequently used by Reddit users suspected of having depressive tendencies as well as other group-specific features, the authors attempted to achieve this. The dataset used in this study was first reported by Losada and Crestani [2]. The dataset, which is described in more detail in Section 3, consists of numerous Reddit comments and posts written by different individuals. The authors used numerous Machine Learning (ML) models along with several strategies to achieve this, which are more fully described in Section 4.
The authors’ four main contributions to this paper are as follows:
  1. The authors look into how certain phrases or phrase combinations might be used to recognise depression.
  2. As a purely functional feature, the authors look into the effects of comment/post-time frequency.
  3. The authors look into the effects of comment and post length as a feature only.
  4. The authors look into the effect of user posts’ emotions as a feature.
The rest of the paper is structured as follows: The works that are connected to our strategy are detailed in section 2. The empirical investigation, dataset description, and pre-processing are all included in section 3. The problem statement is given in section 4. Models that have been suggested are presented in section 5. The outcomes of the suggested technique are addressed in section 6. Future research is mentioned in section 7, and section 8 of the paper is where we draw some conclusions.