Abstract
Recently, a renewed interest in adopting data analytics to help solving
HR problems and to make more informed and effective choices appeared in
the literature. One of the greatest challenges for organizations is
employee turnover because of its adverse impact in many areas, such as
productivity, performance and reputation. In case of attrition, one of
the problems is that data from HRIS are complex, full of sensitive
information (GDPR) and of useless data. Once data are clean, they can be
analyzed by using statistical approaches such as machine learning. This
study is about predicting employee attrition using machine learning
models on a real dataset of a large Italian financial company, and, in
particular, we focus on choosing the best. This contrasts with much
extant research which is based on artificial datasets. To address this
issue, machine learning tools have been developed for investigating and
predicting employee attrition, as well as methods for evaluating their
predictive power. Evidence on what are the most important predictors
that lead to attrition and in what areas it is more likely to happen
enable HR managers to implement targeted retention policies and
practices. The contribution of this paper is to explore and compare the
performance of several common models which are found in the literature
on real data. Then, we focus on the results of the best performing model
and identify some groups of employees who have a high risk of attrition
on which the company could intervene to reduce voluntary resignation.