loading page

Bug Report Classification with Ensemble Learning for Closed-Source Software
  • Eyüp Halit Yılmaz,
  • Ceyhun Emre Öztürk,
  • Ömer Köksal
Eyüp Halit Yılmaz
ASELSAN AS

Corresponding Author:[email protected]

Author Profile
Ceyhun Emre Öztürk
ASELSAN AS
Author Profile
Ömer Köksal
ASELSAN AS
Author Profile

Abstract

This paper introduces a set of datasets grouped under the name Turkish Software Bug Reports (TSBR), which comprises commercial software bug reports from a closed-source project. We investigate and report the statistical properties and classification difficulty of the TSBR datasets. We employ various methods from the text classification literature to apply several classification tasks related to software development on the TSBR datasets. The methods we employ include traditional machine learning (ML) methods such as k-nearest neighbors (KNN) and random forest (RF); sequential deep learning (DL) models such as gated recurrent unit (GRU) and convolutional neural network (CNN); transformer-based language models; and ensembles of the employed models. Our work is among the first efforts in automated bug report classification literature that uses ensembles of DL models.