Introduction
The first attempts to use machine learning for software development quality assurance, were made in the early 1990’s (Munson et al. 1992) \cite{Munson1992}. Since then, this approach has gradually improved. Why then, has it not gained wider popularity in commercial projects so far? One reason is the wide variety of data necessary to perform a prediction as well as the many different sources of data which are encountered in commercial software development projects. Until recently, conducting a prediction on a chosen project, required time-consuming preparation of a special, additional program (or programs) that acquired the desired data and then put it through a suitable preprocessing procedure. One obstacle was the need to gather data from varying sources (tools, processes, methodologies, etc.) as well as the data itself having differing sizes and measurements. Also, different machine learning mechanisms have a different degree of effectiveness depending on the choice of data sources and data itself (experimentally selected). Software defect prediction has been considered to be too complex a process, too cost and time-consuming, and there have been no known solutions for wrapping it into one, universal, defect prediction application which could be used for different projects.