KNN algorithm implementation: Tests were carried out with 11 distinct data bases. The first of these data bases had 3,200 inputs and 4 features for each input. Next, the other 10 databases were generated with the possible combinations of the 4 features to determine which of these provide the larger assertiveness index. This is because, usually, the KNN classification algorithm usually works better if there is a large separation between classes. This separation might be superior if 3 features are used instead of 4 or 2 features instead of 3. However, in our case 4 features provided the best result. Each one of these 11 databases has been processed in multiple occasions by the KNN algorithm varying the number of neighbours used to perform the classification from 3 to 99 neighbours.
A K fold cross validation (KFCV) has been utilized to evaluate the KNN model. Each of these 11 databases has been randomized in 10 distinct occasions. In each one of them, the first 80% of the elements has been chosen as a training set and the remaining 20% as the test set. The training set has been delivered to the KNN and the assertiveness index obtained with the classification of the test set varying from 3 to 99, with increments of 2 for the number of neighbours. The process described above has been repeated in 5 occasions, choosing in each of them a new 20% in each database as a test set and training the KNN with the remaining 80%.
Results: The results obtained for each of the databases applying KFCV in 10 occasions are shown in Fig. 3. The database that contains 4 features presents the larger assertiveness index with an average of 99.02%, which has been obtained when 13 neighbours are used to perform the classification of the test set. 10 iterations of KFVC have been made and in each one of them 5 different training sets and test sets have been chosen. Fig. 3 shows the assertiveness index for each of the 50 tests made on the databases that contain the 4 features varying the number of neighbours.