KNN algorithm implementation: Tests were carried out with 11 distinct
data bases. The first of these data bases had 3,200 inputs and 4
features for each input. Next, the other 10 databases were generated
with the possible combinations of the 4 features to determine which of
these provide the larger assertiveness index. This is because, usually,
the KNN classification algorithm usually works better if there is a
large separation between classes. This separation might be superior if 3
features are used instead of 4 or 2 features instead of 3.
However, in our case 4 features provided the best result. Each
one of these 11 databases has been processed in multiple occasions by
the KNN algorithm varying the number of neighbours used to perform the
classification from 3 to 99 neighbours.
A K fold cross validation (KFCV) has been utilized to evaluate the KNN
model. Each of these 11 databases has been randomized in 10 distinct
occasions. In each one of them, the first 80% of the elements has been
chosen as a training set and the remaining 20% as the test set. The
training set has been delivered to the KNN and the assertiveness index
obtained with the classification of the test set varying from 3 to 99,
with increments of 2 for the number of neighbours. The process described
above has been repeated in 5 occasions, choosing in each of them a new
20% in each database as a test set and training the KNN with the
remaining 80%.
Results: The results obtained for each of the databases applying
KFCV in 10 occasions are shown in Fig. 3. The database that contains 4
features presents the larger assertiveness index with an average of
99.02%, which has been obtained when 13 neighbours are used to perform
the classification of the test set. 10 iterations of KFVC have been made
and in each one of them 5 different training sets and test sets have
been chosen. Fig. 3 shows the assertiveness index for each of the 50
tests made on the databases that contain the 4 features varying the
number of neighbours.