3.1.4. Step four
The main aim of this step is for every class in __ds to have an average error value representative. Technically speaking, the neighbors’ summary in (\ref{103702}) is split into \(k\) subsets, where \(k\) is equal to the number of classes in __ds; every subset contains observations belonging to the same class. The maximum index value of each row is captured; if the maximum index value is not equal to the class index, the corresponding row observation in __ds is flagged. As a result, each class contains the number of observations misclassified by the maximum number of neighbors in (\ref{103702}). Next, the sum of the \(n\) variables of each observation is obtained. This is formulated as in Equation (\ref{eq:4}). In addition, an average of the sums is calculated as in Equation (\ref{eq:5}). Finally, the average of the errors of each class collectively forms a vector \(v\) of size \(k\), where \(k\) is equal to the number of classes in __ds. Vector \(v\) is named parameter_5 and is referred to as err vector. Fig. (\ref{760789}) illustrates the outcome from this step using the showcase example in (\ref{993524}).
Fig. (\ref{464572}) illustrates the names and their references as the outcome from the building stage.