Agreement measures
The exact agreement in scores between the consultants, averaged over the three sessions is provided in Table 2. The overall percentage of observed inter-rater agreement, as shown in Table 2, is consistent across sessions and has a mean value of 67.7% with the 5-categories scale, which increases to 91.4% when the 3-categories scale is used.
There is greater variability in performance of the consultants in the 5 category intra-rater study, with overall percentage agreement for a consultant between the three sessions ranging from 63.9% to 88.9%. The mean intra-rater agreement of the six consultants is 78.3% with a standard deviation of 9.7%. With the 3-category case, not only does the mean intra-rater agreement improve by 14.8% to attain 93.1% agreement, the variability in performance between consultants reduces as shown by the standard deviation of the mean agreement measure reducing 3-fold.
The specific agreement between consultants for each category, averaged over the three sessions, is provided in Table 3.