Table 3: Rating system used by Wiley colleagues for analysis (R-score)
To eliminate bias in the rating process, journal identifiers, including subject areas, were not revealed to any of the raters until after the initial qualitative and quantitative analysis had been completed.
To improve inter-rater reliability, each rater flagged answers that needed further discussion with the other raters. In addition to rating answers, the raters also highlighted examples of interesting and exceptional practice that would form the basis of identifying quality peer review. They also highlighted examples of potential obstacles preventing improvements in a given area.
Once all journal answers had been rated, another team member (SP) who had not been involved in the rating process carried out further qualitative and quantitative analysis.
The SA-score for each journal’s response was subtracted from the R-score, the difference enabling us to assess journals’ levels of understanding or awareness, which in turn could help us evaluate how the Self-Assessment is working (Supplementary Table 2). The scores and the differences were assessed by journal subject area, by Essential Area of peer review, and by each question within the Essential Area.
We analysed the qualitative responses to determine best practice and obstacles to good practice. To find examples of better peer review, we extracted the highlighted responses with an R-score of three, and to find obstacles to better peer review, we extracted the highlighted responses with an R-score of one. We also extracted answers scored ‘N/A’ to determine how questions might be applicable only to certain subject areas. From this we produced a synthesised set of best practice recommendations for each Essential Area, as well potential obstacles to good practice in peer review. We published this online at https://secure.wiley.com/better-peer-review, with a interactive infographic to help editors and researchers explore ways in which they can foster and experience better peer review.

Results

Quantitative analysis

132 journals across a range of disciplinary areas completed the Self-Assessment, resulting in a total of 6,336 responses for the 48 questions. Each journal took an average of 69 minutes to answer the 50 questions in the Self-Assessment. The subject areas represented by the journals are shown in Table 4.