2.5. Quality assessment
The risk of bias was assessed based on the OECD “Guidance Document for
Describing Non-Guideline In Vitro Test Method” and the SciRAP (Beronius
et al., 2018) tool for evaluating in vitro toxicity studies. Two
reviewers independently assessed the quality of each study (BMF and
ETBM). Any discrepancies were resolved by the expert and topic reviewers
(IGD and EMD). We assessed the reporting quality, methodological
quality, and relevance of the in vitro studies. For the reporting and
methodological quality, the response options for signaling each question
were “Not determined, Fulfilled Partially, Fulfilled, or Not
fulfilled”. For the relevance criteria, the response options for the
report items were “Not determined, Directly relevant, Indirectly
relevant, or Not relevant”. These color profiles are especially useful
for determining where the strengths and weaknesses of a study lay. A
numerical score for reporting quality and methodological quality was
calculated. The calculation of the scores can be simply explained as the
SciRAP score: {[F + (PF*0,5)]/T]*100} (where F is the number of
fulfilled criteria multiplied by their individual weights, PF is the
number of partially fulfilled criteria multiplied by their individual
weights and T is the total number of criteria multiplied by their
individual weights). In other words, the score is the percentage of
fulfilled and partially fulfilled criteria included in the evaluation,
taking the weight of individual criteria into account, where partially
fulfilled criteria (subjective analysis) contribute half the value of
fulfilled criteria. The SciRAP score can have a value ranging from 0
(all criteria are judged as “not fulfilled”) to 100 (all criteria are
judged as “fulfilled”). The output from the SciRAP evaluation was
based on the ranking of studies according to their relative reliability,
and dividing the studies into different categories of reliability, e.g.
the Klimisch categories “reliable without restrictions”, “reliable
with restrictions”, and “not reliable”. The ranking and
categorization are done on a case-by-case basis. The SciRAP does not
dictate cut-off values or categorization based on SciRAP scores or color
profiles.