2.5. Quality assessment
The risk of bias was assessed based on the OECD “Guidance Document for Describing Non-Guideline In Vitro Test Method” and the SciRAP (Beronius et al., 2018) tool for evaluating in vitro toxicity studies. Two reviewers independently assessed the quality of each study (BMF and ETBM). Any discrepancies were resolved by the expert and topic reviewers (IGD and EMD). We assessed the reporting quality, methodological quality, and relevance of the in vitro studies. For the reporting and methodological quality, the response options for signaling each question were “Not determined, Fulfilled Partially, Fulfilled, or Not fulfilled”. For the relevance criteria, the response options for the report items were “Not determined, Directly relevant, Indirectly relevant, or Not relevant”. These color profiles are especially useful for determining where the strengths and weaknesses of a study lay. A numerical score for reporting quality and methodological quality was calculated. The calculation of the scores can be simply explained as the SciRAP score: {[F + (PF*0,5)]/T]*100} (where F is the number of fulfilled criteria multiplied by their individual weights, PF is the number of partially fulfilled criteria multiplied by their individual weights and T is the total number of criteria multiplied by their individual weights). In other words, the score is the percentage of fulfilled and partially fulfilled criteria included in the evaluation, taking the weight of individual criteria into account, where partially fulfilled criteria (subjective analysis) contribute half the value of fulfilled criteria. The SciRAP score can have a value ranging from 0 (all criteria are judged as “not fulfilled”) to 100 (all criteria are judged as “fulfilled”). The output from the SciRAP evaluation was based on the ranking of studies according to their relative reliability, and dividing the studies into different categories of reliability, e.g. the Klimisch categories “reliable without restrictions”, “reliable with restrictions”, and “not reliable”. The ranking and categorization are done on a case-by-case basis. The SciRAP does not dictate cut-off values or categorization based on SciRAP scores or color profiles.