Statistical analysis
The outcomes are summarized using mean difference (MD) and risk ratio (RR) with 95 % confidence interval (CI). While our significant findings were derived from a frequentist network meta-analysis, a conventional meta-analysis was performed in advance to compare HIF-PHIs overall with ESAs briefly. The overall heterogeneity of effect size was tested. If there was significant between-study heterogeneity (\(I^{2}>50\%\)) in the primary outcome, mean change in hemoglobin level from baseline, a random-effects model would be used, and a fixed-effects model would be used otherwise. In addition, Cochran’s Q-statistic was calculated under the assumption of design-by-treatment interaction random-effects models to assess the consistency of networks[19-21]. Funnel plots evaluated publication bias. Rankings of treatments were generated by estimating their surface under the cumulative ranking (SUCRA) scores, which is a metric to assess which treatment is likely to be the most efficacious (0: treatment is certain to be the worst; 1: treatment is certain to be the best) in the context of network meta-analyses[22, 23]. The SUCRA score is calculated in the function using the formula:
\begin{equation} \text{SUCRA}_{i}=\frac{\sum_{j=1}^{n-1}\text{cum}_{\text{ij}}}{n-1}\nonumber \\ \end{equation}
Where i =1, 2, …, n is the index of some treatment,n is the number of all competing treatments, j =1, 2, …, n−1 is the rank of best treatments, and cum represents the cumulative probability of treatment i being among the jbest treatments. The influence of mean age, sex ratio, and duration of treatment was investigated through subgroup analysis using the Bayesian model. Finally, the network meta-analysis is repeated using the Bayesian model for sensitivity analysis[24, 25]. All analyses were done with R 4.2.0 via the packages net-meta version 2.1-0 and gemtc version 1.0-1.