Results
The search strategy identified 1723 citations; following removal of
duplicates and screening, 52 full text articles were assessed for
eligibility (PRISMA Flow Diagram, Figure 1 ). This review included
11 studies with a total of 11 final prediction models identified.
The populations of the included studies are shown in Tables 2 and
3 . Four studies included only women with placenta praevia, four studies
included only vaginal deliveries, two studies had a population
consisting of CS (planned and unplanned) and one study had a population
encompassing the general obstetric population.
The key findings of the studies are detailed in Table 2 including
whether the study is to be interpreted as exploratory (requiring more
research) or confirmatory (of use in clinical practice) as judged by the
primary study authors. All candidate predictors and the predictors
included in the final published models is listed in Table 3. The
setting of the included studies were hospitals across the following
countries; Italy, China, France, United States, United Kingdom, South
Korea, Netherlands, Spain, Zimbabwe, Denmark and Egypt. The study
designs included were eight cohort studies of which one used whole
population registry data, and three case-control, of which one was
nested within a population cohort. The number of participants included
in each study ranged from 110 in a prospective cohort study to 56,967 in
a retrospective cohort.
Despite the attempt to predict PPH across all studies, the chosen
outcomes differed. Five studies listed PPH or massive haemorrhage as an
outcome, three studies listed blood transfusion or massive blood
transfusion as an outcome, two studies reported postpartum blood loss,
and one study had a combined outcome of peripartum complications
encompassing perioperative blood transfusion or uterine artery
embolization or caesarean hysterectomy. There is also variation in the
definition and method of measurement of each outcome as shown inTable 2 .
The quality of studies, assessed using the CHARMS checklist to assess
risk of bias, is summarised in Table 4 . Overall there was a high
risk of bias across the studies. The source of data was deemed of
low/moderate risk of bias in eight studies due to the use of a
retrospective design for measurement of predictor and outcome. Two
studies
were at high risk of bias due to a lack of definition or method of
measurement of the outcome to be predicted. There was a high risk of
bias for the candidate predictors in three studies due to a lack of
definition or predictors requiring subjective interpretation. Regarding
sample size, six studies were of high risk of bias for sample size as a
result of a low number of events per variable (EPV). Risk of bias for
missing data was uncertain for all papers because none reported any
missing data.
From the 11 studies there was a total of 97 unique variables selected as
candidate predictors (range 5-23 per study) and 56 variables selected as
predictors (range 5-15 per study) in the final models. The following
predictors were found to be predictive in two or more studies: (parity
n=4 studies), low antenatal haemoglobin (n=3), antepartum
haemorrhage/bleed (n=3), maternal age ≥35 years old (n=4), high neonatal
weight (n=2), multiple pregnancy (n=2), BMI ≥25 (n=2), previous CS
(n=3), anterior placenta (n=2) and retained placenta (n=2).
The predictive ability of the statistical models evaluated using
measures of calibration (concerned with agreement between the predicted
probabilities of the outcome and the observed proportions of the
outcome) and discrimination (how well the model can differentiate
between patients with high and low risk) was evident in four and six
out of 11 studies respectively. Of the four studies to report
calibration, two used the Hosmer-Lemeshow (H-L) test with Kim et al.,
reporting good calibration with a result of p=0.44 and Rubio-Alvarez et
al., failing to report a result. However, Hosmer-Lemeshow test is not
recommended for calibration assessment due to poor interpretation as it
does not provide a direction or magnitude of the miscalibration and has
limited power in small samples. Biguzzi presented a calibration plot
demonstrating overall good performance, however, there was inadequate
information relating to curve development. Ahmadzia et al., report
calibration plots and association between predicted probability of
transfusion and observed incidence in deciles of the risk score
distribution. However, the authors have not reported, at the very least,
a Hosmer-Lemeshow test nor demonstrated a suitable calibration
plot.14 The calibration plots are described as curves
but only display a point for each decile with no 95% confidence
intervals. Ideally the calibration slope should be reported along with a
calibration curve demonstrating the non-parametric relationship between
observed outcome and predicted risk.28 Discrimination
was reported as the area under the receiver operator curve (AUC) where 1
is perfect discrimination and 0.5 is no better than a coin toss. The AUC
ranged from 0.70 to 0.9 across all studies as shown in Table 2.
Of 11 studies, four presented validated models deemed by their primary
study authors as ready for use in clinical
practice.14,18,19,22 Ahmadzia et al., present an
online risk calculator developed in patients who underwent CS and
Dunkerton et al., presented a decision tree based on Hothorn et al’s
non-parametric recursive partitioning algorithm also developed in women
who underwent a CS. Kim et al., presented a scoring system developed in
women with placenta praevia and Rubio-Alvarez et al., present an Excel™
risk tool developed in women vaginally delivering singletons. However,
Ahmadzia et al and Dunkerton et al did not externally validate their
models – an important requirement before use in clinical
practice.29 The discriminatory performance on external
validation for Kim et al and for Rubio-Alvarez et al models were good
with AUCs of 0.88 and 0.83 respectively.