Statistical methods
Due to the observational nature of the data, propensity score methods
were chosen to estimate the treatment effect of EA. Prior to any
inferential analysis, 10-fold multiple imputation was performed to
handle missing values. For each multiply imputed dataset, a propensity
score was calculated for receiving EA. Propensity score calculation was
based on factors potentially influencing the decision of obstetric
caretakers whether or not to use EA, and were maternal age, weight,
height, gestation week, foetal position, gender, year, hospital
category, length and head circumference of the foetus. Predicted
probabilities of receiving EA from these models were used as propensity
scores in all further analyses. Results from the multiply imputed
datasets were combined using Rubin’s rules as implemented in the R
package mice.16 For every endpoint, only cases that
had no missing values in this endpoint prior to imputation were used.
Linear regression models were used for the continuous endpoints pH, BE
and APGAR scores after 1, 5 and 10 minutes. Logistic regression models
were used for admission to NICU, perinatal mortality, and
AS5<7. The covariates in every model were the propensity
scores and EA (yes/no).
Since two primary objectives were investigated, Bonferroni correction
for multiple testing was applied with a significance level of 0.05/2 =
0.025. Furthermore, 97.5% confidence intervals for the effect of EA
were reported. Since p‑values for secondary objectives served only
descriptive purposes, no multiple testing corrections were applied and
95% confidence intervals were reported. As the duration and mode of
delivery as well as an episiotomy may indicate cases with higher
perinatal morbidity, additional multivariable regression models for all
outcome variables were additionally fitted adjusting for these
confounders.
Differences in perineal laceration rates of higher degree, duration of
birth and instrumental delivery were reported descriptively. As a
sensitivity analysis, the same analysis strategy (except for the
imputation related steps) was applied to the original data, leading to
relevant differences in the estimated effect sizes, i.e. the results
seem to heavily depend on the analysis strategy. All analyses were
performed using R (version 3.5.1; Foundation for Statistical Computing,
Vienna, Austria).
Results