Statistical analyses
The hazards of all-cause mortality associated with HRT were initially estimated by Cox proportional hazards regression model. The outcome was time from study entry to death from any cause in years. The model included second order interaction effects of all variables with main exposure HRT, and interactions of all medical conditions with the lifestyle variables. Backward elimination was applied to select the variables at 5% significance level for the main exposures, and 1% significance level for the interactions. The contribution of the covariates in explaining the variation of the hazard in the Cox regression model was assessed by ANOVA. Grambsch and Therneau’s test33 was performed to check the non-proportionality of hazards at 5% level of significance and was found to be significant. Also, the underlying baseline hazards of this study population were found to follow the Weibull distribution.  Consequently, a model34 which we refer to as Weibull Double-Cox model was fitted to estimate the shape parameters of the variables with time-variant hazards, and the scale effects. In principle, this model replaces the unspecified baseline hazard in the Cox model by Weibull hazard function and incorporates an additional Cox regression term for shape.  General practices were included in the model as a random effect or frailty to account for unobserved heterogeneity of patients between practices. Four separate survival models were also fitted to assess the impact of HRT by 5-year age group at initiation on all-cause mortality. The same sets of explanatory variables were adjusted for in the full-case (all age combined) model and in age subgroup analyses.
There were missing values for smoking, BMI, deprivation, and hypertension status (Table S1). Multilevel multiple imputation (MI) was used to deal with missing data. Ten imputed datasets were generated and analyzed independently for the full-case model as well as for each subgroup model. It is widely accepted that when missingness varies from 10-50%, MI can be used to deal with missing data, and 5-10 imputations are sufficient as having more imputations does not affect the results.35,36 The distributions of the variables with missing records in complete and imputed datasets were similar (Table S2). Rubin’s rules37 were applied to pool the estimated parameters. Complete case analyses were performed to validate the imputation models (Figure S1). The overall performance of the models was assessed by the concordance, and its values of 0.7 in full model, and 0.75−0.81 in the subgroup models indicate a good fit.38Kaplan-Meier (KM) survival analysis techniques were used to analyze the time to diagnosis of some selected medical conditions at follow-up. All analyses were performed in statistical software R version 3.6.1 using the packages “survival”, “MASS”, “rms”, and “jomo”.