Methods

Study design and participants

The source data used in the present study were from the UK Biobank, the original design of which has been described previously[8]. Briefly, individual participant’s data were collected from 2006 to 2010. Data related to RRS were derived from touch-screen questionnaires. Our analysis excluded participants aged 39–71 years with either missing values of RRS components or major illnesses, such as CVD, IHD or stroke. Individuals with self-reported cancer at baseline were also excluded. Long-term follow-up of CVD was performed through hospital admissions and death registers. The Northwest Multicenter Research Ethics Committee granted permission for ethical approval and all participants signed informed consent forms.

Definition of reproductive risk score

We followed a standard procedure in constructing the RRS. Health-related pregnancy outcomes were considered in the screening process, including sexual intercourse, pregnancy-induced hypertension, gestational diabetes, menstruation, childbearing history, miscarriage, contraceptive use and surgery of the genital tract. We excluded participants with pregnancy-induced hypertension and gestational diabetes due to their strong correlations with other reproductive factors (i.e. Spearman’s rank correlation coefficient > 0.7). We also excluded single variables with a small sample size (sample size < 100 K).
After meticulous consideration and literature search, we selected 17 questions from the questionnaire to build the RRS, including lifetime number of sexual partners [9], start/end of menstruation [10, 11], reproductive information (number of children [12], childbearing age[13, 14], weight of first child[15]), surgery of the genital tract[16, 17] (hysterectomy and ovariectomy and removal date), abnormal pregnancy events [18] (stillbirth, spontaneous miscarriages, terminations) and contraceptive intake[19]. The risk response in each question was defined as 1 point, and was otherwise defined as 0 points. Questions with a progressive relationship were assigned to the respective scores. For example, in question FH4, ‘Have you ever had any stillbirths, spontaneous miscarriages or terminations?’, the number of ‘stillbirths, spontaneous miscarriages and terminations’ in FH4A to FH4C, respectively, had progressive relationships with FH4, and so were assigned to a group with a score range of 0–3. In summary, the RRS was constructed with a range of 0–16, with higher scores indicating higher reproductive health risk. The 17 variables are listed in Table S1.
To describe the RRS more fully, the cohort was further divided into four groups: low-risk group (RSS: 0–1); low-intermediate group (RSS: 2–3); high-intermediate group (RSS: 4–5); high-risk group (RSS: 6–13). We also defined the participants with scores in the top 30% as the unhealthy RRS group. We examined the baseline characteristics among the four groups (Table 1).

Definition of the healthy lifestyle score

The healthy lifestyle score (HLS), a composite of various common lifestyle indicators, explores the extent to which the adverse effects of reproductive deficits are reduced. HLS included five lifestyle factors: body mass index (BMI), physical activity, diet, smoking status and alcohol intake [20]. Each aspect was divided into healthy and unhealthy groups with scores of 1 and 0, respectively. BMI < 25 kg/m2 was defined as healthy; ≥ 150 minutes of moderate activity per week or ≥ 75 minutes of vigorous activity per week or mixed was defined as healthy; a healthy diet was defined as including four or more of vegetables, fruits, fish, processed meat and non-processed meat; never smoking is defined as healthy; alcohol intake ≤ 14 g/day was defined as healthy. Higher total HLS score indicated better health. The participants were also divided into three groups according to HLS: unhealthy lifestyle group (HLS: 0–1), intermediate lifestyle group (HLS: 2–3) and healthy lifestyle group (HLS: 4–5).

Outcomes

The primary outcomes were total CVD, ischemic heart disease and stroke. Causes of prevalence and morbidity were classified according to the 9th and 10th revisions of the International Classification of Diseases (ICD9 and ICD10, respectively). Self-reported diseases were also used to distinguish the prevalence at baseline. For example, ICD10 (I20–I23, I24.1, I25, I46 and I60–I64) were defined for incidence of CVD and ICD9 (410–414, 429.79, 430–438) and self-report (1066, 1074, 1075, 1081, 1086, 1491, 1583) were defined for the baseline prevalence of CVD. Detailed information on the outcome definitions is provided in Table S2.

Covariates

Several covariates measured at baseline were included in the analysis. Specifically, we included sociodemographic characteristics (age, level of education and medical record region), lifestyle factors (smoking status, daily alcohol intake and physical activity), anthropometric measurements (systolic blood pressure, diastolic blood pressure and BMI) and family medical history (diabetes, CVD and cancer). Age at recruitment was divided into three groups according to the World Health Organisation criteria (≤ 44, 45–60, ≥ 60 years)[21]. Levels of education included vocational, lower secondary, upper secondary, higher and none of the above. The alcohol intake equivalents were calculated by daily intakes of red wine, champagne, beer, spirits and fortified wine, with 1 alcohol intake equivalent containing 14 g of pure alcohol. Physical activity was measured by a metabolic equivalent task (MET), which was calculated by walking, moderate physical activities and vigorous activities.

Statistical analysis

We used the area under the receiver operator characteristic curve (AUC) to describe the performance of the RRS and integrate improvement of discrimination. The baseline characteristics are described according to different RRS groups (low-risk group; low-intermediate group; high-intermediate group; high-risk group) with the means (standard deviation, SD) for continuous variables and number (percentage, %) for categorical variables. To determine the associations of RRS with age and HLS groups more accurately, the mean RRS and prevalence of unhealthy RRS groups (top 30% scores) were plotted by age and HLS groups. We also examined the ratio of RRS groups according to different baseline characteristics.
Follow-up person-years were calculated for the duration from baseline at enrolment to the first occurrence of either the incidence date of the outcomes, loss of follow-up or end of follow-up (January 31, 2018). Kaplan–Meier survival curves were plotted to compare the cumulative incidence rate (and confidence interval, CI) during follow-up between different groups of RRS. The Cox proportional hazards model was used to estimate the associations between RRS (both categorical and continuous variables) and the outcomes. We constructed three multivariate models: model 0 without adjustment for any covariates; model 1 adjusted for age and record region; model 2 additionally adjusted for BMI (kg/m2), systolic blood pressure (mmHg), diastolic blood pressure (mmHg) and family history (CVD, cancer, diabetes). Multivariable-adjusted population-attributable risk fraction calculates the proportions of incident CVD, IHD and stroke attributable to the unhealthy lifestyle group (HLS < 4) in different RRS groups. In stratified analysis, we tested for interactions using the likelihood ratio test between RRS and baseline age (≤ 44, 45–60, ≥ 60 years), systolic blood pressure (< 120, 120–140, 140–160, ≥ 160), diastolic blood pressure (< 80, 80–90, 90–100, ≥ 100), family history (CVD, cancer, diabetes) and BMI (< 18.5, 18.5–25, 25–30, ≥ 30). The population-attributable risk (PAR%) was calculated between RRS and the three outcomes. The relative excess risk due to interaction (RERI) in the additive joint analysis between RRS and HLS was also examined.
All statistical analyses were performed in Stata/SE (version 15.0) and R (version 4.1.2). A two-sided p < 0.05 was taken to indicate statistical significance.