Methods
Study design and
participants
The source data used in the present study were from the UK Biobank, the
original design of which has been described previously[8]. Briefly, individual participant’s data were
collected from 2006 to 2010. Data related to RRS were derived from
touch-screen questionnaires. Our analysis excluded participants aged
39–71 years with either missing values of RRS components or major
illnesses, such as CVD, IHD or stroke. Individuals with self-reported
cancer at baseline were also excluded. Long-term follow-up of CVD was
performed through hospital admissions and death registers. The Northwest
Multicenter Research Ethics Committee granted permission for ethical
approval and all participants signed informed consent forms.
Definition of reproductive risk
score
We followed a standard procedure in constructing the RRS. Health-related
pregnancy outcomes were considered in the screening process, including
sexual intercourse, pregnancy-induced hypertension, gestational
diabetes, menstruation, childbearing history, miscarriage, contraceptive
use and surgery of the genital tract. We excluded participants with
pregnancy-induced hypertension and gestational diabetes due to their
strong correlations with other reproductive factors (i.e. Spearman’s
rank correlation coefficient > 0.7). We also excluded
single variables with a small sample size (sample size < 100
K).
After meticulous consideration and literature search, we selected 17
questions from the questionnaire to build the RRS, including lifetime
number of sexual partners [9], start/end of
menstruation [10, 11], reproductive information
(number of children [12], childbearing age[13, 14], weight of first child[15]), surgery of the genital tract[16, 17] (hysterectomy and ovariectomy and removal
date), abnormal pregnancy events [18] (stillbirth,
spontaneous miscarriages, terminations) and contraceptive intake[19]. The risk response in each question was
defined as 1 point, and was otherwise defined as 0 points. Questions
with a progressive relationship were assigned to the respective scores.
For example, in question FH4, ‘Have you ever had any stillbirths,
spontaneous miscarriages or terminations?’, the number of ‘stillbirths,
spontaneous miscarriages and terminations’ in FH4A to FH4C,
respectively, had progressive relationships with FH4, and so were
assigned to a group with a score range of 0–3. In summary, the RRS was
constructed with a range of 0–16, with higher scores indicating higher
reproductive health risk. The 17 variables are listed in Table S1.
To describe the RRS more fully, the cohort was further divided into four
groups: low-risk group (RSS: 0–1); low-intermediate group (RSS: 2–3);
high-intermediate group (RSS: 4–5); high-risk group (RSS: 6–13). We
also defined the participants with scores in the top 30% as the
unhealthy RRS group. We examined the baseline characteristics among the
four groups (Table 1).
Definition of the healthy lifestyle
score
The healthy lifestyle score (HLS), a composite of various common
lifestyle indicators, explores the extent to which the adverse effects
of reproductive deficits are reduced. HLS included five lifestyle
factors: body mass index (BMI), physical activity, diet, smoking status
and alcohol intake [20]. Each aspect was divided
into healthy and unhealthy groups with scores of 1 and 0, respectively.
BMI < 25 kg/m2 was defined as healthy; ≥ 150
minutes of moderate activity per week or ≥ 75 minutes of vigorous
activity per week or mixed was defined as healthy; a healthy diet was
defined as including four or more of vegetables, fruits, fish, processed
meat and non-processed meat; never smoking is defined as healthy;
alcohol intake ≤ 14 g/day was defined as healthy. Higher total HLS score
indicated better health. The participants were also divided into three
groups according to HLS: unhealthy lifestyle group (HLS: 0–1),
intermediate lifestyle group (HLS: 2–3) and healthy lifestyle group
(HLS: 4–5).
Outcomes
The primary outcomes were total CVD, ischemic heart disease and stroke.
Causes of prevalence and morbidity were classified according to the
9th and 10th revisions of the
International Classification of Diseases (ICD9 and ICD10, respectively).
Self-reported diseases were also used to distinguish the prevalence at
baseline. For example, ICD10 (I20–I23, I24.1, I25, I46 and I60–I64)
were defined for incidence of CVD and ICD9 (410–414, 429.79, 430–438)
and self-report (1066, 1074, 1075, 1081, 1086, 1491, 1583) were defined
for the baseline prevalence of CVD. Detailed information on the outcome
definitions is provided in Table S2.
Covariates
Several covariates measured at baseline were included in the analysis.
Specifically, we included sociodemographic characteristics (age, level
of education and medical record region), lifestyle factors (smoking
status, daily alcohol intake and physical activity), anthropometric
measurements (systolic blood pressure, diastolic blood pressure and BMI)
and family medical history (diabetes, CVD and cancer). Age at
recruitment was divided into three groups according to the World Health
Organisation criteria (≤ 44, 45–60, ≥ 60 years)[21]. Levels of education included vocational,
lower secondary, upper secondary, higher and none of the above. The
alcohol intake equivalents were calculated by daily intakes of red wine,
champagne, beer, spirits and fortified wine, with 1 alcohol intake
equivalent containing 14 g of pure alcohol. Physical activity was
measured by a metabolic equivalent task (MET), which was calculated by
walking, moderate physical activities and vigorous activities.
Statistical analysis
We used the area under the receiver operator characteristic curve (AUC)
to describe the performance of the RRS and integrate improvement of
discrimination. The baseline characteristics are described according to
different RRS groups (low-risk group; low-intermediate group;
high-intermediate group; high-risk group) with the means (standard
deviation, SD) for continuous variables and number (percentage, %) for
categorical variables. To determine the associations of RRS with age and
HLS groups more accurately, the mean RRS and prevalence of unhealthy RRS
groups (top 30% scores) were plotted by age and HLS groups. We also
examined the ratio of RRS groups according to different baseline
characteristics.
Follow-up person-years were calculated for the duration from baseline at
enrolment to the first occurrence of either the incidence date of the
outcomes, loss of follow-up or end of follow-up (January 31, 2018).
Kaplan–Meier survival curves were plotted to compare the cumulative
incidence rate (and confidence interval, CI) during follow-up between
different groups of RRS. The Cox proportional hazards model was used to
estimate the associations between RRS (both categorical and continuous
variables) and the outcomes. We constructed three multivariate models:
model 0 without adjustment for any covariates; model 1 adjusted for age
and record region; model 2 additionally adjusted for BMI
(kg/m2), systolic blood pressure (mmHg), diastolic
blood pressure (mmHg) and family history (CVD, cancer, diabetes).
Multivariable-adjusted population-attributable risk fraction calculates
the proportions of incident CVD, IHD and stroke attributable to the
unhealthy lifestyle group (HLS < 4) in different RRS groups.
In stratified analysis, we tested for interactions using the likelihood
ratio test between RRS and baseline age (≤ 44, 45–60, ≥ 60 years),
systolic blood pressure (< 120, 120–140, 140–160, ≥ 160),
diastolic blood pressure (< 80, 80–90, 90–100, ≥ 100),
family history (CVD, cancer, diabetes) and BMI (< 18.5,
18.5–25, 25–30, ≥ 30). The population-attributable risk (PAR%) was
calculated between RRS and the three outcomes. The relative excess risk
due to interaction (RERI) in the additive joint analysis between RRS and
HLS was also examined.
All statistical analyses were performed in Stata/SE (version 15.0) and R
(version 4.1.2). A two-sided p < 0.05 was taken to
indicate statistical significance.