Data Source
This investigation draws from a general population health cohort survey executed across the Beijing-Tianjin-Hebei region between 2020 and 2023[12]. It specifically targeted the urban residents of Baoding's Jingxiu District, encompassing 1,853 participants. Eligibility criteria mandated a minimum one-year residence in the community, being 20 years of age or older, and a readiness to engage in the study's subsequent follow-up activities.
Data Collection Tools and Techniques
Questionnaire
A meticulously crafted questionnaire was administered via face-to-face interviews by uniformly trained investigators, capturing a broad spectrum of data, including gender, age, ethnicity, residence, educational level, personal monthly income, smoking and drinking habits, engagement in physical labor and exercise, dietary patterns (detailing the intake frequency of eggs, fruits, milk, and dairy products), along with the menstrual conditions of women [13]. Furthermore, comprehensive physical health assessments were conducted on all participants, measuring height, weight, blood pressure, body composition, and blood biochemistry.
For variable categorization, marital status was segmented into four groups: unmarried, married, divorced, and widowed, with corresponding codes of 1 = unmarried; 2 = married; 3 = divorced; 4 = widowed. Monthly income, reported in CNY, was organized into twelve brackets ranging from less than 1,000 to 15,000 CNY or more, to accurately reflect financial standing. Educational attainment was classified from illiteracy to graduate education, coded as: 1 = illiterate; 2 = primary school; 3 = junior high school; 4 = high school; 5 = college and undergraduate; 6 = graduate student. The consumption frequency of eggs, dairy products, and fruits was delineated into five levels, from daily (5–7 days per week) to never, offering detailed insight into dietary habits.
Blood Pressure
Participants were instructed to refrain from consuming alcohol or coffee, smoking, or eating prior to blood pressure assessment. Measurements were conducted with the participant seated, following a 5-minute rest period. Blood pressure readings were recorded thrice consecutively, at 1-minute intervals, utilizing an Omron electronic blood pressure monitor (model: HEM-907) on the right brachial artery of the participants[14].
Physical examination
Height was assessed using a stadiometer, ensuring precision to 0.1 cm, while weight was determined through a body composition analyzer (TANITA BC-420), achieving accuracy to 0.1 kg. These measurements were performed by uniformly trained health examination staff, and the body mass index (BMI) was subsequently calculated as the weight in kilograms divided by the square of the height in meters (m^2) [15].
Blood Biochemistry
Participants were mandated to fast for a minimum of 8 hours prior to the collection of blood samples. A total of 9 ml of peripheral venous blood was drawn, allowed to stand for 30 minutes, then centrifuged on-site, with serum aliquoted within 4 hours. The samples were subsequently stored in a -20°C freezer[16]. Serum biochemistry analyses were conducted at the Chinese People's Liberation Army General Hospital.
Data Processing
Preprocessing
During the data processing phase, R software version 4.3.1 was employed for data preprocessing, encompassing outlier identification and remediation, missing value management, and the selection and definition of pertinent variables.
The unmarried cohort consisted of 72 individuals, split into 26 males and 46 females, with an age range of 20 to 64 years—predominantly clustered between 20 and 30 years. Given the anticipated discrepancies in economic status and dietary habits between the unmarried and those married, divorced, or widowed, incorporating the unmarried group into the analysis alongside spouseless participants was considered unsuitable. Consequently, the unmarried group was omitted from further analysis.
To streamline the interpretability and practicality of the findings, marital status was consolidated into two distinct categories: normal marriage (married) and marital turmoil (divorced and widowed), with normal marriage designated as 0 and marital turmoil as 1.
This refinement led to the inclusion of 1,498 participants in the final analysis, comprising 459 males—440 categorized under normal marriage and 19 under marital turmoil, and 1,039 females—903 within the normal marriage category and 136 classified under marital turmoil.
Descriptive Statistics
In the descriptive statistics phase, continuous variables were summarized as mean values along with their 95% confidence intervals, whereas categorical data were depicted through frequencies and proportions. For assessing group differences, continuous variables underwent t-test evaluations, and chi-square tests were applied to categorical data, establishing a statistical significance threshold at p = 0.05.
Construction of SEM
To affirm the analysis results' reliability, variables underwent initial tests for normal distribution. Skewness and kurtosis tests determined normality; data were deemed approximately normal if skewness's absolute value was < 2 and kurtosis's absolute value (minus 3) was < 7. For assessing multivariate normal distribution, the chi-square and Mahalanobis distance plot method was employed. Variables were considered to follow a multivariate normal distribution if plot points largely aligned along a straight line, fulfilling SEM analysis prerequisites.
In constructing latent variables, exploratory factor analysis (EFA) was performed on selected variables, with suitability confirmed by a KMO value > 0.6 and a significant Bartlett's Test of Sphericity. Principal component analysis facilitated factor extraction, with the scree plot's inflection point guiding the number of factors extracted. The varimax rotation method was applied, selecting factors with an absolute loading > 0.5.
This process determined the final latent variables, their designations, and constituent manifest variables. Marital status was included as a manifest variable, and maximum likelihood estimation applied for SEM fitting. Notably, the significant male and female sample size disparity necessitated separate structural equation models to mitigate gender-based confounding bias. The hypothetical model was informed by prior literature and findings.The 'lavaan' package in R language facilitated SEM's construction and validation. Statistically significant differences or effects were acknowledged at p-values less than 0.05.