Demographic and Clinical Characteristics. Demographic and clinical characteristics were obtained mostly from self-report using SWAN-designed questionnaire. One of the clinical characteristics, body mass index (BMI), was calculated using the height and weight that were measured during each visit using standardized study protocols (Hall et al., 2012). Demographic characteristics included age, race/ethnicity, marital status, level of education, annual household income, employment status, and social support. Clinical characteristics included overall perception of health and BMI.
Menopausal Stage. Menopausal stage was collected based on the self-reported menstrual bleeding patterns. Peri-menopause was defined as having menstrual period in the past 3 months with change in menstrual cycle regularity in the past 12 months or no menstrual period in the past 3 months with intermittent menstrual bleeding within the past 12 months. Post-menopause refers to having no menstrual period in the past 12 consecutive months (Bromberger et al., 2011; Min et al., 2022).
Symptoms. A total of fourteen symptoms were selected based on the literature and symptoms studied in the SWAN study (Greenblum et al., 2013; Min et al., 2021). Symptoms with prevalence less than 20% were excluded to increase the clinical significance of our study findings as recommended by previous researchers (Albusoul et al., 2017). The final set of included symptoms were frequent mood change, anxiety, depression, headache, forgetfulness, trouble sleeping, night sweat, hot flash, decreased sexual desire, decreased sexual arousal, decreased sexual satisfaction, vaginal dryness, get up from sleep to urinate, and stiffness.
The Center for Epidemiological Studies of Depression (CES-D) scale was used to measure depression which has shown to be a reliable and valid tool among midlife U.S adults with Cronbach’s alpha of .90 (Cosco et al., 2017; Jones et al., 2020). A composite score of four anxiety symptoms (irritability, nervousness, feeling fearful, and heart pounding) was derived to measure anxiety as recommended in previous literature, which has shown a good convergent validity with the Generalized Anxiety Disorder (GAD-7) scale (Bromberger et al., 2013; Kravitz et al., 2014). Other symptoms were measured from self-report using SWAN-designed questionnaire that asked about the frequency of each symptom experienced in the past two weeks. A composite score was derived to measure the severity of each symptom that ranged from 0 (mild) to 3 (severe) (Min et al., 2021, 2022).
Data Analysis.
Propensity Score Matching. Midlife peri-menopausal women (n = 1,719) were statistically matched with midlife post-menopausal women (n = 898) using propensity score matching with ration of 1:1, which resulted in (peri-menopause = 898; post-menopause = 898). The covariates used for propensity score matching were the demographic and clinical characteristics: age, race/ethnicity, marital status, level of education, annual household income, employment status, social support, overall perception of health, and BMI. Using the R-package MatchIt, the propensity scores were calculated with logistic regression and matching was performed using the nearest neighbor matching algorithm. Then, the covariate balance was assessed before and after propensity score matching using χ2 tests for categorical variables and independent t-tests for continuous variables to ensure that the propensity score matching has worked.
Network Analysis. The symptom networks were constructed using the propensity-score matched midlife peri-menopausal and post-menopausal women. The following steps for network analysis were followed: (1) network assessment, (2) network accuracy and stability, and (3) network comparison. R statistical software version 3.6.2 was used to conduct network analysis.
(1) Network Assessment. A Pairwise Markov Random Field (PMRF), which is an undirected network model, was used construct the network structure (Hevey, 2018; Papachristou et al., 2019). The q-graph in R-package was used to create partial correlations for both groups. To estimate the network, the “graphical least absolute shrinkage and selection operator (LASSO) algorithm with extended Bayesian Information Criteria (EBIC)” was used with a recommended gamma (γ) value of 0.25. This method was selected to minimize the number of spurious edges and potential issues of over-fitting and unstable estimates (Hevey, 2018). LASSO has been recommended for its flexibility in terms of non-continuous data and low likelihood of false positive edges and the selected EBIC helps enhance the accuracy and interpretability of the constructed networks (Hevey, 2018).
Within the constructed network, nodes represent symptoms and edges represent the relationship between the symptoms independent of other nodes (Papachristou et al., 2019). A blue edge represents a positive relationship and a red edge represents a negative relationship between the two nodes. In addition, the strength of the relationship is indicated by the width of each edge, where a thick edge indicates a strong relationship and a thin edge indicates a weak relationship between the two nodes. To estimate the importance of each node within the network, three centralities measures of strength, betweenness, and closeness were estimated. Strength measures how strongly a node is connected to other nodes based on the sum of weighted number and strength of all connections of this specific node in relation to other nodes (Hevey, 2018). Betweenness indicates the number of times that a node is in the shortest pathway between other nodes (Hevey, 2018). Closeness measures the average distance of a specific node to other nodes in the network (Hevey, 2018). Nodes with high centrality values were considered to be key symptoms within each network.
(2) Network Accuracy and Stability. After the network assessment, network accuracy and stability were examined using the R-package bootnet. We calculated a 95% confidence interval to estimate the accuracy of edge weights. As confidence interval requires prior knowledge on the sampling estimation which is often difficult to have, we conducted the non-parametric bootstrapping method (1,500 bootstrap iterations) to repeatedly estimate the model to obtain the required statistics. Then, we conducted a case-drop bootstrapping method to estimate the correlation stability coefficient (Cs-coefficient) to quantify the stability of centrality indices. While there is no specific cut-off value for Cs-coefficient, researchers recommend a minimum value of .25 and preferably over 0.50 (Epskamp et al., 2018). Finally, we conducted bootstrapped difference test with α = .05 based on 1,500 bootstrap iterations to test for statistical significance between two node strengths or edge weights.
(3) Network Comparison. The R-package NetworkComparisonTest (NCT) was used to compare the symptom network properties between midlife women in peri-menopause and post-menopause. The NCT is a permutation-based hypothesis test that allows for a direct comparison of two symptom networks through testing for differences in their network structure 1000 times repeatedly. It tests on four invariance hypotheses that include (1) global strength, (2) network structure, (3) one-to-one edge strength, and (4) specific centrality measure. The global strength invariance hypothesis assumes the overall connectivity to be identical across subpopulations. The network structure invariance hypothesis assumes the structure to be completely identical across subpopulations. The one-to-one edge strength invariance hypothesis assumes a specific edge to be identical across subpopulations. Last, the specific centrality measure assumes that the two networks are not significantly different in strength centrality measure (van Borkulo, 2018).