Exploring Young Adulthood Psychopathology Networks Related to Depression Using Partial Correlation Network and Bayesian Network


 To characterize young adulthood depression is complicated because it is entangled with a broad spectrum of symptoms as well as traumatic experiences during development. However, previous symptom network studies have focused on undirected transdiagnostic association among depression and anxiety symptoms. Our study investigated both undirected and directed connections among variables potentially associated with depression, such as anxiety, addiction, subjective distress caused by traumatic events, perceived emotional adversities, and support systems. Both the regularized partial correlation network analysis and Bayesian network analysis were applied to 579 subjects screened for depression. Anxiety-related symptoms played a role as a hub node in the partial correlation network and Bayesian network. The vulnerability analysis of the partial correlation network showed that verbal abuse, social anxiety, concentration problems, and suicidal ideation had the strongest influence on changes in the network’s topology. In the Bayesian network analysis, loss of interest, depressed mood, and parental verbal abuse were located as parent nodes in the directed acyclic graph. In the aspect of disease networks, more attention should be paid to certain variables encompassing various domains as well as depressive symptoms in young adults’ mental health management.


Introduction
A characteristic feature of psychopathology in young adulthood depression is a broad spectrum of symptoms. It is also known that the probability of having a family history of mood disorders; being diagnosed with bipolar disorder in the future; and accompanying symptoms, such as substance abuse 1 as well as irritability and anxiety 2 are more common than in young adults than in other age groups.
Therefore, when screening young adulthood depression, it is di cult for clinicians to determine whether particular symptoms are due to depression or other mental disorders, including generalized anxiety disorder, social anxiety disorder, or substance use disorder 3 There are important developmental tasks to accomplish in the transitional period between the ages of 20 and 30. However, the experience of depression at the transitional period is likely to impair appropriate functions to accomplish these tasks 4 . Functional impairments during this period appear to persist for a substantial time after the remission of depression 5,6 . In addition, emotional adversities experienced before young adulthood cause structural [7][8][9] , functional 10,11 , and neurochemical 12 alterations in the brain. Therefore, various variables consisting of trans-diagnostic symptoms, related behavioral patterns, and current and past stresses may be intricately intertwined in young adulthood depression. Those variables can be connected in a subjective causal space (e.g., the experience of being criticized by one's parents as a child seems to make one nervous when a stressful situation arises). However, it was not clear how these variables were connected in past psychopathology studies, and this uncertainty was a factor that confused clinicians. This is partly due to our poor understanding of how variables that can potentially affect depression are associated with certain age groups.
A network analysis does not just consider certain variables to be the phenotypes of unobservable latent variables 13 . Instead, network analysis interprets phenotypes as fundamental components of certain disorders. In terms of network analysis, episodes of mental disorder appear because of interactions between symptoms 14,15 . In the same vein, comorbidity of mental disorders occurs due to the interactions between psychopathologies' entangled networks. We believed that the network-based analysis method would be suitable for studying the relationships of psychopathologies related to depression in young adulthood that can exhibit a wide range of symptoms. There have been several studies on network analysis of psychopathology in young adulthood. However, most studies were conducted on groups in the general population, such as college students 16 , or on the general population focusing on eating disorders 17 . In our previous study 18 , using the same sample, the effect of verbal abuse on psychiatric symptoms was mainly studied. Other studies have investigated the symptom network of depression, but it seems that there were limitations, such as studying networks limited to only depression or anxiety symptoms 19,20 . However, in this study, we focused on the high depression symptom (high-DS) group selected from the parent group. Through network analysis, we aimed to identify the associations of not only symptoms of young adulthood depression, but also other variables, such as social anxiety disorder, substance abuse, the number of mentors, the number of concerns, and experience of verbal abuse by parents, which may be potentially related to young adulthood depression.
In this study, we sought to determine how psychopathologies or potentially related variables (including psychological or environmental factors) were associated in the young adulthood age group and how the symptoms of the depressive disorder of young adulthood could be relatively various in the aspect of network analysis. For that purpose, we conducted a network analysis of various questionnaires on psychopathologies of 5,615 subjects who were college students and 579 subjects screened for depression from the 5,615 subjects. To the best of our knowledge, this is the rst study to explore the psychopathology of depression and potentially related variables associated with depressive symptoms in the young adulthood age group and in the subgroup screened for depression based on the Patient Health Questionnaire-9 (PHQ-9) 21 which is widely used in the screening and diagnosis of depression 22 .

Participants
We used data from self-reported questionnaires, which was part of the annual healthcare screening of 5,685 college students' mental health at Korea Advanced Institute of Science and Technology (KAIST) in Daejeon, South Korea, between April 2014 and February 2015. Respondents knew that their responses would not be used for any purpose other than self-a rmation of their mental health. The responses of 5,615 students between the ages of 18 and 30 were used among the gathered data, and samples with missing data were excluded. We used the PHQ-9 scores to de ne the subjects screened for depression. The group was de ned as respondents with PHQ-9 scores of ve or more while reporting that they were experiencing functional impairment due to the depressive symptoms of PHQ-9 21 . The group was expected to correspond to mild or more severe depression, herein called the high depression symptom severity group (high-DS group). Among the 5,615 subjects, 579 subjects were screened for the high-DS group.

Measures
The survey for this study was conducted through an online website while ensuring anonymity and con dentiality. We conducted surveys of various types of variables that could be associated with depression in the students of KAIST and used the results of the surveys as nodes in our network analysis: Gender; smoking history (Smoking); alcohol abuse (CAGE) 23 ; the number of concerns (Concerns); the number of mentors within and outside the family (Ment_intrafam, Ment_extrafam); the smartphone addiction scale (SAS) 24 ; the Generalized Anxiety Disorder-7 (GAD) 25 ; Leibowitz social anxiety scale (LSAS) 26 ; the experience of verbal abuse from parents, superiors, and peers (VA_parents, VA_superiors, VA_peers) 27 ; the impact of event scale -revised (IESR) 28 ; and 9 items of the patient health questionnaire-9 (PHQ_01, PHQ_02, PHQ_03, PHQ_04, PHQ_05, PHQ_06, PHQ_07, PHQ_08, PHQ_09) 21 .
To focus on the high-DS group with a sample size of only 10% of the parent group, unlike our previous study 18 , which constructed a network using all individual items, the total scores of the each measure were used, excluding PHQ-9. The details of the variables we used are presented in Table 1. In the rst analysis, a graphical Gaussian model was used to construct the networks, in which the edges represent partial correlations between nodes controlling all other nodes' effect. To construct the regularized partial correlation networks, we used a graphical LASSO algorithm 29 implemented in the EBICglasso function of R package qgraph 30 . The network construction process is brie y summarized as follows. Because the LASSO algorithm makes the network parsimonious while maintaining the partial correlation network's explanatory power, the LASSO penalty was used to zero the edges with insigni cantly small partial correlation values. Also, by using the model comparison with the extended Bayesian information criterion, we found the most optimal parameter (λ of the LASSO penalty, and we built a parsimonious network that best describes the model using the parameter. We set the hyperparameter (γ) value to 0.5 31 . The degrees of association between two connected nodes were indicated by the thickness of the edges. The signs of the partial correlations were indicated by the color of the edges.
Furthermore, we estimated the strength centrality, closeness, and the betweenness centrality to explore each node's importance in the obtained network. The rst metric was computed by summing the values of the edges connected to each node. The second was computed by summing the length of the shortest paths between each node and all other nodes. The third was computed by counting the number of times a speci c node appeared on the shortest path connecting all possible pairs of two different nodes. We assumed that nodes with high values of these metrics would play a more important role in the network. In addition, we used the R package bootnet 32 to measure the stability of the network. Through 1000 times of bootstrapping, we measured metrics, including the con dence interval of each edge's strength.

Bayesian networks (DAG)
We computed the Bayesian network using the random-restart hill-climbing algorithm implemented in R package bnlearn 33 . First, it looks for a network structure that optimizes the Bayesian information criterion by adding, removing, or reversing the edge of the network. This process randomly repeated 100 times to prevent the hill-climbing algorithm from falling into the local optima. This procedure only determines whether the edges exist in the network and what the directions of the edges would be.
To obtain stable results, we extracted 1,000 sample networks through bootstrapping with resampling. Then, we checked how often each edge appeared in the networks obtained through bootstrapping. If the edges appeared in more than 85% 34 of the total sample networks, we included them in the nal averaged DAG. In addition, if at least half of the sampled 1,000 networks had the same direction, then the direction would be illustrated in the nal averaged network. Note that while the DAG construction algorithm was applied, edges were excluded toward gender or verbal abuse from any other nodes ("blacklisting") 33 .
Vulnerability analysis of regularized partial correlation network In the eld of neuroimaging, there have been studies on how the changes in regional brain networks affect the overall network topology 35,36 . In our analysis, we investigated how local changes (or a therapeutic approach to the nodes) in nodes affect networks' overall integrity. Arti cial intervention was applied by replacing each node with random values that follow a uniform distribution between 0 and 1. The intervened regularized partial correlation network for each node was sampled through jackknife resampling. In other words, sampled intervened regularized partial correlation networks were obtained for every 22 intervened nodes (579 × 22 networks). The distribution of the intact network was also obtained using jackknife resampling. The distributions of global e ciencies in 579 intervened networks for each node and 579 intact networks were compared through 22 Wilcoxon Rank-Sum tests with multiple comparison correction (Bonferroni correction). This comparison allowed us to evaluate how much intervention to each node (or a therapeutic approach to each node) would affect the entire psychopathology network's global e ciency or clustering coe cient. Through this study, we tried to determine which nodes' preferential treatment is most effective in terms of network topology.

Ethical Standard
The

Partial correlation network
The partial correlation network of high-DS subjects is shown in Fig. 1. The number of non-zero edges was 66 (28.6%). Most of the edges on the network showed positive correlation coe cients (83.33%). It was found that nodes belonging to the same symptom domains tended to form a cluster and were connected within the cluster. For example, the PHQ-01, 02, 06, 09, called cognitive/affective symptoms, tended to cluster together, and the PHQ-03, 04, 05, 06, 07, 08, which are somatic symptoms 37 , tended to cluster together. IESR (subjective distress caused by traumatic events), GAD (generalized anxiety), and LSAS (social anxiety), which are classi ed as anxiety-related symptoms, were also clustered with strong correlations. Interestingly, nodes related to perceived distress due to verbal abuse by parents, peers, and superiors were also clustered. It seems that PHQ_02, GAD, and IESR act as hub nodes that mediate the psychopathology network's symptoms. Regarding

Bayesian Network
It was found that the core symptoms (PHQ_01 and PHQ_02) of depression were at the top of DAG. It was found that the nodes with greater numbers of edges in the DAG were PHQ_06, IESR, and GAD. It was also found that female gender, known as a risk factor for depressive disorder, had no signi cant effect on the interaction of depression-or anxiety-related symptom clusters. Instead, the male gender seemed to be related to alcohol and nicotine addiction; however, the cluster was not associated with a depression or anxiety forming 'island'. In our results, these three nodes (Gender, Smoking, and CAGE) seemed not to contribute to maintaining the network of depression and potentially related variables. Among the symptoms of addiction, only smartphone addiction was connected with the depression network.
The edges related to VA_parents were noteworthy as well. Edges from VA_parents to SAS and PHQ_03 were observed in the network of all subjects. However, edges connecting VA_parents to other nodes was not observed in the network of high-DS subjects. See

Discussion
In this study, we performed a network analysis with depressive symptoms and variables potentially related to depression in subjects screened for depression (N=579). The nodes with high centrality measures identi ed in regularized partial correlation networks were IESR, GAD, and PHQ_02. It was found that PHQ_01, PHQ_02, PHQ_03, and VA_parents were at the higher part of the Bayesian network. It seems that our results were consistent with the expectation that various variables would be closely linked beyond each domain, and the networks adequately re ect existing medical knowledge. For example, the core symptoms of depression, namely, loss of interest (PHQ_01), and depressed mood (PHQ_02), occupied the higher part in the causal relationship and showed relatively higher centrality scores. In addition, when the connected edges were removed (vulnerability analysis), the ve nodes that had the most signi cant impact on the decrease in the integrity of the entire network were PHQ_09, LSAS, VA_parents, PHQ_07, and VA_peers. Our results suggest that these nodes could become primary treatment target symptoms that can effectively reduce the integrity of the entire psychopathology network of high-DS subjects.
It is well known that depression at a younger age is associated with more anxiety symptoms 38 . It is also common for these two symptoms to occur together 39 . Some researchers have even argued that anxiety and depression can be explained by a single factor model 3,40 . As a matter of fact, the speci er "with anxious distress" was added to the Diagnostic and Statistical Manual for Mental Disorders, 5 th Edition (DSM-5) 41 to describe the symptoms of anxiety that frequently accompany depression. The fact that patients become anxious when depressed and depressed when anxious is a phenomenon often experienced in clinical settings. Corresponding results were also observed in our study; the symptoms of anxiety and depression were intertwined, according to the results of our network analysis. Impressively, anxiety-related symptoms (IESR, GAD, and LSAS) appeared to be a hub node in the psychopathology networks. The nding was commonly observed in both the regularized partial correlation network and the Bayesian network. Generalized anxiety seemed to act as a bridge node between cognitive/affective (PHQ_01, 02, 06, 09) and somatic factors (PHQ_03, 04, 05, 07, 08) of depression 37 in the Bayesian network. GAD was also the node linking subjective distress caused by traumatic events (IESR) and depressive symptoms (PHQ_02, PHQ_06). These ndings are consistent with those of previous studies showing that depressed mood and anxiety are associated with somatic symptoms 42 . In addition, there have been studies on the effects of generalized anxiety disorder, panic disorder, and major depressive disorder on somatic complaints using structural equation modeling 43 . No direct associations between emotional awareness and somatic complaints were found; however, there were direct associations among depression, anxiety, and somatic complaints. Also, recent studies on heartbeat evoked potential (HEP) have also found that generalized anxiety 44 or social anxiety 45 were associated with an inadequate increased HEP. These results suggest that anxiety symptoms are associated with abnormally increased somatosensory sensitivity of body sensation. Our study seems to be in line with these ndings as well because our network analysis results obtained from young adult subjects can be interpreted as showing psychopathology in which cognitive/affective symptoms spread to somatic symptoms of depression and other associated symptoms of depression through generalized anxiety both in regularized partial correlation network and Bayesian network.
One notable point was that the depressive symptoms were more strongly associated with SAS (smartphone addiction) than CAGE, Smoking in the network of high-DS subjects. In the regularized partial correlation network, SAS was connected to LSAS, Concerns, VA_parents, VA_peers, and PHQ_07. In the Bayesian network, it was connected to Concerns and LSAS. However, CAGE and Smoking were not connected to depression or anxiety symptoms. This may re ect the bias of the college students' sample.
However, some studies have reported that smartphone addiction was related to shyness, loneliness 46 , low self-esteem, and aggressive behaviors 47 . These studies commonly mentioned that smartphone addiction might promote the development of depressive disorders. Our study suggests that smartphone addiction might be linked to stresses (the number of concerns, verbal abuses), social anxiety symptoms (LSAS), and concentration problems (PHQ_07). Hence, it could be a facilitating factor for depression in young adults. Contrary to what we have previously known, it may be important to consider that smartphone addiction could be more associated with depressive symptoms than substance addiction in young adulthood depression.
In addition, in our study, we examined changes in the whole network's topology by comparing it with the intact network after damage to each node. We were able to determine which node was more effective in reducing the integrity of the whole psychopathology network. Inducing a local change in a network and observing a change in the overall topology is different from determining a node's importance through a centrality measure in an intact network 36 . The former makes it possible to observe the topology change of the whole variable network due to the change in each node, and the latter only represents the importance of the nodes that make up the whole psychopathology network. The analysis allowed us to identify symptoms that require intervention to reduce the connectivity of the entire disease network. It was expected that if intervention for certain variables was prioritized, such as social anxiety (LSAS), verbal abuse (VA_parents and VA_peers), concentration problems (PHQ_07), and suicidal ideation (PHQ_09), it would be possible to stabilize the overall psychopathological network more e ciently in terms of both global e ciency and clustering coe cient.
The advantage of our study was that we aimed to observe the psychopathology network through various statistical aspects of the network, such as partial correlation network, Bayesian network, and how the topology of the whole network changes when each node of the network is intervened. In addition, rather than using only the scales limited to depression or anxiety, our study had the strength of using various scales, including generalized anxiety, social anxiety, subjective distress due to traumatic events, addiction (alcohol, nicotine, smartphone), the number of concerns, the number of mentors, and perceived verbal abuse. This is expected to be more advantageous than previous analysis using only depression or anxiety symptoms in that both social and environmental factors were included to explain the psychopathology networks. This enabled more appropriate network analysis in that it utilized as many variables as possible that could affect the network of psychopathology. However, our study was still limited in that we used data collected cross-sectionally at a particular time point. To compensate for this limitation, we used not only a graphical Gaussian model but also Bayesian network analysis, which may represent the information of causal relationships because Bayesian network (DAG) analysis is relatively useful in inferring the causal relationship of symptoms in situations where time-series data are not available. However, it is worth noting that the directions of arrows in the DAG does not necessarily indicate causal relationships. The graph from A to B to C and the graph from C to B to A are identical in terms of conditional independence. Certainly, the DAG represents at least the associations between nodes; however, it is challenging to be sure that DAG represents the causal relationships between nodes.
In the future, we anticipate that network analysis of psychopathology should be conducted using information gathered at various time points.
In terms of psychopathology, our research revealed that not only variables in a limited domain should be considered signi cant; rather, variables in multiple domains should be considered comprehensively. In addition, the analysis of the network showed that certain variables were more important in terms of centrality, causal relationship, and the potential to lower the integrity of the network. And the speci c variables were not limited to depressive symptoms but encompassed various domains. We suggest that the understanding centered on the hub or the bridge node of the network and the treatment centered on the node that can signi cantly lower the integrity of the network would be helpful in the diagnosis and treatment of young adults' depression. Figure 1 Regularized partial correlations network of high-DS subjects, depicting the regularized partial correlation between nodes representing depressive symptoms and nodes potentially related to depression. Blue indicates edges that have positive correlation coe cients. Red indicates edges that have negative correlation coe cients. The thickness of edges indicates the magnitude of the correlation between two nodes.

Figure 2
Con dence intervals for the strength of edges obtained by 1,000 bootstrapped networks in high-DS subjects.
Page 17/20 Bayesian network of high-DS subjects, depicting causal relationships between nodes representing depressive symptoms and nodes potentially related to depression.