Application of latent class analysis in assessing the mental health of medical students during the COVID-19 Epidemic

Background: The novel coronavirus disease 2019 (COVID-19) is a global public health emergency that has caused worldwide concern. The mental health of medical students under the COVID-19 epidemic has attracted much attention. This study aims to identify subgroups of medical students based on mental health status and explore the influencing factors during the COVID-19 epidemic in China. Methods: A total of 29,663 medical students were recruited during the epidemic of COVID-19 in China. Latent class analysis of the mental health of medical students was performed using M-plus software to identify subtypes of medical students. The latent class subtypes were compared using the chi-square test. Multinomial logistic regression was used to examine associations between identified classes and related factors. Results: In this study, three distinct subgroups were identified, namely, the high-risk group, the low-risk group and the normal group. Therefore, medical students can be divided into three latent classes, and the number of students in each class is 4325, 9321 and 16,017. The multinomial logistic regression results showed that compared with the normal group, the factors influencing mental health in the high-risk group were insomnia, perceived stress, family psychiatric disorders, fear of being infected, drinking, individual psychiatric disorders, sex, educational level and knowledge of COVID-19, according to the intensity of influence from high to low. Conclusions: Our findings suggested that latent class analysis can be used to categorize different medical students according to their mental health subgroup during the outbreak of COVID-19. The main factors influencing the high-risk group and low-risk group are basic demographic characteristics, disease history, COVID-19 related factors and behavioral lifestyle, among which insomnia and perceived stress have the greatest impact. School administrative departments could utilize more specific measures on the basis of different subgroups, and provide targeted measures.


Background
Since December 2019, a novel coronavirus pneumonia (COVID-19) outbreak has persisted in Wuhan. The World Health Organization declared that the COVID-19 outbreak constitutes a public health emergency of international concern [1]. In January 2020, the Ministry of Education issued a notice requiring colleges to appropriately postpone school opening time. For college students, extended holidays, long-term stays at home, fewer trips out of the home, and an inability to attend school and participate in social activities, may affect their academic performance and aggravate their anxiety and depression [2][3][4]. In a recent study, the research team identified social networking as the strongest protective factor against depression and suggested that reducing sedentary activities, such as watching TV and daytime naps, could also help reduce the risk of depression [5]. This epidemic not only led to a risk of death from infection, but also led to unbearable psychological pressure.
As a special group of future medical workers, medical students are an important part of the backbone of health care [6], and the healthy growth of these students can effectively promote the positive development of healthcare in the future. Compared with their normal way of living and learning, stay at home was a major contrast. In fact, the epidemic has affected mental health among those in the medical industry than among those in the general public, and they must be treated correctly to adapt to this change [7]. Mental health problems may continue into adulthood if they are not detected or properly treated. For students in clinically related disciplines, these problems can lead to many undesirable personal and professional consequences [8,9]. Therefore, it is necessary to pay attention to the mental health of medical students during the epidemic period and take targeted action to intervene with students with different characteristics.
In research on the mental health of medical students, the indirect measurement of the latent mental health can be obtained through observed and measurable behavior. Previous studies generally used the total scores of the self-assessment scales as the standard for categorizing the mental health of medical student [10,11]. The categorization standard was too simple to distinguish group characteristics. The application of latent class analysis (LCA) technology can solve this problem and provide more scientific methods for the classification of medical students' mental health during epidemic. LCA is a more scientific and rigorous statistical method to classify the potential characteristics of a population based on the score probability of each item [12]. At present, it has been widely used in sociology, psychology and disease classification or diagnosis [13,14].
Current research on COVID-19 has focused on pathogenesis, epidemiology and clinical research [15][16][17][18]. There is no latent category research on mental health during the COVID-19 epidemic. Therefore, this study intends to use the LCA method to explore the factors influencing medical students' mental health under the stress of COVID-19 pandemic to carry out targeted psychological interventions and to provide accurate decision-making references for relevant education departments.

Participants
Participants in our study came from a large cross-sectional survey conducted from March to April 2020 during the COVID-19 epidemic in China. The survey selected three medical universities, and a cluster stratified random sampling method was used. Considering the severity of the COVID-19, these data were collected through the online platform rather than face-to-face interview. All questionnaires were completed on the public account platform WeChat. Ultimately, a total of 29,663 valid questionnaires were collected. The response rate was 96.99%.
The protocol of this survey was approved by the Biomedical Ethics Committee of Xinxiang Medical University and Hainan Medical University. All the participants signed online informed consent before completing the online questionnaire.

Measures
The survey consists of seven parts: basic demographic characteristics, the psychiatric history of individuals and family members, depression, anxiety, perceived stress, insomnia, and COVID-19 related factors. Anxiety and depression are the most common mental health problems found in Chinese medical students. In our study, mental health status was assessed by the depression and anxiety scale.
We focused on symptoms of depression, anxiety, insomnia, and stress for all students using the Chinese versions of the following measurement tools, which have good validity and reliability. The Patient Health Questionnaire-9 (PHQ-9) included 9 items and was adopted to screen for depressive symptoms in our study. Each item was scored from 0 to 3 (0, not at all; 1, several days; 2, more than half of all the days; 3, nearly every day), with the total scores ranging from 0 to 27. Higher scores indicated greater severity of depressive symptoms [19]. The Generalized Anxiety Disorder-7 (GAD-7) scale was a practical self-report anxiety questionnaire that comprised seven items based on seven core symptoms. The participants reported their symptoms using a 4-item rating scale ranging from 0 (not at all) to 3 (almost every day), such that the total score ranged from 0 to 21 [20]. The Insomnia Severity Index (ISI) was used to assess the severity of insomnia symptoms and comprised seven items. The participants responded to items on a 5-item scale ranging from 0 to 4 (0, not at all; 1, mild; 2, moderate; 3, severe; 4, extremely severe), with the results ranging from 0 to 28 [21]. The Perceived Stress Scale (PSS) is a measure of perceived stress and has shown good reliability and validity. The participants were asked to answer each question using a 5-point rating scale ranging from 0 (never) to 4 (very often) and reported the frequency of events associated with each item in the last month. The total score ranged from 0 to 56, with higher scores reflecting higher levels of stress [22]. The PHQ-9 and GAD-7 items were recoded into binary variables for the LCA.

Statistical Analysis
LCA models were conducted to identify data-driven subgroups using Version 8.2 of Mplus. The LCA can compensate for the deficiencies of factor analysis and structural equation model, which can only be analyzed with the continuous latent variables [23]. In LCA, classes are identified based on a set of categorical indicators, assuming that the latent categorical variable can explain the association among a set of observed variables [24]. In our study, we fitted one to six latent class models to determine the optimal number of latent classes.
The model fit indices used for the LCA included information criteria, the Lo-Mendell-Rubin (LMR) test, the bootstrap likelihood ratio test (BLRT), and the entropy [25]. In addition, subgroup membership interpretability is another important factor in determining the optimal model. The information criteria include the Akaike information criterion (AIC), the Bayesian information criterion (BIC) and the adjusted Bayesian information criterion (aBIC). For these fit indices, the suitable model was based on the highest entropy and the lowest AIC and BIC. The entropy is an indicator of classification accuracy, with values close to 1 indicating greater accuracy [26]. Lower AIC and BIC values indicate that the model provides a better description of the data. The LMR and BLRT are significant tests that compare model fit improvement between models with κ classes and κ-1 classes. Compared to κ-1 classes, significant p values suggest a better model fit with κ classes.
After the appropriate number of latent classes was identified, the medical students were assigned to their most likely subgroup based on their highest posterior class probability. Chi-square tests were conducted to examine the distribution of related factors. Multinomial logistic regression was performed to estimate the correlates of related risk factors with subtypes. Statistical significance was taken as a 2-sided p< 0.05.

Demographic Characteristics
A total of 29,663 medical students were investigated this time, including 10,185 males and 19,478 females. The average age of the medical students was 21.46 years (SD=2.50). The demographic characteristics of medical students are shown in Table 1.

Model Fit Indices of LCA
Model fit indices for various models with different latent classes are listed in Table  2. LCA with 1 to 6 classes was performed. The results showed that the AIC, BIC and aBIC decreased with increasing classification number. The model with 1 class had the largest AIC, BIC and aBIC, suggesting that this model fit the data the worst among the models. The 2-class model had the highest entropy value, but the LMR test was not significant. In the 3-class and 4-class models, the LMR and BLRT values reached significance (P < 0.0001), but the entropy value of the 3-class model was higher, indicating that the 3-class model fit the data better than the 4-class model did. After comprehensively considering the above indicators, we selected the 3-class model because it was parsimonious and exhibited better class separation than did the solutions generated by the other classifications (Table 2).

Definition of Latent Class
The score probability of the first class was high, which showed that the medical students in this category had poor mental health status during the epidemic period and could not be effectively adjusted, so they were labeled the 'high-risk group'. A high prevalence of insomnia and perceived stress were also featured in this subgroup. The second class was defined as the 'low-risk group' because it had a moderate score probability of depression and anxiety, which was lower than that observed in the first class and higher than that observed in the third class. Medical students of this type had a certain self-regulation ability during the epidemic. The third class had a low probability of scoring in all categories, indicating that this type of medical students had better mental health status during the epidemic and could effectively regulate their psychological condition, so it was named the normal group. Figure 1 illustrates the profiles of mental health subtypes for the 3-class model. In Figure 1, the y-axis represents the probability of depression and anxiety symptoms, while the x-axis shows indicator variables used for the LCA. The three lines showed symptom patterns for the three mental health subtypes. No crossing was observed among the three lines, suggesting that the modelled subtypes differed in symptom profiles. In particular, for the item regarding suicide or self-harm, a lower probability was showed across the three groups.

Univariate Analysis of Latent Class of Mental Health
Gender, education level, smoking, drinking, the psychiatric disorders of the individual and family members, perceived stress, insomnia, knowledge of COVID-19, contact with confirmed or suspected patients with COVID-19, fear of being infected, and participation in mental health education on COVID-19 were significantly different across the three subtypes (p < 0.05) ( Table 3). In particular, the number of medical students suffering from insomnia and perceived stress was significantly higher in the high-risk group than in the normal group.

Health
The category classification was used as the dependent variable, the third class (normal group) was used as the reference group, and the significant factors in univariate analysis were used as independent variables for multinomial logistic regression analysis. The results showed that compared with the normal group, medical students in the highrisk group were more likely to be female (OR = 1.732, p < 0.001), have a postgraduate degree or above (OR = 1.740, p < 0.  (Table 4). In particular, as the severity of insomnia increases, the risk of being assigned to the highrisk group increases, and the risk of medical students with severe insomnia belonging to the high-risk group was 57 times higher than that of medical students without insomnia.

Discussion
The outbreak of the COVID-19 has caused public panic and psychological pressure [27,28]. To prevent the escalation of the epidemic, schools have taken measures such as extending holidays to ensure that the majority of students are isolated in their current residences and complete their school-related responsibilities remotely [29]. College students must reduce the frequency of their outings, resulting in their inability to participate in social activities, which may affect their learning progress and exacerbate their anxiety and depression. Therefore, the mental health status of medical students is of great concern to medical universities [30]. In the current study, LCA was used to classify medical students' mental health during the COVID-19 epidemic. LCA is an important research method in the social science that assumes that individuals can be grouped into classes with similar patterns of some behaviors according to their response to a set of observed indicators [31]. Three interpretable subtypes of mental health based on LCA models were detected in the present analysis, and the entropy of the 3-class model (0.88) indicated excellent membership classification. This is consistent with previous reports involving LCA, which classified child mental health at the population level and determined the reliability of identified classes [32][33][34]. Meanwhile, a number of researchers have published papers encouraging the use of LCA in the classification of mental health issues because it is well suited to addressing pertinent questions [35][36][37][38]. For example, Essau CA encouraged the application of LCA for studying complex multidimensional phenomena, such as mental disorders, because multiple aspects of individual functioning can be studied holistically [39]. Other researchers have suggested that LCA is an important analytic tool for studying health risk behaviors in college students [40][41][42][43][44]. Furthermore, it can also be used to examine the clustering of modifiable health risk behaviors and to explore the relationship between these identified clusters and mental health outcomes [45].
This study found that the mental health of medical students had obvious grouping characteristics during the COVID-19 pandemic, and the statistical indicators supported three latent classifications, namely, the 'normal group', the 'low-risk group' and the 'high-risk group'. Most medical students in this study belonged to the 'normal group', and they had low probability scores for each factor of mental health, which showed that most medical students had strong psychological adjustment ability and adaptability in isolation at home during the epidemic period. Through the probability score plot, it can be seen that all the medical students had a lower probability of scoring on the idea of dying or harming themselves. It is possible that the students were in a sensitive period of youth and had more or fewer psychological problems, but they did not have ideas of self-harm or suicide.
In the high-risk group, the probability score plot showed that the mental health problems of medical students occur in clusters rather than independently. The high-risk group had a higher probability of scoring on all other factors except x9, which can partly be attributed to the stressful training experience [46],such as the long length of schooling, academic pressure, and the stress of clinical practice [47].This subtype of students may have multidimensional psychological problems, with long-term consequences on well-being and professional relationships. This is in accordance with previous studies showing that most of the students with depression symptoms were also diagnosed with generalized anxiety symptoms [48,49].The cause of co-existence was related to sharing the same risk factors and symptoms [50][51][52].The symptoms of depression and anxiety in medical students may include slowness of thought, decreased energy, low self-worth, disturbed sleep, and difficulty concentrating, which have been known to jeopardize academic development [53,54].To prevent their behaviors from becoming extreme, these students urgently need corresponding psychological treatment measures and should be the focus of prevention and treatment. Computer-delivered cognitive behavior therapy (CCBT), which has become widely used for the growth of the internet and smartphones, can be considered [55,56].
Multinomial logistic regression analysis showed that compared with the 'normal group', there were more females in the 'high-risk group' and the 'low-risk group'. In particular, the risk of female students entering the 'high-risk group' was 1.732 times higher than that of male students, indicating that the mental health problems of female students were more prominent, which may be due to the different hormone and stressor events. Consistent with previous studies, gender differences have always existed in the mental health of medical students [57][58][59].In an investigation of the effects of different educational levels, it is found that the higher one's educational level is, the higher the risk of entering the 'high-risk group' and 'low-risk group'. Medical students with many years of education are more likely to have psychological problems, which may be related to the higher pressure from scientific research and work [60].Similarly, medical students with drinking habits also have a higher risk of psychological problems, which was in accordance with the findings of previous studies [61,62].Compared with the normal group, medical students in the high-risk group with individual or family psychiatric disorders had a higher risk of mental health problems than did students without psychiatric disorders. A history of psychiatric disorders was consistently found to be significant correlate of depression and anxiety [63,64].
This study also found significant differences in perceived stress and insomnia among medical students with different types of mental health status, and the high-risk group had more serious stress and insomnia problems. As the students spend more time in medical universities, they are faced with multiple challenges such as examination pressures, fear of failure, intense competition, lack of leisure time, exposure to patients' suffering, and scientific research pressure. Especially during the outbreak of COVID-19, a long-term home life prevented medical students from coping with academia and employment, which further aggravated the pressure on the students. These factors can lead to high stress levels that negatively impact the physical, mental, and emotional health of students. High levels of stress in medical students are important predictors of anxiety and depression [65,66]. In addition, we found that insomnia might be a significant feature distinguishing the normal group from the high-risk group. A lower prevalence of insomnia was endorsed by medical students in the normal group than by those in the high-risk and low-risk groups. Insomnia has been reported to be a common residual symptom and predictor of mental health problems, which suggests that it should be taken seriously as one of the most important candidates for intervention targets in the treatment of depression and anxiety [67,68].Medical students can experience these symptoms simultaneously, with long-term consequences on wellbeing and professional relationships [69].
Apart from traditional factors, epidemic-related factors were also observed in our study. Compared with the normal group, the higher the awareness of COVID-19, the lower the risk of psychological problems for medical students in the high-risk and lowrisk groups. This phenomenon elucidated that the better understanding of preventive measures about COVID-19 for medical students, the more active they are in coping with the epidemic situation. Therefore, improving medical students' cognition of COVID-19 is beneficial to their mental health. Relevant government departments and universities should make use of social platforms, social software and other new media to attract medical students to consciously receive health education on epidemic prevention measures and related knowledge in COVID-19. Similarly, compared to the normal group, the risk of mental health problems in the high-risk group with fear of being infected with COVID-19 was three times higher than that in students without this fear. These results indicated that the outbreak of COVID-19 might have a significant effect on the risk of mental health issues for medical students. This was consistent with previous studies conducted in Guangzhou, which suggested that psychological consequences of the COVID-19 could be serious in college students [70]. Under the stress of the COVID-19 epidemic, the mental health status of medical students had clustering characteristics. It is urgent to implement targeted psychological interventions and health education measures according to the latent class group.
Nevertheless, the present study had several potential limitations. First, this was a cross-sectional study, thereby precluding conclusions on causality and weakening the dynamic analysis of mental health problems in medical students. Second, the instruments measuring the mental health used in our study were all conducted using self-rating scales, which may influence the accuracy of the results. Third, the medical students' mental health problems included not only depression and anxiety, but also other psychological problems that were not taken into consideration in our study. This may lead to underestimation of medical students' psychological problems.
In conclusion, this is the first study to use LCA technology to explore mental health subgroups of medical students during the COVID-19 epidemic. LCA is a useful tool for studying and classifying mental health at the population level. It was found that the mental health status of medical students had clustering characteristics. The results will be highly relevant to medical education and could be a very important reminder of the current mental health status of medical students.