Study design, setting, and participants
We carried out a descriptive cross-sectional study between December 2018 and February 2019 in Maharajgunj Medical Campus (MMC) located in Kathmandu, Nepal. All the medical students and residents who had spent at least a year in the medical school were considered eligible for the study. Students who were not available during the data collection period were excluded from the study.
Sample size and sample procedure
(see Sample size and sample procedure in the Supplementary Files)
Depression, anxiety, and burnout were classified as the outcome variables. Depression and anxiety were assessed by using the validated Nepali version of Hospital Anxiety and Depression Scale (HADS)  which consists of two subscales: namely an anxiety scale (HADS-A) and a depression scale (HADS-D) each with seven items. As an example, the characteristic items are: “I feel tensed or ‘wound up’”, “I can laugh and see the funny side of things”, “I have lost interest in my appearance”. All 14 items were rated along a four-point Likert scale according to responses of frequency from zero to three. By adding these up, a sum value for the two scales was obtained. Values from 0 to 7 were considered as normal, 8 to 10 as borderline and between 11 to 21 as suspicious . In our study, we classified both the borderline and the suspicious groups as having depression. In the same way, we classified the participants as having anxiety and not having anxiety. The internal consistency of the Nepali version of HADS was satisfactory (HADS-A α=0.76 and HADS-D α=0.68) . In the current study, the Cronbach’s alpha reliability coefficient was good (HADS-A α=0.74 and HADS-D α=0.73).
Burnout was assessed by using Copenhagen Burnout Inventory (CBI)  consisting of three sub-scales measuring specifically personal burnout (6 items), work-related burnout (7 items) and client-related burnout (6 items). We changed the term ‘client’ into ‘patient’; ‘work’ into ‘work/study’ and accordingly replaced them in the questionnaire. So, ‘client-related’ burnout became ‘patient-related’ burnout and ‘work-related’ burnout became ‘work/study-related’ burnout. Twelve items were rated along a five-point Likert scale according to responses of frequency from ‘100 (always)’ to ‘0 (never/almost never)’. The remaining seven items, however, rate the response according to an intensity which ranges from ‘to a very low degree’ to ‘to a very high degree’ . But, an item in the work-related burnout subscale required inverse scoring and the item was: “do you have enough energy for family and friends during leisure time?” Typically, items in the scale were: “how often do you feel worn out?”, “do you feel burnt out because of your work?”, “do you find it hard to work with patients?”. The level of burnout was classified according to the scores obtained. A score of zero to 50 implies “no/low”, 50 to 74 implies “moderate’, 75 to 99 implies ‘high’, and a score of 100 implies ‘severe’ burnout. All the items had high internal consistency, were straightforward and related to the relevant subscale. In our study, the Cronbach’s alpha reliability coefficients of the three CBI subscales were high (personal burnout α= 0.79; work-related burnout α=0.87; and patient-related burnout= 0.85). Burnout was defined if any one of the personal, work-related or patient-related burnout was present in a student.
Socio-demographic factors included age, gender, religion, nationality, socioeconomic class, current residence and the year of training in medical school. The variable ‘age’ was grouped into four age-groups starting from 18 years with increments of 5 years (group 1=18-24 years, group 2= 25-29 years, group 3= 30-34 years, group 4 = 35-39 years). The socioeconomic class was measured using the modified Kuppuswamy’s Socioeconomic Status scale in the context of Nepal . The scale consisted of three criteria namely educational, occupational and economic (monthly family income) based on which a score was given. According to the sum of these three scores, the socioeconomic class was determined according to the classification (26-29= Upper, 16-25= Upper middle, 11-15= Lower middle, 5-10= Upper Lower, <5= Lower).
Behavioral and clinical factors
Behavioral factors included relationship status, substance use, involvement in extracurricular activities and sleep hours. Substance use by a person was defined when he/she uses either alcohol, cigarette or marijuana. The variable ‘sleep hours’ was divided into two groups: adequate sleep and inadequate sleep by using a cut-off of 7 hours . Clinical factors included stressors, satisfaction with career choice, satisfaction with academic performance, previous history of mental health problems, family history of mental health problems and current treatment regarding mental health issues.
The stressors were identified by using the Medical Students’ Stressor Questionnaire-20 (MSSQ-20) . MSSQ-20 consisted of 20 items representing the six stressor domains which were: academic related stressors (ARS), intrapersonal and interpersonal related stressors (IRS), teaching and learning-related stressors (TLRS), social related stressors (SRS), drive and desire related stressors (DRS), and group activities related stressors (GARS). All 20 items were rated along a five-point Likert scale according to the intensity from 'zero (causing no stress at all)’ to ‘four (causing severe stress)'. The mean score of each of the domain was calculated and thus the severity of the stress caused by that domain was assessed according to the classification (0-1= Mild, 1.01-2= Moderate, 2.01-3= High and 3.01-4= Severe). High-grade score in a particular stressor group indicated that it caused a lot of stress, disturbed emotions, and mildly compromised daily activities. It was a valid and reliable instrument with high internal consistency as shown by Cronbach's alpha coefficient value of 0.95. In the current study, the reliability coefficient was high (α= 0.91). The Cronbach's alpha for each stressor domain was also high (ARS α= 0.87, IRS α= 0.89, TLRS= 0.77, SRS= 0.74, DRS= 0.70, GARS= 0.76). Each of the stressors ARS, IRS, TLRS, SRS, DRS, and GARS was grouped into two groups: ‘absent’ and ‘present’. The ‘absent’ group included those participants who felt only the mild form of those stressors while the ‘present’ group included all other participants feeling the moderate, severe and high degree of those stressors.
Methods of data collection
For the purpose of data collection, the aim of the study was briefly described and doubts of the participants regarding the study were cleared by the investigators. Participants were requested to choose the item in the questionnaire that was closest to what they have been feeling in the past week to minimize recall bias. Following this, the questionnaire form was distributed to them. Questionnaire form contained questions regarding socio-demographic, behavioral and clinical characteristics of the participants along with the scales for measuring depression, anxiety, burnout and stressors in the participants. We used reliable and validated instruments to minimize information bias. Our survey questionnaire is provided as an additional file 1.
R software (version 3.5.3)  and various R packages were used for statistical analyses. ‘G-models’ package  was used to construct a cross-table, ‘caret’  and ‘caTools’ package  for multivariable logistic regression analyses, ‘rcompanion’  for calculating Cox and Snell, and Nagelkerke pseudo R squared, ‘lmtest’  for likelihood ratio test, ‘ROCR’  and ‘Metrics’  for calculating AUC and plotting ROC curve, ‘ResourceSelection’  for Hosmer Lemeshow test, ‘survey’  for Wald test, ‘ corrplot’  for plotting the contingency table and ‘ggplot’  for making bar plots.
A total number of 43 variables (numerical-34, categorical-9) from 19 observations were missed. The missing data in a numerical variable were replaced by averages and that of a categorical variable by modes. Descriptive statistics were used for sociodemographic, behavioral and clinical variables. We used an alpha level of 0.05 as a cut-off point for statistical significance. Univariate analysis was used to find the association of depression, anxiety, and burnout with independent variables. The multivariable logistic regression analysis was used to determine the predictors of depression, anxiety, and burnout. We tested for multicollinearity by calculating variance inflation factor (VIF) score for each variable in the predictor models using the “vif” function of “car” package . We set the cut-off VIF score of 10 and found two variables (“age” and “year”) having higher VIF scores. Thus, we removed a variable (“age”) and rechecked for the collinearity among the remaining variables. As we found no correlation among the remaining variables, we reconstructed the logistic models using the variables except “age”. The stepwise logistic regression using backward elimination method was performed using R software. The scripts used while preparing these models in R software will be available from the author upon reasonable request.
The model with the lowest AIC (Akaike Information Criterion) value was selected as the best-fitted model. Likelihood ratio test and Hosmer-Lemeshow tests were done to test for the goodness of fit of the model. The Cox and Snell, and Nagelkerke pseudo R squared represented the proportion of variance in the outcome variable that was explained by the variables in the model. Tests of individual predictors were done to assess the relative importance of those variables in the model, like Variable Importance using 'caret' package  and Wald test using 'survey' package . Validation of predicted values was done by constructing ROC curve using 'ROCR'  and 'Metrics'  packages.
Ethical clearance was obtained from the Institutional Review Committee of the Institute of Medicine (Reference number- 305/075/076). A cover letter consisting of informed consent was attached with each questionnaire which included a description of the study and participants' rights to decline altogether or to leave the questions answered. The consent was implied through the completion of the questionnaire. The name, address or signature of the participant were not included in the questionnaire to keep the identity of the participant anonymous. Participants did not receive any incentives or financial compensation for participating in the study.