A Rasch Analysis of the 10-item Kessler Psychological Distress Scale (K10) in the Urban-rural fFinge of China


 Background The impact of urbanization on the physical and mental health of the rural-urban population has been ignored. The primary objective of this investigation was to demonstrate the reliability, validity, and responsiveness of the 10-item Kessler Psychological Distress Scale (K10) in the measurement of psychological distress in the rural-urban fringe population. Methods Data were obtained from the mental health section of chronic disease survey in Longzihu District, urban-rural fringe area, of Bengbu City, with 3354 participants. The Mandarin version of K10 was used for face-to-face interviews. The Rasch model was used to analyze the psychometry characteristics and differential item functioning (DIF) of K10.Results Rasch analysis results revealed that the K10 scale showed ordered response categories. The results of principal component analysis (PCA) and information-weighted fit statistic (infit) mean square (MNSQ) indicated that the K10 scale conforms to unidimensionality. The Cronbach's alpha coefficient of K10 was 0.916 (95% confidence interval (CI): 0.907,0.924), which had good reliability, but the Cronbach's alpha coefficient would be increased if the fifth item was removed. The results of the Rasch model showed that all the 10 items in the K10 scale had a good fitting effect (Infit MNSQ value, 0.928-1.072). A non-significant differential item functioning (DIF) was found on K10 of age and gender. Overall, the K10 scale was more difficult, and the psychological distress score of the subjects was generally low.Conclusion Rasch analysis showed that the Mandarin version of K10 was an effective and reliable scale for measuring and screening mental distress of residents in the urban-rural fringe. However, it was still recommended that further research should be conducted to solve inappropriate and difficult items.


Introduction
Mental health and well-being have become a key issue in modern public health. Around one billion people worldwide meet the criteria for common mental disorders [1], and about one-third of them have disabilities [2]. Mental disorders are the second largest contributor to the global burden of disease, accounting for 7-13% of disability-adjusted life expectancy [2]. Mental illness has evolved into one of the main drivers of the global burden of disease [3]. It is very necessary and important to monitor, screen and manage the mental health symptoms of residents.
With the continuous expansion of global cities, the urbanization rate has been rising, and developing countries have contributed 90% of the urbanization process [4]. From 1992 to 2015, China experienced rapid and large-scale urban expansion. Urban land use has increased nearly ve times, with an average annual growth rate of 8.10% [5]. Previous study has shown that urban expansion is squeezing the activity space of residents in the urban-rural fringe, causing air, noise, and water pollution, reducing green space per capita and increasing social exclusion [6]. Recent study has shown that air pollution is associated with deteriorating mental health. In addition, noise and water pollution increase the risk of psychological stress and disease [7]. Many studies have linked green spaces to mental health [8,9]. A cluster randomized trial found a 62.8% reduction in mental distress among people living near green Spaces [10]. The urban-rural fringe, as a direct contact area in the process of urbanization, has brought residents more opportunities to exposed the harm caused by urbanization, and the urban-rural oating population has experienced social exclusion, which may have a signi cant impact on mental health [6]. When we discuss health and well-being within the framework of "health in all policy" [11] and sustainable development [12], the psychological distress of residents in the urban-rural fringe should be an important social and health issue worthy of attention.
At present, there are many tools for measuring mental health, among which 10-item Kessler Psychological Distress Scale (K10) is widely favored by researchers because of its good screening characteristics and ease of use [13][14][15][16][17]. Kessler mental distress scale has been validated in Canada [18], the United States [19], Australia [13], etc. In addition, K10 has been tested in the military personnel [14], diabetes [20], adolescents [21], the elderly [22]. The tool has also been validated in different language versions, including Mandarin [23,24], Arabic [15], Korean [25], Danish [26], etc. Although Kessler mental distress scale has been veri ed in different populations and languages, our literature review has not found an empirical study on the measurement attribute of mental distress (K10) of residents in the urban-rural fringe. It is surprising that research on the psychological distress of rural-urban populations has been ignored.
In conclusion, the purpose of this study is to evaluate the psychometric characteristics of the K10 Mandarin version [27] based on the epidemiological survey data in the urban-rural fringe, and to evaluate the different item functioning of K10 in different characteristics of urban-rural integration in China.

Study design and participants
The data were collected from the survey on health status of residents in Longzihu District, Bengbu City, China in 2015. In this study, multi-stage strati ed random sampling was used to select the residents of 7 community service centers, a total of 3591 people were investigated and 3354 samples were effectively recovered. A total of 56 community health educators were trained to conduct health check-up and questionnaire surveys.

Measures and instrument
The indicators were obtained by questionnaire survey, including basic information of residents, two weeks of illness, mental health, and health-related literacy and so on.
The 10-item Kessler Psychological Distress Scale (K10) was the non-speci c psychological distress scale, which consisted of 10 items about anxiety and depression symptoms experienced by the respondents in the past 4 weeks [28]. K10 was developed for the redesigned national health interview survey (NHIS). Kessler's research showed that screening for severe mental disorders is very effective. The measure had ve response categories (P1 = "none of the time"; P2 = "a little of the time"; P3 = "some of the time"; P4 = "most of the time"; P5 = "all of the time"), ranging from 0(none of the time) to 4 (all of the time). The total scores for these items ranged from 0 to 40, with higher scores indicating higher levels of psychological distress [14]. The Mandarin version of K10 was used for the survey of mental health, which was measured by asking subjects how they felt about items of psychological distress over the past four weeks [23]. Factor variables of differential item functioning Age and gender are important factors variables. As a continuous variable, age was transformed into categorical variable. Participants were divided into two groups, including younger group (aged 18-59 years) and older group (aged 60-90 years old).

Statistical Analysis
Rasch model was a parametric latent trait model based on Item Response Theory (IRT). Rasch analysis, developed by Danish mathematician Georg Rasch, was a formal test of the results of a scale using mathematical models [29]. Rasch analysis provided an estimate of the level of a person's potential variables and allowed for subsequent statistical analysis. In addition, the model was a robust model for measuring potential traits and had been proved to solve some of the weaknesses of CTT [30,31]. Rasch analysis was widely used to evaluate the psychometric attributes of questionnaires and scales. It could measure human ability and project di culty simultaneously on the same scale [29].
In the current research, Rasch analysis was used to test the quality of the items. To validate the Chinese version of the K10 in the urban-rural fringe, six key indicators were used, which included the following: (1) Information-weighted mean-square statistics (In t MNSQ). The in t MNSQ values between 0.7 and 1.3 were considered unidimensionality [32]. In t MNSQ value greater than 1.3 indicated that the variance of measurement was too large, and in t MNSQ value less than 0.7 indicated that the variance of measurement was too small [33]. (2) Principal component analysis (PCA). The PCA of the residuals was performed to test the unidimensionality of the K10 [34]. Two criteria were used in PCA. The rst was that the variance explained by the rst principal component should be su cient (> 50%). The second was that if the ratio of the rst factor eigenvalue to the second factor eigenvalue was close to or greater than 3.0, then the scale was unidimensionality [32,35]. (3) Category threshold order. The order of category threshold re ected by the category probability curve was an important parameter to demonstrate the usage of response categories, and it was the basis to distinguish personnel ability and project di culty. When it was di cult to distinguish the orderly response options, the disorder threshold appeared [36]. (4) Differential item functioning (DIF). DIF is an indispensable part of psychometric analyses aiming to measure invariance across sample groups, e.g., for male and female [37]. (5) Cronbach's alpha coe cient. Cronbach's alpha coe cient was a reliability index, which indicated how reliable the consistency of measurement was. The value range was 0-1. If Cronbach's alpha coe cient was greater than 0.9, it meant good consistency [38]. Person-item map. Rasch model users called a "Wright Map". In Rasch model, the di culty parameter of project was on the same scale as human ability θ, so the di culty of project could be explained by the same scale as human ability. The person item map was a useful tool for understanding how projects and people are distributed along θ [39]. Ideally, the di culty of the project would be distributed within the range of meaningful potential characteristics or capabilities, so that the project could measure capability across the entire range.
All statistical analysis and graphic plotting were performed using R version 4.0.3 R software (The R Foundation for Statistical Computing, Vienna, Austria). And the software packages "mirt" and "ltm" were used to build the Rasch model. The "lordif" function was used to analysis the DIF.

Results
Characteristics of the patients A total of 3354 patients were included in the study. Table 1 showed the distribution characteristics of scores of K10. The K10 had an average score of 3.34 (standard deviation was 5.28), with a median score of 1.00 and the scale's score range was 0-40. The study comprised 1885 females (56.2%) and 1469 males (43.8%), with a mean age of 56.0 years (range 18-91 years). The Mann Whitney U test of K10 scale showed that there were signi cant differences in the total scores of gender and age. The average score of female psychological distress was higher than that of male, and the average score of psychological distress of the elderly was higher than that of young people. In addition, the test of single item showed that the average scores of other items except item 10 were higher in women than those in men. The average scores of K10 of the elderly in items 5 and 6 were higher than that of young people, and there was no signi cant difference in the average score between the elderly and young in the remaining items.

Unidimensionality
Principal component analysis showed that the residual explained 60.22% (greater than 50%) of the original variance.
Moreover, the ratio of the characteristic root of the rst factor to that of the second factor was 7.642 (the ratio was far greater than 3). The In t MNSQ value for each item (see Table 2) was acceptable (0.928-1.072). Both results indicated that there was no evidence that K10 was multidimensional.

The goodness of t of the Rasch model
In the Rasch model, in t MNSQ values were usually used to evaluate the goodness of t of the items. As Wright and Linacre's (1994) suggestion, when the in t MNSQ value of an item was greater than 1.5 or less than 0.5, it could be considered that the tting effect of the item was poor. Rasch model analysis showed that 10 items in K10 scale had a good tting effect (in t MNSQ values, 0.928-1.072), as shown in Table 2.

Threshold order and Category probability curves
There was no evidence of disorder threshold in the category probability curve, and the category threshold increased in an orderly manner (as shown in Figure 1 and table 2). All projects had ve response categories, and it was recommended that each item had four thresholds. The items 1, 3, 5, and 10 in K10 were slightly lack of discrimination, which showed that P2 (a little of the time) was indistinguishable from P1 (none of the time) and P3 (some of the time), and P2 was poorly differentiated. Since the performance of these items were not good, the subsequent screening indicators could be deleted.

Differential item functioning (DIF)
The items of K10 were assessed for DIF across gender and age (table 3). We used likelihood ratio (LR) Chi-square value to test DIF, and set the detection level α to 0.01. Signi cant DIF was found on item 3 when assessing age.
However, DIF between ages was not signi cant when α was adjusted by Bonferroni (0.01/10 = 0.001). There was no DIF difference between genders.
The block diagram on the left of gure 2a showed the dispersion of K10 scores between ages. The shaded boxes in the block diagram were inter-quartile range, representing a 50% difference in the middle (the boundary is at the bottom and top of the shadow box), ranging from -0.002 logit to + 0.003 logit, with a median value of about 0.001 logit. The left of gure 2b showed a box plot of the differences of gender. The interquartile range, representing the middle 50% of the differences (bound between the bottom and top of the shaded box), ranged from -0.002 to 0.003, with a median value of about 0.001.
In the graph on the right the same difference scores were plotted against the initial scores ignoring DIF ("initial theta"), separately for younger and older individuals. Guidelines were placed at 0.0 (solid line), i.e., no difference, and the mean of the differences (dotted line). The positive value on the left side of the graph showed that for older, if DIF was considered, the score would decrease (i.e., ignoring the true score of DIF minus the score of DIF greater than 0, so the calculated DIF score was less than the true score), which was the opposite for younger. Figure 2b showed that the median 50% difference in gender DIF ranges from -0.010 to 0.008. The solid line (standard line) on the left coincided with the dashed line (the average of the differences) indicating that DIF did not differ between genders. The in uence of DIF on K10 score of male and female could be ignored.

Person-item map
In the person-item map, the histogram of human ability estimation θˆ was shown at the top. The black dots on the lines at the bottom of Figure 3 were the threshold of item di culty. In the K10 scale, the individual's ability θˆ was between -3.00 logit and -2.00 logit, and individuals were more inclined to choose items with low scores of "none of the time" and "a little of the time". In other words, participants were more likely to choose items with di culty θ higher than 1.00 logit. Figure 3 showed that the di culty of items 1, 3, and 5 were around 0.00 logit, while the di culty of other items was generally around 1.00 logit, and the overall di culty of K10 was relatively high.

Discussion
The purpose of this study was to evaluate the applicability of K10 in measuring the psychological stress of residents in the urban-rural fringe of China. The Mandarin version of K10 showed high internal consistency and reliability. The results of PCA and Rasch models supported the unidimensional structure of K10, which was consistent with previous study [28]. Rasch analysis showed that the response categories of all items were ranked in order of severity. Although the DIF of the item3 (Did you feel so nervous that nothing could calm you down?) was statistically different between ages, there was no signi cant DIF in gender and age after adjustment. Our research supported that the K10 distress scale was an effective measurement tool [13], which could be used to quantify the non-speci c psychological distress of residents in the urban-rural fringe of China.
Rasch analysis was a powerful tool that allowed the identi cation of items that were not sensitive to potential traits.
The analysis of K10 showed that although not all items were in full compliance with the Rasch model, all items were effective for measurement, that was, the in t MNSQ values were less than 1.3 [32]. These results showed that K10 was a good measure of a potential structure. Principal component analysis of residuals further supported the unidimensional of K10 [34]. Although some studies have found that K10 was two-dimensional and had different twofactor structure [24,40,41]. However, this study supported the unidimensional hypothesis of K10 [28]. The Cronbach' alpha coe cient of K10 in this study was very satisfactory, which was similar to the reliability of Mandarin and other versions [23,24]. There were differences in gender and age of psychological distress. It was reported that female respondents showed signi cantly higher anxiety than males [18,42,43]. In line with that, we found that with the exception of the "feel worthless" item, the average scores in female of other items were higher than that of male. Moreover, our study revealed that the level of psychological distress of the elderly was higher than that of the young, in the items of "feel restless or tness" and "feel so restless that you could not sit still", there was no difference between the older and the younger in the remaining items. However, Evelyne bougie [18] found that people aged 55 or older suffered signi cantly less than younger people, which was contrary to our conclusion.
Our study demonstrated the ordered threshold in the K10 category probability curve, which meant that patients with more mental distress for an item had more anxiety than those who claimed to have less. In other words, the K10 scale could accurately distinguish subjects with low and high levels of mental distress [36]. Our results were consistent with previous studies [13][14][15][16][17], which suggesting that K10 was a good measure of patients with mental distress. However, the discrimination ability of items 1, 3, 5, and 10 in K10 scale was poor. Speci cally, the distinction between "none of the time", "a little of the time" and "some of the time" is not good. This is consistent with was a rural study in Bangladesh [44], which found that there was a disorder threshold in the category probability curve and suggested merge the disordered parts of the sorting.
K10 scale was used to measure the psychological distress of residents in urban-rural fringe, and the DIF of age and gender was not shown. This was consistent with the results of K10 validation in rural areas of Bangladesh [44]. K10 had consistent DIF evidence in measuring different genders and ages [37].
Person-item map represented the location of item di culty and the distribution of person along the logit scale [45]. In the K10 scale, the ability of individuals was between -3.00 logit and -2.00 logit, and individuals were more inclined to choose items with low scores of "none of the time" and "a little of the time". In other words, the K10 scale was more di cult, and it was not easy to distinguish the subjects with low levels of psychological anxiety [39]. A similar phenomenon was also found in rural Bangladesh [44]. This difference might be partly due to the psychologically distressed patients in the evaluation group were more satis ed with the current psychological anxiety status, who were generally lower than the expected level of psychological anxiety level [44]. In the future research, data on the oating population in the urban-rural fringe should be considered, which can better verify these measurement attributes of the K10 scale.
Some limitations should be considered in our ndings: Firstly, all patients were recruited from one district, and there might be selection bias, which needed to be veri ed in random samples from other urban-rural fringe areas, so that the results could truly represent the psychological distress of urban-rural fringe areas in China. Secondly, the subjects of this study were adults, and there is a lack of measurement data for children and adolescents in the urban-rural fringe, which had an impact on the extrapolation of the model. In addition, although our study analyzed the DIF between different characteristics, we only validated the DIF of age and gender, and more DIF between features needed to be explored later. Finally, due to the large population mobility in the urban-rural fringe, and remedial measures had been taken during the study period, this might affect the universality of the results.

Conclusions
In conclusion, the Mandarin version of K10 had high reliability and structural validity in the measurement and identi cation of anxiety patients in urban-rural fringe. The K10 scale satis ed most of the hypotheses of Rasch model, which did not show DIF for age and gender. According to the ordered threshold of category probability curve, the K10 scale could accurately distinguish the subjects with low level and high level of psychological distress. This study conducted a more comprehensive analysis of the measurement attributes of the Mandarin version of K10, but further research is still recommended to solve the problems of inappropriate items and high di culty.  Table 1 Score of K10 on age (younger and older) and gender (male and female)     These graphs showed the difference in score between using scores that ignored DIF and those that accounted for DIF. The graph on the left showed a box plot of these differences. The interquartile range represented the middle 50% of the differences (bound between the bottom and top of the shaded box). In the graph on the right the same difference scores were plotted against the initial scores ignoring DIF ("initial theta"), separately for younger and older individuals. Guidelines were placed at 0.0 (solid line), i.e., no difference, and the mean of the differences (dotted line).