Reliability of pictorial Longshi Scale for informal caregivers to evaluate the functional independence and disability

Abstract Aim The pictorial Longshi Scale was designed to assess patients' functional ability in the Chinese context, which is gradually used by some informal caregivers. However, its reliability compared with healthcare professionals has not been examined. Design A multi‐centre cross‐sectional study conducted in 24 Chinese hospitals. Methods We recruited patients undergoing rehabilitation treatment and informal caregiver dyads. Informal caregivers and healthcare professionals evaluated patients' functional ability using the Longshi Scale according to three levels (bedridden, domestic and community). The Kappa coefficient and McNemar‐Bowker test were used to examine the consistency and accuracy between the two parallel assessments. Results This study involved 947 patients (mean age: 46.07 ± 11.72 years) and informal caregiver dyads (64.86 ± 12.94 years). Most patients were males (66.3%), while most caregivers were females (60.7%). Over 70% of patients and caregiver dyads had a secondary‐school education and lower. Around 90% of caregivers were relatives (spouse, 42.8%; offspring, 20.7%; siblings: 13.3%; parent, 12.0%) of patients. The agreement in sub‐levels of the Longshi Scale between caregivers and healthcare professionals ranges from 73%–89%, and the corresponding Kappa coefficients range from 0.504–0.786. Caregivers were more likely to assign fewer patients to the bedridden group and more to the domestic group than healthcare professionals. The subgroup analysis by education level indicated that the difference in assigning patients into three degrees of functional disability was only significant in those with primary‐school education, while non‐significant in those with secondary‐school education and higher. Conclusion The evaluation outcomes of functional ability using the Longshi Scale are similar between informal caregivers and healthcare professionals. However, informal caregivers' education level is a dominant factor in affecting the assessment accuracy compared with healthcare professionals. Informal caregivers with a secondary‐school education and higher are supported to evaluate patients' functional ability independently.


| INTRODUC TI ON
Approximately 15% of adults suffer from some kind of disability globally (Bethge et al., 2014), which is projected to increase annually (Lee et al., 2021). Disability is caused by serial factors, including trauma, ageing and acute or chronic diseases (Shakespeare & Officer, 2011).
Accurately assessing functional independence and disability facilitates the rehabilitation strategy design and services guideline, further promoting patients' functional ability recovery (Liu et al., 2022).
Currently, functional ability is mainly evaluated by healthcare professionals using special scales, which are time-consuming and terminological. Moreover, the assessment outcomes are challenging for patients and their families to understand Prodinger et al., 2017). A simple and reliable tool for assessing functional ability in patients and their informal caregivers is warranted.
Informal caregivers are defined as "individuals who provide ongoing care and assistance, without pay, for family members and friends in need of support due to physical, cognitive, or mental conditions" (Madara Marasinghe, 2016), which play essential roles in the rehabilitation setting, supporting the rehabilitation and subsequent discharge of patients (Young et al., 2014). Family care, the most common subtype of informal care, accounts for 80% of total care in Europe (Verbakel et al., 2017). Likewise, the Asian cultural norms encouraging families to care for their elders also substantially increase the family care rate (Ansah et al., 2016). Accumulative evidence has highlighted the significance of family care for supplementing the deficiency of professional care and enhancing the quality of long-term care (Ansah et al., 2016;Wang et al., 2017).

| BACKG ROU N D
Some studies have shown that families, non-professional healthcare workers and social workers can observe the functional impairments and disease symptoms of the older people (Ranhoff, 1997;Wang et al., 2019), effectively improving early disease diagnosis and related treatments. In China, the functional disability assessment at different stages of recovery remains inconsistent .
Patients' functional disabilities can be evaluated to varying degrees by professionals accurately when they are admitted to the hospital.
At the same time, continuous assessment after discharge is unavailable during family (Wang et al., 2019) because of the lack of specialist physicians and assessment tools for non-professionals (Bethge et al., 2014;. Therefore, a family user-friendly functional disability assessment tool without specialized training is needed. Current tools for assessing functional disability are designed using the written language, including the Barthel Index scale Liu et al., 2022), the Functional Independence Measure scale (FIM) (Prodinger et al., 2017), the modified Rankin Scale (mRS) (Banks & Marotta, 2007), and the World Health Organization (WHO) disability assessment scale (Chen et al., 2020). These scales constitute many specific medical terms and require participants to report functional limitations verbally, which shows less feasibility among people with illiteracy, language barriers and even dementia. Pictorial scales have shown more feasibility and application than text-based for those population groups in the medical field (Akena et al., 2018;Hadjistavropoulos et al., 2014;Theou et al., 2019;Tomlinson et al., 2010).

| Research question
In 2013, our team developed a pictorial Longshi Scale (Figure 1) based on a survey of 1,862 people with functional disabilities in China (Wang et al., 2019). To our knowledge, this is the first pictorial scale to assess functional ability . The reliability and validity of the Longshi Scale have been assessed among therapists, interns and personal health aids with an intraclass correlation coefficient >0.8, indicating good intra-and inter-rater reliabilities (Wang et al., 2019). However, the reliability of the Longshi Scale among informal caregivers remains unassessed. Additionally, one study demonstrated that the education level of evaluators might influence the assessment outcomes. Therefore, we aimed to verify the reliability of the Longshi Scale among informal caregivers and further explored the influence of education levels on assessment outcomes.

| Study design and setting
This muti-centre cross-sectional study was conducted in the departments of rehabilitation of 24 hospitals located in 11 cities across China from 11-31 December 2020. Initially, this study was designed to assess the accuracy and time consumption of informal caregivers using the Longshi Scale and Barthel Index scale to evaluate patients' functional independence, compared with professional healthcare workers. Then, the basic demographic information and scores of the Longshi Scale were used for analysis.

| Sample size
In the original protocol, we planned to include 744 subjects, while a total of 1,006 eligible subjects were initially recruited. After the evaluation of the inclusion and exclusion criteria, 947 subjects were included K E Y W O R D S functional ability, healthcare professional, informal caregiver, Longshi scale for analysis in this study. However, considering that the purpose of this study was to explore the reliability of informal caregivers using the Longshi Scale to assess patients' functional independence and disability, we recalculated the sample size using the following formula.
where π is the proportion of adults with disability (π = 15%) (Wang et al., 2019); μ α is the critical value of the two-sided test of the first type of error probability (μ α = 2.580); δ is the allowable error (δ = 0.05); We added 20% to non-response rates and incomplete study instruments. Thus, we calculated that the minimum sample size was 408 in this study.

| Participants
In this study, we recruited participants via posters at the nurse stations and the wards in survey hospitals. Nurses contacted the patients and their caregivers in the wards according to the bed number. Qualified participants were invited to participate in the study.
A total of 1,006 consecutive inpatients were enrolled according to the inclusion criteria. For the purpose of this study, we only selected patients who were adult (>18 years) patients who had functional disabilities after diagnosing with cerebral haemorrhage, stroke, spinal cord injury, post-operative brain tumours and brain trauma. Those who suffered from mental illness, serious cognitive dysfunction and the inability to understand the images shown in the Longshi Scale were excluded. All patients' cognitive status was assessed by a nurse using the Mini-Mental State Examination (MMSE) before the disability evaluation . In this study, we excluded patients suffering from a mental illness and cognitive dysfunction (MMSE < 27) because the Longshi Scale is a pictorial scale and patients with MMSE scores < 27 had difficulty in recognizing the pictorial-based items of the Longshi Scale. Additionally, patients who participated simultaneously in other clinical studies were excluded. Furthermore, the informal caregivers were selected according to the patients they cared for. The inclusion criteria for informal caregivers in the study were adults who take care of their patients without any pay, including patients' families, relatives, friends or colleagues. Those who were hired as formal caregivers were excluded.
Non-Mandarin-speaking caregivers were also excluded because of a lack of translation services. The participants were form 24 hospitals in 11 cities of China, and every city might have one dialect, but professional evaluators and nurses who contacted the participants were unlikely to speak every dialect. To make researchers, evaluators and patients communicate more smoothly, we only included Mandarinspeaking participants. The informal caregiver was asked to assess their patient using the Long Scale independently. Written informed consent was obtained from the participants and caregivers' dyads.
Eighteen healthcare professionals were willing to participate.
They were randomly divided into six groups. Each group included one therapist and two interns. Before the assessment, all healthcare professionals were given a brief, half-day training sessions on how to accurately score the Longshi Scale. Each healthcare professional interviewed one patient at a time and each patient received Longshi Scale evaluation from one of three healthcare professional. Both of assessments from healthcare professionals and informal caregivers were conducted on the same day.

| Data collection
The sociodemographic information of the patients and informal caregivers were collected using online questionnaires by nurses after signing the informed consent forms. Then, functional independence and disability were assessed using the Longshi Scale by healthcare professionals and informal caregivers, respectively. All data were recorded and uploaded in the Mike website by a special account. Mike is an online form production website, which can collect, store and management data (MikeCRM Co., Ltd., https:// www.mikec rm.com/). First, one nurse logged into the preregistered Mike account and made electronic forms online, including two basic information forms and Longshi Scale. Each electronic form could generate a unique link or identification code. Second, the nurses used the link or identification code to collect the basic information of the patients and informal caregivers, separately.
Thereafter, healthcare professionals and informal caregivers collected the Longshi Scale scores of patients on a face-to-face basis using another link or identification code. Once the data collection is completed, it could not be changed. Finally, all the data were reviewed and checked by the study assistants in the data management platform of Mike website. Any record with missing information was excluded from the study.

| Instruments
The Longshi Scale assessment was divided into three steps. First, all patients were allocated to the bedridden, domestic or community groups, which depended on their ability to move out of the bed, move outdoors, and return indoors. Second, patients in each group were evaluated using a 3-point Likert subscale (form), including (1) bedridden group subscale (Form 1, including bladder and bowel management, feeding and leisure activities); (2) domestic group subscale (Form 2, including toileting, grooming and housework); and (3) community group subscale (Form 3, including community mobility, shopping and social participation). Third, we calculated the total score of each subscale (minimum independence = 3 and maximum independence = 9) ( Figure 2).

| Analysis
Statistical analyses were conducted using SPSS (version 22.0; IBM Corp., Armonk, NY, USA). Demographic characteristics were presented as numbers. Demographic characteristics included age, sex (male and female), marital status (such as married, unmarried, divorced and widowed), ethnicity (Han and minority), living pattern (such as living alone, living with family, living with tender, living in nursing institution and other), annual household income (less than 50,000, 50,000-100,000, 100,000-150,000 and more than 150,000 yuan) and degree of education (primary and lower, high school, college and higher). Religion and retirement were coded as "yes" and "no," respectively.
Descriptive statistics (i.e. frequency, percentage, mean and standard deviation) were calculated. The Kolmogorov-Smirnov test was used to examine the normal distribution of the data. The chi-square test or McNemar-Bowker test was used to compare the nominal variables in the three groups (bedridden, domestic, and community). Kruskal-Wallis test was used to compare the differences of the age among the groups. Mann-Whitney test was performed to determine the statistically differences between the Longshi Scale scores of three groups. Scatter plots were used for comparison the mean differences in Longshi Scale sum score between healthcare professionals and informal caregivers. The closer the scatter point is to the mean difference line, the better the consistency is. The level of significance was set at <0.05.
The healthcare professionals' scores served as reference standards. As a special type of correlation coefficient, Cohen's kappa statistic (κ) was used as a standardized measure of agreement. All items were scored on an ordinal scale with more than two alternatives, and the weighted kappa coefficient was used. The degree of agreement evaluated by κ coefficient at the item level has the following standard definitions: poor (κ = 0.00-0.20), fair (κ = 0.21-0.40), moderate (κ = 0.41-0.60), good (κ = 0.61-0.80) and very good (κ = 0.81-1.00) (Wang et al., 2019). The marginal homogeneity test was used to examine asymmetry bias. F I G U R E 1 Longshi Scale for assessing the activities of daily living 5 | RE SULTS

| Characteristics of participants
The sociodemographic characteristics of informal caregivers and their patients are summarized in Table 1. A total of 1,006 patients were invited to participate in the study. Of these, 59 were excluded because of refusing to participate (n = 18), missing data (n = 26) or duplicate data (n = 15). There were 947 eligible patients, and their caregivers were included in this study. Among all the patients, 419 (44.2%), 298 (31.5%) and 230 (24.3%) were classified into the bedridden, domestic and community groups, respectively. The mean age in the bedridden group, domestic group and community group were 65.70 ± 13.588, 65.42 ± 12.987 and 62.55 ± 11.372 years, respectively. The majority of them were male (n = 628, 66.3%), of Han ethnicity (n = 946, 99.9%), had a secondary school educational level (n = 544, 57.4%), had retired (n = 556, 58.7%), and had a family annual income of 50,000-100,000 yuan (n = 446, 47.1%).
The mean age of informal caregivers was 46.07 ± 11.715 years.

| Scores of Longshi Scale in healthcare professionals and informal caregivers
The scores of Longshi Scale were normally distributed both in healthcare professionals and informal caregivers (Kolmogorov-Smirnov = 5.018, p = 0.000 vs Kolmogorov-Smirnov = 5.049, p = 0.000). The scores of Longshi Scale items in bedridden, domestic, and community groups were presented as boxplots in Figure 3.
For healthcare professionals, the mean scores of the three items were 5.14 ± 1.946, 4.77 ± 1.421 and 7.55 ± 1.959, respectively.
For informal caregivers, the mean scores of the three items were 5.00 ± 1.936, 5.37 ± 1.656 and 7.18 ± 2.055, respectively. The mean score of each item was also compared, and there were no differences between healthcare professionals and informal caregivers in each item of the three groups (p > 0.05).

| Reliability analysis
All of the Longshi Scale items had kappa coefficient above 0.50, which illustrated moderate agreement between healthcare professionals and informal caregivers' scores ( Table 2). For the "community mobility" and "shopping" items, the kappa coefficients were higher than 0.70, indicating good agreement, and the agreement rates between healthcare professionals and informal caregivers were 86.4% and 89.3%, respectively. However, for the "bladder and bowel management," "entertainment," "toileting," "grooming and bathing" and "housework" items, the kappa coefficients were lower than 0.60, indicating moderate agreement, and the agreement rates were 73.6%, 74.5%, 73.9%, 72.7% and 85.7%, respectively.
According to the evaluation results of informal caregivers, there were 591, 221 and 135 patients in the bedridden, domestic and community groups; however, according to the evaluation of professionals, there were 569, 241 and 137 patients in these groups, respectively. There was no statistically significant difference existed between informal caregivers and healthcare professionals without education stratification of informal caregivers (McNemar-Bowker = 7.413, p > 0.05). Considering that education level was an F I G U R E 2 Flow chart of assessment using Longshi Scale. First step of Longshi Scale is to assess if subjects belong to bedridden, domestic, or community groups according to whether they can transfer out of bed or outdoors and return. Each subject will then be further evaluated using the corresponding form (subscale) of Longshi Scale. Finally, calculate the total score of each form (subscale) important factor influencing on evaluation results, we conducted the subgroup analysis by education level of informal caregivers, using healthcare professionals as reference standards. The results showed that there was no statistically significant difference in the secondary school or above groups (McNemar-Bowker between 0.707and 4.714, p > 0.05). However, in the primary school group, the accuracy evaluation of Longshi Scale differed significantly between healthcare professionals and informal caregivers (McNemar-Bowker = 8.759, p = 0.013). The results are showed in Table 3. Figure 4 shows the difference of the Longshi Scale sum scores between the healthcare professionals and informal caregivers. The mean differences were 0.14 ± 1.57, −0.59 ± 1.65, and 0.37 ± 1.96 in the bedridden, domestic and community group, respectively. No significant bias existed in the sum scores, as the informal caregivers scored slightly higher than the healthcare professionals (5.41 ± 2.027 vs 5.39 ± 2.036, p > 0.05). The scatter showed that there were 6, 3, and 1 data point out of the range in the bedridden, domestic, and community groups (mean ± 2SD), respectively. These findings implied good or better agreement on Longshi Scale scores between healthcare professionals and informal caregivers.

| DISCUSS ION
There are approximate 42 million people with disabilities in China, and most are living in rural areas (Ansah et al., 2021). A large number of healthcare professionals are needed for the identification of highrisk groups, disability evaluation and nursing care (Bai et al., 2021).
However, the limited nursing staff and medical resources in some poverty-stricken areas prevent people with disability from being assessed and cared for (Qiao et al., 2022). Informal caregivers represent the most abundant personnel resource in looking after people with disabilities (Ranhoff, 1997). Training them to make a skilled assessment of functional independence is helpful in lightening the

TA B L E 1 (Continued)
F I G U R E 3 Scores of Longshi Scale in healthcare professionals and informal caregivers. Longshi Scale divided patients into three groups, including bedridden, domestic and community groups; each group was evaluated using a 3-point Likert subscale, which included three different items about functional ability. The mean scores of Longshi Scale items in bedridden group (a), domestic group (b) and community group (c) were compared using t-tests. The level of significance was set at <0.05. In the bedridden group, the mean scores of "bladder and bowel management", "feeding" and "entertainment" were 1.61, 1.84 and 1.58, respectively, according to informal caregivers, while according to healthcare professionals, the mean scores were 1.60, 1.84 and 1.63, respectively. In the domestic group, the mean scores of "toileting," "grooming and bathing" and "housework" were 2.15, 1.50 and 1.22, respectively, according to informal caregivers, while according to healthcare professionals, the mean scores were 2.12, 1.54 and 1.21, respectively. In the community group, the mean scores of "community mobility," "shopping" and "social participation" were 2.58, 2.55 and 2.64, respectively according to informal caregivers, while according to healthcare professionals, the mean scores were 2.60, 2.59 and 2.68, respectively TA B L E 2 Agreement rate of Longshi Scale scoring between healthcare professionals and informal caregivers The results of Longshi Scale evaluated by healthcare professionals served as reference standards. c MNB, McNemar-Bowker Test, was used to compare the differences between healthcare professionals and informal caregivers with different education levels.
burden of nursing staff and ease the effects of unbalanced medical resources. So far, the existing functional independence and disability scales are word-based with some potential limitations, such as too many evaluation items, difficulty of use for non-professionals, time consuming and requiring more than two healthcare professionals to evaluate, which takes up extensive medical resources (Wang et al., 2019). However, the Longshi Scale is a pictorial scale, which can used by non-professionals, and it takes not more than a minute to finish the evaluation, which greatly reduces the time and medical resources required . The results from this study indicate moderate or good agreement between healthcare professionals' and informal caregivers' scores on the Longshi Scale items and their sum scores. A few disagreements can be explained by withinpatient variability due to day-to-day variation (Wang et al., 2019).
The items of grooming and bathing demonstrated the poorest agreement (kappa 0.504), which might be related to variation in personal hygiene.
Ordinarily, informal caregivers might be able to observe functional decline in their patients (Blanco et al., 2020). Presumably, other informal caregivers, such as home helpers, could also be trained to score activities of daily living (ADL) reliably and evaluate functional independence accurately after a short introductory course (Tam & Schmitter-Edgecombe, 2019). In the current study, the informal caregivers worked in teams, together with the healthcare professionals, therefore the scores may have been biased. Although they were instructed not to communicate about the scores, the confounding bias was inevitable. If the assessment of ADL was consistent between informal caregivers and healthcare professionals, suggesting that double-track assessment might be a potential approach to address the insufficient source of professionals care. A previous study indicating that Barthel Index scassessment by a physician from patient interviews was not reliable (Liu et al., 2020;MacIsaac et al., 2017;Wang et al., 2019). The results of this study indicated that the ability of ADL assessed by informal caregivers is a better method to detect decline in functioning, than the doctor's interview about ADL tasks, particularly among stroke survivors in the subacute and recovery stages.
The results obtained from the informal caregivers in this study are not suitable for the assessment of patients suffering from mental illness, because their scores might be biased (Ranhoff, 1997;Zhao et al., 2021).
In addition, our results implied a moderate or good agreement on the Longshi Scale evaluation between healthcare professionals and informal caregivers, especially for disability patients in the community and domestic groups. This might be associated with the ceiling effect of the Longshi Scale. Similarly, Barthel Index scale is F I G U R E 4 Difference in Longshi Scale sum score scored by healthcare professionals and informal caregivers. Longshi Scale divided patients into three groups, including bedridden, domestic and community groups; each group was evaluated using a 3-point Likert subscale. The average differences in Longshi Scale sum score between healthcare professionals and informal caregivers were 0.14 ± 1.57, −0.59 ± 1.65, and 0.37 ± 1.96 in the bedridden, domestic and community groups, respectively. Outlier points were determined as outside the range (mean ± 2SD). The scatter showed that there were 6, 3 and 1 outlier points out of the range in the bedridden, domestic and community groups, respectively known to have a ceiling effect that makes it insensitive to slight functional impairments in previously well-functioning patients (Sarker et al., 2012). Although a significant ceiling effect was found in the bedridden group in our previous study, the internal consistency of all three groups was acceptable for group comparison (Wang et al., 2019). However, Barthel Index scale quantifies ADL on an ordinal, hierarchical scale that ranges from 0 to 100, which limits interpretation of numeric changes in the total score. As for informal caregivers, it is difficult to understand how much score change is significant . A distinct feature of the Longshi Scale is the categorization and scoring system, which facilitates the understanding of patients' functional independence by informal caregivers . Moreover, the pictorial scale may allow a much simpler and more inclusive assessment across all populations, especially for people with aphasia and reading difficulties (Quinn et al., 2011). In this study, we found that except for people with education below the primary school level, there were no statistically significant differences between the two groups. Future research should focus on interventions to make reliable assessments of Longshi Scale in informal caregivers with a low degree of education.
This study included 947 pairs of informal caregivers and patients from 24 clinical settings in 11 cities of China. To our knowledge, this study is the first to address the reliability of pictorial based Longshi Scale for informal caregivers to evaluate the functional independence and disability of inpatients. We believe that these findings provide insights into disability evaluation and medical resource allocation in some impoverished areas. Healthcare strategies for functional disability may integrate healthcare professionals with informal caregivers to improve the effectiveness of rehabilitation.

| LI M ITATI O N S
The interpretation of these results also needs to consider the following limitations. First, the cross-sectional design of this study restricted identification of a causal relationship to functional independence. Second, the sampling method was non-random, and the included hospitals were collaborative organizations with the authors' departments. Although the inherent bias could be unavoidable, our study covered over 24 hospitals in 11 cities to ensure generalizability. Moreover, note that our study only selected patients aged over 18 years, which may make the findings inapplicable to the populations under 18 years of age. Finally, most variables were measured by self-report; thus, we invited experienced investigators to assess functional disability and combine medical records to reduce recall bias as much as possible.

| CON CLUS ION
There is good or moderate agreement between healthcare professionals and informal caregivers on Longshi Scale evaluation.
However, informal caregivers' education level is a dominant factor in affecting the assessment accuracy compared with health professionals. Informal caregivers with secondary-school educations and higher are supported to evaluate patients' functional ability independently.

ACK N OWLED G EM ENTS
We would like to thank Miss Chunli Cai and Wanqi Fu for offering technical assistance in data collection and recruitment participants.
We also would like to thank all the healthcare professionals, informal caregivers and patients to participate this study.

CO N FLI C T O F I NTE R E S T
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

E TH I C A L A PPROVA L
The study protocol was approved by the Medical Ethics Committees of the Shenzhen Second People's Hospital (project identification code: 20201105004). Written informed consent was obtained from all patients and their informal caregivers, who agreed to participate in the study.