Development of the QLICD-PU
QLICD-PU consists of a general module QLICD-GM and a module dedicated to PUD. The development process of QLICD-GM has been described in another paper [20]. Here, we briefly summarize the development steps and results. The programmed develop procedures which include focus group discussions, in-depth interviews, pre-testing, and four quantitative statistical analyses were used in the QLICD-GM. Finally, the QLICD-GM has 30 items which included 3 domains and 10 facets. Based on the data of 620 patients with seven kinds of chronic diseases, QLICD-GM has good psychometrics (reliability, effectiveness, responsiveness), such as coronary heart disease and hypertension. [20].
For a specific module, 29 items reflecting symptoms were selected to constitute the initial item pool. These items focus on the unique side effects and mental health of PUD. We selected these items from literature reviews and nominal / focus group discussions. Focus groups evaluate the importance of each item by ranking each item independently and then discussing the 9 lowest ranked items that are excluded. The remaining 20 items constitute a preliminary questionnaire for conducting the pilot test and also Interviews with 29 PUD patients and 14 clinicians and researchers with extensive experience. We focus on patient opinion, which is most important for assessing the acceptability of interventions and related compliance. Based on the pilot data, the items were re-screened using a development process similar to the generic module (statistical procedure and focus group discussion). The final specific module consists of 14 items, coded PU1-PU14 (see table 1 in detail), classified into 6 facets.
Validation of the QLICD-PU
Data Collection and Scoring
In this study, we enrolled participants with PUD at any stage who were: (1) be able to provide written informed consent; (2) be able to read and write words with assistance. There were no protocol requirements regarding specific clinical treatment of patients. Physicians could treat the patients according to what they deemed clinically appropriate.
The survey was carried out at the First Affiliated Hospital of Kunming Medical University after approved by the ethics committee of Kunming Medical University. Researchers, including doctors and medical graduate students, explained the purpose of the study and obtained informed consent before the test. Each interviewee was required to answer the questionnaire upon admission. To assess the reliability of the retest, a subsample is randomly selected for the second assessment on the second or second day of hospitalization. All patients available at the scheduled third evaluation time point have completed discharge measures to assess the responsiveness of the questionnaire. Besides, there is no recognized gold standard for evaluating PUD quality of life. To evaluate the standard correlation validity, convergence validity, and discriminant validity of QLICD-PU, the Chinese version of SF-36 [24] was also used in the formal test. Baseline socio-demographic characteristics were recorded from hospital medical records, including age, gender, education level, marital status, clinical history, and treatment. Each investigator checked the answers immediately to ensure their integrity.
Since each item uses the five-point Likert format (not many at all, many, many), positively stated items will be scored directly from 1 to 5, while negatively stated items will receive the opposite score. The domain/facet and overall scale scores are obtained by adding related item scores, all of which are linearly converted to standardized scores on a scale of 0-100. The higher the score of QLICD-PU means the better the quality of life of original and standardized scores.
Psychometric Analysis
Then the effectiveness, reliability, and responsiveness of QLICD-PU were evaluated. In this study, the structural effectiveness is evaluated by the Pearson correlation coefficient r between the item and the domain. Assess the validity of the standard by correlating the corresponding fields of QLICD-PU and SF-36. Multi-feature scale analysis [25] is used to test the convergence validity and discriminant validity of the item. There are two validity criteria: (1) When the item domain correlation is 0.40 or higher, it supports convergence validity; (2) The item domain correlation is higher than the discriminant validity of other domains. item In terms of reliability, for each domain/facet and the overall scale, the internal consistency is assessed using the first measurement data (at admission) by Cronbach's alpha coefficient. Evaluation of retest reliability was by Pearson correlation coefficient and intra-class correlation (ICC) [26-27] between the first and second assessments. The responsiveness (sensitivity to detect change) was assessed by using a paired t-test to compare the average score change between the two assessments before and after treatment and the average value of the standardized response (SRM). [28-29].
Generalizability Theory Analysis
In addition to the classical test theory analysis, to study the reliability of the QLICD-PU score, we also applied the Generalizability Theory (GT) in this study. GT is a modern test theory developed based on the combination of classical test theory and analysis of variance. It is proposed as a method to improve measurement program design in an attempt to obtain reliable data [30-33]. To control the measurement errors, GT introduces independent variables or factors that interfere with test scores into measurement models, such as differences between research objects, item difficulty, scoring criteria, and the interaction between these factors. An analysis of variance was then used to assess the impact of these variables or factors on test scores, using the variance component as an index. GT includes G study and D study. G study quantified the amount of variance related to the different facets (factors) to be examined. D study provides information about which protocol is best for a particular measurement by generating a generalizability (G) coefficient, which can be interpreted as a reliability factor for all facets of the current study.
In our research, both G study and D study are completed in one measurement model to estimate the variance component and reliability factor, and to estimate the variance component and reliability factor in the one-sided cross design. [person-by-item (p × i) design]. We define the patient's quality of life as the measurement target and the item as a facet of measurement error. For G-Study, we defined an acceptable observation range composed of measurement objects and measurement errors and estimated variance components. For D-study, we define the allowable summary based on the measurement object and the measurement facet that the researchers are willing to summarize to express the measurement conditions. At the same time, the generalized coefficients of each facet and the variance components of the reliability indicators and their interactions are calculated.