Palliative Performance Scale: Polish validation in hospice setting - a prospective observational study

Background: Measuring functional status in palliative care may help clinicians to assess a patient’s prognosis, recommend adequate therapy, avoid futile or aggressive medical care, consider hospice referral, and evaluate provided rehabilitation outcomes. An optimized, widely used, and validated tool is preferable. The Palliative Performance Scale Version 2 (PPSv2) is currently one of the most commonly used performance scales in palliative settings. The aim of this study is the translation and validation process of a Polish translation of this tool (PPSv2-Polish). Methods: Two hundred patients consecutively admitted to a free-standing hospice were evaluated twice during 2 consecutive days for test-retest reliability. In the first evaluation, two different care providers independently evaluated the same patient to establish inter-rater reliability values. PPS-Polish was compared with the Karnofsky Performance Score (KPS), Eastern Cooperative Oncology Group (ECOG) Performance Status (ECOG PS), and Barthel Activities of Daily Living (ADL) Index to determine its construct validity. Results: A high level of full agreement between test and retest was seen (63%), and a good intra-class correlation coefficient of 0.85 (P<0.0001) was achieved. Excellent agreement between raters was observed when using PPSv2-Polish (Cohen’s kappa 0.91; P<0.0001). Satisfactory correlations with the KPS and good correlations with ECOG PS and Barthel ADL were noticed. Persons who had shorter prognoses and were predominantly bedridden also had lower scores measured by the PPSv2-Polish, KPS and Barthel ADL. A strong correlation of 0.77 between PPSv2-Polish scores and survival time was noted (P<0.0001). Moderate survival correlations were seen between KPS, ECOG PS, and Barthel ADL of 0.41; -0.62; and 0.58, respectively (P<0.0001). Conclusion: PPSv2-Polish is a valid and reliable tool measuring performance status in a hospice population and can be used in daily clinical practice in palliative care and research.

measuring functional status in palliative care is incontestable. It may help clinicians to assess a patient's prognosis, recommend further oncological therapy, avoid futile, aggressive medical care, consider hospice referral and evaluate provided rehabilitation outcomes. All measuring tools may have disadvantages, and an optimal one, which may also be used in monitoring the outcomes of physical therapy, specifically in palliative care, is lacking [1]. The Palliative Performance Scale version 2 (PPSv2) is currently one of the most commonly used performance scales in palliative settings. It was developed as a modification of the Karnofsky Performance Scale in 1996 [2] and validated later [3]. It has also been demonstrated that PPSv2 correlates with patients' survival time [4]. Although occasionally used in Poland, it has not been validated yet. It is reasonable to disseminate knowledge among professionals and promote its use through cultural adaptation.

Aim and design
The aim of this study is the translation and validation process of a Polish translation of PPSv2 (PPSv2-Polish). This is a prospective observational study of cancer patients performed by the hospice care team at St Lazarus Hospice, Krakow, Poland. All participants were evaluated twice during two consecutive days for test-retest reliability by the same palliative team member (a trained and experienced palliative care nurse, psychologist, or physiotherapist), who cared for the patient.
Additionally, on the first evaluation, two different care providers (each time by different types of professionals) independently evaluated the same patient to establish inter-rater reliability values.
Each patient's evaluation encompassed the PPSv2-Polish, which was compared with 3 additional performance scales to accomplish its construct validity.

Participants
Two hundred adult patients (aged ≥ 18 years) who were consecutively admitted to an in-patient hospice, Polish native speakers, clinically stable according to the attending physician, able to communicate and complete the questionnaires, and without cognitive impairment, were recruited and enrolled in the study. The sample size for this survey was based on general guidelines for conducting qualitative research [5].

Measures
Palliative Performance Scale version 2 (PPSv2) and the Polish version of the tool The PPS provides a functional assessment of a patient's ambulation, activity level, evidence of disease, self-care, food/fluids intake, and level of consciousness. The PPS has 11 categories, from 100% (full mobile and healthy) to 0% (dead) in increments of 10%. In 2006, PPS version 2 (PPSv2) was introduced after clarification of instructions for its use [6].
A modified combined translation technique [7] of PPSv2 into Polish was applied, which consisted of 1) independent forward translations by a physician, a psychologist and a Polish native speaker, 2) team discussion on identified differences between these 3 versions until agreement, 3) independent backward translations by a physician, a psychologist and a native English speaker, and 4) second team discussion on any differences between the original and back-translated versions until all agreed that the two versions were semantically identical. This method was consistent with the Victoria Hospice translation guidelines sent to the authors. The discussed back-translated version was then preliminarily tested on patients to obtain the final version of the PPSv2 (PPSv2-Polish -see Additional file 1). The implementation process encompassed education of the medical staff participating in this study during one training session, based on the Victoria Hospice Society instructions, giving the opportunity to get feedback from the team members.

Karnofsky Performance Score (KPS)
The KPS ranking is an 11-point scale and runs from 100% -perfect health, to 0% -dead. While first published in 1948 [8] to evaluate the ability to survive chemotherapy for cancer, it has recently undergone several evaluation adjustments [9]. The KPS provides great consistency of ratings by different oncology professionals [10]. It may also serve as a life survival predictor [11].

Eastern Cooperative Oncology Group (ECOG) Performance Status (ECOG PS)
This scale, also called the World Health Organization (WHO) score, was published in its current form by ECOG in 1982 [12] to assess a patient's level of functioning in terms of the ability to care for himself, daily activity, and physical ability, in order to measure the impact of the disease/treatment on performance status. It has a good intra and interobserver agreement in cancer patients' performance status assessment [13]. It consists of 6 categories, from 0 -fully active, to 5 -dead, and is simpler to use, may be clinically preferable in comparison to the KPS, [14] and is widely used in the literature.

Barthel Activities of Daily Living (ADL) Index
This "simple index of independence" was published in 1965 for measuring the improvement during rehabilitation of the chronically ill" [15]. The original version was modified in 1988 to a 20 point scale that measures in increments of 1 point: from 0 -fully dependent, to 20 -fully independent [16]. The final score can be multiplied by 5 to obtain a 100 point score, and it is proposed that scores of 0-20 indicate "total" dependency, 21-60 indicate "severe" dependency, 61-90 indicate "moderate" dependency, and 91-99 indicates "slight" dependency. It is already widely used as the measurement of daily living activities and has become a standard measure of physical disability in practice [17]. Ten categories are assessed: feeding, grooming, bathing, dressing, toilet use, presence of fecal incontinence, presence of urinary incontinence, transfers (e.g., moving from wheelchair to bed), walking on an even surface (or propelling a wheelchair if unable to walk), and ascending and descending stairs.

Statistical analysis
We summarized the baseline demographics using descriptive statistics and medians with interquartile ranges (IQR) in non-normally distributed data. A Wilcoxon signed-rank test was used to compare testretest of ordinal data in one sample in test-retest and inter-rater comparison. Non-parametric data within subgroups of patients were compared using a Mann-Whitney U test. The strength of the relationship between the test-retest variables, and between the tools scores and survival time were calculated with the Spearman's rank correlation coefficient. The inter-rater reliability was estimated using Cohen's kappa statistics. Data were analyzed using STATISTICA 13.0 (TIBCO Software Inc. 2017) data analysis software. A P-value of < 0.05 was considered as the level for statistical significance. As there are no absolute rules for the sample size needed to validate a questionnaire a fair size of 200 patients was chosen [5].

Results
According to the hospice staff involved in this study, the PPSv2-Polish was a feasible and acceptable assessment tool in their practice. At the beginning of implementation, in the training phase of this study, two out of six team members made observations that the 5-steps of PPSv2-Polish assessment was a bit prolonged, in comparison with 1-step assessment of KPS or ECOG PS. All participating staff emphasized the need for observation of a patient for a reasonable period of time to accurately asses his or her true "capable" functions based on the "observed" ones, during the day shift.
All the recruited subjects were able to complete the evaluation according to the study protocol (no missing values, a response rate of 100%). The majority of patients were aged, had advanced cancer, with short (weeks) prognosis according to Gold Standard Framework (GSF), and finally died at the hospice (Table 1). Test-retest reliability The median PPSv2-Polish value within the first measurement was 30 (IQR 10), which correlated with the data obtained two days later (median 30; IQR 20) by the same care provider. We achieved a high level of full agreement between test and retest (63%) and a good intra-class correlation coefficient of 0.85 (P < 0.0001).
A high level (94-99%) of full agreement between raters was observed, with the exception of ECOG PS, where this agreement was not achieved (Table 2). Table 2 Inter-rater agreement of scales used in the study.  (Table 3).

Known-group validity
Persons who had shorter prognosis and were predominantly bedridden also had lower scores measured by the PPSv2-Polish, KPS and Barthel ADL (Table 4). Table 4 Responsiveness of the scales. Moderate survival correlations were observed between KPS, ECOG PS, and Barthel ADL Index scores (0.41, -0.62, and 0.58, respectively; P < 0.0001).

Discussion 8
The PPSv2-Polish which was created with a combined translation technique appears to be a valuable clinical assessment tool in the hospice setting within the Polish population of cancer patients. The implementation process and training of experienced palliative care medical staff proceeded without any particular difficulties. The team members reported that the PPSv2-Polish tool was clear and easy, although a bit time consuming to use in daily practice, and it required observation of a patient for a significant period of time (e.g. through a whole shift) to assess the potential capability of evaluated functions. When comparing various performance tools, it appears that no one is statistically superior to others in terms of inter-rater reliability [18]. The ECOG PS or KPS are often used in determining eligibility for clinical trials. However, there could be a substantial disagreement in the assessment of performance status between oncologists, even when using as simple a tool as the ECOG PS [19].
Patients' age, preferences and socio-economic background may also influence the assessment.
Numerous studies confirmed good correlations of KPS, ECOG PS, and PPS tools, and meta-analysis favors KPS as descriptively better [18]. Authors of the original version of the PPSv2 noticed that in contrary to the KPS, this scale does not focus on the need for hospitalization (which is of poor definability and does not help in defining performance) but instead assesses food/fluid intake and level of consciousness [2]. The ambiguity of the KPS assessment when patients were bedridden (KPS ≤ 40%) led to scale modification in Australia [20]. PPSv2 usage assumes a 5-step assessment, which can be problematic at the beginning, but after training may be more comprehensive and accurate.
Both KPS and PPSv2 scales need standardized, appropriate instructions regarding performance evaluation. Compared with the 5-point ECOG PS, the 11-point PPSv2 seems to be a more precise tool, especially for lower performance statuses. This phenomenon was also affirmed in our observations, where the ECOG PS did not achieve significant reliability between the scoring of different types of professionals.
The high level of agreement with a very good correlation in serial scoring by one rater in a two days interval appeared better than in another study with two weeks between consecutive assessments [3].
The chosen interval of 2 days between assessments allowed for an optimal period not to recall the first scoring, yet not too long to allow changes in performance in most cases.
The excellent inter-rater concordance observed in this study was higher than in an updated metaanalysis recently published [18]. The explanation of this phenomenon partially could be explained by the careful inclusion of rating staff, who attended to patients for several hours daily and had a great understanding of their mobility and functionality [10].
The strength of this study was the high inter-rater agreement each time between different professionals (nurse, psychologist, or physiotherapist) took part in the assessment. There are inconsistencies in the evaluation of a patient's performance status using the same tool by different types of professionals (doctors rated patients as healthier than nurses using the PPSv2 scale), and it may be explained by different amounts of time spent with the evaluated person [10]. However, even research asistants rated patients simillar to physicians (oncologists or radiation therapists) in one study [21]. Optimally, it should be advised to score patients within interdisciplinary team meetings to gain a more accurate assessment [22].
Our study legitimizes the usage of the PPSv2-Polish in prognostication of patients with advanced cancer, which is in line with previous studies [4] [6]. The strongest correlation between the PPSv2-Polish and survival time among the analysed tools was remarkable, and as to the best of our knowledge it was not published earlier. In another recent study of advanced cancer patients with prognosis in terms of weeks, the PPSv2 assessment was as accurate as subjective clinical survival predictions [23]. The similarity between the subjective prognosis assessment of the attending physician, expressed by GSF staging, and the PPSv2 scoring in our observations legitimizes the good responsiveness of this newly validated tool to patients' changing prognosis and physical condition.
This finding could be explained by the strong relationship between the hospice staff and the patient, as this factor was described to have an impact on accuracy [24].
This study was not without limitations. First, the majority of patients recruited presented a low performance status, were mostly sitting or bedridden, and were not representative for the whole palliative population. Secondly, only in-patient subjects were recruited and most of the unstable patients were excluded due to the test-retest requirements.