4.1 Item generation and scoring
The demographics and characteristics of the second-panel hospital managers are shown in Table (1). The content validity resulted in removing one item and indicated that a revision is needed for eight items. The revised items required either further clarification and rewording or modification for specific participants. For example, CVR results indicated that financial and price items should not be included for nonprofit hospitals. Additionally, CVI results showed that particular items are relevant only to inpatients. This step raised the S-CVI, CVI-UA, and CVR from 0.90, 0.63, and 0.95 to 0.95, 0.78, and 0.97, respectively.
Table (1) is to be inserted here.
4.2 The instrument’s structure and items
The patient sociodemographics and hospital characteristics section included age, gender, scientific degree, working sector, insurance availability, and type. Moreover, the number of visits to the evaluated hospital compares the attitudes of the new and previous customers. The number of earlier visits is considered necessary in the analysis since past customer behavior tends to be a good predictor of future behavior [21]. Moreover, the information source on which the respondent evaluation was built was recorded since perceptions and attitudes may emerge from direct personal experience or from observing other people’s experiences, such as family and friends’ experiences [20]. The second section of the questionnaire was designed to measure patient experiences in light of BSC perspectives and their attitudes toward them, including patient satisfaction, PQ, PI, and loyalty.
4.2.1 The financial perspective
It evaluated the health services and medication's price affordability. This section was answered only by patients who did not have insurance.
4.2.2 The internal perspective
This perspective assessed safety, time, and service availability. On the other hand, the PI of the cure rate, accuracy, complications, and PQ of services and medication were measured in the attitude section.
4.2.3 The knowledge and innovation perspective
Information and training provided to patients were assessed in the experience section. Additionally, we assessed the PI of hospital technology and employee competencies in the attitude section.
4.2.4 The customer perspective
It assessed patient-centeredness and the HCW-patient communication experience. The attitude section assessed actual patient satisfaction and loyalty attitudes. In previous studies, validated items for loyalty measurement included satisfaction measurement and loyalty attitude measurement, specifically the recommendation and return intentions [22, 26]. Using a single item to directly assess actual patient satisfaction was suggested to be better than its assessment through multidimensional items [61].
4.2.5 The environment perspective
It evaluated the hospital building environment and the hospital capacity, ease of access, and female concern experiences. On the other hand, a comparison with the other hospitals’ medical and social PIs was included in the attitude section.
Finally, three items were reversed in the instrument, PIN9, which assessed the long waiting time. Additionally, PIN4, PIN5, and PIN6 assessed readmission, referral to other hospitals, and postoperative infection probability expectations, respectively.
4. 3 The pretest and the internal consistency
The pretest was performed at one NGO hospital in the south of West Bank. Patients found the length of the questionnaire appropriate. Additionally, the layout was well accepted and clear. They gave specific minor comments that were incorporated. These corresponded to the rewording of a few items. The time for completing the questionnaire was less than 10 minutes.
Consequently, few modifications were made after piloting. Cronbach’s alpha was calculated per BSC perspective. All perspectives had a Cronbach’s alpha above 0.7 at the pretest, except for the environmental perspective, which was 0.59. Hence, some of its items were moved to other perspectives, and five items were deleted. As a result, 52 and 50 items remained for inpatients or outpatients, respectively.
4.4 Linguistic validation and translation
The final English and Arabic questionnaire forms were ready for use.
4.5 Sample size and characteristics
Since the research coincided during the COVID-19 pandemic, hospital approvals took six to nine months until received. Only 15 hospitals out of 18 agreed on participation. The data collection was performed between January and September 2021. The data of the pretest at one hospital were excluded. Next, we distributed 1000 questionnaires at the remaining 14 hospitals. As a result, 740 were returned (response rate was 74%). The characteristics and sociodemographics of the respondents are shown in Tables 2 & 3.
Table (2) is to be inserted here.
Table (3) is to be inserted here.
4.6 Statistical analysis
The statistical analysis using the Shapiro–Wilk test showed that the data were not normally distributed, so nonparametric tests were decided to be used. Then, construct validation was assessed for the instrument.
4.6.1 Construct validity in EFA
EFA resulted in 37 items with loadings higher than 0.50 for 12 components. Eigenvalues for all components were higher than one. The KMO was 0.901, reflecting very high sampling adequacy [48, 56], and Bartlett’s test was also significant. The cumulative variance was 63.29%. See Table (4). The 12 components were patient attitude toward BSC perspectives (BSCP ATT), patient experience (PT EXR), service experience (SERV EXR), price experience (PR EXR), building experience (BUIL EXR), access experience (ACC EXR), complication perceived image (COMP IMAGE), technology experience (TECH EXR), information experience (INFO EXR), hospital social responsibility perceived image (HSRP IMAGE), and waiting time experience (WT EXR). One item (SAT2) loaded on the 12th component. However, this item had a higher loading on the BSCP ATT. None of the specific inpatient items had loadings higher than 0.50. Moreover, the scree plot showed the necessity to delete the last three components.
Table (4) is to be inserted here.
4.6.2 Construct validity in CFA
The resulting nine components in EFA were tested in the Amos program. The model was edited based on the item loadings, model fit indices, and calculations in the convergent, discriminant, CR, IIC, and CITC at the next step until we arrived at the best model. First, adding two items that did not have loadings to the INFO EXR construct showed good loadings in CFA. The same was regarding BSCP ATT and TECH IMAGE constructs. Second, splitting the BUIL EXR component into two separate constructs, building environment experience (BUILENV EXR) and building capacity experience (BUILCAP EXR), improved the item loadings and the model fit. Third, PEN9 and PLE7 items were removed from the PT EXR construct because they have loadings lower than 0.50. On the other hand, PIN 14 and PIN 16 were added to the latter construct since both had loadings higher than 0.50 and improved the model fit. Moreover, merging the TECH IMAGE and COMP IMAGE items at the BSCP ATT construct resulted in loadings lower than 0.5 and IIC lower than 0.30. Hence, three separate constructs in the attitude section were decided. Finally, the modification indices in the Amos program were utilized to improve the model. The final model revealed that the CMIN/df, CFI, GFI, TLI, RMSEA, and SRMR indices in CFA were above or close to the cutoff points, reflecting a good fit model. Despite that, the P value was <0.001, which can be referred to as its sensitivity to normality. See Figure (3) and Table (5).
Figure (3) is to be inserted here.
Table (5) is to be inserted here.
4.6.3 Composite reliability and interitem correlations
The composite reliabilities for all constructs were higher than 0.6 except the SERV EXR construct. However, this construct's IIC and CTIC were higher than 0.3. The other constructs also had IICs higher than 0.3, and their CITC ranged from 0.328-0.853, reflecting satisfactory IIC and CITC. See Table (6).
Table (6) is to be inserted here.
4.6.4 Convergent and discriminant validity
Convergent validity was less than 0.5 for BSCP ATT, BUILENV EXR, PTCOMINF EXR, SERV EXR, and COMP_IMAGE. However, the CR, IIC, and CITC showed satisfactory results [58], except the SERV EXR, which had a CR equal to 0.50 but an IIC and CITC higher than 0.3. On the other hand, the square roots of the AVE were higher than the off-diagonal correlations between constructs. Additionally, the lower the correlation between constructs indicates each construct uniqueness. The correlations between the independent constructs were either negligible or low, except between two constructs; the PT EXR and INFO EXR, which was moderate. Merging the two constructs lowered the loadings and the model fit indices in CFA. The same was perceived regarding merging the BUILENV EXR and BUILCAP EXR constructs. Consequently, separate constructs were determined, as mentioned earlier. In regard to the independent constructs, negligible or low correlations existed among them. Neither high nor very high correlations existed between the independent constructs. Therefore, this establishes discriminant validity and the uniqueness of the independent constructs. The same holds true for the dependent constructs. In other words, convergent validity was met for all constructs except SERV EXR. In comparison, discriminant validity was met for all constructs, as shown in Tables 7 & 8.
Table (7) is to be inserted here.
Table (8) is to be inserted here.