Inductive and deductive methods were employed for item generation in this study. The combination of these methods is introduced as the most appropriate technique for this purpose by Boateng et al 2018 (43).
Face-, content- and construct-validity and reliability were assessed for this instrument. Qualitative and quantitative approaches can be used for face validity assessment(44), and we applied both of them. Nurses (the targeting group) were interviewed assessing the qualitative face validity approach, which could guarantee the suitability and completeness of the content of the instrument as well as easing the comprehension of items and completion of instrument.
The impact score method was used to assess the quantitative face validity approach. The minimum acceptable item’s impact score was assumed 1.5 considering a 5-point likert scale with an average of 3.0 and an abundance of 50%.
Two approaches of qualitative and quantitative methodologies were used for assessing the content validity in this study. Expert’s panel was asked to review the grammar, wording, item allocation and scaling of the instrument. Exploring the content of instruments by experts is known as one of the most suitable forms of evidence gathering in support of credibility. In this study, CVI and CVR were measured for assessing the quantitative content validity. Tuyntha et al (2004) believed that two points need to be mentioned for exploring the content validity: one is making sure of the selection of the most important and appropriate content, and the other is designing the items in the most suitable form for content validity assessment. The first is earned by CVR and the second by CVI(45).
The construct validity was tested by factor analysis, which is known as a precious tool for categorising the items into factors (subscales). Each factor represents a unique feature and guides the researcher through grouping and interpretation of items. Factor analysis is done using the two methods of EFA and CFA(46).
The KMO index and Bartlett's test were conducted before the exploratory factor analysis for controlling the sample adequacy. Results have indicated the value of KMO to be 0.879 and the significance of Bartlett’s test (x²=9148.396; P < 0.01). KMO values of 0.7–0.8 are considered adequate and 0.8–0.9 as highly adequate(47).
Principal Axis Factoring method, Varimax rotation and scree plot were used to determine the aspects of OPPMDS. This analysis was performed to 330 samples considering the eigenvalue > 1 and factor loading > 0.4. Scree plot is a visual aid to determine the appropriate and effective number of extractable factors(48). Three factors of “elderly related-factors (10 questions)”, “healthcare providers-related factors (25 questions)” and “system-related factors (4 questions)” having an eigenvalue and factor loading greater than one were extracted by varimax rotation. Based on the Three indicator rule, there should be at least three observable items for each latent item(46). In this study, each factor was named considering the common variables of that factor having a meaningful factor loading.
Confirmatory Factor Analysis was employed to examine the fit of OOPMDS. Results have shown that all the CFA indexes, RMSEAR, CFI, NFI and Chi-squared values confirm the fit of the model, considering the fact that they all need to be assessed for such a confirmation(46, 49).
The correlation between the measurement errors of e4/e6, e14/e16, e19/e25 and e33/e35 were discovered considering the final model of factor structure of OPPMDS construct. Munro (2005) stated that the correlation between the measurement errors happens when a variable of a model is not directly assessed, unclear or affects the item responses(46). Correlated errors might be the side effects of working procedures (e.g. self-report method assessment procedure) or appeared due to vocabulary meaning similarities in each item(50).
The reliability of OPPMDS was evaluated using two methods of internal consistency and stability. Results have indicated a value of 0.956 for the cronbach’s alpha coefficient of the whole instrument, which shows a good internal consistency or reliability. Conventionally, This coefficient needs to be greater than 0. 7(51).
The stability of the scale is measured using test-retest and ICC methods. Terwee et al (2007) introduced repeatability, which includes agreement and interclass coefficient, as a needed index for measuring the reliability of a scale(52).
Results have shown that the value of MDC was more than MIC, that indicates the positive agreement.