At the present time, with the aging population, sexuality has become an essential concept in elderly individuals (Sinković et al., 2019). This study investigated the validity and reliability of the Turkish version of ASKAS, which was developed to be used by healthcare professionals and their relatives caring for the elderly or nursing home staff. The study results show that ASKAS_TR is valid, reliable, and usable in practice for Turkish society. The original English version of ASKAS was translated into Turkish at the first phase of the research. Although there is no standard for translating scales in cross-cultural studies (Cha et al., 2007), a systematic translation process consisting of different phases was followed. These phases were used to reduce differences arising from cross-cultural psycholinguistic features, the method recommended by WHO for adapting instruments developed in different languages (World Health Organization, 2021). The ability to accurately and completely measure the property that a measurement tool aims to measure is referred to as validity. This study's content validity results reveal that expert opinions are in consensus and that the language and content validity criteria are met (Karaahmetolu & Alpar, 2017). A CVI value above 0.80 is considered sufficient (Polit, Beck, 2012). While the content validity ratios of the items ranged from 0.85 to 1.00, the total CVI score for the scale was 0.92.
In assessing the psychometric properties of the knowledge subscale of the scale, item analyses and reliability studies were conducted. In the item analysis, two items were eliminated according to the total score correlation. According to Nunnally's (1967) guidelines, items with excessively high P values, e.g., below 0.20 or above 0.80, are excluded from the scale. No items in the knowledge domain were excluded since their difficulty (P values) ranged from 0.21 to 0.81. In the reliability analysis, the Kuder-Richardson-20 coefficient was used. The Kuder-Richardson-20 coefficient can be considered a coefficient for items scored separately as Cronbach (for example, true/false). The Kuder-Richardson formulas are called internal consistency coefficients because they are based on the assumption that each item of the test measures the same variable or that the test measures are homogeneous (Ercan & Kan, 2004). KR -21 is used in tests without item analysis, and the item difficulty of the test items is assumed to be equal. As a result, the coefficient calculated using the KR-21 method is considered the lowest limit of reliability. Suppose a test's KR-20 or KR-21 reliability is high. In that case, it can be assumed that all of the items measure the same efficacy (the test is one-dimensional) and that the test scores are free of random errors (Karakoç & Dönmez, 2014). Since the internal consistency coefficient Kuder-Richardson − 20 in this study was 0.80, the final version of the knowledge domain consisted of 33 items. The knowledge domain's mean score was found to be 50.29 ± 10.77. The score of the knowledge subscale in the original scale was 64.19 ± 17.25 (White, 1982), 49.00 ± 8.00 in the study conducted with gynecologists in New York (Langer-Most & Langer, 2010), 65.21 ± 12.32 in the study conducted with persons over 60 years of age in Poland (Cybulski et al., 2018), and 22.8 ± 4.69 in the study conducted with caregivers of elderly persons in Australia (Chen et al., 2017). As can be seen, the results of the ASKAS knowledge sub-domains differ from each other in the studies conducted in different regions and samples.
As part of the validity and reliability study of the attitude subscale of the scale, a confirmatory factor analysis, an item analysis, an exploratory factor analysis, and an internal consistency analysis were conducted. Construct validity evaluates how accurately a measurement instrument can measure an abstract concept, behaviour, and dimension that cannot be directly observed and are difficult to measure but are theoretically explained (Güleç & Kavlak, 2013). An exploratory factor analysis was conducted in the study to assess the construct validity of the attitude subscale of the scale. Before conducting factor analysis, various analyses are performed to evaluate whether the sample is large enough. The KMO sampling adequacy test was used in this study. It reports good sampling adequacy with a value between 0.80 and 0.90 of the KMO test score (Karaahmetoğlu & Alpar, 2017). The significant results of Bartlett's test, another sampling adequacy test, show that the correlation matrix of the scale items is sufficient for factor analysis (Polit & Beck, 2012). The KMO value of 0.87 in this study indicated that the sample was enough for factor analysis. The important results of Bartlett's test showed that the items had a sufficient correlation matrix.
Exploratory factor analysis involves rotation to clarify independence and interpretation. Varimax rotation, one of the most commonly used vertical rotation techniques, was used in this study. The higher the total variance explained by the factors resulting from the analysis, the stronger the factor structure of the scale. While single-factor analyses should explain at least 30% of the total variance, this rate should be higher in multifactorial structures (Ayre & Scally, 2014). This scale's three factors explained a considerable portion of the total variance (62.77%). Therefore, the factor structure demonstrates that it is appropriate. In factor analysis, three basic criteria are considered. The first is that items must have high loadings for the factor to which they belong (Lawshe, 1975). The bounds on the factor loadings that explain the correlations of the items with the factors are not clear, but it has been reported that the lowest acceptable factor loading is 0.20 (Strickland, 2003). Because the lowest factor loading in this study was 0.22, no item was removed from the scale. The second criterion is that the items have high loadings for one factor and low loadings for the other factors. When this criterion is met, it is possible to examine independent structures. The loadings are expected to be as high as possible, yet how much difference can be ignored is debatable. The difference between the two-factor loadings is expected to be at least 0.10 (Tavakol & Dennick, 2011). There was no exploratory factor analysis in the original form of the scale or studies of validity and reliability. Also, the attitude dimension of the scale was used as a one-dimensional scale.
It is recommended to create new covariances for those with high covariance among the scale items that reduce the fit to improve the fit indices model in scale-fitting studies (Mishra et al., 2019). The error covariance among the scale items was assigned in the study in accordance with the change suggestions. However, the increasing error covariance shows that the model is losing more and more of its confirmatory properties. Therefore, defining more than two or three covariances may cast doubt on the goodness of the model. However, this does not eliminate the validity of the established model (Akgul, 2005). In this study, covariance assignment was made between items that significantly affect the model's structure and have similar theoretical meanings.
Measurement invariance between groups or populations is tested using confirmatory factor analysis (Brown, 2015). In this study, confirmatory factor analyses were conducted to test the construct identified in the original study. In analysis, many indices can be used for evaluation by determining model fit using various fit indices, but there is no absolute consensus on which values should be reported (Costello & Osborne, 2005). A chi-square degree of freedom (χ2/df) of less than two is normal, and less than five is acceptable. An RMSEA value of less than 0.05 is normal, and less than 0.08 is acceptable. GFI, CFI, and IFI values above 0.95 are considered normal and above 0.90 acceptable (Costello & Osborne, 2005). In the study, the following values were obtained: χ2/df: 2.89, CFI: 0.87, RMSEA: 0.076, GFI: 0.83, and IFI: 0.87 with a three-factor structure. The six-factor results of this study showed that the original scale structure did not have an acceptable fit, while the three-factor structure had an acceptable level of fit.
The Cronbach's alpha coefficient, which indicates the internal consistency of measurements, is generally considered highly reliable in the range of 0.60–0.80 and highly reliable in the range of 0.80-1.00 (Tavşancıl, 2014). The alpha coefficients for the total scale and the subscale "Emotional Attitude of Caregivers towards Sexual Life of the Elderly" were highly reliable. At the same time, the subscales "Behavioural Attitudes of the Caregivers towards Sexual Life of the Elderly" and "Cognitive Attitudes of the Caregivers towards Sexual Life of the Elderly" were reliable. The Cronbach's alpha coefficient for the scale total was 0.86 in the original version of the scale. The Cronbach alpha coefficient of the scale was 0.90 in the study conducted with students in a nursing school in Israel (Gewirtz-Meydan et al., 2018), 0.93 in the study with nurses in Brazil (Evangelista et al., 2019), 0.87 in the study with nurses in Belgium (Mahieu et al., 2016), and 0.92 in the study with older people in the United States (Syme & Cohn, 2016). In this study, the scale's internal consistency was found to be highly reliable, in line with the literature.