Aim
The purpose of this study was to psychometrically examine the Arabic versions of HLS-EU-Q16 and HLS-EU-Q6 and their response patterns among Arabic-speaking persons in Sweden.
Study design
The study had a prospective psychometric design and is a part of a larger project aiming to evaluate and subsequently measure CHL and electronic health literacy among Swedish and Arabic-speaking persons in Sweden [24-27]. The project has been approved by the Regional Ethical Review Board in Stockholm, Sweden (No. 2019/5:1). All respondents were informed in verbal and written form, in Arabic and Swedish, about the purpose and the procedures of the study, and told that it was voluntary to participate in the study, and that they could withdraw from it at any time. They were guaranteed confidentiality and secure data storage, and told that by answering the questionnaire, they were giving their informed consent to participate in the study.
Sample, setting and data collection
The data collection was carried out from May to September 2019. Inclusion criteria were: 18 years of age or older, having Arabic as their mother tongue, and being present on the day of data collection. Convenience sampling was used, and the respondents were recruited by the last author visiting various arenas in three large Swedish cities, such as courses in civic orientation for newly-arrived refugees, an Arabic language school and some informal Arabic-language networks. For more details please, see Wångdahl et al. 2021 [25]. Information about the study was given on site, orally and through an information letter given on the same day as data collection at each arena.
The sample size of around 300 respondents was selected based on guidelines for psychometric testing of instruments [28]. According to this, 335 people were invited to participate in the study. Of those who were invited to participate in the study, 49 were asked to respond to the questionnaire twice at approximately 7-day intervals in order to examine the test-retest reliability. A sample size of at least 25 people has been suggested as applicable for evaluating test-retest reliability [29]. The oversampling was used in order to minimize the risk of having a too small test-retest group in the case of dropouts at the second assessment. To be able to combine the test-retest questionnaires, the respondents had to mark a study-specific personal code consisting of the first three letters of their mother's name and the year he or she was born. Twelve participants were excluded from the analysis for different reasons, such as incomplete questionnaires or absence at the second measurement. The final test-retest group therefore came to consist of 37 respondents.
Questionnaires
The Ar-HLS-EU-Q16 consists of 16 items. Each of the items have the following four response alternatives: very difficult –difficult –easy – very easy. In the analysis an HLS-EU-Q16 index score ranging from 0 to 16 is calculated (requires response on at least 14 items). This is done by first dichotomizing the response alternatives into difficult (difficult and very difficult) and easy (easy and very easy), giving difficult the value 0 and easy the value 1, and then adding up the values of all items. Thereafter, the study population was divided into sub-groups based on CHL level. Based on the recommendations of the developer, [30] the threshold values were set to: 0-8 for inadequate CHL, 9-12 for problematic CHL, and 13-16 for sufficient CHL.
The Ar-HLS-EU-Q6 consists of 6 items (included in Ar-HLS-EU-Q16) and the index score is calculated differently compared to the HLS-EU-Q16. First, each response alternative is coded separately (very difficult = 1, difficult = 2, easy = 3 and very easy = 4), then the values of each item are added together and divided by the total number of items (requires a response on at least 5 items). This gives the HLS-EU-Q6 index score. Based on the recommendations of by the developer [31] and previous research [10] the threshold values were set to: ≤2 for inadequate CHL, >2 and ≤3 for problematic CHL, and >3 for sufficient CHL.
The questionnaire also included demographic questions about age, biological sex, education level, country of birth, years lived in Sweden and health status. Health status was measured with the well-used and established question “How do you assess your overall health status? “ with its response options, “very poor, poor, fair, good, or very good” [32]. Electronic health literacy, i.e. “the ability to seek, find, understand, and appraise health information from electronic sources and apply the knowledge gained to addressing or solving a health problem” [33], was measured using the Arabic Electronic Health Literacy Scale (Ar-eHEALS) consisting of 8 items [25]. The items are answered on a Likert Scale ranging from 5 (strongly agree) to 1 (strongly disagree). The value of the items are added together to produce an Ar-eHEALS sum score [25].
Psychometric testing and data analysis
The COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) guided the choice of correct psychometric tests [34-37]. Data are presented with both number and percentages or with mean, standard deviation (SD) and intervals, depending on what is appropriate based on the type of data. Potential differences in biological sex, age, years in Sweden, educational level and self-perceived health between participants with a valid, respectively non-valid, HLS-EU-Q16 index score, and between participants in the test and re-test group, was assessed using the chi-square test, independent sample t-test and Mann-Whitney U test. Two tailed p values < 0.05 was considered as statistically significant. Floor and ceiling effects (the number of respondents with the lowest or highest possible score on the instrument when answering), were examined by calculating the percentage of respondents who had those scores. If >15% respondents have the lowest score, floor effect could be considered, and if >15% respondents have the highest score, ceiling effect could be considered [37]. Frequency of missing data for each item was calculated and evaluated based on the criterion of < 5% [38].
Construct validity, which describes how well the results from an instrument are consistent with a hypothesis (i.e., assessing the concept that it is designed to measure) [35], was examined by analysing the associations between Ar-HLS-EU-Q16 index score, Ar-HLS-EU-Q6 index score, age, level of education, self-perceived health and electronic health literacy, by calculating Spearman's rank correlation. Negative correlation between health literacy and high age [39-41], and positive correlations between health literacy and high level of education [21, 39, 40, 42], high self-perceived health [20, 39, 43], years in Sweden [26] and high electronic health literacy [24] have been found in previous studies. A correlation coefficient magnitude between 0 and 0.1 was viewed as negligible, between 0.1 and 0.39, as weak, between 0.4 and 0.69, as moderate, between 0.7 and 0.89, as strong, and between 0.9 and 1.0, as very strong [44].
Criterion validity was examined by assessing the agreement between CHL levels defined by the Ar-HLS-EU-Q16 and CHL levels defined by Ar-HLS-EU-Q6 using the Cohen κ coefficient. The Cohen κ coefficient was also used to assess test-retest reliability, i.e., the agreement between the two points in time. A Cohen K coefficient value > 0.7 was considered acceptable [36]. Internal consistency reliability, (the correlation between the items in the instruments) were assessed using Cronbach α (>0.7 indicating good reliability) [36]. Split-half reliability was calculated using Spearman’s-Browns coefficient with a reliability coefficient of .70 to 0.95 considered acceptable [36].