Improving the Face Validity of Self-Report Scales through Cognitive Interviews Based on Tourangeau Question and Answer Framework: A Practical Work on the Nursing Talent Identication Scale

Background The Nursing Talent Identication self-report Scale is a recently developed to assess the t of nursing applicants’ characteristics for the profession. In such scales, respondents may perceive items in a variety of ways. Hence, while developing such scales, cognitive interviews are used to identify problematic and ambiguous items. The present study aimed to determine how respondents understand and answer to items through cognitive interviews to assess the user-friendliness of the scale and increase its face validity. Methods In this Qualitative-descriptive study, Tourangeau's four-stage question and answer model was used as a theoretical framework. The participants included 20 rst-year nursing students from three western provinces of Iran.. Data were collected through thinking aloud and concurrent and retrospective verbal probing methods. For data analysis, the framework proposed by d'Ardenne and Collins (2015) was use. Results Through conducting 20 interviews with the participants, problems related to item comprehension, information retrieval, judgment and reporting appropriate answers were identied. Based on the results, out of 95 items, 20 items were modied. The ‘instructions’ section of the scale was also revised by making the necessary explanations and providing an example. Conclusions Cognitive interview was effective identifying problematic items of the Talent Identication Scale. Although cognitive interviewing is very time-consuming and costly, using this method ensures that the scale has the necessary validity to assess the suitability of nursing applicants to the characteristics of the profession.


Background
The human resources of each discipline need speci c talents and individual characteristics based on the philosophy of the discipline and the professional context (1,2). Identifying talented individuals is critical to improve organizational performance (3). Achieving the philosophy and goals of the nursing profession requires the t of nurses' talents with the characteristics of the profession; and this ultimately improves the health of society (3). Nurses have a wide range of roles and responsibilities, including care, support, protection, coordination, and training ones. Performing such activities and tasks requires special talents and characteristics in nurses (4). The Nursing Talent Identi cation Scale is a self-report questionnaire with 54 items, which aims to identify the nursing talents of applicants willing to enter the profession. In this way, the most suitable individuals are employed, which results in job satisfaction and guarantees the quality of nursing care provided by nurses after employment.
An instrument that aims to measure a subjective concept like aptitude has to be constructed carefully. In order for researchers to judge the quality of the instrument, the necessary information about the development process and psychometric properties of the instrument must be provided (5). In selecting health measurement instruments, three characteristics of the instrument including validity, reliability, and responsiveness should be considered (6). One of the important types of validity in instrument development is face validity (7). Face validity means that the target population considers all instrument items as relevant. If the participants feel that a measurement lacks face validity, they are likely to withdraw to participate in a study (5,8). Therefore, it has been suggested that participants need to be included in the process of developing self-reporting instruments (6). The validity of self-administered instruments can be threatened by three components, including comprehension problems, validity problems, and processing di culties (9). The more effort is put into building validity, the greater the trustworthiness of the instrument. For this purpose, mixed methods including quantitative method (item impact index) (10) and qualitative method (cognitive interview) should be used to determine face validity (5,11).
Cognitive interview is used to determine how respondents understand and respond to instrument items (12,13). Researchers can identify the problems in the questionnaire through conducting semi-structured interviews with the target population and make the necessary corrections to facilitate responding to the items (14). This leads to a reduction in incomplete data collection and response error (15). Thinking aloud and verbal probing are common techniques in cognitive interviewing (16,17). In the thinking aloud method, respondents are requested to express their thoughts when nding answers to the items. The verbal probing technique can be concurrent or retrospective, in other words, it can be performed during or after completing the questionnaire. In the concurrent probing, the respondent presents a verbal argument of his / her thoughts while answering the instrument. In the retrospective probing, after the respondent has answered the items, he/she engages in the given answers and expresses the problems in them verbally. Both methods are recommended (17,18).
The Nursing Talent Identi cation scale has been newly developed to measure the t of nursing applicants' characteristics with the profession. In such scales, respondents may perceive items in a variety of ways. When developing such scales, cognitive interview is used to identify problematic and ambiguous items (12). The present study aimed to determine how respondents perceive and respond to items through cognitive interviews to assess the user-friendliness of the scale and increase its face validity.

Methods
The study design The present descriptive qualitative study is part an approved doctoral dissertation at Tabriz University of Medical Sciences, Iran. The theoretical framework of the study is comprised of Tourangeau's four-stage question and answer model. This model includes examining the respondent's comprehension of items, information retrieval, the respondent's judgment to nd the answer and reporting the appropriate response (19). Based on this model, problems of misinterpretation, missing or excessive details, incomplete recalling and social desirability of the instrument are identi ed (20).

Participants and Setting
The participants of this study included 20 rst-year nursing students from Nursing and Midwifery Schools in three large provinces in western Iran. Sampling was done using purposive sampling up to data saturation.

Data Collection
Cognitive interviews were conducted with the participants in the Nursing and Midwifery Schools. The probing questions used in the interview process are listed in Table 1. To collect data, the think-aloud and concurrent and retrospective verbal probing methods were used (21). During the completion of the instrument, the behavior of the respondents was observed and notes were taken about skipping items, changing the response of the items, scale-related problems, and hesitations when responding.
Participants were requested to think aloud as they completed the scale and to tell the interviewer whatever came to their mind. Semi-structured interviews were also conducted with the aim of exploring clarity, comprehensibility of items and appropriate face validity of the scale. Each interview lasted approximately 40 to 50 minutes (11,21). When obtaining informed consent, the interviewer provided the participants with the necessary explanations about both of the interviewing techniques and thinking aloud. The interviews were audio-recorded so that the interviewer could concentrate on the interviews and not be distracted by taking notes.

Ethical Considerations
This study is part of an approved doctoral dissertation in Nursing at Tabriz University of Medical Sciences, Iran. The study protocol was approved by the Ethics Committee of the University (IR.TBZMED.REC.1397.583). Prior to the study, the objectives of the study were explained to the participants and written consent to participate in the study and audio-recording of the interviews was obtained from all participants.

Data Analysis
Data analysis was performed based on the proposed framework for analyzing cognitive interviews. First, the transcript of each interview was read several times by a researcher to gain a full understanding of the text of the interviews. This helped to identify the main problems of the items from the text of the interviews. Each item was then placed in a separate matrix with eight columns. The column headings were based on the framework proposed by d'Ardenne and Collins (2015) as well as common problems identi ed in the items from the interviews, including respondent details, survey answers, ndings from observations and think-aloud method, general probes, comprehension probes, retrieval probes, comfort probes, and other ndings (22, 23) ( Table 2). After analyzing the data, the ndings were discussed with the members of the research team in a panel session.

Results
The mean age of participants was 20 years (standard deviation [SD] = 2.5) and 11 (57.89%) of them were female. The problems related to the general aspects of the scale, misinterpretation, missing or excessive details, and incomplete item recalling were identi ed and corrected. Based on the results, out of 95 items, 20 items were revised. Findings were reported based on Tourangeau's four-stage question and answer model (19). Table 3 illustrates the results of the cognitive interview.

Comprehension (Assessing the Respondent's Comprehension of Items)
The participants provided detailed feedbacks about the problematic items as follows: Participants did not interpret item 31 ["I am in contact with many people"] in the intended manner. The students reported that the item could be associated with any communication and was incomprehensible to them. They pointed out that "why one should interact with everyone" (P.12). Thus, according to the participants, this item was changed to: "I have good social relationships with others". In item 34, ["I am an eloquent and audible speaker"], the participants reported that "the words eloquent and audible have di cult meaning" (P.3). Meanwhile, some participants expressed that: "This item implies speaking frankly" (P.11). Therefore, this item was also modi ed as follows: "I speak to others in a simple and understandable way". In item 72 ["I can do a lot of things without feeling tired"], the participants stated that this item looks a bit unusual. They stated that "Surely a lot of work tires a person. It is better to say that I get tired quickly with the least amount of work" (P.7). This item was changed according to the participants. In item 85 ["I evaluate the situations with a broad view and through considering the relationships between the components"], the participants reported: "I did not understand what you mean by 'the relationships between the components'; it is vague" (P.8). The item was modi ed as follows: "I try to consider all aspects of a problem". In item 93 ["I am slow in performing my tasks accurately"], the participants' perception was that "To do things accurately, you must be slow" (P.18), while this was completely different from the intended meaning of the research team. The participants also stated that "The words 'accurate' and 'slow' imply opposite meanings and make the sentence di cult to understand" (P.17). The item was modi ed as follows: "While I am fast, I am careful enough when doing things". In item 94 ["I respond quickly to visual and auditory stimuli such as sound and light"], the participants' comprehension was negative and different from the research team. Their perception was that "This item means that they are bothered by sound and light" (P.5). The item was modi ed as follows: "I use my senses to be aware of my surroundings".
Information Retrieval (determination and how to nd the answer to an item) This component evaluates participants' responses based on the strategies they use when responding to the item. During the interview with the participants, no problems with information retrieval were identi ed while responding to the items. Participants retrieved the past and present information from their memory to respond to the items. They constantly thought about their actions and behaviors during these periods, and then responded to the items.

Judgment (respondent's judgment to nd the answer)
Many participants did not interpret item 7 ["I promptly notify an error when it needs to be corrected"] in the intended manner and there were different interpretations. Participants reported that "It is not clear whether the error was made by the individual or other people" (P.9). The item was corrected as follows: "If I make a mistake that needs to be corrected, I report it immediately".
In item 87 ["I pay attention to important details in doing things compared to others"], the participants stated that "A person may be much more precise (being very meticulous is not really needed in nursing profession) compared to the respondent" (P.16). When rewriting this item, 'Compared to' was removed.
Response (choose answer options or nd the right words to respond to the item) Participants were asked about the usefulness of the questionnaire guide. They suggested that how to respond to items be mentioned with an example in the scale guide. Participants were also asked to comment on the appropriateness of the item response options. Participants agreed on a ve-point Likert scale (strongly agree to strongly disagree) to respond to items. They reported answers such as "they are reasonable" (P.13) or "did not notice a problem" (P.8). However, some participants suggested that the scale be reduced to only two or three answer options: "it is su cient to ask to agree or disagree". "There is no need for strongly agree or strongly disagree" (P.11). Many participants preferred that the ve-point Likert scale was easier and more accurate than the two or three-point Likert scale. The interviewer observed that participants had di culty responding to negative items. This was especially evident in item 83: "I do not insist on my wrong ideas and opinions". This item was reviewed and rewritten with an a rmative sentence.
The participants were requested to comment on the usefulness of the scale guide. They suggested that the way of responding to the items be mentioned with an example in the guide section. Participants were also asked to comment on the appropriateness of the item response options. They agreed on a ve-point Likert scale (strongly agree to strongly disagree) to respond to the items. They reported answers such as "They are reasonable" (P.13) and "I did not notice a problem" (P.8). However, some participants suggested that the answer options of the scale be reduced to two or three options: "I agree' and 'I disagree' options are OK and there is no need for 'strongly agree' or 'strongly disagree' options" (P.11). Meanwhile, many participants preferred that the ve-point likert scale was easier and more accurate than the two-or threepoint likert scale.
In addition, the participants had di culty responding to negative items. This was especially evident in item 83: "I do not insist on my wrong ideas and opinions". This item was rewritten with an a rmative sentence.

Discussion
Cognitive interviewing was very helpful in optimizing scale items and signi cantly improved the clarity, comprehensibility, and quality of scale items. This study showed that nding the appropriate answer for each item involves a complex process of comprehension, information retrieval, judgment, and response (19). In this study, the items containing probable response errors were identi ed through cognitive interviews. In addition, the way respondents comprehend and interpret the items, along with tool problems were identi ed Correct understanding of items by respondents is one of the important components in the cognitive interview process (5). This study showed that small changes in the appearance of the scale or words can lead to greater clarity in understanding items. According to COSMIN instructions, there is no reason to delete the item, but to improve the clarity of the item, words appropriate to the respondents' perception can be substituted (6). Research ndings also indicate that scales designed with clear and unambiguous words for the target population, enable respondents to successfully answer the items (24,25).
Most studies have reported that non-response to items of a scale occurs when the respondent is unable to understand the meaning of the item (25,26). In cognitive interviewing, the process of responding to items is determined from the perspective of the respondents and the necessary corrections are made to the clarity the items, which leads to a reduction in non-response to items (26). In this study, participants had di culty understanding some words and their meanings. It should be noted that the meaning of words should be considered in the context in which the scale is used. Words that are familiar to a group may be unfamiliar or have a different meaning for another group. Cognitive interviewing, facilitates understanding of the items and their interpretation by identifying problems of understanding the vocabulary of the scale for the respondents (27).
In this study, the participants had di culty responding to items with negative words; this can confuse respondents in choosing a suitable option. Hence, in many studies, it has been recommended to avoid using negative words in the items as much as possible (5,12).

Limitation
In our study, cognitive interview participants were selected from only three western provinces of Iran by purposive sampling. We may not have included the full range of all participants in the study.

Conclusion
Cognitive interviewing was effective in identifying problematic items in the Nursing Talent Identi cation Scale. Participants' feedback led to a signi cant improvement in the items of the scale. Although cognitive interview is a very time-consuming and costly, using it in the psychometric phase of the scale ensures that the Nursing Talent Identi cation Scale is a valid instrument for measuring the t of nursing applicants for the characteristics of the professional.