Continuous and Binary Sets of Responses are Not the Same: Evidence from the Field

This paper conducts a pre-registered study aimed to compare binary and continuous set of responses in questionnaires. Binary responses consist of two possible opposed responses (Yes/No). Continuous are numerical, where respondents can indicate an answer in a 0 to 10 horizontal blind line. We study whether binary and continuous feasible sets of responses yield to the same outcome (distribution) and the same cost (duration in minutes). We collect data from 360 households in Honduras who were randomly assigned to Yes/No questions or given a slider (visual scaling 0-10) to mark their responses, therefore, we provide causal evidence . We ﬁnd that respondents are 13% more likely to say “Yes” and spend 2.1 minutes less in the binary setting. Besides, we ﬁnd that most of the diﬀerences between binary and continuous settings arise from questions which include negative wording.


Introduction
When dealing with opinion surveys, question design is a key aspect to obtain the desired level of detailed information and to avoid difficulties in expressing opinions. Both, capture the true response of the subject and avoid any bias from the design are of utmost importance. Therefore, the type of question used in a survey is crucial and will depend on the aim of the researcher.
The literature on survey design, and more specifically, the choice of the response format, offers a wide range of typology of questions to evaluate the degree of agreement with an issue [1; 2; 3; 4; 5]. Agree-disagree scales are used to obtain the level of agreement with an issue [6]. If the scale options directly refer to the issue under evaluation, item-specific scales are then recommended [7; 8]. In social sciences, dichotomous scales (binary responses), in which two responses with opposed directions are offered (typically Yes or No) are widely spread when the goal is to evaluate the direction of the answer (agreeing or disagreeing with an issue). This type of questions are very clear in meaning. There is considerable consensus on the meanings of the options [9] and they are easy for the respondents to understand. They also imply low costs and low time for both the researcher and the respondent [10]. In addition, in case of factual questions it has been showed that people correctly answer binary questions more often when the correct answer is "Yes" [3]. This format also requires less interpretative efforts compared to longer ones [9]. However, the lack of neutral responses may alter the results.
It may lead to acquiescence bias, which is the tendency for a respondent to agree with a statement without considering the content of the item or to please the researcher [11; 12; 13; 3]. Moreover, it could be problematic for respondents with neutral attitudes because of the lack of accurate mapping between moderate responses and the dichotomous options offered. As a result, it would require a major cognitive effort [14; 9; 15].
Continuous rating scales allow the researcher to obtain the exact level of agreement with certain issue. When continuous rating scales are numerical, respondents can indicate an option in a horizontal line, marking precise numbers which express the exact level of agreement for an issue with respect to a maximum number, usually 10 or 100. The use of this metric scale allows subjects to easily transform their opinions, whether extreme or mild, into replies, reducing distortion [16]. Nevertheless, point meanings may become less clear as the respondent cannot differentiate between one number and the other [9].
When choosing among binary and continuous rating scales, it is important to evaluate the advantages and disadvantages of each type of question.
In terms of statistic evaluation, continuous scales allow a wider range of procedures [17]. Regarding survey length and complexity, continuous scales require more time to explain the set of possible answers and to think about its conversion into numbers, implying higher reasoning costs [16; 10]. However, compared to binary responses, continuous set of feasible answers provide better quality of information [9]. In binary scales, where respondents have to convert mild responses into Yes-No answers, the respondent is forced to make computations to transform their opinion in the middle to an extreme answer, resulting in an inaccurate choice, which consequently may manifest a higher level of agreement that would not be reached if middle responses were available (i.e., acquiescence or satisficing [13; 3]). Moreover, binary scales do not allow for nuance in respondents' answers or to study respondents' expectations at an individual level. Therefore, if respondents have problems in computing their mild thoughts into extreme responses, biases arise.
This study investigates the extent to which the use of binary and continuous scales yields different results. Specifically, we study if the probability of agreement with a statement is the same when using binary and continuous set of responses. To do so, we conduct a field experiment in Honduras with 360 randomly selected participants. Following a random assignment, half of the participants were assigned to the binary treatment, in which subjects has to respond Yes or No. The other half were assigned to the continuous treatment, in which they had to answer the same questions but with a different set of responses. Specifically, subjects in the continuous treatment indicate the level of agreement with an statement using a slider with a 0 to 10 scale. To compare replies among treatments, we discretize continuous answers. We assign the binary option "No" if the option indicated is below 5. Those who claim to agree by more than 5 are assigned to the binary option "Yes". In case the option indicated is 5, the answer is randomly assign to "Yes" and "No". Additionally, we study whether question wording such as the use of negative words (e.g., no, anything) or prescriptive expressions (e.g., act as, take care of, limit activities) has also an effect on the probability of agreement. We also analyze the differences in terms of survey length.
We find that respondents are more likely to say "Yes" in the binary setting compared to the continuous setting. In particular, the probability of answering "Yes" is 2.3 times higher when asking in binary instead of continuous. Regarding the survey length, binary questionnaires took 2.1 minutes less, which implies on average a reduction of 41% with respect to continuous questionnaires. Our results hold when excluding males from the sample and those subjects with lower cognitive abilities.
We extend the analysis to the wording of the questions. The literature has shown that question wording also plays an important role in acquiring accurate responses. Negative wording has been defined as a question in which disagreement would be a positive answer [12]. Negative wording is usually introduced in surveys to avoid fatigue throughout the questionnaire, acquiescence [18; 19; 12] and to avoid respondents choosing the same answer [20]. Nevertheless, [12; 21; 22], among others, raise the attention to the additional processing effort that negative questions imply, overall in binary sets, where the respondent have to decide to agree or disagree with a question that includes negation, what may generate confusion (see [23]).
In the same line, our data show that when using negative wording in ques-tions, the probability of agreement jumps to 41% in binary sets compared to continuous.

Results I: Aggregate data
We start showing results of the behavior of participants along the entire experiment. Recall that every participant replied to 30 questions in a random order (52% starting from block A, and 48% starting from block E ). Figure 1 shows the average number of answers "Yes", along with the average time (in minutes) respondents spent on the entire questionnaire in both, continuous and binary answers. We observe that the probability of agreement with the survey question is higher when asking in binary, and that the average time in the binary setting is lower. Table 1 shows the regression analysis. Model 1 does not include controls. Model 2 includes controls for age, gender, and ethnic. Model 3 includes controls for age, gender, ethnic, school, sufficient income, having a daughter, education, and the order of the questions in the survey.
In addition, after controlling for these characteristics, the difference between binary and continuous is statistically significant at the 1 % level (Table 1).
It is important to recall that subjects were assigned to treatments randomly. Hence our results imply causality. Therefore, we conclude that: Result 1: Binary settings produce higher shares of Yes.   [24] developed by [25].
Then, we conclude that Results 1 and 2 are not driven by the respondents' cognitive abilities.
In conclusion, the use of binary (instead of continuous) set of responses make subjects respond more "Yes" (or less "No") and complete the survey much faster. These results are robust to excluding men and low ability respondents.   Figure 3 shows the average number of "Yes" (and "No") question by question. It shows that Result 1 (higher number of "Yes" when asking in binary) does not hold for every single question. Indeed we see that many items are almost identical in binary or continuous.

Results II: Question by question
In particular, we find that 10 out of 30 questions are different and most importantly the direction of the difference is always the same (see Model 3 in Table S7, p < 0.05). That is, from the ten questions that report different numbers of "Yes", all of them report higher probability of agreeing with the survey question when asking in binary instead of in continuous. This difference is statistically significant at a 0.05 level.
Result 3: In 63% of the cases there are not differences between binary and continuous settings. In the other cases, the differences are always in the same direction (positive) as in Result 1.
For sake of completeness we repeat the analysis for women only and for the sub-sample of subjects who passed the cognitive test. Results remain in the same direction (Table S8).
Our results also hold when we remove those individuals responding the neutral option 5 (see Table S9) or when we consider as a cut-off the value of 7 (see Table S10). We selected this value since it is the minimum qualification to pass exams at school in Honduras (over a total of 10) [26] and it could be considered as a turning point between agreement and disagreement.

The impact of wording
We next explore the role of question wording in explaining Results 3. To do this, we distinguish three types of question: negative, prescriptive, and other. Negative refers to questions which include words like "no" or "not".
Prescriptive refers to those questions that include words like "have to" or "must be". Other questions refer to those questions that cannot be classified First, we analyze the differences at the aggregate level when asking in binary and continuous for each group of questions (negative, prescriptive, and other). That is, we compare the total number of "Yes" that the subjects respond in binary and continuous. We start with the whole sample. Table S4 reports the average difference in the number of "Yes" when asking in binary instead of continuous for each group of questions. Regarding the group of negative questions, those who answered binary questionnaires said "Yes" 1.4 times (or 41%) more than those who answered the continuous questionnaires.
The average number of "Yes" is also higher in binary when considering the group of other questions (1.02 times or 8.43%). The difference is not statistically significant when considering the group of prescriptive questions.

Result 4:
The probability of agreeing with the survey question is higher when using binary set of responses and negative wording (41%).
We repeat our analysis for women and removing those subjects with low ability according to the results obtained from [24] (see Table S4 middle and bottom) and Result 4 is confirmed.

Discussion
We investigate how answers change when questions are asked using binary Yes-No answers or by offering a continuous rating scale.
Using data from 360 households in Honduras where subjects were randomly assigned to treatments this paper provides causal evidence regarding differences in the probability of agreement and the time length of the questionnaire. We also provide non-causal evidence of the impact of negative wording.
We compare binary and continuous responses by discretizing the continuous variable, collapsing values into 0 ("No") and 1 ("Yes") answers. That is, those which indicates an option below 5 are assigned to 0, while those indicating more than 5 were assigned to 1. Thus, those answering the neutral option 5, are randomly assigned either to 0 or 1.
Our first result shows that, on average, the probability of agreement is 13% higher in binary sets compared to continuous sets. The fact that the binary setting yields to higher agreement rates is not new since this typology of questions is affected by acquiescence, the tendency to agree with the question without considering the content of the item (see [13; 3]).
Second, we control for the length of the interviews and, as expected, Result 2 shows that continuous settings require more time than binary settings.
Specifically, opting for Yes-No responses reduced the average length of the survey in 2.1 minutes, which implies a reduction of 41.5%. This result is also expected given the wider range of options offered to the respondents, which may need more time to accurately map their opinion in a 0 to 10 interval.
Furthermore, the use of the slider to collect answers is also time consuming.
This is in line with [16; 10], that declared longer completion times by using sliders. Sliders are more engaging than other visual scales [27; 28], and increase the tie and effort needed to provide an answer [15]. Sliders also allow respondents to communicate exact values and its use simultaneously convey the respondent how precise the expected answer should be [29]. In Additionally, we consider respondents' ability using an extension of [24] (see also [25]). We find that 71% of the sample responded correctly to these questions. Using this sub-sample, we test our data and find similar results: high ability respondents require on average 2.3 minutes less when using a binary rather than a continuous questionnaire, which implies a reduction of 45%, 2pp less compared to the whole sample. We also find that the probability of agreement is higher in binary sets compared to continuous sets. Specifically, respondents answer "Yes" 2.4 times more (or 14.1%) in binary sets. [13; 3].
Third, we also estimate the differences of responses for every single question. Result 3 shows that 37% of total questions present significant differences between binary and continuous sets, showing in all cases higher agreement in the binary setting. These results hold when we only consider women and when we take the high-ability respondents' sub-sample. To go a step further, we analyze whether the role of question wording has an effect in our results. In particular, we distinguish negative wording questions, which include negative adverbs in the sentence. Specifically, 8 out of 30 questions include negative words, from which 7 of them show significant differences when we compare the binary and the continuous treatment. This specific group of questions report 41% more agreement in binary than in continuous sets. The group of questions classified as "other" also report 8.43% more agreement in binary than in continuous sets. However, the group of questions with prescriptive words show no differences.

Concluding remarks
This paper tests whether asking questions using binary or continuous sets provide different responses. To do so, we design a quasi-experimental survey where subjects were randomly assigned to the binary or continuous treatment, hence we provide here causal evidence. The experiment was conducted in Honduras where 360 participants were randomly selected. At the aggregate level, we find that subjects are more likely to say "Yes" in the binary setting (Result 1). We also find that subjects in the binary treatment spend 41.5% less time to respond the whole questionnaire (Result 2).
Thus, continuous set of responses comes at a higher cost. We test whether both results still hold when the sample is only composed by women (87% of the sample) and results are similar. We also test whether results remain constant when we control for cognitive abilities. The proportion of subjects classified as low ability is 29%. Results removing this subgroup of participants are similar to the ones obtained using the whole sample. Then we estimate the difference between binary and continuous treatment question by question. We find two relevant results. First, the differences between binary and continuous sets do not occur in every single question but in 37% of them. However, the difference between binary and continuous always goes in the same direction. That is, the probability of agreeing with the survey question is higher when using binary set of responses instead of continuous.
Finally, we analyze the impact of question wording. To that end, we differentiate questions with negative, prescriptive and other wording. We find that the negative wording makes subjects increase the probability of answer "Yes" in a 41%. We find no differences between binary and continuous when using prescriptive wording; in sum, our paper provides causal evidence of the difference between binary and continuous set of responses. We find that the type of answer is important and could yield different results. In addition, question wording could play an important role in explaining its results.

Methods Protocol
The experiment took place in Santa Rosa de Copán (Honduras) between  We also are interested in comparing the time expended when using binary instead of continuous set of responses. To that end, we recorded the time (in minutes and seconds) devoted to complete the whole questionnaire by each participant (note that, given the paper-based nature of the questionnaire, we could only account for the time of the whole questionnaire, not question by question). Finally, to account for respondents' ability, we make use of an extension of [24] developed by [25]. This test consist of several questions

Sample
The total sample consists of 353 subjects. However, 7 of them did not answer all questions. Table S5 presents descriptive statistics of the key variables in our analysis. All the subjects surveyed were residents in Honduras at the time of the survey and had a child between 6 and 9 years old in one of the 12 schools considered (see Table S6 for further information). The average age of the participants is 34 years. They were mostly women (87%) with secondary or less than secondary education, and with a daughter living at home (65%). Most of the individuals have not a particular ethnicity (70%), and 29% are low ability (see S3 for more details about measuring ability).
In addition, 77% of the individuals in the sample indicate that they have enough money to feed their children.
All the subjects answered the same questionnaire with only one difference, the type of answer (binary vs continuous). 52% of the participants answered binary questions (Yes/No questions), and the remaining 48% answered continuous questions.
The average number of times that the subjects answered "Yes" is higher in binary questionnaires (see standarization below). In addition, the response time in binary answers is lower (about 2 minutes per questionnaire less).
Finally, we provide the balance of the randomization across treatments in our experiment (Table S5,  If the individual claims to agree with a statement by indicating more than 5 (r > 5), the binary equivalent is "Yes" and it is assigned to 1. In case of being less than 5 (r < 5), the equivalent is "No", and it is assigned to 0. Finally, if the individual indicates the neutral response 5 (r = 5), we randomly assign the answers to 0 or 1.
We opt for splitting 5 answers as it represents the mean point in the continuous set. Using the slider instead of a "Yes/No" question, we can measure the intensity of an opinion. Similarly to our case, the well-known Likert scale questions [34] also measure this intensity. The original 5-point Likert scale questions use the mean value to indicate "neither agree nor disagree", values above the mean indicate agreement with the statement, and values below the mean indicate disagreement with the statement [35].
Following a similar reasoning and using our continuous questionnaire, we assume that: (1) when the answer is greater than 5, it means agreement with the survey question; (2) when the answer is less than 5, it means disagreement with the survey question; (3) when the answer is equal to 5, it means neither agree nor disagree. Then, as a result, when the individual answers 5, we randomly assign him/her to "Yes" or "No".

Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on request.