Design & Procedure
The design was a cross-sectional online-survey conducted in Hong Kong, Australia, USA, United Kingdom, and Germany. The survey was programmed using the online-survey platform Qualtrics. Participants were recruited using stratified quota sampling to ensure that each sample was representative of the respective general population based on sex, age, and educational attainment. No further eligibility criteria were applied. Data were collected between February and March 2021. We aimed for a sample size of 2500 taking into account the stratification and number of sites, the large number of predictors, and expected small effect sizes of some of the putative predictors. The survey took 25 minutes in total, beginning with informed consent, followed by socio-demographic assessment and the questionnaire battery, of which further details have been reported elsewhere32. To prevent missing data, participants were required to respond to all questions on each page before being able to continue leading to minimal missing data on isolated variables due to initial software errors (Missings for: “perceived infection risk”: 0.2%, n=7, “preferred sources of information”: 2.8%, n=72, and “social adversity”: 0.1%, n=3) or due to a “don’t know” answering option (“size of the current home city”: 9.3%, n=234). Participants who failed any of the attention checks, took shorter than half of the median completion time, or showed patterns of machine responses or duplicate patterns of response were excluded. The flow of participants across sites is shown in Table 1. All procedures were approved by each of the ethics committees of the institutions involved (i.e., (1) University of London Research Ethics Committee,. Reference No. 2368, (2) Care New England - Butler Hospital Institutional Review Board, Reference No. 202012-002, (3) La Trobe University Human Research Ethics Committee, Application No. HEC21012, (4) Local Ethics Committee, Universität Hamburg, Application No. 2020_346, and (5) The Chinese University of Hong Kong Survey and Behavioural Research Ethics Committee Reference No. SBRE-20-233).
Measures
Willingness to be vaccinated for COVID-19 was assessed with the following item: “If a COVID-19 vaccine was offered to you now, would you accept it?” The item was rated on a scale from 1= “Definitely not” to 5=“Yes, definitely’” adapted from Wong and colleagues33.
Sociodemographic data and related questions: Sociodemographic variables included age, sex assigned at birth, and current gender (options: “male”, “female”, “trans-male”, “trans-female”, “genderqueer”, and “other”), size of the current home city (rated in six categories form ≤100.000 to ≥10.000.000), highest educational degree achieved (rated in nine categories from elementary school degree to PhD), annual income (seven categories from “under £18,500/US$24,999/18,000€” to “above £112,000/US$150,000/109,000€”), employment status over the last year (nine categories), migrant status, minority status (five categories, each rated as present or absent), and having a mental-health diagnosis.
Risk perception variables included (1) COVID-19 anxiety, (2) personal experiences with COVID-19 in family members or friends, (3) perceived infection risk, and (4) perceived consequences of an infection. Following Shevlin et al.34 COVID-19 anxiety was assessed using the question “How anxious are you about the coronavirus COVID-19 pandemic?” for which participants were provided with a ‘slider’ to indicate their degree of anxiety with 0=“not at all worried” and 100=“very worried”. Personal experiences with COVID-19 in family members or friends were assessed by the following item: “Someone who is close to me has had a COVID-19 virus infection confirmed by a doctor” rated with 1=“yes” 0=“no”. Perceived risk of a COVID-19 infection was assessed with the item: “What do you think is your personal percentage risk of being infected with the COVID-19 virus over the following time periods?” rated from 1=“no risk” to 11=“great risk” for each time period (“the next month”, “the next three months”, and “the next six months”). Similarly, perceived consequence of infection was assessed with “How bad do you think would be the consequences of you being infected with the COVID-19 virus over the following time periods?” rated from 1=“not too bad” to 11=“very bad”. Mean scores of perceived risk and perceived consequences were calculated.
Political views were rated from 1=”very left-wing” to 7=”very right wing” and preferred sources of information (“How do you find out about what is going on in the world?”) were rated from 1=“always from mainstream media” to 5=“always from social media”9.
Specific mistrust variables included (1) COVID-specific paranoid ideation and (2) vaccine conspiracy beliefs. COVID-specific paranoid ideation was assessed with the Pandemic Paranoia Scale32, a 25-item scale assessing paranoid thinking specifically related to the COVID-19 pandemic. It comprises a COVID paranoia global score and the three facets pandemic persecutory threat (15 items, e.g.: “People are deliberately trying to pass COVID-19 to me”), pandemic paranoid conspiracy (six items, e.g.: “COVID-19 is a conspiracy by powerful people”), and pandemic interpersonal mistrust regarding health measures (four items, e.g.: “I can’t trust others to stick to the social distancing rules”). Participants answer on a scale from 0=“not at all” to 4=“totally”. Based on the data used for this article, Kingston et al.32 reported good reliability (internal consistency: α=0.90, test-retest reliability: 0.60≤r≤0.78), factorial validity, and criterion validity. For this study, the three subscales and the global score were calculated. Vaccine conspiracy beliefs were assessed by adapting the general 7-item Vaccine Conspiracy Beliefs Scale35, a valid one-dimensional scale with high internal consistency. The adaptation involved referring to COVID-19 vaccines specifically and using present tense (full item-list in Supplement 1). Reliability in this study was α=0.97.
General mistrust variables included paranoid ideation and general conspiracy mentality. Paranoid ideation was measured with the Revised Green Paranoid Thoughts Scale36. This 18-item questionnaire assesses ideas of reference and persecutory ideation over the past fortnight on two scales. Each item (e.g. “Certain individuals have had it in for me”) is rated on a scale from 0=“not at all” to 4=“totally”. Higher scores indicate higher levels of paranoia. Reliability in this study was α=0.94 for ideas of reference and α=0.96 for persecutory ideation. General conspiracy mentality was assessed with the Conspiracy Mentality Questionnaire37 an instrument designed to efficiently assess differences in the generic tendency to engage in conspiracist ideation within and across cultures. A one-dimensional and time-stable construct has been confirmed across several language versions. It consists of five statements (e.g. “Many very important things happen in the world, which the public is never informed about”) that are rated in terms of their likeliness on scale from 0=“0% chance” to 11=“100% chance”. Reliability in this study was α=0.91.
Social adversity was screened alongside socio-demographic variables with a four item self-report questionnaire used by Jaya and colleagues20. The items consisted of yes/no questions covering emotional neglect, psychological abuse, physical abuse, and sexual abuse (e.g., “were you ever approached sexually against your will?”).
Generalized beliefs about self, others, and one’s own social rank were assessed with the Brief Core Schema Scales (BCSS)38 and the Social Comparison Scale (SCS)39. The BCSS assesses negative and positive beliefs about oneself and others on four subscales of six items, respectively (e.g., “Other people are bad”) that are rated as yes versus no. For each yes-response the degree of conviction is assessed on a scale from 1=“no, do not believe it” to 5=“yes, believe it totally”. Reliability for the subscales in the current study ranged from α=0.85 to α=0.90. The SCS consists of 11 bipolar items that ranged from 0 to 10 (e.g., inferior-superior, left out-accepted) that are rated over the past four weeks. Lower scores indicate a more negative view of the self in comparison with others. Reliability in this study was α=0.95.
Analyses
Statistical analyses were conducted with SPSS 2240. First, we calculated Pearson correlations for all predictor variables. Next, we calculated multifactorial regression models for each of the variable clusters (1) extended socio-demographic data, (2) risk perception, (3) political view/news source, (4) specific mistrust, (5) general mistrust (5) interpersonal trauma, and (6) beliefs about the self, others, and social rank in order to compare the explained variance in vaccination willingness for these different predictor types. In a final regression model, all variables were entered to evaluate the overall explained variance. All significance tests for correlations and predictors in regression models were two-tailed tests based on available data without any further adjustments.
Next, to further test for optimization of prediction accuracy, we established a machine-learning algorithm to predict vaccination willingness (i.e. “definitely” or “probably” getting vaccinated) versus unwillingness (“definitely not” or “probably not” getting vaccinated) using all assessed variables (n=2116) and including missing values in the classification. The mid-category of (“possible”) willingness was left out in this analysis. Calculation of machine learning models were carried out in Python 3.8.6 with the packages scikit-learn 0.23.241, as well as Numpy, Pandas and imblearn. For all tested models we used random forest classifiers and conducted hyperparameter tuning on a class-balanced version of the dataset first (see Supplement 2 for details). Next, we chose the hyperparameter configuration that had the best testing accuracy and evaluated model performance by cross-validating across the five sites and by leave-one-person-out cross-validation19. Finally, we used the calculated machine learning model to evaluate the predictive value of the individual variables. We used permutation feature importance42 (see Supplement 3 for details) to estimate the importance of each variable in a given model. This allowed for the selection of the highest ranking variables to test whether subsequent smaller machine learning models that use only a small selection of questionnaires retain accuracy. Furthermore, it allowed for the elimination of the highest ranking variables/variable cluster to further explore their absolute relevance (i.e., whether they could be compensated for by other predictors).