Bias In Algorithms Of AI Systems Developed For COVID-19: A Scoping Review

doi:10.21203/rs.3.rs-1321571/v1

Download PDF

Research Article

Bias In Algorithms Of AI Systems Developed For COVID-19: A Scoping Review

https://doi.org/10.21203/rs.3.rs-1321571/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Objective: to analyze which ethically relevant biases have been identified by academic literature in artificial intelligence (AI) algorithms developed either for patient risk prediction and triage, or for contact tracing to deal with the COVID-19 pandemic. Additionally, to specifically investigate whether the role of social determinants of health (SDOH) have been considered in these AI developments or not.

Methods: we conducted a scoping review of the literature, which covered publications from March 2020 to April 2021. Studies mentioning biases on AI algorithms developed for contact tracing and medical triage or risk prediction regarding COVID-19 were included.

Results: from 1054 identified articles, 20 studies were finally included. We propose a typology of biases identified in the literature based on bias, limitations and other ethical issues in both areas of analysis. Results on health disparities and SDOH were classified into five categories: racial disparities, biased data, socio-economic disparities, unequal accessibility and workforce, and information communication.

Discussion: SDOH needs to be considered in the clinical context, where they still seem underestimated. Epidemiological conditions depend on geographic location, so the use of local data in studies to develop international solutions may increase some biases. Gender bias was not specifically addressed in the articles included.

Conclusions: the main biases are related to data collection and management. Ethical problems related to privacy, consent, and lack of regulation have been identified in contact tracing while some bias-related health inequalities have been highlighted. There is a need for further research focusing on SDOH and these specific AI apps.

artificial intelligence

bias

digital contact tracing

COVID-19

patient risk prediction.

During the COVID-19 pandemic, one of the most widespread measures adopted to control, minimize and mitigate the impact of COVID-19 was the development of mobile apps that use a variety of technologies to log information used to infer the spread of the disease, physical symptoms of individuals, and possible close contacts. Digital contact tracing (DCT) via smartphone apps was established as a new public-health intervention in many countries in 2020 to reduce the levels of COVID-19 transmission (Colizza et al., 2021). Nevertheless, DCT systems may perpetuate some biases that influence the results obtained, while raising security and privacy concerns (Bengio et al., 2020; Sun et al., 2021).

Another big effort has been put in developing decision-making support devices to help clinicians at the bedside. There is an increasing multiplicity of artificial intelligence (AI) systems and algorithms focused on COVID-19 early detection in risk patients and their prognosis (Jamshidi et al., 2020). Studies prove that these novel technologies support medical triage in those circumstances where healthcare resources are scarce. Still, their results show some limitations due to technical issues or regarding social, cultural, and economical reasons.

While there are other reviews about the application of AI for COVID-19 (Guo et al., 2021), our intention is to focus this review in two specific topics: biases exclusively in AI systems developed for 1) DCT and 2) medical triage regarding COVID-19, as they have been two of the most widely developed automatized systems during the early phases of the pandemic and they need to be evaluated. In addition, although certain social health conditions have been discussed in general terms, in neither clinical research nor technical development of apps, are social determinants of health (SDOH) explicitly addressed. Consequently, we hypothesize that there may be a lack of qualitative data analysis in their application, which causes an overlook of health disparities and SDOH and that have effects on people’s health. Clinical research and app development research are mostly focused on biological-only data, which not only may retain biases and exacerbate health inequalities (Röösli et al., 2021) but also underestimate social-related biases in their analysis.

Thus, in this scoping review we aim to summarize some of the ethically relevant types of bias that have already been identified in literature in AI systems developed for DCT, and for patient risk prediction (PRP) or medical triage to deal with COVID-19 pandemic. In addition, a secondary goal is to analyze if there is any relationship pointed out by previous literature between the biases and social determinants of health in AI systems and algorithms developed for COVID-19.

A scoping review can be a useful approach when the information on a topic has not been comprehensively reviewed, which is the case of research themes. This scoping review followed the recommended five-step framework for scoping reviews (Arksey & O’Malley, 2005; Munn et al., 2018; Pham et al., 2014; Tricco et al., 2016). The research questions were: What are the biases identified in the literature in AI systems or algorithms developed for COVID-19 triage or PRP and DCT? Do researchers and engineers consider the SDOH? Do they relate these biases to possible health disparities?

Search strategy and selection process

We carried out the search strategy in Pubmed, Medline, CINAHL, Scopus, Wiley Online Library, WOS, and Arxiv.org, between the 1st March and the 7th April 2021. The search strategy was initially developed in Pubmed and then adapted to the other databases (table 1).

Table 1. Search Strategy

A manual search was performed to retrieve additional relevant documents not located by the electronic search. Studies fulfilling the criteria shown in Table 2 were eligible. Following the removal of duplicates, two reviewers screened all studies by title and abstract. Full-text articles were obtained for full-text screen, and each one was read by two reviewers, based on the inclusion and exclusion criteria. Discrepancies were discussed and a consensus was reached. When consensus was not possible, all the four reviewers made a decision together.

Table 2

Inclusion and exclusion criteria
Inclusion criteria
Type of publication	Studies with quantitative methodology, mixed methodology, qualitative, interventions, narrative reviews, scoping reviews and systematic reviews/meta-analysis, randomized clinical trials, editorials and letters to editor.
Subject or domain being studied	Articles focused on or with mention of biases, algorithms or AI systems developed for COVID-19, used in triage, early detection of risk patients, and contact tracing.
Language	English or Spanish.
Participant/ population	AI systems or algorithms developed for any population group.
Intervention	Any intervention related to our condition.
Date	Articles published or accepted between March 2020 and April 2021, 7.
Exclusion criteria
Type of publication	Books and theses, not accepted for publication preprints, conferences and abstracts.
Subject or domain being studied	Articles about COVID-19 which do not include any algorithm or AI system, or which do not mention biases. Articles focused on bias, but not on AI systems for COVID-19. Articles about other topics than triage, risk prediction or contact tracing (for instance, vaccines, clinical trials, etc).
Language	Different from English or Spanish.
Participants/ population	No exclusion criteria.
Intervention	No exclusion criteria.
Date	Articles prior to 2020.

Data extraction and data synthesis

Data extraction was undertaken by the four reviewers using an agreed template designed ad hoc. We created a table gathering the main characteristics and results of the studies to collect information from the data extraction. Finally, we developed a narrative synthesis of the main findings.

After screening by title and abstract, out of the 1054 identified articles after eliminating duplicates, 134 were included for full-text reading, and 16 met the criteria for the extraction of relevant information. A manual search provided 4 additional studies; thus, 20 were finally eligible for inclusion (figure 1).

Fig. 1 Scoping Review Flow Diagram following the PRISMA 2020 statement proposed by Page et al. (2021)

Of those selected, we found: 10 narrative reviews (Grantz et al., 2020; Klingwort & Schnell, 2020; Mali & Pratap, 2020; Malik et al., 2020; Marabelli et al., 2021; Mbunge, 2020; Park et al., 2021; Roche, 2020; Röösli et al., 2021; Scott & Coiera, 2020; Shachar et al., 2020), 5 original articles (Casiraghi et al., 2020; Hisada et al., 2020; Marabelli et al., 2021; Moss & Metcalf, 2020; Ravizza et al., 2021), 2 case studies (Gulliver et al., 2020; Sáez et al., 2021), 2 systematic reviews (Mbunge et al., 2020; Wynants et al., 2020), and 1 rapid review (Anglemyer et al., 2020). Their main characteristics are shown in Table 3.

Table 3

Main characteristics of included studies
First author and year	Country	Goal	Design	Triage/ Patient risk prediction	Contact tracing
Anglemyer, Andrew (2020)	New Zealand	To assess the benefits and harms of digital solutions for identified positive cases of an infectious disease and to assess acceptability of this approach from qualitative studies.	Rapid Review		YES
Casiraghi, Elena (2020)	Italy	To develop an AI prediction model capable of processing clinical, radiological and laboratory data of patients related to COVID19 to predict their risk.	Original article: Development of a predictive ML-based computational computing system	YES
Grantz, Kyra H. (2020)	USA	To review the different applications for mobile phone data in guiding and evaluating COVID-19 response and its potential selection bias; to discuss best practices and potential pitfalls for integrating these data into public health decision making.	Narrative review		YES
Gulliver, Robyn (2020)	Australia	To present a six-stage model to evaluate and design best practice infrastructure to use big data in social policy. To provide informative and actionable technical guidance for social policy makers and researchers setting to use big data in their projects.	Case study		YES
Hisada, Shohei (2020)	Japan	To identify clusters of COVID-19 through web search query logs of multiple devices and user location information from location-aware mobile devices.	Original article: tracking the activity on some websites		YES
Klingwort, Jonas (2020)	Germany	To review weaknesses and limitations of tools such as smartphone contact apps to monitor the spread of COVID-19, to prove that no useful results can be obtained and suggest feasible alternative data sources for valid and population covering COVID-19 indicator systems.	Narrative review		YES
Mali, Suraj N. (2020)	India	To review opportunities and risks of AI applied to COVID-19.	Narrative review	YES	YES
Malik, Yashpal Singh (2020)	India	To overview prospective applications of AI model systems in healthcare settings during the COVID-19 pandemic.	Narrative review	YES	YES
Marabelli, Marco (2021)	USA	To study how IT has affected and will affect individual, organizational and societal practices during and after the COVID-19 pandemic. To develop a theoretical construct called "digital scars", defined as ethically problematic sociotechnical innovations that outlast emergency rollouts.	Original article: role of ubiquitous computing during COVID-19		YES
Mbunge, Elliot (2020)	Swaziland	To provide a comprehensive review of emerging technologies to address COVID-19 with emphasis on characteristics, challenges and country of domiciliation.	Systematic review	YES	YES
Mbunge, Elliot (2020)	Swaziland	To analyze potential opportunities and challenges of integrating emerging technologies for the implementation of COVID19 contact tracing.	Narrative review		YES
Moss, Emanuel (2020)	USA	To describe and consider the role of machine learning in social production of risk, the role of risk management in the effort to institutionalize ethics in the technology industry and its possible benefits during pandemic crisis.	Original article: risk management in Machine Learning		YES
Ravizza, Alice (2021)	Italy	To integrate the international framework of requirements to mitigate the known problems of mobile applications to monitoring and tracking and to suggest a method for clinical data collection that ensures researchers and public health institutions significant and reliable data.	Original paper: framework for epidemiological database creation		YES
Roche, Stéphane (2020)	Canada	To present an overview of the development of Contact Tracing and suggest a reflection on possible solutions for their ethical and sustainable deployment through a more active and transparent citizen engagement.	Narrative review		YES
Röösli, Eliane (2021)	USA	To review some of the risks of perpetuating biases using AI-based models to address COVID-19.	Narrative review	YES
Sáez, Carlos (2021)	Spain	To show the potential limitations that multisource variability may have for COVID-19 ML research on large international DRNs. To discover and classify severity subgroups using symptoms and comorbidities	Case study	YES	YES
Park, Sangchul (2020)	USA-South Korea	To present the main concerns over privacy involving tracing strategy through IT in South Korea.	Narrative review		YES
Scott, Ian A. (2020)	Australia	To describe several applications of AI relevant to COVID-19.	Narrative review		YES
Shachar, Carmel (2020)	USA	To expose legal and ethical concerns in AI applications to combat COVID-19 (privacy, human rights, equality and actors involved) and to give frameworks to guide stakeholders.	Narrative review		YES
Wynants, Laure (2020)	The Netherlands	To review and appraise the validity and usefulness of published and preprint reports of prediction models for diagnosing COVID-19 in patients with suspected infection, for prognosis of patients with COVID-19, and for detecting people in the general population at increased risk of COVID-19 infection or being admitted to hospital with the disease.	Systematic review and critical appraisal.	YES

Table 4 summarizes the different apps identified in this study. We use the term “bias” to refer to the systematic errors in a computer system causing an inclination or prejudice for or against a person or a group of people that can be considered to be unfair and cause a deviation from the expected prediction behavior of an AI tool, and “limitations” to refer to facts or situations that allow only some actions and make others impossible.

Table 4

DCTApps identified in the scoping review
APP	COUNTRY	TYPE	TECHNOLOGY	MANDATORY /OPTIONAL	REFERENCE
TraceTogether	Singapour	Contact Tracing	Bluetooth (BlueTrace protocol)	Optional	(Roche 2020)
The-Corona-Warn-App	Germany	Contact Tracing	Control smartphone	Optional	(Mbunge 2020)
COVIDSafe	Australia	Outbreak Response	Bluetooth (BlueTrace protocol)	Optional	(Gulliver et al. 2020)
Stopp Corona	Austria	Contact Tracing	Bluetooth	Optional	(Mbunge 2020)
BeAware App	Bahrain	Contact Tracing	Bluetooth and Global System for Mobile Communications technology	Optional	(Mbunge et al. 2020)
HaMagen	Israel	Contact Tracing	Bluetooth and GPS	Optional	(Mbunge et al. 2020)
bStayHomeSafe	China	Contact Tracing	Bluetooth, GPS and WiFi	Mandatory use	(Mbunge 2020)
CoronaApp	Colombia	Contact Tracing	Global Positioning System	Optional	(Mbunge et al. 2020)
Aarogya Setu	India	Contact Tracing	Global Positioning System	Mandatory use	(Mbunge 2020)
GH COVID-19	Ghana	Outbreak Response	Global Positioning System and GIS	Optional	(Mbunge et al. 2020)
CoronaMadrid	Spain	Symptom tracking	Global Positioning System	Optional	(Mbunge 2020)
Social Monitoring	Russia	Quarantine compliance	Global Positioning System	Optional	(Mbunge et al. 2020)
Yahoo! JAPAN App	Japan	Contact Tracing	WSSCI	Optional	(Hisada et al. 2020)
StopCovid	France	Contact Tracing	Bluetooth	Optional	(Roche 2020)
STOPV	France	Contact Tracing	Global Positioning System, Semantic Data, Epidemiological Data and Test Results	Optional	(Roche 2020)
Private Kit: Safe Paths	USA	Contact Tracing	Global Positioning System	Optional	(Roche 2020)
Covid Alert	Canada	Contact Tracing	Bluetooth	Optional	(Roche 2020)

a. AI systems developed for triage and PRP.

a. 1) Bias

One of the most relevant aspects addressed in the literature is data-related biases. According to a systematic review (SR) of COVID-19 prognostic and risk prediction methods (Wynants et al. 2020), there is a high risk of bias in the studies included due to a poor description of the population, which raises concerns about the reliability of their predictions when applied to clinical practice. An immediate exchange of well-documented individual participant data from COVID-19 studies is needed to develop more rigorous prediction models and validate the existing ones through collaborative efforts.

Our results identified different types of data-related bias:

a.1.1) Data source variability contributes to bias in distributed research networks of COVID-19 data sharing (Sáez et al., 2021), and they play an important role in data quality. The case study reported by Sáez et al. (2021) shows the limitations that multisource variability may have for COVID-19 machine learning (ML) research on international distributed research networks. They used the nCov2019 dataset, including patient-level data from several countries, to discover and classify severity subgroups dividing them into six types: 1) mild disease with no comorbidity; 2) elderly + severe pulmonary disease + comorbidity; 3) middle-aged + severe pulmonary disease + no comorbidity; 4) elderly + mild disease + no comorbidity; 5) elderly + severe systemic disease + comorbidity; 6) elderly + severe pulmonary disease + heart failure. The problem appears when this division is conditioned by data's country of origin. Groups 1 and 4 data were collected in China and Groups 2, 3, 5, and 6 in the Philippines. In the last case, data came from the COVID-19 tracker, owned by the Department of Health of the Republic of Philippines; in the case of China, data came mostly from patient reports. Due to these variations in the sources, results show some inconsistencies, limiting the model. Potential biases of multisource variability for ML can be generalized in large cross-border distributed research networks. How can we prevent such biases? Sáez et al. (2021) propose: 1) a routine assessment of the variability among data sources in ML and statistical methodologies could potentially reduce biases or extra costs 2) a complete data quality; 3) reporting data quality and its impacts as a routine practice in publications; 4) building consciousness about data quality and variability.

a.1.2) Casiraghi et al. (2020) developed an explainable PRP model for COVID-19 risk assessment aimed to avoid data bias. Their model was designed to be used in emergency departments for an early assessment of PRP in COVID-19 patients, integrating clinical, laboratory, and radiological data. The study carried out a comparative evaluation of different imputation techniques to manage the problem of missing data in the prediction for COVID-19 patients. However, the lack of a shared dataset hindered an objective comparative evaluation with the best models (Casiraghi et al. 2020).

a.1.3) There are biases in COVID-19 prediction models due to unrepresentative data samples, high probability of model overfitting, imprecise information on the study populations, and the use of a model that wasn't well suited for the task (Röösli et al., 2021). Models developed in elite and affluent academic health systems that are not representative of the general population lack external validity (Röösli et al., 2021).

a.1.4) The quick development of AI systems carries great risk due to skewed training data, lack of reproducibility, and lack of a regulated COVID-19 data resource (Röösli et al., 2021). Without comprehensive bias mitigation strategies, this can exacerbate existing health disparities. "The source code of any AI model should be shared publicly to ensure that the models can be widely applied, generalized, and transparently compared” (Röösli et al., 2021: 191).

a. 2) Limitations

A sufficient amount of high-quality data is crucial for the successful implementation of AI in COVID‐19 management. Designing practical AI‐based algorithms is challenging because of the huge and complex data that emerge as a consequence of the varied manifestations of the COVID-19 infection, ranging from asymptomatic to severe clinical disease (Malik et al., 2020). Moreover, the principal obstacle to implement these systems in the clinical context is the regulation of the data exchange obtained by the AI application. Additionally, AI-based algorithms can offer a binary answer to a specific question about the disease in context, but cannot offer alternative predictions (Malik et al., 2020). Finally, it is necessary to consider how meaningful and in-depth data can be generated at every point of healthcare activity (Malik et al., 2020).

a.3) Other ethical issues

Transparency in AI algorithms is essential to understand predictions and target populations, unrecognized biases, class imbalance problems, and their capacity to generalize emerging technologies across hospital settings and populations (Röösli et al., 2021). To ensure that models can be broadly applied, generalized, and compared, the source code of an AI system should be shared publicly, and regulatory frameworks should be created to facilitate data sharing (Röösli et al., 2021).

b. AI systems developed for DCT

b. 1) Bias

b.1.1) Uncontrolled application development could generate inadequate data collection and biases due to the loss of some data or an insufficient frequency of monitoring, which can lead to inability to compare data collected from different regions (Ravizza et al., 2021). Although it does not affect the core functionality of the app, it can influence further use of the collected data: most ML models have relied on Chinese data, which can limit scalability to other populations (Scott & Coiera, 2020).

b.1.2) The media alter the nature of searches, producing biases in areas of potential clusters. Whenever the media reports a location of a positive COVID-19 patient, many people who are close to the informed location ask for additional information related to COVID-19 (Hisada et al., 2020).

b.1.3) DCT Applications (DCTApps) pose a high risk of discrimination, especially to affected people (Mbunge, 2020). Specifically, Internet-of-Things (IoT) based DCTApps collect data from the entire population in real-time which is later analyzed to map COVID-19 hotspots. Such data include ethnic information, demographic details, and socioeconomic status, which can influence the allocation and distribution of COVID-19 resources potentially leading to discrimination.

b.1.4) False negatives are an obstacle and may be deliberately generated because infected people do not want to reveal their true status (Klingwort & Schnell, 2020). To overcome this problem, the detection of relevant contacts should be refined as the issue is fundamentally a problem of microscale spatial analysis. Applications must develop the microgeographic analytical capability to specify what kind of proximity constitutes a sufficient contagion risk to trigger a notification (Roche, 2020).

b.2) Limitations and technical problems

b.2.1 ) Accuracy

The most widely proposed type of COVID-19 application uses Bluetooth signals to track encounters with people diagnosed as infected after the encounter; the accuracy of automatic DCTApps suffers from Bluetooth-based measurement errors (Klingwort & Schnell, 2020). These errors are due to the devices’ different signal strengths and the fact that signal is not transmitted in all directions. Characteristics of the physical environment (windows, walls, or doors) can affect the range of discoverable devices. In addition to the four efficiency conditions (mass adoption, well-equipped population, numerous diagnostic tests, and fair and transparent uses), these monitoring applications have many reliability limitations, especially in the Bluetooth reading forecast and the calibration (Roche, 2020). This can add noise and produce many false positives.

b.2.2) Data-related problems.

Most applications have not reached operational maturity (Scott & Coiera, 2020) and their effectiveness has not been proved (Anglemyer et al, 2020). Even modelling studies provide low-certainty evidence of a reduction in secondary cases if CT is used together with other public health measures such as self-isolation. Cohort studies provide very low-certainty evidence that digital DCT may produce more reliable counts of contacts and reduce time to complete DCT (Anglemyer, 2020). The performance of emerging technologies is not yet stable in account of the lack of availability of a sufficient COVID-19 dataset, the inconsistency of some of the available datasets, the non-aggregation of the dataset, and missing data and noise (Mbunge et al., 2020).

DCTApps may use the IoT to transfer data to national health systems. However, they are not globally standardized, and they face a lot of problems based on interoperability (heterogeneity of connection standards and communication protocols, data semantics, formats, different operating systems, and programming languages). Consequently, each country has developed its own app. Data formats and structures should be standardized to avoid noise, prevent incomplete data, and improve its quality (Mbunge, 2020; Mbunge et al., 2020). Determining a standardized list of data, symptoms, clinical signs, risk factors, and comorbidities associated with coronavirus can contribute to ensure compatibility of databases between regions and countries and to improve interoperability (Ravizza et al., 2021).

DCT becomes less effective when dealing with asymptomatic individuals since symptom checkers and apps rely on pulse, temperature, and sleeping patterns (Hellewell, 2020, cited in Mbunge, 2020). Due to built-in privacy mechanisms, the resulting data for scientific research based on these applications is limited to counts of positive or negative encounters from selective populations, where the odds of encounters cannot be calculated (Klingwort & Schnell, 2020).

b.3) Other ethical issues

b.3.1) Privacy concerns

The use of DCTApps raises ethical, legal, security, and privacy concerns (Roche 2020). To be acceptable, this interference with fundamental rights must be justified, reasonable, proportionate, and politically consensual. DCTApps provide little or no privacy to infected people and require them to disclose their data, raising difficult issues of consent, privacy, ethics, and trade-offs between public and private goods (Scott and Coiera, 2020).

DCTApps violate the security, confidentiality, integrity, data availability of COVID-19 patients and contact persons, which can sometimes cause mental health issues like stress, anxiety, or depression (Mbunge, 2020). Apps like TraceTogether, COVIDSafe, or BeAware support access to multiple data access points and the monitoring and surveillance of infected or isolated people, which threatens the security of public health data, and may imply a violation of privacy (Mbunge 2020).

The study of Park et al. (2020) in South Korea recreates privacy-related problems (figure 2).

Fig. 2 Authors summary of a case example about privacy-related problems described by Park et al. ( 2020).

Instead of disclosing data to the public, information could be used to sanitize establishments, potentially avoiding stigma and business decline. That is, instead of publicly revealing the precise locations of an infected individual, less granular data could be revealed, with the same effect on tracking and quarantine (Park et al., 2020).

The correlation of data, the exchange of information, and the ability to extract information from different entry points contribute to the increasing fragility of the anonymization of data. This anonymization is even more fragile when information is collected over time and through data cross-referencing (Roche, 2020). The deactivation of DCTApps must be programmed so that monitoring does not continue beyond the health emergency and is not tacitly established as standard practice. Otherwise, risks of mass surveillance could arise.

b.3.2) Lack of regulation

There are no specific regulations for DCTApps. However, their use of data, access, or privacy has been adapted to international, national, and state laws such as the European Union’s General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA). These legal frameworks can be adapted to help address concerns about privacy, human rights, due process, and equality (Shachar et al., 2020). In certain countries like the US, the lack of state regulation makes it more difficult to guarantee that these applications follow ethical standards (Shachar et al., 2020), and there are no global WHO guidelines on health data shared and transmitted via 5G technology (Mbunge et al., 2020). Though some countries’ regulations protect citizens better, potential “digital scars” are left in society as long as the governments and private institutions continue having long-term and unlimited access to citizens’ data for surveillance purposes.

b.3.3) Consent

The efficacy of DCTApp depends on the level of population uptake, its ability to accurately detect infectious contacts, and the extent of adherence to self-isolation by notified contacts (Scott and Coiera, 2020). DCT must be handled with care: these technological solutions are proposed as the only tool available to ensure a process of deconfinement, a requirement that would make it a sine qua non condition accessible to police control. The risks of seeing such an established form of socio-spatial “triage” and patients and certain categories of the ostracized population are huge (Roche, 2020).

Although DCTApps use WiFi, GPS, or Bluetooth protocols to monitor people’s movement, users have the right to opt-out and configure their devices, jeopardizing the monitorization of positive cases (Mbunge, 2020). DCTApp should allow people to practice withdrawal of consent (Mbunge, 2020), as problematic uses of technologies may well remain once the pandemic is over. This can potentially advantage powerful groups that can obtain financial and political benefits from perpetuating the use of IT while having questionable effects on society (Marabelli et al., 2021).

Privacy issues related to forcing a population to use an app can lead to much lower coverage rates (Klingwort & Schnell, 2020). However, we find opposite scenarios in countries that have not developed any specific app. Brazil, for example, has increased its technological surveillance in order to minimize the COVID-19 transmission chain (Mbunge et al., 2020). This enforcement of massive surveillance can raise issues about power, abuse, and data exploitation.

c. Health disparities and social determinants of health in AI systems developed for triage and DCT for COVID-19

c.1) Racial disparities

Health disparities are related to the emergence of biases in ML systems in the US context where Black and Latinx communities have been the most severely affected by COVID-19 (Moss & Metcalf, 2020). This is due to long-standing disparities in health outcomes for these communities, the impact of environmental determinants of health, and the disproportionate number of workers whose jobs do not allow them to stay at home (Moss & Metcalf, 2020).

c. 2) Biased data

The reliance on AI may create a false sense of objectivity and fairness (Röösli et al., 2021). The pervasiveness of biases is a failure to develop mitigation strategies and has exacerbated the risk of existing health disparities, hindering the adoption of other tools that could actually improve patient outcomes. As an example, the Medical Information Mart for Intensive Care (MIMIC) is a publicly available, de-identified, and broadly studied dataset for critical care patients. A MIMIC-equivalent for COVID-19 from diverse data sources could incentivize urgently needed data sharing and interoperability to enable diverse, population-based tailored therapy—a step that could decisively reduce biases and disparities in healthcare while bolstering clinical judgment and decision-making. One of the main methodological problems is the selection process (Klingwort & Schnell, 2020). The sample of the population using the application will not be random, and subpopulations with a higher prevalence of undetected infections will likely have lower coverage. In addition, models that include comorbidities associated with worse outcomes in COVID-19 may perpetuate structural biases that have led to historically disadvantaged groups disproportionately suffering those comorbidities. To avoid further harm to minority groups already most affected by COVID-19, resource allocation models must go beyond a utilitarian foundation and must be able to identify needs amongst these patients (Moss & Metcalf, 2020).

c.3) Socio-economic disparities

In DCT, the ability to make use of notifications to minimize one’s own risk by self-quarantining is far too dependent on one’s personal wealth and capacity to afford to stay at home (Moss & Metcalf, 2020). DCTApps’ designers must be attuned to the context of social life in which such systems can produce harmful, difficult-to-foresee effects that replicate or amplify pre-existing inequalities. Attending the contextual use of such a system could collectivize risk by identifying and emphasizing the necessary forms of social support for self-quarantine and medical care: adequate sick leave and quarantine leave policies, robust testing, and the economic relief that targets individual workers over large companies.

During the pandemic, ML has been involved in the production and distribution of risk through society (Moss & Metcalf, 2020), generating risks and its uneven distribution in society. Many of the predictive surveillance algorithms used in DCT control focus attention on populations where bias is very present, especially in highly racialized or lower-income populations (Moss & Metcalf, 2020). In this sense, ML can be epidemiologically effective, while unethical.

c. 4) Unequal accessibility

AI-based global health initiative is recommended, since AI-based approaches may not be accessible in countries with limited resources (Malik et al., 2020). Regarding socioeconomic disparities and the digital gap, the lack of population coverage can leave certain populations at risk (Ravizza et al., 2021). Digital solutions can exacerbate existing disparities between those who do not have access to smartphones or who live in areas without connectivity, because of ethnicity, socio-economic status or age (Anglemyer et al., 2020), with equity implications for at-risk populations with poor access to the Internet and digital technology. Digital deserts or data poverty in certain geographical areas are concerning, especially because the effectiveness of DCTApps depends on their massive voluntary adoption and a systematic screening (Roche, 2020). Across country borders, the health gap and inequalities in healthcare pose a problem for the integration of emerging technologies. Even in developed countries, risk groups may not have access to broadband, smartphones, or wearable technology. For a community to benefit from this technology, most people need to be equipped with mobile devices. This applies to only 80% of the US population, 65% of Russians, and 45% of Brazilians (Marabelli et al., 2021). Children, elderly, or individuals with fewer resources are excluded from the stored information (Grantz et al., 2020).

c.5) Workforce and Information and Communication Technologies infrastructure

Developing an app and maintaining the system requires a specific workforce and a consistent Information and Communication Technologies (ICT) infrastructure that may be lacking. Some countries may struggle with the technological infrastructure, especially in countries with a high incidence (Chad or the Central African Republic) where ICT infrastructures are very poor. These factors can hinder the development of technological innovation policies as part of their response to COVID-19 (Mbubnge, 2020).

This is the first study to offer a categorization by typology of the main biases, limitations, and related ethical issues identified in current scholarship on AI systems developed for triage or PRP and DCTApps during COVID-19. The specific focus on the role of SDOH in the emergence of bias in these systems and its analysis is also a novel contribution.

One of the main findings from this review is that while references to "health disparities" are relatively more frequent in the study of AI systems, references to "SDOH" are rather uncommon. We may also point out that definitions and terminologies vary from one author to another, which has made it more difficult to identify and systematize them. Based on the results we argue that SDOH and health disparities are rarely taken into account in triage and PRP studies and are mostly related to DCTApps. These findings suggest that SDOH are undervalued in a clinical context and need to be given more consideration.

Our review shows that data is geographically dependent and that its use as training data across regions and countries results in biases in the outcomes. The use of local data to develop international solutions can increase biases in other local populations due to epidemiological differences.

We have also found that ML solutions can be epidemiologically effective and, at the same time, ethically fraught because of design biases. The rapid development of DCT solutions has proven to accelerate the identification of infected people despite its ethical cost, raising worrying issues of lack of privacy, biased data, or socio-economical disparities. Our findings reiterate the lack of comparative studies and literature about the effectiveness and convenience of DCTApps during the pandemic outbreak, as previous researches have concluded. This makes it more difficult to evaluate the success of the solutions and their cost in opportunity as well as to assess changes and improvements of this technology. Further research on this issue is required.

Finally, gender is never questioned as a possible bias on the results in published studies about biases in AI systems for COVID-19. Race, age, or socioeconomic status, on the contrary, appear more often as indicators in the data used in AI systems.

Our findings are in agreement with Ausín and Andreu Martínez (2020), who have found some ethical elements to take into consideration for DCT:

1. The security and safety of technological systems responds to duty not to cause harm, to minimize it, protecting individuals and groups.

2. The intervention must be proportional and beneficial given the severity of the situation.

3. The installation of the app must be voluntary and require people’s consent, not carrying a penalty for its non-acceptance.

4.The data must be pseudonymized to protect privacy.

5. Applications and other technologies must be available and accessible to everyone, regardless of economic or technological level.

There are some limitations to our study. First, we had some problems identifying AI from other algorithms or statistical and mathematical methods without AI because of a lack of clarity in the literature. Second, our search strategy may have overlooked some concepts related to SDOH or included them in the category. Third, the majority (13 out of 20) of the included articles are narrative reviews (n = 10), rapid review (n = 1), or case studies (n = 2). Narrative reviews can be associated with selection bias (“COVID‐19 Session” 2020) (Pae, 2015). However, these articles were selected because they addressed bias and other ethical issues more explicitly than other types of articles or empirical studies, and because our aim was to identify current known biases in literature so future researches can look for other kind of possible biases during the development and application of the AI support systems Finally, since we restricted our review to studies published in English and Spanish, we might have missed relevant work published in other languages.

Recognizing these limitations, we hope that our scoping review can help to document the types and extent of biases actually present in specific AI algorithms for triage or PRP and DCT in the context of the COVID-19 pandemic.

The analysis of previous literature shows that the main sources of biases identified in both triage or PRP and DCT AI systems for COVID-19 are mainly related to data source variability and inadequate data collection. In addition, ethical problems related to privacy, consent and the lack of regulation have been identified in DCT Apps. Biases related to health disparities and SDOH are not the main topics of the studies but are in some way included, especially in DCT Apps narrative reviews. Although there is some concern on the topic, a theoretical framework addressed to researchers and engineers would facilitate the comprehension and identification of potential biases in future technologies and their uses.

Anglemyer A., Moore, T.H.M., Parker, L., Chambers, T., Grady, A. Chiu, K., Parry, M., Wilczynska, M., Flemyng, E., Bero, L. (2020). Digital contact tracing technologies in epidemics: A rapid review. Cochrane Database of Systematic Reviews, 8. https://doi.org/10.1002/14651858.CD013699
Arksey, H., O’Malley, L. (2005). Scoping studies: Towards a methodological framework. International Journal of Social Research Methodology: Theory and Practice, 8 (1), 19–32. https://doi.org/10.1080/1364557032000119616
Ausín, T., Andreu Martínez, M.B. (2020). Ética y protección de datos de salud en contexto de pandemia: una referencia especial al caso de las aplicaciones de rastreo de contactos, Enrahonar An International Journal of theoretical and practical reason, 65, 47–56. https://doi.org/10.5565/rev/enrahonar.1304
Bengio, Y., Janda, R., Yu, Y.W., Ippolito, D., Jarvie, M., Pilat, D., Struck, B., Krastev, S., Sharma, A. (2020). The need for privacy with public digital contact tracing during the COVID-19 pandemic. The Lancet Digital Health, 2, 342–344. https://doi.org/10.1016/S2589-7500(20)30133-3
Casiraghi, E., Malchiodi, D., Trucco, G., Frasca, M., Cappelletti, L., Fontana, T., Esposito, A.A., Avola, E., Jachetti, A., Reese, J., Rizzi, A., Robinson, P.N., Valentini, G. (2020). Explainable Machine Learning for Early Assessment of COVID-19 Risk Prediction in Emergency Departments. IEEE Access, 8, 196299–196325. https://doi.org/10.1109/ACCESS.2020.3034032
Colizza, V., Grill, E., Mikolajczyk, R., Cattuto, C., Kucharski, A., Riley, S., Kendall, M., Lythgoe, K., Bonsall, D., Wymant, C., Abeler-Dörner, L., Ferretti, L., Fraser, C. (2021). Time to evaluate COVID-19 contact-tracing apps. Nature Medicine, 27 (3), 361–362. https://doi.org/10.1038/s41591-021-01236-6
Grantz, K.H., Meredith, H.R., Cummings, D.A.T., Metcalf, C. J. E., Grenfell, B.T., Giles, J.R., Mehta, S., Solomon, S., Labrique, A., Kishore, N., Buckee, C.O., Wesolowski, A. (2020). The use of mobile phone data to inform analysis of COVID-19 pandemic epidemiology. Nature Communications, 11 (1), 1–8. https://doi.org/10.1038/s41467-020-18190-5
Gulliver, R., Fahmi, M., Abramson, D. (2020). Technical considerations when implementing digital infrastructure for social policy. Australian Journal of Social Issues, 1–19. https://doi.org/10.1002/ajs4.135
Hisada, S., Murayama, T., Tsubouchi, K., Fujita, S., Yada, S., Wkamiya, S., Aramaki, F. (2020) Surveillance of early stage COVID-19 clusters using search query logs and mobile device-based location information. Scientific Reports, 10(1). https://doi.org/10.1038/s41598-020-75771-6
Jamshidi, M., Lalbakhsh, A., Talla, J., Peroutka, Z., Hadjilooei, F., Lalbakhsh, P., Jamshidi, M., Spada, L., Mirmozafari, M., Dehgani, M., Sabet, A., Roshani, Sa., Roshani, So., Bayat-Makou, N., Mohamadzade, B., Malek, Z., Jamshidi, A., Kiani, S., Hashemi-Dezaki, H., Mohyuddin, W. (2020). Artificial Intelligence and COVID-19: Deep Learning Approaches for Diagnosis and Treatment. IEEE Access, 8, 109581–109595. https://doi.org/10.1109/ACCESS.2020.3001973
Klingwort, J., Schnell, R. (2020). Critical limitations of digital epidemiology: Why COVID-19 apps are useless. Survey Research Methods, 14(2), 95–101. https://doi.org/10.18148/srm/2020.v14i2.7726
Mali, S.N., Pratap, A.P. (2020). Targeting infectious Coronavirus Disease 2019 (COVID-19) with Artificial Intelligence (AI) applications: Evidence based opinion. Infectious Disorders - Drug Targets, 20. http://doi.org/10.2174/1871526520666200622144857
Malik, Y.S., Sircar, S., Bhat, S., Ansari, M.I., Pande, T., Kumar, P., Mathapati, B., Balasubramanian, G., Kaushik, R., Natesan, S., Ezzekouri, S., El Zowalaty, M.E., Dhama, K. (2020). How artificial intelligence may help the Covid-19 pandemic: Pitfalls and lessons for the future. Reviews in Medical Virology, e2205. https://doi.org/10.1002/rmv.2205
Marabelli, M. Vaast, E., Li, J.L. (2021). Preventing the digital scars of COVID-19. European Journal of Information Systems, 30(2), 176-192 1-17. https://doi.org/10.1080/0960085X.2020.1863752
Mbunge, E. (2020). Integrating emerging technologies into COVID-19 contact tracing: Opportunities, challenges and pitfalls. Diabetes Metabolic Syndrome: Clinical Research and Reviews, 14(6), 1631–1636. https://doi.org/10.1016/j.dsx.2020.08.029
Mbunge, E., Akinnuwesi, B., Fashoto, S.G., Metfula, A.S., Mashhwama, P. (2020). A critical review of emerging technologies for tackling COVID-19 pandemic. Human Behavior and Emerging Technologies, 3(1), 25–39. https://doi.org/10.1002/hbe2.237
Moss, E., Metcalf, J. (2020). High Tech, High Risk: Tech Ethics Lessons for the COVID-19 Pandemic Response. Patterns, 1(7). https://doi.org/10.1016/j.patter.2020.100102
Munn, Z., Peters, M.D.J., Stern, C., Tufanaru, C., McArthur, A., Aromataris, E. (2018). Systematic review or scoping review? Guidance for authors when choosing between a systematic or scoping review approach. BMC Medical Research Methodology, 18(1), 1–7. https://doi.org/10.1186/S12874-018-0611-X
Pae, C.U. (2015). Why Systematic Review rather than Narrative Review?, Psychiatry Investigation, 12 (3), 417–419. https://doi.org/10.4307/pi.2015.12.3.417
Page, M.J., McKenzie, J.E., Bossuyt, P.M., Boutron, I., Hoffmann, T.C., Mulrow, C.D., Shamseer, L., Tetzlaff, J.M., Akl, E.A., Brennan, S.E., Chou, R., Glanville, J., Grimshaw, J.M., Hróbjartsson, A., Manoj, M.L., Li, T., Loder, E.W., Mayo-Wilson, E., McDonald, S. … Moher, D. (2021). The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ, 372 (71). https://doi.org/10.1136/bmj.n71.
Park, S., Choi, G.J., Ko, H. (2020). Information Technology-Based Tracing Strategy in Response to COVID-19 in South Korea - Privacy Controversies. JAMA - Journal American Medical Association, 323(21), 2129–2130. https:// doi.org/10.1001/jama.2020.6602
Pham, M.T., Rajić, A., Greig, J.D., Sargeant, J.M., Papadopoulos, A., McEwan, S.A. (2014). A scoping review of scoping reviews: advancing the approach and enhancing the consistency. Researcg Synthesis Methods, 5(4), 371–385. https://doi.org/10.1002/JRSM.1123
Ravizza, A., Sternini, F., Molinari, F. Santoro, E., Cabitza, F. (2021). A Proposal For COVID-19 Applications Enabling Extensive Epidemiological Studies. Procedia Computer Science, 181, 589–596. https://doi.org/10.1016/j.procs.2021.01.206
Roche, S. (2020). Smile, you’re being traced! Some thoughts about the ethical issues of digital contact tracing applications. Journal of Location Based Services, 14(2), 71–91. https://doi.org/10.1080/17489725.2020.1811409
Röösli, E., Rice, B., Hernandez-Boussard, T. (2021). Bias at warp speed: how AI may contribute to the disparities gap in the time of COVID-19. Journal of the American Medical Informatics Association, 28(1), 190–192. https://doi.org/10.1093/jamia/ocaa210
Sáez, C., Romero, N., Conejero, J.A., García-Gómez, J.M. (2021). Potential limitations in COVID-19 machine learning due to data source variability: A case study in the nCov2019 dataset. Journal of the American Medical Informatics Association: JAMIA, 28(2). https://doi.org/10.1093/jamia/ocaa258
Scott, I.A., Coiera, E.W. (2020). Can AI help in the fight against COVID-19? Medical Journall of Australia. 213(10), 439-441.e2. https://doi.org/10.5694/mja2.50821
Shachar, C., Gerke, S., Adashi, E.Y. (2020). AI Surveillance during Pandemics: Ethical Implementation Imperatives. Hastings Center Report, 50(3), 18–21. https://doi.org/10.1002/hast.1125
Sun, R., Wangm W., Xue, M., Tyson, G., Camtepe, S., Ranasinghe, D.C. (2021). An Empirical Assessment of Global COVID-19 Contact Tracing Applications. 43rd International Conference on Software Engineering (ICSE), 1085-1097. https://doi.org/10.1109/ICSE43902.2021.00101
Tricco, A.C., Lillie, E., Zarin, W., O’Brien, K., Colquhoun, H., Kastner, M., Levac, D., Ng, C., Pearson Sharpe, J., Wilson, K., Kenny, M., Warren, R., Wilson, C., Stelfox, H.T., Straus, S.E. (2016). A scoping review on the conduct and reporting of scoping reviews. BMC Medical Research Methodology, 16(1), 1–10. https://doi.org/10.1186/S12874-016-0116-4
Wynants, L., Van Calster, B., Collins, G.S., Riley, R.D., Heinze, G., Schuit, E., Bonten, M.M.J., Damen, J.A.A., Debray, T.P.A., De Vos, M., Dhiman, P., Haller, M.C., Harhay, M.O., Henckaerts, L., Kreuzberger, N., Lohmann, A., Lujiken, K., Ma, J., Andaur Navarro, C.L., Van Smeden, M. (2020). Prediction models for diagnosis and prognosis of covid-19: Systematic review and critical appraisal. BMJ, 369. https://doi.org/10.1136/bmj.m1328

Download PDF

Version 1

posted

You are reading this latest preprint version

Bias In Algorithms Of AI Systems Developed For COVID-19: A Scoping Review

Status:

Version 1

Abstract

Figures

1. Background

2. Materials And Methods

3. Results

4. Discussion

5. Conclusions

References

Status:

Version 1