Geographic Information Systems and COVID-19: The Johns Hopkins University Dashboard

Background: This article presents a single case study on the development of a GIS for global monitoring of coronavirus (COVID-19). For such concepts presented about GIS, its use and evolution in epidemic events and a presentation of the context of the current coronavirus outbreak and the meaningless results of consolidating a panel with reliable data. Methods: A single case study of a GIS in continuous development with data sharing and comments from the scientific community was carried out. Because it is not a post-mortem analysis, or a follow-up to a successful case, it was not possible to use more rigorous and systematic approaches such as those used by Lee (1989) and Onsrud, Pinto and Azad (1992) for case studies in GIS. Results: The case study presents the results of the development of a control dashboard, as well as the availability of consolidated data made by researchers at Johns Hopkins University and who showed a reliable platform and a world reference for health comunity. Conclusions: Efforts to develop a dashboard and provide data on the coronavirus outbreak resulted in the immediate replication of several other information systems with different approaches (Power BI, R, Tableau), becoming a reference for any new global epidemic outbreak events.


Introduction
This article deals with the development of a Geographic Information System (GIS) and data on the recent Coronavirus outbreak  and its monitoring by Johns Hopkins University, which has become a world reference in the monitoring of epidemic outbreaks. For that, the researchers made daily monitoring of its evolution, witnessing its evolution until its current stage, which became a global case of success. The research followed the daily evolution of the outbreak, and an example is that the term 2019-nCoV was used (the same used in the initial versions in the header of the JHU dashboard). This name, 2019-nCoV, was temporary and, on February 11, 2020, received the name COVID-19 (PAHO Brasil, 2020). Its importance goes beyond the intrinsic aspects of GIS and moves to solutions for monitoring global epidemic outbreaks and their monitoring. We present concepts related to the theme, as well as the evolution of the way we have mapped epidemic outbreaks since John Snow map of cholera in 1854. Finally, we hope that it will contribute not only to the evolution of GIS linked to the health area but also any other manifestation involving location (violence against women, natural disasters, earthquakes, energy, water, housing density, murders, robberies, thefts, environmental pollution, among others).
An example is the efforts of the Bill & Melinda Gates Foundation, who developed a digital mapping system in Nigeria to help health professionals target specific areas for immunization efforts in the fight to eradicate polio (Gates Notes, 2012). These vaccination teams are using field tracking devices with mobile phones; collect GPS location data. This example shows the power of GIS not only for storytelling, such as sharing places (Flickr or Picasa) but for decision making like predicting a city or region to be impacted by a natural catastrophe.
Among the benefits of an information system, better information, improved services, and increased productivity can be highlighted (Nickerson, 1998). Decision-making processes and their results can be affected by several factors. Among them, data quality is critical. Although this fact is widely known, data quality is still a critical issue in organizations due to the vast volumes of data available on their systems. Weak quality data could lead to terrible decisions. Cerchiari and Erdmann (2008) report that epidemiology deals with the study of the factors that determine the appearance, frequency, distribution, evolution, control measures, eradication, and prevention of diseases. Buss (1995), about Brazil, reports that there are difficulties in discussing health conditions also as deficiencies concerning the indicators customarily used to measure it, as well as the precariousness of the available information on, in particular, morbidity and mortality.
Thus, this article, through the presentation of a successful case, aims to provide a contribution not only to the information systems area but for legislators and health services management, in general, to be aware of the need to expand the use of location-based systems. Some questions to be answered from the dashboard are common to information systems such as: What problem are JHU trying to solve? What gaps exist in dashboard performance? What are our goals? Where does the data reside?
How many data sources are the dashboard looking at using? What are the data delay requirements?

Geographic Information Systems
A Geographical Information System (GIS) is a computer-based technology designed to analyze, manage, store, and display geospatial data (Chang, 2014). They are systems equipped with software capable of analyzing and displaying data using digitized maps in order to improve planning and decision making (Laudon & Laudon, 2011). The GIS allows us to see, understand, question, interpret, and visualize our world in order to reveal patterns, relationships, and trends in the form of maps, globes, reports, and graphs. The technologies related to geographic information are broad and involve analytical methods, cartography, and visualization, design, data modeling, geospatial data, geocomputing, data manipulation, in addition to organizational and institutional aspects and linked to society (DiBiase et al., 2006).
A GIS is an information system that provides information for decision making based on geographic location. Certain information depends on where it originated. A GIS includes a database in which one organizes all data by geographic location. Almost any type of data can be stored in this database. The data is stored in joined layers just as a geographic location. A GIS can provide information to support decision making (Nickerson, 1998).
GIS is a particular category of DSS (Decision Support Systems) (Laudon & Laudon, 2011;João, 2015) that, thanks to data visualization technology, analyzes and displays data for planning and decision making in the form of maps scanned. The software can collect, store, manipulate and display information geographically, tying data with points, lines and areas on a map. The GIS also has a modeling feature, allowing managers to change data and automatically review business scenarios in search of better solutions.
Geographic analysis is the main form of GIS. Depending on the project, there are many different analytical approaches to perform a choice. GIS modeling tools make it relatively easy to perform simple or complex analyzes and create new results.
As Chang (2014) mentioned, GIS can store, manipulate, and display geospatial data on computer systems. After the data is collected, edited, and referenced to a designed coordinate system, the following step is to make the data readily available to users to make maps, assist in fieldwork and perform spatial analysis.

5
Web GIS, the product of integrating web and GIS technologies, is different from traditional GIS in that it masks the differences between the various types of databases, networks, hardware, and software (Lu et al. 2010). It is a platform to provide GIS resources to many organizations that share and collaborate GIS resources on the Internet to easily access and use geographic information (Law, 2013). Therefore, Web GIS provides accessible, manageable, and shareable global geographic data, information data indiscriminately (Liu et al. 2009).
Modern GIS systems use GPS (Global Positioning Systems), which is one of the GNSS (Global Navigation Satellite System). There are others like the Russian GLONASS system and the European Galileo system. Even social networks like Twitter present spatial data represented by latitude and longitude data for each twitter posted, which facilitates tracking in outbreak situations. The science and technology associated with the earth's image is called remote sensing. More advanced systems use lasers for mapping with those used by LIDAR (Light Detection and Ranging). This technique is capable of generating extremely detailed three-dimensional models of the earth's surface, being a technique used, for example, to measure the effects of a hurricane-like Sandy off the coast of New York and New Jersey in 2012. Earthquakes are another example of GIS use using beyond the latitude, longitude, and depth beyond, of course, the magnitude reported for the event. Roux and Mair (2010), in a work on neighborhoods and health, on how residential environments can affect health and contribute to ethnic and racial inequalities in health, focusing on the results of chronic diseases (specifically obesity and related risk factors) and mental health (specifically depression and depressive symptoms) state that the explosion of GIS and spatial analysis techniques allow the examination of space in a much more detailed and sophisticated way than was possible in the past.
GIS for disease outbreaks is not new. An example, of a proto-GIS, is the work carried out by Dr. John Snow, in the late summer of 1854, to map the sources that caused a cholera outbreak in London's Soho region, mapping the detected cases. From August 31 to September 3, 127 people died of cholera (ARCGIS, 2020). As a result, and within a week, 500 people died, and about one in seven people who developed cholera and, eventually, died. All of this occurred 250 meters from the 6 intersection of Cambridge Street and Broad Street (Figure 1 -shown in detail on the right). This proto-GIS allowed Snow to accurately locate contaminated water as well as the source of the outbreak.
Although the techniques have advanced considerably since the John Snow cholera map, the basic principles established by Snow still exist in current epidemiological thinking. Figure 2 presents a rereading of Snow's cholera map using current techniques.
An example of how to develop a GIS project is presented by Brewer (2006) on the development of a system for monitoring prostate cancer mortality in a timeline (how a set of events occurs over time) and including different populations.

About Wuhan and Coronavirus
Wuhan is a traditional industrial center. Of the 500 largest global companies, 230 have investments in the city (BBC News, 2020). In the automobile sector alone, there are ten factories, with Dongfeng Peugeot Citroen, Nissan, Honda, and GM standing out. There is also a nascent innovation industry, are not yet fully known (Gardner, 2020). 7 Still, according to Gardner (2020), infected travelers (mainly by air travel) are known to be responsible for introducing the virus outside Wuhan. On January 13, Thailand reported the first international case outside of China, while the first cases in China, but outside Wuhan, were reported on January 19 in Guangdong and Beijing. On January 20, the National Health Commission of China (NHC) confirmed that the coronavirus could be transmitted between humans. On the same day, Japan and South Korea confirmed human infections by COVID-19, and the following day cases detected in the US and Taiwan in travelers returning from Wuhan. On January 21, several provinces in China were also registering new cases, and the infection was confirmed in 15 health professionals, with six deaths reported. Additional travel cases have been confirmed in Hong Kong, Macau, Singapore, and Vietnam.
On 22 January, a WHO emergency committee met to discuss whether the outbreak should be classified as a public health emergency of international interest (PHEIC) under International Health Regulations but was initially undecided due to a lack of information before deciding against the declaration.
Of immediate concern is the risk of additional transmission resulting from high volumes of travel and mass meetings in celebration of the Chinese New Year on January 24. In an attempt to mitigate local transmission in China, unprecedented outbreak control strategies have been implemented in (initially) three cities. On January 23, 2020, Wuhan suspended all public transport and air travel (inside and outside the city), quarantining all 11 million people in the city. On January 24, Huanggang and Ezhou, cities adjacent to Wuhan, will also be placed in a similar quarantine. Besides, many cities have

The Coronavirus dashboard
The GIS coronavirus Dashboard (COVID-19) from the CSSE (Center for Science Science and Engineering) at Johns Hopkins University (Figure 4) was developed in response to this public health emergency. It is an online dashboard, but not in real-time, to view and track reported cases on a daily time scale. Its development included the team from Esri Living Atlas and the data services team from Johns Hopkins University (JHU Data Services). An important detail is that a complete set of data can be downloaded. Initially, as a google spreadsheet and later as a CSV file for use by other applications around the world, the Johns Hopkins dashboard is also a reference for monitoring the outbreak's evolution.
Initial data were collected from various sources, including WHO (World Health Organization), CDC in the USA, CDC in China, ECDC (European Center for Disease Prevention and Control), NHC, and DXY. DXY is a Chinese website that aggregates situation reports from the NHC and local CCDC in near real-time, providing more current regional case estimates than national reporting organizations are capable of and is therefore used for all cases Mainland China reported on the panel (confirmed, suspected, recovered, deaths). US cases (confirmed, suspected, recovered, deaths) are taken from the US CDC, and all other country case data (suspected and confirmed) are collected from the corresponding regional health departments. The dashboard aims to provide the public with an understanding of the outbreak situation as it unfolds, with transparent data sources. A comprehensive view of the various data sources is provided at https://github.com/CSSEGISandData/COVID-19.
Gardner cited by Donovan (2020) on the dashboard initiative "We created this dashboard because we think it is essential for the public to understand the situation of the outbreak as it develops with transparent data sources. For the research community, this data will become more valuable as we continue to collect it over time. Making data available for download is "critical" for researchers." The dashboard is structured into several small panels with an emphasis on Total confirmed cases (Total Confirmed), total confirmed deaths (Total Deaths), Total recovered patients (Total Recovered), Cases confirmed by Country or Province, data from the last update. Panel with the linear scale with the evolution of the total cases for Mainland China and other locations, its evolution, now including the total recovery and now including graphs on a logarithmic scale and another of the daily increment (new confirmations versus new recoveries). It is thus making the evolution and consequent regression of new cases clearer. Finally, a panel with data on the dashboard itself and the various data sources (Table 1).

Data and its consolidation
All data collected and displayed are made available, initially through Google Spreadsheets ( Figure 5), which presents the data for each bulletin (upper part of Figure 5) and the one which presents the data on a time basis (lower). This data is now presented in CSV text format (commaseparated values) through the GitHub repository (https://github.com/CSSEGISandData/COVID-19), along with the control panel feature layers, which are now included in the Esri Living Atlas. This format, CSV files, can be used directly by software such as Excel, BI tools or by languages such as R and Python, allowing its immediate use by the international scientific community. Between January 22nd and 31st, all data collection and processing was done manually, and updates were usually done twice day, morning, and evening (Eastern time). The manual reporting process became impracticable as the outbreak evolved as of February 1, a semi-automated loading of data was adopted (Dong, Du, & Gardner, 2020). (> 21-60) and, finally, (1-21) confirmed cases. The maps can be customized in multiple styles: imagery hybrid, streets, topographic, navigation, streets at night, terrain with labels, light and dark gray canvas, geography style map, oceans, and, finally, OpenStreetMap.
As we can see, the data of the various daily bulletins are composed by Province / State, Country or Region, date and time of information, the number of confirmed cases, Number of deaths, and the number of recovered cases (returned to normal health). Greater detail can be seen in the structure of the same. The application structure consists of three layers: Deaths, Cases, and Cases_Country. Dong, Du, and Gardner (2020)  Before the manual update, the data was confirmed with the regional and local health departments, which resulted in more reliable data.

Final Considerations
Regardless of the efforts of governments and international organizations such as WHO daily bulletins (WHO, 2020), the JHU dashboard became a world reference for monitoring coronavirus with accesses of the order of more than 400 million on February 5, 2020. Companies linked to the global financial market, such as Bloomberg, and information providers started to use data from the JHU dashboard for its information services. Media outlets like Newsweek, PBS News Hour, and ABC News cited the panel in their reports on the outbreak.
Also, according to Dong, Du, and Gardner (2020), given the popularity and impact of the dashboard, Johns Hopkins plans to continue hosting and managing the tool throughout the COVID-19 outbreak cycle and to develop its capabilities, as a permanent tool, to monitor and report future outbreaks. In his words, "We believe that our efforts are crucial to help inform the modeling efforts and control measured during the early stages of the outbreak." New systems were developed based on data from the JHU panel. On February 24th, there was at least one application on Tableau (Jean-Paul Cavalier, 2020), PowerBI (http://bit.ly/38qeMbz), and R Markdown (https://github.com/kevinlanning/2019-nCoV).
Which in itself already shows the power to make data available to the community.
Kwan and colleagues (Kwan, 2004;Kwan & Lee, 2004) demonstrated that, in the space-time relationship, aided by GIS, show the highly individualized and complex spatial routines that people follow in their daily lives. This type of work demonstrates that the expansion of studies to include the measurement of individual exposure to multiple "Contexts" in time and space would be an essential step forward.
The environmental epidemiology uses information about existing risk factors (physical, chemical, biological, mechanical, ergonomic, and psychosocial) and as a character is that the environment interferes with the population's health standards, exposed people, and adverse effects the health. The use of GIS, according to Croner, Sperling, and Broome (1996), in this approach, of environmental epidemiology, improves the ability of researchers to study environmental risk factors for diseases.
However, we must highlight the human errors involved in the process, among which we can mention.
1) slow communication of the beginning of the outbreak, in which the mayor of Wuhan, Zhou Xianwang admitted that he did not immediately report the first suspicions of the virus; 2) lack of tests (diagnostic kits), which only arrived at the hospitals in Wuhan on January 20; and 3) mistaken assessment by the World Health Organization (WHO), which initially pointed to a moderate risk of epidemic. It subsequently declared an international emergency.
Pei (2020) points out that China has a long history of epidemic outbreaks, such as SARS (severe acute respiratory syndrome), in 2002, where he says that the Chinese Communist Party needs to keep the mind of the Chinese public convinced that everything is as planned by the Party. An example is that in an official statement, there was no evidence that the new disease could be transmitted between humans, in addition to stating that no health worker had been infected. These statements proved to be false. Another issue was that initially, there was little coverage in the Chinese press apart from censors removing references to the outbreak and tighter government control over the Internet, the media (including the popular WeChat), and civil society.