Noname manuscript No. (will be inserted by the editor) COVID-19: visualizing peaks and waves in recorded cases across the globe

COVID-19 has affected different parts of the globe in waves. Yet, waves are not specifically defined. Using a simple and transparent approach, I provide a bird’s-eye view of waves of recorded cases of COVID-19 at different locations across the globe. Using a widely available dataset, I compile and visualize the peak size and number of waves for over 150 locations, with a population of at least a million each. Most locations experienced one to three waves. Whether the waves increased or decreased in size within locations varies. I have created a Shiny app that helps the reader interact with the data graphs.


Introduction
Defining and characterizing waves is a basic step to understand a key feature of the pandemic, relevant to citizens, policymakers and modellers. Although a wave of infections is a familiar phenomenon for epidemiologists, it is puzzling for non-specialists.
Data on recorded cases of COVID-19 are now publicly available for a sufficient length of time; I use the data on new cases (smoothed) per million for over 150 locations across the globe [6]. Only locations with a population of at least a million each are considered. Recorded cases are often poor estimates of real cases; yet, they are often used in narratives by the media and in official accounts.
Institute of Economic Growth, Delhi, India E-mail: vik-day@iegindia.org Many studies have documented certain aspects of waves of COVID-19 in specific locations. Zhang et al. [10] is notable for a global perspective. They provide an example in the US of the vice president and a leading infectious disease expert making contradictory statements about waves on the same day. According to Zhang et al. [10], a wave may be characterized as an upward and downward movement sustained for a period of time. They use a model based approach to define and operationalize waves, and then estimate numbers and durations of waves. Zhang et al. [10] state: "While it is hard to compare the daily or total COVID-19 deaths and cases between countries due to their different sizes, waves capture the changes (upward or downward) within a country and hence present an alternative approach to assess the fluctuation of epidemics in a country or a region".
I use an alternative approach that is simple, intuitive and transparent. I use a function in the ggpmisc package in the R software for finding peaks in a time series-these are the highest number for a certain number of days ('span' parameter) and also have a value above a certain fraction of the range observed in the data ('ignore threshold' parameter). Since sustained high values of cases cause great suffering, this approach has intuitive appeal. I take values of 61 for 'span' (30 days on either side of the peak) and 0.4 for 'ignore threshold'. As a result, I can provide a bird's-eye view of peaks and waves in recorded COVID-19 cases across the globe.
Data visualizations play an important role in communicating insights about the pandemic and Shiny apps are a useful tool [8]. But, data visualizations can also motivate us to think about the processes that generated the data, and to revise our conceptual model. In the case of the epidemic of plague in Bombay, if we simply graph the deaths in 1906, we can fit a model that fits the data well, and we see that deaths rose and then fell. However, if we graph the deaths from 1897 to 1911, then the clear evidence of seasonality in the plague deaths will prompt us to revise our model, as done by [2].
I have developed a Shiny app that accompanies this note-users can interact with the data graphs, and move between the global view and choose a country to zoom in on. They can also see how sensitive the results are to the two parameters that are used to characterize and operationalize a wave (available at https://vikday. shinyapps.io/Wave_peak_1June21/).

Methods
We use a data science approach to work with the data and visualize it using the R software [5].
Finding peaks in a time series is important in many applications. Palshikar [4] provides a discussion of simple algorithms to find peaks that is instructive. In a time series, we can find a local peak in a window of a given size. Whether a local peak is a real peak depends on the overall time series; for example, we may use the relative size of the peak as a criterion [4].
The R package ggpmisc [1] can find and plot peaks. In a wave we have a sustained upward surge. Two inputs are required: (1) 'span': the duration; for COVID-19, a peak with a month on either side is reasonable, and (2) 'ignore threshold': the threshold below which we ignore a local peak (in terms of the range of the values observed). Several other packages in R were used: shiny [3], tidyverse [9], and plotly [7].
The size of peaks and order of waves were extracted for each location and compiled and then visualized.

Results and discussion
The data was for 157 locations, with a population of at least a million each, and from 2020-01-01 to 2021-06-01. Figure 1 shows new cases per million (smoothed), which we will henceforth call new cases for a random sample of six locations; the red points are peaks of waves (span = 61, ignore threshold = 0.4). Three of the locations experienced more than one wave. In Germany the first wave was larger than the second, whereas in Jordan it was opposite, and in Belarus the middle wave was largest.
Of the 157 locations in all, 17 had no wave, 47 had one wave, 64 had two waves, and 24 had three waves. Figure 2 shows a bird's-eye view of peaks and waves for all locations, by continent. On the y-axis we have the size of peak of wave (new cases per million (smoothed)) and on the x-axis we have wave order (first to fourth waves). In this plot wave order was restricted to only four to avoid crowding, but not in the Shiny app web page. Most locations experienced waves. However, there is a lot of heterogeneity.
The Shiny app that accompanies this paper helps the user by displaying data underlying points in the graph. In Africa, South Africa had the highest peak size of new cases of 321 in its second wave, higher than the first wave peak size. In Asia, Georgia had the highest peak size of 1121, and only 1 wave. Kuwait had 5 waves, with the last having the greatest peak size of 345. In Europe, Belgium had a very high peak size of 1536, with only one wave. A number of locations in Europe experienced a large increase in the size of the peak from the first to the second wave, for example, Portugal went from a peak size of 630 in the first wave to 1264 in the second wave. In North America, Honduras and El Salvador had 4 waves each. Panama and the US had the highest peak sizes of 842 and 758, experienced in one wave. Oceania saw low peak sizes. In South America, Ecuador had six waves, with the first the largest. Uruguay had only one wave and had the highest peak size in South America, at 1130.

Conclusions
Learning about waves of the pandemic is of interest not only to analysts but also to policy makers and citizens. Of the 157 locations in all, 17 had no wave, 47 had one wave, 64 had two waves, and 24 had three waves. Different locations had varied experiences. These varied experiences imply that diverse factors, biophysical and social, were operating in different locations.

Conflicts of interest/Competing interests Not Applicable
Availability of data and material The data is publicly available [6].
The Shiny app is available at https://vikday.shinyapps.io/Wave_peak_1June21/.   Readers can explore the data interactively in the Shiny app accompanying the paper. Hovering over a point will reveal data values of points in the app. In this plot I have ignored wave order 5 or greater to avoid crowding, but in the app I show all wave orders.