Statistical distribution of novel coronavirus in Iran

Background and Aim: The coronavirus disease-2019 (COVID-19) pandemic – novel coronavirus (nCoV) spread worldwide in 2019, and by March 27, 2020, 199 countries, including Iran, were affected. Prevention and control of the infection is the most important public health priority today. The behavior prediction of COVID-19 is a significant problem. Therefore, in the present research, we compared the different distribution of COVID-19 cases based on the daily reported data in Iran. Materials and Methods: In this research, we compared the different distribution of COVID-19 cases based on the daily reported data in Iran. We focused on 36 initial data on deaths and new cases with confirmed 2019-nCoV infection in Iran based on official reports from governmental institutes. We used the three types of continuous distribution known as Normal, Lognormal, and Weibull. Results: Our study showed that the Weibull distribution was the best fit to the data. However, the parameters of distribution were different between data on new cases and daily deaths. Conclusion: According to the mean and median of the best-fitted distribution, we can expect to pass the peak of the disease. In other words, the death rate is decreasing. Similar behaviors of COVID-19 in both Iran and China, in the long run, can be seen.


Introduction
Emerging infectious diseases, such as severe acute respiratory syndrome (SARS) and Zika virus disease, give their own significant warnings to the public health [1][2][3][4]. Despite significant medical researches, however, how, when, and where new diseases will appear is still subject to considerable uncertainty [5]. On December 31, 2019, the Wuhan Municipal Health Commission in Wuhan City, Hubei Province, China, reported a cluster of pneumonia cases with unspecified etiology that had a history of exposure to Wuhan's Huanan Seafood Wholesale Market (a wholesale fish and live animal market selling different animal species). On January 9, 2020, China, CDC reported that a novel coronavirus (2019-nCoV) had been detected as the causative agent and the genome sequence was made publicly available. Sequence analysis displayed that the newly identified virus belonged to the SARS-CoV clade [6]. In an effort to prevent the prevalence, travel limitations were imposed on Wuhan from January 23, and this has since expanded to 12 other cities, and large social gatherings were prohibited [7,8]. The coronavirus disease-2019 (COVID-19) pandemic -nCoV spread worldwide in 2019, and by March 27, 2020, 199 countries, including Iran, were affected. According to worldwide statistics, the mortality rate is 3.4%. Early symptoms of COVID-19 involve pneumonia, fever, myalgia, and fatigue. To date, no successful vaccine or antiviral agents have been clinically approved for COVID-19. Therefore, prevention and control of the infection is the most important public health priority [9,10].
During the 2019-2020 coronavirus pandemic, Iran reported the first authenticated cases of SARS-CoV-2 infection on February 19, 2020 in Qom [11]. As of March 27, 2020, pursuant to Iranian health authorities, there had been 2378 COVID-19 deaths in Iran with more than 32,000 confirmed infections. Furthermore, on the same date, Iran had the fourth highest number of COVID-19 deaths after Mainland China, Italy, and Spain but ranked first in Western Asia. Accordingly, the mean age and sex ratio (male/female) of patients in Iran was 59 years and 1.4, respectively. Among deaths related to SARS-CoV-2, 59% were male and 41% were female. In early March 2020, non-Iranian government sources stated their assessment of the number of SARS-CoV-2 infections which was much higher than the official numbers [12][13][14][15][16].
There are considerable uncertainties in assessing the risk of this disease, due to a lack of detailed epidemiological analyses. Extensive researches into the 2019-nCoV are needed to fully elucidate its pathway and pathogenic mechanisms and to identify potential Available at www.onehealthjournal.org/Vol.6/No.2/8.pdf therapeutic targets, which can be effective in developing common preventive and therapeutic measures. The behavior prediction of COVID-19 is a significant problem. Therefore, in the present research, we compared the different distribution of COVID-19 cases based on the daily reported data.

Ethical approval
This study used publicly available data and therefore does not require a code of ethics.

Study design
We used the information on deaths and new cases with confirmed 2019-nCoV infection in Iran based on official reports from governmental institutes [10,14]. We collected the data either directly from governmental websites or from news sites that directly quoted governmental statements. The data were collected in real time and thus may be updated as more details on cases become publicly available. The arranged data are available as the Online Supplementary Material (file S1). The latest update to the dataset was on March 27, 2020, for cases reported up to March 26.
We performed a bootstrap method, based on case resampling, to compute the 95% confidence intervals. We used the three types of continuous distribution known as Normal, Lognormal, and Weibull. Akaike information criterion was used to identify the best fit model. We have also presented the median and mean of the best-fitted model to discover the peak.

Normal distribution
A normal (or Gaussian or Gauss or Laplace-Gauss) distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is: The parameter μ is the mean or expectation of the distribution (and also its median and mode) and σ is its standard deviation. The variance of the distribution is σ 2 . A random variable with a Gaussian distribution is said to be normally distributed and is called a normal deviate [17].

Lognormal distribution
A positive random variable X is lognormally distributed if the logarithm of X is normally distributed [18], 2 Ln (X)~N( , ) µ σ

Weibull distribution
The probability density function of a Weibull random variable is:  Where, k > 0 is the shape parameter and λ > 0 is the scale parameter of the distribution [19].
All statistical analyses were performed using package fitdistrplus in R (version 3.6.3) [20].

Results
We used 36 pieces of daily data from February 21, 2020, in Iran. The smallest number of new cases was 11 (February 22, 2020) and the highest number was 2926 (March 27, 2020). On the other hand, the smallest number of daily deaths was 2 (February 21, 22, and 23, 2020) and the highest number was 157 (March 26, 2020). Figure-1 shows the histogram of daily new cases. The graph also shows three fitted distributions on the data.
Table-1 presents the goodness of fit of three distributions. Accordingly, the best fit belongs to Weibull distribution. With the assumption of Weibull distribution for new daily cases, we can determine the mean, median, and mode of the data as being 897.60, 624.61, and 4.59, respectively. Figure-2 shows the histogram of daily deaths. The graph also shows three fitted distributions on the data.  Available at www.onehealthjournal.org/Vol.6/No.2/8.pdf Table-2 presents the goodness of fit of three distributions. Accordingly, the best fit belongs to Weibull distribution. With the assumption of Weibull distribution for daily deaths, we can state that the mean, median, and mode of data are 66.44, 41.11, and 0, respectively.

Discussion
On the March 11, 2020, the director of the WHO said, "In the days and weeks ahead, we expect to see the number of cases, the number of deaths, and the number of affected countries climb even higher. We have therefore made the assessment that COVID-19 can be characterized as a pandemic" [21]. In the present study, we demonstrated the distribution of COVID-19 data in Iran with various types of continuous distributions. One of the distributions we used was the Weibull distribution. The Weibull distribution is related to a number of other probability distributions; in particular, it interpolates between the exponential distribution (k = 1) and the Rayleigh distribution (k = 2 and 2 λ = σ ). Its complementary cumulative distribution function is a stretched exponential function [22]. We used real daily data (positive infections and deaths) reported by the Ministry of Health [10,14]. Our study showed that the Weibull distribution was the best fit with the data. However, the parameters of distribution were different between new cases and daily deaths data. The shape parameter on the daily deaths data was <1; then, the rate fortunately decreased. On the other hand, the shape parameter on the new cases data was approximately 1. Thus, as mentioned before, the Weibull distribution can change to exponential distribution. We could not find any similar works to compare our results. If we use China COVID-19 data (63 daily data), we can see the same distribution. On the other hand, if we only use 36 initial China COVID-19 data (similar to our data), the Lognormal distribution has the best fit. It is then possible to see the similar behaviors of COVID-19 in both countries in the long run.
According to the mean and median of the best-fitted distribution, it is expected the peak of the disease will pass although it is New Year in Iran, now and the people travel around and visit each other a lot. Contrary to government warnings, many people still set off on journeys. On the other hand, the World Health Organization reported an incubation period for COVID-19 of between 2 and 10 days [23]. Other peaks should be expected if the health instructions are not followed. There are considerable uncertainties in assessing the risk of this event, due to lack of detailed epidemiological analyses. Outbreaks of infectious diseases require investigation by combined teams from departments such as human health, animal health, and wildlife [4].

Conclusion
Our study has shown that the Weibull distribution was the best fit with the data. The death rate is decreasing, and according to the results, we can expect to pass the peak of the disease. Furthermore, similar behaviors with respect to COVID-19 have been observed in both Iran and China in the long run.

Availability of Data and Materials
The data that support the findings of this study are available from: https://www.worldometers.info/ coronavirus/.