Predictive model for COVID 19 curve - An evolutionary approach

In this manuscript we propose a novel method that models the evolution, spread and transmission of COVID 19 pandemic. The proposed model is inspired partly from the evolutionary based state of the art genetic algorithm. The rate of virus evolution, spread and transmission of the COVID 19 and its associated recovery and death rate are modeled using the principle inspired from evolutionary algorithm. Furthermore, the interaction within a community and interaction outside the community is modeled. Using this model, the maximum healthcare threshold is xed as a constraint. Our evolutionary based model distinguishes between individuals in the population depending on the severity of their symptoms/infection based on the tness value of the individuals. There is a need to differentiate between virus infected diagnosed (Self isolated) and virus infected non-diagnosed (Highly interacting) sub populations/group. In this study the model results does not compare the number outcomes with any actual real time data based curves. However, the results from the model demonstrates that a strict lockdown, social-distancing measures in conjunction with more number of testing and contact tracing is required to atten the ongoing COVID-19 pandemic curve. A reproductive number of 2.4 during the initial spread of virus is predicted from the model for the randomly considered population. The proposed model has the potential to be further ne-tuned and matched accurately against real time data.


Introduction
As the second wave of the COVID 19 pandemic is happening around the globe, the spread of COVID 19 has been the topic of discussion in today's life. As many researchers are globally working to understand the nature and the mechanism of spread, the healthcare systems are always in need for an accurate model to forecast the "COVID curve". A "COVID curve" is de ned as the curve that represents the daily new cases with respect to time or the cumulative deaths with respect to time. Many predictive models based on curve tting and stochastic are being produced with exact scales to forecast (extrapolate) the COVID 19 curve. These predictions have insights on how quickly the COVID 19 curve will grow and the extrapolated consequences. Forecast model play a vital role in understanding the healthcare demands/needs [1] , including how many intensive care unit beds, ventilator units and labors will be required to respond effectively. However, the authors believe that as the rst and foremost objective of a predictive/forecast model is to estimate the relative effect of increased human interaction and increased social intervention measures [2], [3] on the COVID curve (cumulative death/daily new cases vs time curve). Simpli ed models with curve tting approaches may provide less valid forecasts with real time accurately scaled data because they cannot extrapolate, account for the exact scale and extent of human interactions..
The SEIR (susceptible, exposed, infectious, removed) model is a commonly cited model. Lin et al. extended a SEIR model that considered various risks and the cumulative number of cases [4]. Stochastic transmission models have also been developed and considered [5]. The results from the model are stated with substantial uncertainty intervals. From the ongoing trends of the rst, second wave and interpretation of data, it is very evident that epidemics will not follow similar paths in all places globally, even when geographically speci c factors such as age distribution are considered. The rational way to go about distinguishing the geographies would be by population density, information on how buys the place is and information on age distribution. A detailed review of some of the existing models are listed by [6], [7]. The Standardized Infection Ratio (SIR) is the metric employed by the National Healthcare Safety Network (NHSN) to track health care associated infections (HAIs). The SIR is calculated by taking the ratio of the number of observed infections by the number of predicted infections. The proposed model in this manuscript future intends to help calculate the SIR with an objective of obtaining SIR close to 1.
There are numerous modi ed SEIR (Susceptible -Exposed -Infectious -Recovered) models available in the CDC website [8]. List of different accepted predictive models are listed in the CDC website [8]. The model proposed in the manuscript presently is not a data based model as it just tries to demonstrate the methodology involving evolutionary approach. There is no curve tting approach incorporated in the method. The parameters varied in the population are similar to most of the models listed in [8]. The novelty in the method proposed in this manuscript is brought about in the way the interaction and mobility is modeled based on evolutionary principle. Furthermore, the complex mechanism of human interaction and sudden social intervention measures could be captured by employing evolutionary principle which the authors have attempted in this manuscript In this manuscript, the authors propose a new evolutionary based model for the COVID-19 pandemic with a prede ned population that derives the methodology partly inspired from the state of the art Genetic Algorithm. Though, the genetic algorithm [9] is understood as an optimization methodology, parts of the methodology has been adopted to mimic the tness of a population community that is assumed initially where in the crossover is replaced by interaction (spread and transmission) within the community. Genetic algorithms (GAs) are biological inspired optimization algorithms that derive it's methodology from the process of evolution. The GAs are widely used as a method to solve highly nonlinear optimization problems that demand global optimal solutions. GAs always involve parents and offspring as a part of the sub population. Any individual in a population is modelled to have a particular tness.
Only individual with high tness values survive each generation (Survival of the ttest). This theme of the genetic algorithm has inspired the authors to model the spread of virus and survival of the individuals in a population community. De Jong[9] cautioned against understanding GAs as only optimization tools.
The author stated that considering GAs as only an optimization tool is not the right approach.
Furthermore, the author strongly suggested to perceive GAs as a tool to simulate natural process.
This motivates the work in this manuscript to adopt evolutionary approach to model spread of virus. Henceforth, we use a common notation, Evolution Programs (EP) for Genetic algorithm based models [10].
Genetic algorithms are a class of evolutionary programs [10]. The evolutionary model is a probabilistic algorithm which preserves a population of individuals for generations based on their tness as it becomes the key to survival. Each individual is identi ed by a tness value. Next, a new generation in the population is formed by preserving the t individuals in the current generation. Randomly selected individuals from the initial population/sub population undergo modi cation by means of "genetic" operators (Operators that cause change in individual tness) to form new solutions. A chosen sub population from a population of individuals undergoes some transformations which effects the tness value of the individuals [10], and during this evolution process from one generation to another (in the units of time) the individuals strive for survival. The crossover and mutation has inspired the transmission, spread of virus and evolution of virus in a population community respectively. The step by step methodology is explained in Fig 1. Reiterating the fact that the idea of evolution programming is not new|[11]- [13].Many different versions of evolutionary systems exist. However, in this manuscript we discuss a novel evolutionary approach to model the spread of COVID 19 in a community. The most challenging part of the evolutionary based modelling of the pandemic spread is the implementation of the constraints. The violation of constraints can be associated with greater penalities. In the algorithm proposed in the current manuscript the penalty is posed naturally by constraining the interaction as more interaction leads to more spread of the infection. This is a great advantage of EPs as the constraint naturally grows or attens the COVID curve. Davis [14] believes that GAs are the most suitable algorithms to model many real world problems.
Devis et al [15]stated that GAs are powerful tools that can be suitable to model the adaptation of the species to changing environment as the underlying principle of the GAs are the survival of the ttest.
From the literature survey quoted, it is evident that understanding any evolutionary algorithm as just an optimization algorithm is not advised. This manuscript develops a model based on evolutionary principle to understand the spread of COVID-19 in a population. In this proposed algorithm to model the spread of COVID 19 we do not use parents and offspring, rather we have infected and non-infected people. The infected population could show symptoms and not show symptoms (Incubation). Furthermore, our model accounts for a distinction between infected diagnosed and infected nondiagnosed individuals, owing to the reason that the latter continue to interact in a population community and spread the infection more and the former are self-quarantined or isolated.
Additionally, in this model, we consider the probability rate of the individuals in the recovered sub population becoming prone to the infection again.

Methodology
The owchart of the methodology is shown in detail In Fig 1. The rst foremost and the most sensitive input to the model is the initial population. The population can be distinguished in terms of the tness value. Fitness is de ned as the decimal equivalent of the binary string. For each individual the binary string length is xed as 8 in this study. Once the population is generated, the process of interaction, spread and transmission inside and outside of population is simulated by modifying the binary string by choosing the most appropriate transmission site and interaction probability.
The maximum tness for a binary string of length 8 is 1 1 1 1 1 1 1 1 = 255 The minimum tness for a binary string of length 8 is 0 0 0 0 0 0 0 0 = 0 Any individual with a tness value is 0.4 -1 times the maximum tness is quali ed as a host population.
Individual with tness less than 0.1 times the maximum tness is quali ed as a part of the virus population. Any individual in between the tness values of these two sub population become a part of incubation sub population. These bounds can be changed or accurately modeled when we have the exact information on the age wise population data. The age of the individual could be directly correlated to the tness value of the individual.

Initial population
The initial population for the population community can be generated in two different ways In this study if the tness value of an individual is greater than 30% of the maximum tness, the individual becomes a part of the host sub population. If the tness value is less than 30% of the lower bound of the host individual (least tness among the host sub population) then the chromosome becomes a part of the virus sub population. The chromosome whose tness is intermediate between host and virus becomes a part of incubation sub population.
By de nition the incubator sub population are the individuals who potentially infected by COVID 19 but have no symptoms, but they are still potential transmitters/carriers of the virus.
The population generated based on available data can be subdivided not groups based on age. The population above the age 65 can be regarded as the population with weak immunity. The tness of the individuals are a direct function of the age. After the initial population is generated, the whole of the population is subject to a random mutation. Mutation is de ned as the process of changing 0s to 1s and vice versa. The inputs that are required to perform the mutation are the probability of mutation. The total number of bits/genes that can undergo mutation can be determined from the mutation probability. Once the mutation is completed, the sub populations are formed.

Interaction within a population community
The interaction of individuals within a population community is modeled through the mechanism of transmission. A pictorial representation of the transmission mechanism is shown in gure 2. The direction of interaction between various sub populations is as shown in gure 3.
The individuals from the sub population that interact are chosen randomly. This mimics the real scenario where in every individual is equally susceptible. As seen from gure 3 the host population is always a receiver. The virus population is always a transmitter. However, the incubation population is both a transmitter and receiver. As the interaction begins, two new sub populations emerge as stated earlier.
They are the recovered sub group and the dead/expired sub group. The research published recently makes it evident that the recovered population is still susceptible and potential transmitters. This algorithm models the recovered sub group in two different ways with a set of assume During the initial period the curve of the virus infected persons are of prime interest. It is evident that the curves from the nancial times website that for some of the countries are attening early and some atten later which can be purely attributed to the level of social intervention.
During the process of transmission, the transmission site is de ned randomly. The transmission site is not de ned/ controlled by any function as the level of interaction is not controlled in reality. Therefore the inputs required to model the transmission are the probability of transmission and the transmission site.
The total probability of interaction = 1

Total probability = Probability of host interaction with virus + Probability of virus interaction with host + Probability of incubation interaction with host + Probability of incubation interaction with virus.
To make the algorithm be relevant to any population community the sub population size is normalised.
h= H/P v= V/P

I= I/P
For example let us consider, Individual 1 = a1 a2 a3 a4 a5 a6 a7 a8 Individual 2 = b1 b2 b3 b4 b5 b6 b7 b8 The bit positions that contribute most to the tness of the individual are to the extreme left (a1, a2, a3 and b1, b2, b3). Henceforth the individual with maximum tness has value 1 occupied in the bit location a1 to a4 and b1 to b4. For example if we assume that individual 2 is low in tness, while individual 1 is high in tness and since the transmission is one way as indicated in Fig 3, the transmission site and the number of transmission bits play a major role in determining the tness of the individual 1. If the number of bits that can be changed is 4, then total possible tness for individual 1 becomes a function of the exact position of the bit that can undergoes change. Figure 4 shows the effect of bit position and the number of bits that are mutated to the reduction in tness value.
Henceforth, the level of interaction can be constrained by having a low value of number of bits "n".

Interaction from outside population
The interaction of the sub population with the outside population community is modeled through one way mutation. The one way mutation is the process of changing 1's to 0's. This implies that the interaction with the outside population does deteriorate one's tness or does no change, but it does not improve the tness value of the individual. This mutation probability can be varied based on speci c population community. For example when modeling the situation at busy cities with world's busiest airports, the mutation probability can be set to a high value. For towns and non-busy cities, this mutation probity is kept low. The mobility data published in Unacast [16] website clearly indicates that the reduction in mobility within and outside of the population community has a great in uence on the COVID 19 curve. It is stated that World Health Organization and the CDC declares that social distancing is currently the most effective way to steadily atten the curve of COVID-19 spread [8]. Furthermore, Migration patterns of people from each city and state have been captured utilizing the available GPS data. The data indicates that the reduction in migration rate has contributed to the attening of the curve in each population community.

Modeling recovery
The recovered population is a very dynamic population that gets formed/grouped in the course of interaction. The recovered population is a sub group formed from the virus sub population. In reality not all of the virus infected population (infected with virus and show symptoms) get hospitalized. Among the percentage of the hospitalized population, some recover and some die. The population that gets hospitalized is chosen based on the information of initial tness of the individuals. The individual of the hospitalized population either recovers or expires. The recovered population could still transmit and be susceptible. The inputs required to model recovery are probability of recovery. The recovery is modeled via one way mutation by changing 0's to 1's. This means that recovery is a process of improving an individual tness value. Probability of recovery is kept to a small value as the recovery is gradual and slow.
The ow chart of the modeling approach is as shown in Figure 4.

Constraints
The constraints on interaction is brought by the awareness social distancing, self-isolation, hand washing. The interaction rate is controlled by changing the rate of interaction operator. The healthcare threshold is also a potential constraints in knowing the number of hospital beds available in a population community. The awareness of the ongoing situation serves as an important constraint to any population community to atten the curve. The objective function for the algorithm from the virus perspective is to maximize the tness of the virus sub population. The maximum value for the normalized tness is equal to 1. The problem is solved both as a constrained as well as on an unconstrained approach. The algorithm proposed in the manuscript can be used to analyze and forecast the curve only by knowing an approximate value of the parameters. Like the modi ed SIER models proposed in the CDC website [8], the model proposed in the manuscript also do not make speci c assumptions about which interventions have been implemented or will remain in place

Reproduction number Ro
The reproduction number is de ned as the number of non-infected peopled possibly infected by one infected person. This number is crucial to understand the intensity of spread in a population community.
In this model the value of Ro is calculated as a function of time.

Results And Discussion
The following results and discussion pertains to following parameters in the algorithm. The normalised population is de ned as the ratio of the number of respective sub population to the total number on the population community. The maximum value of the normalised population is 1 and the minimum value is 0. Figure 5 shows the normalised population with and without social intervention.

Effect of social intervention
The rate of increase in the infected numbers and the associated decrease/increase in other sub population is shown in Fig 6. NSI indicates "No Social Intervention", WSI indicates "With Social Intervention". This clearly indicates that the social intervention measures are seen to have a substantial impact on the growth of the virus infected population number. It is also evident that the social intervention measures can atten the curve of the host population and the incubation population. The intensity of the social intervention measures can be increased by introducing much higher decay rate for interaction operator. The plot assumes that the social intervention measures are introduced around the day 50.
3.2 Effect of hospitalization and recovery without interventions The four curves shown in the gure corresponds to the virus sub population. From the results shown in the model, it is evident that combined social distancing and community wide testing is essential for the early curve attening scenario. Reasonable amount of testing without social intervention measures doesn't lead to the early curve attening scenario. However, with social intervention measures and moderate testing/hospitalization can still be very effective.

Effect of interaction
The Effect of interaction within a population community on the growth of virus infected number is demonstrated in Fig 7. The output from the model suggest that minimum the interaction, earlier the curve attens. The results from Shamam model [17] based on metapopulational SIER model indicates the same trend. The authors concluded that individual living in highly populated neighborhood was infected more.

Effect of interaction with PPE
The effect of interaction with personal protective equipment's (PPE) such as face mask and hand gloves is shown in Figure 8. The result indicate that the use of PPE during interaction has considerable effect on the rate of growth of virus sub population and results in delayed rise in the curve as seen for the case of high PPE. High PPE pertains to a case where more number of people use PPE while interaction.
3.6 Effect of population density Figure 9 shows the number of cases trend for a high population density place and a low population density place. The results from the model makes it evident that the population density has a great impact on the rate of increase of the virus sub population. The population density (PD) of 1 indicates high population density and PD of 0.3 indicates low population density. The trend of the daily infected cases from the model is seen to be more correlated with the population density of a particular geographical location.
3.7 Effect of interaction from outside community Figure 10 shows the effect of mutation probability on the growth of number of cases. The effect of mobility on the curve attening is published by Unacast Though social distancing and hand washing within a population community could slow down the rate of spread of the virus, the interaction with population community outside is seen to have a major impact on the early rise of the curve, indirectly contributing to the rise of the curve at later stages. The results from the model makes it evident that more interaction of the population with outside community that is possibly infected with the virus can have adverse effect on the trend of the curve. Hence the borders should remain closed. The results from [16] also indicates that mobility has a direct impact on the COVID curve.
3.8 Recovery plot, death plot Figure 11 and Figure 12 shows the trend of increasing dead population and recovered population from the model. The results from the model makes it evident that the death rate spikes after a certain period of spread of COVID 19. The exact time and the spike number can be brought about by much more detailed modeling using the real time data. The results from model shown in Figure 12 makes it evident that the recovered sub population becomes a major part of the sub population as time progresses. This results makes it evident that the dynamic sub population namely recovered and expired trends can be matched with real time data using the data from local topography and real time intervention measures.

Reproduction number
The basic reproduction number (R 0 ) is a statistical parameter that helps predict the longevity of the virus spread in a population. The physical interpretation of R 0 is the average number of secondary cases that emerge from a single primary infected diagnosed/non-diagnosed case in a totally susceptible population.
The results from the model shows the variation of the reproduction number Ro with weeks. As seen from On behalf of all authors, the corresponding author states that there is no con ict of interest.
Ethics approval and consent to participate

Consent for publication
All authors have approved the paper and its submission to journal and preprint services.

Not applicable
Availability of data and material Not applicable

Code availability
Not applicable