4P Model for Dynamic Prediction of Covid-19 with statistical and machine learning approach: Social Dynamics, Government Roles and Pathogen Behavior on Bangladesh Perspective

Introduction: Around the world, scientists are racing hard to understand how Covid-19 epidemic is spreading and growing, thus trying to ﬁnd ways to prevent it before medications come to pass. Many diﬀerent models have been proposed so far correlating diﬀerent factors. Some of them are too localized to indicate a general trend of the pandemic while some others have established transient correlations only. Methods: Hence, in this study, a 4P model has been proposed based on four probabilities (4P) which has been found to be true for all aﬀected countries taking Bangladesh as a case. Eﬃciency scores have been estimated from survey analysis not only for governing authorities on managing the situation ( P ( G )) but also for the compliance of the citizens (( P ( P )). Since the immune responses of all the people are not uniform to a speciﬁc pathogen, the probability of a person getting


Introduction
From the beginning of civilizations, the human race has witnessed epidemic of various form and degree. The recorded history of Greek epidemic dated back to 430-427 BCE which claimed the lives of an estimated 25 -35 Greeks according to the contemporary historian Thucydides. Before Covid-19, there were many occurrences of epidemic most notably The Black Death (1346-1353) that wiped out nearly half the population of Europe and Spanish Influenza (1918)(1919) caused about 500 million deaths around the world. A detailed account of the history of epidemic and pandemic can be found in [1]. Throughout the history, it has been observed that inadequate knowledge about the disease, misinformation and misconception among populous, and improper handling of the situation caused more damage than it would have [2,3,4,5,6,7,8,9].
No wonder, in the case of Covid-19, we have acknowledged various confusing or sometimes conflicting information about the epidemic that affects the proper functioning of government. As succeeded populations are getting infected through social contact, the countries those are taking early measures are more successful in containing the outbreak. Before any scientific correlation could be found among factors of contagion, basic measures could have saved most of the lives that are being lost. Government's inaction and citizen's ignorance turn the current situation at stake. While the nature of the virus and its contagiousness are yet to be determined, proper actions by the legislative body or government and compliance of citizens could significantly reduce the spread and have better containment. A study on social distancing policies driven by public awareness and voluntary actions had the strongest causal impact on reducing social interactions which resulted in the decline of the rate of infection by 37% after fifteen days [10]. Mahmud et al [11] also considered their model on social dynamics that excludes government roles in control. The number of tests per population size is very important allowing authorities to isolate and treat the infected individual avoiding further spread. In this paper, we cast down our eyes on an epidemic prediction model for Covid-19 that considers the role of governance and citizen's consciousness along with other necessary variables [12]. This model apprises the concerned audience how their actions and behaviors change the outcome of the onrush by getting user inputs of the aggregated parameters on an interactive web user console available at Covid-19 Bangladesh Projection. Therefore, this work is not just a model that work on a computer and reports the result in an ambiguous way that general people do not understand, it is also a functional system that communicates information with the concerned parties.
The hard part of the model is to initialize the parameter values accurate enough to fit the reported data that can reasonably project the future. Though accuracy at an early phase is not what we are concerned, rather we have focused on its correct behavior given the change of social and pathological dynamics. The model updates its parameters on regular interval as new data comes in from relevant body and scientists. Testing the hypothesis and Methodology have been described in the Methodology section followed by model inception and implementation.

Social Dynamics
Scherer & Cho (2003) showed how one's social networking behavior as well as perception impacts the health behavior of the respondent [12]. Governments in several countries have been unsuccessful in assuring their people that they have control over the situation. In the interim, both the number of newly infected cases and deaths continue to increase every day. Pessimistic decisions may ascend either when a hazard is publicized aggressively or when a dreadful circumstance is presented with extravagant believability [13]. The decelerate response of several countries to the COVID-19 sternness can be explained through geographic collocation. Findings of a research work led by Fischhoff et al. (2003) revealed that the hazard discernment concerning a hostile affair is induced by propinquity to the hazard [14]. In another study, it was observed that the perceptions and behaviors towards the hazard concerning avian influenza (H5N1) virus striking Europe between 2005 to 2006 were significantly associated to the proximity to hazards [15].
Besides, the inability of the political leaders to have people in their confidence to combat COVID-19 would have an influential impact on people to be defiant to government's order regarding social distancing and lockdown [16]. The government should clarify the motives of lockdown to its people, particularly the younger ones who have been found to breach the lockdown rules constantly [17]. Confidence in mainstream media has been found to correlate with support for lockdown measures as well [18] . In a country like Bangladesh having malgovernance, the governments have traditionally been failing to build proper awareness among the people during any emergency period and manage the crisis politically [19].

Prediction Models and their Components
The SIR model [20,21] is one of the olden compartmental models in epidemiology projecting infectious diseases like COVID-19 [22,23,24,25,26,27] and numerous diversified derivations came out from it. The principal SIR model comprises three compartments.
S: The number of susceptible individuals. The susceptible individual is the number of a population who is at the risk of being infected and after being infected the susceptible individuals are shifted to the next infectious compartment I. The transmission rate form susceptible to infected is assumed to be βSI/N 2 where N is the total population and the transmission parameter β is the average number of individuals that one infected individual will infect per unit time.
I: The number of infectious individuals. The population of this compartment are individuals who have been infected and are capable of infecting susceptible individuals. A segment of the population of this infected class will be shifted to the next removal compartment at a recovery rate γ, so that 1/γ is the average period during which an infected individual remains infectious.
R: The number of removed individuals. The individuals who have been infected and have either recovered from the disease or died, entered the removed compartment.
The basic reproductive number R 0 , which is inured to quantify the transmission of pathogens, is the ratio β/γ . That means R 0 is the average number of people infected by an infected individual over the disease infectivity period, in a susceptible population [28]. The model assumes that each transferring person has an equal probability of being contracted by others regardless of social practice and rule in place. The model considers the infected people residing in quarantine as similar as those who are not in quarantine. Hence, both are assumed to have equivalent transmission rate β which might not be true in real cases. Finally, the assumption that β stays constant all the way through the duration of pandemic also might not be veritable. The SIR model is weak to translate the gestation period when an affected person is moving ubiquitously without showing any symptoms whatsoever.
SEIR model is the most widely used epidemic model derived from SIR model. In addition to SIR, SEIR model introduced an intermediate compartment E for exposed population for the incubation period during which susceptible individuals have been infected but are not yet infectious themselves, σ is the incubation rate at which the dormant individuals becoming infectious. The SEIRS model is used to allow recovered individuals to return to a susceptible state at a rate ξ at which they return to the susceptible status due to loss of immunity [21,29,30,31,32,33].
Whether a person is susceptible to a specific pathogen partly depends on the person's immune response. The person could have Natural Active Immunity or might have developed antibody through a non-deliberate contact with the pathogen. Natural Passive Immunity that comes from mother is also possible given the length of Covid-19 epidemic [34]. Furthermore, the type of strain is another numerator for the susceptibly overcoming the population's natural immunity. Body Mass Index (BMI) is said to have an influence on developing or aggravating the disease, but the studies are complicated by the nutrition factors that should oppose the effect of BMI, and we chose not to consider this in our model [35].
After getting infected (I in SIR model), an infective person contributes to the transmission rate in a population N, where N should be a population in proximity and have means to come in close contact, either active or passive and not just a mass in a location. By passive contact, we mean places and artifacts shared by multiple persons in different time within an effective interval, this is how long the SARS-CoV-2 can survive given surface and aerosol stability [36]. R, the removal by recovery or death appears to have little significance since the necessary transmission is already made within the I compartment, and N has to be observed in a real social context. In our computation, we have considered the median incubation period 5.2 days when the infected I infects others [37]. In SIER model, instead of susceptible (N), exposed individual (E) bears greater significance in calculating the rate of infection. Analyzing all conceivable notions and dynamics the basic reproductive quantity has been fragmented into several practical factors described in Kinetics-Modelling-Fitting.
A popular way of risk assessment was proposed by Statistician Fergus Simpson who attempted to estimate the risk of being captured by the virus in terms of the number of dice [38]. According to him, in the first week when a virus initiates to spread across a country, one person out of ten million catch the virus each day, equivalent to a nine-dice risk for being infected by the virus which is very low. But, if the virus can spread unrestrictedly, accompanying risk will increase fast. By week 2, the risk increases approximately six times compared to week 1, equivalent to an eight-dice risk. In the third week, it parallels seven-dice risk, equals six-dice risk by week 4 and so on. But he also cautioned that his formula might not work where the checkout process of positive cases being performed each day is very low with respect to the population size. Hence, his approach has not been taken into consideration in this study. Although it was anticipated by several researchers initially that the density of the population had a positive correlation with the spread of COVID-19 [39,40,41,42,43]], some recent studies show that the assumption might not be completely evident [44,32]. Health expenditure by the government was identified as a significant determinant of deaths caused from COVID-19 [45,46], whereas topographical locations along with their climatic circumstances were detected as prominent factors for the diffusion of the virus [47]. Although the temperature has been observed to be a relevant factor of COVID-19 transmission according to the results of some studies [48,49,50], there are some counterarguments as well [51,52]. Males, as well as older people, were observed to be exceedingly vulnerable to COVID-19 [53,54]. Even though a few studies have been conducted on overall COVID-19 situation along with its various aspects regarding Bangladesh so far [55,56,57,58], none of them has unambiguously focused on the projection of it. In their study, Azad & Hussain (2020) have attempted to fit the available data till date in Bangladesh applying several conventional models like the exponential model, Richards model, logistic model, compartmental model, and Gompertz model, but these traditional models do not take crucial parameters associated with the actions and behaviors [59].

LTSM Components
LSTM: Among the state of the earth deep learning methods, Recurrent Neural Network(RNN) has convinced to be the most robust for prediction as it can automatically excerpt the necessary features from the training samples, delivering the activation from the previous time step as the load for the present time step and the network self-connections. Long Short-Term Memory (LSTM) is one of the most powerful and well known RNN that is skilled to learn order dependence in sequence prediction state.

Methodology
The conventional compartmental models in epidemiology like SIR, SEIR initiate with the size and density of the population [60,61,62,63,64]. But after the outbreak of COVID-19, questions are being raised by the epidemiologists and the public health experts whether the population size as well as its density along with urbanization matter or not. The population size of the U.S is 5.5 times more than Italy. If the population size underwrites the infected numbers, then nearly 5.5 times more infected cases would be expected in the U.S. in comparison to Italy. Likewise, Sweden has experienced almost analogous death rate as Ireland despite having lesser population density, while Spain has undergone about identical mortality rate as Italy although population density of Italy is more than twice than that of Spain. Moreover, urban cities like Shanghai, Seoul, and Singapore with enormous population density have shown better performance in combating the COVID-19 situation than many other cities having a lower density. In this study, three different measures of correlation coefficients, namely, Pearson's product-moment correlation coefficient, Kendall's correlation coefficient, and Spearman's rank correlation coefficient have been performed to evaluate the strength of the bivariate relationship between the confirmed COVID-19 positive cases as well as deaths, population density along with size, and urbanization. Results are displayed graphically in Figure 1, indicating weak correlations between the variables pair wise. To discard the aforementioned infeasibility of population size and density as well as difficulty in estimating the transmission rate kinetics in SIR and SEIR model, a probabilistic approach has been adopted in this study where government control, people's acquiescence to the norms and rules of COVID-19, Test Positivity and infection transmission frequency have been encompassed. The secondary data used in this study have been ex-tracted from the WHO (url: http://www.who.int) and the Institute of Epidemiology, Disease Control and Research (IEDCR) (url: https://iedcr.gov.bd), while the primary data regarding people's compliance along with government's control have been collected from the respondents through a sample survey [DTH1]. A separate survey [DTH2] has been utilized to collect the information about the number of people expected to come in contact with an infected person if s/he moves outside home for whatever reason. Finally, a modified exponential regression model has been developed to fit the observed data encompassing the probability of citizen's compliance accompanied by the government's control, Test Positivity and infection transmission frequency and to prove the dynamicity of all the probabilistic parameters, LSTM is used to train a neural network comparing with the cumulative positive COVID-19 cases of Bangladesh.

Kinetics-Modeling-Fitting
Identification of the relevant causes accompanied by interventions is the must to limit the spread of an epidemic. Epidemiological models work as a guide to plan regarding the outbreak of an epidemic. In case of the extremely infectious Covid-19, most of the affected nations have planned social distancing along with selfisolation measures after vigilant observation of the kinetics of the virus's evolution. With the multifaceted socio and pathological dynamics of Covid-19, it is hard to predict when and how it grows and ends [65]. Different groups of scientists have come up with many different models still with wide deviations [66,67,68,69,70,71]. Selecting too many parameters would increase the risk of non-sampling error due to imprecise and inadequate data, e.g., to estimate the number of people commuting en masse [DTH3] at a given time and location with acceptable accuracy in a country like Bangladesh could be very difficult if not impossible. As mentioned earlier, this issue has been encountered through a sample survey covering adult people from every cluster such as age, sex, profession and so on. The principal focus of this study has been on pandemic predictability by factors that influence the outcome of an exponential model by fitting four probabilities (4P) which can be adjusted over time as new observation comes in. State's control and citizen's compliance: P (G) = Probability of the state's control over the situation. This encompasses - P (G) has direct control over P (P ) whereas P (P ) has passive control over P (G), indicating that they have asymmetric relationship between them. However, poor scores of both P (G) and P (P ) would only degrade the situation. Hence, both must function together in order to have an impact on the overall situation. Therefore, a joint probability of P (G) and P (P ), namely P (C), has been designed for the study which is inversely proportional to the propagation rate of a pandemic. P (C) = P (G)P (P ), Where P (C) is the joint factor of State and its Citizens.
To calculate the probability of overall control P (C), data collected through sample survey have been utilized as mentioned before. The survey provides data about the public compliance (P), and government control (G) being scaled between 1 to 10 (1 indicates the lowest degree of acquiescence, whereas 10 indicates the highest one) along with information whether the respective respondent is affected with COVID-19 or not. Exactly 1119 respondents have participated in the survey, based on which, P (C) has been estimated as below: P (C) = P (G)P (P ) Where P (C) is the joint factor of State and its Citizens. Where, p i = unweighted probability of government control = number of respondents on control having specific scale value total number of respondents From the survey, using the aforementioned formulas, we have P (P ) = 0.324664879 and P (G) = 0.295084897 from the survey data of this study. Finally, we estimate P (C) = P (P )P (G) = 0.095803703

Pathogen's Reproduction Number and Contagiousness
The reproduction number R 0 , is a complicated factor to compute that indicates how contagious an infectious disease is [26,68,72,73,74,75,76,77]. It states how many persons can get contracted by a single infected person with a previously unknown disease. There are three known variants of SARS-CoV-2 (A, B, C) and it is changing the nucleotide sequence of the genome, and we do not know which variant is prevailing in South Asian region [78]. Nine new mutations has been informed in Bangladesh according to an unpublished report [79]. Also, we do not know why some people are more affected, showing serious symptoms, while others remain immune or asymptomatic. We do not know why and how this has created serious havoc in some parts of the world where the continent Africa is only mildly affected. So, it's appallingly difficult to determine the contagiousness here in Bangladesh. Since it depends on the type of pathogen and how people are interacting with each other in certain social settings, we considered this part in P (C). Many scientists have hypothesized the possibility of Herd Immunity [80] or Active Natural Immunity (ANI), which is very hard to quantify as a factor of R 0 . By far, the virus is new to our understanding and no vaccine (Natural Artificial Immunity) has been developed, we have no idea how individual's immune would response at first exposer [81]. On the other hand, the probability of a person catching infection once being exposed, designated as P(I), is an essential factor in order to understand the velocity and magnitude of the contagion. P (I) = Probability of a person getting infected after being exposed. Where, P (I) is primarily a function of -1. Strength of Pathogen Type, S V : Value not known 2. Human Natural Immunity, I N : Value not known 3. Viral Load, L V : Value not known V L is the threshold of viral load to which a target is exposed to be infected. Viral load is a measure of virus particles, also called Infectious Dose, that amount of virus needed to establish an infection. For influenza viruses, people need to be exposed to as little as 10 virus particles, while as many as thousands for other human viruses to get infected. W. David Hardy mentioned 'The virus is spread through very, very casual interpersonal contact [82]. We did not find an estimation of how many virus particles of SARS-CoV-2 are needed to trigger the infection, but COVID-19 is clearly very contagious, probably because few particles are needed for causing the infection leading to a low infectious dose or viral load [83]. Since values of above components are yet to be reliably established, we have passively determined the P (I) through statistical studies on various data, which is proportional to the growth of a pandemic and is varying from region to region.
Thus, we can say -P (I) is the configuration of strength of pathogen type, human natural immunity and viral load and we have tried to determine P (I) by the rate of propagation in a sample population who are possibly being exposed and part of them got infected. In this study, the value of N is estimated to be 10 from a different survey conducted on 359 respondents who have gone outside home for regular activities with the risk of being infected by an infected person and IP is the incubation period.

Test Positivity
Test Positivity is the ratio between the number of daily positive cases and tests done. A high percentage of the population being tested positive may assume right people are being tested and more tests would accumulate more positive results. Though the association between the number of tests and positive cases is not linear, it helps contain the disease at an early stage [84]. The

Country
Test Positivity  proportion will be headed down either by the number of tests increases or decreases of contagiousness. Two types of testing approach have observed around the world depending on state's capacity and policy; (1) Reactive Tests where only people with acute symptoms are being tested and (2) Proactive Tests where subjects are being tested at random. Countries which have gone through reactive testing obtained high Test Positivity by testing only people with acute symptoms while less symptomatic or asymptomatic cases are free to roam contracting others upon contact. Conversely, countries with proactive testing, where people are being tested in every suspected scenario at random got a lower Test Positivity. The result is explicit in Table 1 and Table  2 shows how the containment of the pandemic is proportional to its Test Positivity. Table 2 is exhibiting increase over total cases for 29 days and accomplishing situation in some countries are strongly correlated to the percentage Test Positivity. The daily trend can be found on various authentic data sources like in [85,86].
Here are the following tables 1 . The World Health Organization (WHO) reported incubation period (time from exposure to the development of symptoms) for COVID-19 between 2 to 10 days [85]. The mean incubation period was 5.2 days (95% confidence interval) [37]. In our model, Test Positivity P(T) act as the probability of getting positive out of number tested and it updates with the moving average Test Positivity of incubation period (IP).  Without running into a complex set of incomprehensible parameters, we tried to keep the model simple with a core equation apprehended by four 4Ps that can be learned over time reflecting a closer prediction of reality.
Finally, the regression model fitted with the data becomes - here α, µ and n are constants.

LSTM
This equation basically chose information which can be passed to the cell. The data from the input side of previous memory which is to be ignored is decided by the forget gate by the following equation, Control gate controls the update of the cell by the following formula, The output layer updates both hidden layer h t−1 and output as is given by, tanh is used to normalise the values into range -1 to 1.The weight matrices are W and the activation function is σ which is taken as the sigmoid.

Results and Discussions
Scientists are working on COVID-19's facts of Bangladesh and other countries to analyze and forecast the propagation using the conventional SIR and SEIR model initializing some implicit compartmental rate constants [87,88]. Hazardous civic interaction plays the role to diffuse transmissible viral particles which is possible to control by being acquiescence with the health norms and law and order of the government. Test Positivity is the vibrant contagion trend indicator which indicates how obliviously the suspected are being infected. But these vital factors have been ignored by most pandemic models. In our proposed model, these key factors have not only been addressed but also been integrated with their implication in fitting the propagation trail. Those who worked with traditional compartmental models considering the whole population as susceptible and density a multiplier of the rate constants overlooked the correlation between the factors and the eruptions. Some studies showed the correlation between the degree of contagion and body mass index (BMI), but they ignored the nutation facts. Evidently in this study, the population size and density are ignored due to the insignificant correlation aspect. The reproduction number and implicit compartmental rate constants which are certainly difficult to measure due to insufficiency of reliable data in Bangladesh are circumvented. We explicitly addressed the impact of citizen's awareness about the hazard of Covid-19, norms and rules of health concern, government laws, obligations and citizen's compliance with them. Also, we considered the Test Positivity and the probability of being infected. Figure 3 shows how precisely the model is superimposing with real values. In Figure 4 the growth rate of new positive cases estimated both from the model and the real cases are shown. Though the growth rate is declining its sluggish gradient indicating a lengthier presence of corona virus here in Bangladesh.   The scenario may change upon how cognizant the citizens are about this contagious microbe. As mentioned earlier, the joint probability P (C) of state's control and citizen's compliance regulate the propagation rate. The propagation has been estimated in ±5% controlled compliance situation. Figure 5 shows how only 5% decrease in P (C) and 5% increase in P (T ) force the growth curve upward. Similarly, Figure 6 shows exactly the opposite where 5% increase in P (C) and 5% decrease in P(T) from the estimated values. In both cases we kept P(I) unchanged. The daily positive cases are estimated from the projected propagation by first derivative and considered an average of the incubation period. In Figure 7 the comprehensive scenario of daily positive cases with the estimated one. An estimated Mean Absolute Prediction Error (MAPE) of 0.20 indicates reasonable prediction by our 4P model [2]. Fig-Fig. 7 Daily new positive cases, Reported vs Projected. ure 8 shows the estimated flattening of the curve on last calculated P(C), P(T) and P(I) which is subject to change. On June 26, United States observed an alltime surge in new Coronavirus cases which was about to get flattened during the first quarter of June following some premature reopening, public unrest, civil disobedience, etc. that evidently lowered the country's P(C) score. Reports on critical events are available on JHU Timeline of COVID-19 [89]. Although improving, the Test Positivity in the USA is still above 12% on average during this writing. In Bangladesh, a similar surge could be expected during the Islamic festival Eid-Ul-Adha at the end of July despite our current projection shows a slowdown in new cases. However, in every possible scenario, we continue conducting our study to get the values of the determinants and update the projection accordingly. Since there will no periodic version of the same paper, we have deployed our model available at Covid-19 Bangladesh Projection. LSTM method is used to forecast the COVID-19 cases in Bangladesh. From above Table 3 we can see that there are three results with its different parameters where the prediction accuracy is quite high. Numerous settings are tasted but the results was not satisfactory. Among them number 3 settings are quite satisfactory. From figure 9 also it isnoticed that the accuracy of number three result is good. In that the 1st layer unit is 15,20,25

Conclusion
Nature plays its role on mankind as history tells us and it is human's wisdom and action that ultimately determine the outcome. In this study, the emphasis has been given more on human actions rather than on the SARS-CoV-2 as well as a human as a host for which very little is known so far. Empirical determination of its association with the probability of an exposed person to get infected omitting ambiguous factors has been accomplished. For successful containment of the epidemic, effective social distancing, hygiene, large scale testing and isolation are recommended to ensure at the earliest possible time. The 4P model provides a strong premise in decision making by demonstrating the causality of the epidemic over which the state alongside its citizens have control and the machine learning outcome concretely conclude that here four probabilistic parameters are not just some fitting parameters rather it is sufficient to train a machine learning model.

Conflict of Interest Statement:
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. Authors and Contributors: This work was carried out in close collaboration between all co-authors. first defined the research theme and contributed an early design of the system. further implemented and refined the system development. wrote the paper. All authors have contributed to, seen and approved the final manuscript.
Compliance with Ethical Standards: All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. Informed Consent: Informed consent was obtained from all individual participants included in the study.