An analytical framework for understanding infection progression under social mitigation measures

While there has been much computational work on the effect of intervention measures, such as vaccination or quarantine, the influence of social distancing on the epidemics’ outbursts is not well understood. We present a realistic, analytically solvable, framework for COVID-19 dynamics in the presence of social distancing measures. The model is a generalization of the compartmental SEIR model that accounts for the effects of these measures. We derive a closed-form mathematical expressions for the time dependence of epidemiological observables, in particular, the detected cases and fatalities. These analytical solutions indicate simple quantitative relations between the model variables and epidemiological observables, which give insights into cause-effect connections that underlie the outburst dynamics but are obscured in more standard (numerical) approaches. While the obtained results and conclusions are based on the study of the COVID-19 pandemic, the presented analysis has general applicability to infection outbursts. Our findings are particularly important in the emergence of new pandemics when effective pharmaceutical treatments are unavailable, and one must rely on well-timed and appropriately chosen social mitigation measures.


Introduction
Numerical computation methods and/or simulations are routinely used to study epidemiological dynamics.These tools can easily produce predictions of even very complex epidemiological models.Yet, in a certain sense, the numerical computation/simulation is comparable to a black-box system that converts given model parameters to the prediction outcomes, hardly facilitating any direct and deeper understanding of the influences of individual parameters on model observables.Additionally, inferring the parameter values from the observed data, analogous to reverse-engineering of the black-box functioning, often becomes a difficult task.
While it is evident that social distancing measures seriously impact the epidemiological dynamic, their effects are not yet well understood [25][26][27], despite a lot of undergoing effort to incorporate the influence of these measures into the existing epidemiological models [4,8,22,[28][29][30][31][32][33][34][35][36][37][38][39].These studies often use an analytical approach to analyze and reveal some of the system's characteristics-e.g., system equilibria, positivity and boundness of variables, stability, sensitivity and bifurcation analysis [8,[35][36][37][38][39]. Thus, they allowed insights into some important aspects of the system's behavior, and the role played by the social measures.For example, analytical calculations provided a better understanding of the interventions required to control the disease, such as the interplay between governmental actions and the public response [39], and quarantine [35].However, to our knowledge, no attempts were made to analytically solve complex systems of nonlinear differential equations stemming from these extended epidemiological models (neither from the models aimed to include the effects of mitigation measures nor from those trying to incorporate other relevant factors).Instead, computer simulations and/or numerical solving of differential equations are commonly used to compute and analyze the time evolution of the epidemic (e.g., to make comparisons with the actual epidemiological data).Consequently, despite the ability to include intricate details of the models, these studies lacked the full analytical approach to establish direct functional relations between variables and outcomes, thus missing the advantages of having analytical formulas describing the entire time evolution of the system variables.
While the advantages of the analytical approach are clear, all of the studied models in epidemiology, including the very basic ones such as SIR (Susceptible-Infected-Removed [40]), are nonlinear and the corresponding differential equations cannot be solved in a straightforward manner [41].Nevertheless, in the last two decades, there was significant progress in the field.To our knowledge, the earliest advances in this direction appeared with the turn of the century and were related to the SIR model, on networks [42] and its stochastic variant [43], followed by solutions given in the forms of convergent series, for SIR [44] and then also for SIR and SIS (Susceptible-Infected-Susceptible) models [45].Further progress was made by Harko et al. in 2014 [46], by providing the SIR model analytical solutions, albeit in parametric form and up to an integral inexpressible by elementary functions.COVID-19 pandemic has intensified the efforts in this direction, resulting in new forms of solutions for SIR model: as infinite series (using Asymptotic Approximants method) [47], as an inverse solution with an approximated integral [48], using integrals inexpressible in terms of elementary functions [49], and a direct approximated solution [50,51].Some of these approaches preserve (approximative) integrability even when constant model parameters (such as infection rate) are replaced by time-varying functions (though with certain constraints).Recently, there has been also some progress even with compartmental models more complex than the basic SIR variant.More precisely, papers [52] and [53] analytically treat SEIR (Susceptible-Exposed-Infected-Removed) model and find, respectively, a solution in the form of a well converging infinite series and as an exact solution to an approximative reduced model.
Along with the attempts to analytically solve the equations governing the epidemic dynamics, there was an ongoing effort to find useful analytical relations between the relevant epidemiological variables.One such simple but transcendental equation, relating the basic reproductive number with the final size of the susceptible compartment, was an immediate consequence of the SIR model [40], and a similar result was obtained for the SEIR model [53] (the robustness of such relations is discussed in [54]).Moreover, it often turns out to be possible to derive useful simple analytical expressions for the time of the infection peak and/or the fraction of immunized required to stop the epidemic [48,50,51,53,55,56].These formulas, obtained at a low cost of applying practically negligible approximations to the starting equations or intermediary expressions, turn out to be very accurate (when compared to precise numerical computations).
All these analytical results were obtained only in the context of SIR and SEIR models, and the presence of intervention measures likely renders them unus-able.Therefore, this study aims, for the first time, to establish a realistic-analytically solvable-model of COVID-19 dynamics in the presence of social distancing measures and to demonstrate the benefits of such an approach.The model is a generalization of the compartmental SEIR model-so-called SPEIRD [57,58]that accounts for the influence of social measures by mathematically representing them as an effect that removes susceptibles from the transmission process (and places them in a newly introduced Protected compartment).A similar approach to introduce the effects of distancing in compartmental models can be found in [32,[34][35][36][37] and is in certain aspects advantageous over simply varying transmission coefficients in time, implemented in many models [4,8,22,33,38,39].SPEIRD also introduces additional compartments that directly correspond to epidemiological observables: the number of detected cases and fatalities.The model was introduced in [57], but due to the complex system of nonlinear differential equations, no attempts were made to solve them analytically.However, it turned out that, despite complicated equations, it was possible to analytically compute the closed-form expression for the number of infected cases as a function of time.This partial result was published in [58] and presents a starting point in this study.However, due to its complexity, we were unable to make further progress toward computing other system variables.We only recently managed to analytically compute the time evolution of all relevant variables (i.e., the closed-form expressions for the full-time evolution of the number of confirmed (detected) cases and fatalities).These two variables have been used as crucial measures for tracking epidemic progression and its severity, and this manuscript presents the full analytical treatment of the equations by deriving their closed-form mathematical expressions.
The time evolutions derived here allow us to analytically investigate the effect of different model parameters on the infection outcome and to recognize particular interplays.Knowing explicitly the relations between input and output variables not only vastly simplifies our ability to extract model parameters from experimental data (e.g., by distinguishing special regimes in which equations have characteristic behavior [58]) but, more importantly, to straightforwardly draw conclusions that may simplify public policy decision making.In particular, the asymptotic expressions obtained here for the cumulative numbers of COVID-19 detected cases and fatalities, quantitatively reveal a simple interplay between the basic reproduction number R 0 , the strength of introduced social measures, and the timing of their introduction.Namely, it turns out, as we show in Sect. 4 and further elaborate in Sect.5, that more flexible, i.e., relaxed measured can be easily afforded if they are timely introduced (to the same overall effect as with far more stringent measures introduced just a week or two later).Moreover, we provide simple analytical expressions for estimating the timing of the peak, the tipping points of the epidemic wave, as well as the maximum of detected cases per day.Providing an analytical framework to understand such epidemiological behavior may have a significant impact both on the economy and the quality of human lives.
While the obtained results and conclusions were based on the study of the COVID-19 pandemic (and tested on corresponding epidemiological data), the analysis presented here is of general validity and therefore directly applicable to any possible infection outbreak in the future.While at this stage of the COVID-19 pandemic, we already have additional tools to combat the disease (e.g., vaccines and mounting clinical experience), general conclusions drawn from the models like this one can be a particularly valuable aid in the initial stages of any new epidemic.

Methodology
To assess the dynamics of COVID-19 infection, we employ a mechanistic model SPEIRD, which we developed in Refs.[57,58].More precisely, concerning the standard compartmental models in epidemiology [25][26][27], which comprise susceptible (S), exposed (E), infectious (I ), and undiagnosed recovered (R) pools, we added a few additional compartments.These are introduced to take into account the available data (direct observable quantities), such as the total number of detected (confirmed and thereupon quarantined) cases (D), active cases (A), and fatalities (F).It also accounts for the effects of the social protection measures by adding a compartment (P) of protected individuals (denoting those effectively removed from susceptible category due to social distancing).The corresponding equations, describing the model dynamics, read: In the above equations, N stands for the total population number, while parameters have the following meaning: β− the transmission rate; α− strength of the social measures; σ − the inverse of the latency period; γ − the inverse of the infectious period; δ− the inverse of the infected detection period; − the detection efficiency; h and m− the recovery and the mortality rate, respectively.According to [59][60][61], we assume σ = 1/3 day −1 and γ = 1/4 day −1 .The gradual effect of social measures is introduced through Eq. ( 2) where t 0 denotes the time when the measures are enforced.In [58], it was shown that Hill function [62] can be reliably replaced by the step function of the form αθ(t − t 0 ), which we further apply.We also assume that, after t 0 , the 2 nd term in Eq. (1) dominates over the 1 st term, i.e., S(t) ∼ e −αt , which we previously numerically showed to hold well for a wide range of countries in the first COVID-19 wave [58].In Fig. 1, we provide a schematic representation of the SPEIRD model (Eqs.( 1)-( 8)).Thus, Eq. ( 1) determines the rate of depletion of the susceptible population due to both the infection process and the social distancing measures.These measures result in a transition of the susceptible to the protected compartment, described by Eq. ( 2).In Eq. ( 3), the exposed pool increases due to the infection events.Simultaneously, it decreases due to a transition to the infectious compartment with the rate σ .Equation (4) describes how the number of infectious individuals grows due to the transition from exposed to the infectious category, while it decreases with the rate γ (recovery of undiagnosed infected to R compartment) or with rate δ (corresponding to their detection and subsequent quarantine, resulting in a transition to the detected compartment D).Equation (7) fixes the rate at which the active compartment is depleted through two channels − by being healed (rate h) or by dying from the virus (rate m).Equation ( 8) describes the growth of the fatalities.
The model corresponds to the infection phase, where social distancing measures are first introduced (in our case, the first COVID-19 wave), corresponding to the effective transition from S to P compartment.As the measures later got eased, there can also be a transition from P back to S. This is, for simplicity, not implemented in SPEIRD (and usually neither in similar models, see e.g.Refs.[32,[34][35][36][37]), i.e., only the first phase of epidemics (and the social measure introduction) is considered.We implement our model deterministically, as publicly available COVID-19 counts [63] are very high in most countries, making the relative importance of fluctuations low, and the deterministic description appropriate [62].The advantage of such an approach is the analytical tractability and more intuitive understanding of the obtained results.
Despite the complexity of the system of equations ( 1)-( 8), we show that the model can be analytically treated.As a starting point in our calculation, we use the explicit expression for the number of infectious individuals as a function of time, obtained in [58]: where K(n, x) stands for the modified Bessel function of the second kind [64], , θ(t) denotes Heaviside step function and I 0 is the initial number of infectious individuals in population.
The main result of this paper is the derivation of the time-dependent expressions for the numbers of fatalities F(t) and detected cases D(t), which are continuously tracked and publicly available quantities for a vast number of countries and regions [63].The analytical derivations of these expressions are presented in Appendix A. Essentially, Eqs. ( 6)-( 8) lead to two disentangled differential equations determining F(t) and D(t): Here, I (t) is given by Eq. ( 9) and C stands for the constant of integration.All constants of integration are determined from the requirement of continuity of functions, their first derivatives, and/or the initial conditions: F(0) = 0 and D(0) = D 0 , where D 0 is the initial number of detected cases at onset of infection progression in a population.Solving these equations is highly nontrivial due to the complex form (see Eq. ( 9)) of the function I (t).To accomplish this task, aside from commonly used methods for solving differential equations (i.e., known methods for second-order inhomogeneous Cauchy-Euler equation), we used a number of special properties of modified Bessel functions of the first and second kind, of upper incomplete gamma functions, and of regularized generalized hypergeometric functions, as detailed in Appendix A.

Analytical results
Using the methodology outlined in the previous section, in Appendix A, we derived closed-form expressions for the number of fatalities and the cumulative number of detected cases, as functions of time.The number of fatalities F(t) has the following form: where: In these formulas , while p Fq (a 1 , a 2 , ...a p ; b 1 , b 2 , ...b q ; z) denotes regularized generalized hypergeometric function [65].
While these expressions jointly constitute the exact solution to the differential equations governing the dynamics of F(t), their form is too complex to convey a useful conclusion about the epidemic dynamics.However, by applying a combination of appropriate mathematical transformations and by using Hankel's asymptotic expression for Bessel functions (see Appendix A.1 for details), we arrive at the following approximate expression for F(t): This expression replaces expressions ( 12)-( 15) while not significantly deviating from the full solution (the deviations are essentially negligible, having in mind the limited precision with which we know most of the COVID-19 epidemiological parameters).
As expected, the F(t) saturates at some maximal, asymptotic value as t → ∞ (for details see Appendices A. 3 and A.4).This value corresponds to the total (final) number of COVID-19 fatalities F f in in the observed epidemic outburst: Furthermore, due to the fact that for realistic values of epidemiological parameters [58] is high enough, we utilize Stirling's formula [66] (for large n): After making use of 4βσ (γ + δ − σ ) 2 (for multiple times) Eq. ( 17) reduces to: where By applying a similar procedure, we straightforwardly obtain the equivalent expressions for the time evolution of the number of detected cases (see Appendix A.2).For brevity, we here provide only the final results: and for the total number of detected cases at the end of the epidemic wave: that can be again approximated to the form: Consequently, the estimate for Case Fatality Rate (CFR), which is defined as the final number of fatalities per detected cases (C F R = F f in /D f in ), acquires a simple form [67], involving a ratio of mortality (m) and healing (h) rates: further supporting that our analytical derivation is plausible.
The explicit formulas ( 18) and ( 21) provide analogsthough now in the entirely different context with social distancing-of the final size (i.e.asymptotic value) relations for SIR [40,54] and SEIR models [53].Based on our analytical solution, we can also provide a fairly simple relation for the timing of the epidemic peak (see Appendix A.5 for details), which is another quantity that has been previously analytically derived in simpler scenarios of SIR [48,50,51,55,56] and SEIR [53] models: From Eq. ( 23), we can also obtain the maximal number of detected cases per day during the outburst: which we will further analyze in the next section.
The infection tipping points (see Appendix A.5) correspond to the inflection points of the bell-shaped infected curve.We can explicitly express the two tipping points, from which we can obtain the duration of the epidemic peak (cf. Figure 1D in [58]), corresponding to the time interval between these two points: When α << (γ +σ ) this formula effectively simplifies to: where we used [68] ln Note that Δt peak is independent of the transmission rate β, which is a nontrivial result.Δt peak provides an estimate of the time interval during which the infected are at a high level, and the chance of contracting the infection is the highest.

Numerical analysis
To test the model reliability, we first compare our fullfledged predictions for case counts (detected cases and fatalities) with the observed data counts for three representative countries: Austria, Switzerland, and Israel.Figure 2 shows that we obtain an overall good agreement between our predictions and the data for all three countries.In particular, we see that the time delay, much discussed in interpreting COVID-19 case counts data [12,61], is also well reproduced.This agreement provides confidence that the analytical expressions, which we derived from our model, indeed reasonably represent the observed case counts data.
The obtained results allow us to analyze the effect of each model parameter on D(t) and F(t) curves.For succinctly presenting the results in Figs. 3, 4, 5 and 6, we will further use the parameter values that were extracted for Israel in Fig. 2. We note that the obtained results are independent of the analyzed country, as long as the data are continuously tracked with moderate testing capacities and fairly transparent reporting policies.First, we will consider the influence of the parameter-the fraction of the infections that are detected and consequently quarantined.We here, and in the subsequent analysis, fix the initial (observed) number of detected cases D 0 = D(t = 0).The trivial (direct) proportionality dependence of D(t) on in Eq. ( 6) then gets absorbed in D 0 , so that we investigate the nontrivial effect of the quarantine ( ) on the case count dynamics from Eq. ( 4).For this, we opt for two limiting cases = 0.1 and = 0.5, which respectively correspond to the low and high detection rates inferred from the observed data [58].From Fig. 3, which represents the time evolution of detected cases and fatalities, we observe that a substantial change of this parameter leads to at most ∼ 20% change in saturation value of D and F. In comparison to changes due to variations of other parameters, see below, we will see that this is a minor effect.Expectedly, better detection efficacy (i.e., the higher value) leads to lower overall numbers of case counts, simply because the larger part of infected individuals is timely quarantined.The value of has practically no impact on the steepness of these time-evolution curves.Consequently, detected cases and fatalities are nearly independent of , and, for simplicity, we will further set this value to be 0.1.(Note that, if not indicated otherwise, full-fledged expressions are used for generating predictions through the entire section.) Next, we assess how sensitive case counts are on the initial slope (on a log scale) of the exponential growth of the infectious curve, quantified by λ + .Indirectly, in this way, we also test the dependence of the curves on the closely related basic reproduction number R 0 (quantifying inherent virus transmissibility in population [69]), which relates to λ σ γ [57,70].To this end, we vary λ + in the range between 0.1/day and 0.4/day, based on our previously inferred λ + values for a large number of analyzed countries [70,71].From the left plot in Fig. 4, we observe that D(t) is highly sensitive to this parameter, to the extent that the linear-linear plot is unable to adequately illustrate this dependency.The log-linear plot in the inset is much better suited for this: it clearly demonstrates that the greater the λ + is, the D(t) curve is steeper, and the final saturation value becomes exponentially larger.The same tendency is preserved in the case of F(t) dependence on λ + (see middle plot in Fig. 4), where we ab initio opt for log-linear display.
To gain some further insight on the influence of λ + on case counts saturation values, we note that the dom-Fig. 2 Comparison between model prediction and data for Austria (the left plot), Switzerland (the central plot), and Israel (the right plot), as indicated by abbreviations in the upper left corners.The predictions are generated for the evolution of detected cases (the purple curves) and fatalities (the gray curves).The observed detected counts are denoted as purple dots, while the fatality counts are represented by gray dots.The left and the right y-axis of each plot corresponds to detected cases and fatalities, respectively, while the ticks on the x-axis denote the dates given in DD/MM/YY format.The data for detected cases and fatalities are taken from [63] Fig. 3 The dependence on the infection detection efficiency of detected cases (the left plot) and fatalities (the right plot).The blue and the orange curves correspond to = 0.1 and = 0.5, respectively inant λ + dependence in Eqs. ( 18) and ( 21) is exponential, of the form e (t 0 + 2 α )λ + .However, it can be shown that λ + terms outside of the exponent in Eqs.(18) and ( 21) still contribute by effectively lowering the exponent from e (t 0 + 2 α )λ + to t α = t 0 + 1/α.In Ref. [57], we introduced t α as the, so-called, protection time (the time at which ∼ 1  2 of the population moves to the protected category), so that there is a simple, overall, dependence of the case counts on λ + as ∼ e t α λ + .
The λ + dependence of three curves: and our estimate ∼ e (t 0 + 1 α )λ + , is presented in the rightmost plot of Fig. 4. The practically overlapping D f in and D approx f in curves confirm the adequacy of the approximation applied in Eqs.(18) and (21).Moreover, the fact that the simple exponential dependence e t α λ + very well qualitatively and quantitatively reproduces D(λ + ), highlights usefulness of the protection time in assessing infection risks.
Finally, we concentrate on the effect of the social distancing strength α on the case counts.We assume that α lies in the range between 0.03 and 0.3, as inferred from the observed data by our previous numerical analysis [58].All plots on Fig. 5 are again presented in loglinear scale.From the left and central plot in Fig. 5, we see that both D(t) and F(t) are strongly affected by the change of α.As the epidemic progresses, the stronger social measures result in a significantly lower plateau of case counts.The effect of α on these two observables is the opposite of the one of λ + , as expected.
To further clarify the dependence of D and F on α, we again use Eq.(21).After taking a logarithm, we obtain: where × ln γ + δ + σ + 2λ + and κ 2 = γ + δ + σ .The dominant term on the right-hand side of Eq. ( 27) is proportional to 1/α, i.e., the logarithmic terms could be neglected.To test this obtained simple dependance, in the rightmost plot of Fig. 5 we compare D f in , D approx f in and our estimate ∼ e 0.8λ + /α , as functions of α, on log-linear scale.The very good agreement between D f in and D approx f in again confirms the validity of our approximations.Moreover, we demonstrate that the influence of the strength of social distancing on detected cases (and fatalities) can be reduced (at saturation) to a simple dependence of the form ∼ e 0.8λ + /α .From Figs. 4 and 5, we also see that λ + has no significant effect on infection extinguishing time, i.e., the time for case counts to enter saturation.That is, its most prominent effect is on the total case counts.On the other hand, the strength of social distancing measures affects (i.e., diminish) both the total case counts and the time needed for extinguishing the infection.
Finally, in Fig. 6, we show how the maximal value of detected cases per day depend on λ + and α.For λ + , we see that, similarly to D f in and F f in , simple exponential dependence e t α λ + well reproduces ( dD dt ) max , further highlighting the importance of the lower protection time for reducing the infection risks.For social distancing strength, we see that ( dD dt ) max is also strongly affected by α, where this dependence can be approximated by the simple expression ∼ e 0.6λ + /α .

Discussion
The main goal of the present paper was to provide a fully-analytical treatment of our epidemiological model with introduced social distancing, which predicts observables that are readily accessible in a largescale epidemic.We accomplished this task by calculating these observables as explicit functions of time, expressed in closed-form: Eqs. ( 16) and (19).Particularly useful are the derived (approximate) expressions ( 18) and ( 21) for the detected cases and fatalities at saturation (i.e., at the end of an epidemic wave).We tested the model predictions against the observed COVID-19 data, obtaining a very good agreement (as illustrated in Fig. 2).
The obtained analytical results facilitate a better understanding of the influence of individual epidemiological parameters on COVID-19 observables.In particular, the effect of varying the detection efficacy (within reasonable boundaries) turned out to be exponentially smaller than the influence of the λ + value, as illustrated in Figs. 3 and 4. Since the λ + value is directly related to the basic reproduction number R 0 , which characterizes the inherent biological transmission of the virus in an (initially) completely unprotected pop- and ∼ e 0.8λ+/α are denoted by the solid black, the dashed blue and dot-dashed orange curves, respectively.All plots are on loglinear scale Fig. 6 The effect of initial slope of the infectious curve λ + and social distancing strength α on the maximum of detected cases per day ( dD dt ) max is shown on the left and right plots, respectively.( dD dt ) max and approximate expressions are denoted by the solid black and dot-dashed orange curves, respectively.All plots are on a log-linear scale ulation (λ + and R 0 are roughly proportional, in realistic parameter ranges) [57], this conclusion has a simple and yet important interpretation.Namely, it means that medical, demographic, and environmental predispositions (determining R 0 ) [70,71] should be expected to strongly influence the behavior of the considered observables (detected cases and fatalities attributed to the virus) [72][73][74].On the other hand, R 0 seems not to significantly impact the infection extinguishing time.
Our analysis has shown that the strength of the implemented social measures, quantified by α, effectively appears in the denominator in the exponent of the number of total COVID-19 casualties, as follows from Eq. ( 27).The impact of the social measures is thus, unsurprisingly, highly significant-as illustrated in Figs. 5 and 6.
Even more interesting is the interplay between t 0 (the timing of the introduction of measures) and α values in the asymptotic behavior of the COVID-19 observables.
Careful analysis of the functions for COVID-19 fatalities (18) and detected cases (21) at saturation, as well as the maximum of detected cases per day (24), revealed that this dependence is, in all three cases, effectively of the form: e (t 0 + 1 α )λ + (see also Figs. 4 and 6).The significance of the value t α = t 0 + 1 α has already been numerically recognized in [57], where this combination of parameters was named "protection time".Here we show that the total number of fatalities and detected COVID-19 cases for the entire infection wave does not depend on either t 0 or α individually, but only jointly, through the protection time t α .The implications of this analytical result are striking: introducing very strict (and economically debilitating) social measures can have precisely the same effect (infected case-wise and mortality-wise) as implementing much weaker measures somewhat earlier.
For example, consider a decision to introduce weakto-moderate social measures, corresponding to α = 0.1, some t 0 days from the epidemic onset.If this decision is only one week postponed (i.e., measures put into practice at t 0 + 7), it would require far more restrictive policies, corresponding to α = 0.3, to compensate for the delay and retain the same number of fatalities (according to our analysis [58] α = 0.3 corresponds to the strongest social measures that had been imposed in practice, in the first wave of COVID-19 pandemic).
"The sooner the better" conclusion regarding the introduction of social measures is hardly surprising [28][29][30] and does not require such a sophisticated analysis to reach.However, the analysis presented here provides us with additional insight that was not so obvious: even reasonably weak measures introduced quickly and without prolonged preparations seem to have much stronger mitigation potential than very strong measures introduced with hesitation and delay.
Furthermore, since the numbers of COVID-19 fatalities and detected cases in saturation depend upon the protection time t α as e t α λ + , it is clear that the measures must be put into force sooner and/or be stronger (i.e., t α lower) in populations with higher SARS-CoV-2 reproduction number R 0 .While this conclusion is fairly natural, having this quantitative relation and combining it with machine learning techniques to estimate R 0 dependence upon demographic and environmental parameters [71] may substantially aid the policymaking process, in multiple ways.For example, by estimating how much sooner the measures must be implemented in areas with greater median age, higher prevalence of comorbidities, or higher pollution [70,71].Also, to properly allocate medical resources according to this interplay of R 0 , t 0 , and α.

Conclusion
In this manuscript, we demonstrated that it is possible to carry out a full analytical treatment of a compartmental epidemiological model that takes into account the effects of social distancing measures.The model has shown a good agreement with publicly available data on COVID-19 case counts.
As a benefit of having analytical solutions to the model equations, we were able to point out simple quantitative relations between the model variables and epidemiological observables.Such relations provide insights into cause-effect connections that underlie the epidemiological dynamics and are obscured in more standard, i.e., purely numerical, approaches.
Our analysis revealed a quantitatively expressible interplay between the strength of the social measures and the time of their introduction, showing that the two variables effectively combine into a single relevant parameter called protection time.This not only implies that stringent measures can be often substituted by more relaxed ones introduced at earlier times but provides a direct analytical expression to quantify this trade-off.
We emphasize that our model and its implications are not COVID-19 specific and should hold for any potential future epidemic.Many epidemiologists believe that outbursts of new infectious diseases (or even pandemics) are likely, possibly even in the near future [75].Thus, our results might be even more useful in the future, i.e., in the early stages of some new pandemic, when vaccines or effective pharmaceutical treatments are still not at our disposal, but we can only count on well-timed and appropriately chosen control measures.
Throughout the derivations, we will distinguish twotime regions: I) t ≤ t 0 and II) t > t 0 , and will denote the variables in these two regions accordingly.
First, we concentrate on deriving the expression for the time evolution of the number of fatalities.To this end, we start from Eqs. ( 7)- (8), which lead to a single equation (10).In region I, this second-order inhomogeneous differential equation, after taking into account the first term of Eq. ( 9) (i.e., I I (t) = I 0 e λ + t ), reduces to: If we assume F I (t = 0) = 0 and F I (t = 0) = 0, the time-dependent fatalities for t ≤ t 0 are given by: In region II, for simplicity we shift t − t 0 → t, and take into account that the expression for infectious now reads (the second term of Eq. ( 9)).Since I I I (t) has this complex form, the further derivations are quite demanding.By substituting t → y = 2 √ e −αt βσ α , it can easily be verified that Eq. ( 10) is in region II reduced to the well-known second-order inhomogeneous Cauchy-Euler equation [76,77] ax 2 y + bx y + cy = g(x).After dividing thus obtained differential equation by y 2 , we obtain where . Next, we apply the standard procedure for solving inhomogeneous differential equation given by Eq. ( 30): the full solution equals homogeneous plus particular solution F I I (y) = F I I,h (y) + F I I, p (y).It is straightforward to show that auxiliary/characteristic equation yields the following simple form of homo-geneous solution (where linearly independent solutions are F I I,1 (y) = 1 and ), while for obtaining particular solution F I I, p (y) we used the Lagrange's method of variation of parameters [76].More precisely, we assume that , where again linearly independent solutions of the homogeneous equation are employed.The unknown functions C i (i = 1, 2) of variable y are sought for via the standard procedure where f (y) denotes the right-hand side of Eq. ( 30), while Wronskian [76,77] is given by W = F I I,1 F I I,2 − . This leads to: In spite that the above form of the result will be more useful in what follows, we nevertheless provide also its full-fledged form: where √ βσ α , while p Fq (a 1 , a 2 , ...a p ; b 1 , b 2 , ...b q ; z) denotes regularized generalized hypergeometric function [65].The general solution of Eq. ( 30), when returned to variable t, has a form: where the only unknown parameters are C 1 and C 2 .In order to determine these constants we use the following boundary conditions: F I I (0) = F I (t 0 ) and F I I (0) = F I (t 0 ).After some cumbersome calculation steps we obtain the following expressions: Now, if we use the definition of regularized generalized hypergeometric functions, the above expressions can be significantly simplified and reduced to the form similar to Eq. (32).Namely [65] 1 F2 (a , (37) where (a 1 ) 0 = 1, while for k 1 Let us first concentrate on C 1 , where from Eq. ( 35) it follows that z = βσ/α 2 .By denoting a 1 = (γ + δ)/α and a 2 = σ/α, it is straightforward to infer that the expression in the square brackets of Eq. ( 35) can be rewritten in a form: After multiplying the right-hand side by z a 1 +a 2 2 /z a 1 +a 2 2 we obtain: where we made use of the fact that (n + 1) = n (n) and (a)(a) k = (a + k) (see Eq. ( 38)).To further simplify Eq. ( 40), we adopt the following notation and differentiate it with respect to z: As already mentioned, the main idea is to try to reduce Eq. ( 35) to a form similar to Eq. ( 32), i.e., to relate the above expression to modified Bessel functions, and the modified Bessel function of the first kind [64] is defined as: By comparing Eqs. ( 41) and ( 42), we observe that their right-hand sides are of a similar form, if x/2 → √ z, n → a 1 − a 2 , that is: Proceeding in a similar manner in the case of the remaining term in Eq. ( 40) J 2 = ∞ k=0 z k+a 2 k!(k+a 2 ) (k+1+a 2 −a 1 ) we arrive at: Note, from Eq. (40), that . By substituting integrated Eqs. ( 43) and (44) in Eq. ( 40), and thus obtained expression in Eq. ( 35), for C 1 we finally obtain: Note that, in obtaining the above expression the identity [66,68] relating modified Bessel function of the first and the second kind I −n (x)−I n (x) = 2 π sin(nπ)K n (x) was used.
The remaining constant C 2 (see Eq. ( 36)) from fatality counts expression is simplified by applying the same procedure as in the case of C 1 .To avoid redundant derivations (i.e., the repetition of the above calculations), we simply outline the final expression: with the only distinction that, in the process of simplification, parameters a 1 and a 2 now read (γ + δ − h − m)/α and (σ − h − m)/α, respectively.Now that all terms of fatalities count (given by Eq. ( 34)) are determined, we note that all constants (C 1 , C 2 , C 1 (t) and C 2 (t)) are still in their integral form.Since, for all countries that we consider it holds 2 √ βσ /α 1, we may further simplify these integrals by utilizing Hankel's asymptotic expression [66]: which holds for large x = 2 √ βσ /α.
Finally, upon implementing thus calculated constants of Eq. ( 48) in Eq. ( 34), and by taking into account Eq. ( 29), we obtain the expression (16) for the general solution of Eq. ( 10) at the entire t region.

A.2 Analytical derivation of detected cases
Next, we concentrate on the time evolution of detected counts.To this end, we make use of Eqs. ( 6) and (9), i.e.
where C i (i = 3, 4) stands for the constant of integration.In region I the integration of I I (t) (the first term in Eq. ( 9)) is straightforward and yields D I (t) = I 0 δ λ + e λ + t + C 3 , while C 3 is obtained from the initial conditions D I (t = 0) = D 0 ≡ I 0 δ λ + , leading to: In region II, the integration is more demanding, due to the form of I I I (t) (the second term of Eq. ( 9)).To address this, we employ the following substitution of . Thus: Note that, as opposed to our previous notation during the derivation of F I I (t), now we do not make use of the substitution t − t 0 → t.Because of this, the boundary condition reads: D I I (t 0 ) = D I (t 0 ), which is used for determining the integration constant C 4 = I 0 δ λ + e λ + t 0 .So, the expression for detected cases reads: After performing the same algebraic manipulations of regularized generalized hypergeometric functions as in the previous Subsect.A.1, as well as applying Hankel's approximation (47), we obtain the expression (19) for the general solution of Eq. ( 11) at the entire t region.Alternatively, the same expression ( 19) could be obtained more straightforwardly.Namely, the expression for the number of infectious individuals (Eq.( 9)) can be simplified by utilizing Eq. ( 47): Now that I (t) has been determined in the desired form, the detected counts can be calculated from Eq. ( 11).In region I, the derivation is straightforward (and therefore omitted) and leads to Eq. ( 51).In region II, the integration is more demanding, due to the form of 2 ) e − 2 ( The only difference compared to Eq. ( 48) is that now the lower boundary of integration is not zero but some positive real number.This integral is solved by applying the identity .By combining this result with Eq. ( 51), we finally arrive at the expression (19) for the number of detected cases.

A.3 Simplified expressions for fatalities and detected cases
In this section, we show that certain terms in the expressions for fatalities and detected cases can be neglected (when epidemiological parameters are in realistic ranges), without significant loss of predictive precision.First, we start with Eq. ( 16) for fatality counts.We notice that for t > t 0 first two terms are much smaller than the remaining terms, due to 2 √ βσ /α 1, and therefore can be neglected.Additionally, for the same reason all (s, 2 √ βσ /α) → (s, ∞) are approximately equal to zero (i.e., the gamma integral is effectively ∞ ∞ ).Therefore, instead of Eq. ( 16), the following formula can be safely used in practice: ( We have numerically tested and confirmed that the full-fledged (given by Eq. ( 16)) and simplified (given by Eq. ( 56)) fatality curves are practically overlapping (and the same for the detected cases).
A.4 Expressions for fatalities and detected cases at saturation We will evaluate the saturation values of fatalities and detected counts, that is, their expressions in the limit of very large t.This means that we can concentrate only on t > t 0 , i.e., region II, where we set t → ∞.Building on the results of the previous section, from Eq. ( 56) we observe that, in this limit, the second term in the square brackets can be neglected, due to e −(h+m)(t−t 0 ) → 0. Likewise, , 2 e −α(t−t 0 ) βσ α Therefore, we obtain the expressions ( 17) and (20) for the saturation values.

A.5 Expressions for the epidemics peak and tipping points
Other important quantities characterizing infection dynamics during the first wave are epidemics peak time and inflection (tipping and turning) points, for which we here provide analytic expressions.Namely, the epidemics peak time is the moment when infected curve reaches its maximal value (i.e., dI /dt = 0), or equivalently d 2 D/dt 2 = 0.The second derivative of Eq. ( 19) in region II (or equivalently Eq. ( 57) in the same region) yields: ( In deriving the above expression, we made use of equality d (s, x)/dx = −x s−1 e −x , following from the definition of incomplete gamma functions [66].Note that, in this subsection, we are interested in region II, where all relevant points lay.After equating the second derivative of D I I with zero, for the epidemics peak time, we obtain: By evaluating dD dt at t = t max , we can straightforwardly obtain the maximum of detected cases per day, given by dD dt max = D 0 λ + e λ + t 0 + 2 α + 1 Eq. ( 61) has two solutions, which correspond to the infection tipping points The duration of the epidemic peak can then be defined as a difference between these two tipping points and is equal to:

Fig. 1
Fig. 1 Scheme of the SPEIRD model.The compartments are represented with circles and named as indicated in the text.The transitions between different compartments are denoted by arrows, with labels corresponding to the appropriate transition rates, as explained in the text.The two main categories for our

Fig. 4
Fig.4 The effect of initial slope of infectious curve λ + on the main observables.The time evolution of detected cases is presented in the left plot, with inset on a more pronounced log-linear plot.The central plot shows fatalities versus time on a log-linear scale.λ + values are indicated in the legend.In the right plot, λ +

Fig. 5
Fig.5 The effect of social distancing strength α on the main observables.The time evolution of detected cases and fatalities are presented in the left and central plots, respectively.α values are indicated in the legend.The α dependence of detected

√e 2 ,
−α(t−t 0 ) βσ α .To address this, again we employ the following substitution of variable t − t 0 → x = 2 √ βσ α e − α(t−t 0 ) resulting in the boundary condition D I I (t 0 ) = D I (t 0 ) (which is used for determining the integration constant C 4 ).In a similar manner as before, we again encounter the incomplete gamma functions[66]:D I I (t) x dx + C 4 .
b a x s−1 e −x dx = b 0 x s−1 e −x dx − a 0 x s−1 e −x dx = (s, a) − (s, b)

1 2. ( 60 ) 2 (
+ δ + σ ) − α − γ + δ+σ αAlong the same lines, the epidemics inflection points are defined as d 2 I /dt 2 = 0, or equivalently d 3 D/dt 3 = 0:d 3 D I I dt 3 = I 0 δe λ + t 0 + 2 √ βσ α e − α(t−t 0 ) ) for modified Bessel function of the second kind contained in C and C 0 .Note that the second terms of C 1 and C 1 (t) on one side, and the second terms of C 2 and C 2 (t) on the other have the same exponents (keep in mind that, according to Eq. (34), there is still a