Exploring the interaction of influenza A subtypes H1N1 and H3N2 based on an evolution-driven transmission model

Two influenza A subtypes (H1N1 and H3N2) co-circulate and pose a serious disease burden globally. It is unclear about the extent of cross-protection between these two influenza A subtypes as well as the impact of cross-protection on influenza dynamics. Based on an evolution-driven mathematical transmission model, the interaction between H1N1 and H3N2 was explored based on monthly incidence data in the USA from the years 2012 to 2019. The results showed that significant cross-protection between H1N1 and H3N2 exists. Besides, the cross-protection for H1N1 from H3N2 is much stronger than the other way around. The results also show that cross-protection between H1N1 and H3N2 has substantial influence on the epidemics. The model was also validated in Germany, Austria and Norway. A forecasting framework was proposed based on the model with feasible accuracy for both retrospective and prospective forecasts of H1N1 and H3N2 in the USA. Our findings quantify and highlight the importance of cross-protection on the co-circulating of seasonal influenza A H1N1 and H3N2.


Introduction
Influenza epidemics have created a serious medical burden in many regions of the world and cause significant excess mortality rates in different ages [1]. Of the two influenza types, influenza A (H1N1 and H3N2) is associated with more severe symptoms and clinical outcomes than influenza B [2]. Infection of one type/subtype can provide cross-protection against other types/subtypes mediated by high-affinity antibodies against conserved epitopes [3][4][5]. Virus-specific T cell can target influenza highly conserved epitopes and have potential for cross-protection [6][7][8]. In ferrets and mice, although infection with one subtype did not prevent infection with the other, reinfection resulted in shorter viral shedding times and reduced clinical symptoms [9,10]. Population cohort studies have shown that cross-subtype protection exists in humans during continuous and/or concurrent epidemics caused by two influenza subtypes [11], while another study identified no heterosubtypic protection following infection [12]. Serological findings also indicate that the cross-protection may be weak and short-lived [13]. Overall, the extent of crossprotection between influenza subtypes is still controversy and needs to be quantitatively studied at the population level.
In temperate regions, different influenza subtypes share the co-seasonality under the common influence of environmental factors [14,15]. Local outbreaks of seasonal influenza are often associated with an increased incidence of more than one influenza type or subtype [16]. Early circulation of one influenza subtype is associated with a reduced total incidence of the other subtypes, consistent with the presence of interference between subtypes [17]. The traditional time series correlation analysis methods are not suitable to this case when accounting for autocorrelation and confounding factors in time series, which may result in false positive or negative correlation [18]. Dynamic models are widely used to explore pathogen interactions at the population level, inference ability depends on the strength and time course of the underlying mechanism: stronger and more persistent interactions are easier and more precisely quantified [19][20][21]. Previous studies [22,23] have used mathematical models to study the interaction of influenza viruses. However, the results may be biased due to the nonnegligible effect of the punctuated antigenic changes on influenza dynamics, especially for H3N2.
It has been demonstrated that significant differences exist for models with or without viral antigenic changes as covariates regarding either capture the temporal patterns or make forecasts [24,25]. An evolution-informed model has already been constructed and used to make predictions for H3N2, but the interaction between H3N2 and H1N1 was simplified and neglected [24]. The theoretical modeling results illustrate a cross-protection of 50% between the two subtypes is sufficient to explain why a single influenza illness peak was observed in temperate countries [16]. Accounting for cross-immunity driven by exposures in previous outbreaks explicitly is expected to improve the accuracy of epidemiological modeling forecasts during influenza outbreaks [26]. More importantly, forecasts separated by subtype generally produced more accurate predictions and, when summed, produced more accurate predictions of total influenza incidence [27].
Based on the previous works, an evolution-driven mathematical transmission model was built to explicitly and quantitatively explore the interaction between influenza A H1N1 and H3N2 with the realistic evolutionary change information of the virus incorporated into the model in the present study. Based on the model, retrospective and prospective forecasts can be made, which will help decision makers to take timely control measures and allocate medical resources.

Epidemiological data and virological data
Hemagglutinin (HA) protein sequences of influenza A virus H1N1 and H3N2 were downloaded from the Global Initiative on Sharing Avian Influenza Data (GISAID, https://www.gisaid.org/) and then were aligned with MUSCLE v3.7 using default settings [28]. Undetermined amino acids were replaced by gaps, and only the HA1 domain was retained for further analysis [29]. Outliers based on a reconstructed phylogenetic tree using FastTree 2 [30] with default settings were manually removed. Outpatient illness surveillance data and viral surveillance data from the years 2012 to 2020 were downloaded from Global Influenza Surveillance and Response System (GISRS). Yearly national level population estimates were downloaded from World Health Organization (WHO). The final weekly subtype-specific incidence was calculated as the product of influenza-like illness (ILI) incidence rate, proportion of ILI samples that tested positive for influenza, subtype-specific proportion and population size. Weekly incidence data were then aggregated to monthly data.

Time-series analysis
The wavelet decomposition of the time series from the years 2012 to 2019 for all subtypes was performed to disentangle periodicity in time-frequency space using the R package developed by Grinsted et al. (2004) [31].
The wavelet transform decomposes a signal using narrow functions (wavelets) when high-frequency features are present and widen functions on lowfrequency structures [32]. Before wavelet decomposition, the monthly incidence of influenza A virus H1N1, H3N2, and influenza B was square-rooted transformed. To explore how strongly two subtypes are correlated, the cross-correlations were also computed between them.

Two-subtype epidemiological model
An evolution-driven mathematical transmission model was proposed to explore the interaction between influenza A viruses H1N1 and H3N2 based on monthly incidence time series data of the USA from the years 2012 to 2019. The model considers both the transmission characteristic and antigenic evolution. The transmission model divided the population into several compartments according to their infectious and immune status (Fig. 1), where S represents the susceptible population, I H3 and I H1 are the infected populations by H3N2 and H1N1. R cH3 and R cH1 are the population recovered from infection H3N2 and H1N1 and may transiently cross-protected from H1N1 and H3N2, respectively. R H3 and R H1 are the population completely protected from H3N2 and H1N1 without cross-protection from the other. Population J H3 and J H1 are infected by H3N2 and H1N1, but also immune to H1N1 and H3N2, respectively, due to repeated infection, and R is recovered class with immune to both subtypes. The variables and related parameters for these two subtypes are described by corresponding subscripts (H3 and H1). Each differential equation represents the number of people flowing in and out of the compartment per unit time as indicated by the arrows in Fig. 1. The complete model is given by: Three aspects of cross-protection on the dynamics were considered in the model.
(1) The influence of cross-protection on susceptibility individuals infected with one subtype differ in susceptibility to the other, and the parameters c H3 and c H1 are the two parameters quanties the strength of partial cross-protection from subtypes H1N1 and H3N2, respectively. These two parameters represent the difference in the probability of reinfection with the other subtype between cross-protected and uncrossprotected populations. P H3 and P H1 represent the duration of cross-protection. (2) The influence of cross-protection on transmission individuals with cross-protection differ in transmission compared to those without cross-protection, yielding the force of infection The parameter a was used to represent the heterogeneous mixing effects on the spread of pathogen in the population. Those effects include immune response difference due to age effect and complex transmission pattern due to intrinsic property of psychosocial behavior, etc 13,33 . N(t) is the population size at time t. The parameter n represents the relative transmission difference between I i and J i , which was used to distinguish the difference in transmissibility between newly infected and reinfected population. The contact rate is given by , which includes two components: seasonality implemented through six b-splines s i t ð Þ with coefficients b i , and environmental noise simulated by a gamma distribution U [34].
(3) The influence of cross-protection on infectious period individuals with cross-protection differ in infectious period compared to those without cross-protection. The parameter D denotes the infectious period. The parameter x represents the infectious period difference between I i and respectively. R H3 and R H1 are the population completely protected from H3N2 and H1N1 without cross-protection from the other. Population J H3 and J H1 are infected by H3N2 and J i , which was used to represent the impact of reinfection on infectious period.
The evolution of influenza can influence the proportion of the susceptible population, which can elevate the proportion of susceptible people [22], so the immune waning of H1N1 and H3N2 is given by: where L 0 i is the basic immune protection period and w i is the scaling factor. The evolutionary index E i ðtÞ was used to measure the degree of evolutionary change of the virus, the index is calculated as a weighted average distance $ d i ðs;tÞ between the viruses in current month t and those in the past two seasons in that this period is close to the average time of cluster replacements [35]. The biology meaning behind E(t) relates to the immune memory or protection existing in the human population against new viruses: The more similar this variant is to viruses in the past, the less likely it will be to infect people [24]. The weight h is taken to be a decaying function of the intersequence interval. The evolutionary index is given by: For the evolutionary index E i ðs 0 Þ for Germany, Austria and Norway, the evolutionary distance was calculated between the viruses in current season s 0 and those in the past two seasons due to limited sequence data, which is given by: The other parameters s and l are the external importation rate and death rate, which were fixed to 36.5 per year and 0.015 per year, respectively, as the previous study [24]. The number of births per unit time was quantified as lN t ð Þ þ dN t ð Þ dt , to reproduce the natural increase of susceptible population.

Parameter estimation
H i ðtÞ was defined to be the total number of new infections come to I i and J i , the monthly case reports, C i ðtÞ, are normally distributed, that is, where q is the reporting ratio and r s is the standard deviation which is different in summer (April-September) and winter (November-March in the next year). For simplicity, the variance was assumed to scale linearly with the mean, that is r s ¼ r 0 s qH i ðtÞ, r 0 s is the scaling coefficient (s = 1 for summer and s = 2 for winter, respectively). Both parameters and initial conditions were estimated on the basis of the likelihood function using MIF in the R package pomp [36], which is given by: The priors of all the parameters were uniform distributions (Table S1). The iterated filtering (IF2) algorithm will be free to search parameter space outside the box. The maximum likelihood estimation of target parameters and its confidential interval (CI) were obtained based on the Monte Carlo adjusted profile algorithm [37].

Forecasts
The model was trained on the data set before the target season and the initial conditions were estimated based on exactly the same procedure described above. Next, forecasts were obtained through forward simulations for the target season starting from these initial conditions. Some external variables or drivers whose values must be specified independently from the dynamics. The projected population size N(t) was produced by the United States Census Bureau. Especially for the weighted average distance $ d i ðs;tÞ , we use the mean distance of $ d i ðs;tÞ from the months April to September in the last season and make it constant for the next season as a rough approximation in the USA.

Model validation and Sensitivity analysis
To establish whether the inference framework can offer valid and robust parameters estimation, parameters were re-estimated based on synthesized data that generated based on pre-defined parameters. Only the critical interaction parameters c H3 and c H1 were considered, and three scenarios were tested: c H3 ¼ 0:5; c H1 ¼ 1; c H3 ¼ 1; c H1 ¼ 0:5;c H3 ¼ 1; c H1 ¼ 1. Other parameters in the dynamics model are fixed at maximum likelihood estimations. Based on the simulated time series, the target parameters were estimated and compared with the pre-defined values. To further test whether the model framework would detect weak cross-protection, one of the two parameters c H3 and c H1 was fixed at 1 and the value of the other parameter (c H1 or c H3 ) shifts from 0.25 to 0.95. Furthermore, we compare the shape of the log-likelihood profiles as the size of the data varies under three scenarios (c H3 ¼ 0:5; c H1 ¼ 1; c H3 ¼ 1; c H1 ¼ 0:5; c H3 ¼ 1; c H1 ¼ 1) to investigate the precision of inference as the length of time series varies.
The difference of infectious period of population with or without cross-protection was specifically considered in the study using the scaling parameter x. The product between parameters D and x makes a strong correlation and undistinguishable between them. To test if this correlation could affect the other key parameters estimation, the infectious period D was fixed at 0.04 or 0.02 year without x in the model. The other key parameters (c i ; n) were randomly searched on the basis of the likelihood function.

Data and Epidemiological model
The monthly incidence of influenza A virus H1N1, H3N2, and influenza B from the years 2012 to 2019 for the whole USA shows annual epidemic ( Fig. 2A and Figure S1). The cross-correlation between H1N1 and H3N2 has the largest value at 1-year lag, which is the same as that of H1N1 and B ( Figure S2). However, the cross-correlation between H3N2 and B shows the strongest correlation at 0-year lag. Since the crossprotection between influenza B and A is weak [38,39], and the overall prevalence of influenza B is much lower than the others in the USA during the study period, only interaction of H1N1 and H3N2 was explored in the present study.
An evolution-driven mathematical transmission model was built by incorporating the interaction between H1N1 and H3N2. The transmission model divided the population into several compartments according to their infectious and immune status including the first infected (I H3 and I H1 ), the second infected by another virus (J H3 and J H1 ), the recovered that can be cross-protected (R cH3 and R cH1 ), and the recovered without cross-protection (R H3 and R H1 ). The model also specifically incorporates three aspects of cross-protection on epidemics (Fig. 1). The multiannual variation of H1N1 and H3N2 dynamics can be simulated by the model simultaneously and accurately (Fig. 2B). The maximum likelihood estimations for all the parameters are shown in Table S1. The monthly evolutionary index E i ðtÞ was used as a driving factor to account for the influence of antigenic evolution on the epidemics, which is illustrated in Fig. 2C.

Evaluating the influence of cross-protection on epidemics
The inference framework was tested on synthesized data sets and the key parameters could be correctly reestimated, no matter the strength of cross-protection is different between H1N1 and H3N2 ( Figure S3). Our results show that there exists cross-protection for H1N1 from H3N2 (c H3 : 0:36ð0:05 $ 0:53Þ), but not the other way around (c H1 : 0:98ð0:55 $ 1:23Þ) ( Fig. 3A and Table S1). The probability of people recovering from H3N2 infection being reinfected with H1N1 decreased to 0.36 compared to people not infected with H3N2 previously. However, populations were equally likely to be reinfected with H3N2 regardless of whether they were infected with H1N1 or not. The cross-protection from H1N1 coffered by H3N2 has larger influence on the H1N1 incidence than H3N2, since the multiples of cumulative H1N1 incidence change faster and larger as the parameter c H3 increases (Fig. 3B). The sensitive analysis showed that the variation of parameters n and x could also affect the epidemic intensity and long-term trend of influenza epidemics obviously ( Figure S4 and Figure S5). Moreover, the change of H1N1 incidence was more pronounced than H3N2 ( Figure S6). The coexisting heterosubtypic and homosubtypic immunity would last for around yearsðP H3 : 2:78 1:68 $ 3:28 ð Þ , P H1 : 1:69ð1:21 À 2:71ÞÞ (Table S1), which is consistent with the finding that the cross-correlation between H1N1 and H3N2 has the largest value at 1-year lag ( Figure S2). The homosubtypic immunity alone is much longer and would sustain for several years not considering the antigenic changes ðL 0 H3 : 54:65 5:05 $ 61:75 ð Þ ; L 0 H1 : 83:94ð46:91 $ 125:51ÞÞ (Table S1). However, the cross-protection effect could be false negative when estimated if the magnitude itself was weak ( Figure S7). At the same time, with longer time series, the narrower 95% confident intervals can be obtained and the accuracy of inference will be improved ( Figure S8).

Applying the model to other countries
In order to verify the generality of our model and the key parameters (c i ; n; x), the model was applied to Germany, Austria and Norway with sufficient data, respectively, as external validation. Because of limited or biased epidemiological and antigenic data of these countries, it is not sufficient to estimate all parameter values simultaneously [40]. Therefore, key parameters obtained from the USA were constrained by letting 0\c H3 \1; c H1 ¼ 1; 0\n\1; 0\x\1; and the other parameters were randomly searched on the basis of the likelihood function. When calculating the evolutionary index E i ðs 0 Þ, the average evolutionary distance between seasons was calculated and used due to lack of sequence data in the summer ( Figure S9). The model is capable to capture the dynamics of H1N1 and H3N2 in Germany, Austria and Norway (Fig. 4A, 4B and 4C, respectively). Monthly predictions and observations are significantly correlated ( Figure S10). The evolutionary index of H3N2 is often greater than that of H1N1 in either the USA (Fig. 2D), Germany, Austria or Norway (Fig. 4A, 4B and 4C, respectively), which causes more rapid loss of homosubtypic immunity in the model. Blue, red and green curves are for influenza A virus H1N1, H3N2 and influenza B, respectively. B Model simulations compared with the observed incidence. The black curve is the monthly observed incidence; the colored curve is the predicted monthly median incidence with shaded 95% uncertainty intervals (2.5 and 97.5% quantiles) from 5000 random simulations with the maximum likelihood estimations. C The Monthly evolutionary change E(t). Blue and red curves are for H1N1 and H3N2, respectively

Retrospective and prospective forecast
To test the predicting ability, the model was trained on the data set before the target season and tested on the data set after the season 2015/2016. Because the model is nonautonomous, some covariates should be specified manually and independently before making predictions. For the weighted average distance $ d i ðs;tÞ , the mean distance of $ d i ðs;tÞ from the months April to September in the last season was used and keep it constant for the next season as a rough approximation. The retrospective forecasts of 2016/2017, 2017/2018, 2018/2019 and a prospective forecast 2019/2020 show an increasing trend for H1N1 and a decreasing trend for H3N2. Most observations fall within the 95% uncertainty intervals, which matches well with historical data except for A(H3N2) in 2018/2019 (Fig. 5A). There are significant correlations between predictions and observations of H1N1 and H3N2 (r ¼ 0:85and0:54, respectively, with both P \ 0.01) from the retrospective and prospective forecasts (Fig. 5B). Our results demonstrate the feasibility of epidemiological forecasts by incorporating the interaction between H1N1 and H3N2, the transmission characteristic and antigenic evolution.

Discussion
An evolution-driven mathematical transmission model that specifically incorporates antigenic evolution and cross-protection between subtypes was proposed in this study for exploration of interaction between H1N1 and H3N2 based on data in the USA from the years 2012 to 2019. Key parameters and quantitative description of strength of the interaction between them were revealed. A forecasting approach was also proposed based on the model and the forecast feasibility was tested. Our findings highlight the importance of cross-protection on the co-circulating of H1N1 and H3N2.
The model is capable to simulate the epidemic trend and relative epidemic intensity of H1N1 and H3N2 not only in the USA but also in other countries. There exists significant cross-protection for H1N1 from H3N2 (c H3 : 0:36ð0:05 $ 0:53Þ), which is consistent with the general perception that H3N2 outcompetes H1N1, as a result, causes more frequent and larger epidemics [41]. The cross-protection for H3N2 from H1N1 was not detected in this study (c H1 : 0:98ð0:55 $ 1:23Þ), but sensitivity analysis indicates that it could be much weaker than the other way around if it exsisted ( Figure S7). Furthermore, individuals without cross-protection contribute less to Fig. 3 The influence of cross-protection on epidemics. A The profiling likelihood for c H3 and c H1 , and their corresponding 95% confident intervals (CIs). (B) The effects of cross-protection on the epidemic size. The parameter c H3 or c H1 varies from 0 to 1 while the other parameters were fixed at the maximum likelihood estimations (Table S1). The accumulated incidence was calculated from the years 2012 to 2019 and the fold change of accumulated incidence was computed relative to the total incidence under the scenario c H3 ¼ c H1 ¼ 0 influenza transmission than individuals with crossprotection (n : 0:01ð0:00 $ 0:41Þ), which can probably be explained by much more health-seeking behavior and more intense self-isolation than individuals with cross-protection [42][43][44], who have shorter infectious period as well (x : 0:32ð0:29 $ 0:59Þ). Further sensitivity analysis showed that the direction and strength of the cross-protection between H1N1 and H3N2 is not influenced by the other assumptions (Table S2).
A forecasting approach was proposed based on the evolution-driven mathematical model to make predictions for H1N1 and H3N2 simultaneously ahead of season. Earlier forecasts of incidence dynamics can aid policy makers to take better preventive and control measures, plan and allocate of resource, and develop vaccines. The prediction of incidence dynamics depends on the prediction of the evolution index. However, due to the discontinuity of influenza surveillance data in many countries, the model cannot be applied to more countries and used to make Fig. 4 Applying the model to other countries. Monthly incidence and evolutionary change E(t) for Germany A Austria B and Norway C, respectively. The black curve is the monthly observed incidence. The colored curve is the predicted monthly median incidence with shaded 95% uncertainty intervals (2.5 and 97.5% quantiles) from 5000 random simulations. The evolutionary index E i s 0 ð Þ was calculated at the scale of season due to limited sequence data (check Materials and Methods for more details) predictions. The influenza virus surveillance should be further strengthened to provide a data basis for exploration of influenza subtypes epidemic patterns and make earlier forecasts available.
Several limitations should be acknowledged. First, the compartmental model did not include age structure for lack of data in different age groups. People of different ages have different immune responses to natural infection. The immune imprint left by the initial infection lasts for life and affects the immune response to later infection with antigen-similar viruses [13,45,46]. Second, it is not always appropriate to calculate hamming distance based on mutations in HA to measure the degree of antigen changes. Taking into account variations in other parts of the influenza virus and changes in the influenza phenotype will help us to better measure the extent of antigenic changes [47][48][49]. Third, we considered only the immune , and a prospective forecast for the 2019/2020 influenza season (in orange). The black curve is the monthly observed incidence; the colored curve is the predicted monthly median incidence with shaded 95% uncertainty intervals (2.5 and 97.5% quantiles) from 5000 random simulations with the maximum likelihood estimations. B Spearman rank's correlations were computed between monthly median predictions and observations of H1N1 and H3N2 from the retrospective and retrospective forecasts as Fig. 4A. The error bars represent 95% uncertainty intervals of the predictions response caused by natural infection and did not distinguish the difference between natural infection and vaccine induced immune response [50,51]. Fourth, the model consistently underestimates cumulative incidence especially for H3N2 epidemics although the observed incidence lies within the 95% uncertainty intervals. The underestimation of epidemics could result from combinations of many factors as mentioned above, such as population heterogeneity, measurement of antigenic changes and vaccination, which should be investigated further.
Author Contribution XD designed the study. GW, SL and FT collected the data, GW, BZ and SL built the model. GW performed the analysis. XD, GW, BZ, SL, YZ and DT interpreted the data. XD and GW prepared the manuscript. XD, GW, BZ, YZ and DT edited the paper. All authors reviewed and approved the submitted manuscript.