Scarlet fever is the most common respiratory infectious disease in China, and its incidence level has been increasing in recent years with similar epidemic trends nationwide and worldwide[7, 23] .The prevention and control of scarlet fever still face major challenges. Adequate understanding of the epidemiological trends of scarlet fever and predicting its incidence are important guidelines for the prevention and control of scarlet fever and the allocation of related medical resources. Since the implementation of direct network reporting of scarlet fever in 2004[24], the reported incidence rate of scarlet fever in Jiangsu Province has been at a low level, which began to rise steeply in 2010, and then fluctuated, but the overall incidence rate increased significantly compared with historical levels, especially in 2019, when it reached the highest historical level since direct network reporting, and the situation of epidemic prevention and control was severe.
This study analyzed the trend of scarlet fever incidence in Jiangsu Province, China, from 2013 to 2021. From 2013 to 2021, the trend of scarlet fever incidence in Jiangsu Province had a clear seasonal distribution pattern, with a 12-month cycle, with one peak from April to June and another peak is between December and January of the following year. These two peaks coincide with the local school term and are similar to other Chinese provinces[23, 25–27]. On the one hand, it may be considered that the number of scarlet fever cases rises gradually due to the fact that March or April is the start of the school year in China, with more contact between students[27, 28]. On the other hand, it may be related to the change in temperature adaptation of Streptococcus haemolyticus.[29]. Therefore, it is crucial to strengthen scarlet fever control and preventive measures in schools during April and June. Then, based on the data from 2013 to 2021, this study established two time series models : ARIMA model and TBATS model, and predicted the number of scarlet fever cases in Jiangsu Province in the first half of 2022. The predicted results are different from the actual values, but it provides a new idea for the prediction and prevention of scarlet fever, a multi-seasonal disease.
ARIMA model is a classical time series analysis method, which is widely used in the prediction of various infectious diseases and has good results in the prediction of scarlet fever: wu et al. constructed an ARIMA (3, 1, 3) (3, 1, 0)12 model to predict the incidence of scarlet fever in Chongqing, China, and achieved good results[30]. The ARIMA model was developed for a series of time-varying, but interrelated, dynamic data. The seasonal component of the time series can be extracted for diseases whose incidence has seasonal characteristics. However, ARIMA models ignore the nonlinear component of the time series, have limited ability to make long-term predictions, and cannot handle complex time series with multiple seasonality, high frequency seasonality, non-integer seasonality, and double calendar effects[18]. The TBATS model can handle data with multiple seasonality and introduces Box-Cox variation to deal with the nonlinear features in the time series. The results of the time series analysis show that the TBATS model outperforms the ARIMA model in both the training and test sets, and the results are similar to those of previous studies[31, 32] .The TBATS model also has other advantages: one is the stability of the model results, and the other is that fewer initial parameters need to be estimated, so the decomposition of the components of the time series is more stable. This gives TBATS the potential to describe the long-term prevalence of scarlet fever, given the epidemiological characteristics of scarlet fever incidence[33] .
The results of this study show that the fitting effect of ARIMA model and TBATS model is stronger than the prediction effect, and the prediction results are high, with MAPE greater than 20%, and the prediction accuracy needs to be further improved. The consideration may be related to the characteristics of these two models themselves. Time series analysis relies only on historical data and temporal information, and scarlet fever, as a respiratory infectious disease, its onset is influenced by a variety of factors, such as temperature, relative humidity, precipitation, and other meteorological factors[34–37] .This can cause inaccuracy in the prediction effect of single-factor prediction models. Similarly, the historical span chosen for this study is long, up to nine years, and changes in meteorological factors will also have a greater impact on the model, leading to a decrease in the long-term prediction effect of the model. More importantly, the time series analysis also ignores the impact of public health policies. Since March 2022, the outbreak in Shanghai, Jiangsu Province, has been severely restricted by the Chinese government's closure policy, resulting in a significant decrease in the number of scarlet fever cases from March to June, which is lower than the previous period, resulting in an overprediction.
Based on the above limitations, we propose the following suggestions for future research: first, we need to continuously collect monitoring data to improve the accuracy of scarlet fever case reports [38]; second, we can try to combine different models to give full play to the strengths of each model to achieve good prediction results. Third, the ARIMAX model can be developed on the basis of the ARIMA model by adding spatial information or other covariates ( such as meteorological factors : temperature, humidity, wind speed, rainfall, etc. ) to improve prediction accuracy