Pneumoconiosis is a group of heteroeous occupational interstitial lung diseases related to the corresponding reactions of inhaled mineral dust and lung tissue, which eventually leads to irreversible lung injury[1]. Due to the lack of prevention of workplace dust, failure of early diagnosis of diseases, and limited effective treatment of diseases, Pneumoconiosis is still a serious global public health problem. According to the Global Burden of Disease(GBD) Study 2017[2], the number of pneumoconiosis cases worldwide increased by 66.0% from 36,186 in 1990 to 60,055 in 2017 and approximately 125,000 cases of pneumoconiosis die every year[3]. China is the country with the largest working population in the world and the most seriously affected by pneumoconiosis. In 2018, it was about 776 million working population, and most workers spend half their lives working. According to the estimation of the National Health Commission of China, the total number of occupational-based cases reported by 2018 was 97500, and 90% of the occupational diseases were identified as pneumoconiosis[4]. Tianjin is an industrial city dominated by processing and manufacturing industries in northern China. In recent years, the number of cases of pneumoconiosis in Tianjin has been increasing, showing a trend of wide industrial distribution and younger age. Although China has taken a variety of measures to prevent and control pneumoconiosis in the past few decades, compared with the United States and Britain, China 's occupational health field is still in its infancy, and the situation of pneumoconiosis prevention and control is still grim. Pneumoconiosis causes huge disease burden and economic losses to Chinese workers, families and society every year.
Disease burden assessment is an important public health tool to guide risk reduction and prevent diseases caused by workplace exposure. Disability adjusted life year (DALY) was developed by WHO and the World Bank to quantify human disease burdens and injuries in the Global Burden of Disease Study[5]. DALY combines the estimation of disability survival time and the time lost due to premature death, and is adjusted through a set of social preference values[6]. For different age groups and time periods, DALY can be given different age weights and discount rates. Therefore, this provides an objective and quantitative description of the gap between ideal health status and actual population health status[7]. Due to these irreplaceable advantages, DALY method has been applied in many fields, such as cancer[8], cardiovascular diseases[9], and the impact of environmental pollution on health[10]. However, it is relatively less applied in the field of occupational diseases.
Also known as historical extension forecasting method, time-series forecasting method is an extrapolation and forecasting method to reflect the development trend of things through time-series[11]. Accurately predicting the trend of pneumoconiosis burden can realize the monitoring and early warning of pneumoconiosis. Common traditional time-series prediction methods include autoregressive integrated moving average (ARIMA) model and Holt-Winters exponential smoothing method et al, among which ARIMA model is the most classical and popular model[12, 13]. ARIMA model involves the invariance of trend change, random disturbance, periodic change and other related random variables in the process of time-series analysis. Due to the advantages of simple structure, strong applicability and ability to interpret data sets, ARIMA model has been successfully applied in the past medical and health fields[14].
In recent years, deep learning technology has developed rapidly and is widely used to extract information from various data. Among them, Artificial neural network (ANN) is widely used because it can overcome the limitation of linear model.[15]. In terms of time-series model prediction, recurrent neural networks (RNN) model dominates and has higher prediction accuracy than traditional artificial neural network[16, 17]. However, when the sequence length is too large, the training time of RNN is significantly increased and it is prone to gradient disappearance and gradient explosion[18]. Based on the above problems, a novel recursive network structure called Long Short-Term Memory Neural Network (LSTM) was proposed[19]. It combines the appropriate gradient-based learning algorithm, improves the hidden layer of RNN and extends the storage function of the network, so that the model can obtain more persistent information and reduce data transmission speed[20, 21]. It can learn to span 1000 steps without losing short latency even in noisy, incompressible input sequences[22]. In recent years, LSTM model has been more and more applied in many fields such as traffic flow prediction, speech recognition and disease prediction[23, 24]. As far as we know, no studies using LSTM model to predict the disease burden of pneumoconiosis.
In this study, ARIMA model and LSTM model were used to fit and predict the time-series of pneumoconiosis burden in Tianjin, China. In addition, by comparing the fitting effect and prediction accuracy of the two models, a more suitable prediction model is sought to predict the disease burden level of pneumoconiosis in Tianjin