Study on typhoon disaster assessment by mining data from social media based on artificial neural network

Typhoon disaster is a major threat to the economy and personnel safety in coastal areas. After the disaster, the objective assessment of typhoon disaster economic losses can provide an important reference for the post-disaster rescue and reconstruction and improve the scientific decision making. In this study, social media data, disaster causing factors, disaster bearing carriers (exposure and vulnerability) and other factors in traditional disasters were combined to achieve rapid disaster assessment. Convolutional neural networks were used to train a text classifier and perform automatic text classification of social media data. The correlations were discussed between various texts and disaster economic losses. It was found that there was a strong correlation between the geographical distribution of texts describing disaster damage and disaster economic losses. A back-propagation neural network was used for supervised learning to realize disaster loss assessment. To prove the reliability and applicability of the evaluation model, typhoons “Mangkhut” and “Lekima” were selected as study cases to realize the whole process from information collection to final assessment. The introduction of social media data modified the assessment results obtained by traditional methods and reduced the difference between the estimated disaster loss value and the actual value.


Introduction
Over the past years, typhoon disasters have become increasingly frequent and the impact of disasters has become much more severe with global warming (Coronese et al. 2019). Typhoons are one of the major natural disasters in China. With the rapid economic development of China and the increasing concentration of population and wealth in coastal cities, the damage caused by typhoon disasters has been increasing. In 2019, typhoon disasters caused direct economic losses of 58.9 billion yuan in China, which accounts for 18% of annual natural disaster losses (UNDRR 2019). Objective assessment and prediction of typhoon disaster economic losses can provide an important reference basis for post-disaster relief operations in affected areas, greatly improve the efficiency of relief operations and reduce typhoon disaster losses.
Conventionally, disaster economic loss assessment mainly includes the HAZUS-MH hurricane model, which integrates typhoon hazard model, building change model, vulnerability model, and economic income change model to quantify the loss assessment factor relationships, perform accurate simulation of regional loss assessment and disaster prevention capability (Vickery et al. 2006a); HAZUS-WIND assessment model, which is constructed with the technical support of GIS and used to assess the potential damage level of different types of buildings in each region (Vickery et al. 2006b). In addition, many methods were used to predict typhoon disaster losses such as fitting the relationship between insured loss and frequency of typhoon disaster (Unanwa et al. 2000), calculation about the regional wind distribution function (Mitsuta et al. 1996), analysis of the post-disaster images collected by remote sensing techniques (Kakooei and Baleghi 2017), etc. In largescale natural disasters, traditional disaster assessment tools lack timeliness and cannot reflect the possible secondary derivative disasters caused by typhoon disasters promptly, while the assessment methods related to remote sensing images require extremely highresolution remote sensing data in real time, and factors such as satellite revisit time also limit their timeliness (Kakooei and Baleghi 2017). Therefore, it is necessary to develop new methods to make disaster damage assessment rapidly after the occurrence of disasters.
In recent years, with the rapid development of social media and the increasing number of users, active users in social media platforms have become real-time sensors, reflecting the surrounding environment and transmitting information about the events occurring around them at any time (Imran et al. 2018). Internet users not only share what they see and hear but also express their feelings and opinions on the platforms. When natural disasters occur, people relate their personal experiences and opinions about disaster events or rescue operations on social platforms (Tang et al. 2021). In recent years, scholars have begun to pay more attention to the performance of social media in disasters. Researchers have found that social media data can reflect the type and scale of disaster damage to some extent (Shan et al. 2019). Disaster-related twitter activity was closely related to the value of emergency relief funds for disasters such as hurricanes, tornadoes and earthquakes (Kryvasheyeu et al. 2016). Another researcher correlated emotional data in voice and text communication with population density and disaster severity to build a disaster emotional loss assessment model (Teodorescu 2013). However, applying only social media information can lead to inaccurate assessment results. Social media activities are often influenced by regional socioeconomic factors, such as population and household income (Dargin et al. 2021;Fan et al. 2020). Moreover, social media data cannot fully reflect the disaster-bearing capacity of the affected areas, etc.
This study combines social media data and other disaster loss-related information to achieve a rapid assessment of disaster losses. Social media data are used as real-time data related to disaster loss. Social media, disaster-causing factors, and disaster-bearing carriers (exposure and vulnerability) are considered when assessing disaster economic loss, and neural networks are trained to achieve rapid disaster assessment. A text classifier based on a convolutional neural network was trained, and the text of social media data is classified to discuss the effectiveness of social media data for disaster loss assessment. Then, it was discussed that the correlation between each type of text and economic loss of disaster, and effective indicators were selected for disaster loss assessment. Back-propagation neural network was used to integrate the relevant factors of disaster loss and perform supervised learning to achieve disaster loss assessment. To demonstrate the reliability and applicability of the assessment model, the typhoons "Mangkhut" in Guangdong in 2018 and "Lekima" in Zhejiang in 2019 were selected as study cases. The contents of this paper are organized as follows: First, the data sources of the factors related to the disaster economic loss assessment are identified. Second, the research methods used in this paper are described in detail. Third, the results of the study and the strengths and limitations of the research are discussed. Finally, the research contributions and the next steps of this paper are summarized.

Overview
This paper focused on relevance analysis and economic loss prediction of typhoon disasters using multi-source data. In this study, typhoon meteorological data, city statistics and social media data were obtained. The typhoon meteorological data were used as a disastercausing factor, including the maximum wind speed in the region during the typhoon and daily maximum local rainfall. Urban statistics were used as a measure of urban disasterbearing carriers, including urban GDP, resident population, urban annual rainfall, and urban agricultural land area. Figure 1 shows the framework of this study. In this paper, two main parts are considered. One part is to conduct a correlation analysis of disaster loss and social media data; the other is to carry out economic loss prediction when a typhoon disaster happened. Firstly, the text convolutional neural network model (textCNN) is trained to classify tweets collected from the microblog and discuss the correlation between various categories and disaster loss. Secondly, the BP neural network model is applied to obtain a reliable rapid damage assessment. For the correlation analysis, the concentration of disaster losses is considered to be the main factor, and the different kinds of information in social media data to be the correlation factors (Fig. 2).
The study considered social media data as real-time data in the evaluation model. Through the analysis of social media data, we can timely perceive the loss caused by typhoon disaster. Social media data can be collected in real time by means of web crawlers. The BP neural network was trained through the combination of social media data, typhoon disaster causing factor, disaster bearing carrier and other data, which can quickly realize the evaluation of typhoon disaster loss.

Text categorization-textCNN
In typhoon disasters, people use social media platforms to post various typhoon disasterrelated news. Analyzing the information released by social media platforms timely can help governments and social groups gain an awareness of the disaster situation, understand the disaster situation in different locations and formulate corresponding rescue and recovery actions based on the analysis results.
The textCNN model is used in this study to explore the relationship between social media data and disasters. Yoon Kim applied convolutional neural networks (CNNs) to the task of text classification (Nguyen et al. 2016). The model utilizes multiple kernels of different sizes to extract key information in sentences (similar to n-grams with multiple window sizes), which can better capture local correlations. No changes have been made to textCNN in the network structure compared with the traditional image CNN network. The traditional textCNN model consists of four parts: input layer, convolutional layer, pooling layer, and fully connected layer. The first layer is the input layer. The input layer is an n × k matrix, where n is the number of words in a sentence, and k is the dimension of the word vector corresponding to each word. In addition, the padding operation is performed on the original sentence to make the vector length consistent. The second layer is a convolutional layer. Each convolution operation is equivalent to a feature vector extraction. By defining different windows, different feature vectors are extracted to form the output of the convolution layer. The third layer is the pooling layer, the role of which is to pool sentences of different lengths to obtain fixed-length vector representations. Commonly used pooling methods are 1-max pooling, k-max pooling, and average pooling. The last layer is the fully Fig. 2 The textCNN model architecture for an example sentence connected layer, which is used to map the learned feature representation to the label space of the sample, and use the softmax activation function to output the classification category probability (Kalchbrenner et al. 2014).
Compared with traditional models, CNN models do not rely on well-designed features and complex natural language processing tools and have the advantages of simple network structure, less computation, and faster training speed. By introducing already trained word vectors, the CNN model performs well in multiple datasets. The word vectors used in this study were provided by the Word2Vec model (Mikolov et al. 2013). The model has different forms of channels in both static word vectors and dynamic word vectors, one of which is kept static while the other is dynamically fine-tuned through back-propagation during training. In this two-channel architecture, each filter is applied to both channels and the result is added.

Disaster assessment based on BP neural network model
The back-propagation neural network continuously corrects the network weights and thresholds through the training of sample data, so that the error function decreases along the negative gradient direction and approaches the desired output. It is a widely used neural network model, which is mostly used for function approximation, model recognition, and classification, data compression and time-series prediction (Xiao-Ling et al. 2011). The BP network consists of an input layer, a hidden layer, and an output layer. The hidden layer can have one or more layers. Figure 3 is the used three-layer BP network model of m × L × n. In this study, the related factors are set to be the input layer, and the value of disaster economic loss is set to be the output layer. There are two main steps in establishing the BPNN model. (1) model establishment and correlation analysis. Program the process and do network training using sample data to determine parameters and to establish a trained neural network; (2) Model evaluation. Input new related factor data, to get the estimated data of economic loss from the output layer, compare the estimated value with the observed value and evaluate the accuracy of the trained network.
To obtain a better fitting effect and avoid overfitting, we use the K-fold cross-validation method to validate the model. The K-fold cross-validation method is to randomly divide the training set into K groups, use (K-1) groups for modeling, and use the remaining group In the process of building a BP neural network model, it is very important to allocate the number of hidden neuron nodes (L). The following two empirical formulas are commonly used to calculate the number of hidden nodes (Wu 2011).
Formula (1): . In these two formulas, the parameter m represents the number of neurons in the input layer, the parameter n represents the number of neurons in the output layer, and a is an empirical integer between 1 and 10.
The BP network selects f (x) = 1 1+e −x as the sigmoid transfer function and E = as back-propagation error function ( t i is the expected output, O i is the calculated output of the network). The BP neural network makes the error function E reach a minimum by continuously adjusting the network weights and thresholds. This study uses root-mean-square error (RMSE), mean absolute error (MAE), and coefficient of determination ( R 2 ) to evaluate the performance of the model. RMSE represents the standard deviation between the actual loss and the evaluation result, MAE represents the mean absolute difference between the actual loss and the evaluation result (Willmott and Matsuura 2005), and R 2 represents the ratio of the evaluation result to the variance of the actual loss (Draper and Smith 1998). The formulas for the three metrics are as follows: where y i = actualdamage , ỹ i = assesseddamage , y = mean value of the actual damage; and n = number of sample used for calculating model performance.

Data collection
In order to ensure that the study area is distributed from mild to severe disasters, typhoons "Mangkhut" and "Lekima" are selected in this paper. A brief introduction of the two typhoons is given in Table 1, which includes landfall date, landfall location, max wind level, affected population, and economic loss.
Guangdong Province and Zhejiang Province are selected as study cases. The two provinces have a large land area and large population. Once the typhoon disaster occurs, it will cause huge economic losses. In addition, the two provinces have a high frequency of typhoon landfall. Abundant typhoon data provided great convenience for research.
To prepare the inputs for disaster loss assessment, the information sources of hazard factors, disaster carriers, and social media information were collected. As for the output, the direct economic loss allocated was collected.

Real-time data-social media information
The latest research shows that there is a correlation between social media data and disaster losses (Kryvasheyeu et al. 2016;Yuan and Liu 2018;Chen and Ji 2021). The introduction of social media data will enable the original assessment system to respond to disaster losses in near real-time. In this study, social media data is derived from Sina Microblog, which is the most used and active social media platform in China. Sina Microblog allows posting messages within 140 characters and several images through the platform. Users can use the symbol '#' to mark the subject of the messages and use the symbol '@' to interact with other users during the process. In the event of major natural disasters and emergencies, Microblog provides the individual with a channel to release real-time emergencies and express their feelings, as well as provides a platform for governments and social organizations to disseminate early warning information and report emergency news.
The purpose of this study is to provide a rapid assessment of disaster losses. However, there could be some false or exaggerated information in social media, which may affect the evaluation results of the model. To solve this problem, original social media data were selected as input data in the evaluation process, which was user-initiated blog post. The original social media data did not include the data of the forwarded microblog. On the one hand, only a small number of people released false information during the typhoon disaster. The impact of false information was limited when it was not forwarded (Kryvasheyeu et al. 2016). On the other hand, it was considered that the correlation between original social media data and disaster losses in disaster-stricken areas was higher than that of forwarded information (Donner and Rodríguez 2008). So only original tweets were collected in this study. In addition, the social media data acquisition time was limited to the impact period of the typhoon, and the location was limited to the disaster-affected areas. The location filters were set to be at the provincial level. The disaster assessment approach was set at the prefecture level. More information to be collected about social media data is listed in Table 2.

Hazard factors data
As a typical meteorological disaster, the typhoon is characterized by various types of weather information (i.e. atmospheric pressure, rainfall amount, wind speed). Commonly, strong breezes to destroy homes, billboards, and power transmission towers; meanwhile, short-term heavy rainfall can lead to floods and cause damage to homes and infrastructure. Max wind speed and rainfall amount for 24 h are two critical parameters because most of the damage caused by typhoons is related to strong breezes, floods, waterlogging, and their secondary derived disasters. In this study, the paths of the typhoon are provided by the Oceanographic Data Center, Chinese Academy of Sciences (CASODC) (http:// msdc. qdio. ac. cn).

Disaster carrier data
According to previous studies, GDP is a vital indicator of disaster-stricken areas (Donner and Rodríguez 2008), which can represent regional infrastructure value as well as to measure the extent of economic damage that may result from a disaster. On the other hand, the population can reflect community exposure to disaster (Santos 2019). In this study, GDP and population can be collected from Statistical Yearbook (National Bureau of statistics of the People's Republic of China 2019, 2020). The official website of the Provincial Bureau of Statistics publishes the statistical data of the previous year's prefecture-level cities every year. In addition, typhoon disasters have a huge impact on regional agricultural development. Strong winds and heavy rains wreaked havoc on food crops, cash crops, etc. According to Typhoon Disaster Statistics in China, agricultural economic losses occupy a very high proportion of direct economic losses (National Bureau of statistics of the People's Republic of China 2019, 2020). Therefore, it is necessary to consider the agricultural land area as an important factor for disaster assessment. The areas of food crops, cash crops, and other crops are in consideration, which was collected from Guangdong Statistical Yearbook, and Zhejiang Statistical Yearbook.

Disaster economic loss
In the post-disaster stage, the emergency management bureaus in the disaster-stricken areas usually count disaster loss into five categories: agricultural losses, industrial and mining losses, infrastructure losses, public service losses, and household property losses (Ministry of Emergency Management 2020). The statistical results would be reported to the National Disaster Mitigation Center. In this study, disaster economic loss is considered as output. In addition, storm surge information only occurs in coastal areas, which leads to the consequence that coastal areas and non-coastal areas could not be assessed in the same way. So if a prefecture-level city includes the coastal area, the economic losses in the coastal area will be reduced from the total.

Spatiotemporal distribution of social media data
We collected social media data from the microblog. The data collection time was from 24 h before the typhoon landed to 24 h after the typhoon left. The time-series distribution of social media data is shown in Fig. 4. The graph shows the number of tweets per hour 1 3 which is related to typhoons. It is indicated that the numbers of twitter messages related to the typhoon disaster began to increase before the typhoon landed, peaked on the day the typhoon made landfall, and then declined rapidly. There is a lag period between the time of the typhoon's landing and the outbreak of public opinion. A few hours after the typhoon landed, the number of tweets ushered in an outbreak period. In addition, the number of tweets is affected by netizens' work and rest patterns. From 0:00 am to 6:00 pm, the number of public opinion texts is less, while the number of tweets is more at noon and evening. The number of tweets is affected by netizens' work and rest patterns. From 0:00 am to 6:00 pm, the number of public opinion texts is less, while the number of tweets is more at noon and evening (Fig. 5).
Geographically, on the one hand, the number of tweets is affected by the typhoon, and the high-frequency areas are mainly distributed on both sides of the typhoon pat. In the typhoon "Lekima", the areas with more than 500 tweets are mainly in Taizhou, Shaoxing, Hangzhou, and Zhoushan, and the areas with more than 800 tweets are all around the typhoon track. The number of tweets is closely related to the typhoon track. On the other hand, it is related to the population distribution in the affected area. In typhoon "Mangkhut", a large number of people concentrated in Guangzhou and its surrounding urban agglomerations (Guangzhou, Shenzhen, Zhuhai, and Dongguan). The number of tweets posted in these regions is much higher than in other regions, with the majority of tweets  The geographic distribution of social media data over 1 000 being posted in these regions. Therefore, there is a certain correlation between the number of tweets and the typhoon disaster, and also a great correlation with the population distribution in the affected areas.

Text classification of social media data and correlation with disaster losses
In this section, we formed a classified dataset by extracting part of the social media data and performing manual annotation of classified texts at first. Then, the TextCNN model was trained using the labeled dataset. Finally, the rest of the social media data was classified and processed using the trained TextCNN model, and the correlation between different categories of social media data and typhoon disaster losses was discussed.

TextCNN classification
This study classified social media data in typhoon disaster environments into eight categories: pre-disaster prevention, disaster damage description, reminder and advice, emotional expression, rescue and recovery, disaster notification, volunteer activities, and irrelevant information (Nguyen et al. 2016). The number of each type of marked text is shown in Table 3.
The ROC curve of the TextCNN training model is shown in Fig. 6. Each point on the ROC (receiver operating characteristic curve) curve reflects the sensitivity to the same signal stimulus. The horizontal axis is the false-positive rate (FPR), which represents the proportion of actual positive instances in the positive class predicted by the classifier to all positive instances. The vertical axis is the true-positive rate (TPR), which represents the actual negative instances in the positive class predicted by the classifier to all negative instances. The performance of a classifier can be measured by the area under the ROC curve (AUC). The larger the AUC value, the higher the accuracy rate that the classifier can obtain after selecting the appropriate threshold. Table 4 presents the classification performance of the textCNN model on classified instances. The precision, recall and F 1 score of the textCNN multi-class classifier are given in the table. According to the comprehensive accuracy rate, recall rate and F 1 score, the textCNN model obtained by training is acceptable for text classification. Table 5 shows the performances of the different classifiers. Compared with the unsupervised approach, the F 1 scores of supervised classification models are higher. Among supervised learning methods, neural network classification is better than machine learning. Furthermore, CNN outperformed the other classifier models in terms of precision, recall, and F 1 score. Hence, this research selected CNN as the classifier model.

Social media data text classification results and their correlation with disaster losses
This section focuses on the correlation between social media data and disaster losses. The disaster loss data come from the post-disaster loss survey data of typhoon "Mangkhut" in 2018 and typhoon "Lekima" in 2019. Figure 6 shows the spatial distribution of statistics on direct economic losses from typhoon disasters. It can be seen that without considering the losses caused by storm surges, the typhoon landfall area is the area with relatively large disaster losses. At the same time, after excluding the impact of storm surges, the coastal areas suffered less damage, while the inland areas were still    The social media data were classified by the text classifier in the previous section, and the distribution results of each category of text in each region were obtained. The correlation between typhoon disaster losses and social media data was measured by the correlation coefficient. Among them, the Pearson correlation coefficient is suitable for describing the correlation in the linear relationship, and the Spearman correlation coefficient is often used to describe the correlation in the nonlinear relationship. The results of Pearson's correlation coefficient and Spearman's correlation coefficient between disaster losses and social media data are shown in Table 6. It can be seen that there is a moderate correlation between the number of texts describing disaster damage and disaster losses, and both the Pearson and Spearman correlation coefficients are higher than the total number of tweets posted. Disaster losses are also moderately correlated, while disaster prevention, emotional expression, reminders and suggestions, disaster reporting, and the number of texts in volunteer activities are less correlated with disaster losses.
As can be seen from Table 6, the number of texts disaster damage descriptions is an important factor that can describe disaster losses, and it is feasible to use the number of disaster damage texts as real-time data for disaster loss assessment. However, since social media data are greatly affected by the population of the region, the larger the population base, the greater the number of posts on the same topic. Due to the dense population, some areas may see a large number of texts and a small amount of damage.
To reduce the impact of such situations, two solutions are proposed. One is to consider the correlation between per capita loss and social media data. The second is to consider the correlation between the proportion of social media categories in the total number of tweets and disaster losses. The correlations obtained by the two methods are shown in Table 7. When considering per capita disaster loss, the correlation between disaster damage description and disaster loss decreases, the correlation between each category of social media data and per capita disaster loss decreases, and the volunteer-related shows a slight negative correlation with per capita disaster loss. Considering the proportion of different types of disaster texts, the correlation between disaster damage description and disaster loss becomes higher, while reminders, suggestions, emotional expressions, irrelevant information and disaster losses have a moderate negative correlation, and pre-disaster prevention and disaster losses have a mildly negative correlation. The proportion of disaster-damaged text to the total number of tweets is shown in Fig. 8. Comparing Fig. 7 and Fig. 8, it can be seen that areas with a high proportion of disaster damage texts also have high disaster losses, and the two have a high-rank correlation. The reason for the above results may be that people in the more severely affected areas are more willing to share the disaster damage information to others through the Internet, hoping to gain attention and help from other areas, while the daily life of people in the less affected areas is more willing to express sympathy and support for the hardest-hit areas, and issue some reminders and suggestions for disaster avoidance and self-rescue.

Typhoon disaster loss assessment
The method proposed in this study could quickly realize the loss assessment of typhoon disaster. On the one hand, real-time information data came from Weibo and other public social media platforms, which can be collected in real time and added to the trained classification model to complete the rapid processing of social media data. On the other hand, data of typhoon disaster factors (rainfall, wind speed) and partial disaster carrier data (distance from the eye of storm) can be obtained in the first time during the typhoon disaster. Statistics on affected areas (resident population, GDP, agricultural land area, annual rainfall) can be also available from local Bureau of Statistics before disaster.
Selecting representative feature data from basic data can simplify the model and improve its evaluation accuracy. In the loss assessment of typhoon disasters, this study extracts some representative characteristic data based on previous studies from the perspective of disaster-causing factors, disaster-bearing carriers, and real-time data. From the perspective of disaster-causing factors, rainfall and wind speed are selected as disaster description features. From the perspective of disaster-bearing carriers, GDP, distance from the eye of storm, population are selected as the characteristics of regional exposure as well  Table 8. This study collected typhoon disaster data in 30 regions as the sample data of the BP neural network model and grouped the sample data according to the proportion of the training set accounting for 70% and the test set accounting for 30%. To discuss the role of Model of hazard-bearing carriers, hazard-causing factors, and social media. The results of the disaster loss assessment are shown in Fig. 9. In order to better show the difference between the model evaluation results, we adjusted the actual disaster loss value and the evaluation value to a logarithm with a base of 10. The relative magnitude of the errors in the graph was indicated by color, and the diagonal (Y = X) represents the ideal case where the estimated value was equal to the actual value. As can be seen from the figure, Model I (without social media data) had a larger error in the evaluation results. When the losses were low, the evaluation results of Model I tend to be high. When the disaster losses were high, the evaluation result of Model I is lower than the actual value. Model II introduced social media data into disaster loss assessment, which corrected the assessment results of Model I to some extent. In moderately affected areas, the evaluation results of Model II are closer to the actual value than that of model I. Table 9 shows the performance parameters of the two evaluation models in the test set with and without social media data. It can be seen from the performance parameters that all parameters of the evaluation model considering social media data are better than the traditional model.
Further comparison of disaster-related loss factors, disaster assessment results, and actual disaster losses shows that areas with high assessment have larger GDP, permanent residents amount (PRA), and the area of agricultural land (AAL) but smaller max wind speed (MWS) and daily rainfall amount (DRA). The assessment of hardest-hit areas is significantly lower. The reason is that secondary and derivative disasters had occurred such as floods and landslides but the traditional model could not take it in consideration. When the typhoon disaster happened, social media reflected the local disaster situation in nearly realtime. In severe disaster areas, victims tend to describe disaster damage scenarios; in lightly affected areas, there are fewer disaster damage scenarios, the normal life of the victims is less affected, and the proportion related to disaster damage is low. Therefore, the traditional disaster loss assessment model can be modified by considering social media.
In a word, the method proposed in this study could quickly realize the loss assessment of typhoon disaster. On the one hand, real-time information data comes from Weibo and other public social media platforms, which can be collected in real time and added to the trained classification model to complete the rapid processing of social media data. On the other hand, data of typhoon disaster factors (rainfall, wind speed) and partial disaster carrier data (distance from the eye of storm) can be obtained in the first time after the typhoon disaster. Statistics on affected areas (resident population, GDP, agricultural land area, annual rainfall) were also available quickly.

Conclusion
The correlation is analyzed between social media data and disaster losses, and social media data, disaster-causing factors, and disaster-bearing carriers are used for disaster loss assessment by neural network models in this paper. The areas affected by typhoons "Mangkhut" and "Lekima" were selected as the research objects. This study collected social media data from disaster-affected areas and discussed the correlation between different categories of texts in the social media data and disaster losses. It is shown that compared with the Table 8 Sample of disaster assessment data The proportion of tweets that destroy descriptions (PTDD), max wind speed (MWS), daily rainfall amount(DRA), distance from the eye of storm(DES), annual rainfall amount (ARA), permanent residents amount (PRA), the area of agricultural land (AAL) Area 2.11 × 10 3 1.32 × 10 3 number of tweets, the proportion of tweets related to disaster damage descriptions is more correlated with disaster losses than the number of tweets and can be used as real-time data for disaster loss assessment. In the disaster assessment stage, the maximum wind speed and rainfall at the time of the disaster are used to characterize the hazard-causing factors, GDP, distance from the eye of storm and resident population are used as the exposure characteristics of the disaster-bearing carrier, the average annual rainfall and agricultural land area are used as the vulnerability characteristics of the disaster-bearing carrier, and the social media data are used as real-time data features. Eventually, a fast evaluation method based on supervised learning is proposed. The evaluation results show that the addition of the proportion of disaster description texts in social media data can effectively improve the accuracy of disaster loss assessment and effectively revise traditional assessment models. Based on the traditional disaster assessment model, this assessment model adds social media data as real-time data to quickly assess the typhoon disaster loss. The evaluation results can provide a reference for post-disaster recovery, rapid rescue, material distribution, and other actions. In future research, the disaster loss assessment work can also be carried out in combination with the historical typhoon disaster data in the affected areas, and the disaster prevention and mitigation work in the affected areas.
It should be noted that the spread of false information and rumors in social media will cause deviations in the evaluation model. In order to improve the anti-interference ability of the model, the future research will be carried out on the detection of false information and rumor recognition in social media data. The research in these aspects will improve the accuracy of the present methods.