Analysis of public emotion on flood disasters in southern China in 2020 based on social media data

doi:10.21203/rs.3.rs-2374215/v1

Download PDF

Research Article

Analysis of public emotion on flood disasters in southern China in 2020 based on social media data

https://doi.org/10.21203/rs.3.rs-2374215/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 28 May, 2023

Read the published version in Natural Hazards →

You are reading this latest preprint version

The exploding popularity of social networks, provides a new opportunity to study disasters and public emotion. Among the social networks, Weibo is one of the largest microblogging services in China. Taking Guangdong and Guangxi in the south of China as a case, Web Scraper was used to obtain Weibo texts related to floods in 2020. The spatial distribution of floods was analyzed using Kernel Density Estimation. Public emotion was analyzed using Natural Language Processing (NLP) tools. The association between floods and public emotion was explored through correlation analysis methods. The results indicated that: (1) Weibo texts could be utilized as an eﬀective data to identify urban waterlogging risk in Guangdong and Guangxi. (2) More floods occurred in the southeast than in the central and northwest, and more in the south than in the north in Guangdong and Guangxi. The coastal cities and provincial capitals were severely affected. (3) The public emotion was mainly negative and varied significantly over time, generally showing stronger negative emotion during periods of heavy precipitation. (4) There was a strong correlation between public emotion and floods in spatial-temporal variation. The degree of negative public emotion was significantly influenced by the number of waterlogging points. The presented results serve as the pre-liminary data for future planning and designing of emergency management.

flood

Weibo texts

public emotion

Guangdong and Guangxi

Weibo texts can be used to study the flooding situation and public emotion.
The distribution of waterlogging points was spatially heterogeneous.
The type of public response was dominated by the description of the flood.
The overall public emotion during the flood was negative.
There was a strong correlation between public emotion and flooding.

Urban waterlogging caused by heavy rains has become a huge challenge for many cities due to rapid urbanization and global warming. Urban waterlogging can seriously affect city safety, urban economy and residents' daily life (Hu et al. 2012; Malik et al. 2020; Wang et al. 2020; Yin et al. 2015). In 2020, many cities in southern China experienced severe urban flooding due to consecutive rains. According to the Emergency Management Department, 29 provinces and 73.74 million people were directly affected, damaged property worth 219.86 billion RMB in the first three quarters of 2020 (Disaster Relief and Material Security Division 2020). Existing studies on urban flooding include cause analysis, prevention and control strategies, urban flood simulations and risk identification, etc. (Huang et al. 2021). Among them, urban waterlogging risk identification and assessment is one of the main research contents (Lin et al. 2018).

When urban waterlogging occurs, obtaining disaster information quickly and accurately is the key to risk identification. In previous studies, the main methods of information access about urban waterlogging were GIS and RS techniques (Huang 2022; Xu et al. 2022) and hydrodynamic models (Liu et al. 2022; Wang et al. 2022). Researchers can measure the extent and depth of inundation through remote sensing images or use hydrodynamic models to simulate the damage to the hazard-bearing body under different conditions. GIS and RS technologies for flood monitoring and risk assessment are important research elements of non-engineering measures for flood management. GIS and RS technologies have powerful spatial data management and analysis functions, but the acquisition and interpretation of remote sensing images require a lot of human and material resources (Li et al. 2014). Hydrodynamic model is currently the mainstream method for storm water flooding simulation, which has a clear physical basis and can express hydrodynamic processes with high accuracy. Hydrodynamic model requires high model data and relies on validation data such as flow rate and water level to calibrate the model. The computational efficiency of hydrodynamic simulations is not high, it is still a big challenge to perform fast and high precision simulations at the urban scale of hundreds of square kilometers (Huang et al. 2021). In summary, GIS and RS technologies and hydrodynamic models have their own advantages in the study of flooding, but they are also limited by many factors.

The rising of social media has profoundly altered the way researchers acquire information of disaster events (Fang et al. 2019). Compared to traditional sources of information, social media has several advantages in terms of information extraction and dissemination, such as the ability to be searched and shared, real-time updates, wide distribution and self-publishing capabilities (Crooks et al. 2013; Fang et al. 2019). And it is noteworthy that social media also provides an enabling condition for public emotion analysis (Anwar et al. 2015; Wang 2020). The public are likely to express their opinions and emotions towards disasters on social media platforms. When a natural disaster suddenly occurs, emotion analysis can help crisis response managers understand the state, trend, and abnormal changes in public emotion (Bai and Yu 2016). In flood-related studies, comparing social media with GIS technology and hydrological models from the perspective of disaster data sources, it was concluded that social media is suitable for acquiring data on a larger scale and in real time (Huang et al. 2021). In recent years, many scholars have used social media as a source of research data in studies related to urban flooding, and confirmed the validity of disaster information extracted by social media (Fang et al. 2019; Shoyama et al. 2021). Some studies have further explored public emotion during floods and obtained characteristics of public emotion (Wang et al. 2020; Wu et al. 2018). Their studies have confirmed the feasibility of social media for flood and public emotion research, but few studies have explored the relationship between urban flooding and public emotion.

With the widely use of Twitter in disaster management abroad, Sina Weibo, as a Chinese version of Twitter, has attracted a lot of attention in disaster management (Deng et al. 2016). It has been shown that disaster information extracted form Sina Weibo could reflect the public's disaster risk perception well and have potential to serve as a data source for disaster management in China (Chen et al. 2022; Wu et al. 2021; Xiao 2019). Meanwhile, there are also studies showing that Weibo data can well support public emotion analysis in disaster events (Chen and Song 2020; Han and Wang 2019; Wu et al. 2018). Thus, Weibo is an ideal tool to provide basic data for the study of urban waterlogging events.

Weibo data were less frequently used in flooding studies in Guangdong and Guangxi regions, and the correlation between disaster situations and public emotion has rarely been considered in relevant studies. Therefore, this study aims to explore the disaster characteristics and public emotion characteristics of flooding in Guangdong and Guangxi regions from June 1 to September 30, 2020 on the basis of Weibo texts, as well as to explore the correlation between flooding and public emotion. It is hoped that this paper can provide scientific support for the identification of flood risks, improvement of emergency management system and public emotion guidance and control in Guangdong and Guangxi regions.

2.1. Study area

Guangdong Province and Guangxi Zhuang Autonomous Region are located on the southern coast of China and are commonly known as the Guangdong and Guangxi regions. Guangdong and Guangxi are rapidly growing and densely populated urban areas in China, and they share significant similarities in location, size, climate, topography, and resources. The geographical location and administrative division of the study area were shown in Fig. 1.

Guangdong Province has a central subtropical, southern subtropical and tropical climate from north to south, respectively. The northern half of Guangxi Zhuang Autonomous Region has a central subtropical climate, while the southern half has a southern subtropical climate. Guangdong and Guangxi regions are rich in light, heat and water resources, and rain and heat are synchronized, with precipitation mainly concentrated in April-September (Baidu Encyclopedia 2022).

Affected by the monsoon climate, Guangdong and Guangxi are the regions with the maximum distribution of summer precipitation and inter-annual variability in China. The annual precipitation in Guangdong Province ranged from 1179.6 mm to 2320.9 mm (Guangdong Meteorological Bureau 2022), and that in Guangxi Zhuang Autonomous Region ranged from 1210.4 mm to 2190.9 mm (Meng J 2019). At the same time, there are also problems such as low altitude, aging pipeline network and imperfect flood control projects in Guangdong and Guangxi (Zhang and Tang 2020; Zhao and Zhang 2017). Therefore, during the period of concentrated rainfall every year, the Guangdong and Guangxi regions are faced with severe waterlogging risk.

2.2. Data collection

The remote sensing images, DEM data and administrative boundaries of the Guangdong and Guangxi regions in 2020 were obtained through the open source map software-LocaSpace Viewer. Weibo texts were chosen as the data source. Based on the open interface of Weibo platform, the index phrases were formed by keywords such as "flood", "waterlogging" and "trapped", and obtained the original sign-in data of urban waterlogging disasters in Guangdong and Guangxi from June 1, 2020 to September 30, 2020. Through manual screening, duplicate data, pictures, videos and other data are deleted, and 6125 pieces of valid text data were finally obtained. The number of Weibo texts for different dates and different cities are shown in Fig. 2 and Fig. 3.

2.3. Methods

The framework and approach used in this study was schematically depicted in Fig. 4, which mainly contained three parts: Weibo database construction, analysis of disaster situation and public emotion, correlation analysis between disaster situation and public emotion.

2.3.1. Weibo database construction

Firstly, Web Scraper was used to obtain flood-related Weibo texts between June 1, 2020 and September 30, 2020. Web Scraper is a Google Chrome-based crawler tool that is free and does not require a high level of programming from the user (Sspai Web 2020). After that, duplicate data, images and videos were removed through manual screening. In this way, a database of flood-themed Weibo texts was obtained.

2.3.2. Analysis of disaster situation

In order to extract useful information from Weibo texts, the collected texts were first divided into words using “jieba” (Chen and Song 2020). “jieba” (Chinese for “to stutter”) Chinese text segmentation is a Python Chinese word segmentation module. The functions of “jieba” include text segmentation, keyword extraction, lexical annotation and word position query. After obtaining the results of text segmentation, the analysis of the disaster situation and public emotion can be launched (GitHub Web 2020).

In the analysis of the disaster, Baidu Maps was used to obtain the coordinates of the location where the waterlogging occurred in Weibo texts. The coordinates of these waterlogging points were represented in ArcGIS, and then the spatial distribution characteristics of waterlogging were explored by the Kernel Density Analysis (Allen et al. 2021). Kernel Density Analysis is a method used to calculate the unit density of the measured values of point and line elements within a specified neighborhood. It visualizes the distribution of discrete measurements over a continuous area. The result of Kernel Density Analysis is a smooth surface with large intermediate values and small peripheral values, and the raster value is the unit density. The Kernel Density Analysis uses the following function.

$$D=\frac{{3(1-{scale}^{2})}^{2}}{\pi {r}^{2}}$$

In the above formula, r is the lookup radius, scale is the ratio of the distance from the grid center point to the point or line object to the lookup radius. For a point object, the volume of the space enclosed by its kernel density surface and the plane below approximates the measurement at this point. For a line object, the volume of the space enclosed by the kernel density surface and the plane below is approximated by the product of the measurement of this line and the length of the line (IDesktop Web 2022).

2.3.3. Analysis of public emotion

Baidu Natural Language Processing was used to analyze public emotion (Zhang and Gan 2020). Baidu Natural Language Processing is a tool to analyze the emotional tendencies of Chinese, which is built on deep learning technology and Baidu big data. Baidu Natural Language Processing can automatically determine the sentiment polarity category of the text, including positive, negative and neutral, and give the corresponding confidence level. The emotion polarity category of each Weibo text was determined, and then positive, negative and neutral were expressed as 1, -1 and 0, respectively, to quantify the public emotional score (Wang et al. 2020).

Based on the text segmentation results, the TF-IDF formula was used for semantic-based classification of Weibo texts (Sarirete A 2022). TF-IDF is a statistical method to assess the importance of a word for a document set or one of the documents in a corpus. The importance of a word increases positively with its number of occurrences in a document, but decreases inversely with its frequency in a corpus.

$$t{f}_{i,j}=\frac{{n}_{i,j}}{{\sum }_{k}{n}_{k,j}}$$

In the above formula, ${f}_{i,j}$ represents the number of times that item ${t}_{i}$ appears in Weibo $j$, the numerator is the number of times the word appears in the text, and the denominator is the sum of the number of times that all words appear in the text (Natural Language Processing Column 2022).

Combining existing studies (Wu et al. 2018) and the contents of the collected microblog texts, the feature items after calculating the weights were divided into three categories: disaster descriptions, pre-disaster warnings and news reports, and emotional expressions and related thoughts.

2.3.4. Correlation analysis between disaster situation and public emotion

The data of waterlogging points and negative emotion scores of each city were tested to be in line with normal distribution. Pearson correlation analysis and partial correlations analysis were used to analyze the correlation between waterlogging and negative public emotion (Ates and Guran 2021). Pearson correlation analysis is a method proposed by British statistician Pearson and is widely used to measure the degree of correlation between two variables, its result is represented by the correlation coefficient $r$.

$${r}_{X,Y}=\frac{n\sum XY-\sum X\sum Y}{\sqrt{\left[N\sum {X}^{2}-{\left(\sum X\right)}^{2}\right]\left[N\sum {Y}^{2}-{\left(\sum Y\right)}^{2}\right]}}$$

In the formula, $n$ is the sample size, $X$ and $Y$ are the observed values of the research variables. Generally defined, if $r$ >0, it can be concluded that the two variables are positively correlated, otherwise, it is negatively correlated. The larger the absolute value of $r$, the stronger the correlation between the two variables (Machine Learning and Artificial Intelligence Column 2021).

Partial Correlations Analysis is the process of eliminating the effect of the third variable when two variables are simultaneously correlated with a third variable, and analyzing only the degree of correlation between the two variables to be explored. In analyzing the correlation between two variables X and Y, with Z as the control variable, the partial correlation coefficient between X and Y is defined as ${r}_{xy\left(z\right)}$.

$${r}_{xy\left(z\right)}=\frac{{r}_{xy}-{r}_{xz}{r}_{yz}}{\sqrt{1-{r}_{xz}^{2}}\sqrt{1-{r}_{yz}^{2}}}$$

In the formula, ${r}_{xy}$ is the correlation coefficient of x and y, ${r}_{xz}$ is the correlation coefficient of x and z, ${r}_{yz}$ is the correlation coefficient of y and z. The larger the absolute value of ${r}_{xy\left(z\right)}$, the stronger the correlation between x and y (Mengte Web 2022).

2.3.5. Natural Breaks Classification (Jenks)

There are some natural turning points and breakpoints between any series of numbers, and these natural breakpoints are statistically significant, and with these turning points the objects of study can be divided into clusters of similar nature. Therefore, the natural breakpoints themselves are good bounds for grading (GIS Column 2016). Natural breakpoints were used in this study to classify the kernel density of waterlogging points and emotion scores.

3.1. Distribution characteristics of waterlogging

By screening and counting the geographic location descriptions in Weibo texts, a total of 276 waterlogging points in Guangdong and Guangxi were obtained. Among them, there were 110 in Guangxi, accounting for 39.86% of the total waterlogging points, and 166 in Guangdong, accounting for 60.14%. The nucleation density of waterlogging points extracted from Weibo was shown in Fig. 5. It was clear that more flooding occurred in the southern coastal cities of Guangdong and Guangxi regions than in the northern cities during the rainy season, and more in the southeast than in the central and northwestern regions. This result was highly consistent with media reports. Floods in Guangdong Province mainly distributed in Canton, Foshan, Huizhou, Dongguan, Zhongshan, and Zhuhai. Floods in Guangxi Zhuang Autonomous Region mainly distributed in Nanning, Fangchenggang, Beihai, Qinzhou and Guilin. Waterlogging in Guangdong and Guangxi always occurred in multiple places at the same time, or occurred multiple times in one place. Combining the text data and remote sensing images, it could be seen that the high incidence of waterlogging was mostly located at the intersection of urban roads, where the urbanization rate was high and the aging of drainage network was common. The stagnant water produced by heavy rainfall carried attached materials such as branches, leaves, garbage or hail, which can easily cause drainage facilities to become clogged, making the disaster even worse.

3.2. Public emotion characteristics

3.2.1. Public response features

According to Fig. 2, the number of Weibo texts related to waterlogging showed certain peaks and valleys during the continuous rainstorm events, and the peaks with more obvious changes in the number of microblogs appeared on June 8 and August 12. The historical weather query showed (Table 1) that the weather in the two peak periods was mostly characterized by heavy rainfall in Guangdong and Guangxi regions. In addition, judging from the number of microblogs, public discussions on social media about waterlogging mainly occurred from early June to the end of August. During this period, the frequent occurrence of extreme precipitation in Guangdong and Guangxi regions triggered the continuous occurrence of urban waterlogging, and the problems of traffic standstill, school closure and trapped people caused by heavy rainfall and flooding made the public respond strongly. The scope of heavy rainfall in September after the main flood season was significantly reduced, and the number of microblogs sent by the public about urban waterlogging showed a declining trend. It could be seen that the number of related Weibo was obviously affected by the precipitation weather.

Table 1

Weather conditions in urban areas of Guangdong and Guangxi
area	June 8	August 12	area	June 8	August 12
Nanning	overcast	light rain to shower	Liuzhou	cloudy	overcast to cloudy
Guilin	cloudy	overcast with showers	Wuzhou	overcast	overcast to cloudy
Beihai	cloudy	overcast to moderate rain	Fangchenggang	cloudy	light rain to heavy rain
Qinzhou	cloudy	light rain to moderate rain	Guigang	overcast	overcast with showers
Yulin	cloudy	overcast to cloudy	Baise	cloudy	overcast with showers
Hezhou	cloudy	overcast with showers	Hechi	cloudy	overcast with showers
Laibin	overcast	overcast to cloudy	Congzuo	cloudy	overcast to cloudy
Canton	overcast with showers	moderate rain	Shaoguan	light rain	overcast with showers
Shenzhen	cloudy	rainstorm to heavy rain	Zhuhai	overcast	heavy rain to moderate rain
Shantou	cloudy	overcast with showers	Foshan	cloudy	rainstorm to thundershower
Jiangmen	cloudy	light rain to moderate rain	Zhanjiang	cloudy	cloudy to moderate rain
Maoming	overcast	light rain to moderate rain	Zhaoqing	overcast	moderate rain to shower
Huizhou	light rain to heavy rain	overcast with showers	Meizhou	light rain	light rain to cloudy
Shanwei	light rain to moderate rain	rainstorm to shower	Heyuan	light rain to moderate rain	moderate rain to shower
Yangjiang	overcast	heavy rain	Qingyuan	Overcast	moderate rain to shower
Dongguan	light rain to heavy rain	rainstorm to moderate rain	Zhongshan	overcast	moderate rain to heavy rain
Chaozhou	light rain to moderate rain	light rain to cloudy	Jieyang	light rain	light rain to thundershower
Yunfu	overcast	overcast to cloudy

The keywords, weights and frequencies of Weibo texts during the study period were obtained by text segmentation using “jieba”, including: waterlogging (7544 times), rainstorm (3161 times), road (2009 times), severe (1083 times), vehicles (886 times), trapped (852 times), affected (845 times), flooding (578 times), and heavy rainfall (565 times), etc. The word cloud of these keywords is shown in Fig. 6. Based on the results of the normalized relative and absolute subscripts, text clustering was performed using the TF-IDF formula to derive the types and quantitative differences of the social public responses. The classification results showed that the public responded the most to the description of the disaster location and the disaster situation (74.59%). The second was the pre-disaster warning and the delivery of news content during the disaster (19.73%). And the least part were thoughts on urban planning and disaster situations (5.68%). Relevant departments should focus on identifying and guiding public responses while strengthening disaster early warning, so as to curb the breeding and spread of negative emotions.

3.2.2. Spatial and temporal characteristics of public emotion

The polarity and score of each Weibo text were determined through Baidu Natural Language Processing. These emotional scores were summed temporally, as shown in Figs. 7. The number of positive and negative tweets within each prefecture-level city was shown in Fig. 8. It could be seen that the public's emotional response to urban flooding was mainly negative. The extreme values of emotional scores coincided with the periods of concentrated rainfall and the extreme values of the number of Weibo texts, indicating that there was some connection between changes in public emotion and weather conditions, but the degree of correlation with the amount of rainfall still needed further analysis.

The emotional scores of negative tweets within the prefecture-level cities were summed to obtain the negative emotion scores of each prefecture-level city, as shown in Fig. 9. The spatial distribution of the negative emotion scores and the waterlogging points were displayed in ArcGIS, as shown in Fig. 10.

As can be seen from Fig. 9 and Fig. 10, most of the prefecture-level cities had negative emotion scores between − 12 and 0 (74.29%). Among them, Chongzuo had negative emotion score of 0. On the one hand, verification of news reports showed that Chongzuo was hardly affected by the floods; on the other hand, Chongzuo had the lowest number of Weibo texts among all prefecture-level cities. Negative emotion scores in Liuzhou, Guilin, Foshan, Zhuhai, Shantou and Nanning ranged from − 37 to -15 (17.14%). Negative emotion scores for Dongguan and Shenzhen were − 78 and − 76 (5.71%). Canton had the lowest negative emotion score of -248 (2.86%) and was much lower than other prefecture-level cities, while Canton also had the highest number of waterlogging points and Weibo texts. In general, the negative emotion scores of most cities in Guangdong and Guangxi regions were not very low, and the low negative emotion scores were mainly distributed in the provincial capital and surrounding city clusters in Guangdong Province.

3.3. The relationship between waterlogging and public emotion

As shown in Fig. 7, the public emotional reactions were greater before the onset of the rainstorm and when the flooding occurred, while negative emotional words such as worry, sadness, fear, alarm, and annoyance were easy to appear in Weibo texts. However, the effective implementation of emergency measures by relevant departments and the promotion of anti-flooding deeds in the later stage of the flooding led to a gradual easing of public emotion. The use of positive terms in Weibo became more frequent, and the public emotion of society was positive for a few days. It was evident that changes in the disaster situation did affect the direction of public emotion.

As seen in Fig. 10, areas with a high number of waterlogging points were accompanied by lower negative emotion scores, suggesting a link between waterlogging points and negative emotion scores. Nanning was a very special city, where there were many waterlogging points, but the negative emotion score was not very low. A correlation analysis was done between the waterlogging points and the negative emotion scores, the correlation coefficient was − 0.759 and the Sig value was 0.00 < 0.05. The correlation coefficient even reaches − 0.896 when Nanning was not considered. To exclude the effect of population differences on the correlation analysis, a partial correlation analysis between the number of waterlogging points and negative emotion scores was done using population as a control variable (without considering Nanning). The result of the partial correlation analysis was − 0.756, and the Sig value was 0.00 < 0.05. This indicated that the more intensive the waterlogging, the higher the negative public emotion score, and the public emotion response was consistent with the intensity of the flooding.

4.1. Is it feasible to study flooding and public emotion with Weibo data?

Among the extant studies related to urban flooding in Guangdong and Guangxi regions, few of them used social media data as the data base, the present research fills the gap in this area to some extent. The primary data used in this study were Weibo texts, and the results confirmed the feasibility of Weibo texts to study flooding and public emotion, which is consistent with the results of related studies (Wu et al. 2018; Xiao 2019). In addition to Weibo texts, user check-in information, pictures and videos, user retweets and comment information are also useful Weibo data. In the study by Wang et al. the ratio of the number of flood-related microblogs to the number of user check-ins was used to characterize the extent to which residents perceive flooding (Wang et al. 2020). Only the original texts published by Weibo users were selected for analysis in this study, which affected the accuracy and completeness of the information to some extent. The coupling of multiple sources and different communication methods of social media can reflect the disaster situation and users' emotions more comprehensively (Meng et al. 2021). In addition, there are studies that use more than just social media data. As an example, Fang et al. (2019) used two types of data: observed hourly precipitation data and social media data. Precipitation data were used to represent the process of the disaster and validate the effectiveness of social media messages (Fang et al. 2019). Therefore, combined information from multiple sources in social media and other data can be used as a data base for studying flooding and public emotion in future studies.

4.2. Spatial distribution of waterlogging points

The location attributes unique to Weibo itself and the location descriptions within the text together constitute the location information of the disaster occurrence area, which was well reflected in the pickup and nuclear density analysis of the waterlogging points in this paper. The results of the study showed that the provincial capitals and coastal cities in Guangdong and Guangxi regions were densely flooded while the rest of the regions were relatively less flooded. The results were generally consistent with those obtained from the coupled GIS and RS analysis and scenario simulation methods used (Huang and Bai 2019; Li et al. 2019; Tian et al. 2017). The main reason for this distribution pattern is that provincial capitals and coastal cities are urbanizing relatively fast, and their drainage capacity is significantly lagging behind the level of urbanization (Zhang and Tang 2020; Zhao and Zhang 2017). Therefore, the old drainage network should be upgraded in these areas or new drainage pipes should be built to enhance the drainage capacity. At the same time, the relevant departments should also be targeted to carry out disaster prevention, mitigation and relief work.

4.3. Characteristics of public emotion

The Weibo texts during the study period were dominated by descriptions of the disaster, which led to the formation of dissemination nodes of varying sizes within users' two-way social circles. It is related to the information dissemination mechanism of the operation of social media such as Weibo (Wang 2010). In addition, a small number of users gave in-depth consideration to aspects such as urban planning and emergency deployment, and made corresponding suggestions. These suggestions serve to assess the extent of disaster damage and retrofitting of affected areas. Therefore, it can be considered that public emotion originates from the disaster situation, and the analysis of public emotion in turn plays a role in the analysis of the disaster situation.

Affected by the flooding, public emotion presented predominantly negative in Weibo, which is consistent with the results obtained in similar studies (Han and Wang 2019; Wu et al. 2018; Xiao 2019). While Wu et al. directly used feature words indicating emotion as the basis for classifying public emotion levels, this paper used the entire Weibo text as the emotion feature term for emotion identification, focusing more on the semantic relationships in Chinese expressions. While other studies have focused on the overall public emotion characteristics (Wang et al. 2020; Zhang and Cheng 2021), this paper focused on negative public emotion. The reason is that the collected Weibo texts contains a large number of tweets from institutions or organizations, which serve to convey disaster information or channel positive emotions. These tweets were mostly positive or neutral and were not strictly speaking public emotion, while negative tweets were almost always published by individual Weibo users.

The low scores of negative public emotion were mainly distributed in the provincial capital and surrounding urban clusters of Guangdong Province. On the one hand, the large number of waterlogging points in these places triggered negative public emotion; on the other hand, the highly developed economy in these areas led to the aggregation of social networks, which amplified the score of negative emotion (li et al. 2013). Therefore, the emergency management departments in these places should focus on the positive guidance of public emotion while fighting floods and providing disaster relief.

4.4. The connection between disaster and public emotion

The intertwining of disaster and public emotion reflects both the complexity of worsening extreme weather into flooding-causing and the high correlation between the two. The results of the correlation analysis showed that, the negative public emotion was consistent with the change in the disaster situation. The more severe the urban flooding was, the more negative the public emotion was. The main reasons for the strong correlation between the disaster situation and public emotion included: (1) Urban flooding can disrupt transportation and communications, cause property damage and even injury to residents, and breed negative public emotions such as fear, irritation and anger (Zhang and Li 2015). (2) Information about the resettlement of people and the deployment of flood control have triggered negative emotions among affected people (Guo et al. 2021). (3) Negative events triggered strong empathy and fostered negative emotions in the affected population (Schipper and Petermann 2013).

Few of the existing studies on flooding have analyzed the correlation between flooding and public emotion. But clarifying the correlation between disaster conditions and public emotion can be beneficial for emergency management. Relevant departments can check whether extreme weather warnings are timely and accurate, and whether disaster preparedness is appropriate through public emotion. The potential public emotion risk is also captured by analyzing the degree of attention paid to the disaster by different subjects such as media and netizens in the affected regions. In addition, joint control of disasters and public emotion is necessary and should begin with the correlation between disasters and public emotion.

This paper analyzed the spatial and temporal distribution of the disaster situation of flooding events and public emotion in Guangdong and Guangxi regions based on the Weibo text data. The main conclusions are as follows.

(1) The Weibo text can be used to study the flooding situation and public emotion.

(2) Between June 1 and September 30, 2020, the flooding in Guangdong and Guangxi regions was serious, and the waterlogging points were mainly distributed in coastal cities and provincial capitals, with obvious spatial heterogeneity. In the prevention of urban flooding, relevant departments should speed up the renovation of the old network of drainage systems in coastal cities and provincial capitals. Low-lying areas and flood-prone areas should be equipped with additional drainage networks and flood control measures.

(3) During disasters, the type of public response was mostly a description of the flooding, and the public emotion was mainly negative. According to the characteristics of public response and public emotion, relevant departments can strengthen disaster warning while focusing on positive guidance of public emotion, thus curbing the growth and spread of negative emotion.

(4) There was a strong correlation between public emotion and flooding. Based on the correlation, attention should be paid to identifying potential public emotion risks based on disaster situations. Meanwhile, disaster emergency management should be improved based on public response.

Funding

The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.

Competing Interests

The authors have no relevant financial or non-financial interests to disclose.

Author Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Qiang Gao and Mingjun Ma. The first draft of the manuscript was written by Mingjun Ma and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Allen MJ, Allen TR, Davis C, McLeod G (2021) Exploring Spatial Patterns of Virginia Tornadoes Using Kernel Density and Space-Time Cube Analysis (1960-2019). ISPRS International Journal of Geo-Information, 10(5), Article 310. https://doi.org/10.3390/ijgi10050310 (in Chinese)
Anwar Hridoy SA, Ekram MT, Islam MS, Ahmed F, Rahman RM (2015) Localized twitter opinion mining using sentiment analysis. Decision Analytics, 2(1), 8. https://doi.org/10.1186/s40165-015-0016-4
Ates E, Guran A (2021) Pearson correlation and Granger causality analysis of Twitter sentiments and the daily changes in Bist30 index returns. Journal of the Faculty of Engineering and Architecture of Gazi University, 36(3), 1688-1701. https://doi.org/10.17341/gazimmfd.660018
Baidu Encyclopedia (2022) Guangdong. https://baike.baidu.com/item/%E5%B9%BF%E4%B8%9C/207811 (in Chinese)
Baidu Encyclopedia (2022) Guangxi. https://baike.baidu.com/item/%E5%B9%BF%E8%A5%BF/162679#3_3 (in Chinese)
Bai H, Yu G (2016) A Weibo-based approach to disaster informatics: incidents monitor in post-disaster situation via Weibo text negative sentiment analysis. Natural Hazards, 83(2), 1177-1196. https://doi.org/10.1007/s11069-016-2370-5
Chen L, Song YX (2020) LSTM Sentiment Analysis Based on the Context of Public Emotion —— A Case Study of Super Typhoon "Lekoma". Journal of Modern Information, 40(06), 98-105. (in Chinese)
Chen YL, Gong CH, Fan YY, Li XL, Liang YH, Hu MC (2022) Analysis of temporal and spatial changes of waterlogging in Zhengzhou based on social media data. Journal of China Hydrology, 1-10. https://doi.org/10.19797/j.cnki.1000-0852.20210463 (in Chinese)
Crooks A, Croitoru A, Stefanidis A, Radzikowski J (2013) #Earthquake: Twitter as a Distributed Sensor System. Transactions in Gis, 17(1), 124-147. https://doi.org/10.1111/j.1467-9671.2012.01359.x
Deng Q, Liu Y, Zhang H, Deng XL, Ma YF (2016) A new crowdsourcing model to assess disaster using microblog data in typhoon Haiyan. Natural Hazards, 84(2), 1241-1256. https://doi.org/10.1007/s11069-016-2484-9
Disaster Relief and Material Security Division (2020) The Ministry of Emergency Management Released the National Natural Disaster Situation in the First Three Quarters of 2020. Ministry of Emergency Management of the People's Republic of China. https://www.mem.gov.cn/xw/bndt/202010/t20201007_369709.shtml (in Chinese)
Fang J, Hu JM, Shi XW, Zhao L (2019) Assessing disaster impacts and response using social media data in China: A case study of 2016 Wuhan rainstorm. International Journal of Disaster Risk Reduction, 34, 275-282. https://doi.org/10.1016/j.ijdrr.2018.11.027
GIS Column (2016) Standard Classification Methods in ArcGIS. CSDN Web. https://blog.csdn.net/BinChasing/article/details/51118937 (in Chinese)
GitHub Web (2020) Jieba Chinese Text Segmentation. https://github.com/fxsjy/jieba (in Chinese)
Guangdong Meteorological Bureau (2022) Geography – Climate. Guangdong Province Information Web. http://dfz.gd.gov.cn/sqyl/jbsq/content/post_3776901.html (in Chinese)
Guo DF, Zhao QT, Chen QW, Wu J, Li LN, Gao H (2021) Comparison between sentiments of people from affected and non-affected regions after the flood. Geomatics Natural Hazards & Risk, 12(1), 3346-3357. https://doi.org/10.1080/19475705.2021.2012530
Han XH, Wang JL (2019) Using Social Media to Mine and Analyze Public Sentiment during a Disaster: A Case Study of the 2018 Shouguang City Flood in China. ISPRS International Journal of Geo-Information, 8(4), Article 185. https://doi.org/10.3390/ijgi8040185
Hu BB, Zhou J, Wang J, Xu SY, Meng WQ (2012) Waterlogging risk assessment of heavy rain in 2020 in Tianjin Binhai New Area based on scenario simulation. Chinese Journal of Geography, 32(07), 846-852. https://doi.org/10.13249/j.cnki.sgs.2012.07.011 (in Chinese)
Huang HB, Wang XW, Liu L (2021) A review on urban pluvial floods: Characteristics, mechanisms, data, and research methods. Progress in Geography, 40(6), 1048-1059. https://doi.org/10.18306/dlkxjz.2021.06.014 (in Chinese)
Huang DP, Bai L (2019) The distribution characteristics of urban waterlogging in Nanning and the application analysis of its monitoring and early warning system. Journal of institute of disaster prevention, 21(04), 84-89. (in Chinese)
Huang MM (2022) Study on flood monitoring and prediction in Shouguang City based on Seminel-1/2 data. Dissertation, Nanjing University of Information Science & Technology. (in Chinese)
IDesktop Web (2022) Kernel Density Analysis. https://help.supermap.com/iDesktop/zh/tutorial/Analyst/Raster/KernelDensityAnalysis/ (in Chinese)
Li BQ, Luo HW, Chen WJ, Wang WQ, Huang GR (2019) Rainstorm risk assessment based on numerical simulation in Shenzhen Minzhi area. South-to-North Water Diversion and Water Science and Technology, 17(05), 20-28+63. https://doi.org/10.13476/j.cnki.nsbdqk.2019.0105 (in Chinese)
Li JL, Cao LD, Pu RL (2014) Progress on monitoring and assessment of flood disaster in remote sensing. Journal of Hydraulic Engineering, 45(03), 253-260. https://doi.org/10.13243/j.cnki.slxb.2014.03.001 (in Chinese)
Li QQ, Chang XM, Xiao SL (2013) Characteristics of micro-blog inter-city social interactions in China. Journal of Shenzhen University(Science and Engineering), 30(05), 441-449. (in Chinese)
Lin T, Liu XF, Song JC, Zhang GQ, Jia YQ, Tu ZZ, Zheng ZH, Liu CL (2018) Urban waterlogging risk assessment based on internet open data: A case study in China. Habitat International, 71, 88-96. https://doi.org/10.1016/j.habitatint.2017.11.013
Liu L, Sun J, Lin B (2022) A large-scale waterlogging investigation in a megacity. Natural Hazards. https://doi.org/10.1007/s11069-022-05435-3
Machine Learning and Artificial Intelligence Column (2021) Pearson Correlation Coefficient. Zhihu Web. https://zhuanlan.zhihu.com/p/350334110 (in Chinese)
Malik S, Pal SC, Sattar A, Singh SK, Das B, Chakrabortty R, Mohammad P (2020) Trend of extreme rainfall events using suitable Global Circulation Model to combat the water logging condition in Kolkata Metropolitan Area. Urban Climate, 32, Article 100599. https://doi.org/10.1016/j.uclim.2020.100599
Meng J (2019) Analysis of Spatial and Temporal Distribution Characteristics and Trends of Precipitation in Guangxi. Chinaqking Web. http://www.chinaqking.com/yc/2019/1775481.html (in Chinese)
Mengte Web (2022) Introduction to the Theory of Partial Correlations Analysis. https://mengte.online/archives/2644 (in Chinese)
Meng XR, Yang WZ, Wang T (2021) A review of sentiment analysis research based on image-text fusion. Computer Applications, 41(02), 307-317. https://kns.cnki.net/kcms/detail/51.1307.TP.20201225.1659.002.html (in Chinese)
Natural Language Processing Column (2022) TF-IDF Principle and Implementation. Zhihu Web. https://zhuanlan.zhihu.com/p/97273457 (in Chinese)
Sarirete A (2022) Sentiment analysis tracking of COVID-19 vaccine through tweets. Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-022-03805-0
Schipper M, Petermann F (2013) Relating empathy and emotion regulation: Do deficits in empathy trigger emotion dysregulation? Social Neuroscience, 8(1), 101-107. https://doi.org/10.1080/17470919.2012.761650
Shoyama K, Cui QL, Hanashima M, Sano H, Usuda Y (2021) Emergency flood detection using multiple information sources: Integrated analysis of natural hazard monitoring and social media data. Science of the Total Environment, 767, Article 144371. https://doi.org/10.1016/j.scitotenv.2020.144371
Sspai Web (2020) Web Scraper - Lightweight data crawling tool. https://sspai.com/post/60572 (in Chinese)
Tian YJ, Wang CM, Cui Q (2017) Numerical simulation and diagnostic analysis of the heavy rain process in Guangdong and Guangxi from May 19-20, 2015. Heavy Rain Disaster, 36(01), 18-25. (in Chinese)
Wang B, Zhen F, Sun HH (2020) Spatio-temporal analysis of urban residents’ response to storms and floods based on social media sign-in data. Geographical Sciences, 40(09), 1543-1552. https://doi.org/10.13249/j.cnki.sgs.2020.09.016 (in Chinese)
Wang JZ (2020) Extraction and Analysis of Urban Waterlogging Information in Social Media. Dissertation, East China Normal University. (in Chinese)
Wang L, Li Y, Hou H, Chen Y, Fan J, Wang P, Hu T (2022) Analyzing spatial variance of urban waterlogging disaster at multiple scales based on a hydrological and hydrodynamic model. Natural Hazards. https://doi.org/10.1007/s11069-022-05453-1
Wang XG (2010) An Empirical Analysis of Microblog User Behavior Characteristics and Relationship Characteristics——Taking "Sina Weibo" as an Example. Library and Information Service, 54(14), 66-70. (in Chinese)
Wu KJ, Wu JD, Ding W, Tang RM (2021) Extracting disaster information based on Sina Weibo in China: A case study of the 2019 Typhoon Lekima. International Journal of Disaster Risk Reduction, 60, Article 102304. https://doi.org/10.1016/j.ijdrr.2021.102304
Wu XH, Xiao Y, Wang GF, Ma TH, Ji ZH (2018) Research on the disaster situation and public sentiment of urban waterlogging disaster based on Weibo big data——Taking Nanjing as an example. Catastrophe Science, 33(03), 117-122. (in Chinese)
Xiao Y (2019) Research on the identification and public opinion of rainstorm and waterlogging based on Weibo text. Dissertation, Nanjing University of Information Science & Technology. (in Chinese)
Xu YQ, Chen L, Liu YH, Li S, Zhou XX, Na RB (2022) Comparative Analysis of Natural Disaster Risk Assessment Based on Different Spatial Resolution Data———Taking Rainstorm and Flood Disaster as an Example. Journal of Catastrophology, 37(03), 61-68. https://kns.cnki.net/kcms/detail/61.1097.P.20220506.2031.006.html (in Chinese)
Yin J, Ye MW, Yin Z, Xu SY (2015) A review of advances in urban flood risk analysis over China. Stochastic Environmental Research and Risk Assessment, 29(3), 1063-1070. https://doi.org/10.1007/s00477-014-0939-7
Zhang DL, Li L (2015) Ham and Causes of Urban Water Logging and Corresponding Countermeasures. Sichuan Environment, 34(03), 151-154. https://doi.org/10.14034/j.cnki.schj.2015.03.030 (in Chinese)
Zhang F, Gan HC (2020) On the Influence of Temporal and Spatial Distance from Epidemic on Public Sentiment: A Computational Analysis Based on Panel Data of Weibo Text about COVID-19. Journalism and Mass Communication(06), 39-49. https://doi.org/10.15897/j.cnki.cn51-1046/g2.20200514.017
Zhang L, Tang Z (2020) Analysis of urban flood control planning and management. Urban and Rural Development(24), 16-19. (in Chinese)
Zhang T, Cheng C (2021) Temporal and Spatial Evolution and Influencing Factors of Public Sentiment in Natural Disasters—A Case Study of Typhoon Haiyan. ISPRS International Journal of Geo-Information, 10(5). https://doi.org/10.3390/ijgi10050299
Zhao Y, Zhang BS (2017) Study on Flood Disaster Prevention Strategies of Extreme Weather in Coastal Cities in China. Architecture & Culture(11), 175-177. (in Chinese)

GraphicalAbstract.png

Download PDF

Journal Publication

published 28 May, 2023

Read the published version in Natural Hazards →

Reviewers agreed at journal
26 Jan, 2023
Reviewers invited by journal
26 Jan, 2023
Editor invited by journal
20 Jan, 2023
Editor assigned by journal
14 Dec, 2022
First submitted to journal
13 Dec, 2022

You are reading this latest preprint version

Analysis of public emotion on flood disasters in southern China in 2020 based on social media data

Status:

Journal Publication

Version 1

Abstract

Figures

Highlights

1. Introduction

2. Study Area And Methods

2.1. Study area

2.2. Data collection

2.3. Methods

2.3.1. Weibo database construction

2.3.2. Analysis of disaster situation

2.3.3. Analysis of public emotion

2.3.4. Correlation analysis between disaster situation and public emotion

2.3.5. Natural Breaks Classification (Jenks)

3. Results

3.1. Distribution characteristics of waterlogging

3.2. Public emotion characteristics

3.2.1. Public response features

3.2.2. Spatial and temporal characteristics of public emotion

3.3. The relationship between waterlogging and public emotion

4. Discussion

4.1. Is it feasible to study flooding and public emotion with Weibo data?

4.2. Spatial distribution of waterlogging points

4.3. Characteristics of public emotion

4.4. The connection between disaster and public emotion

5. Conclusion

Declarations

References

Supplementary Files

Status:

Journal Publication

Version 1