DOI: https://doi.org/10.21203/rs.3.rs-1188742/v1
Objectives: To distinguish between the differences in cultural services based on the type of urban green area, through atypical expressions.
Context: Urban green spaces provide important ecosystem services, with cultural ecosystem services (CES) playing a significant role in citizens’ lives. Nevertheless, these are often undervalued as it is difficult to quantitatively evaluate the characteristics of an individuals’ subjective perception of urban space. By examining social media content, we can analyze the content created by users and grasp demand values.
Methods: This study analyzed urban green areas in the inland of Ansan city in Gyeonggi-do, South Korea. Data were collected twice, on October 3, 2017 and October 4, 2018, to verify that the extracted keywords were representative. We extracted keywords from blog posts related to CES and evaluated the possibility of using them as quantitative indicators.
Results: The results indicate that the perceived expression words were different depending on the type of green space. Certain CES such as “exhibit” and “climbing” are affected by green space type. However, it was difficult to identify emotional responses to CES. We found that some words contained double meanings, which made it difficult to evaluate individuals’ perceptions of CES based on the frequency of specific words.
Conclusions: This study demonstrates that social media data on CES greatly extends the type and, especially, the volume and scale of information derived from traditional survey methods. The significance of this study lies in its attempt to quantitatively evaluate the recognition of CES in daily life.
In urban green spaces, ecosystems not only fulfill the ecological function of sustaining the urban environment but also provide an important living area for urban residents. The concept of ecosystem services conceptualizes human environmental interactions through a series of linked components that relate ecological processes to human well-being (MA 2005; Kosanic and Petzold 2020). As evidenced by numerous studies, urban green spaces provide the inhabitants with various ecosystem services (Bolund and Hunhammar 1999; Kremer et al. 2015). For example, regulation services may help urban residents by filtering air pollutants and mitigating flooding (Aram et al. 2019; Grote et al. 2016; Prudencio and Null 2018). Thus, an adequate management of urban ecosystem services is essential to preserve the urban environment (MA 2005; Luederitz et al. 2015; Tzoulas et al. 2007; Zhang and Ramírez 2019).
Cultural ecosystem services (CES) are defined as “all the non-material outputs of ecosystems (biotic and abiotic) that affect physical and mental states of people” (Haines-Young and Potschin 2018). CES are not only the functions provided for humans by the natural ecosystem, but also the interaction between humans and the environment (Oteros- Rozas et al. 2017; Pen ̃a et al. 2015). It plays an important role in the mental and physical well-being of residents based on their activities in the natural environment (Filho et al. 2020; Plieninger et al. 2013). In the urban ecosystem, the demand for ecosystem services has greatly increased, and CES are important due to the frequency and impact of human use (Ko and Son 2018; Wilkerson et al. 2018). At the city level, urban ecosystems such as parks, mountains, and beaches provide scenic beauty and a refuge from everyday busy life (Chen et al. 2019; Martín-López et al. 2018; Schnell et al. 2019; Subiza-Pérez et al. 2020).
Regulating and provisioning services are the most commonly examined aspects of urban ecosystem services (Luederitz et al. 2015; Martínez-Harms and Balvanera 2012). Moreover, urban green spaces are often evaluated for their economic value more than their cultural value, though the latter is more important to residents than the former. (Christie et al. 2012; D’Amato et al. 2016; Spangenberg and Settele 2010). CES are often undervalued compared with other ecosystem services due to the subjective nature of individuals’ perceptions, which presents a challenge for quantitative assessments (Cheng et al. 2019; Lee et al. 2020; Luederitz et al. 2015; Riechers et al. 2016; Stålhammar and Pedersen 2017; Tilliger et al. 2015).
To provide quality CES through limited urban green spaces, it is necessary to understand users’ perception of the CES provided by the city’s green spaces (Andersson et al. 2015; Dickinson and Hobbs 2017). Various ecosystems such as forests, agricultural lands, rivers, lakes, parks, and roadside trees exist in urban areas, and each serves a distinct function. Additionally, CES are not uniformly embedded in all green spaces, but vary based on the type of green space (Plieninger et al. 2013; Ko and Son 2018). Previous studies have shown that differences in CES can exist depending on the characteristics of green areas constituting the urban ecosystem (Beninde et al. 2015; Dade et al. 2020; Threlfall and Kendal 2018).
Socio-cultural approaches are useful for exploring user perception and preferences for CES. Numerous methods have been used to evaluate CES, including questionnaires, photographic analysis, and visitor-employed photography with short comments, with the most common method being the perception survey method, which has temporal and spatial limitations (Haase et al. 2014; Leetaru et al. 2013). This method is most suitable for small regional-scale studies, however, it has limitations in large-scale regional studies (Bragagnolo et al. 2016; Gosal et al. 2019, Paracchini et al. 2014; Peichao et al. 2019).
As an alternative to the perception survey method, studies have increasingly used big data, which can improve existing empirical research methods and identify a variety of previously unknown values based on user-created content (Gosal et al. 2019; Hu et al. 2015; Mckenzie et al. 2013; Scholte et al. 2015). Spontaneously generated social media data differ from data collected for research in a structured manner (Heikinheimo et al. 2017; See et al. 2016; Sonter et al. 2016). The subjective intervention of the researcher can be reduced through social media data, as its collection does not involve direct contact between the researcher and participants (Chen et al., 2018). Such data may provide information on various users’ experiences and preferences over a brief period, which may contribute to improving other empirical methods (Hausmann et al. 2017; Richards and Friess 2015; Yoshimura and Hiura 2017).
Additionally, data in social media include attached metadata, such as a title, descriptions, time stamp, and photo tags; such data contain a considerable amount of information from which interactions with nature or space use can be inferred (Dorwart et al. 2009; Hollenstein and Purves 2010; Johnson et al. 2019; Runge et al. 2020). Some research has confirmed public awareness of ecosystem services and a correlation between landscape features and CES using photographic geographical information (Dunkel 2015; Gosal et al. 2019; Oteros-Rozas et al. 2018; Vigl et al. 2021). Other studies have proposed methods of socio-cultural usage indicators, including tags and other text associated with photos on social media (Barry 2014; Ghermandi et al. 2020; Hamstead et al. 2018; Hollenstein and Purves 2010; J. Retka et al. 2019; Lee et al. 2019). Social media posts provide an information-rich source through which one can not only understand the diverse ways in which people interact with landscapes, but also identify the expressions used to describe CES (Hale et al. 2019, Lee et al. 2020). Some studies have used these newly available data to maximize reliability and validity (Brown et al. 2014; Johnson et al. 2019; Karasov et al. 2020). Previous studies have shown that social network data are similar to data obtained from traditional survey methods (Johnson et al. 2019; Keeler et al. 2015; Richards and Tunçer 2018; Sonter et al. 2016; Wood et al. 2013). Many researches have focused on social media platforms offering easy data access such as Flickr. However, the efficacy of this data differs based on users and the type of social media platform itself (Heikinheimo et al. 2020; Manikonda et al. 2016; Ruiz-Frau et al. 2020; Tenkanen et al. 2017).
Some studies have discussed the representativeness of social media (Li et al. 2013; Liu et al. 2016) and methodological issues related to the representation of CES by crowd sourced data (Zhang et al. 2020). However, some platforms and social media posts make it difficult to understand users’ perception of CES and what they did in the urban green space because photographs and tags do not provide those data (Hale et al. 2019; Richards and Tunçer 2018).
Text, which is unstructured data, accounts for 80% of the data in the world. (Chakraborty and Pagolu 2014; Gantz and Reinsel 2012). However, existing studies have used short sentence limits from social network services (SNSs) (Di Minin et al. 2015; Figueroa-Alfaro and Tang 2017; Johnson et al. 2019; Tenkanen et al. 2017). No existing studies, to our knowledge, has used a character limit from a data-centric approach to quantify CES using long sentences. Long user-written texts can contain important evaluations of products or services, and in some cases provide insightful and important information (Jo et al. 2018; Johnson et al. 2019).
In this study, we examined the relationship between the meanings of words contained in social media posts through natural language processing, to understand user perceptions of CES provided by urban green spaces. To do this, we first extracted keywords representing CES through text network analysis and identified the characteristics of CES for each type of urban green space. This was followed by analyzing the relationship between the keywords and greenery types derived from text mining. We then aimed to determine the possibility of distinguishing between differences in cultural services based on the type of urban green area, through atypical expressions.
This study analyzed urban green areas in the inland of Ansan city in Gyeonggi-do, South Korea. Ansan was the country’s first planned city. A 1977 development plan turned Ansan into an industrial eco-friendly city by separating the industrial area from residential and commercial zones. With a total area of 154.23 km2 and a population of 740,000, Ansan became what is commonly classified as a mid-sized city in 2017. Ansan is bordered by the Yellow Sea to the west and mountains to the east. The areas designated for agricultural and industrial activities are concentrated in the city’s periphery. The inland area is 109.34 km2 and is mostly smooth flatland, except for mountainous regions. The city includes several types of urban green spaces integrated into historical and cultural sites (Fig. 1).
As of 2017, the urban forest area per citizen was 9.02 km2, which is relatively high compared to other municipalities in Korea. The parks are quite evenly distributed due to the planned nature of the city, and green spaces are all equally accessible, making Ansan suitable for this initial big data study.
This study collected and evaluated online text data regarding users’ perceptions of urban green spaces in the inland area of Ansan city. The text data were collected from public blogs obtained through a widely used Korean search engine, Naver (www.naver.com). Referred to as “the Google of South Korea”, Naver handled 74.7% of all web searches in the country and had 42 million enrolled users as of September 2017, (https://dbpedia.org/page/Naver). It has a blog app that enables users to easily publish user-generated content, including posts, photos, product reviews, or location information. A blog contains individual daily information based on experiences, which is then shared and spread to a large number of users through a portal site. Therefore, it can be used as a tool to understand the user’s individual experience.
The data were extracted using the Python crawler through the API, which followed the order of the portal site’s engine algorithm and top-ranked content in order of relevance. We only collected data that were made public by the user.
We collected data twice, on October 3, 2017 and October 4, 2018, to verify that the extracted keywords were representative. The datasets were then manually filtered using Python and Excel to retain only relevant information for further analysis. We removed duplicate posts, posts made outside the study area, and data posted before 2007. After comparing the keywords extracted from the collected data in 2017 and 2018, the 50 most common keywords were extracted.
The keyword search was performed using main place names in Ansan (Gosal et al. 2019; Paracchini et al. 2014). Urban green spaces in Ansan, including forests, rivers, parks, and public cultural education facilities (PCEFS), were selected as search targets and examined via preliminary surveys. Cases where the number of searches related to the space’s use was insufficient due to advertisements unrelated to the actual use of the space or lack of content on the topic were exempt from preliminary surveys. A total of 17 search keywords related to green spaces were found, including three for forests, two for rivers, eight for parks, and four for greenery. Fig. 2 presents this study’s workflow.
The blog used in this study contains information written in users’ natural language. Natural language has a lexical ambiguity in which one word has various meanings (Agirre and Edmonds 2007). As the true meaning of natural language is information contained in the everyday language of those who speak it, we must examine frequently used words, the order in which they are used, and which words appear together. Unstructured atypical text analysis requires an understanding of natural language processing and semantic complexity (Loebner 2002). A social network analysis is based on graph theory, which posits that a system of connected elements can be defined as a network (Freeman 1979). On the network, keywords are defined as nodes, and relationships between keywords are defined as links. The semantic analysis was performed using Net-miner 4.4, a social network analysis program (Fig. 3).
The collected data were refined via repetitive preprocessing of unstructured text data. Although the characteristics of long texts differ depending on the author, in general, various keywords are combined to describe a single topic. Moreover, although detailed opinions can be found if all words are examined, words with a low frequency tend to be less useful (Luhn 1958). To clearly examine the relationship between keywords, it is necessary to simplify the sentences. Thus, unnecessary words were refined through a stop-word dictionary to simplify the relationship between terms. For example, “then,” “now,” “so to speak,” “to summarize,” etc. are stop words. In this study, we excluded words such as “got” (“some space” in Korean) and “e-got” (“here” in Korean), too. We examined the top keywords that were common nouns, excluding the names of administrative districts, search terms, and proper nouns.
Centrality and community analyses were conducted, and representative keywords were extracted through a semantic analysis of the collected data. A total of 530 words/phrases were extracted from both sets of data (Table 1).
Type | Search Keywords (place) | Refined word |
---|---|---|
Forest (3) | Mt. Gwangdeok, Mt. Nabong, Suambong Peak | 137 |
River (2) | Ansancheon Stream, Hwajeongcheon Stream | 66 |
Parks (8) | Nojeokbong Park, Sangnoksu Park, Seongho Park, Ansan Park, Central Park, Lake Park, Wadong Park, Hwarang Amusement Park | 488 |
Greenery in PCEF (4) | Gyeonggi Museum of Modern Art, Danwon Art Museum, Seongho Memorial Hall, Ansan Arts Center | 256 |
Greenery in PCEF: Greenery in public cultural education facility |
Each type of data was refined by the same preprocessing, and the analysis was performed with respect to the parent node for visualization due to a large number of nodes. In this process, as the frequency and centrality value were too high, the network was tilted to one side in a node, making it impossible to analyze all types in the same manner. As such, high value top-ranked nodes that connected to each node were removed and analyzed. The removal process was conducted while sequentially removing the nodes from the top-ranked node and was based on the point at which the cluster was scattered. To analyze the keyword meaning, we examined the content of the text with the word connected to the ego-networks analysis for the same preprocessed node. We focused on the use of degree centrality and eigenvector centrality to interpret the structure of the networks. A network analysis provides insights into the system properties and identifies critical nodes with high centrality (Roth and Cointet 2010; Topirceanu et al. 2018). As the centrality value of a word derived through centrality analysis goes through a standardization process, its importance can be compared on networks.
Next, we refined the network using the same conditions because the number of search keywords differed by type. For “Forest,” 51 words were extracted by analyzing the components, excluding the top 5 (excluded keywords included: Ansansi, climbing) connected with 137 words extracted after the preprocessing refinement. We extracted 27 nodes following the text analysis of “Rivers,” excluding the top 3 (excluded keywords included: Ansansi) that were connected to the 66 words extracted after refinement. For “Park,” we analyzed the components, excluding the top 10 (excluded keywords included: Park, Nojeokbong, Citizen, Ansansi, Gyeonggido) of the 488 words extracted after refinement. For “Park,” eight search keywords were used in the collection. A cohesion analysis was performed several times around the ancestor node, and the last 44 words were extracted to visualize and contain content. Regarding “Greenery” in public cultural institutions, a component analysis was performed on all nodes except the top 8 (excluded keywords: culture, exhibition, Ansansi, Gyeonggido) out of the 256 words extracted. After the pre-processing and the result that most nodes were connected, 51 words were finally extracted. All the words used in this study were translated using the Korean–English standard dictionary (https://en.dict.naver.com/#/main), and Korean was written together in parentheses.
In contrast to the access of words according to a predefined typology, we wanted to identify the type of word attribute without limiting CES meaning. The representative words were created by classifying the attributes related to place, activity, object, and image based on the basic meaning of the keywords. This approach was inspired by three components (forms, practices, relationship) from Stephenson (2008) and Bieling et al. (2014).
A library was built by creating a taxonomy of expression words related to cultural services mentioned in all types and clustering similar content by word attributes. The classification resulted in 39 sub-categories by grouping keywords with similar attributes in a bottom-up. By regrouping keywords with similar properties, 12 intermediate classifications were derived, broadly classified into place, activity, object, and image. We classified words as actions if they became verbs by adding “do.” If the word meant a visible object or subject, it was grouped as an object, the same goes for space and words expressing emotions. For example, seeing “autumn leaves” is a recreational and aesthetic CES value; “autumn leaves” is the object, “visit” is the activity. Regarding the phrase “along the riverside bicycle path;” the activity of enjoying riding is for recreation and health promotion, the keyword “bicycle path” is the place and “riding” is the activity. As this also includes content beyond the scope of CES, only the main keywords solely relevant to CES were examined.
Based on this, the relationship between CES and the semantic structure of expression was analyzed. First, we examined the relationship between keywords and CES by type of urban green space. Second, we looked at cultural services that appeared regardless of the type of green space. To simplify complex networks, we transformed the network to one mode and implemented by leaving only important links with high similarity (weight) through the pathfinder network. Then, links with low similarity were deleted and only keywords with high similarity remained. Through iterative work, we extracted word pairs with high links, co-concurrence word relationships regardless of the specific type. Many words of activity, object, and place properties were connected to one central keyword to form a word network. Third, MDS and correlation analysis were conducted on the overlap of cultural services between green space types. To prevent duplication of CES as much as possible, we performed MDS analysis only activity and emotion in library of CES. Moreover, we analyzed the correlations between type categories by calculating the Pearson correlation coefficient between each pair of CES word and link.
During the data creation period of 10 years and 10 months, 5,536 blog users uploaded a total of 9,264 posts. The posts gradually increased from 143 in 2007 and 222 in 2008 to 1,552 in 2016 and 3,353 in 2017. In descending order, the months with the most content were September (1,178), August (1,098), April (1,036), May (1,020), June (922), July (871), and October (717). In terms of season, most of the posts appeared between spring and autumn, which could be because the green space is an outdoor space affected by weather. In order to check the bias of the collected data, we analyzed the uploaded content for each user based on the domain of the collected URL. A total of 5,536 users generated an average of 1.67 posts per person. Examining the number of uploaded posts per user, 3,362 users uploaded only one post during the period. The maximum number of posts uploaded by an individual user was 21, and 3 users created 20 posts.
The analysis results for major keywords by ecosystem type are shown in Table 2. We found differences in the main content, expressive vocabulary, CES-related activities, main space, and usage pattern through the analysis of content by green space type.
Forests were highly associated with health and aesthetic values centered around climbing. The top 30 words were related to mountaineering such as “climbing, “descending,” “traversing,” and “top of a mountain.” It was possible to find the places where cultural services were supplied by word of places such as “entrances,” “forest baths,” “Dullegil,” “octagonal pavilion,” “stairs,” and “shelters connected around the course.” In addition, we understood perceptions of aesthetic value through words such as “picture,” “photos,” and “views,” and perceptions of spiritual value through words such as “leisure” and “relaxation.” Rivers were associated with high health values centered on “biking” and “walking.” Unlike other types of urban greenery, words such as “tulips” and “maple” also appeared and connected the festival. These word pairs include recreational, heritage, and aesthetic value meanings rather than a single cultural service.
Parks were more highly associated with recreation activities than the other types of green spaces. “Strolling” was the most frequent and representative activity of the park. In addition, cherry blossom viewing, water viewing, and biking were also major recreational activities of CES.
The keywords for activities such as “exhibition,” “holding,” “participation,” and “artist” are connected with the keywords “exhibitions,” “events,” “works,” and “sculptures,” so that it has the meaning of cultural heritage. Aesthetic value can also be confirmed through the connection between the activity keywords of “photography,” “snapping,” and “appreciation” and picture and object keywords of “cherry blossoms,” “autumn leaves,” “roses,” “tulips,” “landscapes,” and “trees,” which are linked mainly to festivals. In addition, “experience” has educational value meaning as it is connected to “program,” “play,” “insect,” “ecology,” “nature” and “green,” which are the keywords of experiential activity. Although there were differences depending on the parks’ character and facilities, it can be seen that various cultural services are being serviced.
Greenery in PCEF exhibited distinct spatial characteristics that were generally associated with values in cultural heritage and education through the activity words such as art, experience, and drawing. However, it is possible to grasp the recreational value and the meaning of recreation through the words “outdoor greenery-sculpture,” “playground-experience-garden,” “lake,” “walk,” “date,” and “outing.”
No | Word | Whole | Forest* | River* | Park* | Greenery in PCEF* |
---|---|---|---|---|---|---|
1 | Way (길) | 0.494 | 0.151 | 0.146 | 0.109 | 0.075 |
2 | Time (시간) | 0.453 | 0.160 | 0.082 | 0.085 | 0.100 |
3 | Course (코스) | 0.408 | 0.231 | 0.028 | 0.041 | 0.017 |
4 | Climbing (산행) | 0.374 | 0.256 | 0.001 | 0.010 | 0.000 |
5 | Top of a mountain (정상) | 0.346 | 0.229 | 0.001 | 0.018 | 0.000 |
6 | House (집) | 0.331 | 0.085 | 0.093 | 0.073 | 0.090 |
7 | Citizen (시민) | 0.317 | 0.013 | 0.043 | 0.186 | 0.017 |
8 | Picture (사진) | 0.302 | 0.048 | 0.105 | 0.094 | 0.069 |
9 | Parking Lot (주차장) | 0.297 | 0.140 | 0.004 | 0.044 | 0.056 |
10 | Weather (날씨) | 0.278 | 0.081 | 0.065 | 0.053 | 0.085 |
11 | Strolling (산책) | 0.259 | 0.015 | 0.117 | 0.094 | 0.062 |
12 | Kid (아이) | 0.257 | 0.018 | 0.046 | 0.087 | 0.133 |
13 | Art gallery (미술관) | 0.250 | 0.000 | 0.001 | 0.005 | 0.390 |
14 | Place (장소) | 0.239 | 0.050 | 0.056 | 0.058 | 0.090 |
15 | Thought (생각) | 0.239 | 0.050 | 0.051 | 0.058 | 0.094 |
16 | Person (사람) | 0.236 | 0.048 | 0.086 | 0.063 | 0.051 |
17 | Street (거리) | 0.229 | 0.059 | 0.037 | 0.073 | 0.038 |
18 | Mountain climbing (등산) | 0.227 | 0.151 | 0.001 | 0.010 | 0.001 |
19 | Exhibition (전시) | 0.196 | 0.003 | 0.003 | 0.013 | 0.277 |
20 | Weekend (주말) | 0.186 | 0.038 | 0.034 | 0.043 | 0.086 |
21 | Arrival (도착) | 0.176 | 0.068 | 0.017 | 0.030 | 0.047 |
22 | Start (시작) | 0.176 | 0.060 | 0.046 | 0.036 | 0.025 |
23 | Nearby (근처) | 0.169 | 0.028 | 0.042 | 0.049 | 0.060 |
24 | Shape (모습) | 0.162 | 0.030 | 0.046 | 0.056 | 0.024 |
25 | Waterfall (폭포) | 0.162 | 0.001 | 0.005 | 0.109 | 0.007 |
26 | Use (이용) | 0.161 | 0.033 | 0.032 | 0.057 | 0.028 |
27 | Entrance (입구) | 0.155 | 0.061 | 0.009 | 0.022 | 0.054 |
28 | Family (가족) | 0.147 | 0.021 | 0.020 | 0.054 | 0.048 |
29 | Culture (문화) | 0.142 | 0.006 | 0.008 | 0.058 | 0.076 |
30 | Created (조성) | 0.142 | 0.006 | 0.055 | 0.070 | 0.006 |
Greenery in PCEF: Greenery in public cultural education facility |
The result of the pathfinder network analysis showed that the overlapping of linked words confirms the possibility of same CES benefits in different types of ecosystems (Fig. 3). There are 4 words with high centrality: way, strolling, exhibit, and climbing. The words, way and strolling, are closely connected with each other, while exhibit and climbing are connected with the mediating words, house and course, respectively, forming separate clusters. The analysis was conducted without considering the type of green space, but exhibit and climbing activities constitute a separate network. A specific of CES can be identified through a word associated with only one type (Forest) without other connection node, such as climbing. In other words, we can guess what kind of greenery activity is possible without spatial data merely from the perceived word expression. Our results indicate the expression of perceived CES in different greenspace type as mentioned in our previous studies (Ko and Son 2018).
In the process of network reduction, the keywords with low frequency and link values were removed, leaving only “think” and “relaxation” among emotion word of CES. The word mentioned here “way” can also be an ambiguous expression meaning a way (how to do, where to go for life) such as how to relax rather than a simple “road.” This result means that simple words (e.g., social media tags) should be used with caution as CES indicators.
The MDS analysis showed that the explanatory power was 67.4% (Fig. 4). The x-axis represents the degree of activity, ranging from passive participation to positive activity. The y-axis represents the degree of activity, ranging from the dynamic activity in nature to the static activity in an artificial space. Activities that can be done individually and together were separated. Use of established facilities and active participation or physical movement has a certain direction.
At the origin, multiple keywords of social relationships and recreation were superimposed on the same coordinates. The word nested such as inline, skate, gatherings, and outing, flower viewing in the first quadrant (x>0, y>0), and concerts, festivals, recital, drawing, learning, ecotourism in the fourth quadrant (x>0, y<0). Forests and greenery in PCEF can identify the types of green spaces by activity, whereas rivers and parks are mixed and not easy to distinguish clearly. Many words derived from rivers overlap with parks and have similar CES, this is presumed because the park and the river have spatial continuity due to the connection of the promenade and the bicycle path.
Correlation analysis can help in quantitatively checking the differences by type of green spaces (Table 3). We found that the Park category was significantly correlated with other type and had a relatively higher correlation value compared to other categories such as the Rivers, the Greenery in PCEF, and the Forest. However, the correlation between the Park and the Forest was not significant in the link of type analysis, but it was significant in the node of type analysis. There was no significant correlation between other green types except Park.
Type | Greenery in PCEF | Forest | Park | River |
---|---|---|---|---|
Forest | 0.0218 | |||
Park | 0.2095* | 0.1607* | ||
River | 0.0409 | 0.1351 | 0.3731** | |
* p < 0.05; ** p < 0.01; Bold text indicates strong positive correlations (≥0.10) |
Type | Greenery in PCEF | Forest | Park | River |
---|---|---|---|---|
Forest | 0.017 | |||
Park | 0.151* | 0.138 | ||
River | 0.029 | 0.130 | 0.348** | |
* p < 0.05; ** p < 0.01; Significant correlations (>0.90) are bolded |
-Greenery in PCEF: Greenery in public cultural education facility
In this study, we applied an inductive free-listing approach to explore people's subjective perception of CES, focusing on the words with which people express their experiences of using urban greenery (Bieling et al. 2014; Fagerholm et al. 2012). Although this research found that various words used to express opinions about CES can be classified based on the meaning of a specific keyword, we found that one keyword can overlap with one or more CES and have multiple meanings. For example, “flower viewing,” “outing,” and other physical activities can overlap with social activities such as “meeting” or “gathering.” Trees, cherry blossoms, and autumn leaves form part of the natural environment and are associated with aesthetic value; however, they also play a role in cultural heritage values because they are related to festivals and events. Therefore, in this study, the library was created by assigning the attributes of words after completing word analysis without assigning words to CES in advance. The characteristics of CES were examined through the created library. The association of certain types of CES is unsurprising given that many ecosystem services types (including CES) are bundled (Ament et al. 2017; Dade et al 2019; Martín-López et al. 2012; Plieninger et al. 2013; Raudsepp-Hearne et al. 2010). This tendency can be understood as a demonstration of the interlinked, inseparable nature of different CES (Daniel et al. 2012; Plieninger and Bieling 2012).
In many studies, expression words were assigned to CES types and then evaluated by frequency or weight between words (J. Retka et al. 2019; Johnson et al. 2019; Karasov et al. 2020; Dai et al. 2019). However, evaluating words simply based on frequency may result in errors. Evaluating text from social media platforms is complicated because meanings can be ambiguous and difficult to quantify (Hearst, 1999; Hirons et al., 2016). Keywords and frequencies derived from text mining analysis provides only relative information regarding CES (Ko and Son 2018). The frequency of occurrence has no absolute meaning in the current study.
As CES expression words can overlap in meaning, the method of assigning and evaluating words to each CES is not appropriate, and more research on expression words is needed. Social media-based indicators should be carefully considered when used to evaluate CES, as there is uncertainty regarding lexical representation. In other words, there is no clear answer regarding what types of CES people use and benefit from. As more usable big data are created, it is necessary to develop a method that can effectively use such data in studies. To efficiently use big data to evaluate words that denote user perceptions of CES, we require additional research into the methodology of word-embedding and ontology that can evaluate expressions and word-specific ecosystem services. In natural language processing, word embedding is an effective method for extracting semantic and syntactic information from a large corpus (Lai et al. 2016). An index should be established to understand the value and meaning of individual words, word pairs, and the relationship between words related to CES. Although it is not easy to develop indicators using natural language (Bieling 2014), we expect that using visual and physical indicators conjunctively to assess CES will enable researchers to more clearly identify the nonmaterial benefits of CES.
In this study, different CES were provided by different types of urban green space; some were clearly distinguished based on user text, however, some were difficult to distinguish by text alone. Although an inductive method was used to find as many expression words for cultural services as possible, it was difficult to identify cultural services by keywords other than activities. This study revealed the importance of individuating CES that can and cannot be evaluated through text indicators. We were unable to grasp other elements of CES, such as spiritual and religious values, because it was uncommon to find religious views discussed in blog posts. We analyzed emotion words separately, but the frequency and link values were low; therefore, the words of emotion that remained as the main keyword were very small in number. The inability to examine all CES expressions is a limitation of this study.
It is necessary to distinguish between the services that can be evaluated using indicators derived through the objectification of cognition and services, and those that are difficult to assess by individual differences. Specifically, services can be evaluated through the development of indicators such as user activity, words related to activities, expressive vocabulary for service recognition, and emotional expressions of positive and negative experiences in a space (Bieling 2014). It is particularly difficult to evaluate artistic inspiration, spirituality, and religion through a text index that views space as a CES, which is completely dependent on a subjective point of view (Allan et al. 2015; Retka et al. 2019; Richards and Friess 2015; Zhang et al. 2020). Allan et al. (2015) referred to hard-to-find CES value of knowledge systems, education, and cultural heritage through social interactions. Retka et al. (2019) mentioned about spirituality and religion that might be considered disrespectful to photograph and share on public social media accounts or may not be worthy of documentation. Additionally, even if the direct keyword “education” is not used through nodes such as nature, ecology, and experience, educational value can still be acquired by experiencing the natural environment directly or indirectly.
CES are a specific type of ecosystem service; they are more important for users’ activities and experiences in the natural environment than natural ecological functions, can provide multiple services, and are understood as services when people recognize and use their value (Bertram and Rehdanz 2015; Termorshuizen and Opdam 2009). Structural factors such as the quality and management of facilities are imperative to understand users’ activities; however, this is not a core element of CES (Voigt et al. 2014). It is more important to satisfy CES demands through individual experiences and to accurately measure them (Costanza et al. 2017; Dou et al. 2020).
The materials in this study were not prepared with ecosystem service evaluation in mind. Rather, people expressed their perceptions freely in their own words. As CES assessment requires public participation, social media could be the best path to “citizen science” (Chen et al. 2018). However, how to handle ethical issues and what assessment methods to use remain problems to be solved.
SNS media such as Instagram, Twitter, and Weibo have more visits than real sightseeing events in daily life (Figueroa-Alfaro and Tang 2017; Heikinheimo et al. 2020; Vieira et al. 2018). We investigated if big data can be used to replace survey or interview methods. As a result of text analysis, we identified the phrases “the mountain behind the house” and “near the house.” Moreover, the word “home” appeared with high frequency. We therefore presumed that writing about daily life was common in the form of thought blogs. Thus, through analyzing big data, we were able to understand daily activities, which are difficult to understand through typical survey methods.
The CES were determined by the urban and geographical landscape in Korea. The frequent use of keywords related to mountain climbing and forests is a result of the easily accessible high and low forests in Ansan. Additionally, due to Ansan’s geographical characteristics, urban parks are often located in low hills rather than flat areas. CES located in forests may also be perceived as having cultural heritage, spiritual value, and religious value, due to the influence of temples, cemeteries, or totems found within (Ko and Son 2018). We can predict that other cities in the country may reveal equivalent results because of the topographical characteristics of Korean cities. In addition, due to indigenous religions or particular religious sites such as Seonangdangs (shrines to the village deity) in each region, the spiritual and religious values referred to in CES abroad are recognized by differences in CES. As noted in previous studies, CES are often associated with indigenous and native languages (Schnegg et al 2014; Wartmann and Purves 2018). Thus, due to the differences in the natural environment and cultural history of the region, ecosystem service assessment and evaluation criteria are needed for each country.
The study examined how CES are perceived and what keywords define urban green spaces through long text written in everyday life. This study confirmed that an attempt to quantify subjective perception is possible through text mining. Despite the limitations of dealing with few data, that bias was useful here as we were particularly interested in examining word interactions with CES using the text data that was used and reproduced for information retrieval of many people.
Our work demonstrated that the social media data on CES greatly extends the type and, especially, the volume and scale of information that can be derived from traditional survey methods (Retka et al. 2019; Zhang et al. 2020). However, the data from this study were limited, as only blog texts from social media platforms were analyzed. Although, as the number of users accessing social media from their smart phones increases and the user base continues to grow, studies relying on social media data could become more representative by including different age groups.
It will be very useful if the meaning of words and the subtle differences between words can be further subdivided by utilizing more texts generated in the post-corona era (an era where non-face-to-face methods are becoming less common). If the meaning of the expression can be derived and used as a relative indicator of CES, it is likely that the qualitative aspects, which are currently alienated from evaluations of ecosystem services, can be analyzed.
We will continue to explore this area in subsequent studies. It is necessary to build a dictionary of CES that can be used as a lexicon and index like that of Vigl et al. (2021), which used text analysis by processing the entire Wikipedia ontology. If it will be possible to develop an ontology system that automatically establishes the concepts between words according to the type of ecosystem service, first, it will be possible to quantitatively evaluate CES through the relationship between keywords, second, if the keyword network and natural asset information are linked, an integrated semantic information retrieval system can be implemented.
Funding
The author declares that his research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Competing Interests
The author has no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Author Contributions
This research originated from a doctoral dissertation, with much of it performed by Ha-Jung Ko. The advice of Professors Seong-Woo Jeon, Dong-Kun Lee, and Yong-Hoon Son were sought and implemented during the doctoral thesis review process.
Data availability statement
The data used for this study are available from the author on reasonable request.