Using Mobile Phone Data to Understand the Demographic Characteristics and Behavioral Patterns of Park Visitors in a Megacity, Beijing, China

Urban parks are important places that allow urban residents to experience nature but are also associated with the risk of exposure to contaminated soil. Local researches on demographic characteristics and population behavior patterns are the basis of soil exposure assessment. The objectives were to determine park visitors’ demographic characteristics and behavioral patterns.


Introduction
With the rapid development of urbanization, 55% of the world's population lived in urban areas in 2018 (United Nations, 2018). Urban space is typically characterized by arti cial environments, architecture, tra c, and air pollution, so that urban residents have few opportunities to experience nature (Orioli et al., 2019). As important public resources, urban parks are loved by urban residents because of the good natural environment, bene t to human health, quality of life and social interaction (Fischer et al., 2018;Kazmierczak, 2013). Nevertheless, urban park soils are frequently enriched in contaminants owing to the development of urbanization and industrialization (Frimpong and Koranteng, 2019;Khan et al., 2016), and people may be exposed to contaminated soil through ingestion, dermal absorption, and inhalation while engaging in various activities in urban parks (Fig. S1). Consequently, the health risks associated with soil pollution in urban parks have attracted widespread attention (Huang et al., 2021;Penteado et al., 2021;Qu et al., 2020), and the Chinese government published the rst risk control standard for soil contamination of park land in 2018.
Human activity patterns are important for exposure modeling (Brandon and Price, 2020;Wu et al., 2011).
The selection of sensitive receptors and the study of population behavior patterns are the premise of ne soil exposure assessment and health risk assessment. Different groups of people have different recreational activities and levels of contact with the soil in the park, so that it is vital to select appropriate sensitive receptors when conducting soil exposure assessment. Identifying the demographic characteristics of park users is the rst step in selecting sensitive receptors. In addition, the population behavior pattern signi cantly affects the soil exposure assessment results. For example, the average soil ingestion rate, key parameter for soil exposure assessment, is correlated to amount of time spent outdoors (CL: AIRE, 2014). Therefore, local park visiting patterns, the demographic characteristics and behavioral patterns of urban park visitors are of great importance to the study of the soil exposure levels of park visitors. Barrio-Parra et al. (2019) also emphasizes the importance of characterizing urban gardeners' local activity patterns when assessing human health risks in urban garden scenario. What's more, Wu et al. (2011) suggested that changes in human activity patterns should be taken into account in exposure modeling.
Until now, traditional research methods, such as questionnaire surveys, eld observation, and interviews, have been widely used to study the behavioral patterns and demographic characteristics of park visitors (Bertram et al., 2017;Fischer et al., 2018;Kabisch et al., 2021;Subramanian and Jana, 2018). However, it is well known that traditional methods are typically subjective, costly and laborious (Li et al., 2019). In recent years, the use of big data, such as social media and mobile phone data, has gradually gained traction as a new method of studying park visitation patterns in view of the limitations of traditional methods and the rapid development of computer science-and internet-related techniques (Chen et al., 2018;Li et al., 2018;Lin et al., 2021;Song et al., 2020a). Many studies using big data focus on park accessibility (Guo et al., 2019a;Hamstead et al., 2018), the spatial and temporal distribution of urban park users (Chen et al., 2018;Liang and Zhang, 2021;Ullah et al., 2020), and the factors in uencing park usage (Fan et al., 2021;Lyu and Zhang, 2019), with these studies aiming to provide references for the planning and construction of urban parks. Although some previous studies based on questionnaires and interviews have revealed that park visit patterns are in uenced by the visitors' age and gender, unfortunately, such visitor data cannot be easily obtained (Song et al., 2020b;Ullah et al., 2019).
Consequently, studies that use big data methods to analyze the demographic characteristics of urban park visitors and the behavioral patterns of different groups of people in parks are rare, which limits the ability to obtain more objective results on population behavior patterns and highlights the importance of this study.
In this study, based on the mobile phone location data which can identify the demographic characteristics of users, we aim to determine the demographic characteristics and behavioral patterns of urban park visitors to provide a reference for the re ned assessment of park visitors' soil exposure levels. To achieve our research purpose, Beijing, a megacity, was chosen as the study site. The study's aims are as follows: to determine the visitation patterns of park visitors in Beijing based on mobile phone data; to identify the demographic characteristics of urban park visitors, such as gender, age, and local or foreign status; and to determine the visitors' length of stay in parks and its in uencing factors. Our ndings will helpful for the assessments of urban park visitors' soil exposure levels and the re ned risk assessment in urban park scenario.

Brief description of the study area
Beijing is China's capital and the city with the largest built-up area. The latitudinal range of the territory of Beijing is 115°25'-117°30' E, and its longitudinal range is 39°26'-41°03' N (Feng et al., 2019). According to the Beijing Municipal Bureau of Statistics, at the end of 2019, the city's permanent population amounted to 21.54 million, and the total permanent population in the central urban area is 52.17% of that in Beijing overall. In recent years, the main residential areas have expanded to between the Fifth and Sixth Ring Roads. To ensure an environmentally friendly and livable city, Beijing is vigorously developing and constructing urban parks. By the end of 2019, there were 344 urban parks in Beijing covering a total area of 35,157 hectares. Moreover, Beijing has the greatest area of green coverage among China's built-up environments (National Bureau of Statistics of China, 2020).
In this study, 86 parks within Beijing's Sixth Ring Road, an area with high population density (Guo et al., 2019a;Liu et al., 2018), were selected. In view of the diversity of urban parks in Beijing, we divided the 86 urban parks into four categories based on the Standard for Classi cation of Urban Green Space, the parks' functions, and previous studies (Guo et al., 2019b;. The four categories are community parks, comprehensive parks, country parks and theme parks, with 17, 22, 21, and 26 urban parks classi ed into these four categories, respectively. Figure 1 illustrates the geographic area of the study and the distribution of parks within it.

Data sources
This study involved several data sources. First, the list of parks was obtained from the Beijing Gardening and Greening Bureau's website. Second, the parks' boundaries were extracted from Google Earth and then processed using ArcGIS 10.7 to obtain the parks' shape les. Third, we determined the park visit volumes and visitors' stay time in parks using mobile phone data obtained from one of the three communications operators in China. In anticipation of seasonal variations in the visitors' behaviors, 16 days' worth of data were collected, and these 16 days were evenly distributed across all four seasons in 2019. To avoid the impact of bad weather on park visits, study dates with good weather conditions were selected (Table S1).
The method of identifying park visitors was as follows: a) By overlaying park boundaries and communication operator base stations for analysis, stations located within the boundary of a given park were extracted. b) For each study date, users whose trajectories between 6 a.m. and 9 p.m. include these stations located within the park boundary were extracted as candidate visitors, and their movement trajectories were extracted. c) According to the candidate visitors' trajectories and excluding the park staff, we designated candidate visitors who stayed in the park for 0.5-5 h as park visitors. To ensure that the data were representative, we removed parks that attracted fewer than 100 visitors on each study date (Ullah et al., 2020), and visitor volume data were obtained for 86 parks. We further determined each park visitor's gender and age. The locations of the visitors' homes could also be determined by extracting the most frequent locations of the visitors at night (00:00-06:00 and 21:00-24:00). Local and foreign visitors could be distinguished because gender and age data could not be obtained for foreign visitors.
Finally, to study the factors in uencing visitors' stay time in parks and based on existing research regarding park visitation (Aikoh et al., 2012;Bertram et al., 2017;Chen et al., 2018;Jiang et al., 2019;Subramanian and Jana, 2018), we selected three categories of variables that may in uence visitors' stay times. The rst category was the parks' attributes and surrounding environmental characteristics and included four independent variables: park area, number of dwellings and number of transportation facilities within 2 km of the park, and the distance from a given visitor's home to the park. We chose 2 km as the threshold distance as people are particularly likely to visit parks within 2 km from their homes (Lin et al., 2021;Tu et al., 2020). Park area data were obtained from the website of the Beijing Gardening and Greening Bureau. The numbers of dwellings and transportation facilities were expressed by the number of points of interest, which was obtained by crawling the Baidu map of the area. The distances from visitors' homes to the parks were calculated based on latitude and longitude. The second category included the characteristics of park visitors, such as the gender and age of the visitors. The third category was temperature, which represented the environmental variable. Temperature was acquired from an open platform (https://www.tianqi.com/).

Statistical analysis
We rst used Excel 2016 and SPSS 25.0 for basic data processing and statistical analysis. Because the data did not follow a normal distribution, the Kruskal-Wallis test was used to compare differences in visitors' gender and age across different seasons and different park types. It was also used to analyze the differences in the proportions of local and foreign visitors in different seasons. A Dunn-Bonferroni test was used for pairwise multiple comparisons. Kernel density estimation (KDE) was used in ArcGIS 10.7 to study the spatial differences in park visits between foreign and local visitors. KDE is an effective measurement tool used to describe the spatial structure of the density of visitors within each selected urban park . Random forest (RF) was used to analyze the factors in uencing visitors' stay time in parks and their relative importance. RF is an integrated machine-learning method that assigns importance measures to variables to facilitate the evaluation of each variable's importance. The RF model was implemented in Python 3.7 software. The optimized parameters of the RF model were as follows: n_estimators = 80, max_depth = 5, min_samples_split = 2. Figure S2 illustrates the seasonal variation in the average visitor numbers at different parks. In general, the average visitor numbers were highest in autumn and lowest in winter, although few differences in the average visitor numbers were observed between spring, summer, and autumn. The average visitor numbers of comprehensive parks and theme parks were similar and signi cantly higher than those of community parks and country parks for any season. However, for different park types, the seasonal variation in average visitor volume was not exactly the same. The seasonal variation in average visitor volume for theme parks was the largest, followed by comprehensive parks. Regarding the above two park types, the visitor numbers in winter were signi cantly lower than those in other seasons. For community parks and country parks, the average visitor volumes exhibited little difference across the seasons.

Park visit patterns of local and foreign visitors
As Figure 2 illustrates, local and foreign visitors were almost equally divided in Beijing's urban parks. The respective proportions of local and foreign visitors in different seasons were as follows: spring, 50.36% and 49.64%; summer, 48.38% and 51.62%; autumn, 48.28% and 51.72%; and winter, 53.20% and 46.80%.
The number of foreign visitors in winter was signi cantly lower than those in other seasons. Results of a Kruskal-Wallis test showed statistically signi cant differences in the proportions of local visitors among different seasons (p < 0.05) as well as for foreign visitors. Bonferroni tests revealed that the proportions of local visitors and foreign visitors differed signi cantly between autumn and winter (p < 0.05), whereas the proportions in spring, summer, and autumn were similar (p > 0.05).
The KDE method was used to analyze the spatial distribution of local and foreign visitors, and core agglomeration was used to re ect the gathering places where high-density spots of visitors in urban parks were located. As Figure 3 illustrates, the spatial agglomeration characteristics of local and foreign visitors differed signi cantly. For local visitors, two single-core agglomerations were observed with a high kernel density of visitor numbers. One was located near north Fifth Ring Road and mainly included the Olympic Forest Park and its surrounding parks (such as Yangshan Park). The other was located between west Fourth Ring Road and west Fifth Ring Road and included Beijing International Sculpture Park and Wukesong Olympic Park. Additionally, some differences in the spatial aggregations of local visitors across the seasons were observed. The differences in seasonal variation were re ected in the disappearance of two single-core agglomerations with low kernel density of visitor numbers in winter. The two single-core agglomerations were located within the Second Ring Road and mainly included Taoranting Park, Shichahai Park, Jingshan Park, Huangchenggen Ruins Park, Temple of Heaven, and Zhongshan Park. However, for foreign visitors, only one single-core agglomeration was located in the city center, and it included Zhongshan Park, Jingshan Park, and Shichahai Park. Furthermore, the spatial gathering of foreign visitors was almost identical across seasons.
3.3 Differences in park visits by gender and age Figure 4a shows the ratio of male to female (M/F) visitors to urban parks. Because the gender and age of foreign visitors could not be identi ed, we only analyzed age, gender, and length of stay in the park for local visitors. Overall, it was clear that the gender ratio was greater than 1 for different types of parks, with an average ratio of 1.585. Moreover, according to the results of the Kruskal-Wallis test (p < 0.001), there were statistically signi cant differences in the M/F ratios for visitors to different types of parks. The ratio was highest in country parks (1.847) and lowest in theme parks (1.401). No statistically signi cant difference was observed in the M/F ratio between community parks and comprehensive parks, which were associated with average ratios of 1.559 and 1.57, respectively. Regarding the in uence of season on the M/F ratio, signi cant differences were observed across different seasons based on the Kruskal-Wallis test (p<0.01). The ratios in spring, summer, autumn, and winter were 1.589, 1.601, 1.543, and 1.605, respectively. The ratio was lowest in autumn, with this value signi cantly different from the ratios in summer and winter.
The age distribution of park visitors was also considered (Figure 4b). Given that the proportion of visitors under the age of 18 was low and the data may not be representative, we only considered visitors older than 18. Figure 4b illustrates the differences in the proportion of visitors across different age groups. The proportion of visitors aged 31-45 was the highest, followed by visitors aged 46-60. In general, the proportions of park visitors aged 19-30, 31-45, 46-60, and > 60 were 11.23%, 41.40%, 31.52%, and 15.85%, respectively. As Figure 4b shows, no signi cant seasonal change emerged in the proportions of visitors of different ages, but signi cant differences in the proportions of visitors of different ages were found for different park types. Speci cally, the proportions of visitors of different ages varied signi cantly between country parks and theme parks. Compared with theme parks, the proportion of visitors aged 31-45 was higher at country parks, whereas the proportion of visitors aged over 60 was lower. The average proportion of visitors aged over 60 at country parks was 13.5%, whereas the proportion of visitors aged over 60 at theme parks was 20%.

Visitors' length of stay in a park and its in uencing factors
First, we observed that the visitors' length of stay in the parks ranged widely, with the shortest time being close to 0.5 h and the longest being close to 5 h ( Figure 5). The median length of stay in a park was approximately 1.54 h, and the highest proportion of visitors (42%) stayed in a park for 1-2 h. A small number of people (around 5%) remained for 4 h or longer. In addition, we observed signi cant differences in the time spent by visitors of different age groups in the parks (p < 0.001). Post hoc test comparisons indicated that visitors aged 31-45 and 46-60 did not differ in terms of their stay times. The median visitor stay times for the 19-30, 31-45, 46-60, and > 60 age groups were 1.594, 1.525, 1.539, and 1.627 h, respectively. The stay time of visitors over 60 was slightly higher than those of visitors in other age groups. Visitor stay times were also compared across different park types, and the visit duration varied among different types of parks (p < 0.001). The results showed that visitors stayed slightly longer in community parks (median 1.591 h) and theme parks (median 1.59 h) than in comprehensive parks (median 1.513 h) and country parks (median 1.520 h). Although the differences are not obvious ( Figure  5), the visitors' length of stay did differ signi cantly across different seasons (p < 0.001).
To explore the reasons for changes in visitor stay times, the RF method was used to quantify the in uences of seven factors on stay time. As can be seen from Figure 6, the distance from a visitor's home to the park was the most important factor affecting their stay time, contributing 80.65% to the change in stay time. Compared with the distance from home to the park, other factors had less in uence on the stay time. Park area, the number of dwellings with 2 km of the park, temperature, the number of transportation facilities within 2 km of the park, the age and gender of visitors contributed 7.33%, 6.91%, 3.12%, 1.81%, 0.18%, and 0%, respectively. Figure 7 illustrates how some factors contribute to the variation in stay time. Given that gender had almost no effect on the length of stay, it was not included in the gure. Figure 7a demonstrates that the stay time and the distance from home to the park were signi cantly negatively correlated when the distance was short, whereas a signi cantly positive correlation was found when the distance was far. Park area, the number of dwellings, and the number of transportation facilities were all positively correlated with stay time (Figure 7). Moreover, the correlation between park area and stay time increased signi cantly when park area is larger, which indicates that visitors typically stay longer in larger parks. Lower temperatures were negatively correlated with stay time, whereas no correlation was observed at higher values. By contrast, no correlation between age and stay time was observed for younger groups, but a weak positive correlation was noted for older groups.

Park users' demographic characteristics and visit patterns
In proportional terms, almost half of the visitors to Beijing's urban parks were from elsewhere. This may rst be attributed to the fact that Beijing is a metropolis with a large number of immigrant population. Statistics show that at the end of 2019, Beijing's permanent migrant population amounted to 7.456 million, accounting for 34.6% of the permanent resident population (Beijing Municipal Bureau of Statistics, 2020). In addition, Beijing is a famous historical city with many places of interest. According to statistics recorded by the Beijing Municipal Bureau of Culture and Tourism, there were 227 scenic spots with star ratings in 2020 that attracted large numbers of tourists. According to the World Tourism Cities Development Report (2019), Beijing ranks rst in the list of tourist destination cities in China, and so foreign tourists may constitute a signi cant proportion of visitors to Beijing's urban parks.
Local and foreign visitors may have different park visiting preferences, while few studies have paid attention to this. We noted that most of the visitors who went to the parks located in the city center were foreign visitors. Similar results have been reported by previous studies. For example, a study of Beijing tourism found that the central city is typically the most densely populated area with respect to foreign tourists because the downtown district is home to several traditional scenic spots and modern shopping areas . The same explanation applies to the distribution of foreign visitors in urban parks Song et al., 2020b). Regarding the in uence of season on visitor distribution, only the geographic distribution of local visitors was weakly associated with seasonal changes, with the results showing that parks located in the center of Beijing are less attractive to local visitors during winter. This is because the parks located in Beijing's center are mainly theme parks, and their attraction for local visitors declines during the winter owing to the low temperatures and changes in scenery. This is consistent with the signi cant decline in the numbers of visitors to theme parks in winter, as re ected in Section 3.1. However, seasonal changes did not affect the geographical distribution of foreign visitors. This indicates that parks located downtown are attractive to foreign visitors across all seasons because of their popularity and proximity to the city center.
It also emerged that the number of visitors varied with gender and age, which is consistent with previous studies (Azagew and Worku, 2020;Lin et al., 2014;Liu et al., 2017). In this study, the overall M/F ratio was 1.585, which is higher than that of the Beijing population (1.032) (Beijing Municipal Bureau Statistics, 2020). This indicates that men are more likely than women to visit parks in Beijing, as some researchers have reported in other contexts. For example, Cohen et al. (2007) found that the M/F ratio for visitors to Los Angeles' urban parks was around 1.63, based on eld observation and face-to-face interviews. Lin et al. (2014) found that the number of male visitors was slightly higher than that of female visitors. Women are more likely to be discouraged from visiting parks owing to fear and a lack of companions or interest (Derose et al., 2018;Liu et al., 2017). The largest proportion of urban park visitors was aged 31-60, whereas the proportions of visitors aged 19-30 and over 60 were low. From examining the age distribution of park visitors who live in Beijing, we found that younger visitors aged 19-30 were not inclined to visit parks. However, with advancing age, people tend to visit parks more frequently to promote their health (Grilli et al., 2020). Liu et al. (2017a) found that the parents of children aged under 7 years were more likely to visit parks, so accompanying children may be another reason for middle-aged people over 30 to go to a park. Other studies have also reported on park visitors' age distributions. Veitch et al. (2015) found that the proportion of visitors aged 21-59 was 53%. Interestingly, Azagew and Worku (2020) found that the 16-25 age group used the park most, whereas those aged over 55 exhibited the least park use in Ethiopia, which is signi cantly different from our results. This suggests that the age distribution of park visitors may vary from region to region and may be related to the speci c age distribution of the population in each region.
Additionally, signi cant differences in the gender and age of visitors were observed among different types of parks, which are likely related to the services provided by the park, the proximity of the park (Cohen et al., 2020), and the activities of visitors in the park. Women and elderly people are generally regarded as marginalized populations, and they are more likely to be reluctant to visit parks for multiple reasons. In this study, the median distances from a park visitor's home to different types of parks are as follows: community parks, 2.14 km; comprehensive parks, 3.09 km; country parks, 2.84 km; and theme parks, 3.23 km. The median distance from home to a country park is not much shorter than that to a comprehensive park or a theme park. However, the services and landscapes provided by country parks to visitors are not as good as those provided by comprehensive and theme parks. Furthermore, women and older people are less likely to participate in moderate-to-vigorous physical activities (Cohen et al., 2020;Subramanian and Jana, 2018). Sang et al. (2016) found that older people prefer to engage in activities related to nature, in contrast to younger people, and so, they are more likely to visit parks with better scenery for entertainment and sightseeing purposes. Gu et al. (2020) also found that age has a negative impact on the likelihood of visiting country parks. In summary, country parks are less attractive to marginalized populations. Theme parks encompass many park types, such as historical parks, memorial parks and parks with famous scenery, which have beautiful scenery and/or diverse functions, so they can attract different groups of people.

Factors in uencing stay time and their relative importance
In this study, we found that most visitors spent 1-2 h in a park. To determine whether visitors from different regions spent different amounts of time in urban parks, we summarized the stay times of visitors from different regions (Table S2). The majority of visitors from most areas spent 1-2 h in the park visited. However, visitors from several areas such as Mianyang, Singapore, and Cape Town were observed to stay at a park for less than 15 min. Therefore, it is necessary to analyze the factors that in uence visitor stay time.
Based on the results of RF modeling, the distance from the visitor's home to the park is the factor that exerts the greatest in uence on stay time. This result is similar to that reported by Cohen et al. (2020), but it is not exactly the same. Cohen et al. (2020) only found that the average duration of stay in parks reached by active commuting tended to be shorter than that in parks reached by motor transport, and visitors who lived closer to parks were more likely to use active commuting, which means that the greater the distance from home to the park, the longer the stay time. Actually, visitors are also likely to stay longer in a park when the distance from their home to the park is short. Longer stays in parks closer to home may be due to visitor behavior and activities in the park. Parks close to residential areas are typically designed to serve the surrounding residents, and it may be that visitors engage in activities in parks that take some time, such as accompanying children, dancing, playing chess, and exercising ( Figure S1; Wang et al., 2021). Pham et al. (2019) showed that practicing sports and exercising were predominant in explaining longer stay times in parks. In Kansas City, 89.9% of park visitors participated in physical activities in parks, and they spent a mean of 77.1 min being physically active (Bai et al., 2013).
Excessively long distances from an individual's homes to the park typically reduces the frequency of visits (Bertram et al., 2017;Cohen et al., 2020). Therefore, visitors usually choose to stay longer in the park to compensate for the lack of visits to the park and obtain su cient exposure to nature.
Park area was the second most important factor impacting stay time, although only 7.33% of the variation in stay time could be attributed to it. Clearly, one of the reasons that visitors stay longer in larger parks is that it usually takes longer to visit the entire park. Kazmierczak (2013) con rmed that visit length was associated with park size and that visitors tended to spend more time in large parks. Dwelling numbers can represent the population density near a park. Koohsari et al. (2020) found that population density was positively associated with sedentary behaviors, which may increase visitor stay time in a park. Temperatures was negatively associated with stay time at low values, which means that people usually spend less time outdoors because of the cold. Transportation facilities may affect visitor stay time by in uencing their travel behaviors. For example, visitors can stay late in parks with convenient transportation, because they don't have to worry about the lack of transportation facilities to go home. In addition, a weak positive correlation was observed between age and stay time in parks with older people, consistent with ndings from a study in Hong Kong, China (Wong, 2009). Given that 60 is the legal retirement age in China, this may be related to the fact that older people have more leisure time.

Implications for soil exposure assessment
To explore the behavioral patterns of urban park visitors, park visits, the visitors' demographic characteristics, and visitor length of stay were analyzed based on mobile phone data. The variations in the number of park visitors across different seasons and different park types suggest that season and park type affect visitor behaviors and may thus impact visitor soil exposure. In particular, no signi cant seasonal variation was evident in the number of visitors to community and country parks, suggesting that visitors are likely to visit these parks at a similar frequency and the frequency of visiting to these parks may be close to 365 days per year. However, the number of visitors to comprehensive and theme parks dropped signi cantly in winter, indicating that many visitors may reduce the frequency of visits to these parks in winter. It can be inferred that the frequency at which visitors go to comprehensive parks and theme parks is likely to be lower than that for community and country parks, with a corresponding lower frequency of soil exposure. In China, the frequency of soil exposure experienced by people who visit community parks and other types of parks are distinguished in technical guidelines for risk assessment of soil contamination of land for construction. Unfortunately, many researchers have failed to consider differences in visitors' frequency of exposure at different types of parks. To ensure a more accurate risk assessment, it is necessary to use different exposure frequencies for different types of parks in future research.
The visitors' demographic characteristics indicate that Beijing's urban parks not only provide services for local people but also for large numbers of migrants, although their park visit behavioral patterns differ considerably. For the sake of conservatism, the parameters used in risk assessment are usually based on local populations. However, for a city like Beijing with many immigrants and the exposure duration for local people may be longer than that for foreign people, it may be more accurate to consider the soil exposure of visitors to parks dominated by foreign visitors. In addition, the number of male visitors is higher than that of female visitors in Beijing. In other words, it's too conservative to select only women as sensitive receptors in risk assessment. A better approach would be to take both men and women into account and some exposure parameters, such as body weight and height, be weighted according to the proportion of male and female visitors.
Although the durations of visitors' stay in the park varies widely and are signi cantly affected by the distance from home to the park, the occupancy period of 1.5 h may be a reference exposure parameter for risk assessment in a park scenario. In addition, most visitors spent 1-2 h in parks, signi cantly less time than in their residences (usually a minimum of 8 h) and visitors are often exposed at a lower frequency to parkland than to residential land. As a matter of fact, in China, no strict distinction is made between residential land and parkland when assessing the risk of contaminated sites, which may lead to an overestimation of the health risks faced by people in parkland. Therefore, to ensure more realistic results, different exposure parameters should be used for risk assessments of residential land and parkland.

Limitations
There may be a few limitations to this study. First, considering the availability of data and the complexity of human behavioral patterns, we only used data collected over 16 days to represent seasonal variation in park visits and visitor stay time in a park as an indicator to represent their behavioral patterns. In future research, more data that re ect seasonal changes and characterize visitors' behavioral patterns should be collected, if possible. Second, the mobile phone data used in this study cover children insu ciently and cannot accurately re ect patterns in children's visits to parks. Considering children's sensitivity to contaminated-soil exposure, additional research on children's park visits is warranted. Finally, although mobile phone data have the advantage of wide coverage, yield large amounts of information, and can more objectively re ect park visit trends, park visitation preferences are subjective and vary among human individuals. Thus, it is usually necessary to investigate individual subjective willingness to visit a park once human behavioral patterns are involved. For example, visitors' attitudes toward parks in uence their use of the park, but such data are usually only available through visitors' self-reports (Zhang and Tan, 2019). Therefore, it may be more fruitful to use mobile phone data and data obtained from questionnaire surveys or eld observations in combination.

Conclusions
To our knowledge, this study is the rst to use mobile phone data to analyze the park visiting behavior with different demographic characteristics in detail. In this study, we found that males and visitors aged 31-45 are active users of urban parks and foreign visitors are also a part of Beijing urban park visitors that cannot be ignored. Differences were evident in the park visitation behavior of visitors with different demographics. Most visitors stay in a park for 1-2 h, and the distance from a visitor's accommodation to the park was the most important factor impacting the stay time of tourists. Big data offer a new means of studying the behavioral patterns of park visitors but should also be combined with traditional survey methods to be more accurate and comprehensive. A more re ned risk assessment may be achieved, for example, by distinguishing between residential land and parkland, different types of parks, and different groups of people.

Declarations
Ethics approval and consent to participate Not Applicable.

Consent for publication
Not Applicable.

Availability of data and materials
The mobile phone data used in this study are available from one of China's three major telecommunications operators. They were used under license for the current study, and so are not publicly available. Page 21/23 Gender and age distribution of park visitors: (a) gender; (b) age note:1-community park, 2-comprehensive park, 3-country park, 4-theme park Single factor partial dependence plot for the random forest model of stay time in park Supplementary Files This is a list of supplementary les associated with this preprint. Click to download.