Streamlining urban forest monitoring based on a large-scale tree survey: a case study of highway vegetation in Hong Kong

Through the analysis of an urban tree inventory with the aid of machine learning, this study brought together different aspects of urban forestry. Urban tree monitoring is essential to successful urban forestry. Transport land use accommodates huge tree stock which requires substantial monitoring efforts. In Hong Kong, more research is needed to take into consideration how monitoring works can be improved in response to variations in tree stand characteristics. This case study aimed to illustrate the usefulness of a large-scale tree survey in mainstreaming future tree monitoring and management in transport land use. A total of 7209 trees were found in a large-scale tree survey conducted in 53 slopes and 52 verges along San Tin Highway in Hong Kong. Dominance by Corymbia citriodora (72%) was observed, especially for the highway verges. Using chi-square tests, significant associations were found between monospecific stands, habitat type, and tree risk rating. A logistic regression model was constructed to predict the occurrence of monoculture. Every metre increase in maximum tree height, the odds of a stand being monospecific would be 1.22 times greater. Stands on verges had 5.26 times greater odds of being monospecific against the slope. The associations and relationships were attributed to the dominance of C. citriodora. By boosting the logistic model, model reliability increased as kappa rose from 0.51 to 0.63, while balanced accuracy improved from 0.72 to 0.85. The occurrence of monospecific stands could be reliably predicted using maximum tree height and habitat type of tree stands. These quantitative findings monitoring can guide urban forest monitoring. Through a better understanding of urban forest structure and composition, future monitoring can aid the mainstreaming of urban forestry in transport planning.


Introduction
Urban forestry is essential to urban sustainability. Transport land use provides important urban functions. At the same time, urban trees growing in and around transport facilities account for a considerable proportion of urban tree stock (Song et al., 2005). For example, roadside verges and slopes can make up as much as one-third of public urban green space (Marshall et al., 2019). Trees planted along roads and highways are desired by road users. Key ecosystem services, such as landscape beautification (Mu et al., 2022) and pollutant removal (Bandara & Dissanayake, 2021;Kaighn & Yu, 1996;Modlingerová et al., 2012) are ushered by trees. In recent research, tree species diversification has been recommended to augment both the quantity and quality of ecosystem services (Felton et al., 2016;Huang et al., 2021;O'Sullivan et al., 2017;Soga et al., 2014). Before all else, the monitoring of the large tree stock must be conducted effectively and efficiently. Therefore, the expected ecosystem services can be delivered from a healthy tree stock which receives optimal arboricultural care including but not limited to tree risk management and tree performance monitoring.
But research revealed less than ideal situations. In different cities, transport land use had low tree biodiversity (Sever- Liu & Slik, 2022;Mutlu et al., 2017;Źróbek-Sokolnik et al., 2021). Hong Kong shared a similar situation. In Hong Kong, most major highways are lined by stately rows of trees, such as Corymbia citriodora, Acacia confusa, and Ficus microcarpa (Jim, 1990). In the past, these readily obtainable fast-growers were a quick fix to the boring, grey urban landscape. Growth rate and environmental tolerance were decisive in tree planting. In the present, the trees are approaching maturity and forming low-diversity, if not monospecific stands. Closer monitoring is required due to their colossal dimensions, possibility of structural failure, and severe consequences of failure (Dunster et al., 2017;Smiley et al., 2011). Tree risk has become the focus of monitoring. In the future, species diversification works are expected to be implemented. Ecological reconciliation and more naturalistic landscape design will be possible goals. The structure, composition, and biodiversity of urban forests will be subjected to more intensive monitoring.
Urban forest monitoring may be re-focused in foreseeable future. An opportunity arises for streamlining and fine-tuning the monitoring design. In this process, inputs from an arboricultural perspective would be useful. The habitat conditions of transport land use are subject to large-scale, artificial engineering which in turn impacts tree growth and survival. Common urban tree survey variables can guide urban forestry monitoring, such as biodiversity, structural soundness, and physiological performance (Zeng et al., 2011;Divakara et al., 2021;Lugo-Pérez & Sabat-Guérnica, 2011;Ma et al., 2020;Casanelles-Abella et al., 2021;Yang et al., 2022). Some qualitative variables have long been used in monitoring works such as tree risk assessment (Dunster et al., 2017;Harris et al., 2004;Smiley et al., 2011). Apart from tree risk, other aspects such as biodiversity and habitat conditions are also major topics in sustainable urban forestry. Thus, in this case study, following a stringent quantitative approach, various aspects of the urban tree stand along a heavily used highway would be explored. More findings may be found by using more advanced techniques such as machine learning so that the relationships between different parameters can be quantified.
Tree monitoring in transport land use can be difficult. Some tree stands occur in hardly accessible locations. If more variables are monitored, field work may become cumbersome and arduous. But, if the intrinsic correlations among key parameters of urban forests are established, tree monitoring surveys can be made more compact and concise. Past research found that tree abundance increased with species richness (Kendal et al., 2014). As mentioned above, transport land use accommodates plentiful trees (Song et al., 2005;Yang et al., 2021). Thus, the possibility of using tree abundance as a proxy for species composition can be inquired. Apart from urban forest structure and composition, relationships regarding habitat types and tree risks can be inquired due to their important role in planting design (Hale & Morzillo, 2020;Orlóci & Stanek, 1980). Such data can be collected using a largescale tree survey. Subsequent data analysis may guide future monitoring. Therefore, this study would attempt to demonstrate how a tree survey could assist in achieving effective urban forest monitoring.
Statistical modelling has been widely utilised in investigating relationships among variables in vegetation research, including allometry (Zheng et al., 2018), growing space requirement (Pretzsch et al., 2015), environmental justice (Violin et al., 2020), and spatial distribution pattern (Jim, 1989). Very often, explanatory or predictive models are constructed using multiple variables. Additional insights may be gained by model training, tuning, and testing from a machine learning approach. Such techniques can be applied on the data collected from tree surveys. The management strategies of monospecific and mixed-species stands could be different. Meanwhile, it is not uncommon to observe single-species dominance in roadside urban green space. The conditions which are conducive to the occurrence of monospecific stands can be inquired with the aid of machine learning. This may help identify priority areas where biodiversity enhancement can be implemented.
Data analysis procedures based on machine learning often requires an adequately large data set. In some urban forest sites, sampling may be difficult or even impossible. Due to site restrictions and safety issues, conducting tree surveys on highway verges and slopes could be meticulous, resulting in the small data set. But research showed that small samples were acceptable when appropriate analysis techniques and comprehensive model evaluation metrics are used (Fassnacht et al., 2014;Shataee et al., 2012;Zheng et al., 2018). Therefore, it is high time that urban foresters applied more advanced techniques in the investigation of tree management issues.
Urban forestry has gained increasing attention. Tree monitoring and management in different urban land use zones deserve more research efforts. Since transport land use can accommodate a massive tree stock, effective monitoring would be needed. It is argued that a largescale tree survey could inform upcoming monitoring works. The focus of this research was urban trees along highways. As a case study, a large number of slopes and verges were randomly selected along a heavily used highway in Hong Kong. Data from a tree monitoring campaign were gathered for in-depth analysis. The aim of this research was to illustrate the usefulness of a largescale tree survey in mainstreaming future tree monitoring and management in transport land use. Practical implications about urban forest management were distilled from the findings.Some qualitative variables have

Study area
This research was based in Hong Kong (22.32° N,114.17° E). With more than 7.403 million residents on just 1114 km 2 land area, Hong Kong is regarded a compact city (HKSARG Census & Statistic Department, 2022;HKSARG Planning Department, 2021). The transport of people and goods in Hong Kong relies on the Strategic Route and Exit Number System which is composed of inter-connected highways and expressways (HKSARG Highways Department, 2022;HKSARG Transport Department, 2020). This research focused on San Tin Highway, which is a major section of Route 9, which is the longest route in the territory (Fig. 1). San Tin Highway is a 6.3 km-long, threelane, dual-carriageway in northwest Hong Kong. Yuen Long, a major residential and business area, lies in the southern exit of San Tin Highway, whereas the northern exit leads to the border between Hong Kong and Shenzhen, another prosperous city in mainland China.
Despite its heavy use, San Tin Highway features an enormous tree stock. Many trees growing on slopes and verges were planted more than a decade ago. Technical circulars have been published and continually updated to guide tree monitoring and care in transport infrastructures in Hong Kong (e.g. HKSARG Development Bureau, 2012. The technical circulars are applicable to the landscaping works of San Tin Highway. In the past, tree planting featured a limited palette of species, resulting in low-diversity stands and even monoculture. But the importance of biodiversity has been increasingly accentuated.
With the permission of the Highways Department, 52 verges and 53 slopes along San Tin Highway were randomly selected. Due to the difference in characteristics and the management approach, verges and slopes were regarded as two distinct habitat types. In this study, all trees on a slope or a verge were collectively treated as a stand. The Highways Department acknowledged that monospecific stands were common along San Tin Highway. This study also attempted to identify the critical factors to be monitored in the management of such tree stands.

Large-scale tree survey
In this research, 53 slopes ( Fig. 2a) and 52 verges ( Fig. 2b) along San Tin Highway were randomly selected for sampling. Each slope or verge was visited once by a team of independent arborists and horticulturalists under the official coordination by the Highways Department. The tree survey campaign lasted from 4 March 2021 to 4 July 2021. A stand is composed of many trees. The structure of each stand was quantified by its abundance and maximum tree height. Abundance was defined as the number of trees in a stand. Maximum tree height was defined as the height of the tallest tree in the stand, which was measured using a laser hypsometer (TruPulse 200, Laser Technology, Centennial).
The species composition of the stands was measured using different variables. Species richness was defined as the number of species found on each slope or verge. The identification of species was based on the official taxonomical publications in Hong Kong (Hong Kong Herbarium & South China Botanical Garden, 2007, 2009Hong Kong Herbarium, 2012). Stands which consisted purely of one tree species were labelled as monospecific, but mixed-species if more than one species. Therefore, a stand could be dichotomously classified as monospecific or mixed-species. In order to quantify the biodiversity of the slopes and verges, three biodiversity indices, namely, Shannon-Wiener diversity index (H), Simpson's index (D), and Pielou's evenness index (J) were derived as follows: where P i was the proportion of tree species i, as computed by the number of tree species i over the total number of trees in each site. S was the total number of species thereon. A monoculture would be denoted by H = 0.
where n i was the number of tree species i. N was the total number of trees on each slope and verge. D was equal to the probability of two randomly and independently selected trees belonging to the identical species. In other words, a larger D value would imply a greater dominance by one or multiple certain species. D would be one in the case of monoculture. where H was Shannon-Wiener diversity as above. S was the number of species. The greater the J, the more even the species distribution was. At J = 0, monoculture would occur. H, D, and J have been applied in the study of Hong Kong's urban forests (Lee et al., 2019(Lee et al., , 2021. The three biodiversity indices were forwarded to subsequent analysis. In addition, tree risk assessment was conducted because tree risk was subsumed under road safety. In Hong Kong, the tree risk assessment protocol emanated from the best management practises published by the International Society of Arboriculture (American National Standard, 2017;Dunster et al., 2017;Smiley et al., 2011). A limited visual tree risk assessment (level 1 assessment) was carried out. Out of safety issues, time-consuming assessment exercises were ruled out in order to minimise the exposure of the personnel to busy road traffic. To uphold consistency in sampling, the assessment was performed by the same team of arborists for every stand. The outcome of each risk rating process was expressed by one of four ordinal categories, namely, low, moderate, high, and extreme, at an ascending order of tree risk level. The risk rating was as a categorical variable.

Data analysis
Abundance, maximum tree height, species richness, H, D, J, and risk rating, with respect to monoculture and habitat type, were summarised. Several statistical tests were run in order to distil specific management implications. Due to non-normality, abundance (Shapiro-Wilk W = 0.712, p < 0.01) and species richness (Shapiro-Wilk W = 0.647, p < 0.01) were log-and exponential-transformed, respectively. Stand structure and species composition were compared between verges and slopes by independent t-tests. Another round of t-tests was conducted on abundance and maximum tree height between monospecific versus mixed-species sites. Then, chi-square tests were administered to search for associations between monoculture status, habitat type, and risk rating.
In order to understand the occurrence of monospecific stands, a binomial logistic regression analysis was conducted. The dichotomous response variable had two classes, namely, monospecific and mixed-species. Abundance, maximum tree height, habitat type, and risk rating served as predictors. The former two and the latter two were quantitative and categorical variables, respectively. Since the data set was small (n = 105), some precautions were taken when implementing a machine learning approach in evaluating model performance. With stratification, the data were split into a training set (75%) and a testing set (25%). Class imbalance in the training set was cured by Synthetic Minority Oversampling Technique (SMOTE), with 100% over-sampling and 200% undersampling in R package DMwR (Torgo, 2010). A preliminary linear model with a logit link function was constructed using the training set and evaluated using the testing set. In order to improve the model, boosting was applied on the logistic regression model. After a random hyperparameter search with 50 attempts, the optimal number of iterations in boosting was found as 30. For the limited sample size, leavegroup-out cross-validation with 1000 rounds of resampling was chosen due to the limited sample size. Parameters in the model diagnostics between the boosted and the preliminary models were compared. All analyses in this study were supported by RStudio (RStudio Team, 2019), and the packages caret (Kuhn, 2008(Kuhn, , 2016, caTools (Tuszynski, 2021), ggpubr (Kassambara, 2020), and tidyverse (Wickham et al., 2019).

Results and discussion
Urban forest structure of the surveyed slopes and verges This research was conducted in San Tin Highway in Hong Kong, with an intention to improve tree monitoring within the context of transport land use. Fifty-three slopes and 52 verges were surveyed. Seven thousand two hundred nine trees were found. When all stands were considered, the mean and median abundance were 69 and 43 trees, respectively (Fig. 3a). Skewness, which was indicated by the divergence between mean and median, was attributed to 12 outlier stands with very high abundance. These outliers were entirely sloped tree stands. In fact, the mean abundance of slope tree stands (93 trees) nearly doubled the median (50 trees). But if only verges were considered, skewness was almost absent. Nevertheless, neither mean nor median difference in abundance between slopes and verges was significant, as revealed by t-test (t = 1.54, p > 0.05) and Mann-Whitney U test (U = 1,596, p > 0.05).
The mean and median abundance of monospecific stands was 60 and 47 trees, respectively. In contrast, mixed-species stands had a higher mean abundance (85 trees) but a lower median (33 trees). However, the differences in tree abundance between monospecific and mixed-species stands were insignificant as suggested by t-test (t = −0.55, p > 0.05) and Mann-Whitney U test (U = 1,259, p > 0.05). Therefore, differences in tree abundance with respect to habitat type and monoculture status were insignificant among the surveyed stands. Very often, tree monitoring involves field measurement of key dimensions. If the tree monitoring work is planned with respect to habitat type, the time period to be reserved for making tree measurements will be more or less equivalent.
Considering the maximum tree height of all stands, the mean and median were 17.21 m and 18.00 m, respectively (Fig. 3b). Unlike other variables, skewness was less obvious. Mean and median showed convergence for tree stands on slopes (mean = 13.09 m, median = 12.00 m) and verges (mean = 21.40 m, median = 20.00 m). However, the maximum tree height on the verges was noticeably greater than that on slopes. Such a large slope-versus-verge differences were highly significant, as concluded from the results of t-test (t = −10.1, p < 0.001) and Mann-Whitney U test (U = 310, p < 0.001). A taller tree can lead to more severe consequences in case of structural failure. Therefore, if any tree monitoring campaigns are centred on tree risk, more efforts may be spent on the verges.
Maximum tree height was interpreted in relation to species composition. Monospecific stands showed significantly greater maximum tree height than monospecific stands, as confirmed by t-test (t = 5.58, p < 0.001) and Mann-Whitney U test (U = 1,952, p < 0.001) (Fig. 3b). Most monospecific stands featured Corymbia citriodora, whose mature height matched the mean (19.35 m) and median (20.00 m). The severity of tree failure increases with tree size (Dunster et al., 2017;Smiley et al., 2011). Again, monospecific stands, especially those along road verges, may warrant a higher level of tree risk assessment in regular monitoring.
In this case study, 7209 trees were recorded on just a 6.3 km-long highway section. A gigantic tree stock in the transport land use was thus implied in Hong Kong's 2193 km-long extensive road network. Sharp contrasts were made with other land uses. A study on half of the public residential estates found 47,801 trees (Lee et al., 2019), while a roadside tree survey in a town centre found just 1111 trees (Lee, 2022). Such important contribution of transport land use to urban forestry in Hong Kong echoed with the overseas situations (e.g. Marshall et al., 2019). Along San Tin Highway, the tree stock was made up by a consideration proportion of tall trees, especially on road verges. The fact that verges and monospecific stands had taller trees could be attributed to the common occurrence of C. citriodora. This species could easily attain > 20 m in urban locations (Jim, 1990). Tree failure of such a large species can cause severe consequences. Research has shown that a tall and large tree canopy may catch drivers' attention and raise their awareness of avoiding trees as dangerous objects (Marshall et al., 2018). But large tree size was shown to raise collision risks (Bucsuházy et al., 2022). In order to ensure the safety of road users, a more detailed tree risk assessment is required in tree monitoring. In the examination of tree abundance and tree height, the tree survey results were considered useful in the planning of urban forest monitoring fulfilling various goals. This study demonstrated that arboricultural implications can be revealed even with the use Fig. 3 Boxplots showing the distribution of a abundance, b maximum tree height, c species richness, d Shannon-Wiener diversity, e Simpson's dominance, and f Pielou's evenness. In each graph, from left to right, the boxes were constructed using data from all stands (n = 105), verges (n = 52), slopes (n = 53), monospecific stands (n = 68), and mixed-species stand (n = 37).
The box boundaries were defined by the inter-quartile range (IQR). The horizontal line inside each box and the upper and lower whiskers indicated median, maximum, and minimum values. Any values outside 1.5 × IQR were outliers and visualised by dots of simple but essential tree inventory parameters such as tree height and tree abundance. These parameters can also serve as input to more advanced data analysis techniques as shown in upcoming sections.

Species richness and biodiversity
Overall, species richness was very low. Only 23 tree species were identified (Fig. 4). Considering all 105 sites, the mean richness was 1.9 species, and the median was 1.0 species (Fig. 3c). In fact, the median represented monoculture, most of which contained C. citriodora only.
Slope tree stands had higher species richness, with a mean and median of 2.7 and 2.0 species, respectively, which were significantly higher than those on the verges, as confirmed by t-test (t = 2.13, p < 0.05) and Mann-Whitney U test (U = 2,152, p < 0.001). By inspecting the species composition charts, more species occurred in the slope tree stands (Fig. 4). But even after excluding the monocultures, the mixedspecies stands only had a mean and median richness of 3.6 and 3.0 species, respectively. Optimistically speaking, much potential work can be conducted to enhance tree biodiversity in transport land use.
Biodiversity indices, namely, diversity (H), dominance (D), and evenness (J), were computed. These indices were related to species richness, and thus showed some similarities (Fig. 3c-f) The pies placed on the right were an extension of the slice resembling the species with relatively low frequency in the main pie those on verges. Slope tree stands were significantly more diverse (t = 6.81, U = 2,168), less dominated by certain species (t = − 5.65, U = 605), and more even (t = 6.71, U = 2155).
Tree stands on verges had obvious influences on the overall data set, as both showed identical median values, namely, H = 0, D = 1, and J = 1, which indicated monoculture. Complete dominance by a single species, mostly C. citriodora, was reflected by D = 1. Such tree stands may be the priority areas for biodiversity enhancements. However, if the goal for urban tree monitoring was to enhance biodiversity, species richness would be a simple and reliable parameter. In this case study, both species richness and biodiversity indices of tree stand suggested higher biodiversity on slopes. For brevity, species richness alone may be an adequate variable in urban forest monitoring. In this sense, the field work of urban forest monitoring can be streamlined. Biodiversity indices, which are derived quantities, could be computed afterwards if needed.
Among the 23 identified species, 10 were common in Hong Kong (Jim, 2008). In other parts of the world, past research also found limited species richness. For example, only 11 tree species were found along road verges of South Africa (O'farrell & Milton, 2006). It would be easy to locate guidelines and materials for the management of common species. However, the management objectives of roadside green space may evolve with time. For instance, biodiversity improvement has been increasingly valued in a mission to optimise roadside habitats for the provision of ecosystem services (O'Sullivan et al., 2017). Monospecific stands may be converted into mixed-species stands (Juchheim et al., 2020). Nevertheless, clear objectives and a management plan must be formulated before putting biodiversity improvement plans in effect. Visual amenity, soil erosion control, noise interception, and various other ecosystem services could be ushered by suitable species selection and target-oriented maintenance. In the case of transport land use with trees in great abundance, low-maintenance plantings may be desired in order to meet drivers' expectations (Booze-Daniels et al., 2000;Wolf, 2006). If a more diverse selection of tree species is desired, corresponding management guidelines must be extended to cater for less common species. In this research, the species with relatively low frequencies could be surveyed for their health and vigour, which may serve as an indication of their suitability in future use in similar habitats. With sufficient empirical evidence, the tree stock of transport land use may contain a greater diversity of species in the future.
Associations among habitat type, monoculture status, and tree risk ratings The relationships between various urban forest properties were examined. A highly significant association (Χ 2 = 31.90, p < 0.001) was found between habitat type and monoculture status. The significance was caused by the prevalence of monocultures on the verges. The observed count (48) well exceeded the expected count (34) (Table 1(a)). On the opposite, slopes had a higher observed count of mixedspecies stands (33) than expected (19). Different tree species possess different physiological traits. Some species are more prone to structural failure. Habitatbiodiversity associations can be useful for fine-tuning monitoring parameters. For example, C. citriodora monocultures were common on verges. This species is susceptible to branch failure. In order to safeguard road users and traffic, related aspects can be the focus of tree risk monitoring with detailed parameters, such as branch attachment angle, localised disorders and decay on branches, and so on. But for the slope trees, such a level of detail may be unnecessary.
Moreover, monoculture status and tree risk rating were significantly associated (Χ 2 = 23.94, p < 0.001) (Table 1(b)). Divergence was found between observed and expected frequencies of mixed-species stand with moderate tree risk at 18 and 8, respectively. Also, monoculture had a greater observed count (64) than expected (54). In brief terms, monospecific tree stands were linked to high tree risks, while the tree risk level of mixed-species stands was associated with a moderate rating. Tree monitoring is a targetoriented practise (Harris et al., 2004). In this case study, as monospecific stands were associated with high tree risk, a shorter-than-normal monitoring interval can be applied. But for mixed-species stands, a regular interval can be maintained.
Last but not least, by extending the two significant associations as discussed above, habitat type and tree risk rating were significantly associated (Χ 2 = 24.86, p < 0.001) (Table 1(c)). Slopes were associated with moderate tree risk, whose observed frequency (22) doubled the expected one (11). The tree risks of all verges were rated as high, thus staging a total deviance from the expected count of moderate class (11). The associations elaborated in this section corroborated with each other.
Tree risk rating is highly relevant to transport planning. Tree failure occurring along a highway could cause traffic accidents and disruption. In this case study, consistent associations were found among risk rating, species composition, and habitat type. Monitoring results can inform arboricultural works. At the moment, based on the results, relatively more resources can be spent on tree risk mitigation measures for monospecific stands on verges. In the future, the monitoring plan for mixed-species stands on roadside slopes could focus on environmental parameters leading to the accumulation of tree risks. But, for monospecific stands, more monitoring efforts can be spent on the examination of tree structural conditions. Predicting and detecting the presence of monospecific stands Binomial logistic regression was used for predicting the occurrence of monospecific stands. In the initial model, there were only two significant predictors, namely, maximum tree height (p < 0.01) and habitat type (p < 0.05). Tree abundance and risk rating were insignificant predictors. The log odds were exponentialised for more intuitive interpretation here (Table 2). A metre increase in maximum tree height would bring an increase of 1.22 times in the odds of the occurrence of monoculture. In terms of habitat type, the odds of a verge being monospecific was 5.34 times greater than that of a slope. In more general terms, a verge with tall trees would be expected to be monospecific, such as those with C. citriodora in this case study. Such a finding is helpful in situation when species composition would have to be estimated when ground-based or remote-sensing-based measurements covering only tree height and land use types are available.
In the present study, boosting enhanced the performance of the binary logistic model to a certain extent (Table 3). With or without boosting, the models exhibited an identical accuracy at 0.81, and Mcnemar's Test p-values exceeding 0.05. However, the boosted model was more reliable as evinced by its higher kappa at κ = 0.63. In other words, a greater proportion of the correct predictions made by the boosted model could be attributed to the model's capabilities, instead of random chances. Consequently, the boosted model was preferred in predicting the occurrence of monoculture.
The accuracy of classifying a stand as monospecific and mixed-species was examined. The initial model was entirely correct in predicting the presence of monoculture but had noticeable inaccuracies in predicting mixed-species sites (Table 3). This translated into a sensitivity and a specificity at 1.00 and 0.44, respectively. In contrast, the boosted model had remarkable accuracy in predicting mixed-species stands as shown by the specificity of 1.00, but less so for predicting monospecific stands as reflected by a sensitivity of 0.69. Still, in terms of balanced accuracy, the boosted model (0.84) surpassed the initial model (0.72). All in all, boosting, as an ensemble method, raised the predictive accuracy. Before improving or replacing monocultures, a good detection mechanism would be necessary. The present findings corroborated past studies that variables related to forest structure could predict tree species diversity. For instance, the present study echoed the work by Hakkenberg et al. (2016) that maximum tree height acted as a strong predictor. However, an alternative perspective was provided due to a divergence in the direction of the effect of maximum tree height. In past research, greater maximum tree height increased with species diversity (Hakkenberg et al., 2016) and species mixing (Pommerening & Uria-Diez, 2017). Contrastingly, according to the modelling results in this research, maximum tree height increased with the probability of the presence of monoculture.
The possible underlying causes of such differences were identified. The effect of maximum tree height on species diversity should be interpreted in tandem with other variables so that a link between maximum tree height and species diversity could be established. No matter how variable the growth rates of different tree species in an urban forest were, tree height would increase with time. Thus, past studies catered for the effects of stand age and history (e.g. Hakkenberg et al., 2016). During forest stand development, new species might move in as a form of spontaneous vegetation growth, while the existing trees accumulated height increments. But for the present case study, such linkages explaining the positive relationship between maximum height and species diversity might be inapplicable.
However, for this case study, all stands were located along the same highway. Site establishment history or age posed no or just little confounding effects because the landscaping works were carried out in the same phase. Instead, as the monospecific stands were dominated by mature C. citriodra, whose maximum tree height values were likely to exceed those of mixed-species stands. As a result, the tie between the tall specimens of C. citriodra and monospecific stands contributed to the positive coefficient of maximum tree height in the model (Table 2). This research lacked other key tree dimensions such as diameter at breast height or crown spread. But if other dimensional parameters were included, a positive effect on the odds of the occurrence of monospecific stands would be expected in the context of this research.
After all, tree monitoring can be benefitted from the results of this large-scale tree survey. In this case study, maximum tree height was a significant predictor of the probability of a stand being monospecific. This study demonstrated a possible workflow for the identification of the relationship between species composition, tree risk, and tree performance. In the context of this study, transport land use accommodates a Table 2 Estimates of log-odds of a unit change in the predictors in the initial logistic regression model for predicting whether a stand would be monospecific or mixed-species. For easier interpretation, the log-odds values were exponentialised. If the converted odds ratio was greater than one, a unit increase in the predictor would increase the probability of a stand being monospecific, vice versa. Predictors which were significant at α = 0.05 were underlined. Habitat and risk rating were categorical variables. The log odds of a stand being on a slope against a verge were presented. The log odds of a moderate risk rating against a high-risk rating were displayed substantial proportion of urban green space. A major difficulty in the monitoring of highway trees is the low accessibility due to heavy traffic. Taking tree risk monitoring as an example, when an assessor encounters a stand which is composed mostly of tall trees from afar, such a stand could be considered a monospecific C. citriodora stand. Since tree risk and performance are related to tree species and size, the assessor can fill in the assessment items accordingly without entering the roadside greening area.
In the future, advanced techniques, such as unmanned drones and satellite images, can be utilised for tree height mapping. Areas with exceptionally tall trees may be feature low species diversity. Then, field surveys can be conducted in such stands, which may become priority zones for tree biodiversity enhancement. Subsequently, tree monitoring with special attention to biodiversityrelated parameters can be conducted.

Conclusions
This case study featured a large-scale tree survey. One hundred five tree stands along a busy highway in Hong Kong were investigated. 53 slopes and 52 verges were randomly selected and surveyed. Each tree stand was characterised by abundance, species, maximum height, and biodiversity indices. Seven thousand two hundred nine trees were found on the surveyed sites. Effectiveness is required for monitoring such an enormous tree stock. Significant differences were discovered in the structure and species composition of tree stands between the two habitat types, namely, verge, and slope, suggesting the possibility of optimising the allocation resources in monitoring. The maximum tree height of the verges, which could exceed 20.0 m, was significantly greater than that of the slopes. But the slopes showed higher biodiversity as reflected by the diversity, dominance, and evenness indices. With possible biodiversity enhancement works in urban forests in the foreseeable future, the monitoring focus of tree stands can be adjusted based on the biodiversity situation. Chi-square tests found significant associations of the verges with higher tree risk and monoculture. The association was attributed to the prevalence of mature C. citriodora. The logistic regression model indicated that the likelihood of a monoculture would increase with maximum tree height. Also, when the habitat type was verge, a greater likelihood was resulted. Such findings are helpful in situations when species composition would have to be estimated when ground-based or remote-sensing-based measurements covering only tree height and land use types are available. However, the effects of the predictors differed from those from past research. Within the context of this case study, tree stock composition and planting history could be the possible explanations. Not only did boosting increase the balanced accuracy of the regression model, but also model reliability. This research demonstrated how a comprehensive analysis of the results of a tree survey can streamline urban forest monitoring. In future research, more parameters could be sampled and monitored for the detection of possible monospecific tree stands. Other advanced techniques could be used in tandem to facilitate tree monitoring and management in urban transport land use.