Urban biotope classification incorporates urban forest and green infrastructure for improved environmental land-use planning in Mexico City

Urban forests are recognized worldwide as the most critical component of green infrastructure due to their capacity to provide various environmental goods and services. As cities continue to expand and their environmental problems intensify, there is a growing need for urban forests and green infrastructure to be better incorporated into strategic land-use planning, especially in developing cities. The first step in building an urban forest management plan is to capture characteristics of the urban forest and how these change across the built environment. Here, we used an urban biotope approach to classify urban forests and environmental characteristics in Mexico City. We sampled 500 fixed-area randomly stratified plots across the city to characterize urban forest structural and compositional variables. PCA and the broken-stick method were used to reduce the number of 25 urban forest variables down to five significant principal components that accounted for 78% of the data's cumulative variation. Ward's method helped classify biotopes into a hierarchical system with seven finer-level biotopes defined by urban forest characteristics (Dunn = 0.09, AC = 0.98), nested within two broader-level biotopes defined by forest canopy conditions (Silhouette = 0.59, AC = 0.99). A no-tree canopy biotope was extracted from sampling locations with no trees. The biotopes derived here can fundament biotope mapping, and support decision-making in urban forest planning, including the identification of available planting spaces, tree diversity targets, and canopy protection. Our work in Mexico City demonstrates how the biotope approach can be adapted and used to better incorporate urban forests and green infrastructure into future management planning for any city.


Introduction
Urban areas are a mosaic of different land-use types with varying extents of developed lands and green areas. Within the urban fabric, there are patchworks of natural, seminatural, anthropogenic, and planted vegetation that differ in their origin, structure, composition, and functions. Various plants, from herbs, shrubs, and trees are distributed across parks, public green spaces, green corridors, roofs, vertical greening, private gardens, lawns, and streets (Ahern 2007). Trees, collectively referred to as urban forests, are the most significant vegetation and green infrastructure component (Samson et al. 2019). Due to their trees' size, biology and longevity, urban forests provide diverse ecological functions and services that are directly related to the forest composition, structure (Nowak and Crane 2000;Nowak et al. 2016;Samson et al. 2019), distribution, and variability over urban space (Escobedo and Nowak 2009). The urban forest has been defined differently across geographical regions. In this study urban forest included all trees across different land uses, from individual trees to groups of trees, private and public trees, and trees planted or in remnant ecosystems, similar to definitions used in North America and some European countries (Nowak et al. 2001;Alvey 2006;Konijnendijk et al. 2006;Tree Canada 2019).
Urban forest characteristics and functions and their variation are driven by the characteristics of the built environment (Bourne and Conway 2014;Steenberg et al. 2015). Variation of urban forest composition and structure has been explored 1) as a function of development along urban gradients (Burton 1 3 et al. 2005;Ortega-Álvarez et al. 2011;Bourne and Conway 2014); 2) in relation to land use categories (Sudha and Ravindranath 2000;Dobbs et al. 2013;Fan et al. 2019); or 3) a function of urban socio-economic heterogeneity (Iverson and Cook 2000;Conway and Bourne 2013). Additionally, classifications of ecologically relevant spatial units across a landscape are developed to understand the spatial patterns of the urban forest (Cadenasso et al. 2007;Steenberg et al. 2015). Ecologically based classifications that combine information on vegetation (i.e., urban forest), built surfaces, and surface materials, are shown to capture the urban land and ecological heterogeneity (Cadenasso et al. 2007), and enable strategic integration of urban forest planning and management into land-use planning (Steenberg et al. 2015).
In recent decades, there has been a worldwide surge in diverse green initiatives that focus on incorporating ecological issues into urban planning (Niemelä 1999), including the multifunctional landscape assessment tool (Lovell and Johnstone 2009), the life cycle assessment approach (Lovell and Taylor 2013), the iTree methodology , the urban forest ecosystem classification (Steenberg et al. 2015), the urban biotope approach (Sukopp and Weiler 1988), among others. Among the ecologically driven approaches for defining, capturing, and mapping the relationship between urban vegetation and its environments, we chose the biotope approach as it can be applied to both natural and built environments (Sukopp and Weiler 1988;Löfvenhaft et al. 2002), and is suitable for spatially heterogeneous landscapes. Biotopes are land units defined based on a combination of vegetation and environmental characteristics and are practical, flexible, and applicable for incorporating ecology into land use, spatial planning, and relevant decision-making (Löfvenhaft et al. 2002).

Urban biotopes
A biotope (bio = life, topos = place) is an area with relatively uniform environmental conditions that support a specific assemblage of plants and animals (Sukopp and Weiler 1988). Originally, biotope classifications were developed and applied to guide the environmental protection of natural and seminatural ecosystems in Germany. In the last decades, the biotope concept has been extended and applied to spatial planning in urban areas, consequently termed "Urban biotopes" (Sukopp and Weiler 1988). Besides Germany (Sukopp and Weiler 1988;Maurer et al. 2000), several countries such as New Zealand (Freeman and Buck 2003;Stewart et al. 2009), Sweden (Cousins andIhse 1998;Löfvenhaft et al. 2002;Gao et al. 2012), Turkey (Mansuroglu et al. 2006), the UK (Jarvis and Young 2005), Brazil (Weber and Bedê 1998), China (Lu and Wang 2018), and South Korea (Hong et al. 2005) have found applications of the biotope approach into land use and spatial planning in anthropogenic and built environments. To date, the biotope approach has not been applied in North and Central American cities.
Biotopes, as readily mappable land units that have found applications in landscape and urban planning, and heve been used to inform land-use planning, support landscape monitoring programs, develop and evaluate land-use policies, and plan for biodiversity enhancement and management at different scales (Cousins and Ihse 1998;Löfvenhaft et al. 2002;Freeman and Buck 2003;Gao et al. 2012). Traditionally, biotopes have been mapped based on visual interpretations of aerial imagery and various environmental properties of a site (i.e., soil, vegetation). Such mapped units are further described using field data from vegetation sampling (Cousins and Ihse 1998;Freeman and Buck 2003;Mansuroglu et al. 2006;Gao et al. 2012). This top-to-bottom approach relies on manual or automatic interpretation of boundaries in areas with detailed vegetation classification, soil information, mapping, and field data. However, for areas where such detailed information does not exist (Heiden et al. 2003), such as Mexico City, there is a need for an alternative approach. Here we investigate a bottom-up approach, wherein first, field data are collected, then combined with readily available spatial data and remotely sensed information and processed using statistical modeling. The outcome of this classification approach can be finely tuned as more field or spatial data is acquired. Meanwhile, this approach also enables the integration of different sets of data (i.e., physical, ecological, socioeconomic), allowing to derive biotope classes based on specific needs.
The urban landscape of Mexico City is a suitable setting to develop and test this statistically based, data-driven biotope classification approach due to its spatial heterogeneity determined by its built-up physical characteristics, built-up density, and land cover classes (Taubenböck et al. 2008). The city has 16 administrative boroughs which differ in their socioeconomic (Fernández-Álvarez 2017) and ecological characteristics measured by canopy cover (PAOT 2010;Bravo-Bello et al. 2020). Across its developed and built-up lands, Mexico City lacks city-wide urban forest information and strategic spatial planning. The existing urban forest and green space planning are done site by site and address specific needs related only to public lands (Programa General de Desarrollo del Distrito Federal 2013). In addition, to date there is no city-wide urban forest research. The existing urban forest-related research in Mexico City has targeted a fraction of the urban forest. For example, Ortega-Álvarez et al. (2011) examined the urban forest in the northwest part of the city looking at specific landuse types. Research on the "Bosque de Chapultepec", the oldest and largest urban park in Latin America, looked at the dendrological characteristics of the stands (Benavides Meza and Fernández Grandizo 2012). The urban forest of the neighborhood "Escandón" was targeted due to its impact on high CO 2 emissions (Velasco et al. 2014). While there are some research and urban forest management fragments in Mexico City, there is no overall assessment of urban forest nor an understanding of its composition and structure across the entire urbanized area. To address this knowledge gap and to explore how urban forest characteristics are associated with urban structure at the city-wide scale, we developed a biotope classification by combining biotic (urban forest) and abiotic (environmental) information. The selection of urban forest and environmental characteristics variables was dependent on their use in previous urban forest classifications and available data for the study area. Urban forest variables included structural and compositional characteristics, and environmental variables included surface type (impervious and pervious), indicators of land-use intensity, and soil types.
The objective of this study was to develop a statistically based and adaptable urban biotope classification by combing structural and compositional characteristics of the urban forest and environmental characteristics of Mexico City. We expect that urban forest characteristics will be strong drivers of urban biotopes. Specifically, we aimed at 1) characterizing the compositional and structural characteristics of the urban forests to support biotope classification; 2) deriving urban biotopes by integrating urban forest and environmental characteristics; and 3) interpreting and characterizing the derived urban biotope classes.

Study area
The study area is Mexico City, located in central Mexico (19.4326° N and 99.1332° W). The city covers an area of 1,494 km 2 , of which 42% is urban development (790 km 2 ) while the remaining 58% is still unurbanized and under conservation lands (Fig. 1). The unique combination of two extreme environments within the city resulted in most studies focusing on conservation lands and natural vegetation (i.e., González-Hidalgo et al. 2001;Castillo-Argüero et al. 2004), while only a handful of studies targeted urbanized land (Ortega-Álvarez et al. 2011;Velasco et al. 2014). The boundary of the study area was determined by the urban development area of the city defined by the Ministry of Urban Development and Housing (SEDUVI 2003) as areas with high population density, large proportions of build-up and impervious surfaces, traffic, and a variety of industrial activities. Therefore, the present study excludes the conservation lands and focuses on the urbanized part of Mexico City and its urban forest.
Within the study area, residential land use is predominant (34% of the urban area), followed by transportation networks (19%), mixed residential-commercial (15%), green areas and open spaces (15%), urban services (9%), and industry (3%). The remaining marginal land use (5%) is regulated through special programs of urban development (SEDUVI 2003). Tree canopy cover for the urbanized part of Mexico City is 10.6% (Bravo-Bello et al. 2020) and is unevenly distributed across the 16 city's boroughs (autonomous administrative units). Boroughs located in the west and south parts of the city have the highest canopy cover (18 to 26%), while boroughs distributed in the north and east parts have less than 8% of canopy cover (PAOT 2010;Bravo-Bello et al. 2020).
Mexico City is altitudinally one of the highest cities in the world, located at 2,240 m asl. It resides in a complex geological, and ecological area termed the Trans-Mexican Volcanic Belt, where the Neotropical and Nearctic biotas overlap (Morrone 2010). The northeast and east parts of the city are situated on a lacustrine plateau, and the south and west parts have volcanic slopes with extrusive igneous substrates. The types of soil in Mexico City are Andosol, Lithosol, Phaeozem, and Solonchak. Andosols are volcanic soils used in agriculture and are distributed in the south of the city. Lithosols are shallow and rocky soils, also found in the south. Phaeozems have a humus-rich surface layer and are the dominant type across the city. Solonchak, saline soils, are found in the east part of the city (INEGI 1999). The climate is mostly temperate with dry winter conditions (Cwb). The annual average temperature for the region is 16 °C, ranging from a maximum average of 27 °C registered in the warmest months (March to May) to a minimum average of 3 °C registered from November to January. The area experiences a rainy season from June to September when it receives about 73% of the average annual precipitation which is 625 mm.

Data sources
Data to support the classification of urban biotopes is a combination of urban forest field and spatial data and environmental data. Urban forest data included 25 urban forest structural and compositional variables derived from field sampling, and tree canopy cover derived from supervised classification of remotely sensed images. Environmental data was represented by eight environmental variables that were: the percentage of impervious surface and pervious surface, the density of roads (km/ha) and dwellings (dwellings/ha), and soil types (Andosol, Lithosol, Phaeozem, and Solonchak). Urban forest and environmental data will be explained in detail in the following sections.
Field sampling was guided by a stratified random sampling design to ensure efficient field sampling across Mexico City's urban forest conditions and heterogeneous urban structure (i.e., built-up physical characteristics, built-up density, land cover classes) (Taubenböck et al. 2008). The city was divided into a 1-ha hexagon grid and each hexagon was assigned land use, census, and canopy cover information. The decision of using 1-ha hexagons was done based on the spatial resolution of the available satellite imagery for this study (RapidEye imagery, 5 m pixel; Sentinel-2, 10 m pixel; and Landsat, 30 m pixel) (RapidEye 2012; Landsat 8 OLI 2019; Copernicus Sentinel-2 2021). With high-resolution imagery (i.e., 1 m pixel) a smaller hexagon size is more adequate. Alternatively, for low-resolution imagery (i.e., 1 km) a 1-ha hexagon may yield mixed pixel values, thus, reducing the accuracy of the measured variables. Using k-means clustering hexagons were grouped into eleven strata. From the stratified hexagons, 500 were randomly selected using the sampling design tool (Buja and Meza 2012) for ArcGIS 10.4.1 (ESRI 2016).
Sampling urban trees in the entire 1-ha area was not feasible due to the time and costs associated with sampling, and thus, in the center of each selected hexagon, a sampling plot was established where urban forest measurements were completed. A plot size of 400 m 2 , commonly utilized in urban forest assessments (i.e., United States Forest Service; Nowak and Crane 2000), was selected to capture urban forest structure and composition. Within each fixed area plot, all trees and shrubs with a diameter at breast height (DBH) ≥ 5 cm were identified to the species level and their DBH and canopy width were measured (Nowak and Crane 2000). Of the 500 originally targeted sampling plots, 320 (64%) plots that contained trees were used for further analysis. Of the 180 field plots without trees, 37 were within the 1-ha hexagons without tree canopy cover and were used to derive the no-tree canopy cover biotope class.
From urban forest field measurements, 25 variables representing urban forest composition and structure were derived. Compositional variables were overall species richness; richness of native and introduced species; richness of tropical, sub-tropical and temperate species; and richness of evergreen and deciduous tree species. Structural variables included the number of trees, basal area (BA) (m 2 ), and canopy cover (m 2 ); and BA, and canopy cover of native, introduced, tropical, subtropical, temperate, evergreen, and deciduous species per plot. These variables were selected because they are deemed to be relevant descriptors of urban forest composition and structure (Nowak and Crane 2000;Pataki et al. 2013) and have applications in urban forest planning and management and biotope classifications (Freeman and Buck 2003;Gao et al. 2012).
Urban tree canopy cover is the most common measure of urban forest extent and variation of tree canopy across cities (Nowak et al. 1996). Tree canopy for Mexico City was derived using 2012 RapidEye satellite imagery. Specifically, the Normalized Difference Vegetation Index (NDVI) as well as the Red, Green, and Blue bands were used in supervised classification. Based on validation data the classification accuracy of tree canopy had a value of kappa = 0.79, which is considered a good classification (Jensen 2005). Tree canopy derived from RapidEye enabled estimating percent canopy cover per 1-ha hexagon (using zonal statistics in Arc-GIS 10.4.1, ESRI 2016).
Environmental variables used to describe biotopes were (Stewart et al. 2009;Conway and Bourne 2013;Steenberg et al. 2015): the percentage of impervious surface and pervious surface, the density of roads (km/ha) and dwellings (dwellings/ha) (as per census data characterizing households), and soil types (Andosol, Lithosol, Phaeozem, and Solonchak). Data on impervious and pervious surfaces were obtained from the National Commission for the Knowledge and Use of Biodiversity (CONABIO 2016). Road density was calculated from a road network spatial layer (PAOT 2010). The dwelling's density was calculated using census data (INEGI 2015). Soil type information was retrieved from the national soil mapping (INEGI 1999). To avoid correlated variables, the eight environmental variables were evaluated in a Pearson correlation matrix. Correlated variables were identified between pairs of variables that had significant correlations (p < 0.05) at a 95% confidence level, and one of the correlated variables was removed. Significant correlations were identified between the percentage of impervious surface and road density (r = 0.48) and between Phaeozem and Lithosol (r = -0.7). Within correlated variables, the impervious surface percentage was retained as it captures information on roads and other structures and to some extent provides information on tree growing space (Nowak et al. 2004). Phaeozem was retained as Lithosol occurs in a very limited part of the study area. Each 1-ha hexagon was assigned the values of the six uncorrelated environmental variables.

Urban forest variables
The 25 variables depicting urban forest structure and composition were analyzed using Principal Components Analysis (PCA). PCA was used to reduce the dimensionality of the data and derive new urban forest variables while retaining as much of the variation in the original data as possible (Jolliffe 2002). The number of components to extract and the significance of loadings with their correspondent principal component were determined based on the broken-stick method where PCA axes and loadings with percentages of variance larger than the broken-stick variances were considered significant (Jackson 1993;Peres-Neto et al. 2003). The broken-stick method was performed in R using the "PCAsignificance" function in the "Biodiversi-tyR" package (Kindt and Coe 2005). Principal components were derived using the "principal" function in the "psych" package (Revelle 2019) for R (R Development Core Team 2019).
The output of the PCA was interpreted as new uncorrelated urban forest variables and component scores were assigned to 1-ha hexagons, under the assumption that the PCA scores captured at the site plot level are transferable to a hexagon. The advantage of using PCA scores rather than the raw data is to simplify the urban forest structure and thus further relationships between urban forest characteristics and environmental variables; as well as to mitigate the effect of correlation among original variables (Huang et al. 2001). The principal component scores were used as input in clustering.

Classification of urban biotopes
Given the different units of measurement of variables, each variable was standardized using z-scores (Steenberg et al. 2015). A hierarchical cluster analysis was performed using a data set of 320 records (hexagons) depicting urban forest and environmental variables. Hierarchical agglomerative clustering methods were selected due to their common application in vegetation classifications (Wallace and Dale 2005) and their effective applications when combining biotic and environmental data (Steenberg et al. 2015). Four different widely used agglomerative clustering functions were applied: average, complete, single linkage, and Ward's method. Clustering was computed in R using the "agnes" function of the package "cluster" (Maechler et al. 2019). The performance of the four clustering methods was evaluated using the Agglomerative Coefficient (AC) and considering results with balanced cluster sizes (number of cases per cluster) (Schmidtlein et al. 2010).
To determine the final number of clusters, the "clValid" package (Brock et al. 2008) for R was used. The decision of how many clusters to derive was supported by Dunn's Index (Dunn 1974), and the silhouette width (Rousseeuw 1987) because both are examples of non-linear combinations of compactness and separation (Brock et al. 2008). The Dunn Index has a value between zero and infinity and should be maximized (Dunn 1974). The Silhouette width ranges from -1 to + 1, and values closer to + 1 indicate better goodness of clustering (Brock et al. 2008). Using the "aggregate" function of the "dplyr" package (Wickham et al. 2020), the average values of each variable per cluster were obtained and used to characterize biotopes. The classes derived from the cluster analysis were interpreted, described, assigned a biotope class, and named according to their dominant urban forest or built characteristics.

Urban forest characteristics
In total, 1,640 trees were surveyed in 2017 and 2018. Of the sampled trees, 106 species, 72 genera, and 44 families were found. Overall, 70% of all the sampled species were introduced, and 30% were native to the Trans-Mexican Volcanic Belt. Evergreen species represented 64% and deciduous species represented 36% of the tree species. According to species biogeographical origin, most tree species were sub-tropical (45%), followed by temperate and tropical species (27.5% each).
Of the 25 urban forest variables from field data, PCA revealed five significant principal components. The new, uncorrelated, orthogonal principal components (PC) derived urban forest variables. Urban forest variables that contributed significantly to the variance captured by a particular component were used to interpret the principal components as follows: (PC1) Evergreen-subtropical canopy, (PC2) Introduced-evergreen richness, (PC3) Temperate basal area and canopy, (PC4) Deciduous basal area and canopy, (PC5) Tropical basal area and canopy.
The five principal components accounted for 78% of the data's cumulative variation; PC1 and PC2 accounted for 37% of the variation. The PC1 explained 21% of the variation in urban forest characteristics and was most strongly related to the canopy cover of evergreen, subtropical and native trees, and the basal area of subtropical and evergreen trees. The PC2 explained 16% of the variation and was most strongly related to the richness of introduced, evergreen and subtropical species. PC3 accounted for 15% of the variation and was formed by the canopy cover and basal area of temperate trees, the richness of tropical species, the basal area of natives and the number of trees. PC4 explained 14% of the variation and included the canopy cover and basal area of deciduous trees. Finally, PC5 accounted for 12% of the variation and represented the basal area and canopy cover of tropical trees (Table 1).

Urban biotopes classification
The optimal number of clusters was identified at two (Silhouette width = 0.59), and 7 (Dunn = 0.09), and both results were evaluated. The uniformity of cluster sizes in clustering solutions varied between clustering methods (Fig. 2). For the solution of two clusters, Ward's method and average linkage produced more balanced clusters, Ward's with clusters of 97 and 223 samples, and average linkage with clusters of 25 and 295 samples. The complete and single linkage produced unbalanced clusters with 9 and 311, and 1 and 319 samples, respectively (Fig. 2). For the solution of 7 clusters, Ward's method produced clusters that ranged in size between 9 and 110 samples per cluster. Complete linkage and average linkage clustering gave more unbalanced solutions with cluster sizes ranging from 2 to 217 samples per cluster. Single linkage clustering produced the most unbalanced solutions with clusters ranging from 1 to 284 samples and six clusters with less than 20 samples (Fig. 2). Accordingly, the Agglomerative Coefficient (AC) showed the strength of the clustering structure obtained by Ward's method (AC = 0.99 for two clusters, AC = 0.98 for seven clusters), complete clustering (AC = 0.96 for two clusters, AC = 0.95 for seven clusters), and average clustering (AC = 0.92 each), as compared to single linkage clustering (AC = 0.83 each).
Considering the results of the four agglomerative clustering methods, Ward's method was selected to derive the urban forest biotopes. A dendrogram was produced and illustrates the hierarchical and agglomerative clusters derived (Fig. 3). Classifications resulting in two clusters were interpreted as "Broader-level biotope groups" characterized by their canopy percentages. Then, seven clusters nested within the two broader biotopes were identified as "Finer-level biotope classes" and interpreted as biotopes defined by urban forest and environmental characteristics. A no-tree canopy biotope class was directly extracted from field and spatial data hexagons without trees and tree canopy and was characterized by zero canopy cover and the average values of its environmental characteristics.

Broader-level biotope groups
Three biotope groups were identified as: 1) defined by impervious surfaces, 2) defined by the canopy and urban forest characteristics, and 3) defined by the absence of trees. The biotope group defined by impervious surfaces had an average of 85.5% (± 24.7) of impervious surfaces, 18.5% (± 18.7) of canopy cover, 1.1% (± 5.7) of pervious surfaces, and the PC scores in this group were represented by negative values, indicating the low influence of urban forest components. The nested finer-level biotopes within this group are biotopes 1 to 3 (Fig. 3). The biotope group defined by canopy cover and urban forest had an average of 46% (± 34.3) canopy cover, 58% (± 37.9) of impervious surfaces, and 5%  . 2 Cluster sizes in clustering solutions with seven and two clusters, using four hierarchical clustering methods. Cluster sizes were normalized to the maximum cluster size and ranked from smaller to larger cluster size. Points are the relative number of cases per cluster (± 19.1) of pervious surfaces; all urban forest variables were important in the formation of this cluster. Biotopes 4 to 7 were grouped in this cluster (Fig. 3). Finally, the biotope group defined by the absence of trees had 0% canopy cover, 82% (± 22.2) of impervious surfaces, and 2.3% (± 1.0) of pervious surfaces.

Finer-level biotope classes
Urban tree canopy across urban biotope classes ranged from 0 to 63% per unit (hexagon). Biotope 7 (63.2% ± 36.1%) and Biotope 6 (60.2% ± 29.8%) had the highest tree canopy cover, and all urban forest variables were important in the formation of those clusters; these biotopes differ considerably in their percentages of pervious and impervious surfaces. Biotopes 3 and 1 had the lowest tree canopy cover (6.8% ± 8.5, and 9.5% ± 9.4, respectively), and urban forest characteristics were not meaningful in defining these clusters. The average impervious surface cover between biotopes ranged from 34% (± 28.0) in biotope 7 to 95% (± 7.4) in biotope 1. The density of dwellings/ha ranged from 269 (± 151) in biotope 2 to 690 (± 366) in biotope 7. Pervious surface cover ranged from 2.3% (± 1.0) in biotope 8 to 25.6% (± 41.1) in biotope 7. Biotopes 1 and 2 were represented with more than 90% by Phaeozem, and biotope 3 by Solonchak, whereas the rest of the biotope classes were a mix of soil types. Tables showing the average  Biotope 1, "Average 95% impervious surfaces", had a high percentage of impervious surfaces (95.2% ± 7.4%) and was formed without the influence of any of the urban forest variables, as indicated by their negative component scores. Biotope 1 had 9.5% (± 9.4) canopy cover, 0.1% (± 1.0) of pervious surfaces, and a high density of dwellings of 623 per hectare (± 304). Phaeozem was the dominant soil type (98.8% ± 2.6).
Biotope 3, "Average 90% impervious surfaces, introduced trees", was represented by an average of 87.3% (± 17.6) of impervious surfaces, and PC2 (Introduced-evergreen richness) was the urban forest variable with more influence in the formation of this cluster with a component score of 0.15 (± 2.1). It had 0% of pervious surfaces, a low density of 269 dwellings/ha (± 151), and Phaeozem was the dominant soil type (98.0% ± 4.2%).
Biotope 8 or "Average 80% impervious surface without trees", was not derived from the cluster analysis as it was directly interpreted as a biotope class from hexagons without trees and canopy cover. This biotope was characterized by 82.6% (± 22.2) of impervious surfaces, 2.3% (± 10.4) of pervious surfaces, and a dwelling density of 578 (± 353). Phaeozem and Solonchak were the types of soils present in this biotope, with 66.9% (± 47.0) and 32.1% (± 46.7), respectively.

Discussion
Our work represents the first application of the biotope approach to Mexico City's urbanized area, and an effort to develop an urban biotope classification based on field and spatial data, and urban forest and environmental variables. The objective of developing an urban biotope classification was attained with the election of variables and methods followed in this study. The methods selected, PCA and cluster analyses for landscape classifications, have been reported in other studies (i.e., Huang et al. 2001;Owen et al. 2006), and were useful to derive urban forest biotopes. The results showed the potential use of statistically based data-driven classifications to develop urban biotopes across urban areas. Our study also represents the first city-wide characterization of urban forest compositional and structural variables across Mexico City.
Statistically based biotope classification has advantages over traditional expert-based classifications that are subjective in that the output of the classification often depends on the skill of the interpreter (Löfvenhaft et al. 2002). Even though our approach had some similarities with the expert-based classifications (i.e., forest mapping), the core difference relied on the use of combined field and spatial data further processed through statistical modeling. Here, we presented an alternative approach to deriving biotopes based on the available data and unsupervised clustering, which has been widely employed in vegetation classifications (Wallace and Dale 2005). The hierarchical clustering approach allowed the grouping of observations into clusters based on similar urban forest and environmental characteristics. Internal validation measures helped identify the optimal number of clusters while trying to retain those results that maximized homogeneity within clusters and maximized heterogeneity between clusters ). The optimality criterion indicated that the system was hierarchical, meaning that each biotope group was formed by several finer-level biotopes nested into broader-level biotopes. By comparing different clustering methods, Ward's method yielded better clustering results and a more balanced number of clusters, similar to other studies (i.e., Schmidtlein et al. 2010;Steenberg et al. 2015). We are aware that classification outcomes are often impacted by the selection of variables, the available data, the classification algorithm, and the optimality criterion used, and do not represent the ultimate truth (Schmidtlein et al. 2010;Tichý et al. 2010). The classification scheme presented here is flexible and allows modifications as more resources and data are acquired (i.e., high-resolution imagery, ancillary data, field data).
This study aimed to develop an urban forest biotope classification specific to Mexico City, and our results showed two levels of detail in the description of urban forest biotopes: broader-level and finer-level. The broader-level biotopes were defined by canopy cover levels, and the finer-level biotope classes described specific urban forest structural and compositional characteristics. This classification scheme had similarities with previous research on landscape and biotope classifications. For instance, the landscape classification developed by Steenberg et al. (2015) in Toronto, Canada, reported categories of canopy cover levels such as very low, low, moderate, and high. The results from their classification are comparable with the broader-level classification results reported here (defined by impervious surfaces, defined by the canopy and urban forest characteristics, and defined by the absence of trees). While only canopy cover information is necessary to derive broader-level biotopes, field data should be incorporated as additional urban forest variables to derive finer-level biotope classes. Previous biotope models described compositional forest characteristics, including native or exotic species in Dunedin, New Zealand (Freeman and Buck 2003), and deciduous or evergreen trees in Helsingborg, Sweden (Gao et al. 2012). The outcome of our finer-level classification conveyed similar urban forest descriptors (deciduous or evergreen; native or non-native; and tropical, subtropical, or temperate species), 1 3 indicating that the classes derived for Mexico City are relevant to other cities.
Results from our work showed how urban biotope classifications can be developed utilizing compositional and structural urban forest variables, which allows for a better understanding of what ecological treats are more relevant for urban forest planning and management. For instance, the three broader-level biotope groups described the three main canopy conditions across the urbanized area of Mexico City and can be used by planners and decision-makers to identify areas for conservation, monitoring, assessing ecological services, or for interventions to increase canopy cover when possible. Targeted biotopes can be managed in order to emphasize the delivery of desired ecosystem services. For instance, the biotope with higher canopy cover can be managed in such a way that the surface under the canopy is permeable to allow water filtration thus reducing flood risks; tree management and tree canopy protection allow for better capture and absorption of air pollutants. Previous research has shown that biotopes with a more complex structure in terms of tree size are likely to provide some ecosystem services more efficiently (i.e., temperature regulation, air filtering) (Vihervaara et al. 2012). In the biotope dominated by hard surfaces and the biotope without trees, increasing the tree canopy will help reduce the heat island effect, regulate microclimates, and mitigate these areas' vulnerability under changing environmental conditions. Biotope information can assist in determining the level of transformation of these areas or inform spatial trade-offs in places where interventions are not possible. Efforts to transform areas without trees should be prioritized and could contribute to providing ecological services that are not being delivered and start addressing issues of environmental justice in certain areas of the city.
The finer-level biotope classification reflects the variation in compositional and structural urban forest variables and can be useful to set tree diversity and canopy targets within individual biotope classes, as it allows the evaluation of important ecological aspects, such as species diversity or native species content (Ordóñez and Duinker 2012). For instance, planting trees is feasible in biotopes 4, 5, and 7, indicated by their availability of pervious surfaces; in biotope 4 native species should be prioritized to offset the dominance of introduced species. In contrast, due to their higher percentages of impervious surfaces, planning actions are restricted in biotopes 1, 2, 3, and 8, and thus other options for greening should be considered, such as the implementation of green infrastructure planning (i.e., green walls, green roofs). Protection of canopy cover and tree management is recommended in biotopes 5, 6, and 7 which had over 50% of canopy cover and were influenced by all urban forest structural and compositional variables, indicating a higher urban forest diversity and a more complex forest structure.
The selected environmental variables were used to describe the abiotic conditions in which urban trees develop. For instance, soil type is an important driver of tree species suitability and growth, surface type is an indicator of urban density and available space for planting trees, and the density of dwellings and roads was used as an indicator of the intensity of the use of the land. Not surprisingly, biotopes dominated by impervious surfaces were not influenced by urban forest variables nor canopy cover, which is consistent with previous research exploring the relationship between canopy cover and urban form (Nowak et al. 1996;Heynen and Lindsey 2003;Steenberg et al. 2015). Even though biotope classifications have been based on a combination of different variables reflecting the biophysical environment (i.e., vegetation, soil, climate, slope), land management and uses, substrate characteristics, topography, built (land-use, housing, surface type), and human population characteristics (income, education, etc.) (Cousins and Ihse 1998; Sukopp and Weiler 1988;Löfvenhaft et al. 2002;Freeman and Buck 2003;Cadenasso et al. 2007;Steenberg et al. 2015), here, the selection of variables was limited by data availability and the quality of the available data. For instance, income and topography were considered as they often reflect variation in urban forest conditions and have been used in urban forest classifications (Iverson and Cook 2000;Heynen and Lindsey 2003). However, for Mexico City, income data is only available at broader scales (boroughs) and the variation of topography across the urbanized area is not sufficient to reflect spatial heterogeneity in the biotopes, therefore, it was not possible to include these variables in our classification. However, the role of socioeconomic and additional biophysical explanatory variables needs to be investigated as potential variables for biotope classifications and to better understand the patterns and relations between the urban canopy and the urban fabric in Mexico City.
Our research contributed to bridging the knowledge gap in urban forest research in Mexico City. Our city-wide analysis allowed the characterization of 25 urban forest structural and compositional variables. While these variables provide detailed information about the urban forest, they are redundant and hard to use in classifications and extrapolation across the landscape. The PCA analysis enabled the reduction of the number of urban forest variables to five significant variables that explained 78% of the variation in the urban forest data, which is appropriate as values between 70 and 90% are considered to preserve and retain most of the original data information (Jolliffe 2002). The first two PCs revealed that the canopy of evergreen-subtropical and the richness of introduced-evergreen species were strong descriptors of urban forest characteristics across Mexico City. These results are not surprising as most of the tree species sampled were evergreen and introduced, similar to results reported in previous studies conducted in Mexico City (Ortega-Álvarez et al. 2011;Velasco et al. 2014). The third PC captured variation of the basal area and canopy cover of temperate trees, consistent with the distribution of these species within the mountainous area of Mexico City and within a larger Nearctic biogeographical region where the city is located (Morrone 2010). PCs 4 and 5 represented the basal area and canopy cover of deciduous trees and the basal area and canopy cover of tropical trees. Most of the sampled deciduous and tropical trees are originally from Asia, and Central and South America, which are regions with different climates and rain regimes than those found in Mexico City.
We found a higher frequency of non-native tree species as compared to natives. However, this is not always the case for urban areas as in cities such as Chennai, Guangzhou City, and New York City, native tree species dominate over introduced ones (Jim and Liu 2001; Nowak et al. 2007;Muthulingam and Thangavel 2012). The higher representation of non-native species in Mexico City can be explained by the preference of local managers and residents towards exotic species for their visual properties (i.e., Jacaranda mimosifolia), and further by the growth rate and quick establishment of some of these trees (i.e., Eucalyptus spp, Casuarina equisetifolia) (Ortega-Álvarez et al. 2011;Chimal-Hernández and Corona 2016). Planting programs should aim at augmenting the native diversity of the urban forest, and when using non-native species is important to know the species origin, maintenance requirements, maximum tree, and crown size as well as roots development (Chimal-Hernández and Corona 2016), as the selection of tree species for the right planting site conditions is essential to ensure tree health, tree survival, and optimum service provisioning. The role of native vs introduced tree species in urban areas is still under investigation and it is important in the planning process to plant trees that will live longer, will be better adapted to novel conditions, and thus will maximize ecological services. Additional considerations regarding the selection of tree species are important in terms of the urban forest ecological integrity (Ordóñez and Duinker 2012), particularly under climate change conditions (Davis et al. 2011;Simberloff 2011;Sjöman et al. 2016).

Conclusion
This study shows that it is possible to classify urban biotopes from a bottom-up approach based on field surveys followed by modeling methods, as an alternative to the top-to-bottom expert-based approach that requires detailed spatial data. This work represents one of the first efforts to develop an urban biotope classification in the urbanized area of Mexico City and a North American city and can be replicated and adapted across larger and spatially heterogeneous urban areas.
The selection of variables and methods followed in this study allowed the classification of three urban biotopes describing canopy conditions, and seven biotopes describing specific compositional and structural urban forest characteristics. The conditions of the urban biotopes derived can provide information about urban forest management, tree planting, tree protection, or for setting tree diversity targets, and fundament biotope mapping at larger scales.
Our urban biotope classification is meaningful to identify urban forest planning and management opportunities within biotopes, however, some steps are necessary to successfully incorporate urban forest biotopes into urban planning. The main limitation of our approach is that the classification currently provides information about areas representing biotope conditions at limited sampling sites across the urbanized area, and thus information on the distributions and extensions of urban biotopes and how they relate to broader-scale factors such as land use is still needed.