We generated two-decade time-series greenspace maps from the high-resolution satellite data for global large cities using the linear spectral unmixing approach. We further combined the long-term fine-scale greenspace and population mappings with a population-weighted exposure framework to explore temporal changes in greenspace coverage, human greenspace exposure, and the associated inequality. To decipher the drivers controlling the inequality in human exposure to greenspace over time, we proposed a Venn conceptual model to quantify the individual and joint contributions from greenspace and population for the past two decades.
Global urban areas
We selected 1028 major urban areas globally as our study targets. The boundaries of these urban areas were extracted from the latest global urban boundaries (GUB) product, which was generated based on the 30-m Landsat-based global artificial impervious area (GAIA) data 46. We chose the urban areas in 2018 with a geographic area large than 100 km2 as the largest area extent during the past two decades to allow for: 1) the exploration of the urban expansion impacts on greenness change and 2) the collection of sufficient samples for measuring human greenspace exposure and inequality 18. It is noted that we refer to these 1028 urban areas as ‘cities’ throughout the manuscript.
Greenspace
We used 19-year (2000–2018) Landsat surface reflectance products from three satellite sensors (i.e., Landsat-5, Landsat-7, and Landsat-8) with a 30-m spatial solution to quantify the spatial distribution and temporal dynamics of greenspace. Landsat provides the longest high-quality temporal record of global surface reflectance data together with the pixel-level quality assurance (QA) layer indicating cloud, cloud shadow, snow, and ice conditions 47. We used the surface reflectance products of three visible (i.e., blue, green, and red) and one near-infrared bands. To minimize the uncertainty caused by Landsat-7 scan line off failure 48, we primarily focused on the use of Landsat-5 and Landsat-8 satellite data, with data availabilities of 12-year (2000–2011) Landsat-5, 2-year (2012–2013) Landsat-7, and 5-year (2014–2018) Landsat-8.
Several data pre-processing steps are conducted to ensure high-quality inputs for greenspace mapping. We first harmonized Landsat-5 and Landsat-7 data using Landsat-8 data as a baseline to generate consistent time-series data that removes potential impacts due to the difference in spectral settings across satellite sensors 49. With the QA bitmask layer, we excluded the pixels that are contaminated by cloud cover, shadow, and snow. We further calculated the normalized difference vegetation index (NDVI) and normalized difference water index (NDWI) to quantify the spectral characteristics of green vegetation and water body, respectively. Finally, we adopted a maximum value composite approach to generate the annual greenest vegetation green metrics by selecting the pixel-based maximum NDVI values from the cloud-free time series. In addition to the maximum NDVI value, we also recorded the corresponding NDWI and spectral reflectance of blue, green, red, and near-infrared bands for the following fractional greenspace coverage mapping.
Population
We used the WorldPop dataset from 2000–2018 to map the population’s spatially explicit distribution and temporal dynamics. WorldPop provides a global annual update of demographic datasets (e.g., population density, age and sex structures, and urban growth) at a 100-m spatial resolution using a random forest regression tree-based mapping approach 50. We chose the WorldPop population density dataset for its advantages of high spatial resolution, annual update frequency, and global coverage over alternatives such as Gridded Population of the World (GPW) 51, LandScan 52, and High-Resolution Settlement Layer (HRSL) 53.
Greenspace coverage
We adopted the linear spectral unmixing model to map fractional greenspace coverage from the composited Landsat surface reflectance-NDVI-NDWI time series, which can capture subpixel greenspace signals 27. The linear spectral unmixing model assumes that one pixel’s spectral signature (including reflectance and its derivative index) is a linearly weighted sum of a few spectrally pure endmembers and their fraction covers within pixel 54.
\({R}_{i}=\sum _{k=1}^{n}{f}_{ik}\bullet {C}_{ik}+{\epsilon }_{i}\) (1)
where \({R}_{i}\) represents the spectral signatures of pixel i, including spectral reflectance of three visible (i.e., blue, green, red) and one near-infrared bands, NDVI, and NDWI, \({C}_{ik}\) represents the spectral signature of the kth endmember, \({\epsilon }_{i}\) is the unmodeled residual in the kth pixel, n is the total number of endmembers, \({f}_{ik}\) is the fraction of kth endmember within pixel i, which is usually calculated from the least-squares method with the following physical constraints:
\(\sum _{k=1}^{n}{f}_{ik}=1 and {f}_{ik}\ge 0 \forall k = 1, \cdots , n\) (2)
We selected vegetation, impervious areas, and water as the three endmembers (n = 3). To collect pure spectra of endmembers and minimize their annual variations, in addition to Eq. (2), we included four more physical constraints for three endmembers: (1) vegetation endmembers should have an NDVI value > 0.8, (2) impervious endmembers should have an NDVI value < 0.2, (3) water endmembers should have an NDWI value > 0, and (4) three endmembers should be temporally stable over the past two decades, namely, constraints (1–3) should be satisfied for the endmembers for each year from 2000 to 2018.
With the endmember spectra and the associated physical constraints, we first calculated the pixel-level fractional greenspace coverage from Eq. (1). To remove the impacts of residue cloud contaminations, we conducted a pixel-level data smooth to generate the high-quality fractional greenspace coverage time series by using the Savitzky–Golay (SG) filtering approach 55. We aggregated the 30-m fractional greenspace coverage to 100-m resolution using the nearest neighbor resampling approach to ensure the derived greenspace map is spatially consistent with the 100-m population data. Then, we calculated the city-level physical greenspace coverage rate by overlapping the 100-m resampled pixel-level fractional greenspace coverage with the city boundary and averaging all greenspace coverages of pixels within the city. To further explore the residual uncertainty in the spectral unmixing process, we proposed a spectral unmixing-based threshold classification approach for the sensitivity analysis of physical greenspace coverage mapping. We first generated a greenspace binary map by classifying the 100-m pixel-level fractional greenspace coverage into greenspace (i.e., fractional greenspace coverage ≥ threshold) or non-greenspace (i.e., fractional greenspace coverage < threshold) components using a threshold approach 56. We adopted the thresholds of 0.3, 0.4, and 0.5 for this classification. Then, we aggregated the greenspace binary map to 100 m for the calculation of city-level physical greenspace coverage.
To validate the accuracy of the Landsat-derived city-level physical greenspace coverage, we used the classification maps of 1-m resolution National Agricultural Imagery Program (NAIP) aerial imagery for 2003–2015 and 10-m resolution Sentinel-2 satellite for 2016–2018 as benchmarks, following the approach in Chen et al. (2022) 18. This task includes three steps. First, we generated the annual vegetation green metrics from NAIP and Sentinel-2. For Sentinel-2 data, we applied a maximum value composite approach to generate the yearly greenest vegetation green metric across 1028 urban cities like Landsat data. Since NAIP only collected aerial imagery during the agricultural growing season in the sampling regions of the continental United States, we first chose the summer NAIP data (i.e., June - September) of the sampled United States cities as candidates. We then excluded these NAIP candidate data whose observation dates are significantly different from the peak growing season by visually comparing them with the corresponding Sentinel-2 and Landsat images. Consequently, a total of 639 United States cities with 19, 580 NAIP images were selected (Supplementary Table 6). Second, we generated the vegetation classification maps from the annual NAIP and Sentinel-2 imageries using a random forest machine learning approach. To minimize the impacts of inter-annual variations, we selected training samples from each annual vegetation green metric to calibrate the random forest algorithm with a decision tress parameter of 15. With the annually calibrated random forest models, we classify the NAIP and Sentinel-2 images into vegetation and non-vegetation components. Third, we calculated city-level physical greenspace coverage from the binary vegetation maps as references to evaluate the accuracy of Landsat-derived greenspace coverage. The overall consistency results with high Pearson’s correlation coefficients (Supplementary Fig. 15) supported the feasibility and acceptable accuracy of using Landsat-derived physical greenspace coverage to measure urban greenspace provision.
Greenspace exposure
We adopted the population-weighted exposure framework 14, 16, 18, 27, which can model the spatial interactions between greenspace and population, to quantify the probability and level of human exposure to greenspace according to Eq. (3):
\({GE}^{d}= \frac{\sum _{i =1}^{M}{P}_{i}\times {G}_{i}^{d}}{\sum _{i =1}^{M}{P}_{i}}\) (3)
where \({P}_{i}\) represents the population of pixel i, \({G}_{i}^{d}\) represents the fractional greenspace coverage of pixel i that considers both the central and nearby green environment with a buffer size of d (500 m is used in this study), M is the total pixel number within the city, and \({GE}^{d}\) is the population-weighted greenspace exposure at a city level.
Greenspace exposure inequality
We used three commonly used economic metrics (including the Gini coefficient, Atkinson, and Theil indices) to measure the inequality in human exposure to greenspace for global cities following the framework and method of Chen et al. (2022) 18. The calculations of these three metrics are provided in the Supplementary materials. Three metrics range between 0 and 1, where 0 indicates absolute equality and 1 indicates absolute inequality, and increasing value means a larger level of inequality.
Temporal trend analysis
We used the non-parametric Mann-Kendall statistic and Theil–Sen slope estimator approaches, which are insensitive to data distribution and outliers 57, to calculate the magnitude and direction of the city-level monotonic trends of physical greenspace coverage, human greenspace exposure, and greenspace exposure inequality. We adopted a significance level of 0.05 to assess the significance of time series trends.
Attribution of greenspace exposure inequality
We proposed a Venn conceptual model to quantify the contributions of greenspace and population to the temporal changes of greenspace exposure inequality. As shown in Fig. 3, the change of greenspace exposure inequality measured by the Gini index can be decomposed into three parts: individual greenspace provision (region I in Fig. 3a), individual population change (region III in Fig. 3a), and joint greenspace and population impacts (region II in Fig. 3a). When greenspace exposure inequality changes from year i to i + 1, the individual contribution of greenspace provision (green regions I + II in Fig. 3a) can be modeled as:
\(\text{Ⅰ} + \text{Ⅱ} = \varDelta {Gini}_{green, i}= Gini\left({g}_{i+1},{p}_{i}\right) - Gini\left({g}_{i},{p}_{i}\right)\) (4)
where \(\varDelta {Gini}_{green,i}\) denotes the contribution of greenspace provision to the overall changes of greenspace exposure inequality measured by the Gini index in year i (i = 2001, …, 2018), \(Gini\left({g}_{i+1},{p}_{i}\right)\) is the Gini index with greenspace coverage \({g}_{i+1}\) in year i + 1 and population \({p}_{i}\) in year i, \(Gini\left({g}_{i},{p}_{i}\right)\) is the Gini index with greenspace coverage \({g}_{i}\) in year i and population \({p}_{i}\) in year i.
Similarly, the individual contribution of population distribution to the overall greenspace exposure inequality (orange regions I + III in Fig. 3a) can be modeled as:
\(\text{Ⅱ} + \text{Ⅲ} = \varDelta {Gini}_{pop, i}= Gini\left({g}_{i},{p}_{i+1}\right) - Gini\left({g}_{i},{p}_{i}\right)\) (5)
where \(\varDelta {Gini}_{pop,i}\) denotes the contribution of population to the overall changes of the Gini index in year i (i = 2001, …, 2018), \(Gini\left({g}_{i},{p}_{i+1}\right)\) is the Gini index with greenspace coverage \({g}_{i}\) in year i and population \({p}_{i+1}\) in year i + 1.
The joint contribution of greenspace provision and population change (regions I + II + III in Fig. 3a) can be modeled as:
\(\text{Ⅰ} + \text{Ⅱ}+ \text{Ⅲ} = \varDelta {Gini}_{all, i}= Gini\left({g}_{i+1},{p}_{i+1}\right) - Gini\left({g}_{i},{p}_{i}\right)\) (6)
where \(\varDelta {Gini}_{all, i}\) denotes the contribution of greenspace provision and population to the overall changes of the Gini index in year i (i = 2001, …, 2018), \(Gini\left({g}_{i+1},{p}_{i+1}\right)\) is the Gini index with greenspace coverage \({g}_{i+1}\) and population \({p}_{i+1}\) in year i + 1.
By solving Eqs. (4–6), we can quantify the contributions of greenspace provision, population distribution, and joint greenspace and population to the change of greenspace exposure inequality.