Pixel-based temporal aggregation is a paradigm that applies a suite of user-defined rules to reduce redundant information and generate cloud-free, radiometrically and phenologically consistent, spatially contiguous image composites (White et al. 2014). From the S2 archive, three images were generated, one that corresponds to the median spectral response, one that captures the peak value of NDVI in the seasonal cycle of vegetation, and a two-date composite.
A crucial point of land cover mapping applications using multitemporal satellite images is the radiometric consistency of the acquired images (Syariz et al. 2019). The degree of influence of clouds, sun angle, water-vapour, aerosols and directional surface reflectance is minimized in maximum NDVI composite imagery with increasing length of the quantizing interval (Holben 1986). Specifically, in order to apply pixel-based composite techniques, the availability of cloud-free images in the study period is mandatory (Griffiths et al. 2013). Accuracy in thematic map production is a function of the quantity and quality of imagery available; data deficiency during the peak of the growing season is a severe limitation (Johnson 2019). In this study, the cloud coverage values per date and the number of cloud-free images per pixel over the growing season were analysed.
For the data assessed in this study and the four-month period considered, at least one cloud-free image is ensured every 13 days. This value is high, considering that the study area belongs to a monsoon regime with a rainy season concentrated in the growing season (Aliaga, Ferrelli, and Piccolo 2017). In areas of Australia characterised by a low precipitation regime, the period between two cloud-free S2 acquisitions was on average less than two weeks (Sudmanns et al. 2020). S2 data have higher frequency than Landsat 8, the alternative satellite for use in this type of studies; this is because the study area is located in the centre of the Landsat 8 orbit, so scenes do not overlap and the maximum revisit frequency is 16 days. In practice, this frequency cannot be met because of cloud coverage (Frantz et al. 2016), which is high especially when considering that the analyzed growing season coincides with the rainy season.
The inconsistency in the number of cloud-free pixels available over a period is one of the major challenges of using image temporal aggregation techniques. In this study, we found that for a four-month period that includes the summer crop season, at least 25 cloud-free images are ensured. The values observed are far better than those observed by Carrasco et al. (2019), who found a minimum of seven cloud-free images per pixel for a one-year interval of S2 data in the United Kingdom. In our study, such high number of free-cloud images per pixel allowed us to generate composites. Composites are helpful for characterizing vegetation phenology, and have shown to improve classification accuracies (Azzari and Lobell 2017; Bey et al. 2020; Johnson 2019). How to generate a cloud-free input dataset for analysing land use/cover change in GEE is an important and crucial step, and the effects of the selected method on the classification results is generally underestimated (Phan, Kuch, and Lehnert 2020).
Here, the two-date method yielded the highest accuracy indexes, followed by the median method, with the max NDVI method having the poorest results. These findings are consistent with those of Carrasco et al. (2019) and Phan, Kuch, and Lehnert (2020), who also found that traditional multi date datasets obtained the highest classification accuracy compared with median and max NDVI methods applied to Sentinel 2 data.
Here, patch size and homogeneity of forest and crop classes were detected as possible characteristics that may increase classification accuracy. In central Córdoba, forests are characterized by a net decrease of species richness due to grazing pressures and fire recurrence (Aguilar et al. 2018), whereas crop fields are characterized by large-scale commercial mechanized farms. The simplicity of the forest and the machinery used for crop sowing, planting and harvesting increases homogeneity of these types of land cover patches. This homogeneity and the large mean area (MA) size of forest patches and field plots (Table 4) contribute to a clearer differentiation of these classes using these methods (Bey et al. 2020). Using multi-date images in classification enhances the spectral separability of the land cover classes and increases the overall accuracy of the land cover maps (Rujoiu-Mare et al. 2017). On the other hand, difficulties in distinguishing the built-up area from the bare land patches may be explained by their similar spectral response.
The use of the median method was found to not only significantly reduce data volume but also remove noisy, very dark, and very bright pixels that may occur due to shadow and haze (Amani et al. 2019; Azzari and Lobell, 2017; Okoro et al. 2016; Phan et al. 2020). These factors may explain the relatively good performance of land cover maps based on the median method with respect to max NDVI. Even though the max NDVI method is useful to indicate the highest level of vegetation abundance (Huang et al. 2017), it did not perform well as a temporal aggregation method. Depending on the study context, max NDVI may be useful for identifying specific land use changes, such as conversion of cropped land to conservation, non-irrigated to irrigated land, or forest land variation (Maxwell and Sylvester 2012).
Landscapes with high values of PD and PARA, and low GM can generally be considered spatially heterogeneous (Awuah et al. 2018; Silva et al. 2018; Statuto, Cilis, and Picuno 2018; Tran, Julian, and De Beurs 2014). Here, the land cover maps with the highest landscape heterogeneity were obtained using the max NDVI method. Bey et al. (2020) also found that Seasonal Greenest Pixel composites included more hazy and shaded pixels than the Median composite. These authors suggested that the inclusion of hazy pixels may be indicative of a more systematic bias, since hazy pixels often had higher reflectance values than clear pixels in the bands required for NDVI calculation. Ruefenacht (2016) and Xie et al. (2019) also found that the max NDVI composite has Landsat anomalies, clouds, shadows, and abnormal pixelation. Therefore, it is likely that max NDVI is not a convenient method for constructing a cloud-free composite from time series.
Landscape metrics integrate all patch classes over the full land extent and represent the spatial complexity of the pattern of the entire land cover map (McGarigal 2015). Land cover map with either high or low CONDENT values can generally be considered a land cover map with a complex pattern (Altieri, Cocchi, and Roli 2018; Awuah et al. 2018; Nowosad and Stepinski 2019). Again, our results showed that the max NDVI method was the one revealing the most complex pattern. The comparison among the 10 sites showed that site 1 had the land cover map with the highest complexity, regardless of the temporal aggregation method used to generate the composite database. This result was also observed in PD values, with site 1 showing the highest fragmentation. Site 1 is characterized by an urbanized natural area, with a very low proportion of impervious surfaces compared with urban environments. Awuah et al. (2018) also found the highest values of spatial heterogeneity in the transition zone between urban and rural areas. In those cases, the authors suggest the use of finer resolution images that can retain more spatial and thematic details in land cover.
In the current study, landscape metrics values were found to be significantly related to accuracy indexes, which impacts map quality. Maps exhibiting high levels of landscape fragmentation-heterogeneity and complexity were closely correlated with low accuracy assessment values. This observation is supported by the hypothesis that highly heterogeneous land cover classes generally have lower accuracy with respect to simple and homogeneous land cover patches, not only because of the possibility of geolocation error, but also because these subsets include all edge pixels that are generally more difficult to classify correctly than centre patch pixels (Glick et al. 2016; Stehman and Foody 2019). Awuah et al. (2018) also found that high fragmentation can result in an increase in the number of mixed pixels, affecting classification accuracy. This fact can be observed in our results, since the max NDVI method was related to high heterogeneity and low overall accuracy.
It is known that uncertainty of landscape metrics is negatively related to classification accuracy. In most landscape metrics, the values tend to stabilize as the overall accuracy of image classification approaches 90%. This means that a high degree of classification accuracy is required for ensuring the consistency and reliability of landscape metrics (Shao and Wu 2008). The two-date method showed the best performance in accuracy indexes and, at the same time, it exhibited the lowest value of fragmentation and complexity in land cover maps. For small areas, its implementation is also computationally and technically less demanding than that of the median and max NDVI methods, since GEE has a built-in command that facilitates the procedure.
An interesting future research avenue is the use of this methodology in a wider area and including several crop growing seasons; we will focus on the precise discrimination of principal crops to determine the predominant crop sequences in the area, because proper sequencing of crops is an important component of successful dynamic cropping systems (Ashworth et al. 2017; Krupinsky et al. 2006). Moreover, the current open access to historical satellite data sets and free GIS tools give us the opportunity to identify relationships between the main dynamics in land use, land use policies and socioeconomic contexts through techniques and methodologies that can be applied around the globe.