An uneven history of satellite data limitations
Between 1982 and 2022, the Landsat archive contains 17,553,123 daytime images (per-pixel avg.: 1618, ± 1599; Extended Data Fig. 1a), meaning that 33.1% of images that would be expected under a 16-day revisit frequency are missing (see Methods). Moreover, the average pixel effectively lost an additional 44.8% (± 17.6%) of available observations due to cloud cover or degradation of Landsat images (Extended Data Fig. 1b/d), leaving an average quality-weighted number of 1,429 (± 887.4) images per pixel across the 40 years (Extended Data Fig. 1c). Data gaps are spatially and temporally highly uneven and particularly prevalent in the Tropics, Arctic, and Antarctic (Fig. 1b) and in earlier years (Fig. 2a). Notwithstanding the regionally important role of cloud cover28, extensive data gaps primarily relate to the historical development of the Landsat program20.
Global coverage of Landsat data evolved only gradually (Fig. 2a, Extended Data Fig. 3) with the support of a global network of receiving stations established in all continents except Antarctica (Extended Data Fig. 2a). Whereas the archive grew by an average of 118,362 (± 67,104) images per year during the 1980s, this rate increased nearly five-fold by the 2010s (to 575,169, ± 159,924; Extended Data Fig. 3). However, improvements were not uniform. Many countries lacked (or still lack) the infrastructure and know-how to continuously collect and preserve data from overpassing satellites20 (Extended Data Fig. 2a-b). As a result, many world regions, particularly at low and very high latitudes, have lagged behind in developing a dense Landsat data record (Fig. 2a). Improvements happened particularly late over islands, with, for example, 65.4% of Oceanian Island areas never observed until the 1990s.
Beyond these gradual increases, changes in Landsat satellite technologies caused several abrupt global changes in data coverage. Since the launch of Landsat 7 in 1999, Landsat satellites include on-board data storage that reduced the reliance on global networks of receiving stations for assuring data collection and archiving20. Combined with improving data transmission and warehousing, this helped the Landsat archive expand rapidly during the 21st century. Since 2003, however, mechanical issues in Landsat 7 degraded as much as 25% of pixels per image29. With the end of Landsat 5 in 2010, the quality of available images decreased until the launch of Landsat 8 in 2013, where data coverage improved dramatically (Fig. 2a). The average proportion of countries’ lands with a full-year coverage increased from 17.5% (± 30.8%) for the years before 2013 to 89.4% (± 26.4%) thereafter. Additionally, interoperability between Landsat 8 and earlier missions is hindered by differences in the sensed spectral information30, a challenge not addressed explicitly in state-of-the-art mapping applications.
With the exception of much of North America, nearly all land areas have continuous gaps lasting ≥ 1 year in their post-1982 Landsat data record (Fig. 2b). 36.2% of those lands have ≥ 1-year interruptions after their first data coverage, including most islands, most of Africa, Mesoamerica and north-eastern South America, northern Beringia, Patagonia, and Antarctica (Extended Data Fig. 4a). Oftentimes, these interruptions persisted over extended periods (averaging 5.4 years, ± 3.7), with several Central African regions not having a single usable Landsat image during > 10 consecutive years (Extended Data Fig. 4b). These archive interruptions often reflect short-lived or inconsistent receiving and storage capacities (Extended Data Fig. 2b), and are particularly severe during the mid-1990s20.
When data are available, their frequency and quality vary (Fig. 2c-d). Whereas much of the world has a large number of any Landsat images (Extended Data Fig. 1a), consistent high coverage with high-quality images only exists for dryland regions of the Middle East, North Africa, Australia, and North America (Extended Data Fig. 1c). By contrast, for most equatorial forest regions of the world, less than half of existing images are usable, due to persistent cloud cover28 (Extended Data Fig. 1b). In some areas such as the Sahel belt and much of Central Asia, annual fluctuations in quality-weighted image numbers exceed annual averages (Fig. 2c-d). In similar regions, seasonal data coverage is highly incomplete, often with less than three months covered with data in a given year, and fluctuations of a similar magnitude between years (Fig. 2e-f).
Data limitations bias perceived changes in SDG indicators
Unless carefully accounted for, the described spatial and temporal differences in data coverage and quality can introduce biases into the derived time-series products used for monitoring SDGs. For example, discontinuous historical satellite-data coverage impairs our perception of the timing of change events, while varying frequencies of quality images affect our ability to perceive time-sensitive changes, and fluctuations in seasonal data completeness limit our ability to distinguish multi-year changes from seasonal fluctuations. These biases in monitoring can lead to poor policy-making. For example, biased information on trends in forest and wetland areas may lead to ill-conceived protection and restoration goals as Nationally Determined Contributions under the Paris Agreement. In the following paragraphs, we will highlight how different types of Landsat data limitations indeed bias perceptions of changes in different SDG indicators derived from state-of-the-art monitoring products.
Gaps in quality data bias perceived timings of deforestation events. Changes in forest areas affect multiple SDGs and are the focus of SDG indicator 15.1.1. Tropical moist forests accounted for > 90% of global deforestation since 200031, and are particularly important for capturing and storing carbon (SDG 13), preserving and restoring biodiversity (SDG 15), and providing billions of people with income, food, and/or medicine from forest products (SDGs 1, 2, and 3). Accordingly, the recently published Landsat-based Tropical Moist Forest product (TMF)19, which maps the onset of deforestation since 1982, is poised to play prominent role in global monitoring under diverse policy frameworks, including the Paris Agreement, the Post-2020 Global Biodiversity Framework, or EU regulations on deforestation-free supply chains. The TMF addresses temporal gaps in the Landsat archive by preserving the last-recorded classes into the gap periods and only mapping any class changes once new satellite images confirm those19. The resulting annual time-series thus hide uncertainties regarding the true timing of forest-change events. This means that the inferred deforestation years and perceived change trajectories may commonly be biased by unaccounted gaps and quality differences in the Landsat archive.
We examined the TMF and, indeed, found strong indications of such biases. Globally, perceived annual deforestation affected disproportionately large portions of the tropical moist forest biome during two periods since 1990, both of which mark periods of particularly rapid improvements in Landsat satellite-data coverage and quality (Fig. 3a). For example, from 1999 to 2000, right after the launch of Landsat 7, the World seemingly experienced an increase in deforested areas of 65.2%, more than twice the maximum year-over-year increase (31.9%) registered anywhere between 1990 and 1999. Similarly, the 2012–2013 deforestation increase of 60.5% coincides with increased image frequencies following the launch of Landsat 8 and is nearly twice the recorded maximum over the 2001–2012 period (30.8%), in which Landsat 7 was the sole data source. Regions experiencing ≥ 1-year periods with either no data or potentially unusable, low-quality data show disproportionately higher deforestation rates during years immediately following those periods, compared to their smoothed trend line (Fig. 3a; see Methods). This results in 67,329,203 ha of globally deforested areas that are potentially allocated to the wrong year (Extended Data Fig. 5a; see Methods), corresponding to 58.4% of total gross deforestation mapped since 1990.
We wanted to know whether gaps in Landsat data actually bias the perceived timing of deforestation. To this end, we used a formal causal analysis technique developed for detecting causal relationships between two time-series called Convergent Cross Mapping32,33 (CCM; see Methods). Based on the results of these analyses, we attribute deforestation anomalies to anomalies in maximum annual image quality in preceding periods in 68.8% of tropical-moist-forest countries (Fig. 3b; see Methods).
Resulting biases in perceived deforestation years may bias any timing-sensitive applications related to achieving SDGs, including modelling of carbon emissions5, restoration prioritization to mitigate extinction debts34, or attributions of forest changes to changing socio-political conditions35. For example, deforestation inside the Luo Scientific Reserve (Democratic Republic of Congo) that reportedly happened during the first Congo war (1996–1997) due to human displacement35 would be falsely attributed to processes in the immediate post-war period, for which Landsat data are again available (Extended Data Fig. 5b). These biases may also cast unfair perceptions of national progress in curbing deforestation. Twelve countries indicated by the TMF as having increasing deforestation rates around the Landsat 8 launch – when data improvements were particularly strong (Fig. 2a) – in fact reported decreases in the Forest Resource Assessments (relative to the previous reporting period)36, including countries with successful restoration and conservation programs over that period (e.g., Cuba, India, Vietnam, Thailand)31.
Increasing frequencies of quality data miss regional arable-land losses. Accurately capturing dynamics in arable-land extents is a critical component of measuring SDG indicator 2.4.1 on agricultural lands under sustainable use, and is also closely linked to indicators aimed at avoiding deforestation (15.2.1) and loss of water-related ecosystems (6.6.1)6.
Mapping arable-land requires temporally dense satellite observations to capture phenological land-surface changes driven by crop planting and harvesting, and as such is highly sensitive to cloud-related gaps in Landsat data37. To tackle this, a recently developed global product (GLAD)18 maps arable-land in four-year epochs, exploiting the highest-quality images of an entire epoch (aggregated into an annualized 16-day time-series) for more accurate detections of cropping-related phenological patterns18. Yet, this approach is not immune to increasing densities of high-quality images in the Landsat archive over time (Extended Data Fig. 3), which may lead to overestimations of arable-land gains by reducing the likelihood of missing existing arable-lands, compared to earlier time periods. Simultaneously, changes in newer Landsat sensors relative to earlier missions (e.g., different bad spectral ranges)30 are likely to misinform classification algorithms fed mainly with data from earlier periods.
In fact, we identified 123 countries where the GLAD mapped gains despite reported losses in national statistics (Extended Data Fig. 6), casting doubts on 74,975,550 ha of arable-land expansion, an area larger than the total of all arable-lands across all Amazonian countries in 202138 (Extended Data Fig. 5c). Most (80.0%) of these positive disagreements relative to statistics are associated with improvements in the frequency of quality Landsat images (Fig. 3c). These disagreements peak between the 2008–2011 and 2012–2015 epochs (36.0% of cases), coinciding with the 2013 launch of Landsat 8 which massively increased quality-image numbers (Fig. 2a). Doubtful arable-land gains concentrate in Southern Asia (19.5% of doubtful gains), South America (18.8%), and Western Africa (18.5%, Extended Data Fig. 5c), with countries such as Ghana in Nepal consistently experiencing positive disagreements between all epochs. Our causal analysis using CCM attribute the former to the latter in 48.4% of countries (Fig. 3d; see Methods).
These biases in perceived arable-land changes can severely bias perceptions of global food security issues. The 54 countries with moderate to high bias-causing effects include top food-producing countries (e.g., China, Russia, France) and together accounted for 38.4% of global cereal production in 202139. However, they also include many food-insecure countries (e.g., Central African Republic, Niger, Somalia, South Sudan, Yemen, Zimbabwe), where misinterpreting losses of arable-lands for gains bears risks that policy-makers might fail to recognize emerging crises.
Overestimated arable-land gains can also lead to unfair evaluations of progress towards SDG target 2.4 (sustainable food production) that exaggerate conflicts of food security with ecosystem protection and climate-change mitigation. For example, two recent studies40,41 using GLAD data reported extensive cropland expansion into global protected areas, with massively accelerating expansion rates between the mid-2000s and mid-2010s. The above-described data biases associated with the 2013 launch of Landsat 8 (Fig. 2a, Fig. 3c), however, may render these assessments unreliable. This is illustrated in India, where sudden changes in Landsat data led the GLAD mapping algorithm to falsely re-classify an entire protected Ramsar wetland of > 3,000 ha into arable-land (Extended Data Fig. 5d).
Improving seasonal data completeness exaggerates water gains. SDG Indicator 6.6.1 tracks changes in surface water bodies, such as lakes, rivers, and reservoirs, and is informed by the Landsat-based Global Surface Water product (GSW)17. Particularly in many dryland regions of the world, seasonal water bodies that only exist for a few months per year play a crucial role for water security, both as seasonal sources of drinking water and water for livestock and cropping42, as well as for filling aquifers that sustain water supplies during dry seasons. Even outside drylands, seasonal flooding of river plains affects both natural nutrient inputs in, and leaching from, major agricultural production regions43. The GSW maps seasonal (as well as permanent) surface water extents annually based on monthly classifications of water occurrences.
These data show a nearly 5-fold increase in global seasonal water areas between 1984 and 2020, with increasing trends over 90.1% of the maximum seasonal-water extent. However, because the GSW maps water if as little as 43.5% of expected images per year are available, seasonally biased distributions of those images could either entirely miss seasonal water occurrences or misclassify seasonal for permanent water (if only covering the dry or wet season, respectively). Therefore, long-term increases in seasonal completeness could be falsely mapped as increasing seasonal-water extents44.
We found that, indeed, global seasonal surface water gains correlate with improvement in seasonal data completeness (number of months with usable data; r2 = 0.80; Fig. 3e, Extended Data Fig. 5e), which are largely unsupported by local discharge measurements (64.5% of gauge stations show disagreements, Extended Data Fig. 7; see methods). Our causal analysis using CCM found moderate to high bias-causing effects in 144 countries (Fig. 3f), including several with severe water stress (e.g. Yemen, Sudan)39, mischaracterizing persistent and expanding water scarcity issues driven by increasing drought frequencies45 (e.g., in Somalia6; Extended Data Fig. 5f).
Biases disproportionately affect lower-income countries
We found that Landsat data limitations, as well as the resulting biases in perceptions of land changes, occur disproportionately often in countries with lower financial capacity to sustain remote sensing monitoring programs. Specifically, biasing effects on perceived arable-land and seasonal water trends were significantly more frequent in lower-income than in higher-income countries (McNemar’s tests, arable-land: 51.9% of lower-income vs. 46.1% of higher-income, p-value = 0.00; water: 89.7% vs. 73.1%, p = 0.00; note there was a near-significant difference in deforestation bias in the opposite direction among the respective income groupings of tropical-moist-forest countries; 51.9% vs. 46.1%, p = 0.07; details in Methods). Unless ensuing biases in SDG indicators are accounted for, misperceptions of progress in food- and water-security goals in those countries may hamper adequate international support and timely policy interventions.
Similarly, we found higher average frequencies of years without any usable data in lower-income countries (Wilcoxon test, p = 0.0, avg. of 4.9% [± 2.9] vs. 3.7% [± 4.0] for higher-income countries), affecting 43.3% of their combined area, compared to only 17.2% of the combined area of higher income countries (mainly high-latitude and offshore territories). Similarly, we found that pixels in lower-income countries were more frequently affected by fluctuations in usable-data frequencies exceeding the expected frequencies under a 16-day recurrence (60.2%, vs. 40.6% for upper-middle-/high-income countries; p = 0.00), and also by fluctuations in usable-data months exceeding a typical climate-season length (88.2% for lower-income vs. 70.73% for higher-income countries; p = 0.0).
Future needs: bias corrections, fair product validations, and support to users
While this paper focuses on Landsat data as the most important resource for long-term, global land-change monitoring, all satellite data archives are affected by uneven data coverage and quality46,47. Given the importance of satellite-based land-change observations for sustainability policy, monitoring, and related scientific fields, addressing the highlighted biases caused by limitations in global satellite archives becomes imperative. This will require more rigorous bias-control and more honest validations by data developers, as well as better support for (and commitment by) data users for detecting and addressing remaining uncertainties.
Firstly, expert communities developing remote-sensing-based time-series products should raise standards for correcting for satellite data limitations before applying classification algorithms. An increasing array of sophisticated approaches can fill gaps in satellite archives48,49, for example, by fusing sparse Landsat with coarser-resolution but less incomplete data from the MODIS and AVHRR satellite missions to generate global, seamless data cubes50, but such approaches remain rarely applied in operational land-surface monitoring. To further improve their performance, information on data coverage and quality, as provided here for Landsat (see Data availability and Code availability sections), could be made available for all sensor systems, enabling its explicit use by gap-filling models for correcting satellite data to desired, high-quality levels.
Secondly, we need higher standards for assessing uncertainties in the derived land-change products. All three products scrutinized here were, in fact, extensively validated by their developers. Yet, validation samples were mostly generated by visually interpreting Landsat images17–19 – as is true for nearly all global time-series, especially for pre-2000 periods, where few alternative sources of validation data exist50. For accuracy assessments to be meaningful, however, accuracies must be comparable between validated and non-validated pixels and years51. In reality, the selection of validation samples and their correct visual interpretations are both biased away from the most data-limited regions and periods, which is also where the classification algorithms are most likely to fail. This likely results in exaggerated accuracy scores and hence unwarranted trust in Landsat-based monitoring products. To be honest towards data users, accuracy tests should directly incorporate information on limitations in both satellite and validation data. Again, models could be used to generate seamless predictions of class-confusion probabilities in between existing samples that are representative of all pixels and years, including those with limited data. Much more than allowing 'corrections’ of all pixel values51, this should allow mapping remaining uncertainties in ways that enable their due propagation into change assessments and indicators52, for example, in form of probability-mass functions of alternative class sequences for each pixel.
Such higher standards would imply more time needed for the development and quality-assurance of time-series products, and thus fewer, more transparent products that pass peer-review and enter the market every year. This would be desirable from the perspective of data users, who are already overwhelmed by too many products to choose from with little guidance on which products they should trust26. Many data users may be similarly overwhelmed by fewer but more voluminous products with rich, pixel-level uncertainty information, as they lack the technical capacity to effectively use them. Thus, we additionally need easy-to-use tools helping with their use, as well as with selecting the most fit-for-purpose product for a given desired application. For example, software packages could support easier incorporation of data uncertainties into, and propagation between, different types of applications (mapping, change assessment, causal analysis, etc.). Similarly, cloud-based tools could automatically test where within a user-specified region and period a given product could plausibly support the desired application, given the product’s uncertainties and/or underlying satellite data limitations.
At the same time, data users should acknowledge that remote-sensing “data” on land changes are not facts, but model-based interpretations of (satellite) data that often inherit large uncertainties. Ultimately, data users carry the burden of validating their original results on land changes, even when they rely on existing products. Easy-to-use and freely available webtools for exploring historical time-series of high-resolution images (e.g., Google Earth Pro) empower them to do so.
By highlighting data limitations and biases in perceived land changes, and offering data-quality layers and suggestions for addressing these, we provide an essential first step. We hope that this may serve as a starting point for the needed collaborative actions, to ensure that satellite-based data can reliably guide progress towards a sustainable future.