The paper describes the production steps and accuracy assessment of an analysis-ready, open-access European
data cube consisting of 2000–2020+ Landsat data, 2017–2021+ Sentinel-2 data and a 30m resolution
Digital Terrain Model (DTM). The main purpose of the data cube is to make annual continental-scale
spatiotemporal machine learning tasks accessible to a wider user base by providing a spatially and temporally
consistent multidimensional feature space. This has required systematic spatiotemporal harmonization,
efficient compression, and imputation of missing values. Sentinel-2 and Landsat reflectance values were
aggregated into four quarterly averages approximating the four seasons common in Europe (winter, spring,
summer and autumn), as well as the 25th and 75th percentile, in order to retain intra-seasonal variance.
Remaining missing data in the Landsat time-series was imputed with a temporal moving window median
(TMWM) approach. An accuracy assessment shows TMWM performs relatively better in Southern Europe
and lower in mountainous regions such as the Scandinavian Mountains, the Alps, and the Pyrenees. We
quantify the usability of the different component data sets for spatiotemporal machine learning tasks with a
series of land cover classification experiments, which show that models utilizing the full feature space (30 m
DTM, 30 m Landsat, 30 m and 10 m Sentinel-2) yield the highest land cover classification accuracy, with
different data sets improving the results for different land cover classes. The data sets presented in the
paper are part of the EcoDataCube platform, which also hosts open vegetation, soil, and land use / land
cover (LULC) maps created. All data sets are available under CC-BY license as Cloud-Optimized GeoTIFFs
(ca. 12 TB in size) through SpatioTemporal Asset Catalog (STAC) and the EcoDataCube data portal.