Records of historical floods. Data on historical floods are taken primarily from the HANZE (Historical Analysis of Natural HaZards in Europe) database, v2.1 (29). The dataset was constructed with more than 800 sources of information ranging from news media through government reports to scientific papers. It records the date, location (by subnational regions from Paprotny and Mengel (32), based on European Union’s NUTS level 3 classification, v2010 (49)), type and the impact of 2521 floods that caused significant impacts between 1870 and 2020. Of these, 2037 events occurred within this study’s timeframe (1950–2020), and 1504 events were detected in the factual flood catalogue (30), described below. A reported event was considered matched with modelled floods if both hydrological and impact thresholds defined in the catalogue were reached in at least one NUTS3 region that was impacted by a reported event within the same timeframe as that event. Not all events had all impacts statistics available (fatalities, population affected, economic loss). In such situations, impacts were gap-filled with model estimates made in Paprotny et al. (33), also described below. In addition to the HANZE database, 237 floods were identified in Paprotny et al. (30) as significant, based on qualitative or partially quantitative data. Of these, 225 were included in the attribution study, with impacts estimated in Paprotny et al. (30) to complete the dataset of 1729 events. 12 events were excluded as they represented a phase of a flood event from the HANZE database, which was indicated in the model as a separate event, but whose impacts were already included as part of the “main” model event connected with HANZE. In total, our attribution dataset includes an estimated 83% of fatalities, 96% of the population affected and 95% of direct economic losses from flooding in Europe between 1950 and 2020 (Supplementary Table S 2). Due to the resolution of meteorological data and hydrological models, it was not possible to include, in particular, many small or rapid flash floods.
Factual flood data. Each historical event in the study was reconstructed using a set of connected, pan-European models of hazard, exposure and vulnerability. We highlight the main points of the methodology, but refer to the underlying publications on the individual components for technical details.
Climate change (riverine and flash floods). The basis of the riverine flood catalogue is the Hydrological European ReAnalysis (HERA) dataset (31), in which river discharge for Europe was simulated for the period 1950–2020 using a state-of-the-art hydrological model LISFLOOD (50), with spatial resolution of 1 arc minute (1.8 km at the equator) and temporal resolution of six hours. The model was forced with climate reanalysis data (ERA5-Land (51)), bias-corrected and downscaled to the model resolution with weather observations using the ISIMIP3BASD method (52). Extreme river discharge events along river segments with an upstream catchment area of at least 100 km2 were identified in the dataset and merged in space and time. Then, based on maximum river discharge at each grid cell, they were combined with a stack of flood hazard maps for different discharge scenarios (53, 54) to derive footprint and water depths of each event. Only regions with sufficient potentially flooded area, and events with sufficient impact potential (based on population and asset value maps32) were included in the final catalogue of 11,205 riverine floods (30).
Climate change (coastal and compound floods). Similarly to river discharge, the coastal flood catalogue was constructed by combining a set of extreme sea level events with coastal flood maps. To obtain sea levels, a Delft3D-based simulation of storm surges along European coasts (55) was carried out, driven by ERA5 reanalysis data (56). Hourly storm surge heights were combined with data on tidal elevations, significant wave height (converted to wave run-up), long-term sea level rise, mean dynamic topography of the ocean, and glacial isostatic adjustment to estimate the total water level. Based on those water levels, flood hazard maps were computed according to the methodology of Vousdoukas et al. (57) to derive footprints and water depths. The final number of potential coastal events in the catalogue is 2436. In addition, where potential riverine and coastal floods co-occurred in the same NUTS3 region, a separate ‘compound’ event was added to the catalogue (1058 events in total).
Catchment alteration. The hydrological model to derive river discharges for the factual flood catalogue included changes in socio-economic input maps for every year of simulation (31). This accounted for new reservoirs (based on the year of construction), land use changes (rice, other irrigated land, forest, sealed surfaces, open water, and other, i.e. non-irrigated agriculture, non-forest natural and pervious artificial), and water demand changes.
Population and economic growth. This driver represents the changing population and gross domestic product (GDP) at the level of NUTS3 regions. A harmonised dataset of historical statistics, based on almost 400 different data sources, was created as part of the HANZE database (32) spanning the period 1870–2020.
Land use and economic structure. Historical statistics in HANZE were applied to an exposure model that downscales NUTS3-region level changes in land use, population, GDP and fixed asset value to a 100 m grid. The modelling approach combines high-resolution predictor maps for changes in exposure distribution with rule-based and statistical methods (32). The dataset can capture historical changes in the spatial redistribution of exposure due to population and economic growth, land use change (especially through urbanization and building of new infrastructure), change in the structure of the economy (GDP and asset composition) and change in asset-to-GDP ratio per economic sector.
Flood protection levels. In the factual dataset, the locations where flood defences were insufficient due to the return period of the event are identified as NUTS3 regions recorded as impacted in the HANZE database. Other NUTS3 regions with potentially significant flooding identified in the modelled flood catalogue were considered to be protected from flooding.
Flood vulnerability. For each event, within the historical impact zone, a set of four vine-copula models was applied to estimate fatalities, population affected and economic loss. The models were built using data on historical, reported impacts in HANZE converted to relative impacts using potential impact in the modelled flood catalogue (30). A set of hydrological and socioeconomic predictors were derived to predict the changes in flood vulnerability in space and time, and then applied to all events in this study. Where available, reported impacts were used, while any gaps in the data were filled with the model predictions. Two vine-copula models were derived to estimate fatalities, one to derive the probability of any fatalities (as a large share of events had no fatalities) and the other to derive the magnitude of fatalities. However, in most cases (all 1504 events from the HANZE database) the occurrence or non-occurrence of fatalities was known, hence only the second model was applied to estimate the magnitude of impacts. Otherwise, for the remaining 225 events the estimated fatalities integrate the probability of fatality occurrence and estimated magnitude. As one of the parameters in the vulnerability models is the total relative impact of the event, if under a certain counterfactual scenario (climate change, catchment alteration, flood protection level) additional regions are affected beyond the factual ones, the vulnerability estimate under that scenario is used only for those additional regions, and not for the factual impact zone, to maintain a consistent estimate of vulnerability.
Counterfactual flood data. Each event was modelled with counterfactual scenarios, each representing the climatic and socioeconomic conditions of the year 1950. Every combination of factual and counterfactual scenarios (64 in total) were computed, but here we present the scenario in which only the driver in question is set to counterfactual, while all other drivers remain factual. Modelled impacts under all scenarios were scaled by a factor corresponding to the difference between modelled and reported impacts, if such reported data were available in HANZE. However, this adjustment was done only for those regions that were reported as impacted in HANZE. Additional regions impacted under counterfactual scenarios were not adjusted. Also, if no fatalities were reported for the event, the counterfactual model estimates were also not adjusted.
Climate change (riverine and flash floods). River discharge was converted to counterfactual timeseries based on the LISFLOOD run under static 1950 socioeconomic conditions (see ‘Catchment alteration’ below), in which variation of discharge is only due to changes in climate forcing. The detrending of extreme discharges was carried out using the transformed-stationary extreme value analysis (tsEVA) approach (58). The method decouples the detection of non-stationary patterns from the fitting of the extreme value distribution, assuming the generalized Pareto distribution (GPD). It was applied here considering: (a) discharges above 95th percentile (in 6-hourly resolution), (b) minimum distance between flood peaks of 3 days, (c) a moving 30-year time window and (d) the seasonal (monthly) cycle of discharges. The method converted extreme discharges at a given return period (assessed at the time of the event under historical climate) to the discharge for the same return period in 1950.
Climate change (coastal floods). Three components of extreme sea levels were converted to counterfactual. Hourly storm surge height and significant wave height were calculated for 1950 using the tsEVA approach, as for river discharge. However, parameters of the transformation were modified by using a 99th percentile (due to higher temporal resolution and shorter duration of coastal events) and not using seasonal changes (as it is less relevant here and allows more confident estimation of return periods). Additionally, long-term sea level rise after 1950 was removed from the series, with data for 1950–1999 from a high-resolution reconstruction of sea levels (59) and data for 2000–2020 from satellite altimetry (60, 61). The tidal cycle from the FES2014 model (62) and glacial isostatic adjustment from the ICE-6G_C model (63, 64) were preserved in the counterfactual.
Catchment alteration. The riverine discharge simulation was repeated with fixed socioeconomic input maps (land use, reservoirs, water demand) for the year 1950. In case of water demand, the intra-annual (monthly) cycle for the domestic and energy sectors was preserved as it varies depending on the temperature distribution with the year. The time series was scaled proportionally so that the annual totals were reduced to 1950 level.
Population and economic growth, land use and economic structure. Exposure data at regional (NUTS3) and gridded levels for the 1950 timestep were used in the impact calculations.
Flood protection levels. Factual and counterfactual protection level, defined as the probability of the hydrological event causing socioeconomic impacts, was modelled both at the event level and for each NUTS3 region constituting the event’s impact zone. Two vine-copula models were used, based on historical data on flood impact occurrence and non-occurrence within the modelled flood catalogue, and a set of hydrological and socioeconomic predictors (33). An event was considered prevented by defences in the counterfactual scenario, if the mean counterfactual model prediction was ‘no flooding’, while the factual prediction was correctly identified as ‘flooding’. At the NUTS3 level, counterfactual predictions of impacts were only considered for regions that were correctly classified by the model under factual conditions as ‘flooding’ or ‘no flooding’. This is a relatively conservative assumption, but increases the confidence in the results.
Flood vulnerability. The vulnerability models were applied under 1950 socioeconomic conditions and flood experience. For floods with zero reported fatalities, they were possible under counterfactual conditions, if the mean counterfactual prediction of the vine-copula model was ‘fatalities’, but the factual prediction was correctly identified as ‘no fatality’. For floods with reported fatalities, the reverse approach was applied: only floods with correct factual model predictions were allowed to transition to a ‘no fatality’ counterfactual. As for flood protection, this is a relatively conservative assumption, but increases the confidence in the results.
Uncertainty calculation. An uncertainty distribution of each driver of change was estimated using a method appropriate for the underlying modelling approach. When aggregating uncertainties of many floods, e.g. per flood type or country, we assume that the uncertainty of each flood is independent of all others. Consequently, the uncertainty of individual events is higher than for any aggregate (Supplementary Fig. S2).
Climate change, catchment alteration. We assume that the uncertainty in extreme sea levels and river discharge can be represented by a normal distribution with a location parameter of zero. The scale parameter of the distribution is assumed to be equal to the mean absolute error between modelled and observed extreme sea levels and river discharge during events recorded in the factual modelled flood catalogue (30). The mean absolute errors from the validation were averaged per country. Countries with very few data points or no validation data at all were combined with neighbouring countries with more data. River discharge in this analysis was normalized per km² of upstream area. Then, a transformation factor of sea level or discharge into impacts was estimated per NUTS3 region using available flood hazard maps and exposure maps for 2020. In this way, the uncertainty distribution of sea level or discharge can be sampled and then converted into a percent deviation of flood impacts from the model’s deterministic estimate.
Population and economic growth. We assume that the uncertainty of population and GDP can be represented by a normal distribution with a location parameter of zero. The scale parameter of the distribution was assumed to be 2% for population and 3% for GDP per capita in 2020. The value of the parameter for population remains stable for all years, while for GDP it increases by 0.1 pp. per year up to 10% in 1950. The value for population is based on scarce post-enumeration evaluations of population census in developed countries (65–67), which indicate that error of those is usually less than 2%. The 3% value for GDP in 2020 is equivalent to the average revision of historical GDP figures in European countries when transitioning from the ESA 1995 to the ESA 2010 system of national accounts (based on current and archived Eurostat data (32)). The 10% figure is roughly equivalent to the revision of GDP figures for European countries in 1950 between the 2010 and 2023 iterations of the Maddison Project Database of historical GDP estimates (68). It also reflects the lower availability of subnational GDP data in early decades compared to almost complete data in the recent two decades.
Land use and economic structure. The uncertainty was sampled from the uncertainty bounds (nearly uniformly-distributed) of the HANZE exposure model (32). It represents the uncertainty of changes in population distribution due to urbanization, modelled with copulas, and the uncertainty of land-use transitions modelled with a Bayesian Network.
Flood protection levels, flood vulnerability. The uncertainty was sampled from the probabilistic output of the vine-copula models (33). For flood protection levels, the change in impacts due to uncertainty of impacts at the NUTS3 level was assessed using potential impact with depth-damage functions without applying the vine-copula vulnerability models. This was done to avoid reapplying the vine-copula model for each sample of possible flood footprints, which would be too computationally demanding.