**Individual projection results**

For each of the examined 118 test sites, we simulated the future weekly groundwater level development based on six climate projections (s.a. Table 1). Since these climate projections differ considerably in detail for individual future time periods, we also obtained six different future groundwater level simulations, which should only be interpreted on the basis of longer time periods (at least 30 years)36. Figure 2 depicts the trend as the relative development in percent of the annual mean for each of the six projections (A) as well as the annual upper extreme (97.5%) quantile (B) and the annual lower extreme (2.5%) quantile (D) for all test sites in 2100, compared to the start of the simulation (2014) and normalized on the individual historic range as explained in the methods section. For each site, all relative developments are shown ordered by the strength of the change, the order does therefore not correspond to the numbering of the projections. The given boxplots in Figure 2C provide more detailed information for the three maps as well as on the development of the 25% and the 75% quantiles, relative and absolute values of the presented changes are given in Table 2. The values of the non-significant trends are not shown in the boxplots, which has to be kept in mind for interpretation, especially for quantiles with many non-significant trends (compare Table 2).

In case of the mean, approximately 54% of all simulations (387 of 708, i.e. six projections for each of the 118 sites) show a significant trend until 2100. At least one of the projected developments is always considered significant (p<0.05) for each site, which, however, also means that there are several sites with mainly non-significant trends (grey). The large majority of the significant trends is negative with a median ranging between -23% in case of p1 and -6.6% in case of p6 (Table 2). In Figure 2C we observe that p1 systematically shows the strongest declines until 2100, being significant for 117 of the 118 wells. The overall maximum decline is -46%, clearly indicating the different character of p1 compared to the other projections. Especially projections p3-p5 show more moderate changes of the mean (median ranges from -8% to -13%), with many non-significant trends (35%-54%). Simulations based on p2 and p6 only find significant trends for around 30% of all sites and additionally are moderate in their significant results. Three projections (p2, p3, but mainly p6, compare Table 2) even show some positive developments until 2100, however overall, such developments are rare and occur at sites, where other projections simultaneously show at least non-significant or even negative trends. In absolute numbers the mentioned median changes are in the order of -0.1 m to -0.4 m, which is highly dependent on the individual groundwater level range at each site. Despite many non-significant and some positive trends, there is a clear tendency of declining mean groundwater levels until 2100. Additionally, we can observe a slight spatial tendency with more and stronger significant negative trends in some areas of northern and eastern Germany, where we also find the strongest overall relative declines. In southern Germany many wells show several non-significant trends and also most positive changes can be found scattered in this region, however, some of the southernmost wells show very strong declines for single simulations, comparably to the strong declines in eastern Germany.

In case of the upper extreme value quantile (97.5%) this spatial pattern is partly confirmed. In Figure 2B we clearly observe many significant declines in eastern Germany, while the large majority (>70%) of the trends in whole Germany is considered to be non-significant. Increasing trends are found comparably often for the 97.5% quantiles, with increases up to 20%. Comparing the projections with each other (Figure 2C), we find a similar behavior as before: p1 shows the strongest significant decreases (down to -47%), p3, p4 and p5 tend to move in the moderate negative range (medians around -12%), while p2 and p6 more often show positive trends (positive medians of the significant trends). We therefore observe partly a contradictory development of the upper extreme values compared to the mean. The absolute numbers of the mentioned changes again are in the order of few tens of centimeters upwards and downwards. The strongest simulated absolute increase (max. of p6) is almost 5 meters, however, in a karstic well in southern Germany, which has a high variability anyway.

The tendency of declining groundwater levels we observed for the mean, gets clearer for the lower extreme values (2.5% quantile) shown in Figure 2D. We still observe 36% non-significant trends, however the remaining 65% show almost exclusively negative changes with a maximum decline of -81% (Table 2). The median change of the 2.5% quantile of all projections ranges between -38% for p1, which again shows the strongest declines, followed by p4 (-21%), as well as p2, p3, p5 and p6 with a median change around -10% each. The latter four, and especially of them p6, contain the majority of non-significant trends, the changes shown in the boxplots therefore tend to be overestimated. There are only few sites where only one result is considered significant. These occur mainly near the Baltic Sea coast, the central and eastern part of northern Germany, and the central area of southern Germany. In the latter, however, there are at the same time quite strong relative decreases, just as we also find them in eastern Germany and in the western part of northern Germany. This pattern is largely consistent with the spatial pattern of the mean mentioned above. Most median decreases (p2-p6) are in the order of -0.1 to -0.4 m, for p1 the median decrease reaches even -0.7 m for the annual lower extreme value quantile. All projections except p6 agree that of all significant changes, at least a decrease of -0.1 m will be observed (max. values for 2.5% quantile in Table 2).

Considering all results, we see a clear tendency toward declining groundwater levels overall, with stronger declines for lower quantiles, i.e. groundwater level lows will occur more frequently and will be more severe in the future. At the same time, mostly no or even increasing trends are found for upper extreme values, which means that the overall variability will increase significantly by the end of the century.

Table 2: Detailed numbers for each projection on relative changes (left), already shown as boxplots (Figure 2C). Right tables show associated absolute changes in meters.

Figure 3 shows the detailed development at four selected sites (black boxes in Figure 2). For each site we plot the six projected groundwater level time series for the far future (2070-2100) (A1-D1), as well as the complete simulations, separately as heatmaps with years as row and weeks as columns (A2-D2). The time series plots show the diverging development of some projections in the far future, however, there is no strict sequence of projections in terms of absolute groundwater height, the order can change throughout the years. Most heatmaps show the development described above by displaying generally declining groundwater levels (more and darker red, as well as lighter or constant blue shadings towards 2100 in the lower part of the heatmaps). What we additionally see now is that the length of low groundwater levels increases (red shadings get wider) for all sites. The time of higher groundwater levels throughout the year shows two possible developments of either getting shorter (blue shadings get narrower, e.g. B2-p1 or even change to red, e.g. D2-p4) or staying constant in length (width of blue shadings does not change, e.g. A2-p2 and A2-p6), with optionally even increasing peak height (darker blue, e.g. A2-p6). In both plot types we can also recognize sequences of several more extreme years, such as several dry years around 2090 in B1-p4, which also reflects in a dark-red stripe in the corresponding heatmap (B2-p4). Such sequences are especially critical because effects accumulate and dependent ecosystem are not able to recover but are instead particularly vulnerable to further damage in subsequent years due to reduced resilience.

**Average projection results**

We consolidated the separate projection results for each site into one by calculating the mean of the significant trends shown in Figure 2. Only sites with at least 4 (thus the majority) significant results are included, the rest is depicted as not significant on average. Results are shown in Figure 4. The development of the mean is depicted in the upper left map and we find, that according to the aforementioned definition, 41% of the wells (49 of 118) are considered significant on average and on median show a change of -13%. We do not find any wells with increasing mean trends and observe a similar spatial pattern as before with strongest decreases in eastern Germany. For wells in southern Germany we observe noticeably many non-significant changes. All in all, we simulated significant average decreases between -0.2 m to -2.4 m for about 25 wells, and at least a decrease of -10 cm for all 49 wells in Figure 4A (max. abs. value of the mean in Figure 4D). In case of the upper extreme value quantile (97.5%) we can summarize that the consolidated results show mainly no trends, especially for southern Germany, they will therefore probably remain at their current level. Few sites (5), all of them in northern Germany, are expected to show increased upper extreme values up to a maximum of 15% or 1.5 m, however, we still observe a spatial pattern of decreasing upper extreme values in eastern Germany up to -30% or -0.7 m. Hence, in this area the groundwater levels probably will decrease in every part of the annual cycle and with comparably high certainty (many consistent significant simulations). This applies also to the lower extreme values (2.5% quantile) that show on average significant decreases for more than half of the examined sites all over Germany with median decreases of -19% (equivalent to -0.3 m, comp. Figure 4C, D). On this map, no clear spatial pattern is recognizable any longer.

**Annual maximum and minimum timing aspects**

Besides the relative and absolute developments of the groundwater height, we also investigated timing aspects of the groundwater dynamics. For a possible shift of the annual minimum (Figure 5) we found significant (p<0.05) results for p1 (41 of 118) and also p4 (33 of 118), with median shifts of 3.4 and 3.1 weeks (positive, i.e. later. A spatial pattern exists, showing significant and stronger shifts with increasing proximity to the coast in the north and no or even negative (i.e. earlier) shifts in the south. However, please note that most results are not significant and the shown pattern may only serve as an indication for further interpretation.

Even fewer significant shift were found in case of the annual maximum timing (not shown). Especially for snow dominated regions a shift of the peak timing from spring towards the winter is expected in the context of climate change, however, Germany as a whole cannot be considered snow-dominated. This is in accordance with our findings, because we found mainly non-significant shifts (< 10 per projection). Only in case of p4 we detected a slightly larger number of significant shifts (29 of 118). Here, the maximum even occurs on median 4 weeks later during the annual cycle, in contrary to the expected shift for snow-dominated regions.

**Model input analysis**

From the combined analysis of our groundwater level simulations and the model inputs shown in the introduction, we can conclude, that temperature is mainly the driving factor for declining groundwater levels, rather than precipitation. This applies because mostly no significant or even increasing precipitation is projected, our models, however, still frequently show declining groundwater level tendencies, which therefore most likely are caused by the significantly increased temperature until the end of the century. Therefore, our results are consistent with other studies, which indicate that the reduction in water availability in the future is driven primarily by changes in temperature34.

This reflects also in the model interpretability approach (SHAP values) we used to check the plausibility of our model outputs. The minimum SHAP value for T is mostly lower than the minimum SHAP value observed for P (except for eight sites); i.e. the models have learned that high temperatures can cause stronger decreasing groundwater levels than low precipitation. This is, however, only an interpretation of what was learned, which agrees with our conception. A causality cannot be derived from this.