## Region of the Gangetic Plain

In order to define the region of the Gangetic Plain, the land topography is obtained from ETOPO1 Global Relief Model32. The land surface elevation data from Ice Surface grid-registered version of ETOPO1 within the region from 75°E to 93°E and from 21°N to 28°N are interpolated into a 1°⋅1° grid, and grid boxes where the altitude is lower than 150m are treated as the region of Gangetic Plain. The topography of South Asia is shown in Fig. S1, and the definition of the GP region is shown as the red contour in Fig. 1a and Fig. S1.

### Precipitation Data

For precipitation observation, several datasets are used including rain-gauge based precipitation products and satellite estimates. These datasets are monthly time series products by the Global Precipitation Climatology Centre (GPCC, version 2020)33, the Global Precipitation Climatology Project (GPCP, version 2.3)34, the CPC Merged Analysis of Precipitation (CMAP)35 and NOAA’s Precipitation Reconstruction over Land (PREC/L)36. Data from 1979 to 2019 are used to analyze the South Asia summer monsoon precipitation trend since the start of satellite era. The precipitation data from these datasets are linearly interpolated into a 1°⋅1° grid to calculated the regional mean time series of summer monsoon precipitation over the Gangetic Plain. Results of GPCC is presented in the main text while trends of other datasets are shown in supporting information.

### Aerosol Optical Properties

We use Aerosol Robotic Network (AERONET) observations as a reference of aerosol optical depth and single scattering albedo trends. For this study, Level 2.0 monthly AOD products37 and Level 1.5 monthly SSA products38 retrieved at 440 nm, 675 nm, 870 nm and 1020 nm at Kanpur station (26.5°N, 80.2°E) are used to represent the observed trend of AOD and SSA in northern India, since Kanpur station is identified as the only long-term station covering the 2000 to 2020 period compared to the other stations in South Asia. A monthly mean is considered valid only if there are more than 5 measurements for that month. The linear trends of AOD and SSA are calculated from the monthly time series directly. To compare with models, AOD measured at the above four wavelengths are linearly interpolated to 550 nm on a logarithm scale, and SSA at these wavelengths are interpolated to 550 nm by a second-order polynomial fit.

### Model Configuration And Experiment Design

In order to estimate the impacts of aerosols on South Asia summer monsoon precipitation, a set of aerosol control simulations are performed with the Community Earth System Model version 1.2.2 (CESM)39. Both the Community Atmosphere Model version 4 (CAM4)40 and version 5 (CAM5)41 are used in our study. CAM4 is forced by three-dimensional monthly mean distributions of aerosol mass for sulfate, sea salt, black carbon, organic carbon and dust, while in CAM5 these aerosol species are simulated by the three-mode version of the modal aerosol model (MAM-3) which includes Aitken, accumulation and coarse modes.

We first use CAM5 to obtain the aerosol mass concentration fields by reference to the emission in CMIP6 historical input files at the 1980s and 2010s level. The average anthropogenic emissions from 1980 to 1984 (namely 1980s level), from 2000 to 2004 (namely 2000s level) and from 2010 to 2014 (namely 2010s level) in CMIP6 historical input files are calculated. Then the ratio between 1980s level (2010s level) and 2000s level is multiplied with the original CAM5 present-day emission data (at year 2000) so that the emission files of 1980s level (2010s level) fitting for CAM5 are obtained. Driven by these emission files, CAM5 experiments are simulated for 8 years, and the first two years are treated as spin-up.

The ratio between 2010s and 1980s level of vertically integrated mass concentration of sulfate, sea salt, black carbon, organic carbon and dust simulated by CAM5 are multiplied with the prescribed historical aerosol fields of CAM4. The first two simulations of CAM4 are forced by these fields at 1980s and 2010s level, namely aer_control_1980s and aer_control_2010s. The ratio of AOD in aer_control_1980s and aer_control_2010s simulations are calculated and then multiplied with the mass concentration fields of all aerosol species at 2010s level, so that the aerosol composition is kept at 2010s level while the total mass of aerosols is scaled into 1980s level, namely aer_frac_2010s. Similar processes are performed to obtain mass concentration fields with aerosol composition at 1980s level and total mass at 2010s level, namely aer_frac_1980s. The summary of these experiments is shown in Table S1. The horizontal resolution of the atmospheric component is set to 2.5°⋅1.875°. All the other forcing, including greenhouse gases and sea surface temperature, are kept at present-day level (2000). Each experiment contains 10 ensemble members with a random perturbation of temperature to the initial conditions, and all cases are simulated for 2 years. Difference between aer_control_2010s and aer_control_1980s represents the responses of climate system to the change of aerosols, named as AER experiment. Differences between aer_frac_1980s, aer_frac_2010s and aer_control_1980s stand for the contributions of AOD and SSA change, namely AOD experiment and SSA experiment, respectively.

### Cmip6 Model Simulations

Monthly precipitation data as well as other meteorological variables from the Coupled Model Intercomparison Project phase 6 (CMIP6)42 are used to calculate a multi-model-mean result. Four experiments including AMIP, historical, hist-GHG and hist-aer are taken into consideration. AMIP simulations are atmospheric model simulations forced by observed sea surface temperature. Historical simulations are fully coupled simulations forced by multiple forcing including greenhouse gases, anthropogenic aerosols and their precursors, while hist-GHG and hist-aer are single-forced experiments with changing GHGs and anthropogenic aerosols, respectively43. Monthly outputs of these scenarios are available for 13 models in CMIP6, and each model has multiple ensemble members. Details of the selected models are shown in Table S2. Data from 1980 to 2014 are used to calculate the trends since the 1980s.

### Moisture Budget Analysis

To estimate the contributions of physical processes to precipitation trend, here we use the moisture budget analysis, which has been widely used to understand the mechanisms that induce precipitation associated changes44–46. The vertically integrated moisture budget of each grid box can be written as

$$P-E\approx -<\nabla \bullet \left({V}q\right)>=-<\omega {\partial }_{p}q>-<{v}\bullet \nabla q>+res$$

where *P* represents precipitation, *E* represents evaporation, V represents wind vector, *q* represents specific humidity, *ω* represents pressure velocity, and v represents horizontal wind vector. The pressure velocity is set to zero at the surface and the top of atmosphere, and *< >* stands for mass integration through the whole column. The change of precipitation can be divided into the contributions of evaporation, vertical moisture advection \(-<\omega {\partial }_{p}q>\), horizontal moisture advection \(-<{v}\bullet \nabla q>\) and residual term. The residual term includes transient eddies and nonlinear effects.

### Trend Estimation And Significance Test

The linear trends of precipitation, meteorological fields, moisture budget terms and other variables are estimated using the Sen’s Slope47. The Sen’s slope *b* is calculated as

$$b=Median\left(\frac{{X}_{i}-{X}_{j}}{i-j}\right) \forall j<i$$

where *X**i* and *X**j* represent the value at the *i*th and *j*th time step in the time series, respectively.

Then the significance test of the linear trends is performed using the Mann-Kendall statistical test48,49. The test statistic *S* is calculated as

$$S=\sum _{i}^{n-1}\sum _{j=i+1}^{n}sgn({X}_{j}-{X}_{i})$$

where *n* stands for the length of time series, and *sgn* stands for the sign function as

$$sgn\left({X}_{j}-{X}_{i}\right)=\left\{\begin{array}{c}1 if {X}_{j}>{X}_{i}\\ 0 if {X}_{j}={X}_{i}\\ -1 if {X}_{j}<{X}_{i}\end{array}\right.$$

The variance of *S* is

$$Var\left(S\right)=\frac{1}{18}n(n-1)(2n+5)$$

And the standard normal test statistic *ZS* is calculated as

$$ZS=\left\{\begin{array}{c}\frac{S-1}{\sqrt{Var\left(S\right)}} if S>0\\ 0 if S=0\\ \frac{S+1}{\sqrt{Var\left(S\right)}} if S<0\end{array}\right.$$

if *n* > 30. The 95% confidence level is met when the absolute value of *ZS* is larger than 1.96 according to the normal distribution table.

The significance test of the differences between CESM CAM4 simulations are performed using Student’s *t* test. The Student’s *t* is calculated as

$$t=\frac{{\stackrel{-}{x}}_{1}-{\stackrel{-}{x}}_{2}}{\sigma \sqrt{\frac{1}{{n}_{1}}+\frac{1}{{n}_{2}}}}$$

where \(\stackrel{-}{x}\) is the sample means, *n* is the sample sizes, and σ is the pooled standard deviation, which is calculated as

$$\sigma =\sqrt{\frac{{n}_{1}{s}_{1}^{2}+{n}_{2}{s}_{2}^{2}}{{n}_{1}+{n}_{2}-2}}$$

where *s* is the standard deviation. The test statistic under the null hypothesis has Student’s *t* distribution with *n**1* *+ n**2**-2* degrees of freedom.