A Long-term Global Comparison of IMERG and CFSR with Surface Precipitation Stations

As the Integrated MultisatellitE Retrievals for- Global Precipitation Measurement (GPM) (IMERG) products are expected to be available at least until mid-2030, it is highly desired that an extensive spatio-temporal comparison of IMERG be done at the in-situ level to have guidance on potential usage, especially for hydro-climatic applications in data-scarce conditions. For reference, we selected widely popular CFSR precipitation products and compared both (IMERG and CFSR) with monthly precipitation time-series accessed from in-situ global stations from National Oceanic and Atmospheric Adminstration (NOAA). This comparison was made over 2001–2020 for the extent of 5 geographical regions, 7 continents, 105 countries and ~ 50,000 surface stations using the standard metrices like NSE, VE, KGE, R, RMSE and PBIAS. IMERG has a satisfactory to a good simulation of monthly rainfall in the majority of the regions, continents, countries, and stations and outperforms Climate Forecast and System Reanalysis (CFSR). Precisely, satisfactory simulation of monthly precipitation was found for Europe (VE = 0.61), North America (0.56) and Australia (0.56) and unsatisfactory simulation was observed for other continents for CFSR. At country levels, 64 countries reveal a significantly better mean NSE with IMERG. Considering the value of automated global access to the latest precipitation data for hydrologic modelling, and the better quality of IMERG, this study also introduces a public web service: Worldwide Weather Service (W3S) for preprocessing and dissemination of IMERG precipitation for use in hydrologic modelling. The outcomes of the study are expected to guide water resources managers to use these datasets in sustainable water resources management.


Introduction
In-situ measurements of precipitation, a key hydrological variable, are sparsely-cumunevenly distributed over the globe and hardly represent the spatial rainfall patterns (Kidd and Levizzani 2011;Navarro et al. 2019). On the contrary, satellite precipitation products and their reanalysis products provide spatially inclusive precipitation estimates for large global domains (Navarro et al. 2019). For better understanding, Hassler et al. (2021) and Wang and Zhao (2022) are few among the recent comparative studies on comparing the outputs of different precipitation providers. Hassler et al. (2021) compared eleven different precipitation datasets (six reanalysis + five observational) to identify their strengths and shortcomings, globally. ERA5 agreed well with observations for Central-Europe and the South-Asian Monsoon region but underestimates very low precipitation rates in the Tropics. In this study, the accuracy of precipitation estimates from eight high-resolution gridded precipitation products are evaluated by referring to the precipitation observations from 23 stations over the Heihe River basin (HRB). The results display findings on the uncertainties of several frequently used precipitation datasets in the high mountains and poorly gauged regions in the HRB.
While many global gridded precipitation products are freely available these days, differences in them exist due to their sources and generation processes (Sun et al., 2017). Such gridded-cum-continuous precipitation products benefit hydrological applications in transboundary, geographically complex and data-scarce domains. Additionally, such data provide leverage to researchers and water resources managers for water resources management, basin plan development, flood and drought monitoring-cum-forecasting and gap-filling of observed meteorological data. Precisely, popularity and applicability of satellite data products for hydro-meteorological applications are increasing due to increasing accessibility, improvements in earth system representation and high-speed computing (Jiang and Wang 2019). There are multiple applications of one of such satellite product, Tropical Rainfall Measurement Mission (TRMM) Multisatellite Precipitation Analysis (TMPA) (Owusu et al. 2019;Logah et al. 2021) near real-time product (1998( -December 2019) in data-scarce regions for hydro-climatic assessments (Hussain et al. 2018;Foroumandi et al. 2022a, b). As an advancement over TMPA, Integrated MultisatellitE Retrievals for-Global Precipitation Measurement (GPM) (IMERG) products (launched in February 2014) was conceived with a better spatial resolution (0.1° vs. 0.25°), temporal resolution (sub-daily vs. daily), the inclusion of snowfall estimates and multiple satellite products (Huffman et al. 2015(Huffman et al. , 2020. In 2019, IMERG version 06 (V06) was released, which estimates precipitation from different satellites under the GPM constellation at a fine spatial (0.1°) and temporal (30') resolution (Huffman et al. 2020). Spatial-coverage of this data extends from 90 o N-S (60 o N-S in previous versions of IMERG) and temporal-coverage has been extended from June 2000 onwards using TRMM (until 2014) and GPM (after 2014) estimates. Studies have suggested that IMERG V06 is an improvement over its predecessors (V05 and others) and other products in estimating sub-daily, daily, and monthly rainfall in different studies (Table 1).
In addition, Early runs of IMERG with a minimum latency period are expected to help in real-time flood forecasting applications but does not undergo different correction algorithm within the model. Late runs have a latency period of almost 14 h but undergoes forward and backward propagation of sensor and rainfall data within the algorithm of IMERG, allowing it to have better accuracy than early runs. Similarly, Final runs (different from Late runs Final run did not add improvement to the product and the early run itself has potential for use in real-time flood monitoring. Table 1 Summary of applications of IMERG products by adjustment factor) are generated after almost 3 months of latency but have been recommended for research purposes by the data providers (GPM 2020).
Additionally, the Final run accounts the GPCC monthly gauge monitoring and the revised precipitation retrievals from ERA-5, the exclusion of which might not differ the estimates significantly from previous runs, for data-scarce catchments. Therefore, we consider the IMERG Final run for this study.
Meanwhile, in most of the studies, a few years of application-period of IMERG Final run was considered (Table 1). Moreover, an extensive comparison of the data with observed rainfall has not been reported to the best of the authors' knowledge. Therefore, as a reference to IMERG Final run, a reanalysis gridded precipitation product, namely Climate Forecast and System Reanalysis (CFSR) [0.38 o ] (Garibay et al. 2021;Sun et al. 2021) is considered for this study (Supplementary section S1.2: https://data.mendeley.com/datasets/ fsfp797mb9). Easy accessibility and popularity are reasons behind the selection, though spatio-temporal resolutions are finer in IMERG. However, no detailed comparison has also been performed at a global scale to suggest the better performance of one over another. Therefore, this study aims to compare (globally) the IMERG Final run, CFSR and surface precipitation information at multiple spatial scales (i.e., station, country, continent, and geographical circles) during 2001-2020. A study with these variables is the first to the best of authors' knowledge. The study may serve further as a guide for the potential hydrological application of these data in data-scarce regions for sustainable water resources management. Furthermore, the authors introduce an easy-to-use free data accessing platform to access the IMERG Late run product at daily timestep for a user-specified area of interest, to increase its application in hydro-met studies. Figure 1 presents the detailed flow of methodology. The detailed methodological description of each component is presented in the Supplementary section S1-S3.

Geographical Regions
A quick inspection of geographical regions revealed that the Tropic of Cancer (TC) had a total of 2868 stations, for which an average of 72-133 (31-58%) points (monthly precipitation values) were available for evaluation, with higher coverage in India, Thailand, Mexico, Different versions of IMERG have been compared with other products and applied over the years Tapiador et al. (2020) Validated IMERG v05B early, late and final runs

Spain
Hydrological simulation Satisfactory performance in areas where no rain gauges were used for the calibration of IMERG and a very good performance in areas where at least one rain-gauge station existed to calibrate the IMERG rainfall.  Fig. 2.
From Fig. 2, it is evident that the monthly mean of GSOM is better replicated by IMERG in some geographical regions, and in others, both CFSR and IMERG have similar data characteristics. Relatively better capture of normal monthly precipitation by both products is seen for FC, ArC and TCa compared to the AnC and TC. Similarly, CFSR has a higher spread of precipitation compared to IMERG in the TCa, ArC, and FC and IMERG has a higher spread in the remaining two. The other performance metrices calculated at stations and aggregated for geographical regions also exhibit a clear signal regarding IMERG's outperformance in simulating mean monthly precipitation across the globe.
The NSE values computed from 2001 to 2020 with monthly precipitation values of IMERG, CFSR and GSOM indicate that IMERG has a better ability to simulate wet months than CFSR for each region, as shown in Fig. 3(a).
The bias of NSE towards the higher values indicates that bigger NSE values indicate better capture of the wet season's climatology in the region, which might be important from a flood simulation and monitoring perspective. The highest value of median NSE was observed for TCa (0.85), followed by AnC (0.76), ArC (0.71), TC (0.64) and FC (0.47) for IMERG which indicates its good performance in the former three regions, satisfactory performance in the fourth and unsatisfactory in the fifth region. CFSR on the other hand had a satisfactory performance in the TCa (NSE = 0.59), followed by an unsatisfactory per- Not only during the high precipitation months, but dry months in general also had a better simulation of GSOM precipitation characteristics with IMERG for all geographical circles, as presented as VE values in Fig. 3(b). VE in general represents how much of the rainfall is delivered at the proper time and its remainder represents the fractional volumetric mismatch and is thus desired by water resources managers (Ghimire et al. 2019b).
In terms of fractional volume match of unit precipitation, ArC had the best statistics (median VE = 0.72) followed by AnC (0.7), TCa (0.68), FC (0.62) and TC (0.57) when simulated by IMERG. These statistics indicate good deliverance of unit monthly precipitation in the first two and satisfactory performance in the latter three regions. CFSR on the other hand had a satisfactory simulation of unit precipitation in ArC (VE = 0.57) and AnC (0.57) Fig. 2 Scatterplots & boxplots of mean monthly normal precipitation (mm) simulated by IMERG (greencross) and CFSR (red-triangle) compared with observed GSOM (black circle) at stations during 2001-2020 across different geographical regions and unsatisfactory performance in the remaining three regions. This indicates that researchers aiming to employ the IMERG dataset in the TC and FC can expect higher fractional mismatch compared to observed precipitation. Similarly, water resources management in the data-scarce regions of Tropics and Frigid might not be benefitted from the monthly rainfall series of the CFSR dataset. The relatively poor performance of both datasets in FC could be due to their inability of estimating snow properly, as discussed by Huffman et al. (2020). Similarly, the relatively lower efficiency of the datasets in tropical circles could be due to their inability to depict primary climatological features of tropical rainfall like annual mean, annual cycle and monsoon domain, as discussed by Wang and Ding (2008).
NSE and VE measure relative agreement among observed and simulated precipitation values, RMSE however measures the differences between GSOM and IMERG (CFSR).
Although an ideal value of RMSE would be zero, it is almost unachievable in rainfall comparison studies like this, but a larger value would indicate higher problems with the data associated. However, a geographical comparison of the precipitation products again revealed that IMERG has lesser errors than CFSR in simulating monthly precipitation across all geographical regions, as presented in Fig. 3(c).
In general, it is observed that the RMSE of both products was highest in the TC, followed by TCa, ArC, AnC, and FC. While the stations' number are different in each circle, this descending order of RMSE values is suggested to be evaluated qualitatively. The Indian subcontinent, southeast Asia and Amazon (i.e., TC and TCa) generally receive higher precipitation than other regions of the globe, thus any inconsistency is likely to appear there, in higher quantities, than the other geographical regions (Cobon et al. 2020). However, less disagreement is observed for IMERG when compared to the surface precipitation. In general, CFSR underperformed in estimating surface precipitation characteristics in all geographical circles at the monthly time step.
The biases (percent) in IMERG and CFSR products are found in general to be positive, indicating an overestimation of monthly precipitation by both products, as shown in Fig. 3(d).
IMERG (CFSR) had least median biases in TCa i.e., 5.9% (0.4%), followed by AnC i.e., 8.6% (0.8%), ArC i.e., 9% (11.5%), TC i.e., 12.9% (24.7%) and FC i.e., 28% (60.35%). The spread of biases is smaller in IMERG compared to the CFSR, which indicates its slight overestimation in all geographical regions. However, the bias itself might not a significant issue in using such meteorological information from satellite products, mostly due to different bias correction techniques known to reduce systematic biases in estimated rainfall series compared to the observed rainfall climatology (Ghimire et al. 2019b). A similar outperformance of IMERG over CFSR compared to GSOM precipitation is observed for two other metrices KGE and R, as shown in Fig. 3(e) and 2(f), respectively.
The difference in the mean of performance metrices for each geographical region was tested for their significance using Welch T-test (Supplementary Section S2.2: https://data. mendeley.com/datasets/fsfp797mb9) reveals that IMERG is significantly better in simulating precipitation compared to CFSR, as presented in Table S3.
The outperformance of IMERG over CFSR could be due to different correction algorithms used by IMERG (Huffman et al. 2015). Furthermore, IMERG uses the Global Precipitation Climatology Project (GPCP) monthly climatology to correct its biases (Huffman et al., 2018), albeit at a coarser resolution of 2.5° (Pendergrass et al., 2020). Better statistic values are observed in general for ArC and AnCs followed by TCa, TC and FC, which is likely due to the better representation of precipitation in the GPCP and the set of satellites that are continuously recording the energy information. The reason for the low capture of precipitation values in the TC and TCa could also be due to the missing of key monsoon characteristics of the region (Wang and Ding 2008).

Continent-wise Comparison
A comparison of different precipitation stations located inside each continent revealed that IMERG outperforms CFSR in all continents in terms of different performance metrices.
The fractional matching of unit precipitation was good in Europe (VE = 0.72), North America (VE = 0.72), Asia (VE = 0.71), Australia (VE = 0.70) and satisfactory in South America (VE = 0.66), Russia (VE = 0.58) and Africa (VE = 0.52) when IMERG was used to simulate monthly precipitation during 2001-2020 ( Fig. 4(b)). However, when CFSR was used to simulate monthly precipitation, only Europe (VE = 0.61), North America (VE = 0.56) and Australia (VE = 0.56) were found satisfactorily simulated, while other continents had an unsatisfactory simulation of precipitation for water resource management (Fig. 4(b)). Africa, which was least represented in this study due to the unavailability of rainfall stations reported for 2001-2020 had the least utility of both IMERG and CFSR products evaluated  GSOM precipitation comparison (2001GSOM precipitation comparison ( -2020 in this study. A similar trend of other performance indicators (NSE, KGE and R) for the continents are presented in Fig. 4(a), 4(e) and 4(f), respectively.

Country-wise Comparison
The metrices, computed at stations, were aggregated by their median values and presented at the country level. These results are expected to guide researchers in using IMERG/CFSR data in their country of interest. However, these statistics are an indication of their potential application, these might not reflect the ground reality, as many countries have very few to no station included in this analysis. Figure S2 presents the density of rainfall stations (number of stations per 1000 km 2 of land area) in different countries. From Figure S2, it is evident that the countries located in South America, Africa and Asia have scarce representation of GSOM rainfall stations compared to the countries in Australia, Europe, and North America. So, higher confidence can be placed on the results where station densities are higher (Tian et al. 2018). Accordingly, the median of NSE values computed for stations located inside each country by comparing IMERG, CFSR and GSOM precipitation during 2001-2020 are presented as spatial plots in Fig. 5(a).
A significant difference among IMERG and CFSR generated NSE is observed at country levels when compared with GSOM precipitation. 64 out of 105 countries where more than one stations were available for comparison revealed that the mean NSE difference between IMERG and CFSR was significant with P < 0.05. A far superior performance of IMERG is evident at country levels throughout the globe with the majority of countries in North America, Europe, Asia, and Australia having satisfactory to good performance. However, countries like Mongolia, Kazakhstan, middle-eastern nations, Indonesia, Papua New Guinea, and many countries from Africa and South America exhibit unsatisfactory simulation of high rainfall months, attributed to poor NSE values. CFSR, on the contrary, have only a few countries with satisfactory simulation of wet months. Again, this superior performance of IMERG may be attributed to the set of passive and active satellite estimates of rainfall and further correction techniques employed with GPCP precipitation (Huffman et al. 2020). When tested for the significance of these differences between IMERG and CFSR, similar outperformances of IMERG at the country level are observed in other indices like VE, KGE and R (Fig. 5(c), 5(d) and 5(e), respectively).
It is also inferred that NSE values of CFSR are lower for the countries like Russia, Canada and in north-east Europe, which are classified as snow dominated climates as per the Köppen-Geiger classification ( Figure S3). Similarly, the countries of warm temperate climates in Europe and Asia have good NSE values when simulated with IMERG. The disagreement among the precipitation products, represented by RMSE values, aggregated at the country level is shown in Fig. 5(b).
Despite low agreement among IMERG and GSOM precipitation across certain countries like Mongolia, Egypt, Iraq and others, as indicated by NSE values (Fig. 5(a)), VE (Fig. 5(c)), KGE (Fig. 5(d)) and R (Fig. 5(e)), the RMSE values appear on the lower spectrum ( Fig. 5(b)). It is due to the arid climatology of these countries ( Figure S3), where even low RMSE values could cause significant disagreement among rain/no rain status.
Interestingly, CFSR appears to follow the trend of IMERG for most of the countries, albeit with higher disagreement (higher RMSE values). The biases between IMERG (CFSR) and GSOM precipitation are accordingly calculated and presented in Fig. 5(f). Similarly, the summary of statistics is presented in Table S5 for 138 countries which align with the abovediscussed findings of better performance of IMERG compared to the CFSR.

Station-wise Comparison
Comparison of monthly precipitation at station levels during 2001-2020 for selected 50,000 + stations across the globe provides a clear snapshot that IMERG Final run has higher accuracy than CFSR in simulating precipitation. E.g., the VE values computed at station levels clearly show an outperformance of IMERG precipitation, as can be seen from Fig. 6 (c).
Similar outperformance of IMERG data over CFSR can be seen in terms of other metrices like NSE ( Fig. 6(a)), KGE (Fig. 6(d)) and R (Fig. 6(e)). The disagreement among data in terms of RMSE also shows the higher RMSE values in stations located in South and Southeast Asia and Amazon forests with a clear outperformance of IMERG, as shown in Fig. 6(b).
The biases computed at station levels are presented in Fig. 6(f) and the entire statistics can be referred to at station levels from Table S6.

Applications in Hydrology
Considering the value of automated global access to the latest precipitation data for hydrologic modelling, and the better quality of IMERG (as assessed in Sect. 3.1-3.4), this study also introduces a public web service: Worldwide Weather Service (W3S) for preprocessing and dissemination of IMERG precipitation for use in hydrologic modelling. The framework of the W3S is presented in Figure S4. The W3S produces hydrologic model-ready precipitation data in two formats: CSV and Soil and Water Assessment Tool (SWAT) (Arnold et al. 2012)-ready. The W3S framework is generic in design and may be enhanced in the future to include other gridded weather products to use in hydrologic modeling. This framework is an extension of the Can-GLWS (Shrestha et al. 2021) and APWS framework (Ghimire et al. 2019a) and has three connected layers ( Figure S4). W3S is deployed on a University of Guelph-server (https://www.uoguelph.ca/watershed/w3s/). It ultimately aims to reduce the redundancy associated with manually preprocessing of precipitation data for hydrological modeling by specifying the area of interest, data period and format and downloading it automatically in little to no time.  GSOM precipitation comparison (2001GSOM precipitation comparison ( -2020

Conclusion
This study is the first to compare, to the best knowledge of authors, IMERG and CFSR satellite precipitation data with observed long-term (2001-2020) monthly precipitation using performance indicators: NSE, VE, KGE, R, RMSE and PBIAS, for 5 geographical regions, 7 continents, 105 countries and 50,000 > surface stations. The results indicate that.
• IMERG has a satisfactory to the good simulation of monthly rainfall in most of the regions, continents, countries, and stations and outperforms CFSR. • IMERG had an unsatisfactory simulation of rainfall in the Frigid region whereas CFSR had an unsatisfactory simulation of rainfall in all regions except Tropic of Capricorn. • 64 out of 105 countries (where more than one stations was available for comparison) revealed that the IMERG had significantly better performance than CFSR.
Thus, the outcomes of the study are expected to guide water resource managers to use these datasets in sustainable water resources management. However, owing to the sparse to nil station density in many of the countries, the confidence in the results for many countries still needs to be verified and thus the authors caution the potential users to perform in-depth analysis with different data products before finalizing them for research and application. Future studies are encouraged to evaluate IMERG and similar satellite products at a daily/ sub-daily resolution so that a better understanding of their daily rainfall simulation characteristics is developed. Daily/sub-daily precipitation data is crucial for applications including near real-time flow monitoring, flood forecasting and reservoir operation. It is also recommended to include recent satellite products like Multi-Source Weighted Ensemble Precipitation (MSWEP) (Beck et al. 2019) in future studies. MSWEP is available at a sub-daily time-step and 0.1° spatial resolution which makes it more comparable with IMERG Early, Late and Final run and may provide more clarity on their applicability in real-time hydromet applications.

Acknowledgements, Statements & Declarations.
• -Ethical Approval: The authors acknowledge the approval of Dr. Andrey Savtchenko, Dr. George Huffman, and Andrea M. Portier from NASA Global Precipitation Measurement Mission to use and redistribute the IMERG data for use and analysis in this study. Similarly, the approval of Dr. Wei Shi from the Climate Prediction Center, NOAA to use the CFSR data and redistribute is also acknowledged. • -Consent to Participate: We agree among ourselves to outline the roles and responsibilities towards one another throughout the whole research and publication process. • -Availability of data and materials: Due to privacy, ethical concerns, and confidentiality agreements the supporting data cannot be made available. However, we have provided the source and name of the contact persons in the 'Acknowledgements, Statements & Declarations' section. We can provide more details on 'how to request data' to the peers upon request.