Objective Evaluation of Satellite Precipitation Datasets for Heavy Precipitation Events Caused by Typhoons in the Philippines


 Extreme weather events, such as typhoons, have occurred more frequently in the last few decades in the Philippines. The heavy precipitation caused by typhoons is difficult to measure with traditional instruments, such as rain gauges and ground-based radar, because these instruments have an uneven distribution in remote areas. Satellite precipitation datasets (SPDs) provide integrated spatial coverage of rainfall measurements, even for remote areas. This study performed subdaily (3-hour) assessments of SPDs (i.e., the Integrated Multi-satellitE Retrievals for Global Precipitation Measurement [IMERG], Global Satellite Mapping of Precipitation [GSMaP], and Precipitation Estimation from Remotely Sensed Information Using Artificial Neural Networks datasets) during five typhoon-related heavy precipitation events in the Philippines between 2016 and 2018. The aforementioned assessments were performed through a point-to-grid comparison by using continuous and volumetric statistical validation indices for the 34-knot wind radii of the typhoons, rainfall intensity, the terrain, and wind velocity effects. The results revealed that the IMERG exhibited good agreement with rain gauge measurements and exhibited high performance in detecting rainfall during five typhoon events, whereas the GSMaP exhibited high agreement during peak rainfall. All the SPDs tended to overestimate rainfall during light to moderate rainfall events and underestimate rainfall during heavy to extreme events. The IMERG exhibited a strong ability to detect moderate rainfall events (5–15 mm/3 hours), whereas the GSMaP exhibited superior performance in detecting heavy to extreme rainfall events (15–25, 25–50, and >50 mm/3 hours). The GSMaP exhibited the best performance for detecting heavy rainfall at high elevations, whereas the IMERG exhibited the best performance for rainfall detection at low elevations. The IMERG exhibited a strong ability to detect heavy rainfall under various wind speeds. A strong ability to detect heavy rainfall events for different wind speeds in the western and eastern parts of the mountainous region of Luzon were found for the GSMap and IMERG, respectively. This study demonstrated that the IMERG and GSMaP datasets exhibit promising performance in detecting heavy precipitation caused by typhoon events.

The multisource datasets used in this study can be categorized into four types: 262 typhoon event data; traditional observational rainfall data obtained from surface rain 263 gauges; precipitation information estimated from satellite measurements; and wind 264 vector data, which constitute a reanalysis dataset. The following subsections provide a 265 brief description of these four types of data. 266 Table 1  Haima, and Nock-ten), and tropical storms (Doksuri). 283

Data from Rain Gauge Measurements 284
The data from rain gauge measurements were used as a reference to evaluate the 285 performance of the SPDs. Three-hour rainfall observation data for typhoons making 286 landfall in the Philippines were obtained from PAGASA, the Department of Science 287 and Technology, the Republic of the Philippines. PAGASA provides rainfall data from 288 222 automatic weather stations distributed across the Philippines. A total of 66 rain 289 gauge stations were selected on the basis of whether the spatial distribution was affected 290 by the R34 values of the typhoons and the completeness of the desired data. Table 1  291 lists the number of selected rain gauge stations within the R34 during the passing of the 292 storm. PAGASA has made available high-quality rain gauge data for the five 293 considered typhoon events. 294

IMERG Dataset 295
The high-resolution IMERG dataset is an improvement on the TRMM 296 Multisatellite Precipitation Analysis dataset, whose global coverage data were made 297 available from June 2000. The IMERG program was initiated by the National 298 Aeronautics and Space Administration (NASA) and the Japan Aerospace Exploration 299 Agency (JAXA). Its algorithm intercalibrates, merges, and interpolates all available 300 satellite microwave precipitation measurements, microwave-calibrated infrared 301 measurements, surface rain gauge analyses, and other possible rainfall estimates on 302 wide temporal and spatial scales for nearly the entire globe (Huffman et al. 2019). The 303 IMERG dataset provides half-hourly, daily, and monthly rainfall estimation at a spatial 304 resolution of 0.1°. The IMERG dataset contains three types of data in terms of time 305 release, namely early-, late-, and final-run data. The time-release delay is 4 hours for 306 the early-run data, 12 hours for the late-run data, and 3.5 months for the final-run data 307 (Huffman et al. 2019). This study used the latest Level-3 IMERG half-hourly data from 308 version 06B of the final-run dataset. The final-run dataset exhibits superior performance 309 to that of the early-and late-run datasets. The final-run dataset is also more appropriate 310 for use in climate and hydrological studies than the other two datasets are (Tang et al. 311 2016). The IMERG dataset is available online at https://gpm.nasa.gov/data/directory. 312

GSMaP Dataset 313
The GSMaP dataset is a satellite-based precipitation dataset constructed by the 314 retrieval algorithm is primarily based on integrated infrared imagery from 339 geosynchronous satellites, with forecasts generated by an artificial neural network to 340 transform infrared imagery into global rainfall data (Sorooshian et al. 2000).

Wind Data 359
The wind data used in this study was ERA5, which is a grid reanalysis dataset 360 terms of the 3-hour temporal scale, rainfall intensity, the terrain, and wind velocity 377 effect by comparing the precipitation estimates with the rain gauge measurements. The 378 comparison between SPDs estimates and rain gauge measurements was carried out 379 when the rain gauge is within R34 during the passing of the storm. The half-hourly 380 IMERG estimation and hourly GSMaP and PERSIANN estimation data were converted 381 into 3-hour rainfall data so that their temporal resolution matched that of the rain gauge 382 measurements. Only a few data points were missing from both the rain gauge stations 383 and SPDs, and they were excluded from the analysis. The 3-hour rainfall estimates 384 obtained by the SPDs were assessed as functions of rainfall intensity. The 3-hour 385 rainfall intensities for all precipitation datasets were categorized into the following five 386 groups: 0-5 mm/3 hours (light rain events), 5-15 mm/3 hours (moderate rain events), 387 15-25 mm/3 hours (heavy rain events), 25-50 mm/3 hours (very heavy rain events), 388 and >50 mm/3 hours (extreme rain events). The performance of the SPDs in terms of 389 the terrain effect was evaluated by dividing the rain gauge stations into two elevation 390 categories: ≤1000 m (low altitude) and >1000 m (high altitude). The evaluation of the 391 SPD performance in terms of wind velocity was conducted by dividing wind speed into 392 the following five categories: 0-5, 5-10, 10-15, 15-20, 20-25, and ≥25 m/s. The 393 distribution of the SPD performance in terms of wind direction was modeled as a wind 394 rose, in which wind direction was divided into eight categories: north (N), northeast 395 (NE), east (E), southeast (SE), south (S), southwest (SW), west (W), and northwest 396 (NW). 397 The performance of the SPDs was evaluated by conducting a quantitative analysis 398 of two categories of validation statistics. The first statistical category was continuous 399 statistics, which describe the differences between satellite rainfall magnitude and 400 ground rainfall station measurements and include bias ratio (BR), correlation 401 coefficient (R), mean error (ME), and root mean square error (RMSE). BR refers to the 402 tendency of SPDs to underestimate or overestimate rainfall compared with the rain 403 gauge station measurements. The perfect score for BR is 1. A BR below 1 indicates that 404 the satellite datasets tend to underestimate rainfall compared with the ground rainfall 405 measurements, and a BR above 1 indicates that the satellite datasets tend to 406 overestimate rainfall. The parameter R measures the strength of the linear association 407 between the satellite rainfall estimates and the ground-based observations. A value of 408 1 is the ideal score for R. ME indicates the average error in rainfall measurements 409 between the SPDs and the ground-based observations. RMSE reflects the average 410 deviation in absolute magnitude between the SPD data and the ground-based 411 observations. The ideal value of ME and RSME is 0. RB, R, ME, and RMSE were 412 computed using the following equations ( (3) 416 434 The ability of the SPDs to estimate rainfall during heavy precipitation events 439 caused by typhoons was evaluated in terms of rain rate intensity, elevation, and wind 440 velocity by using continuous statistics (i.e., BR, R, ME, and RMSE) and volumetric 441 indices (i.e., VHI, VFAR, and VCSI). High R, VHI, and VCSI values; low ME, RMSE, 442 and VFAR values; and BR values close to 1 indicated a high performance level. 443

Performance of SPDs During Typhoon Events 444
The agreement between the rain gauge observations and the satellite rainfall 445 datasets for each typhoon event was determined by using the scatter plots in Figure 2.  To demonstrate the performance of three SPDs more comprehensively, Figure 3  Philippines during the five typhoon events. The highest values for average 3-hour 503 rainfall were different for each typhoon event probably due to the differences in 504 atmospheric conditions and the complexity of the typhoon structure. In general, the 505 patterns of temporal variations of precipitation found using the three SPDs were in good 506 agreement with that of rain gauge measurements. The GSMaP dataset exhibited 507 superior agreement with the rainfall station observations during peak rainfall. The 508 IMERG and PERSIANN datasets considerably underestimated rainfall during rainfall 509 peaks in the typhoon events. 510

Performance of SPDs Under Different Rainfall Intensities 511
The BR values between the rain gauge station measurements and the data of the 512 IMERG, GSMaP, and PERSIANN datasets for different rainfall rate intervals were 513 derived. Figure 5  The performance of the SPDs was also assessed at various rainfall thresholds: 5, 535 15, 25, and 50 mm/3 hours. Figure 6 presents the performance diagram for the IMERG, 536 GSMaP, and PERSIANN datasets in terms of the volumetric indices (VHI, 1 − VFAR, 537 VCSI, and BR) for 3-hour precipitation under various rainfall thresholds. The ability of 538 these three SPDs to detect precipitation decreased with an increase in rainfall. VHI and 539 VCSI decreased and VFAR increased with increasing rainfall intensity. These results

Performance of the SPDs at Different Elevations 554
The variation in rainfall in the island area is caused by orographic uplift and the 555 complexity of topography (Lee et al. 2014). Topography has a prominent effect on 556 precipitation (Chen et al. 2020). The altitudes of the rain gauge stations used in this 557 study were divided into two categories: ≤1000 m (low altitude) and >1000 m (high 558 altitude). Table 3 presents an assessment of statistical metrics for the IMERG, GSMaP, 559 and PERSIANN datasets for 3-hour precipitation estimates at different elevations. 560 According to the BR and ME values, the SPDs tended to overestimate rainfall at low 561 elevation and underestimate rainfall at high elevations. The IMERG dataset had the 562 highest R and lowest RMSE values at both high and low altitudes. The IMERG dataset 563 exhibited superior performance at low altitudes because it had the best scores in the 564 continuous statistical analysis (BR, R, ME, and RMSE). The BR of the GSMaP dataset 565 was 0.96 at high altitudes, which indicates that this dataset had 4% bias compared with 566 the rain gauge measurements. The high BR of the GSMaP dataset at high elevations 567 was possibly caused by the inclusion of a topographic dataset from the Shuttle Radar 568 Topography Mission 30 Arc Second to classify orographic and nonorographic rainfall 569 (Yamamoto and Shige 2015). 570 Satellite rainfall estimates performed better in detecting heavy precipitation at high 571 altitudes than at low altitudes. This result might have been caused by orographic uplift 572 (Tang et al. 2018). In terms of the ability of SPDs to detect heavy rainfall at different 573 elevations, the PERSIANN dataset exhibited the lowest VHI and VCSI values at low 574 altitudes, whereas the GSMaP dataset exhibited the worse VFAR values at low 575 elevations (Table 3). The GSMaP dataset exhibited the highest VHI at both altitudes; 576 the IMERG dataset exhibited the best VFAR values at low altitudes; and the 577 PERSIANN dataset had a perfect VFAR value at high altitudes. The performance 578 diagram summarizes the three SPDs' ability to detect heavy rainfall accurately at 579 different altitudes (Figure 7). The GSMaP dataset outperformed the other datasets in 580 terms of the ability to detect heavy rainfall at high elevations, whereas the IMERG 581 dataset outperformed the other datasets at low elevations. The PERSIANN dataset 582 performed poorly at both elevations probably because its rainfall estimation algorithm 583 does not contain a terrain component (Nguyen et al. 2018). 584

Performance of the SPDs Under Different Wind Velocities 585
The levels of infrastructural and environmental damage caused by typhoon events 586 are influenced by wind intensity. High wind intensity is also associated with heavy 587 rainfall, which is another hazard of typhoon events (Bloemendaal et al. 2020). In a 588 previous study, the rainfall caused by typhoon events was forecasted using satellite 589 estimates of rainfall data, typhoon intensity, and wind vectors (Kidder et al. 2005). The 590 effect of wind velocity on the ability of SPDs to detect heavy precipitation caused by 591 typhoon events should be investigated. In this study, the averages of the wind vector 592 components u and v from the ECMWF at a pressure level of 925-850 hPa, which is 593 observed at the considered rain gauge stations, were processed into wind speed and 594 direction, respectively. The frequency distribution indicates the relationship between 595 wind speed and the continuous performance statistics (Figure 8). The IMERG and 596 PERSIANN datasets underestimated rainfall compared with the gauge station 597 measurements, yielding a high-frequency concentration of negative MEs (−20-0 mm/3 598 hours) and a BR below 1 for the distribution of wind speed. The GSMaP dataset tended 599 to overestimate rainfall, with a distribution of frequency concentrated on positive MEs 600 (0-20 mm/3 hours) and a BR above 1. The IMERG dataset exhibited superior 601 agreement with the rain gauge observations at different wind speeds, with the 602 distribution frequency of R ranging from 0.4 to 1. For the PERSIANN and GSMaP 603 datasets, the distribution frequency of R ranged from 0.1 to 1 and from 0 to 0.9, 604 respectively. The frequency distributions of RMSE at each wind speed for the IMERG 605 and PERSIANN datasets ranged from 0 to 30 mm/3 hours, whereas those for the 606 GSMaP dataset ranged from 0 to 40 mm/3 hours. Among the three SPDs, the IMERG 607 dataset was the most consistent with the rain gauge measurements in terms of having 608 the most continuous statistical parameters at the different wind speeds. The distribution 609 frequencies of ME, RMSE, R, and BR for the IMERG dataset were concentrated around 610 the near-perfect value for the continuous statistics. 611 The distribution percentage of each volumetric index presented in Figure 9 was 612 used to describe the association between wind speed and the ability of the SPDs to 613 detect heavy rainfall events caused by typhoons. In terms of the VHI distribution, the 614 GSMaP dataset exhibited the best performance, followed by the IMERG and 615 PERSIANN dataset. The GSMaP dataset yielded high frequency distribution for a VHI 616 range of 0.9-1.0 in the wind speed range of 7.5-12.5 m/s. The IMERG dataset exhibited 617 high frequency distribution for a VHI range of 0.9-1.0 in the wind speed range of 10-618 12.5 m/s, and the PERSIANN dataset exhibited high frequency distribution for a VHI 619 range of 0.5-0.6 in the wind speed range of 7.5-10 m/s. In terms of false rainfall 620 estimates, the IMERG dataset outperformed the GSMaP and PERSIANN datasets. The 621 IMERG dataset had a high-frequency distribution at a lower VFAR than did the other 622 SPDs. The comprehensive evaluation of the volumetric index performance indicates 623 that compared with the other SPDs, the IMERG dataset exhibited a stronger ability to 624 detect heavy rainfall at various wind speeds. The IMERG dataset exhibited a strong ability to detect heavy rainfall under 690 various wind speeds. The GSMaP dataset exhibited a stronger ability to detect 691 heavy rainfall events in terms of wind velocity in the western part of the 692 mountainous region than in its eastern part. By contrast, the IMERG dataset 693 exhibited better performance in the eastern part of the mountainous region than 694 in its western part. 695 The accurate detection and estimation of heavy precipitation with SPDs remains a 696 challenge in archipelagos with complex terrain or mountainous areas. In this study, the 697 IMERG and GSMaP datasets demonstrated a promising ability to detect heavy 698 precipitation caused by typhoon events. An in-depth investigation is required before the 699 IMERG and GSMaP datasets are applied to tropical-cyclone-related studies.  Table 1. Information regarding five typhoon events in the Philippines.