Application of Multivariate Techniques for Spatial Drought Modelling using Satellite Rainfall Estimate in Fiji


 Monitoring hydrological extremes is essential for developing risk-mitigation strategies. One of the limiting factors for this is the absence of reliable on the ground monitoring networks that capture data on climate variables, which is highly evident in developing states such as Fiji. Fortunately, increasing global coverage of satellite-derived datasets is facilitating utilisation of this information for monitoring dry and wet periods in data sparse regions. In this study, three global satellite rainfall datasets (CHIRPS, PERSIANN-CDR and CPC) were evaluated for Fiji. All satellite products had reasonable correlations with station data, and CPC had the highest correlation with minimum error values. The Effective Drought Index (EDI), a useful index for understanding hydrological extremes, was then calculated. Thereafter, a canonical correlation analysis (CCA) was employed to forecast the EDI using sea surface temperature anomaly (SSTa) data. A high canonical correlation of 0.98 was achieved between the PCs of mean SST and mean EDI, showing the influence of ocean–atmospheric interactions on precipitation regimes in Fiji. CCA was used to perform a hind cast and a short-term forecast. The training stage produced a coefficient of determinant (R2) value of 0.83 and mean square error (MSE) of 0.11. The results in the testing stage for the forecast were more modest, with an R2 of 0.45 and MSE of 0.26. This easy-to-implement system can be a useful tool used by disaster management bodies to aid in enacting water restrictions, providing aid, and making informed agronomic decisions such as planting dates or extents.


Introduction
Fiji is highly exposed to extreme events, with evidence indicating that many local communities are sensitive to various hydro-meteorological hazards such as ood and drought. The 1997-1998 El Niñoinduced drought cost the Fijian government approximately FJ$1 million in emergency supplies and the country experienced a decline of 3.7% in gross domestic product (GDP) owing to the collapse of the sugar industry (Terry and Raj 1999). Hazard modelling can be a means to increase preparedness among the government and local communities to reduce risks.
However, development of models is reliant on a long series of continuous and homogeneous data for various parameters, which are often lacking in small island developing states (SIDS). The restricted data availability in Fiji may be due to a lack of instrumental hydrological records and equipment (Grimes et  data have advantages such as higher accuracy for point measurements; poor accessibility, lack of dense data networks and uneven distribution of gauges are often a deterrent for carrying out studies using recorded data (Sun et al. 2018). Fortunately, satellite data can be used to overcome some of these challenges. Satellite products have many advantages, including broader spatial coverage, equidistant data points and availability at multiple time resolutions (Kidd 2001). Satellite products are widely applied in global and regional climate trend analysis, as well as in monitoring droughts and oods ; Sun et al. 2018). The method of estimating rainfall by measuring Cold -Cloud-duration, such as the Climate Hazards group Infrared Precipitation with Station (CHIRPS) data and Climate Prediction Centre (CPC) US uni ed precipitation, are claimed to be better for longer timescales, while passive microwave sensors such as the Precipitation Estimation from Remotely Sensed Information using Arti cial Neural Networks-Climate Data Record (PERSIANN-CDR) are better for instantaneous rain in wellde ned regions (Kidd 2001). The appropriateness of use of a satellite product is dependent on the regional topography, seasonal weather patterns, number of heavy rain events and in uence of orographic effect (Xu et al. 2015). Importantly, where possible, satellite data should be checked for reliability and bias to test the suitability of the products in a given geographical location.
The Fiji Meteorological Services presently uses the Standardised Precipitation Index (SPI) to monitor the country's wet and dry climate periods (Pahalad and McGree 2002). A number of studies have used SPI for drought-related research on Fiji (Deo 2011; Rhee and Yang 2018a). Deo (2011), performed a Mann Kendal test using SPI and concluded that most stations indicated negative values depicting reduction in rainfall.
Recently, SPI has also been successfully used to carry out drought monitoring in Fiji using a hybrid approach of multi-model ensemble seasonal climate forecast and machine learning method (Rhee and Yang 2018a). However, one of drawbacks in the calculation of SPI is that it assumes stationary climate conditions (Kim et al. 2009). Additionally, a common practice has been to use timescales on a monthly or annual basis (Heim Jr 2002). Although the use of monthly scales in other drought indices yields a good judgement of drought for long-term planning, it provides minimum information on the status within a short time span. SPI is known to yield errors when calculated at shorter timescales (1-month) (Wu et al. 2007).Therefore, in this study we used the Effective Drought Index (EDI) by Byun  Unlike oods, which are typically caused by point source locations such as rivers affecting an area at a time, droughts have a larger spatial extent, which is why it is important to understand the spatial distribution of droughts (Ionita et al. 2012). CCA can be used to understand the spatial distribution of droughts through spatial mapping. Maps containing geographical information are an important communication tool to the wider public and the stakeholders because of the ease of presenting and understanding the information. Importantly, climatic variations are increasingly recognised in the Paci c, and traditional coping mechanisms are no longer su cient to deal with the adversities (Kuleshov et al. 2014). Therefore, seasonal forecast maps can be used to enhance the coping capacity and reduce vulnerability of communities in SIDS (Mercer et al. 2007).
Therefore, the aim of this research was to obtain spatial drought forecasts in Fiji by employing CCA using satellite derived rainfall estimates.

Study Region
The study area, Fiji, is located at 17.7134° S and 178.0650° E. It occupies a total area of 18,270 square kilometres, with a maximum elevation of 1324 metres (Neall and Trewick 2008). The nation consists of 322 islands with two main islands, Viti Levu and Vanua Levu. The region falls within the tropics and therefore experiences wet and dry seasons. The mean annual temperature is 25 degrees Celsius (Neall and Trewick 2008) and the mean precipitation ranges from 1676 to 3544 mm at different locations (Kumar et al. 2014). The Southern winter season observes movement of the SPCZ northward, away from Fiji; therefore, rainfall during winter is a result of regional features such as orographic lift. Fiji consists of mountainous topography with prevailing south-easterly trade winds which results in an orographic lift (Terry 2005). This is highly evident in the dry season whereby the leeward side (Western and Northern Divisions) receives 20% of the annual rainfall, while the windward side (Central Division) receives 33% of the annual rainfall (Terry 2005). Most of the rain-fed agriculture is located in the Western and Northern Divisions, including the sugar industry. The intra-seasonal movement of the SPCZ northward and southward results in a warm-phase El Niño and a cool-phase La Nina event, respectively, causing unexpected rainfall anomalies (Juillet- Leclerc et al. 2006). Precipitation overall in the country is driven by the South Paci c Convergence Zone, El Niño-Southern Oscillation (ENSO) and tropical cyclones, which are accompanied by heavy rain events during the wet season.

Satellite Rainfall Data
The satellite precipitation data were selected on the basis of the start date of the record and the spatial resolution (Table 1). Because Fiji occupies a small area, we needed to ensure that products with ne resolutions were selected. Rainfall can be highly variable in different areas despite being in the same zone; therefore, the smallest spatial scale was chosen for validation (Xu et al. 2015). Following the methods of Dembélé and Zwart (2016), satellite data were extracted using the point to pixel method based on the precise observation station coordinates. CHIRPS: This dataset, starting from 1981 with a spatial extent ranging from 50° S to 50° N, was retrieved from http://chg.ucsb.edu/data/. CHIRPS data have a resolution of 0.05° and use thermal infrared bands and in situ station data to create the nal gridded product. The satellite information is used to account for sparsely gauged locations, and precipitation estimates are available on daily, pentadal and monthly scales . CHIRPS data have not been explored before for Paci c island topographies; however, they have been tested in other areas such as the Caribbean. Evaluation of the CHIRPS in Mozambique has indicated a good ability to detect rainfall during hurricanes (Toté et al. 2015), which is especially useful given Fiji's proneness to ooding during cyclones.

In Situ Rainfall Data
The daily and monthly precipitation dataset was obtained from Fiji Meteorological Services, which operates over 20 manual and automatically telemetered rain gauges over the Fiji islands. Six high-quality stations were used in the current evaluation: Labasa, Nadi, Navua, Ba, Matei and Nausori (Fig. 1). These meteorological stations were selected considering the variation in local climate, topography and orographic effects observed in Fiji ( Table 2). The Central Division stations, Navua and Nausori, are in the windward region characterised mainly with wet weather conditions (See Table 2 and Fig. 2 for Station climatology). The Western Division is the drier side of the island, represented by the stations Rararwai and Nadi. Labasa station is the main station on the island of Vanua Levu, while Matei is a town situated on Taveuni (an outer smaller island); both serve to provide datasets for the Northern Division of Fiji. The data were checked for homogeneity, and missing values were replaced with the long-term monthly averages. The number of missing data ranged from 0 -5 points for all the stations, while Matei had 31 missing data points.

Sea Surface Temperature Data
The Kaplan SST data used in this analysis were retrieved from https://climatedataguide.ucar.edu/climate-data/kaplan-sea-surface-temperature-anomalies. The dataset has a temporal coverage starting from 1856 and a global spatial extent with a spatial resolution of 5.0 x 5.0 units. The data are produced by various steps, such as Empirical Orthogonal Functions (EOF) projection, optimal interpolation (OI), the Kalman Filter (KF) forecast, KF analysis and an optimal smoother (OS), which ultimately minimise the number of missing data and error values (Reynolds and Smith 1994). A bounding box was created in Arc GIS, ranging from 150° E to 80° W between the Tropic of Cancer and the Tropic of Capricorn, which resulted in 259 grid points. This ensured that the SST would be extracted for the region surrounding Fiji. The series length extracted was the same for the EDI and SST, from 1980 to 2017 (Fig. 2). The data were checked for homogeneity, and grids with missing data were omitted.

Satellite Rainfall Validation
Continuous veri cation statistics were applied to test the satellite-derived precipitation products (Jolliffe and Stephenson 2003) for the six stations (Fig. 1). The following performance measures were used to evaluate the products: coe cient of determination (R 2 ), root-mean-square error (RMSE) and Lin's concordance. The R 2 measures the extent of association between the actual recorded data and the satellite data in this case. The RMSE assesses the variance of errors independently and indicates the inconsistency between the recorded and the satellite values, with lower values indicating minimum error difference between the recorded data and satellite data (Adamowski et al. 2012). Lin's concordance created by Nickerson (1997), is calculated based on the degree to which two variables fall on the 45° line that passes through the origin. The calculation was undertaken in R using the DescTools package (Signorell et al. 2016). Lin's concordance have values between 0 and 1 without a unit, and a higher value indicates a strong relationship similar to R 2 . The unit for RMSE is millimetres. The daily data were aggregated to monthly data and were tested for the identi ed performance metrics. A complementary check was also made on Nadi station to test for differences in the wet and dry seasons respectively for the CPC product (Results shown in Table 5). The eyeball method is a subjective form of veri cation (Ebert 2007), and in this study we made scatter plots of observed versus satellite data for Nadi station, as well as created kernel density plots to determine instances of overestimation and underestimation.

Index Calculation
For further analyses, CPC data were used because these data rendered the best correlation results with local station data ( Table 4 ). Therefore, following point extraction, the dataset was checked for quality and homogeneity, whereby the zoo package (Zeileis and Grothendieck 2005) in R was used to carry out linear interpolation in the case of missing precipitation data. Point linear interpolation is calculated using the values measured on either side of the missing data (North and Livingstone 2013). The drought index was calculated on the basis of the principles identi ed by Byun and Wilhite (1999) for each of the grid cell retrieved from the CPC data. The daily Effective Drought Index (EDI) values were aggregated into monthly EDI values because the climate index data were available at monthly timescales (Fig. 3).

Multivariate Analysis
CCA successfully establishes interrelations to identify linear combinations between more than one response and explanatory variable (Ouarda et al. 2001), while PCA employs the same approach on a single multi-dimensional dataset. The relationships between the weights of the response and explanatory data are known as loading patterns, and these can be used to visualise potential physical mechanisms in large-scale processes (Shabbar and Barnston 1996). The multivariate analysis was performed according to the guidelines provided by Wilks (2011). SST was used as the explanatory variable and the gridded EDI as the response variable, from which a probabilistic forecast was generated. If a dataset x (t) leads to y (t), then CCA can be used to forecast the y variable by using the lagged x variable in a gridded form. A lag of 2 months was introduced in the data matrix between the EDI and the SST, consequently, a forecast of EDI could be provided at short term (2 months ahead). Short term forecasts are not only accurate, but is adequate to assist managers and farmers in making useful management and agronomic decisions (Anshuka et al. 2019). As a rst step, the data were standardised and centred to remove any trend. This ensures that the principal components (PCs) are mutually orthogonal (Westra and Sharma 2010).
Previous studies have also applied pre-processing of data using empirical orthogonal analysis (Landman and Mason 1999; Tang et al. 2000;Yu et al. 1997). This has a number of advantages such as, reduces dimensionality and noise in the data, and gives equal opportunity for predictors to contribute to forecasting models (Shabbar and Barnston 1996).
We selected the rst 10 PCs from the SST and EDI to perform the CCA. Forecast evaluation was performed by splitting the data into two parts, whereby a hindcast was generated which entailed model training on data from 1980 to 2004 and the forecast was tested using the data from 2005 to 2017. The forecast skill was veri ed using the coe cient of determination (R 2 ) and mean square error (MSE). A spatial plot was generated to determine how well the forecast and the hind cast predicted the EDI classes as shown in Table 3.

Satellite Rainfall Validation
First, the satellite data products were validated for the Fiji region. From the results (Table 4), the best satellite product was selected, which was used to calculate the EDI. The CPC had the best results for most of the stations (Table 4). Nadi station had the highest correlation of 0.85, and respectively a high Lin's concordance of 0.88 with the lowest RMSE value of 75mm. All the datasets had the same best-correlated and poorly correlated stations, that is, Nadi had the best performance in all three datasets, similarly Navua had the poorest correlation in the three datasets. There was a marginal difference in the performance measures for CPC and CHIRPS data. Similarly, the PERSIANN-CDR data, although having the lowest performance of the three datasets, still had modest results. Navua had the weakest correlation and the highest error, followed by Matei station.
The density plot (Fig. 4) indicated that the distribution was skewed to the right; that is, most of the precipitation values ranged from 0 to 200 mm. Out of the three datasets, the CPC most closely matched the distribution of the gauge data. The datasets showed instances of overestimation especially for low rainfall and, to a lesser extent, underestimation of heavy rainfall events. However, not all events were captured; particularly, some of the high rainfall events were under predicted. None of the datasets were able to detect extreme heavy rainfall events, but the CPC was able to follow the gauge data closely up to values as high as 1000 mm. The scatterplot for Nadi station (Fig. 5) shows that CPC data represents the rain gauge data well.
Additionally, validation for Nadi station was performed separately by considering the two seasons in Fiji: the wet season including January, February, March, April, November and December and the dry season including May, June, July, August, September and October. Validation of data for the wet and dry seasons at Nadi showed that the correlations were higher in the dry season with low error values (Table 5). On the basis of satellite validation performance, CPC data was selected to calculate the Effective Drought Index.

Spatial Drought Modelling Principal Component Analysis
A principal component analysis was performed on EDI and SST. The rst ve Principal Components (PCs) explained approximately 86% of the variation within the SST data, while the rst ve PCs explained 98% of the variation in the EDI data ( Table 6). The rst 10 PCs were retained for further analysis. pair shows that 98.6% of the variation in the EDI was explained by the variation in the SST (Table 7) indicating a strong relationship between the two variables.

Spatial Modelling
The predictive skill of the model was assessed for both the hind cast and the forecast (Fig. 6). The hind cast was the model training stage (Fig. 7), which obtained an R 2 score of 0.83 and an MSE value of 0.12.
The model was tested in the forecasting stage (Fig. 8) with a noted decrease in the R 2 skill and increase in the MSE value by almost 50%. The colour sequence revealed that class error predictions in both the hind cast (Fig. 7) and the forecast (Fig. 8) were generally well predicted. The grid cells which fall in the Central Division, which is mainly affected by orographic rainfall, were not correctly predicted in both the training and the testing stages. The time series plot (Fig. 6) shows instances of under-prediction in the testing stage and to a lesser extent in the training stage. Comparisons between observed mean EDI showed an increase in precipitation trend across the Fiji island group. In 2004, a number of dry events were observed; however, in 2017 no dry conditions were observed across Fiji.

Satellite Rainfall Veri cation
All the satellite-derived precipitation products were reasonably well correlated with the observed precipitation data and therefore can be used in further hydrological studies. Both the CPC and CHIRPS gridded precipitation products showed high correlations with observed data, except for Navua and Matei stations ( Table 4). The high correlation is likely due to the data products being calibrated using the same local rain gauge data as we used to verify the products. Speci cally in CPC, gauge reports are collected and compared with historical records, other nearby stations, numerical forecasts and subsequently created into a eld of precipitation by factoring for orographic effect (Xie et al. 2010). Regardless, the quality of the gridded data is partially dependent on the availability of data on the ground, as poor availability on the ground will result in a poor satellite-derived product (Toté et al. 2015). Therefore, in future it may be useful to identify gauges not used in the calibration process to verify the products as future research.
All three data sources showed overestimation and underestimation of frequency of rainfall volumes as indicated by the density plot (Fig. 4) Gridded datasets are derived from gauge-based products that undergo interpolation, resulting in smoother extreme values (Sun et al. 2018). This is an important consideration; in drought monitoring studies, one needs to ensure that the satellite-derived product does not overestimate rainfall, similarly, models that underestimate rainfall may not be suitable for ood monitoring studies (Toté et al. 2015).
Similarly, the seasonal correlations for Nadi were better in the dry season than in the wet season (Table  5). This may result in better forecasts for the dry season than those for the wet season in hydrologicalmodelling studies. This could also be a possible explanation for low correlations achieved for Navua, which is the wettest station in the country with a mean annual rainfall of 3573 mm. Because performances are generally better in the dry season than in the wet season, the datasets therefore are less able to accurately make estimates for stations that exhibit extremely wet characteristics, such as Navua.
A bias correction speci c to the study site can also be undertaken for low rainfall or high rainfall, depending on the focus of the study (Vernimmen et al. 2012;Yin et al. 2008).
The satellite products use algorithms, which cannot be applied to different terrains equally. The cloud-top temperature method of rain detection used by the CHIRPS may fail to identify warm orographic clouds, which rise higher than other rain/non-rain producing clouds in mountainous and coastal regions (Dinku et al. 2010). Matei, which is situated in Taveuni  In contrast, comparison between TAMSAT and the CPC for Southern Africa showed better results for the CPC over mountainous regions due to adjustment of the nal product for orographic rainfall (Thorne et al. 2001). This could also explain the better overall performance for CPC data which takes into account rainfall caused by orographic effect in parts of the Fiji island group. While CHIRPS was marginally less well correlated with local rainfall data, it has a ner spatial resolution and so it or potentially a hybrid between the products should also be investigated.

Spatial Drought Modelling
The analyses revealed that there is potential to develop spatial monitoring tools using multivariate techniques with reasonable accuracy. An approach that identi es physical relationships between precipitation and climate drivers can yield useful information to understand the risks better ( winds are responsible for much of the rain in the south-eastern side of the island (Suva, Navua and Nausori) and less rain in the north-western side during the dry (winter) season is observed. The Western Division in Fiji has lower correlation with the Southern Oscillation Index during the winter (dry season) (Walsh et al. 2001). This may indicate that winter precipitation has no systematic link to largescale circulations controlled by SST particularly and is in uenced by regional features such as orographic lift. Tropical cyclone activity can also in uence forecasts as heavy rains associated with cyclones are not usually re ected in the SST (Yu et al. 1997). Therefore, SST alone may not be the best predictor of rainfall in Fiji, and features such as spatial atmospheric pressure data (Chu and He 1994) and ocean thermocline (Ruiz et al. 2006) warrant further investigation.

Conclusion
The results suggest that all satellite products give reasonable correlations with observed rainfall for the Fiji region. However, CPC data gives the best correlation with the recorded rainfall data. Therefore, it is recommended that CPC data are used for future hydrological studies. The mean SST and annual mean EDI series in Fiji indicate an increasing trend in temperature and precipitation respectively. The study has also identi ed a strong link between the SST and the EDI, with a correlation of 0.98. A CCA model shows reasonable skill in predicting EDI over the Fiji region; hence, this model can be used as a 2-month advance spatial drought forecasting tool for Fiji. Future study can incorporate sea level pressure and ocean thermocline as explanatory variables of the EDI to further improve the forecasts. The loading patterns can also be explored to assess the effect of major climate drivers in different parts of the island group in Fiji and the South West Paci c region as a whole. In the absence of an operational early-warning system, this relatively simple and easy-to-implement system can be made available locally in Fiji to various stakeholders.

Declarations
Con ict of interest : The authors declare that they have no con ict of interest.