Spatial Variation in the Relationship between Leptospirosis Incidence and Climate Determinants in Sarawak (Malaysia)

Background Leptospirosis is a zoonotic disease caused by spirochete bacteria in the genus Leptospira, and it has become a significant public health challenge in Malaysia. Environmental survival and persistence of this pathogen are highly dependent on environmental conditions such as moisture content, pH and temperature. These conditions are further adapted by the natural climate system including precipitation and humidity which is highly heterogeneous at a geographical scale. This paper described the spatial and temporal distribution of leptospirosis incidence with climate factors using Geographical Information System and stratified the climate factors based on their association with the disease incidence rate. Methods Leptospirosis surveillance data from 2012-2016 were integrated into this study along with seven geospatial climate variables for the state of Sarawak, Malaysia. High and low clustering of incidence rate was explored by Getis Ord Gi* statistics. Geographical Weighted Regression model was utilized to study the relationship between the incidence rate and selected climate variables.

2 Abstract Background Leptospirosis is a zoonotic disease caused by spirochete bacteria in the genus Leptospira, and it has become a significant public health challenge in Malaysia. Environmental survival and persistence of this pathogen are highly dependent on environmental conditions such as moisture content, pH and temperature. These conditions are further adapted by the natural climate system including precipitation and humidity which is highly heterogeneous at a geographical scale. This paper described the spatial and temporal distribution of leptospirosis incidence with climate factors using Geographical Information System and stratified the climate factors based on their association with the disease incidence rate.

Methods
Leptospirosis surveillance data from 2012-2016 were integrated into this study along with seven geospatial climate variables for the state of Sarawak, Malaysia. High and low clustering of incidence rate was explored by Getis Ord Gi* statistics. Geographical Weighted Regression model was utilized to study the relationship between the incidence rate and selected climate variables.

Results
Spatial analysis revealed seven districts in the state of Sarawak as hot spot areas, and six cold spot areas with GiZ score varies between (-3.092 to 3.203). The cumulative incidence rate demonstrated an increasing trend towards the South-East region of Sarawak with an average of 162 cases per 100,000 population. The univariate analysis reported a significant relationship (p<0.05) between leptospirosis incidence rate and temperature seasonality with the lowest AIC value of 31741. 4. The results showed that temperature seasonality explained 99% (R2: 0.99) of the spatial variances in incidence rate with 42.31% of the localities showing a significant positive relationship.

Conclusion
The present study highlighted the importance of temperature seasonality as a potential determinant for the spatial distribution of leptospirosis cases with different strength and direction of the association. In conclusion, this suggests that specific interventions at a locality with strong climate 3 determinants should be implemented to combat the burden of leptospirosis.

Background
The global burden of leptospirosis is estimated at 0.10-975 cases per 100,000 populations and with case fatality of 6.85% depending on the prevalent serovars, healthcare services and economic status of the population [1]. World Health Organization (WHO) has discerned leptospirosis as a neglected tropical disease of global importance, thus requiring further research to understand its epidemiology, disease biology and ecology as well as its transmission dynamics. The severity of this disease is attributed to a spirochete of the genus Leptospira that colonize the kidneys of a wide diversity of peridomestic animals (rats, horses, cows, dogs, and pigs) and feral animals (bats, coyotes and sea lions) [2]. Exposure of abraded skin and mucous membrane to leptospires-contaminated water facilitate the entrance of this bacteria to the human body and establish infections. The local burden of human leptospirosis was deliberated by a previous study [3], emphasizing the endemicity of this disease in the tropical and subtropical regions, with prominently high incidence in Southeast Asia [4,5,6,7].
Climatic conditions are at the forefront among the contributing factors of endemicity and the persistence of Leptospira sp. in the natural environment [8,9,10], particularly during the rainy season.
Malaysia is considered an epicentre of leptospirosis due to the suitable weather and climatic conditions that favour the growth and propagation of these bacteria [11]. Unprecedented outbreaks following the occurrence of extreme weather events are well documented in tropical countries such as Philippines [4], Laos [7], India [12] and Malaysia [13].
In recent years, the development of Geographical Information Systems (GIS) has provided a robust and rapid ability to examine spatial and temporal patterns and processes by incorporating metadata into its analysis. This, in turn, has fostered the utilization of GIS analytics and geospatial statistics for environmental analyses of infectious diseases [14]. Previous studies have adopted GIS analysis tools in the study of ecological models to explore and analyse spatial variations in relationships between local environmental factors and the occurrences of leptospirosis [13,15,16,17,18].
As baseline data on vector population is currently not available, this study was structured by 4 employing leptospirosis surveillance database as the starting point for investigating the spatial and temporal pattern of leptospirosis cases distribution. Considering the substantial role of climatic factors in facilitating the spread of leptospires, the existence of potential location with strong climatic determinants should be further explored. As climate phenomena are non-stationary and transit from one locality to another in a complicated manner, it is hypothesized that spatial heterogeneity exists and play an essential role in the transmission and distribution of leptospirosis in Sarawak. Thus, the objectives of this study are to describe the spatial and temporal distribution of leptospirosis incidence in Sarawak about climate factors and evaluate as well as map the spatial heterogeneity of leptospirosis using Geographical Information Systems (GISs) application tools. To the best of our knowledge, this is the first study in the state of Sarawak to evaluate the spatial influence of climate factors towards the spatial distribution of leptospirosis cases.

Study Area
The state of Sarawak is located between latitude 0° 50' 5° N and longitude 109° 36' 115° 40' E on the island of Borneo. With a total area of 124,450 km2, Sarawak is the largest state in Malaysia.
According to the 2016 census, Sarawak has 2,738,700 inhabitants, with a population density of 22 people per km2 [19]. The population is concentrated in the western regions such as Kuching (451 people per km2) and Samarahan (238 people per km2), while the South-East area of Sarawak such as Kapit is sparsely populated with 16 people per km2 [19]. The climate is warm throughout the year with heavy rainfall during the monsoon months from November to March. Based on the official classification, there are 12 divisions with 57 districts and sub-districts in Sarawak. As the evaluation of spatial heterogeneity of climate factors with the probable transmission of leptospires is part of the study objective, seven geospatial climatic variables (precipitation in the wettest month (PWM), precipitation in the wettest quarter (PWQ), precipitation in the driest month (PDM), precipitation in the driest quarter (PDQ), precipitation seasonality (PS), temperature seasonality(TS), water vapour pressure (WVP)) were included in this study. The dataset was retrieved from CHELSA [20] and WorldClim [21] database. All data layers were projected in World Geodetic System, WGS 1984 coordinates and resampled to a spatial resolution of ~1 × 1 km in ArcGIS Desktop 10.3 [22].

Differences in case distribution for demographic factors
The difference in cases distribution of leptospirosis distribution based demographic factors such as gender, occupational types and age groups were analysed using Chi-square goodness of fit test using statistical package SPSS (version 12) [23].

Spatial and temporal distribution of leptospirosis incidence
The annual incidence rate of leptospirosis (cases per 100,000 population) was calculated using the census data for Sarawak in 2016. One-way ANOVA was employed to determine the significant difference in the mean incidence rate based on year, month and division. Spatial and temporal distribution of cumulative incidence rate was explored by interpolating the district incidence rate about the year 2012 to 2016 using ArcGIS 10.3 [22].

Hot Spot Analysis
Hot spot analysis was performed to identify high and low clustering of cumulative incidence rate in the study area by calculating the Getis-Ord Gi* statistic for each feature in a dataset. The resultant zscores and p-values provide information on features with either high or low values cluster spatially.
For statistically significant positive z-scores, the larger the z-score, the more intense the clustering of high values (hot spot) and vice versa. A statistically significant hot spot is a feature with a high value 6 and is surrounded by other features with high values as well.
The statistical equation for calculating Gi and Gi* can be written as follows;

Due to technical limitations, equation 1 can be found in the supplemental file section [24]
Where Wij (d) is a spatial weight vector with values for all cells 'j' within a distance of target cell i, W* i is the sum of weights, S*1i is the sum of squared weights and s* is the standard deviation of the data in the cells. The output feature class is a shapefile displaying the polygons (district) of high and low clusters of leptospirosis cumulative incidence rate.

Trend Surface Analysis
Trend surface analysis is a smoothing method that produces a modelled surface and identifies broad, overall patterns in geographic data [25]. The present study incorporated trend surface analysis to determine whether the north-south and east-west trend of leptospirosis incidence rate in Sarawak were present systematically and compared these trends to geographic patterns of climatic variables using Pearson's correlation test. This study employed an Empirical Bayesian Kriging method in Geostatistical Analyst extension of ESRI's ArcGIS software [22] to compute and map spatial trends in cumulative leptospirosis incidence (CI) rate. Using Bayes' rule, the weights for each semivariogram are computed:

Due to technical limitations, equation 2 can be found in the supplemental file section [26]
Where qi is the i-th set of semivariogram parameters nugget, sill and range. W(θi | Z) is the weight for the i-th semivariogram; ƒ (Z | θi) evaluates the likelihood the observed data can be generated from the semivariogram, and P (θi) stands for the probability of the i-th set of parameters θi among the simulated semivariogram spectrum [27].

Geographically Weighted Regression Model
Spatial heterogeneity in the relationship between leptospirosis incidence rate and geospatial climate variables were analysed using Geographically Weighted Regression (GWR) model. In evaluating the relationship between dependent and independent variables at specific localities, a collection of digital gazetteers containing structured dictionaries of geographical places were used as the spatial reference. Gazetteers for Sarawak, Malaysia were retrieved from the DIVA-GIS website [28]. The raster cell values for dependent and independent variables were extracted to each point data. The form of the GWR model used was similar to global regression models; however, the parameters vary with spatial location: Due to technical limitations, equation 3 can be found in the supplemental file section [15] where (μі, νі) is the coordinate of gazetteers and βj (μі, νі) is the coefficients estimated by weighting function for any gazetteers. For this study, an adaptive kernel was employed in which 1000 nearby points nearest to each regression point were included as described by Mayfield et al. [15]. The efficacy of different regression models was evaluated based on corrected Akaike Information Criterion (AICc) model significance was tested using variance analysis, F-Test [29].

Differences in case distribution for demographic factors
The result of chi-square goodness of fit test showed a statistically significant difference in cases distribution among age groups (X2(7) = 772.35, p<0.01), with higher cases observed in the group of 20-29 years of age (N= 614) as compared to other age groups. Individuals involved in agriculturebased and plantation sector accounted for the highest number of cases (N= 487), which was statistically significant at X2(12) =1534.13, p<0.01. The result of chi-square goodness of fit test also revealed a statistically significant difference (X2(1) = 573.06, p<0.01) in the prevalence between male and female, with more male (N=2075) compare to female (N= 793). The number of fatal cases reported during the study period was 67 cases with the highest fatality of 23 cases reported in the year 2014. The summary of demographic analysis was presented in Table 1.   (TS), (r= -0.693, n=200, p<0.01). This suggests that the increasing spatial trend of leptospirosis incidence follows the reduction trend in temperature seasonality.

Geographically Weighted Regression Model
Univariate GWR models were utilized to explore the spatial variability of explanatory climate variables and their influence on leptospirosis incidence rate. Five independent variables, namely PS, PWM, PWQ, PDM, PDQ, TS and WVP were analysed in the model. The summary of the GWR model was listed in Table 2.  (Table 4). All local models were statistically significant (p< 0.05) except for the model with water vapour pressure (p= 0.83).
The parameter estimates for each independent variable varies spatially across the study region as illustrated in Figure 4 (A)-(G). High coefficients of determination (R2) for temperature seasonality were concentrated in 13 districts namely Asajaya, Bau, Daro, Kuching, Lundu, Matu, Oya, Padawan, Pendam, Samarahan, Sematan, Serian, and Siburan. This suggests that the GWR model with temperature seasonality as the unique explanatory variable fitted better in these districts as compared to other districts in the study region.

Discussions
The average incidence rate of leptospirosis during the study period was 104 cases per 100,000 population, slightly higher than reported by Benacer et al. [3] for the incidence rate in Sarawak. The variation in incidence rate reported was due to the different data time frame used in the studies where Benacer and colleague [3] analysed data from 2004 to 2012, and this study included data from 2012 to 2016. The present study showed a low incidence rate in 2016 with a mean of 24 cases per 100,000 population) due to incomplete monthly data reported for that particular year.  [30].
Agricultural-based and plantation workers accounted for the majority of the cases with 487 cases and the disease occurrence by gender is higher in a male with an average of 28.52 cases per 100,000 population. Higher infections in males and individuals working in agricultural-based and plantation corroborate previous epidemiological studies which outlined the influence of occupational exposure and recreational activities among men that contribute to a higher risk of infection [3,31,32]. This disease is prevalent among farmers, abattoir workers, miners, sewer workers and cleaners as these occupations are in continuous contact with water and soil that may have been contaminated by Leptospira sp. [10].
About the spatial and temporal distribution of leptospirosis in our study, a high incidence rate is predominantly in the South East region (Kapit) with 388 cases per 100,000 population. In Sabah and Sarawak, significantly higher cases occurred during the wet season between October to February as continuous rain forms muddy ponds, small lakes and streams which prolonged the survival of Leptospira sp. [3]. This seasonality pattern is comparable to other South East Asian countries such as Thailand, Laos and Indonesia where the marked incidence was recorded during the wet season [7,9,33]. The survival and longevity of leptospires once it is shed into the environment, will have a direct bearing on the infection risk which implies that environmental survival and persistence are highly dependent on environmental conditions [2].
The trend surface analysis illustrated a continuous geographical distribution of leptospirosis incidence and facilitated the visualization of transmission risk patterns across the study area. The initial 20 inference from the trend surface analysis map suggested that leptospirosis incidence demonstrated a spatially heterogeneous pattern with higher risk towards central districts of Kapit, Song and Belaga (95% CI: 33.01-605.10). The result from the present study concluded that the trend of leptospirosis incidence inclines toward the region with high precipitation in the driest quarter with low-temperature seasonality. This inference is supported by the hot spot analysis, which classified seven districts as a significant hot spot. Hot spot areas are concentrated in the South East region of the study area, matching the increasing trend of incidence demonstrated by Empirical Bayesian Kriging. This result deduced that high incidence clusters (hot spots) are concentrated in the region with higher precipitation in the driest quarter and precipitation in the driest month.
In contrast, low incidence clusters (cold spots) dominate regions with higher precipitation in the wettest month, precipitation in the wettest quarter, temperature seasonality, precipitation seasonality and water vapour pressure. This finding is in agreement with a previous study [34], where high seroprevalence was reported in the Rejang Basin (Kapit, Sibu, Sarikei) and further emphasized the endemicity of leptospirosis in Sarawak. It is suggested that high annual rainfall received in the Rejang Basin, and seasonal flooding during monsoon coupled with the proximity of living settlements to forest fringes contributed towards the high seroprevalence in these areas [34].
Geographically weighted regression (GWR) model allows for local spatial variation in the relationship between variables across a study region [35,36]. The GWR model in the present study has identified geographical variation in the intensity of climate drivers for the distribution of leptospirosis cases in Sarawak. All models were statistically significant except for a model with water vapour pressure as the explanatory variable. The Akaike Information Criterion (AICc) value, which assesses the relative quality of models given trade-offs between model fit and model complexity [15], showed that the model with temperature seasonality (TS) was more efficient (AIC: 31743.42) in explaining the geographical distribution of leptospirosis transmission risk. This result corroborates findings from other studies showing that geographically weighted regression can offer improvements and additional insights over standard non-spatial regression models for eco-epidemiological studies of leptospirosis 21 [13,15,16,17,18].
The present study revealed that temperature seasonality alone could explain 99% (R2: 0.99) of the variances in leptospirosis incidence, while water vapour pressure could explain about 41% (R2: 0.41).
Thus, temperature seasonality seems significantly related to disease incidence rather than water vapour pressure. Analysis of central tendency and dispersal for both variables highlighted a higher deviation in temperature seasonality (SD: 66.91) as compared to water vapour pressure (SD: 0.28) across the study area. Also, temperature seasonality demonstrated a positive relationship with the majority of the localities, which implies that as temperature variation increases, so does the incidence of leptospirosis. The association between temperature seasonality and leptospirosis incidence is most influential in two significant regions namely the Central and South West region. The link between temperature and distribution of leptospirosis incidence demonstrated in the present study is supported by the previous investigation in countries like Thailand [33], Reunion Island [37], Philippines [38], and China [39]. Climate condition is a network of multiple climatic processes, and Water is substantial in the motility and dispersal of leptospires to the susceptible human host, and changes in water temperature will undoubtedly impact the survival of this pathogen. Surface water temperatures follow the evolution of the air temperature, which is dependent on geographical location and altitude. Thus, survival of leptospires in water and soil may depend on the month, the nature of the water and location (east to west, coast vs mountain) [37]. As emphasized by Medlock & Leach [41], vector-borne diseases are highly sensitive to changes in weather and climate. However, land-use changes and adaptation to climate change are also likely to affect the geographical distribution and incidence of vector-borne disease [41].
The results in this study should be interpreted in light of the study's limitations. First, the present study is by the association between climate variables with incidences of leptospirosis rather than the causation. In order to study the causation of this disease in a community-based setting, detailed surveillance data and other geospatial environmental and socioeconomic covariates should be included for a much robust investigation. Another limitation of the present study involves regression modelling, in which only the singular effect of climate variable is being explored by univariate analysis. The burden of leptospirosis involves a complex network of covariates that should be explored by multivariate analysis to capture the combined effects of multiple risk factors.

Conclusions
In conclusion, the present study has successfully identified significant climate variable associated with the distribution of leptospirosis cases using Geographical Information system (GIS). Hence, the findings in this study justified the study hypothesis in which significant spatial variations in the effect of climate variables exist within the state of Sarawak. This demonstrated the value of GIS in investigating the spatiotemporal dynamics of infectious disease and predicting the distribution of biological agents about spatial heterogeneity of environmental conditions. As leptospirosis continues to be a public health challenge in Sarawak, it is hoped that this information could assist in risk assessment in local areas and guide public health personnel to optimize the allocation of public health resources and enhanced preparedness against future outbreaks according to region-specific conditions. Furthermore, the concepts and methods in this study should be complemented with ecological studies of Leptospira sp. s in the natural environment for a better understanding of how environmental changes affects them, and by extension affecting human health. The variation in coefficient of determination (local R2) for temperature seasonality derived from the univariate analysis.

Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download.