Total hospital admissions range from 0 (68 sub-districts) to 4,077 times (Balah sub-district in Yala), with high numbers in mountainous areas along the southern border with Malaysia. The populations of sub-districts ranged from 1,678 (Ta Che sub-district in Yala) to 148,284 (Hat Yai sub-district in Songkla) (Fig. 2).

The 68 sub-districts with no cases for a consecutive 11-year period were omitted. Only 5,948 records out of 54,384 in the study region (11 years old, 16 gender-age groups, and 309 sub-districts) had a malaria occurrence, resulting in a 10.9% occurrence rate. The disease incidence rate is defined as the corresponding incidence rate per 1,000 population.

A linear model for predicting the malaria incidence using gender, age group, year, and sub-district as predictive factors gives a very poor fit, as shown by the Q-Q plots of the studentised residuals. This is because malaria incidence has a highly right-skewed distribution. But when we fit the same linear model to logarithms of incidence rates, the model fits very well, and the R2 has almost doubled to 62.8% (Fig. 3). This is the ability of the model to predict incidence rate per 1,000 population.

A logistic model for predicting the malaria occurrence using gender-age group, year, sub-district, and population as predictive factors was assessed using ROC curve. The ROC curve shows how well a model predicts a binary outcome. It plots sensitivity (probability of finding an outcome when it is there) against the false positive error rate (probability of finding an outcome when it is not there). The cut-off point marked by the dot gives a total predicted number agreement of the number of cells in the model. The ROC curve shows that a model containing gender-age group, year, and sub-district as well as cell population fits the occurrence data very well. The model has an AUC of 0.8523 (Fig. 4), and it gives 91.79% predictive accuracy (number of correct predictions divided by the total number of all predictions).

Confidence intervals of malaria occurrence for levels of each predictive factor, including cell population from the logistic regression model, were plotted, with an overall mean of 10.9% (Fig. 5). Age patterns are different to those seen for incidence, with a peak at age 20–29 for males, a broader lower peak for females, and a decrease with age for each sex. Sub-districts show variation, with areas of high occurrence, especially in Yala. Thus, the disease is more likely to occur at these particular levels of predictive factors rather than at other levels.

The log-linear model's confidence intervals for the malaria incidence rate for each predictive factor were plotted (Fig. 6).The overall median incidence rate was 431 cases per 1,000 population, whereas the overall mean was 948 cases per 1,000 population. The larger value of the overall mean than the overall median is caused by a highly right-skewed distribution of malaria incidence rate. This does not affect the results from the model because the model was fitted to logarithms of incidence rates, which are normally distributed.

The incidence rate patterns show moderate increases with age for each sex, a decrease over the decade from 2008–2018, and high variation among sub-districts, with pockets of high incidence rate in Yala and Narathiwat.

A confidence interval plot of malaria incidence rate was used to divide sub-districts into three groups, depending on the placement of these intervals completely above, around, or below the overall median (Fig. 7B). The red color indicates sub-districts with incidence rate. Sub-districts with high malaria incidence are all located in the forested mountain range to the south-west. Most of these sub-districts were in Yala (all sub-districts of Than To and Kabang districts).

Similarly, a confidence interval plot of malaria occurrence was used to divide sub-districts into three groups depending on the placement of these intervals completely above, around, or below the overall mean (Fig. 7A). This map shows the pattern of malaria occurrence. We saw that the mountainous area bordering Malaysia is where malaria has high incidence rate, and it also has a high occurrence there. This map also shows that areas in the coastal plain of the Gulf of Thailand to the northeast have low to moderate malaria occurrences.

Table 1 summarizes the characteristics of malaria occurrence and incidence rate. Malaria occurrence was highest in 2008 (19.9%) and lowest in 2018 (4.6%), with the incidence rate appearing to be the same. The highest incidence was found in both males and females aged 20–29 years, with the incidence rate increasing with age in both sexes. The occurrence and incidence rates were highest in Yala province, while the lowest were in Songkhla province.

Table 1

Occurrence and incidence rate of malaria cases and social-demographic of malaria cases

Determinant | Occurrence (%) | Incidence rate/1,000 population |

Year | | |

2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 | 19.9 12.2 12.6 7.9 9.1 13.5 12.4 5.9 13.9 8.3 4.6 | 6.90 4.91 5.65 3.51 4.33 4.26 4.08 2.84 4.40 3.04 2.34 |

Gender-age group | | |

Male | | |

0–9 years 10–19 years 20–29 years 30–39 years 40–49 years 50–59 years 60–69 years 70 + years Female 0–9 years 10–19 years 20–29 years 30–39 years 40–49 years 50–59 years 60–69 years 70 + years | 13.5 17.7 20.0 15.7 13.1 8.9 6.6 3.6 12.6 12.7 13.0 10.9 10.2 7.6 5.2 3.5 | 3.16 3.24 3.16 3.74 4.55 5.87 8.93 9.35 3.31 3.48 3.44 4.35 4.15 6.85 7.75 7.57 |

Province | | |

Yala Pattani Narathiwat Songkha | 28.4 5.0 10.9 7.6 | 7.19 3.24 4.01 2.55 |

A thematic map of all combinations of occurrence and incidence rate levels is shown in Fig. 8. Spearman's correlation coefficient between occurrence and incidence is 0.29. The area on this map where malaria occurrence and incidenc rate were both high very closely matches the area on the map of number of cases (Fig. 2) where all the sub-districts reported 25 or more cases over the 11 years.