This study was conducted as a cross-sectional survey design. The dataset used in this work is known as the Malaria Indicator Survey (MIS) dataset of Rwanda which was part of the Demographic and Health Surveys (DHS) 2010.
Inclusion Criteria
This study includes both males and females tested for malaria diagnosis (rapid malaria test and confirmatory blood smear test) either positive or negative result.
Study area
This work was conducted in Rwanda, a Central African country located South of the equator between latitude 1°4' and 2°51' South and longitude 28°63' and 30°54' East. It has a surface area of 26,338 square kilometers and is bordered by Uganda to the North, Tanzania to the East, the Democratic Republic of the Congo to the West and Burundi to the South. Landlocked, Rwanda lies 1,200 kilometers from the Indian Ocean and 2,000 kilometers from the Atlantic Ocean [20]. Rwanda is divided into five geographically-based provinces—North, South, East, West and the City of Kigali, with the provinces, further subdivided into 30 districts, 416 sectors, 2,148 cells, and 14,837 villages [20].
Sampling technique
Two-stage sampling of Rwanda malaria indicator survey (MIS)
The data used for the analysis was obtained from the 2010 Malaria Indicator Survey (MIS) conducted by Rwanda Demographic and Health Survey (RDHS) program. The previous study collected the data elements on basic demographic and health indicators, malaria prevention, treatment and morbidity.
Sampling in previous study was done in two stages. In the first stage 492 villages which formed the clusters were selected with probability proportional to the village size. The village population size also indicates the number of households in the village. The mapping and listing of all households in the selected villages was done. The resulting list served as the sampling frame for the second stage of sample selection. All of the 492 clusters were selected for the modeling as surveyed for the 2010 RDHS. The selected data contained 11,865 households consented to participate in the study and completed the individual’s questionnaires. Data for children less than five years of age was collected from their mothers [20, 21].
Dependent variable
The malaria outcome variable for this study was defined according to the Center for Disease Control (CDC) and World Health Organization case definition criteria. The CDC and WHO definition indicates that there are three possible states of malaria infection [22, 23].
- No malaria
- Probable or (symptomatic or asymptomatic) malaria infection
- Confirmed malaria infection
Two types of tests were conducted on surveyed population (rapid malaria test and confirmatory blood smear laboratory test). Participants who showed a negative result in both the tests (rapid malaria test and confirmatory blood smear laboratory test) were put in the category of "No Malaria Cases". The participants who were not tested for confirmatory blood smear laboratory or either show negative test but showed positive in the rapid malaria test were considered as "Probable or symptomatic or asymptomatic Malaria Cases". Those who have positive confirmatory blood smear laboratory test regardless of their results from rapid malaria test either positive or negative were considered as confirmed malaria cases.
Independent variables
The independent variables were split into four categories: individual level variables (education gain in years, age in years, social economic status, health insurance, household related variable (no of rooms for sleeping), ecological variables (cluster altitude in meters and region) and behavioural variables (has clean water facility for drinking, and sleep under bed nets).
Factor analysis (FA) was used to derive socio-economic status (SES) using indicators data from 2010 Rwanda Demographic Health Survey (RDHS). Factor analysis is a useful tool for investigating variables relationships for complex concepts such as socioeconomic status.[24] It allows researchers to investigate concepts that are not easily measured directly by collapsing a large number of variables into a few interpretable underlying factors. The process to create socio-economic status is shown in (Supplementary S1 and S2).
Conceptual framework
Reiter P. in 2008 studied the effects of temperature on Malaria transmission. His study proposed that temperature, rainfall and humidity cannot be considered in exclusion without considering the behavioural of human. Additional factors that influence the malaria infection directly and/or indirectly; household and individual level related variables (SES, age, education and health insurance), behavioural variables (Clean water facility for drinking and sleep under bednet), ecological variables (Regions (north, east, west and south)) and cluster altitude in meters).
This study is considering all the direct and indirect factors with the inclusion of human behavioural as suggested by Reiter P. in 2008 [25]. This work not only considers human behavior but related direct and indirect factors (mentioned above) which have not been considered in previous or historical studies.
Conceptual modelling technique and variables are divided into three main categories; ecological, household and/or individual level, and behavioural variables in the analysis with the purpose that these determinative factors run through the standard set of either ‘proximate direct or indirect variables that have an influence on malaria morbidity [26, 27]. Variables are either endogenous (dependent variable) or exogenous (explanatory variable) or both, which can be modelled by using generalised structural equation models. Details of these models provided in Figure 4 and Figure 5.
Statistical analysis
Power computation
There was no need for sampling for this study; instead power computation was done using a STATA ado-file. [28] A sample prevalence of 2.1% comprised of 1.4% children and 0.7% adults adopted from the Rwanda Malaria Indicator Survey [3] and a 3% assumed population prevalence at an alpha level of 0.05. A design effect of 10 was used and intercluster correlation (ICC) of 0.071 with a total of 492 clusters and average households per cluster of 115. Power was obtained an 89%.
Descriptive analysis and bivariate analysis
Descriptive analysis was conducted using survey chi square test (Rao-Scott adjustment) adjusted for cluster effect. For of categorical variables weighted percentages (proportion) with adjusted F-statistic are reported as shown in Table 1. For continuous variables mean and confidence interval are reported as shown in Table 1.
Bivariate analysis was done using survey chi-square test adjusting for cluster effect to establish the relationship between two categorical variables (such as malaria morbidity and gender). For the continuous explanatory variables such as age, survey ordinal univariate analysis adjusted for cluster effect was done. For bivariate analysis odds ration with p-value are reported in Table 2.
Multivariate analysis
Multivariate analysis was done using step-wise forward selection survey ordinal logistic regression modelling adjusting for random effects and generalised structural equation modelling (G-SEM) to obtain total (direct and indirect) effects of malaria infection. Demographic and Health Survey (DHS) data follow a hierarchical structure that is, individuals are nested within clusters, and clusters are nested within regions. Respondents who live in the same cluster or region may not be independent of one another. Compared with regular individual-level regression analyses that assume that all individuals are independent, the multilevel modelling approach accounts for the fact that people that live in the same area may have some characteristics in common. It should be noted individuals residing in the same household are not independent. However, this clustering and the household effect were included in this analysis given that it is common that multiple individuals from the same household were interviewed.
Another important feature of the random-effects model is that it gives information on the proportion of total variation that was explained by the cluster-level, individual level and household level predictors. The model sets with cluster number and household number as random effects to account for inter cluster correlations between individuals from the same cluster and household.
The models were also tested using the ordinal regression diagnostics i.e. goodness of fit and multicollinearity; as those variables which were correlated with each other deleted from the final model. Variables were deleted from the model if they showed multicollinearity. Suest test was used to check the goodness of fit and see if the addition of a variable did improve the model. It checks the estimates and (co)variance of the coefficients of two models simultaneously and gives adjusted p-value.
Generalized structural equation modeling (G-SEM)
Generalized Structural Equation modeling (G-SEM) was used to model direct and indirect effects on the malaria outcome. The conceptual framework is a useful graphical display of the relationship that exists between an explanatory and dependent variable to quantify the associations used for generalised structural equation modelling (G-SEM). G-SEM is a systematic way of evaluating hypothesis involving pathways analysis against multivariable data [29]. Therefore, it is used for testing and estimating causal relationships (direct and indirect effect) through statistical data and qualitative causal assumptions.
Direct pathway defines as an effect of exposure which is not affected by a given set of potential mediators. Description of direct pathway has provided in (Supplementary S-3).
Indirect pathway defines as exposure effect which is affected by a given set of potential mediators. Description of indirect pathway has provided in (Supplementary S-4).
Modelling in the G-SEM based on variables that were significant in the regression analysis to quantify the relationship of the relevant pathways. Fewer variable that are important being a factor of malaria infectious have also used in G-SEM if is shown no significant in regression model. From the final survey ordinal regression model adjusted for random effect, variables like individual level, household related variables, ecological variables and behavioural variables were selected as underlying determinants for G-SEM modelling.
The G-SEM has advantages of being able to adjust for multiple factors and at the same time take into account various outcomes. Furthermore, it gives a refined estimation of positive and negative effect compared to those from multiple variable analyses
The selection was based on variables that were most significant in the final survey ordinal regression model. The model fit was assessed using the root mean square error of approximation (RMSEA) due to its sensitivity to the number of estimated model parameters and ability to handle large samples [30]. Studies show that an RMSEA below 0.8 shows evidence of good fit, [30] hence the RMSEA of 0.03 from our G-SEM model was a good fit. All statistical analyses were carried out using Stata®13.1 (Copyright 1985–2013, StataCorp LP). Arc GIS was used for mapping the prevalence and variance of malaria infection in Rwanda.
Ethics approval
This study was granted ethical approval by the University of the Witwatersrand's Human Research Ethics Committee (Medical) (Clearance Certificate No. M151040). Approval to use the MIS data was obtained from the Measure DHS website. The primary study, where the data was collected, verbal informed consent for testing of children was obtained from the child's parent or guardian at the end of the household interview and ethical clearances with the Rwanda authorities before the study started.