Predictive Risk Factors of Hypertension in Sub-Saharan Africa: A Fixed Effect Modelling Study in Burundi


 BackgroundHypertension, signalled by persistently high systolic and diastolic blood pressure is a major threat to public health globally. Especially in sub-Saharan African countries, this coexists with high burden of other infectious diseases, creating a complex public health situation which is difficult to address. Tackling this will require targeted public health intervention based on evidence that well defines the at risk population. In this study, using retrospective data from two referral hospitals in Burundi, we model the risk factors associated with hypertension in Burundi MethodsRetrospective data of a sample of 353 randomly selected from a population of 4,380 patients admitted in 2019 in two referral hospitals in Burundi: Military and University teaching hospital of Kamenge. The predictive risk factors were carried out by fixed effect logistic regression. Model performance was assessed by Area under Curve (AUC). Model was internally validated via bootstrapping with 2000 replications. All analysis were conducted in R.ResultsOverall, 16.7% of the patients were found to be hypertensive. After adjustment of the model for cofounding covariates, associated risk factors found were advanced age (40 years) AOR: 6.03, 95% CI: 1.86- 17.19) and above 60 years, (AOR: 12.76, 95% CI: 3.30 – 14.26). Patients comorbid with chronic kidney failure were 4.95 times more (95% CI: 1.83-15.82) to be hypertensive and among those with family history of hypertension, the adjusted risk were twice. Compared to non-smokers, smokers were 2.87 times more likely to develop hypertension (95% CI: 0.87 – 9.15).The highest probabilities are observed to patients who are at the same time smokers, overweight, with chronic kidney failure, family history with hypertension with secondary or university as highest educational level. The model had an excellent predictive performance (AUC), accurately predicting 88.71% (95% CI: 84.17%-92.5%) of all observations Conclusion The relatively high prevalence and associated risk factors of hypertension in Burundi raises a call for concern especially in this context where there exist an equally high burden of infectious diseases, other chronic diseases including chronic malnutrition. Targeting interventions based on these identified risk factors will allow judicious channel of resources and effective public health planning.


Background
Hypertension corresponds to a permanently raised blood pressure in arteries and arterioles. It is defined as a systolic blood pressure equal or above 140 mmHg and /or a diastolic blood pressure above 90 mmHg. Hypertension is a threat to global public health as it tires vessels, the heart and causes damage to artery walls [1]. It is a major risk factor for cardiovascular diseases [2] with high morbidity and mortality rate [3]. If not identified and treated early, arterial hypertension may result in serious complications including strokes, coronary artery, kidney and hypertensive heart diseases [4,5,6] which are among the leading causes of mortality in the world. Approximately, cardiovascular diseases account for 17.8 million death in 2017 [7], nearly 1/3 of total [8], of which more than three quarters were in low and middle-income countries (LMICs). Hypertension complications, cardiovascular diseases account 9.4 million (52.8%) every year [7]. It is responsible for 45% deaths due to heart disease and 51% stroke related deaths [8,9]. Premature death and health care expenditure for treatments due the hypertension puts an economic toll on families and pushes many into poverty [10]. At the macro level, these high expenses and human losses significantly impacts on economic growth and reduces productivity [11,12].
In 2015 only, the prevalence of hypertension in adults was 40% with an estimated 1.13 billion people living with different forms of hypertension. [11]. Data from the World Health Organisation Global Health Observatory Repository [12] found the highest prevalence of hypertension in the Africa region (46%) followed by the Americas (35%) and other regions, majority of whom, were undiagnosed and untreated [13]. In sub-Saharan Africa (SSA), just like other settings, hypertension has been associated with lifestyles, diets, physical inactivity urbanization and socio economic status [14,15,16]. More than 125 million people with hypertension are expected by the year 2025 in SSA alone [17,18]. By year 2030, hypertension and other non-communicable diseases are projected to surpass communicable diseases as the top of mortality causes on the continent [1,17]. From 2011 to 2025, the cumulative lost output with non-communicable diseases is projected to be US$7.28 trillion in low and middle income countries which is approximately a loss of US$500 billion per year [19]. Cardiovascular diseases including hypertension account for nearly half this cost [20]. Despite this, SSA faces a major problem of early screening, timely treatment and control of hypertension [21].
Burundi at present ranks 16 th worldwide and 12 th in sub-Saharan Africa on age standardized hypertension and related mortality. Yet, studies to understand the epidemiology and associated in the context of Burundi are lacking, prompting the conduct of this study [22]. Therefore, in this study, we determine the overall prevalence of hypertension, evaluate predictive risk factors, and predict their probabilities. Knowing these factors could support effective public health planning and facilitate policy makers to formulate plausible policies towards the fight against hypertension and its complications.

Sources of data and sampling methods
Data used in this study were collected in 2018 in two referral hospitals in Burundi: University teaching hospitals of Kamenge and Kamenge Military hospital in different departments. Cross sectional population of 4380 patients were stratified in 2 groups in both hospitals. Random sampling were done with proportional allocation to the admitted patients by service and by hospital.

Inclusion Probability and Sample Size Calculation
Inclusion probability i p is the same for patients admitted in the same i service of the selected hospital and is calculated as: The minimal size of sample is calculate as The quintile of the normal low with at 95% of confidence (1.96), N population size, n minimal size sample, p prevalence of hypertension and y acceptable margin (5%). As the prevalence of hypertension is unknown, a value of 5 . 0 = p was selected. According to these parameters, the minimal size of the sample is 353 patients. Basically, a respondent was selected if he had of the following criteria: Admitted in internal medicine service in the period of 2018, measured diastolic and systolic blood pressure three times.

Outcome and Independent Variables
In this study, quantitative and qualitative variables were used. Hypertension was considered as outcome variable and was defined as systolic blood pressure ≥ 140 mmHg and/or diastolic blood pressure ≥ 90 mmHg. Body mass index (BMI) was calculated as the weight in kilograms to square of height in meters and was categorized in into; underweight (BMI<18.5kg/m 2 ), normal (BMI 18.5-24.9 kg/m 2 ), overweight (BMI 25.0-29.9 kg/m 2 ) and obesity (BMI ≥ 30kg/m 2 ).

Data Analysis
Data analysis were undertaken in different steps: descriptive statistic, binary logistic modelling with fixed effects, power predictive evaluation of final model and probabilities prediction. Hypertension associated risk factors were done firstly via univariate and multivariate logistic regressions secondary. We calculated the odds ratios (ORs) at 95% confidence level for each covariate to identify predictors of hypertension. The risk estimate equation for multiple logistic regression is as follows: Where p is outcome realization probability, 0 β intercept, i β coefficients, i X independents variables and ε error. Significant variables on 15% threshold were introduced in multivariate logistic modelling to determine a combined effect on the outcome. Finally, the predictor variables of the model were manually selected step by step using decreasing method on a 5% threshold. The likelihood ration test, the score test and the Wald test were used to determine significance of independent variables on the outcome [23].To select the best model for this study, the Akaike Information Criterion (AIC) based on adjustment [24] were used in a given equation as: Where L refers to the maximum value of the likelihood function of the model, k model parameters number. The best model is one with low AIC value. The relevance of the final model to make prediction was assessed by Pearson residuals test. Receiving Operating characteristics (ROC) and Area under Curve (AUC) were respectively used to compare and evaluate performance and predictive power of the model. Furthermore, the ROC was used to determine the discriminatory performance of the model, determining the false positive and false negative rates. The Mann Whitney statistics method showed that the two distributions were offset: normotensive people had an average higher scores than hypertensive people.
Each individual's score was ranked in ascending order. Thus, the AUC which determined the number of observations accurately predicted was calculated as: exceptional discrimination [25]. Influents points of the model was analysed by Hoaglin and Welsh criterion: Where p is parameters of the model and n the sample size.
The R software (3.5.0) Foreign and forest model packages were used to carry out results in this study [26].

Results
More than 75% of the patients had high normal blood pressure according the World Health Organisation's (WHO) classification (Table 1). Overall, the prevalence of hypertension is 16.71% (Table 1).This prevalence 2 times higher in overweight patients and 3 times more in diabetic patients. The high proportions above the overall prevalence are observed in people with cardiovascular comorbidity, married, people aged between 40-60 years or 60 and above and over, chronic kidney failure, men, smokers, obese people, with secondary and university level. These prevalence are 16.9%, 19.3%, 21.4 %, 33.3%, 29.3%, 17.7%, 28.6%, 39.1%, 18.2% and 29.4% respectively.

Logistic Regression Modelling of Predictive Risk factors of Hypertension
The table 2 shows the univariate logistic regressions where hypertension is modeled by each explanatory variable.  The table 3 shows the variables with significant association on hypertension.    .001) rejects the null hypothesis and therefore to confirm the alternative hypothesis stating that there is at least one coefficients significantly different to zero. Pearson residuals test of (X 2 = 266.17, df= 344) was determined with a p-value of 0.99 which shows the model was well adjusted on the observations. A McFadden statistic (R 2 : 0.35) also indicates that this model has a good fit.
The influential points' analysis based on Hoaglin and Welsh criterion shows that only 9 points are influential. Also, Cook's distance shows that 3 points (108,114 and 199) are outliers, it means the influential points are not numerous. Studentized residue analysis (Figure 1) shows that 97% (343/353) are between -2 and 2. Observations with residues ( Figure 2) greater than 2 are ten (9,34,105,108,114,199,212,232,265,273). No observation with studentized residue less than -2, indicating that the number of outliers are negligible.

Cross validation and Probabilities predictions
The figures 3 and 4 show respectively the ROC curve and complexities parameters from the decision tree. A bootstrap method with 2000 replications was used to determine an AUC of 88.71% (95% CI: 84.17% -92.5%) which suggests an excellent discrimination. This implies that saturated model has an excellent predictive power and probabilities accurately determine people with hypertension based on these identified characteristics. On the decision tree, Root node error: 59/353 = 0.167. Substitution Error is equal to 0.1331.
The table 5 shows the predicted probabilities of becoming hypertensive based on different scenarios of having either a risk factor or a combination of risk factors. The first individual is the one with no risk factors which was considered as the reference individual. Fifteen predictions were generated from reference individual to whom with all hypertension risk factors. The reference individual have 0.002 as probability to become hypertensive. This probability goes from simple to the triple when the last one is smoker. Individual aged between 40 and 60 old and the overweight individual have the same probability to become hypertensive (p=0.011).This probability is six times higher to individual with 60 years old and over. If the individual aged between 40 and 60 who is at the same time smoker and overweight, his probability to become hypertensive is 0.031. If this last one is 60 years old or over, that probability is 5.2 times higher the last one (p=0.161). If the overweight individual born in the family of hypertensive people have chronic kidney disease, his probability to become hypertensive is around 50% (p=0.492).
That probability double when that preceding is smoker. Individuals with all risk factors have at least 0.850 as probability. The highest probabilities are observed firstly to patients who are at the same time smokers, secondary level, chronic kidney failure, born in the hypertensive family and secondly to the last one but with university as highest educational level. Their probabilities are 0.972 and 0.999 respectively. Table 6 shows different levels of cardiovascular risk according to association of blood pressure stages with hypertension risk factors. Risk are divided in five groups: Low risk (<15%), Moderate risk (15-20%), High risk (20-30%) and Very High Risk (>30%). In this study, only 15 patients had zero risk of cardiovascular diseases. More than a 1/3 had low risk (<0.15), 25 had moderate risk between 0.15 and 0.20,126 patients a high risk less than 0.30 and 35 patients had very high risk more than 30%. The more the blood pressure is high, more the risk increased. Also, more the risk factors increase at the same individual, more the risk to develop cardiovascular diseases. Finally the combination of the two parameters to the same person increase his probability of becoming patients with cardiovascular diseases.

Discussion
In this study, we determined the prevalence of hypertension, identified principal predictive risk factors of hypertension and predictive probabilities to become hypertensive based on a combination of risk factors. Overall, the prevalence of hypertension was 16.7%. Considering women only, the prevalence was 17.7% (95%CI 17.24% -23.31%). This prevalence is similar to results of a recent study conducted in Lesotho which showed a prevalence of 17.3% among women [37]. In another study conducted in Saudi Arabia in 2014, the prevalence was 15.2% among those aged 15 years old and above had different levels of hypertension [28]. This study did not show the significant difference of the prevalence (X 2 =0.080, df =1, p= 0.778) between women (16%) and men (17.7%) [28]. This finding is consistent in a study conducted in Benin where there was no significant difference between men (32.8%) and women (33.0%) [29]. The highest prevalence of hypertension observed in diabetic patients (41.2%) and the lowest in youngest patients (2.2%) aged under 40 years [29].
Literature on marital status and hypertension is inconclusive and mostly compares never married to currently married persons [30]. In congruence to this, our study did not show association between marital status and hypertension. This is contrary to what had previously reported on this association [31]. After adjusting hypertension on other covariates via logistic regression model, high educational level, smoking, advanced age, overweight, chronic kidney failure and familial history of hypertension are the associated factors with hypertension. Similar findings were found in several previous studies in developing countries: Malawi, Uganda, Northwest of Ethiopia and Birmania in 2018, 2015, 2015, and 2016 respectively which showed that the factors associated in odds of hypertension were overweight, smoking, education level and older age [32,33,34,35]. The association between advanced age and high risk of hypertension could be due to the biological effect of increased arterial resistance which increases with old age [36]. Our study did not find the association between residence and marital status. Furthermore, as in this study, alcohol was also not associated with hypertension which is also consistent in two studies conducted among Europeans countries and Beninese [29,37].
This study shows that predicted probabilities to become hypertensive is low in young people, aged under 40 years. High probabilities are observed in oldest people with many risk factors ( Table 5). The highest probabilities, more than 60% were observed in people aged 40 years and above, with presence of all others risk factors as shown in 13 th , 14 th and 15 th individual with 85.0%, 97.2% and 99.9% as probabilities respectively. Also, more than a 1/3 had low risk (<0.15), 25 had moderate risk between 0.15 and 0.20, 126 patients a high risk less than 0.30 and 35 patients had very high risk more than 30%, underscoring that the higher the blood pressure, the higher the cardiovascular risk. Similar probabilities were found in recent Chinese study conducted in 2020 which shows that <20% cumulative risk of hypertension for 57.62% of participants, 20-40% risk for 27.24%,40-60% risk for 12.19% ,and > 60% risk for 2.96% of participants [38].
One strength of our study is the ability to study hypertensive and normotensive people at the same time, combining descriptive and inferential statistics (logistic regression with fixed effects, Wald test, deviance test) to build the ROC curve and complexities parameters using decision tree. Another strength of the study is the ability to estimate area under curve and build bootstrap AUC interval confidence using Bootstrap method to analyse model's residuals using Welsh-Kuh's distance to predict probabilities of becoming hypertensive given a combination of risk factors. However, despite these strengths, some limitations should be noted during interpretation and policy formulation. First, our study used secondary data and as such, we were unable to measure quantities and type of alcohol and tobacco consumed as well as obtain information on physical activities which have been found to be associated with hypertension. Caution should be taken when generalizing findings on high blood pressure as data used were only reported from two hospitals. To validate findings, additional studies should be conducted in other hospitals in the country and take into others characteristics including more biomarkers. A random effect logistic regression or Bayesian regression based on Markov chain Hamiltonian Monte Carlo simulates and Langevin algorithms could give precision in the estimation of model's parameters and Bayesian credibility intervals as such these methods are recommended for future research.
The main interest of this study is to identify predictive risk factors which allows prediction probabilities of hypertension and further evaluating cardiovascular risks controlling possible cofounders.

Conclusion
This study showed that the hypertension prevalence was 16.7% with insignificant differences between men and women. Predictive risk factors of hypertension were advanced age (40-59 years, 60 years and over), smoking, presence of chronic kidney failure, cardiovascular comorbidity, educational level and overweight.
The lowest predicted probability is observed to young people with no risk factors. More than 85% predicted probabilities to become hypertensive are observed to people with all risk factors. Resources in Burundi are scarce, therefore, the tackling the high burden of cardiovascular diseases should be based on instituting systems for early detection and prompt treatment especially those identified as high risks. At the community level, efforts should be channelled towards intensifying innovative and inclusive health promotion aimed at behaviour change. At the health system, creating a risk-based nomogram based on these identified risks factors could allow those at high risks to be identified early and well targeted with the needed treatment. Finally, provision of long term care for those identified cases will depend on not just consistent treatment but also on the overall health systems' strengthening. This will ensure sustainability and effectiveness of public health interventions aimed at tackling chronic diseases along with other high burden infectious diseases.