Statistical Joint Modeling on Longitudinal Body Weight and CD4 Cell Progression with Survival Time-to-Death Predictors on HIV/AIDS Patients in Mekelle General Hospital, Ethiopia

Background: This study assessed the impact of repeated biomarker measurements of statistical joint modeling on survival time-to-death and determines potential predictors of HIV/AIDS patients on ART in Mekelle General Hospital Ethiopia. Methods: A retrospective cohort study was conducted among HIV/AIDS patients who were under ART follow-up during September 11, 2013 - September 5, 2016 at Mekelle General Hospital, Ethiopia. The two repeated biomarkers of longitudinal measurements and survival outcome with separate 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 longitudinal modeling approach and statistical joint modeling approach were used to fit simultaneously. A total of 216 HIV/AIDS patients were selected by using systematic random sampling technique from ART follow-up. Results: The relationship between the two biomarkers CD4 cell and body weight with risk for survival time-to-death were statistical insignificant. Thus, death is less probable to occur in HIV/AIDS patients with higher value of CD4 cell count and body weight progression. In event process the sub-model, Baseline CD4, Fair and Good Adherence, HIV/TB (yes) and Sex (male) were significant factors of risk to short survival Time-to-Death on HIV/AIDS patients. In the 1 st longitudinal process sub-model, Baseline CD4, Ambulatory functional status, HIV/TB (yes), Time*Ambulatory functional status, Time*Working functional status and Time*Baseline CD4 were the significant factors of In 2 nd longitudinal process sub-model, visit Time of follow-up, Age, Sex (male), Baseline weight, Time*Ambulatory and Time*Working functional status were the significant factors of log 10 (body weight) progression. Conclusion: Both governmental and non-governmental stakeholders should pay special attention for HIV positive adults, especially for those who had developed HIV/TB, male, bedridden functional status, poor adherence and lower Baseline CD4 cell count progression so that mortality due to HIV/AIDS optimally reduced.


Background
It is about a quarter of a century since Human Immune Deficiency Virus (HIV) epidemic has been a menace to this world. Following the reports of the first case in the early 1980s, the spread of HIV/AIDS has increased at an alarming rate. In spite of global awareness, more than 70 million people were affected and 35 million people died due to HIV infection [1]. It is reported that, nearly 36.7 million people were living with HIV [1,2,3] and that more than 1 million people died globally in 2016 [2]. While adults ranging in age between 15 and 49 which constitutes to about 0.8% worldwide were affected, the infection in Sub-Saharan Africa continues to be severe (4.2%) and this accounts for about two-thirds of HIV people living in the world.
Although the infection varies considerably in different regions, the distribution of HIV/AIDS was more in Sub-Saharan Africa and represents to about 60% of the victims of HIV/AIDS of the world. Among 800, 000 new HIV cases reported in Eastern and Sub-Saharan Africa, 12.9% of the infected people were assessed for ART and around 380, 000 people were died of HIV/AIDS in 2017 [4].
Ethiopia is one of the most highly affected among the seven countries that are greatest hit by the HIV/AIDS pandemic diseases in Sub-Saharan Africa [5]. Approximately to about one million people living with HIV becomes the leading cause of mortality among 15-49 years of age that accounts for 43% of all population deaths in 2016, and approximately a total of 114,690 people died of AIDS related conditions [6].
Anti-retroviral treatment (ART) was started in Ethiopia in 2003 formally with cost sharing. However, the country initiated free ART in 2005 with the support of Global Fund for Tuberculosis, AIDS and Malaria (GFTAM), and the U.S. President Emergency Plan for AIDS Relief (PEPFAR). In the study area, Mekelle General Hospital has been begun ART delivery services follow up since September, 2011 and up to now has been given a service, more than 10,000 HIV/AIDS patients [7]. But, the effectiveness of ART initiatives not yet fully investigated especially how much supportive to prolong the survival of HIV patients in the study area and at nation to the larger domain. Thus, we have focused on examining the existing status and determining the potential predictors that could affect the survival of HIV patients and suggesting the realistic interventions so as to optimize patient's life quality. The CD4 cell count of a health individual must have at least 500 cells /mm3 and below this level is an indication of a unhealthy condition, and when CD4 cell count reduces below 200/mm3, the person requires a diagnosis for AIDS [8,9].
CD4 counts decrease over time in persons who are not receiving ART, of course the other way also true. Moreover, CD4 cell counts affected by various demographic factors [10].
Authors have focused on patient's body weight and CD4 cell progression where both survival and longitudinal data usually considered. Survival analysis involves the modeling of time to event data; in this context, death or failure is considered as" an event". In fact, separate modeling could be possible but usually joint modeling suggested for such studies and comparison have made with the separable models. Joint modeling is appropriate when one wants to predict the time to an event with covariates that are measured longitudinally and are related to the event. An underlying random effects structure links the survival and longitudinal sub models and allows for individual-specific predictions.
Thus, the main purpose of the current study was to examine the current patient status in contrast with baseline facts and determine potential predictors through Joint modeling longitudinal analysis of body weight and CD4 cell as a biomarker of disease progression on shortening survival Time-to-Death predictors of HIV/AIDS patients on ART follow-up in Mekelle General Hospital.
The paper is organized as follows. Section 2 describes the materials and methods. The basic findings of the study are presented and discussed in Sections 3 and 4. Finally, concluding remarks are provided in Section 5.

Study Area and period
This study was conducted at Mekelle General Hospital, Tigray Region, and Northern Ethiopia on HIV/AIDS positive patients who initiated ART follow-up from September 11, 2013 until September 5, 2016.

Study design and Data Source
We have obtained data through a retrospective cohort study design where basically joint longitudinal and survival modeling has considered to determine potential predictors.

Study Population
All HIV/AIDS positive patients whose ages were 15 years and above treated on ART follow up from time September 11, 2013 to September 5, 2016 in Mekelle General Hospital. These are the study population if they fulfilled the inclusion criterion were considered in this study.

Sample size determination and sampling procedure
In this the researcher used a systematic random sampling procedure to select sample subject from list of study population. Then, the researcher has used survival sample size determination formula to achieve statistically significant results. Sample size calculation formula, the required sample size for this study was obtained as follows [11].

Data Collection Procedure
The data were collected using standardized structural questionnaire in Mekelle General Hospital. The relevant data were taken from HIV/AIDS patients under ART follow up charts and have been collected by two professional collectors and also one supervisor.

Quality of data
The quality of the data was controlled by data controllers from ART section of the hospital. The controllers were taken intensive training by the Ministry of Health for different services. The data extraction tools and the variables included in the study were tested. Necessary amendments were made on the final data collection sheet.

Study variables
The response and predictor variables considered in this study are defined as follows.

Response Variables
The longitudinal response variables were the progression of body weight (in kg) and CD4 cell count (in cell/mm 3 ) of HIV/AIDS Positive patients during the follow up time from the date of ART initiation. Survival response variable was the survival time-to-death or censored (in month) HIV/AIDS Positive patients during the follow up time from the date of ART initiation. This was measured from the starting a treatment till the patient's time-to-death or censored of the last visit ( . in our study right censoring was faced).

Independent Variables
The covariates (predictor) variables in the current study were considered potentially affect the body weight and CD4 cell count progression and then, aggravate the death of HIV/AIDS patients. Thus, study variables selected based on authors experience and empirical related literatures, for example [6,13,14,15].

Statistical Models of data analysis
In this study, the authors used survival model to investigate the determinant factors that can affect survival time after patients started taking treatment to death of the patient's and univariate longitudinal model analysis had been used to recognize determinant factors that affect the longitudinal change of CD4 cell count and body weight progression separately. And also, the statistical Joint longitudinal and survival analysis were used to assess the impact of longitudinal change of body weight and CD4cell count progression on survival time-to-death of the HIV/AIDS patients. Finally, the data were analyzed through using statistical software packages R version 3.5.1.
In this study, the researcher applied the following three types of different statistical data models:

Linear Mixed Model
The longitudinal data analysis arises when multiple observations are made on the same subject over time. Measurements made on the same variable for the same subject are likely to be correlated. ( . . The Measurements of body weight & CD4 cell for a given HIV/AIDS patients would be tended to be correlated each other over time) and another important outcome that is commonly measured in a longitudinal study is the time until a key clinical event of interest occur such as disease recurrence or death (Time-to-event data). Longitudinal studies were used to characterize normal growth, to assess the effect of risk factors on human health and to evaluate the effectiveness of treatments [16].
Longitudinal responses data may arise in two common situations.
(1) When the measurements are taken on the same subject at different times (for example, when multiple measurement of CD4 are made on the same subject over time) and (2) When the measurements are taken on related subjects. In both cases, the outcome variables (responses) are likely to be correlated [23].
And also, there are two sources of variations were considered for longitudinal data analysis. These were be the within subject (inter-individual variation) which arises during the measurements within each subject, and between subject variations (intra-individual variation) which arises during the measurement between different subjects. The longitudinal modeling within subject variations had been used to study changes overtime while the longitudinal modeling between subject variations were used to understand difference between subjects [18]. A LMM is a parametric linear model for repeatedly measured data that quantifies the relationships between a continuous dependent variable and various predictor variables when the response variable have been follow a normal distribution and also have included to account for the correlation. A LMM might include both fixed-effect parameters associated with one or more continuous or categorical covariates and random effects associated with one or more random factors. So that, the mixed of fixed and random effects gives the LMM its name. Whereas fixedeffect parameters describe the relationships of the covariates to the dependent variable for an entire population, random effects are specific to subjects within a population [16].
The Linear Mixed Model is defined as: � b~(0, ) ∈~(0, ∑ ) b 1 , … . . , b , ∈ , … … . ∈ Where y is the n *1 response (longitudinal outcome) vector for observations in the ℎ subjects and ∈ is distributed as N(0, ∑ ) is a vector of residuals components, combining measurement error and serial correlation. And also b is random-effects parameters distributed as N (0, D), with independently of each other and of the within-subjects residuals ∈ . That is, ( ; ∈ ) = 0. Furthermore, ∑ =δ 2 I ni is the * positive-definite variancecovariance matrix for the errors in subject i, where I ni denotes the * identity matrix. Marginally, the vector y is normally distributed with mean X β and variance-covariance matrix of v = Z DZ T + δ 2 Ini. Here D is a k * k positive-definite covariance matrix for random effects. Conditional on ; y is normally distributed with mean X + Z b and with variancecovariance matrix ∑ . It can also be rewritten as

Fixe Rando
That is, given the random effects b , the dependent variable y is normally distributed with variance covariance structure [20].

Statistical Joint Modeling of Longitudinal and Survival Data Analysis
In many longitudinal studies, the outcomes recorded on each subject include both a sequence of repeated measurements at pre-specified times and the time at which an event of particular interest occurs: e.g,death,drop out from the study. The event survival time for each subject may be recorded exactly, interval censored or right censored. So that, the term statistical joint modelling refers to the statistical analysis of the resulting data while taking account of any association between the repeated measurement and survivl time-to-event outcome [17]. Joint longitudinal-survival models had been formed where the association between the two endpoints is due to shared random effects.
This study mainly focusses on the use of a joint model, where the longitudinal and survival processes are assuming to be conditionally independent may be has given unobserved random effects. That is, the key assumption of a joint model is that the random effects underlie both the longitudinal and survival processes. This means that these random effects account for both the association between the longitudinal and event outcomes, and the correlation between the repeated measurements in the longitudinal process (conditional independence assumption). These types of joint model have also called a shared parameter model, as both processes share these random effects [22].
Therefore, in this study, the authors used some of the methodologies, notations and equations used [21] and employed the statistical joint models that belong to the random effects shared parameter models framework as both sub-models share the same random effects. Let represent the failure time for the ℎ individual such that either censoring or the event has occurred. Without loss of generalizability, our aim is to associate the true and unobserved value of the longitudinal outcome at time t, denoted by ( ), with event outcome . The longitudinal and survival components of the joint model are typically linked (joined) through the trajectory function.
, represents the history of the true (unobserved) longitudinal response. Ɯ ( ) Represents the vector of baseline covariates with corresponding parameter estimates , 1 and 2 measures (quantifies) the effect of the longitudinal outcome to the risk of an event (i.e., in our case effect of number of CD4 cells and body weight to the risk of death (k=1, 2)). Hence with this formulation, the risk of an event at time t is dependent on the true value of the longitudinal endpoint at that time.

Statistical Estimation and Inference
The Parameter estimates in this study were obtained by using ML is a very general approach to statistical estimation that has been widely used to handle many difficult estimation problems.

Linear Mixed Effects Estimation
In general terms, the researchers had used efficient estimation of using likelihood-based models either ML or REML estimation to obtained estimates of the covariance parameters in LMM with the remark that REML is usually better than ML, because it reduces the well-known finite sample bias in the estimation of the covariance structure [19].
The difference between ML and REML is the construction of the likelihood function. However, the two methods are asymptotically equivalent and often give very similar results except the Difference becomes important only when the number of fixed effects is relatively large. Given that the ℎ subject outcomes have the same random effects they have been marginally correlated, so we assume that, ( / , ) =∏ � │ =1 ; }. That is, longitudinal responses of a subject are independent conditionally on its random effect. As random effects have expected values of zero and therefore do not affect the mean, this distribution has a mean vector Xiβ and a covariance matrix,V then Taking into account that we assume independence across subjects, the likelihood function is simply the product of the density functions for each subject.
The log-likelihood of a linear mixed model is given by: ,Given a , the estimates of fixed-effects parameters are obtained by maximizing the log-likelihood Function, conditionally on the parameters in Vi, and have a closed-form solution: Finally, ML and REML both have the same merits of being based on the likelihood principle which leads to useful properties such as consistency, asymptotic normality and efficiency. But the REML would be produce less biased estimators for many special cases [18].

Model Selection Criteria
Model selection is the process of selecting a statistical model from a set of candidate models, for given HIV/AIDS patient's data. There are different mechanisms that can be used to select an appropriate statistical JM or for the LMM and multivariate LMM most commonly known model selection criterions; Information Criterion (AIC) and the Bayesian Information Criterion (BIC) had considered for this study that can predicted survival status of HIV/AIDS patients on ART follow-up. AIC and BIC are measures of likelihood, penalized for the complexity of the model [3,13].
Where−2logL is twice the negative log-likelihood value for the model; p is the number of estimated parameters denotes the total number of parameters in the model and N is the total number of observations had been used to fit the model. Therefore, smaller values of AIC and BIC reflect an overall better fit model.

Results
Descriptive statistics of baseline covariates was illustrated in Table 1. Thus, among 216 HIV/AIDS positive patients considered in the current study, 31 (14.4%) of them were died while the remaining 185 (85.6%) were censored and they may still alive, death with other case/competing interest and lost follow-up. The mean age, hemoglobin label and body weight of HIV/AIDS patients at the start of ART were 34.8 years, 13.6 /100ml and 49.2k.g respectively. The average number of baseline CD4 cells count was 311.04 cells per mm3 with a standard deviation of 161cells per mm3 of blood implying that patients were at higher risk of getting HIV/AIDS related illness. Out of 216 HIV/AIDS patients 134 (62%) were Female and 130(60%) were lived in urban area. Similarly, among the sampled HIV/AIDS patients 23(10.6%) were with HIV/TB co-infected. Table-1 also revealed that the percentage of death in males (9.7%) was higher than that of females (4.6%) in HIV positive patients. Moreover, the percentage of death HIV/TB co-infected patient's (43.48) was higher than that of patients who did not have TB diseases (10.88%). Finally from, the percentage of death of HIV/AIDS positive patients who lived in rural (9.3%) area was higher than those who lived in urban (5%).

Statistical joint model analysis Separate and Joint Model Analysis on Longitudinal √ Progression and Survival Time-To-Death
In this section we determined the variables to be included in the statistical joint model, authors have considered an automatic back ward variable selection method (step AIC in R). The survival sub-model was consistent with the results from the separate survival analysis. The differences in magnitudes of the parameter estimates were minor and there were some parameter difference in terms of statistical significance. The results from the separate and joint analysis were slightly similar to each other. In the survival sub model fair adherence, good adherence, Baseline CD4, HIV/TB (yes), Baseline weight and Sex (male) were statistically significant predictors on the risk of death. However, the categories of Function status and Education level was not statistically significant.  Adherence Sex         cell count. And also, significantly lower risk for shortening the survival time to death on the associated with HIV/AIDS patients having higher Baseline CD4 & Baseline weight. While the estimated parameters of the two models were slightly similar to each other and not identical, the estimate of the association parameters in the Joint Model was significantly different from zero. This association indicates that higher √ 4 cell progression was associated with lower risk of survival time and time to death.
The residual variability was smaller in joint analysis (5.9731) compared to the relative linear mixed Effects analysis (6.0025) which was probably because the standard errors were adjusted for the Correlation between the longitudinal and survival responses. Finally, when we assessed the overall performance of both the separate and joint statistical models in terms of model less complex and goodness of fit, the joint statistical model was performed better. As a result the joint model was better as it has a smaller total AIC than the separate model. Also, the statistical significant of both the association parameters was also evidence that the statistical joint model analysis was better than the Separate models [24].

Separate and Joint Model Analysis on Log10 (Body-Weight) progression and Survival Time-to-Death
In this sub section, the survival sub-model was consistent with the results from the separate model of survival analysis, Table 3. The differences in magnitudes of the parameter estimates were minor and there were some parameter difference in terms of statistical significance. The results from the separate and joint model analysis were slightly similar to each other. This survival sub model is given below. Fair Adherence, good Adherence, Baseline CD4, Baseline weight, ambulatory Function status, working Function status, primary education level, secondary education level, tertiary education level and Sex (male) were statistically significant predictors with risk of shortening survival time-To-death.

Parameter
Separate model analysis of survival Joint model sub model  Table 3, longitudinal analysis sub-model of the joint model age, Baseline CD4, Baseline weight, Ambulatory Function status, working Function status, time: Baseline CD4 and sex (male) all predictors included in the model were significantly predictors of progression on log (body weight). Finally, the estimated parameters of the two models (separate and joint models) were slightly similar to each other but not identical. However, the estimates of the association parameters in the statistical Joint model analysis were significantly different from zero. This provides that there was enough evidence of association between the two sub-models. The estimate of association (Gamma_2= -1.975) indicating that the higher Log10 (Body-Weight) progression is associated with the lower risk of shortening survival Time-To-death. When evaluating the overall performance of both the separate and joint models in terms of model less complex and goodness of fit, the joint model was performed better. As a result the joint model was better as it has a smaller total AIC than the separate model. Also, the statistical significance of both the association parameters was also strong evidence that the joint model analysis was better than the Separate models [24].

Statistical Joint Model Analysis on √ & Log10 (Body-Weight) progression with Survival Time-to-Death
Based on the above sub-titles, the researchers tried to find the factors that affect  Table 4 shows that the 1 st longitudinal process sub model was for √ 4 count and the second was for log (body weight). Based on this Table,         count on survival Time-To-Death and log (Body weight) on survival Time-To-Death of HIV/AIDS patients were statistically significant when the two biomarkers of longitudinal repeated measurements with survival Time-to-Death was fitted simultaneously. That means, the association parameters were negative for both statistical joint models ( . , this indicated that inverse relationship between them) and this result is similar to an earlier study [25].
The risk of survival time to death on patients for those who had developed HIV/TB (yes) patients was more risk than those had not developed HIV/TB (No) patients when controlling other independent variables. These results conformed to the studies conducted [6,12]. This shows that, the main known risk factor for mortality in HIV/AIDS patients are TB development. There was also significant sex differential among patients on risk of mortality of males had more affected as compared with females by controlling other predictor variables. This study contradicts with the previous study [14]. The risk of death HIV/AIDS patients whose Adherence Fair and Good had lower than those whose Adherence Poor by controlling other predictor variables. This shows that, HIV/AIDS patients whose Adherence was Fair and Good have better survival time & understanding of the disease condition and comprehension of instructions given on drug usage than Adherence poor patients and this observation agrees with the study conducted earlier [26]. Moreover, those patients who had lower baseline CD4 were associated with a higher risk of death among the retrospective cohort. That means, Patient's baseline CD4+ count significantly impact on his or her survival time. This study was agreed with a study [6,12]. The progression change of √ 4 count on patients for those who had developed HIV/TB (yes) patients was less than those had not developed HIV/TB (No) patients when controlling other independent variables. This study agreed with findings of [12]. On the functional status of Patients on ART, The progression change of √ 4 count for Ambulatory Functional status patient's more as compared to Bedridden Functional status patients by controlling other independent variables. This shows that patients whose Bedridden Functional statuses have higher probability risk of death than patients whose Ambulatory Functional status. This result is in agreement with the earlier findings [26]. Baseline CD4 cell count was positively associated with progression change of √ 4 count during ART treatment. That means, High baseline CD4 count was contributed to the increase of √ 4 count progression. This result was similar with earlier study which reported positive associations between these characteristics [27].
The interaction effect of visit time follow-up by functional status had statistically significant effect on the mean √ 4 count, this suggesting that as the number of visit time increased the average √ 4 count of HIV/AIDS patients who were ambulatory and working functional status was higher than the average √ 4 count of patients who were bedridden functional status by 2times (p-value= 0.0009) and 2.2times (p-value= 0.0003) respectively when other variables constant. The length of stay of follow-up on ART was an important predictor of improvement on log of body weight progression of patients (Beta =0.0305; p-value= 0.0039). Such kind of result also found [13]. The progression change of log (body weight) has positively associated with the baseline body weight. This indicated higher body weight at baseline was found to be associated with higher progression on log (body weight). The results of the present study confirmed the study reported previously [14]. Similarly, from the longitudinal joint sub model of log (body weight) ambulatory and working functional status were significant factors for contributing to the prediction of HIV/AIDS patients of body weight progression. That means that, ambulatory & actively working patients had better log body weight than bedridden patients as reported earlier [13]. It means, higher age predicted improvement in log of body Weight progression which is similar with the earlier study [14].

Conclusion
By studying the relationship between the two biomarkers such as CD4cells and body weight progression on the survival time and time to death, the authors have concluded that they were negatively significant, and that the Statistical Joint Model performed better. Thus, authors concluded that, the Statistical Join Model was better for simultaneous analyses of repeated biomarker measurements and survival time to death. Thus, statistical joint model should be preferred over separate models for the two longitudinal and survival data analysis.
Finally, the association parameters were negative for both statistical joint models. Hence, this indicated that the higher Log10 (Body-Weight) & √ 4 count progression is significantly associated with the lower risk of survival Time-To-death. This show that as √ 4 count & log10 (body weight) increased the mortality rate was decreased. We recommend that there should be a special attention for HIV positive adults, especially for those who had developed HIV/TB, male, bedridden functional status, poor adherence and lower Baseline CD4 cell count progression. In order to address this problem, continuous timely medical care should be important so as to minimize the risk of death and we suggested that for Health experts to measure the repeated biomarkers. Because these were important indicator for planning methods to minimize morbidity and mortality rates due to HIV/AIDS diseases. Moreover, we suggested that for other researchers of this nature include other important covariates such as viral load results, opportunistic infections, economic status, religion, smoking status, alcohol drinking and many others such risk factors playing a great role in this study.

Declarations Ethics approval and consent to participate
Ethical clearance was obtained from the Ethical Review Committee of University of Gondar College of Natural and Computational Science. The names of the subjects were not extracted to insure privacy of HIV/AIDS patients and confidentiality was maintained throughout data collection process and analysis. To collect the data, permission was obtained from administrative officers of Mekelle general hospital.

Consent for publication
Not Applicable.

Availability of data and material
Authors have Considered HIV/AIDS datasets from Mekelle General Hospital patient history card and now, attached as supplementary materials of the submission system.