Development and validation of a prognostic nomogram for predicting the survival of HIV/AIDS adults who received antiviral treatment: a cohort between 2003 and 2019 in Nanjing


 Background

Although great achievements have been made since free antiviral treatment (ART) was available, an in-time and accurate prediction of survival for people living with HIV (PLHIV) is still needed for effective management. We aimed to establish an effective prognostic model to forecast the survival probability of PLHIV after ART.
Methods

The participants enrolled were from a follow-up cohort between 2003 and 2019 in Nanjing from Nanjing AIDS Prevention and Control Information System. The nested case-control study was employed with HIV-related death, and propensity-score matching (PSM) approach was applied at a ratio of 1:4 to allocate the patients. Univariate and multivariate Cox hazards analyses were used based on the training set to determine the risk factors. The discrimination was qualified using the area under the curve (AUC) and concordance index (C-Index). The calibration was evaluated using the calibration curve. The clinical benefit of prognostic nomogram was assessed by decision curve analysis (DCA).
Results

Predictive factors including CD4 cell count (CD4), body mass index (BMI) and food blood glucose (GLU) were determined and contained in the nomogram. In the training set, AUC and C-index (95% CI) were 0.826 and 0.793 (0.740, 0.846), respectively. The model of validation set still revealed good discrimination with an AUC and a C-index (95% CI) of 0.750 and 0.776 (0.711, 0.839). The calibration curve also exhibited a high consistency in predicting the survival of PLHIV (especially in the first three years after starting ART). Moreover, DCA demonstrated that the nomogram was clinically beneficial.
Conclusion

The nomogram is effective and accurate in forecasting the survival rate of PLHIV, and therefore accessible for medical workers in health administration.


Introduction
Over the past 30 years, HIV has become a major challenge to global public health [1] . Since 2003, when China launched free antiviral treatment (ART), great achievements have been made in improving the patients' survival and quality of life, such as recovering their CD4 + T lymphocytes count, lowering viral load (HIV RNA) and decreasing HIV transmission rate [2] . Nevertheless, the poor prognosis for people living with HIV (PLHIV) after ART is still worthy of concern [3][4] . Therefore, an in-time and accurate prediction of death risk for PLHIV is essential for clinicians and health care providers to perform effective PLHIV management.
Since the combination of several independent indicators, rather than a single predictive factor, is more likely to improve the predictive performance, several scoring systems based on the multiple risk factors have been proposed to forecast the mortality of PLHIV. However, there still lacks a widely-held effective scoring system to predict the survival probability of ART-treated PLHIV.
In recent years, the clinical prediction model, a multi-factor model to estimate the probability of catching a disease or the probability of an outcome event, has been used extensively in medical diagnosis and treatment, prognosis management of patients and public health resource allocation, contributing greatly to public health. To establish a risk score system, we followed the recommendation of Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) [24] . Nomogram, as a more convenient and advanced clinical prediction model, is widely used to analyze prognosis [25] . Although previous studies have established prognosis nomogram models to assess ART survival probability, the predictive performance fails to meet the expectation. For example, the model by Margaret et al. [26] has a concordance index (C-Index) of 0.75 (95% CI: 0.74-0.81) in the training set and a C-Index of 0.69 (95% CI: 0.59-0.77) in the validation set.
This model is relatively satisfactory, but far from excellent. Few prognosis models constructed for PLHIV after ART present satisfying discrimination and calibration.
In the model established by Hou et al. [27] , the C-Indexes are as high as 0.91 (95% CI: 0.86-0.97) in the training set and 0.92 (95% CI: 0.82-1.00) in the validation set. The patients involved were randomly split into a training set and a validation set. In case of small development sample size, the development data could not be fully utilized.
In the present study, to build a simple and effective prognostic model to forecast the survival probability of PLHIV after ART, nested case-control study was employed with HIV-related mortality events, and propensity-score matching (PSM) approach was applied at a ratio of 1:4 to allocate the patients. To make the model more reliable and robust, bootstrap was used for internal validation. The discrimination and calibration of the model were evaluated based on the training set and validation set. Decision curve analysis (DCA) was also used to evaluate the performance of prediction model. The prognostic model was displayed by a nomogram.

Study design
The data used in this study were extracted from patients who received ART between 2003 and 2019 from Nanjing AIDS Prevention and Control Information System (AIDS-PCIS). All patients received a free combination antiviral treatment (cART) containing at least three antiviral medicines. The follow-up started after treatment initiation and the participants were visited every three months. The observation end point was December 31, 2019, and the outcome was death. The survival time was de ned as the duration from ART initiation to death or December 31, 2019. The inclusion criteria included: 1) living in Nanjing; 2) being visited at least once; 3) being over 18 years old when ART started; 4) having complete laboratory test data before starting ART.
At the end of the observation, 4573 patients met the inclusion criteria, and a total of 120 patients died from HIV/AIDS-related diseases during the follow-up.

Data collection
Demographic data and clinical information were retrieved from face-to-face surveys at the patients' enrollment or extracted from their medical records using a structured questionnaire designed speci cally for AIDS-PCIS. The information included the date of birth, gender, height, weight, marital status, infection route and WHO clinical stage. The age of the patient was calculated from the date of birth to the date of starting ART. Body mass index (BMI) was calculated using the following formula: BMI = weight (kg) / (height (m) * height (m)).
The laboratory testing information was obtained from the Nanjing Center for Disease Control and Prevention (CDC) or local hospitals. The laboratory testing indicators included CD4, white blood cell (WBC), blood platelets (PLT), HB, serum creatinine (CR), triglycerides (TG), serum total cholesterol (TC), fasting blood glucose (GLU), aspartate aminotransferase (AST), alanine aminotransferase (ALT), total bilirubin (TBIL). All these laboratory tests were carried out by the trained technical personnel strictly following clinical guidelines at each visit in the central laboratory of local hospitals or Nanjing CDC.
Routine blood biochemical indexes, such as TG, TC, GLU, CR, AST, ALT, and TBIL, were measured using a Beckman AU5800 automatic biochemical analyzer (Beckman COULTER K., Japan). Other indexes including WBC, HB and PLT were evaluated by Sysmex Xe-2100 automatic blood cell analyzer (Sysmex Corporation, Japan). CD4 was determined by the BD FACSCalibur ow cytometer (Becton Dickinson Corporation, USA).

Statistical analysis
Data processing Since the prediction model is based on a multi-factor regression model, there is no simple method to estimate the sample size. When the number of predictors is much larger than the observations of outcome, over tting may occur. Previous literature showed that in the conservative estimation, one prediction factor requires at least 10 effective outcomes. In this study, there were 120 cases with effective outcomes, so the number of predictors should be no more than 12.
Since directly dropping the observation with missing values might not only lead to selection bias, but also decrease the power of a test, missing value imputation was applied to obtain suitable values by employing the values of other variables before data analysis. The results were listed in Fig. 1. A sensitivity analysis was carried out to evaluate the lling effect of the missing values (Table 1).
A total of 120 deaths caused by AIDS-related diseases were contained in the case group of the nested case-control study. To ensure that all the subjects in the case group could have a matching control, PSM was applied at a ratio of 1:4 to determine the participants (a case was well matched by age and gender with 4 controls) [28] . Finally, 600 subjects were included in this study with 120 dead and 480 alive PLHIV who were separated into 120 blocks.

Establishment and validation of prediction model
The patients were randomly split into a training set and a validation set at a ratio of 7:3. The comparability of the training set and validation set was then evaluated. Continuous variables with normal distribution were presented as mean ± standard deviation, and t-tests were used to infer the differences between the training and validation sets. The continuous variables subjecting to skewed distribution were described using median ( rst quartile, second quartile). The Wilcoxon rank-sum tests were employed for comparisons. Frequency (ratio) was utilized to describe the characteristics of categorical variables, and comparisons between the two sets were performed using chi-square tests or Fisher's exact tests.
Then the data in the training set were used to t a model and the data in validation set were applied to evaluate the e cacy of the model. Based on the data in the training set, a univariate Cox regression analysis was performed on the variables. P-values of the variables were calculated based on the univariate Cox proportion hazard regression model. The variables with p-values less than or equal to 0.2 were included in a multivariate Cox proportion hazard regression model. After multivariate analysis, the factors with p-value less than or equal to 0.05 were included in a prediction model. According to Occam's Razor, the model with the fewest variables is the best [29] . Finally, we combined the statistically signi cant risk factors with professionally signi cant factors, such as the di culty of index measurement, the cost of measurement and the di culty of application, to determine the predictive factors and select a prediction model with better predictive performance.
The repeatability and extrapolation of an established prediction model should be evaluated to assess its performance. A strict evaluation of the prediction model should include internal validation and external validation. The internal validation is performed using the same dataset as the training set. This study employed the bootstrap resampling [30] for internal validation because there lacked extra data to verify the model. The 1,000 resampling performances of the model were averaged as the internal validation performance.
Discrimination and calibration are the two most common evaluation indicators. The discrimination of the prediction model is quanti ed using the area under the curve (AUC) and C-Index. The C-Index value ranges from 0 to 1. The closer C-Index is to 1, the better the discrimination of the model is. A C-Index of 0.5 indicates that the model has no predictive ability. When C-Index is less than 0.5, the model prediction is contrary to the actual results. In general, a C-Index of 0.7 indicates a good prediction performance of the model. However, discrimination cannot re ect whether the estimate of absolute risk of prediction model is accurate or not, because it is only based on risk scores or the ranking of prediction probabilities. Therefore, calibration, a more accurate indicator, is needed to qualify the prediction model. In this study, the calibration of the model was evaluated using the calibration curve.
We sorted the predicted probabilities of all participants from the smallest to the largest, and divided the patients into ten equal parts. The average predicted probability of patients in each divided part was used as x-axis and the proportion of actual events as y-axis. Ideally, the calibration graph was a straight line with an intercept of 0 and a slope of 1. The predictive ability of the model was also evaluated using decision curve analysis (DCA).
Integrated discrimination improvement (IDI), net reclassi cation index or improvement (NRI) and other indicators that are used to compare models or evaluate the increase in predictive performance of individual predictors were not discussed in the present study.

Presentation of nomogram
The prediction model was visualized and presented by a nomogram. To calculate the score of each variable in each level, a scoring standard was developed based on the standard regression coe cients of all variables. Then using the scores of these factors, a total score was calculated to indicate the survival probability of each patient.
All data analyses and gures were made using R software version 3.6.3. All hypothesis tests were twosided, with an α level of 0.05.

Establishment of prediction model
In this PSM based nested case-control study, the characteristics of the 600 PLHIV (420 from the training set and 180 from the validation set) revealed that both sets were similar in all variables (Table 2). In the univariate Cox regression analysis of the training set, infection route, TB, WHO clinical stage, CD4, BMI, WBC, HB, CR, TC, GLU, AST and ALT were detected to be statistically related to the mortality of PLHIV (Table 3). Variables with p-value less than or equal to 0.2 in the univariate analysis were included in the multivariate Cox hazard regression model. CD4, BMI, CR and GLU were found linked to HIV/AIDS-related death. In order to establish an optimal prediction model, the individual and combined performance of these four factors were then evaluated using ROC analysis and C-Index. As shown in Fig. 2A, the individual AUCs of CD4, BMI, CR, and GLU in the training set were 0.796, 0.689, 0.524, and 0.689, respectively. The AUC of combine 1 (CD4 + BMI + CR + GLU) was 0.833, and the AUC of combine 2 (CD4 + BMI + GLU) was 0.826. To compare the predictive performances of combine 1 and combine 2, their corresponding C-Indexes were calculated, and the results were 0.797 (95% CI: 0.745, 0.849) and 0.793 (95% CI: 0.740, 0.846), indicating both models had a prediction accuracy of around 80%. Besides, no statistically signi cant difference in the C-Indexes between combine 1 and combine 2 model was observed (P = 0.9160) (Fig. 3A). The discrimination between the two models was not large, but combine 2 involved fewer variables. Thus, combine 2 model was chosen and the three variables CD4, BMI and GLU were preliminarily selected to construct a prediction model of three-year and ve-year survival probabilities of PLHIV after ART.

Validation of prediction model
In order to verify the e cacy of the model in predicting the survival probability of PLHIV, bootstrap resampling was used for internal validation of the model. In the validation set, the AUCs of CD4, BMI, CR, and GLU were 0.731, 0.729, 0.535 and 0.673 in the ROC analysis chart (Fig. 2B). The AUC of combine 1 achieved 0.751, and the AUC of combine 2 (prediction model) was 0.750. The C-Indexes of combine 1 (0.776; 95% CI: 0.712, 0.840) and combine 2 (0.776; 95% CI: 0.711, 0.839) were similar and the difference was not statistically signi cant (P = 0.999), which showed that the discrimination of combine 1 and combine 2 (prediction model) was not very large (Fig. 3B). The calibration curve also exhibited a high consistency in predicting the survival probability of PLHIV (especially in the rst three years after ART initiation) (Fig. 4).
As shown in Fig. 5, in both the training set and the validation set, the prediction model (combine 2) showed better performance, thus ensuring the maximum clinical bene ts. Overall, the DCA curve demonstrated that the prediction model (combine 2) could make valuable and pro table judgements. In addition, among the four detected factors, CD4 was more bene cial than the other three routine clinical laboratory indicators in predicting the three-year and ve-year survival probabilities of PLHIV.

Performance of nomogram
A nomogram was drawn according to the determined prediction model. As seen in Fig. 6, each selected predictor was assigned a corresponding score according to its value in the nomogram based on the

Discussion
Although the survival probability of PLHIV has been improved signi cantly with the promotion of free ART, an in-time and accurate prediction of survival probability of PLHIV is still necessary as an HIV/AIDS management guideline for health providers. For clinicians and disease control personnel, it can also bene t the personalized management of PLHIV and the allocation of the limited medical resources [26] .
For prognosis, due to a longitudinal temporal logic between predictors and outcome, cohort study is applied to analyze the data and t a prognostic model. Randomized controlled clinical trials are considered as a prospective cohort study with more rigorous inclusion criteria, which therefore can be used to establish a prognostic model. However, it has limitations in extrapolation. Due to the population selection bias and information bias, retrospective cohort studies are not suitable for constructing a prognostic model, while nested case-control or case cohort studies are more economical and feasible for studies with rare outcomes or expensive predictive factor measurements. Based on this nested casecontrol study from an HIV/AIDS ART cohort in Nanjing, the relationship between routine laboratory indicators and the survival probability of PLHIV was evaluated. A prognostic model (including CD4, BMI and GLU) with satisfactory discrimination and calibration was developed to predict the three-year and ve-year survival probabilities of PLHIV receiving ART. Then the result of this prognostic model was shown in the form of a nomogram.
Nomogram is simple, direct and effective in predicting the prognosis of PLHIV and can be easily understood by medical workers [24] . In this study, the multivariate Cox proportional hazard regression model indicated that the four factors (CD4, BMI, CR and GLU) were associated with the HIV/AIDS-related survival time. To overcome the limitation of a single predictor and simplify the prediction procedure, three detected factors (CD4, BMI and GLU) were combined to construct a prognostic model to predict the three- year and ve-year survival probabilities of ART-treated PLHIV, which exhibited a high consistency.
Some clinical indicators (such as WHO clinical stage) with close association with PLHIV survival rate [13] were not included in the nomogram because the laboratory indicators (such as CD4) were more sensitive in predicting survival rate of PLHIV than the clinical indicators. In recent years, many researchers have reported that some laboratory indicators are connected with the survival rate of PLHIV. In this study, CD4, BMI and GLU were signi cantly correlated with the survival rate of PLHIV and showed good consistency with these published studies [10,16,21,26] .
An obesity paradox can be seen in this predictive nomogram of PLHIV, where those with high BMI had a low risk of death. This maybe that the protective effect of BMI help preserve the immune system response and slower the progression of HIV [31] . There is some evidence that a higher BMI is associated with more robust CD4 + T-lymphocyte cell recovery in ART-treated patients [32] . Previous studies also suggested that the immune reconstitution on ART was often highest among patients classi ed as overweight [33] .
DCA is commonly applied to assess the e cacy of speci c clinical prediction models [34] . In this study, DCA was used to assess the potential clinical bene ts of nomogram, which revealed that nomogram was more effective and accurate than a single indicator in forecasting the survival rate of PLHIV.
The present model has a limitation: it was established based on only a small number of easy-to-collect and low-cost predictors due to the underdeveloped technology in the past. Yet, with the advancement of economy and technology, clinical prediction models that involve a larger number of data (big data) will be developed. Hopefully, more complex models and algorithms based on machine learning and arti cial intelligence can provide more accurate results for the bene ts of medical workers, PLHIV and medical decision makers.

Declarations
Ethics approval and consent to participate The data were extracted from the Nanjing AIDS Prevention and Control Information System (AIDS-PCIS), which was established by the China Center for Disease Control and Prevention (CCDC). All the methods carried out in our study are accorded with relevant guidelines. The AIDS-PCIS protocol was approved by the institutional review boards at the CCDC. Informed consent was obtained from the subjects before their enrollments. The ethical approval for the study was also obtained from the Ethics Review Board of Nanjing Center for Disease Control and Prevention and the ethical committee of Nanjing Medical University ("F", "CH", "Nanjing Med U", "FWA00001501", "NANJING", 11/21/2004). I have read and have abided by the statement of ethical standards for manuscripts submitted to BMC Public Health.

Consent for publication
Not applicable.
Availability of data and materials The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.