Development and Validation of a New Clinical Prognosis Prediction Model for Metabolism in Cancer Patients

Background: Metabolic reprogramming has emerged as an important feature of cancer, and the metabolism-related indexes are closely related to prognosis. Therefore, we develop and verify a large sample clinical prediction model to predict the prognosis in patients with solid tumors. Methods: This retrospective analysis was conducted on a primary cohort of 5006 patients with solid tumor from INSCOC database. A total of 1720 cancer patients treated at the Fujian Cancer Hospital was used to form the validation cohort. A multivariate Cox regression analysis was performed to test the independent signicance of different factors and then establish the model. The prediction model was simplied into a nomogram to predict the 1-, 3-and 5-year OS rates. To determine the discriminatory and predictive accuracy capacity of the model, the C-index and calibration curve were evaluated. Results: Multivariate analysis indicated that age, smoking history, tumor stage, tumor metastasis, PGSGA score, FBG, NLR, ALB, TG, and HDL-C were independent factors. Moreover, the nomogram combining the score and clinical parameters can predict patient survival accurately. Conclusions: Clinical indicators based on metabolism reprogramming coould well t and predict the prognosis of cancer patients, and could provide assistance for the individual treatment of tumor patients in the clinic.


Introduction
Cancer is the most threatening disease to human beings. Its incidence rate has been rising globally, and it is the most fatal disease in the world. Some experts believe that cancer is mainly caused by gene mutation, but the effect of gene targeted therapy is not signi cant in the ght against cancer [1][2] . The mutations in cancer are diverse, and the gene mutations found in cancer are complex and heterogeneous, it is di cult to identify the key rate limiting genes for targeted treatment of tumors 3 . This suggests that genetic mutations may be not the origin of cancer. A few decades ago, Warburg 4 rst theorized that mitochondrial damage causes energy metabolism defects and leads to cancer. When the respiration of tumor cells is damaged, the retrograde response (RTG) is activated, which transmit signals from mitochondria to nucleus, affecting the stability of the genome and leading metabolic reprogramming [5][6] .
This propound a theory that cancer is essentially a metabolic disease.
Metabolic reprogramming in cancer cells alters glucose metabolism, lipid metabolism, amino acid metabolism, and tumor microenvironment (TME), leading to cancer progression 7 . The rapidly proliferation of cancer cells needs to balance the decomposition and anabolic at the same time. Therefore, metabolism related indicators can re ect tumor growth. Recent reports have suggested that patient prognosis is associated with certain molecular biomarkers involved in tumor metabolism. However, expensive and time-consuming laboratory metabolomics technology is required. In contrast, blood tests from clinical patients are convenient and can be widely used in clinical application. The changes of metabolic markers in blood test can re ect tumor metabolic reprogramming and the prognosis of patients. Therefore, we developed and validated a metabolic based prognostic prediction model to predict the survival of patients with solid tumors and support the decision making on early therapy.

Materials
This retrospective analysis was conducted on a primary cohort of solid tumor patients from INSCOC(Investigation on Nutritional Status and its Clinical Outcomes of Common Cancers)database. The INSCOC is a nation-wide cross-sectional survey on nutritional status and clinical outcome in patients with malignant tumors. Patients were derived from the Investigation on Nutrition Status and its Clinical Outcome of Common Cancers (INSCOC) project of China (registered at chictr.org.cn, ChiCTR1800020329). Patients were evaluated from January 2013 to August 2018 at 30 tertiary public hospitals in China. Inclusion criteria: 1) a histologic diagnosis of malignant solid tumors; 2) a complete medical history record and follow-up data available. An independent cohort of cancer patients with the same inclusion criteria were enrolled from the Fujian Cancer Hospital, and this cohort was used to form the external validation cohort. The follow-up time was 1-60 months in both primary cohort and validation cohort, and the outcome was patient's death. The study was approved by the Ethics Committees of all participating institutions and all data was analyzed anonymously. The study is reported in accordance with the TRIPOD guidance for transparent reporting of prediction models 8 .

Data collection
Demographic and clinicopathological data were collected, including sex, age, smoking history, drinking history, PGSGA score (Patient Generated Subjective Global Assessment score), NRS2002 score (Nutrition risk screening score), primary tumor site, tumor metastasis, TP (total protein), ALB (albumin), PAB (prealbumin), FBG (fasting blood-glucose), TC (total cholesterol), TG (triglyceride), HDL-C (high density lipoprotein cholesterol), LDL-C (low density lipoprotein cholesterol), WBC(white blood cell), NLR(neutrophil/lymphocyte ratio). All continuous variables were converted to categorical variables according to clinical standard. Regardless of tumor type or origin, metabolic abnormalities are common features of most cancer cells. Therefore, 15 kinds of malignant solid tumors were included in the study and classi ed by human systems. The NLR is an in ammatory marker which has been investigated as a prognostic indicator in post-therapeutic recurrence and survival of patients with cancer 9-10 . In our study, NLR was classi ed according to the optimum cutoff value(Figure.S1). PGSGA was adapted from the SGA (Subjective Global Assessment) and widely used for clinical assessment of malnutrition in cancer patients 11 . Patients with PGSGA score≥4 need nutritional interventions and symptomatic treatment. NRS 2002 is recommended by The European Society of Clinical Nutrition and Metabolism (ESPEN) as a nutritional risk screening method for patients 12 . Patients with NRS 2002 score ≥3 are at risk of malnutrition and require nutritional support. Variables where >10% of values were missing, or patients with a missing value for a speci c variable, were excluded from the analysis. The missing data in selected variables were multiply imputed to generate a complete data set.

Statistical analysis
We used a primary cohort of solid tumor patients from INSCOC database to develop a clinical prediction model. Categorical variables were reported as whole numbers and proportions. The overall survival (OS) was calculated using the Kaplane-Meier method and the log-rank test. Univariate analysis was performed for all variables, and the variables with P values < 0.05 were included in multivariate analysis. A multivariate Cox regression analysis was performed to test the independent signi cance of different factors. The variables were selected by stepwise regression and then t a more parsimonious model. Nomograms are a pictorial representation of a complex mathematical formula that use two or more known variables to calculate an outcome. The resulting model was simpli ed into a nomogram to predict the 1-, 3-and 5-years OS rates. We also did a decision-curve analysis to assess the clinical usefulness of the model.
The area under ROC curve (AUC) was used to evaluate the predictive accuracy. Calibration curves were assessed graphically by plotting the observed rates against the predicted probabilities to evaluate the agreement. Brier score was used to evaluate probability calibration. Nomograms are a pictorial representation of a complex mathematical formula that uses two or more known variables to calculate an outcome. The resulting model was simpli ed into a nomogram to predict the survival of patients with solid tumors. To assess the performance of our model, the discriminative performance of the model was measured using Harrell's C-statistic. An internal validation step was performed to counteract the possible over tting of our model to the data. The bootstrapping techniques (B = 100) was used to validate and correct the over-optimism of the models ,and obtained the optimism-corrected measures of C-statistic.
In all analyses, p values < 0.05 were considered to indicate statistical signi cance. All analyses were performed using the R software, version 3.6.1.

Clinicopathological characteristics
In the primary cohort, there were 5006 patients who met the inclusion criteria were nally enrolled in the study, the median follow-up time was 30.3 months. For the validation cohort, we studied 1703 patients admitted to a single institution, the median follow-up time was 27.0 months. The demographic features and clinical characteristics of the primary and validation cohorts are presented in Table 1. Figure S2 shows the cumulative survival free between primary cohort and validation cohort. Log-Rank test showed that P=0.052. This means there was no signi cant difference between the two cohorts.  (Figure 1). We simpli ed the model into a nomogram ( Figure 2). The nomogram based on these ten factors was developed to predict the 1, 3, 5 years of OS in cancer patients. The scales of the nomogram re ect coe cients from the Cox model rescaled to a user-friendly 100-pointrange.  Internal validation with bootstrapping revealed the optimism-corrected C-statistic of the predictive model was 0.776, and the optimism-corrected Brier score was 0.169. Both the average optimism of C-statistic and Brier score were less than 0.001, re ecting a small degree of over-optimism. In the external validation cohort, there was also a good calibration curve for the risk estimation, indicating that the model is wellcalibrated ( Figure.

Page 12/21
Cancer is a signi cant public health problem and is the second leading cause of death globally 13 .
Metabolic alterations of tumors are recognized as one of the hallmarks of cancer 14 . Cancer cells support energy to maintain tumor progression and proliferation by adopting to metabolic changes. A huge number of cancer cells show metabolic reprogramming, including the reprogrammed glucose, lipid and amino acid metabolism to satisfy high proliferation requests. FBG, ALB, TG and HDL-C can well re ect metabolic changes as routine clinical detection items, so they were included in this study. In addition, tumor cells need to survive drastic changes in the microenvironment such as hypoxia, nutrient storage, acidic pH and chronic in ammation 15 . The tumor microenvironment enforces metabolic plasticity and promotes tumor proliferation and progression 16 . Therefore, this study included PGSGA score as a sensitive to evaluate nutritional status, and included NLR as an in ammatory marker. Besides, we included some other indicators related to the survival and prognosis of tumor patients, such as age, smoking history, tumor stage and tumor metastasis, so as to more comprehensively predict the prognosis of cancer patients. There were also many indicators that can re ect tumor metabolism and microenvironment, but they were not included in this model due to the incomplete data, missing Normally, the main way for the body to obtain energy is the oxidative phosphorylation of glucose under aerobic conditions. In cancer cells, even in the presence of oxygen, the main pathway of glucose metabolism is aerobic glycolysis, termed Warburg effect 4 , which re ects the reprogramming of tumor glucose metabolism. Hyperglycemia is a common phenomenon in patients with advanced cancer 17 .
Hyperglycemia can provide cancer cells with a high glucose fuel source to support rapid proliferation, drive glycolysis metabolic pathway, and lead to worse prognosis. Hyperglycemia can indirectly in uence cancer cells through an increase in the levels of insulin/IGF-1, thus activating the PI3K/AKT/mTOR signaling pathway and promoting the development of cancer 18 . Beyond that, hyperglycemia has a direct impact on cancer cell proliferation, metastasis, invasiveness, and antiapoptotic qualities [19][20][21] . In our study, FBG ≥ 6.1 mmol / L (HR 2.63, 95% CI 2.39-2.91) is considered as one of the risk factors affecting the survival of tumor patients, which con rms the harm of hyperglycemia. Hyperglycemia can promote glycolysis and raise the prevalence and mortality of certain malignancies. FBG is the most intuitive index to re ect blood glucose, which can be a prediction index for cancer progression and glucose metabolism [22][23] .
Therefore, patients with FBG ≥ 6.1 mmol / L should be carried out appropriate diet or drug intervention to improve the prognosis.
In cancer cells, the protein synthesis and decomposition are enhanced, but the anabolism exceeds the catabolism, and can even capture the protein from normal tissues, in order to meet the needs of their own growth. The amino acid metabolism of the tumor was also changed, tumor cells can obtain energy through glutamine and other amino acids. These changes will lead to severe protein consumption, negative nitrogen balance and hypoproteinemia. Patients with hypoproteinemia have a greater risk of recurrence and mortality, which can be corrected by albumin supplementation. Albumin (Alb) is an acute phase protein that decreases with in ammation and due to other reasons, such as malnutrition, increased age and metabolic disorder. Albumin re ects nutritional state and response to amino acid metabolism, and is associated with the prognosis of cancer patients. In our study, ALB≥35 g/L (HR 0.71, 95%CI 0.63-0.80) was the protective factors of overall survival. Kao HK et al. 24 showed that patients with increased serum albumin level can have better prognosis. Therefore, albumin can not only re ect amino acid metabolism, but also predict survival and prognosis as a biomarker [25][26] .
Nowadays, there are increasing evidences of the role of lipid metabolism alterations as biomarkers of cancer prognosis and survival. Together with the Warburg effect and the increased glutaminolysis, lipid metabolism plays a key role in cancer metabolic reprogramming. Extremely proliferative cancer cells exhibit an intense lipid and cholesterol avidity, which they satisfy by increasing the uptake of dietary or exogenous lipids and lipoproteins 27 . In addition, the increase of de novo fatty acid synthesis and lipid synthesis in cancer cells requires e cient and complementary lipolytic mechanisms to accommodate the intracellular lipid content and provide materials for tumor cell proliferation 28 . This long-term metabolic change will lead to the depletion of stored fat, and promote cancer cell metastasis 29 . TG and HDL, as lipid indexes re ecting lipid metabolism, are closely related to prognosis. Studies showed that a high level of HDL-C can reduce the risk and progression of cancer [30][31] [39][40] . Therefore, NLR can be used as a biological indicator of in ammation to predict the prognosis of cancer patients.
Some limitations of this study should be discussed when considering the results. First, cancer patients still have a lot of laboratory indicators re ecting the metabolic situation in clinic, so it is necessary to add more factors to improve the model in the future. In addition, although internal validation was performed to prevent over-interpreting the data, and external validation verify our ndings are applicable in single center, we need a prospective multicenter study to con rm the results in the future.

Conclusion
Although the interactions between cancer metabolism and clinical prognosis are intricate. We emphasize their importance and develop a model based on cancer metabolism to predict the prognosis of tumor patients. This proves the correlation between cancer metabolism and clinical prognosis. This prognosis prediction model can accurately predict the prognosis through rapid and economical blood tests, and can be widely used in clinical practice.

DATA AVAILABILITY STATEMENT
The INSCOC data that support the ndings of this study are available from Chinese Cancer Society Nutrition and Support Committee but restrictions apply to the availability of these data, which were used under license for the current study, and this INSCOC data relates to the con dentiality of multiple clinical center and patient privacy, so it is not convenient to disclose. Data are however available from the authors upon reasonable request.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ethics Committee of Capital Medical University A liated Beijing Shijitan Hospital. The patients/participants provided their written informed consent to participate in this study. All methods were carried out in accordance with relevant guidelines and regulations. We don't consent for the publication of identifying images or other personal or clinical details of participants that compromise anonymity.

Figure 1
The forest plot showed the results of the prediction model. The HR of the multivariate Cox regression analysis model was shown by forest plot.

Figure 2
The nomogram developed to predict the overall survival of cancer patients with metabolic reprogramming. To use the nomogram, an individual patient's value is located on each variable axis, and a line is drawn upward to determine the number of points received for each variable's value. The sum of these numbers is located on the Total Points axis, and a line is drawn downward to the survival axes to determine the likelihood of survival at 1, 3 or 5 years.

Figure 3
The calibration curve and area under the ROC curves (AUC) for predicting 5 years survival in primary cohort and validation cohort. Calibration plot showing the optimal agreement between the prediction and actual observation in primary cohort(A) and validation cohort(B).The ideal line with 45° slope represents a perfect prediction (the predicted probability equals the observed probability). Area under the ROC curves (AUC) for predicting the survival at 5 years in the primary cohort (C) and the validation cohort (D).

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.