Construction of XGBoost Model for Early Lung Cancer Prediction Based on Metabolic Indexes

doi:10.21203/rs.3.rs-1850658/v1

Download PDF

Research Article

Construction of XGBoost Model for Early Lung Cancer Prediction Based on Metabolic Indexes

https://doi.org/10.21203/rs.3.rs-1850658/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 13 Jun, 2023

Read the published version in BMC Medical Informatics and Decision Making →

You are reading this latest preprint version

Background

In order to test the predictive effect of the XGBoost model in the machine learning algorithm for early lung cancer diagnosis and the importance of amino acid and carnitine indicators in affecting the occurrence of lung cancer, this study used data from 848 patients with lung cancer and benign lung nodules.

Results

The study selected 49 serum amino acid and carnitine indexes, as well as age and gender demographic indexes of subjects from a hospital in Dalian City, Liaoning Province. After screening with stepwise regression algorithm, 16 indicators were included.Draw the ROC curve of 5 models, comparing the accuracy, precision, recall, and F1 score of the lung cancer prediction model with the nomogram and the machine learning algorithm, the optimal model (XGBoost) was obtained, and the feature importance ranking was obtained using the XGBoost model.In the method of screening indicators, the index modeling accuracy after screening using the stepwise regression algorithm was the highest, which was 75.29%, and the AUC value was 0.81. In the model performance comparison, the accuracy of the nomogram is 68.24%, and the AUC value is 0.74. Among the machine learning models, the XGBoost algorithm has the best performance, with an accuracy rate of 75.29% and an ROC value of 0.81. In the model feature importance score results, the top three important features were ornithine, valine and palmitoylcarnitine.

Conclusions

The combined use of nomogram and machine learning model(XGBoost) not only improves the accuracy of the lung cancer prediction model, but also makes the understanding of the results more intuitive and convenient for doctors and patients.

lung cancer

nomogram

machine learning

amino acid

carnitine

Globally, lung cancer has been one of the most common malignancies and the leading cause of cancer among the world in the last few decades. It has the highest incidence and is the leading cause of death.In 2018, there was about 2.1 million new lung cancer diagnoses, accounting for 12% of the global cancer burden^[1,2]. There are many ways to diagnose lung cancer, but early identification of malignancy in lung nodules is a serious challenge for physicians. Chest computed tomography (CT) is an effective tool for screening for lung cancer and can effectively increase the detection rate of lung cancer, but it still has a certain rate of false positives^[3,4]. When there is a misdiagnosis, it causes anxiety in the patient and brings about unnecessary clinical investigations, such as tissue biopsy. There are many other diagnostic methods such as blood tumour biomarkers, CT-guided transthoracic needle biopsy and bronchoscopy for lung cancer screening, but they still have their own shortcomings^[5,6].

The current gold standard for lung cancer diagnosis is CT-guided transthoracic needle aspiration biopsy, however, it carries a high price tag, risks such as pneumothorax, pulmonary embolism and great trauma. It is therefore difficult to accept for most patients. Because early diagnosis of lung cancer is associated with a significant increase in overall survival. It would be a valuable study to develop a more accurate, feasible and effective diagnostic model to differentiate patients staying early stage of lung cancer from those with benign nodules. In recent years, the advent of metabolomics has given us an insight into many diseases, particularly cancer^[7]. Metabolomic analysis targeting amino acids and carnitine has been used to construct predictive models for lung cancer.

There are many surprising benefits of applying machine learning techniques in the medical field. Machine learning models can use computers to analyze, model and train a large amount of medical data to obtain the relationship between various medical indicators. This method has greater computational effort and shorter time. At the same time, it can also predict and assist the diagnosis of diseases through the trained model, which can improve the accuracy of diagnosis^[8]. Machine learning algorithms for disease prediction essentially train with labeled datasets and then continuously test and optimize the trained model to predict unknown outcomes. Logistic regression is the traditional method of predicting disease, and a Nomogram is a graph that lists the level of risk for each metric. In the past decade, the nomogram has been accepted as a reliable method for predicting tumor prognosis ^[9]. It has been applied to the prognosis prediction of many cancers, including gastric cancer, breast cancer, testicular cancer, etc ^[10-13]. As a prognostic model, the nomogram can rank important associated risk factors.

There are many risk prediction models have been developed to predict an individual's risk of lung cancer, which include sociodemographic and clinical risk factors associated with lung cancer, including age, gender, race/ethnicity smoking history, and family history of cancer, history of emphysema and chronic obstructive pulmonary disease，etc. Most lung cancer risk prediction models are used to select high-risk groups for lung cancer screening, while there are also risk models used to predict the probability of malignancy of substantial lung nodules or to optimise screening results.Alternatively, they can be used to optimise the frequency of screening for those who are eligible by incorporating past screening results.

2.1 Source of data and participants

The study participants were recruited from April 2018 to December 2020 at the Department of Thoracic Surgery and Respiratory of the Second Affiliated Hospital of Dalian Medical University (Dalian, China). 478 patients diagnosed with lung cancer and 370 subjects with lung benign nodules (tuberculoma, hamartoma, and inflammatory pseudotumor) were retrospectively recruited.This research protocol was approved by the ethics committee of a hospital in Dalian, and is in line with ethical and safe research practices involving human subjects or blood.

The inclusion criteria for patients with lung cancer were as follows:

1). Patients with stage I-II lung cancer according to the 8th edition American Joint Committee on Cancer (AJCC8th) tumor-node-metastasis (TNM) staging system.

2).Naive to antineoplastic therapy, radiotherapy or chemotherapy before surgery or cancer diagnosis.

Due to significant relationships between the profile of circulating metabolites and certain diseases(autoimmune diseases; serious heart, liver, and kidney diseases; metabolic syndrome; other cancer types)^[14-18].

The exclusion criteria are as follows:

(1) Incomplete medical records and missing data;

(2) Combination of autoimmune diseases, severe heart, liver, kidney diseases, metabolic syndrome, etc. may lead to metabolic disorders and other diseases

(3) History of recurrent tumor, metastatic tumor or combined with other malignant tumors;

(4) There is a history of surgery in the past six months.Fig 1.

2.2 Selection of indicator screening algorithms

We use the parameter indicators of the XGBoost machine learning model as a representative, test the original data set, the data set filtered by the stepwise regression algorithm, and the data set filtered by the Boruta algorithm, by comparing its accuracy, precision, recall, F1 Score and area under receiving operating characteristic (ROC) curve (AUC) to obtain the algorithm with the best parameter index as a method for subsequent screening of data sets.The process is shown in Figure 2.

2.3 Nomogram

We used metabolomics filtered by backward stepwise regression to construct a model for predicting the probility of lung cancer.The metabolomics are gen、age、Arg、Asn、Glu、Orn、Ser、Val、C4OH、C4DC、C5、C5DC、C12、C16、C22、C26. The logistic regression model was performed to investigated the risk of lung cancer. R software version 3.0.4 was performed for establishing nomograms.Figure 5。

2.4 Introduction to Machine Learning Algorithms

Use random seeds to divide the training set and test set by 7:3, and use 4 algorithms of machine learning to compare the accuracy, precision, recall, and F1 score of the model index values.Support vector machines have a wide range of applications in disease prediction due to their high robustness and ability to model non-linear decision bounds, their many optional kernel functions, and their ability to efficiently learn high-dimensional data. Extreme gradient boosting (XGBoost) is an ensemble learning algorithm based on the classification tree model, which combines the classifier groups with low accuracy through iterative calculation method, making it a high-accuracy classifier.Its characteristics are fast running speed, accurate training results, and loose data requirements. Strong model generalization ability, higher scalability, and faster computing speed are its advantages.Random forest is to randomly build a large number of classification trees, and the final classification is determined by voting on the classification results of each tree.The KNN algorithm is more effective than other machine algorithms in multi-classification problems, providing doctors with efficient and high-quality analysis and diagnosis in disease diagnosis, and improving the accuracy of diagnosis.SVM、KNN、Random Forest and XGBoost model using the above indicators was constructed by Python3.7. Receiver operating characteristic (ROC) curves were conducted to assess the predicted performance of nanogram and machine learning models.The specific process is shown in Figure 2.

3.1. Data Description

This study collected 848 patients who visited a hospital in Dalian from 2018.04.06 to 2020.12.15. Among them, there were 478 patients with early stage lung cancer and 370 patients with benign lung nodules. There were 369 male patients, 186 in the lung cancer group, accounting for 50.41%, and 183 in the benign nodule group, accounting for 49.59%. There were 479 female patients, 292 in the lung cancer group, accounting for 61.96%; 187 in the benign nodule group, accounting for 39.04%. According to the age distribution, among the early stage lung cancer patients, the 61-70 age group has the most patients, accounting for 41.21% of all early stage lung cancer patients; among the benign nodule patients, the 61-70 age group has the largest distribution. , for 119 patients, accounting for 32.16% of all patients with benign nodules. The specific distribution is shown in Table 1.

3.2 Performance Comparison of Data Index Screening Algorithms

In this study, we use the XGBoost model as a representative, and apply two algorithms for data feature screening, namely stepwise regression and Boruta. The stepwise regression algorithm is a traditional statistical feature screening method. Its basic idea is to reduce the degree of multicollinearity by eliminating variables that are less important and highly correlated with other variables.Boruta algorithm is a popular feature screening method in machine learning. It is based on the same idea of random forest classifier, that is, adding randomness to the system and collecting results from random sample sets can reduce the misleading effects of random fluctuations and correlations. We also used the original dataset as a control group.The results are shown in Table 2. In the original data set, all 49 features were used. After the Boruta algorithm screening, 19 features were included in the study, and after the stepwise regression algorithm screening, 16 features were included in the study. In terms of the number of included features, the number of features filtered by the stepwise regression algorithm is the least, which can simplify the subsequent operation process and shorten the operation time.Comparing from the accuracy, precision, F1 score and recall index, the accuracy of stepwise regression is 75.29%, the accuracy of Boruta is 72.55%, and the accuracy of the original dataset is 73.73%. The area under receiving operating characteristic (ROC) curve (AUC) value showed that it was 0.79 for the original dataset, 0.78 for Boruta, and 0.81 for the stepwise regression, Figure 3.The results show that, using the XGBoost model for testing, the stepwise regression algorithm has the highest accuracy and the least number of features after filtering. Therefore, in the follow-up research, we choose 16 features after stepwise regression screening as the data set. They are Sex, Age, Arg, Asn, Glu, Orn, Ser, Val, C4OH, C4DC, C5, C5DC, C12, C16, C22, C26.

3.3. Performance Metrics Comparison of Machine Learning Algorithms

Use random seeds to divide the training set and test set by 7:3, and use 4 algorithms of machine learning to compare the accuracy, precision, recall, and F1 score of the model index values. The receiver operating characteristic curve (ROC) was used to determine the strength of the predictive ability by the area under the curve (AUC). The larger the AUC value, the stronger the predictive ability. The results show that in Table 3, the accuracy of the XGBoost model is 75.29%, and the AUC value is 0.81, which is better than all models. The random forest model has an accuracy of 72.55% and an AUC value of 0.78. The accuracy rate of the support vector machine model is 71.37%, and the AUC value is 0.77. The accuracy rate of the adjacent algorithm model is 66.67%, and the AUC value is 0.69. Figure 4.

3.4 Performance Comparison of Nomogram and Machine Learning Algorithms

The nomogram shows an accuracy of 68.24%, a sensitivity of 0.71, and a specificity of 0.64. The machine learning model (XGBoost) showed 75.29% accuracy, 0.74 sensitivity, and 0.76 specificity. As shown in Table 4, the XGBoost model is better than the nomogram model in terms of parameter index performance. In the subsequent index feature importance ranking, we will apply the XGBoost model for testing.

3.5 Index importance score ranking

For the 16 included indicators, the XGBoost model is used to score the importance of the indicators, and the importance ranking of the 16 indicators is obtained, as shown in Figure 6. The order of importance is Orn, Val, C16, Arg, Asn, Glu, Ser, Age, C4DC, C5DC, C5, C22, C4-OH, C12, C26, Sex. In the amino acid category, the most important index is ornithine, and in the carnitine category, the most important feature is palmitoylcarnitine.

In this study, we used nomogram and 4 machine learning algorithms to build a model for predicting early stage lung cancer by amino acid and carnitine indicators. For the 47 kinds of metabolic indexes in human serum, and 2 kinds of clinical indexes of age and gender in the clinical data, the backward stepwise regression algorithm was used to finally screen out 16 indexes, which were included in the follow-up research and established a prediction model.Finally, the XGBoost model in the machine learning algorithm was shown to have superior predictive ability. Notably, previous studies have shown that metabolites are relatively strong objective predictors of lung cancer, and 8 acylcarnitines (C16, C4DC, C5DC, C5, C22, C4-OH, C12, C26) were included in our Model.Carnitine acts as a shuttle, bringing long-chain fatty acids into the mitochondria for oxidation and then into acylcarnitines. Excess acylcarnitines are then released into the bloodstream. Studies have shown that fatty acids are synthesized in tumor cells and are associated with cell proliferation and metastasis in lung cancer ^[19].Therefore, acylcarnitine spectrum can reflect fatty acid metabolic status and related diseases such as lung cancer. C5DC has been previously shown to be involved in the genetic metabolism of neonatal leukemia or lymphoma ^[20]. It also serves as a potential screening marker for autism spectrum disorder and Alzheimer's disease in children ^[21]. But so far, there has been no direct link between C5DCs and cancer. C5 is a short-chain acylcarnitine (2-5 carbons in length) that is included in several metabolic signatures used to identify risk of endometrial cancer and Alzheimer's disease ^[22,23].C16 is a long-chain acylcarnitine (more than 12 carbons in length) that can be used as a potential novel biomarker for the diagnosis of nonalcoholic fatty liver disease. A high correlation (r>0.7) was found between even-carbon long-chain acylcarnitines in patients with nonalcoholic fatty liver disease ^[24]. This study is the first to incorporate C5DC and C16 into a cancer prediction model.

Two amino acids (Arg, Ser) involved in our model have been shown to be closely related to the biological functions of lung cancer development. Arginine is a semi-essential amino acid that acts as a building block for protein synthesis and a precursor for a variety of metabolites, including polyamines and nitric oxide, which have strong immunomodulatory properties in tumors ^[25,26]. In addition, cancer cells show elevated levels of Arg ^[27], and elevated Arg levels induce overall metabolic changes, including activation of T cells from glycolysis to oxidative phosphorylation, and promotion of central memory-like cells with higher viability , and has antitumor activity in mouse models. Thus, intracellular arginine concentration directly affects the metabolic fitness and viability of T cells that are critical for antitumor responses^[28].Ser is a non-essential amino acid that supports a variety of metabolic processes critical to the growth and survival of proliferating cells, including the synthesis of proteins, amino acids, and glutathione. As an important one-carbon donor of the folate cycle, Ser contributes to the production of NADPH for nucleotide synthesis, methylation reactions and antioxidant defense ^[29]. Many rapidly proliferating cells depend on exogenous Ser, and depletion of Ser significantly inhibits the growth of some cancer cells in vitro and in vivo ^[30].In this study, the amino acid with the highest index importance score was ornithine. Ornithine is a non-essential amino acid and an intermediate molecule in the urea cycle. It is a key substrate for the synthesis of proline, polyamines and citrulline. Previous reports have demonstrated that ornithine plays an important role in the regulation of several metabolic processes leading to diseases like hyperorithinemia, hyperammonemia, gyrate atrophy and cancer in humans^[31].

Traditionally, clinicians have made judgments based on patient consultation and past decisions. Therefore, clinician experience plays an important role in accurate risk estimation and decision-making. This approach raises a huge problem, the risk of bias and patient outcomes can be highly subjective ^[32]. Nomograms have been used to predict survival in various head and neck cancers ^[33,34]. Likewise, machine learning models have shown encouraging risk estimates for patients ^[35,36]. Therefore, the introduction of nomograms and machine learning models provides clinicians with a new decision aid that can more accurately predict patient conditions. In this study, for the comparison of the performance parameters of the two methods, the machine learning model (XGBoost) outperformed the nomogram in predicting the occurrence of lung cancer patients. To our knowledge, this is the first study to compare nomograms and machine learning models to lung cancer.It is worth noting that the visualization of outcome metrics provided by the nomogram solves the problem of not easily interpreting the results of machine learning models. This can make it easier for patients and clinicians to accept. More importantly, the principle of shared decision-making between patients and clinicians can be reinforced. Thus, published studies have shown^[34] that the combination of a nomogram-machine learning (NomoML) approach can provide a more transparent approach to individualized assessment and to develop the most appropriate adjuvant treatment regimen for lung cancer patients. In addition to the remarkable accuracy provided by machine learning models, visualization of model results can make overall research more practical.

In this study, there are certain limitations that need to be considered. Due to the particularity of the indicators, the data collected from the hospital this time cannot find matching public data from the public database, so there is a lack of external verification in the model testing process. In addition, the amount of data in this study is not enough, and efforts should be made to collect more data so that the indicators of the model are more accurate.

In conclusion, by comparing the performance of the nomogram and the machine learning model for building a lung cancer prediction model, this study obtained a new machine learning model (XGBoost) with better performance for predicting the occurrence of lung cancer. And obtained the order of the effect of amino acids and carnitine on the occurrence of lung cancer in the human serum metabolic indexes. In the amino acid category, the most important index is ornithine, and in the carnitine category, the most important feature is palmitoylcarnitine.

Ethics and Consent to Participate

Applying institution:Department of Respiratory Medicine, Second Affiliated Hospital of Dalian Medical University

The name of trial register and clinical trial registration number:Liquid biopsy and multi-omics methods depict key pathway in lung cancer evolution.(2018048)

A statement that written informed consent was obtained.

Informed consent was obtained for the human blood samples used in this study.

Consent for publication

The data samples used in this study have obtained patient informed consent. Agree to publish.

Competing interest.

The authors declare that they have no conflicts of interest in the present study.

Funding

This work is supported by the following grant: Foundation of Liaoning Province Education Administration (No. LJKZ0849).

The funding unit assumes the function of the research sponsor in this research

Author Contribution Statement

Study concepts and study design: XG,YD.

Date extraction: YD,NT.

Data quality: XG,RM.

Data analysis and interpretation: XG,SO.

Manuscript preparation:YD, XG.

Manuscript review: YD.

Manuscript editing: XG.

All authors approved the final manuscript for submission.

The Availability of data and materials

The datasets used and analysed during the current study available from the corresponding author on reasonable request.

Acknowledgements

Not Applicable

Schabath MB, Cote ML. Cancer Progress and Priorities: Lung Cancer. Cancer Epidemiol Biomarkers Prev. 2019 Oct;28(10):1563–1579.
Toumazis I, Bastani M, Han SS, Plevritis SK. Risk-Based lung cancer screening: A systematic review. Lung Cancer. 2020 Sep;147:154–186.
Wender R, Fontham ET, Barrera E Jr, Colditz GA, Church TR, Ettinger DS, Etzioni R, Flowers CR, Gazelle GS, Kelsey DK, LaMonte SJ, Michaelson JS, Oeffinger KC, Shih YC, Sullivan DC, Travis W, Walter L, Wolf AM, Brawley OW, Smith RA. American Cancer Society lung cancer screening guidelines. CA Cancer J Clin. 2013;63:107–117.
Swensen SJ, Jett JR, Hartman TE, Midthun DE, Sloan JA, Sykes AM, Aughenbaugh GL, Clemens MA, Lung cancer screening with CT: Mayo Clinic experience, Radiology, 226 (2003) 756–761.
Ni J, Xu L, Li W, Zheng C, Wu L. Targeted metabolomics for serum amino acids and acylcarnitines in patients with lung cancer. Exp Ther Med. 2019;18:188–198.
Mu Y, Zhou Y, Wang Y, Li W, Zhou L, Lu X, Gao P, Gao M, Zhao Y, Wang Q, Wang Y, Xu G. Serum Metabolomics Study of Nonsmoking Female Patients with Non-Small Cell Lung Cancer Using Gas Chromatography-Mass Spectrometry. J Proteome Res. 2019;18:2175–2184.
Planchard D, Popat S, Kerr K, Novello S, Smit EF, Faivre-Finn C, Mok TS, Reck M, Van Schil PE, Hellmann MD, Peters S. ESMO Guidelines Committee. Metastatic non-small cell lung cancer: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann Oncol. 2018 Oct 1;29(Suppl 4):iv192-iv237. doi: 10.1093/annonc/mdy275. Erratum in: Ann Oncol. 2019 May;30(5):863–870.
Peiffer-Smadja N, Rawson TM, Ahmad R, Buchard A, Georgiou P, Lescure FX, Birgand G, Holmes AH. Machine learning for clinical decision support in infectious diseases: a narrative review of current applications. Clin Microbiol Infect. 2020 May;26(5):584–595. doi: 10.1016/j.cmi.2019.09.009. Epub 2019 Sep 17. Erratum in: Clin Microbiol Infect. 2020 Aug;26(8):1118.
Li Q, Yang H, Wang P, Liu X, Lv K, Ye M. XGBoost-based and tumor-immune characterized gene signature for the prediction of metastatic status in breast cancer. J Transl Med. 2022 Apr;18(1):177. 20(.
Yu C, Zhang Y. Development and validation of prognostic nomogram for young patients with gastric cancer. Annals of translational medicine. 2019;7(22):641.
Pan X, Yang W, Chen Y, Tong L, Li C, Li H. Nomogram for predicting the overall survival of patients with inflammatory breast cancer:A SEER-based study. Breast (Edinburgh Scotland). 2019;47:56–61.
Mao W, Wu J, Kong Q, Li J, Xu B, Chen M. Development and validation of prognostic nomogram for germ cell testicular cancer patients. Aging. 2020;12(21):22095–111.
Deng X, Li M, Deng S, Wang L. Hybrid gene selection approach using XGBoost and multi-objective genetic algorithm for cancer classification. Med Biol Eng Comput. 2022 Mar;60(3):663–681.
Li Z, Zhang H. Reprogramming of glucose, fatty acid and amino acid metabolism for cancer progression. Cell Mol Life Sci. 2016;73:377–392.
Mondanelli G, Iacono A, Carvalho A, Orabona C, Volpi C, Pallotta MT, Matino D, Esposito S, Grohmann U. Amino acid metabolism as drug target in autoimmune diseases. Autoimmun Rev. 2019;18:334–348.
Hocher B, Adamski J. Metabolomics for clinical use and research in chronic kidney disease. Nat Rev Nephrol. 2017;13:269–284.
Smith E, Fernandez C, Melander O, Ottosson F. Altered Acylcarnitine Metabolism Is Associated With an Increased Risk of Atrial Fibrillation. J Am Heart Assoc. 2020;9:e016737.
Zhao S, Feng XF, Huang T, Luo HH, Chen JX, Zeng J, Gu M, Li J, Sun XY, Sun D, Yang X, Fang ZZ, Cao YF. The Association Between Acylcarnitine Metabolites and Cardiovascular Disease in Chinese Patients With Type 2 Diabetes Mellitus. Front Endocrinol (Lausanne). 2020;11:212.
Beloribi-Djefaflia S, Vasseur S, Guillaumond F. Lipid metabolic reprogramming in cancer cells. Oncogenesis. 2016;5:e189.
Anand ST, Ryckman KK, Baer RJ, Charlton ME, Breheny PJ, Terry WW, Kober K, Oltman S, Rogers EE, Jelliffe-Pawlowski LL, Chrischilles EA. Metabolic differences among newborns born to mothers with a history of leukemia or lymphoma, J Matern Fetal Neonatal Med, (2021) 1–8.
Gaudet MM, Falk RT, Stevens RD, Gunter MJ, Bain JR, Pfeiffer RM, Potischman N, Lissowska J, Peplonska B, Brinton LA, Garcia-Closas M, Newgard CB, Sherman ME. Analysis of serum metabolic profiles in women with endometrial cancer and controls in a population-based case-control study. J Clin Endocrinol Metab. 2012;97:3216–3223.
Lin CN, Huang CC, Huang KL, Lin KJ, Yen TC, Kuo HC. A metabolomic approach to identifying biomarkers in blood of Alzheimer's disease. Ann Clin Transl Neurol. 2019;6:537–545.
Chang Y, Gao XQ, Shen N, He J, Fan X, Chen K, Lin XH, Li HM, Tian FS, Li H. A targeted metabolomic profiling of plasma acylcarnitines in nonalcoholic fatty liver disease. Eur Rev Med Pharmacol Sci. 2020;24:7433–7441.
Grohmann U, Bronte V. Control of immune response by amino acid metabolism. Immunol Rev. 2010;236:243–264.
Morris SM Jr. Arginine metabolism: boundaries of our knowledge. J Nutr. 2007;137:1602s–1609s.
Bach SJ, Lasnitzki I. Some aspects of the role of arginine and arginase in mouse carcinoma 63. 12: Enzymologia; 1947. pp. 198–205.
Geiger R, Rieckmann JC, Wolf T, Basso C, Feng Y, Fuhrer T, Kogadeeva M, Picotti P, Meissner F, Mann M, Zamboni N, Sallusto F, Lanzavecchia A. L-Arginine Modulates T Cell Metabolism and Enhances Survival and Anti-tumor Activity. 167: Cell; 2016. 829–842.e813.
Locasale JW. Serine, glycine and one-carbon units: cancer metabolism in full circle. Nat Rev Cancer. 2013;13:572–583.
Yang M, Vousden KH. Serine and one-carbon metabolism in cancer. Nat Rev Cancer. 2016;16:650–662.
Maddocks OD, Berkers CR, Mason SM, Zheng L, Blyth K, Gottlieb E, Vousden KH. Serine starvation induces stress and p53-dependent metabolic remodelling in cancer cells. Nature. 2013;493:542–546.
Sivashanmugam M. J J, V U, K N S. Ornithine and its role in metabolic diseases: An appraisal. Biomed Pharmacother. 2017 Feb;86:185–194. doi:10.1016/j.biopha.2016.12.024. Epub 2016 Dec 12.
Kudo Y. Predicting cancer outcome: Artificial intelligence vs. pathologists. Oral Dis. 2019;25:643–645.
Montero PH, Yu C, Palmer FL, Patel PD, Ganly I, Shah JP, et al. Nomograms for preoperative prediction of prognosis in patients with oral cavity squamous cell carcinoma. Cancer. 2014;120:214–221.
Alabi RO, Mäkitie AA, Pirinen M, Elmusrati M, Leivo I, Almangush A. Comparison of nomogram with machine learning techniques for prediction of overall survival in patients with tongue cancer. Int J Med Inform. 2021 Jan;145:104313.
Alabi RO, Elmusrati M, Sawazaki-Calone I, Kowalski LP, Haglund C, Coletta RD, et al. Machine learning application for prediction of locoregional recurrences in early oral tongue cancer: a Web-based prognostic tool. Virchows Arch. 2019;475:489–497.
Alabi RO, Elmusrati M, Sawazaki-Calone I, Kowalski LP, Haglund C, Coletta RD, et al., Comparison of supervised machine learning classification techniques in prediction of locoregional recurrences in early oral tongue cancer, International Journal of Medical Informatics (2019) 104068.

Tables 1-4 are available in the Supplementary Files section.

Download PDF

Journal Publication

published 13 Jun, 2023

Read the published version in BMC Medical Informatics and Decision Making →

Editorial decision: Major revision
02 Oct, 2022
Reviewers agreed at journal
01 Sep, 2022
Reviewers invited by journal
01 Sep, 2022
Submission checks completed at journal
25 Aug, 2022
Editor invited by journal
04 Aug, 2022
Editor assigned by journal
12 Jul, 2022
First submitted to journal
12 Jul, 2022

You are reading this latest preprint version

Construction of XGBoost Model for Early Lung Cancer Prediction Based on Metabolic Indexes

Status:

Journal Publication

Version 1

Abstract

Background

Results

Conclusions

Figures

1. introduction

2. Material And Methods

3. Results

4. discussion

5. Declarations

6. References

tables

Supplementary Files

Status:

Journal Publication

Version 1