Study population and design
The patients, with pathologically diagnosed solid tumor at any stage, came from the Investigation on Nutrition Status and its Clinical Outcome of Common Cancers (INSCOC) cohort in China, a detailed description of the design, methods, and development of the INSCOC study was provided elsewhere [7, 18, 19]. All cancer participants who met the inclusion criteria were recruited from multiple institutions in China between 2013 and 2019. The inclusion criteria in the present study were: 1) patients aged 18 years or more; 2) a histological diagnosis of solid malignant tumors; and 3) a hospital stay longer than 48 hours. The exclusion criteria were: 1) patients with Acquired immunodeficiency syndrome (AIDS) or transplanted organ(s); 2) patients who were admitted to the intensive care unit (ICU) and were in a critical condition at the beginning of recruitment, 3) patients who refused to participate or would not cooperate with the questionnaire survey (Supplementary Table 1). The primary outcome was 5-year mortality. The baseline survey included an in-person interview, a self-administered questionnaire by the patient, and a physical exam or anthropometric measurement by a trained interviewer using standardized protocols for the PG-SGA. Follow-up for participants included in-person or telephone interviews annually to collect their survival information. The outcome for this present analysis was censored on December, 2019.
Additionally, as shown in the study schematic (Figure 1), participants who had a missing critical clinical examination, or follow-up data, or more than 10% of all data, were excluded. Finally, 8,749 patients were included in the current analysis as the primary cohort. With the same inclusion and exclusion criteria, the validation cohort was composed of 208 and 488 cancer patients who were followed at the First Affiliated Hospital of Sun Yat-Sen University and the First Affiliated Hospital of USTC between 2010 and 2019.
The study was conducted in line with the Helsinki declaration; its design was approved by the local Ethics Committees of all participant hospitals. All patients signed an informed consent form before participating in the study. The trial was registered at http://www.chictr.org.cn with registration number ChiCTR1800020329.
Data collection
Four patient-reported domains: 1) weight, 2) food intake, 3) symptoms, and 4) activities and function were used in the present study. Unintentional weight loss was evaluated by comparing the historical weight (according to the patient's self-report) to the measured weight, obtained when the patient was wearing a light hospital gown upon admission. Food intake was evaluated by comparing the present intake to the intake in the past month. Symptoms were defined as problems that prevented patients from eating enough during the past two weeks. Food intake and symptoms were grouped since both of them could be used to reflect oral intake [20]. Activities and function were assessed by the presence of problems that decreased the functional abilities of the patient (Supplementary Table 2).
Clinical parameters and demographics information were also collected, including age, sex, body mass index (BMI), primary tumor site, TNM stage, co-morbidity, nutrition support interventions (enteral or parenteral ), lifestyle habits (e.g., alcohol, smoking, or tea consumption), menstrual and reproductive history, medical history, occupational history, and family history. BMI was categorized using the classifications for Chinese population: underweight (< 18.5 kg/m2), normal weight (18.5 – 23.9 kg/m2), overweight (24 – 27.9 kg/m2), and obese (≥ 28 kg/m2). Fasting blood tests, such as total protein, albumin, pre-albumin, and creatinine, were collected with standard laboratory techniques within 48 hours of admission. The scores for the PG-SGA were constructed using standard thresholds [21]. The OS of patients was calculated from admission to death or the last contact. All pathological staging was defined according to the 8th edition of the AJCC TNM staging system.
Construction, Validation, and Calibration of the Nomogram
The least absolute shrinkage and selection operator (LASSO) method was used to select the most useful potential predictive features from the primary cohort. K-fold cross-validation was performed to select the best-fitted model according to the optimal lambda value. All potential risk factors were entered into subsequent multivariable Cox regression analysis to identify the independent predictors. The dual-direction step-wise method was applied to further reduce the number of covariates with Akaike’s information criterion as the stopping rule.
Independent prognostic factors for survival were identified from multiple variables to generate a nomogram, following the Transport Reporting of a Multivariable Prediction Model for Individual Prognosis of Diagnosis (TRIPOD) statement. The nomogram was subjected to 1,000 bootstrap re-samplings for internal validation of the primary cohort. Model performance for predicting the outcome and the discriminative ability was measured by calculating the concordance index (C-index). Calibration curves, generated by comparing the predicted survival with the observed survival after bias correction, were plotted to assess the nomogram’s calibration, accompanied by the Hosmer-Lemeshow test.
Clinical Application
To evaluate the potential clinical net benefit of the model, the researchers performed decision curve analysis (DCA) and compared the nomogram with existing models (TNM staging system and TNM staging system combined with scored PG-SGA) using the C-index.
Risk Group Stratification Based on the Nomogram
In addition to numerically comparing the discrimination ability by the C-index, we sought to illustrate the independent discriminatory ability of the nomogram beyond that provided by the standard TNM staging system. The patients were distributed into different risk groups within a specific TNM category according to the total risk scores (from highest to lowest) in the primary cohort, after which cut-off values of nomogram were determined.
Statistical Analysis
Quantitative variables are expressed as the means ± standard deviations. Their differences were analyzed using Student’s t-test to see if variables followed a normal distribution, or nonparametric tests (Mann-Whitney or Kruskal-Wallis) if variables did not follow a normal distribution. Qualitative variables were analyzed using chi-square tests, with Fisher corrections if necessary. Kaplan-Meier curves were used to analyze the survival data. Cox regression analysis was used to evaluate the associations of OS with each factor. Adjusted hazard ratios (HRs) and their corresponding 95% confidence intervals (CIs) were derived from Cox models after adjusting for potential confounders. The optimal cut-off value was determined using the maximally selected rank statistics from the 'maxstat' R package. Age was used as the timescale for all models, with entry time defined as the age at the baseline interview and exit time defined as the age at death, last follow-up, or December 2019, whichever came first. The significance level was set at P < 0.05 (two-sided probability). All analysis was performed using R version 3.6.2 (http://www.rproject.org, rms package, survival package, and survminer packages). DCA was performed using the source file “stdca.r”, downloaded from https://www.mskcc.org.