Effects of surgery on survival of GSCC without distant metastasis฀Propensity Score Analysis and Nomogram Construction based on SEER Database

: Purpose: Our purpose was to appraise the postoperative survival benefit of primary gingival squamous cell carcinoma (GSCC) patients without distant metastasis and to construct a nomogram in order to predict overall survival(OS). Method: Patients were selected from the Surveillance, Epidemiology and End Results database. The primary endpoint was OS. Univariate, multivariate Cox regression analysis and propensity score analysis were used to compare the association between surgical treatment and OS. The nomogram was to predict the 3-year and 5-year OS probabilities. The concordance index (C-index), the area under the curve (AUC), calibration plot and the Decision Curve Analysis curve were used to evaluate the data performance and clinical efficacy of the nomogram model. Results: The median OS was 85 and 10 months in the surgery group and non-surgery group. We used propensity score matching (PSM) analysis to balance the influence of confounding factors on the statistics, and the results showed that OS still preferred surgical treatment at the primary tumor after PSM. The C-index of the nomogram model is 0.708 and the AUC value of the 3-year and 5-year is 0.730. The calibration curve is also close to 45 degree. Both net benefit and net reduction curves showed that the model had good clinical benefit. Conclusion: Our analysis suggests that surgery for primary site tumors is a viable option for GSCC patients without distant metastasis. Meanwhile, we have developed and verified a nomogram model, which has been proved to have well distinction, calibration abilities and clinical benefit. Purpose: Surgery remains the preferred treatment for head and neck of malignancies. Our purpose was to appraise the postoperative survival benefit of primary gingival squamous cell carcinoma (GSCC) patients without distant metastasis and to construct a nomogram in order to predict overall survival(OS). Method: Patients who diagnosed with GSCC without distant metastasis and receiving active treatment between 2004 and 2015 were selected from the Surveillance, Epidemiology and End Results (SEER) database. The primary endpoint was OS. Univariate, multivariate Cox regression analysis and propensity score (PS) analysis were used to compare the association between surgical treatment and OS. The nomogram was to predict the 3-year and 5-year OS probabilities. The concordance index (C-index), the area under the curve (AUC), and calibration plot were used to evaluate the data performance of the nomogram model. The Decision Curve Analysis (DCA) curve was used to evaluate the clinical efficacy of the nomogram. Results: We reviewed 2440 GSCC patients without distant metastasis. The median OS was 85 and 10 months in the surgery group and non-surgery group, respectively. After univariate and multivariate Cox regression analysis, surgery for the primary tumor was independent factors for OS. In order to balance the influence of confounding factors on the statistics, we used propensity score matching (PSM) analysis to evaluate the influence of surgery on OS, and the analysis results showed that OS still preferred surgical treatment at the primary tumor. In addition, independent influencing factors determined by univariate and multivariate Cox analysis were selected to prepare the survival prediction model of the nomogram. The model has well prediction performance. The C-index of the model is 0.708 and the AUC value of the 3-year and 5-year nomogram prediction model is 0.730. The calibration curve is also close to 45 degree. Both net benefit and net reduction curves showed that the nomogram model had good clinical benefit. Conclusion: Our analysis suggests that surgery for primary site tumors is a viable option for GSCC patients without distant metastasis. Meanwhile, we have developed and verified a nomogram model which can effectively predict 3- and 5-year OS for GSCC patients, which has been proved to have well distinction, calibration abilities and clinical benefit. It can provide practical help for clinicians to make decision. Of course, our results and model need to be further validated with real-world data. This retrospective study mainly evaluated the influence of surgery on the survival of GSCC patients without distant metastasis. We collected data from the SEER database, and used Cox regression analysis and propensity score analysis to appraise the influence of surgery on the survival of GSCC without distant metastasis. Our results show that surgical treatment can significantly improve the median OS in GSCC patients without distant metastasis. In addition, effective GSCC survival prediction is also of great significance for clinical treatment to improve patients' survival state and prolong survival time. We established a nomogram with good predictive performance to predict the 3-year and 5-year OS potential, so as to guide the clinical management of GSCC patients.

analysis to evaluate the influence of surgery on OS, and the analysis results showed that OS still preferred surgical treatment at the primary tumor. In addition, independent influencing factors determined by univariate and multivariate Cox analysis were selected to prepare the survival prediction model of the nomogram. The model has well prediction performance. The C-index of the model is 0.708 and the AUC value of the 3-year and 5-year nomogram prediction model is 0.730. The calibration curve is also close to 45 degree. Both net benefit and net reduction curves showed that the nomogram model had good clinical benefit.
Conclusion: Our analysis suggests that surgery for primary site tumors is a viable option for GSCC patients without distant metastasis. Meanwhile, we have developed and verified a nomogram model which can effectively predict 3-and 5-year OS for GSCC patients, which has been proved to have well distinction, calibration abilities and clinical benefit. It can provide practical help for clinicians to make decision. Of course, our results and model need to be further validated with real-world data.

Introduction:
Among the seven anatomical regions of head and neck malignancies, oral and maxillofacial malignancies occupy a large part, while GSCC is one of the common oral and maxillofacial malignancies, accounting for 10%-25% of all oral and maxillofacial malignancies (1). GSCC is characterized by a high possibility of metastasis and bone infiltration (2). Despite the continuous development of diagnosis and treatment techniques, GSCC is often found at an advanced stage with poor prognosis because it is difficult to distinguish from benign lesions such as periodontitis (3,4).
Once distant metastasis occurs, the 5-year survival rate of GSCC patients is obviously lower than that of patients with early primary disease. Although some researchers have reported a slight improvement in prognosis over the past 20 years, GSCC with distant metastases is still a bad predicate (5). Therefore, early diagnosis and active treatment to prevent recurrence and distant metastasis are effective methods to improve the survival of GSCC patients.
Advanced in diagnostic modality, chemotherapy, radiotherapy, targeted therapy, immunotherapy, and reconstructive surgery have influenced the contemporary management of GSCC patients, but surgery remains the preferred treatment for GSCC patients (6). In the current studies, most of the studies have focuses on patients with advanced GSCC, and the results generally suggest that surgery adjuvant chemoradiotherapy is the main treatment for advanced GSCC patients and achieved positive prognostic effect (7,8). However, there are few studies on whether patients with early or non-distant metastatic GSCC should be treated with surgery alone or surgery combined with adjuvant chemoradiotherapy. Previously, Ashley M. Nassiri et al. discussed the treatment of patients with stage T4aN0M0 GSCC. The results of this study suggest that surgery alone can significantly improve 5-year OS in patients with stage T4aN0M0 GSCC compared with combination therapy, while avoiding the morbidity associated with adjuvant therapy and maintaining survival outcomes (9). However, there are some limitations in the study, such as the small number of cases included and the lack of comprehensive and in-depth analysis. Therefore, our study selected patients with stage M0 GSCC from the SEER database. Our primary aim was to explore the value of surgery on their survival benefit, and to compare the prognostic effect of surgery alone with adjuvant chemoradiotherapy combined with surgery. Our second aim was to construct a comprehensive nomogram to provide clinicians with a quantitative tool to evaluate OS in patients with non-distant metastatic (M0) GSCC.

Methods:
The data in this research were based on SEER database data. The SEER database is the authoritative cancer statistics database in the United States, which contains the basic information of the incidence, mortality and disease status of millions of malignant tumor patients in some states and counties in the United States. All the raw data in this study were to submit SEER study data access requests and signed the agreement, through the SEER Stat to a client-server mode from the SEER website (https://seer.cancer.gov/data/) to download. Because the data were public, patient informed consent and ethical approval are not required. The de-identified data analysis of SEER project was not subject to medical ethics review and did not require informed consent. All implementation procedures related to human participants in this study comply with the 1964 Helsinki Declaration and its subsequent amendments or similar ethical standards.

Study population
In this study, the SEER software version 8.3.9 was used to obtain the information of patients with GSCC from SEER database for retrospective analysis. Detailed methods for identifying patients with GSCC are described in a flow-chart ( Figure 1). We used the terrain codes of "8051/3", "8052/3", "8070/3", "8076/3", "8083/3" and "8084/3" of the International Classification of Cancer Diseases patients with more than one metastatic tumor ; (III) aged <18 years; (IV) diagnosis was confirmed merely on the basis of clinical findings and without pathological analysis; (V) patients' baseline data and clinical information were unknown or untested; (VI) patients who had not undergone any aggressive treatment, including surgery, chemotherapy, radiotherapy. The surgical group consisted of patients undergoing primary tumor-related surgery, while the non-surgical group included patients who did not undergo surgical treatment but received chemotherapy and/or radiotherapy. A total of 2440 patients met our inclusion and exclusion criteria and were enrolled in the study.

Collected variables
The data collected were composed of clinicopathological, therapeutic, and follow-up information, including age, sex, race, the year of diagnosis, marital status, site of the primary tumor, tumor size, tumor grade, chemotherapical status, radiotherapical status, surgical status, the American Joint Committee on Cancer (AJCC) tumor-node-metastasis (TNM) stage, T stage, N stage, M stage, survival status, OS time, and GSCC-specific survival (GCSS) time. In this study, We set age as a categorical variable. Through X-tile (version 3.6.1, Y ale University) software calculation, the optimal cut-off values are 77 years and 86 years severally. OS was interpreted as the time between diagnosis and death from any cause or the last follow-up. GCSS was interpreted as the time from diagnosis to specific death due to GSCC. The criteria for reclassification of AJCC TNM stage, T descriptor, and N descriptor is the AJCC's Cancer Staging Manual, 6 th edition. Treatment strategies contained surgery, chemotherapy, and radiotherapy.

Cox regression analysis and Propensity score analysis
Univariate and multivariate Cox proportional hazard regression analyses were used to compare the OS of the surgical and non-surgical groups, and to determine the hazard ratio (HR) and their 95% confidence intervals (CI). P-values, which was based on likelihood ratio tests, were used to contrast patient mortality. Univariate and multivariate Cox proportional risk regression analyses demonstrated that both the overall mortality and cancer-specific mortality, surgery were the independent factors.
Propensity score analysis is a statistical technique for a variety of clinical research, which was introduced in 1983. It can effectively adjust the influence of various confounding factors on data in retrospective observational studies (10)(11)(12). To reduce potential confounders between the surgical and non-surgical groups, propensity score analysis was performed. First, the probability of each patients undergoing surgery was calculated based on a variety of confounding factors identified by Cox regression analysis, including age, sex, marital status, site of the primary tumor, tumor grade, AJCC stage, and treatment strategy. Then we matched the patients into the surgical and non-surgical groups in a 1:1 ratio, with a logit SD caliper width of 0.2. The above steps were performed by using "MatchIt" and "table1" packets in R language. Baseline levels of matched patients were then assessed to ensure that baseline characteristics did not differ significantly in the primary analysis. After aforesaid PSM, we evaluated the value of surgery in survival and prognosis of primary tumor by Cox regression analysis.

Establishment and Validation of the nomogram
After univariate and multivariate Cox proportional risk regression analyses, significance variables were collected to construct a visual prediction nomogram. The association between prognostic factors and OS was then determined, respectively. On this basis, we develop a new nomogram.
Mainly from the following two aspects to appraise performance of the nomogram. First, we assessed the prediction capability of the nomogram by the C-index and a calibration plot. In the nomogram verification, we use the bootstrap method of 1000 resampling to weaken the overfit bias. A strong prediction nomogram usually has a excellent C-index, and its numerical ranges between 0.5 and 1 (13).
Second, we plot the receiver operating characteristic (ROC) curve. According to the ROC curve, the difference between the prediction accuracy of the nomogram and other factors was compared. The larger area under the ROC curve, the greater predictive ability. Third, we used DCA curve to appraise the clinical efficacy of the nomogram.

Statistical Methods
Converts continuous variables to category variables, expressed in percentage. X 2 test was employed to appraise the differences between groups. Cox univariate and multivariate models were used to determine the independent factors affecting the prognosis. Kaplan-Meier method was used to calculate the survival time outcomes. SPSS 22.0 software (version 24.0, SPSS Inc., Chicago, USA) and R (version 3.6.0, Foundation for Statistical Computing, Vienna, Austria) were used for all statistical analyses. R packages "foreign", "survival", "survminer", "rms", "MatchIt", "table1" "caret", and "survival ROC" were used. Results with P value <0.05 suggests a statistical difference between the groups.

Patients Characteristic
A grand total of 2440 patients who reached the criteria were enrolled in our research. Table 1 shows the basic of information and clinicopathological features of patients with GSCC. The study patients were mainly white (85.5%) and had N0 lymph node involvement (75.6%). Overall, 2190 patients underwent surgery for the primary tumor and 250 patients received radiotherapy and/or chemotherapy without surgical treatment. Figure 2a shows the prognostic influence of primary tumor surgical operation on OS. The median OS time of the surgery groupwas 85 months (95%CI, 77-93), and that of the non-surgery group was 10 months (95%CI, [9][10][11][12][13][14]. The risk of death in the non-surgery group was 3.55 times higher than in the surgery group. The Kaplan-meier curves showed that patients who underwent surgery for the primary tumor had a higher OS rate than those who did not (P<0.001). Similar results are shown for GCSS ratios.

PSM was used to adjusting for Patients Characteristic
Univariate and multivariate Cox analysis showed that age, marital status, tumor grade, T stage and N stage were independent factors for OS in addition to surgery for the primary tumor. To account for the potential deviations between surgery and OS due to the imbalance of these factors, propensity score analysis was used to assess whether surgery had a favorable prognostic effect on OS. Before matching the surgical patients had a PS of 0.080+0.118 compared to 0.330+0.208 in the non-surgery group (P<0.001), suggesting that there was a strong clinical correlation bias between the demographic and clinical characteristic observed in the two groups. After PS matching, the surgical group and the non-surgical group were well matched, with 186 patients in each group. There were no significant differences in clinical and pathological factors between the two groups (Table 1). After PS matching, the PS of the two group are nearly equal (0.267+0.179 for both groups, P<0.001). Figure 3 showed the distribution of PS in the two groups before and after the PS matching progress. After propensity matching, most parameters could avoid the imbalance between the two groups pf patients, except age and marital status. Multivariate Cox regression model showed that primary tumor surgery remained an independent prognostic factor of OS and GCSS after PSM (Table 2).

Nomogram and model performance
Independent prognostic factors were identified by univariate and multivariate analyses and were included in the nomogram for evaluation with different treatment regimens. The nomogram for predicting 3-year and 5-year OS of GSCC patients without distant metastasis were shown in Figure 4.
Age had the largest risk score range, followed by N stage, T stage, and tumor grade, suggesting that these four factors had the greatest impact on prognosis. A re-sampling of 2440 GSCC patients without distant metastasis through the bootstrapping (1000 replicates) was used to obtain a relatively unbiased appraise of the model's performance. The calibration diagram of the 3-year and 5-year OS prediction is shown in Figure 5, which shows the predicted results of the nomogram and the actual observation results. The C-index for OS prediction model is 0.708. The AUC was 0.730 both in the prognosis model of 3-and 5-year nomogram, respectively ( Figure 6).

DCA is a clinical method proposed by Dr. Andrew Vickers of Memorial Sloan-Kettering Cancer
Institute in 2006 (14). DCA is a simple way to appraise clinical predictive models, which has the advantage of incorporating patient or decision maker preferences into the analysis (15). It meets the practical need of clinical decision making and is widely used in clinical analysis. The new benefit curves of 3-and 5-year OS rates of the datasets showed that the survival prediction ability of the nomogram was better than that of all intervention or no intervention. In comparison of OS rates at 3and 5-year, the net reduction of the nomogram was also higher than other factors (T stage, N stage and tumor grade), indicating that nomogram are superior in predicting survival in patients with GSCC ( Figure 7).

Discussion:
This retrospective study mainly evaluated the influence of surgery on the survival of GSCC patients without distant metastasis. We collected data from the SEER database, and used Cox regression analysis and propensity score analysis to appraise the influence of surgery on the survival of GSCC without distant metastasis. Our results show that surgical treatment can significantly improve the median OS in GSCC patients without distant metastasis. In addition, effective GSCC survival prediction is also of great significance for clinical treatment to improve patients' survival state and prolong survival time. We established a nomogram with good predictive performance to predict the 3-year and 5-year OS potential, so as to guide the clinical management of GSCC patients.
GSCC is a rare disease, which occurs more frequently in the elderly and often presents with clinical symptoms similar to those of inflammatory lesions (16)(17)(18). Since squamous cell carcinoma is characterized by rapid cell growth, progressive invasion, metastasis and migration, the overall 5-year prognostic survival rate is less than 50% (19). With the continuous progress of medical technology, tumor treatment methods are changing with each passing day. In addition to surgical treatment and chemoradiotherapy, targeted therapy, immunotherapy, oncolytic virus therapy, bacterial therapy and so on also have excellent performance in tumor therapy (19)(20)(21)(22)(23)(24). The National Comprehensive Cancer Network (NCCN) proposed the following treatment guidelines based on the TNM staging of oral cancer: Surgery is the main treatment for oral cancer without distant metastasis (M0), especially for the early stage (T1-2, N0). Therefore, we selected GSCC patients with TNM stage M0 as the study population to explore the effects of surgery on survival. However, whether surgery combined with chemoradiotherapy provides a better survival benefit than surgery alone remains controversial(9).
Several retrospective analyses have shown that surgical treatment significantly prolonged survival in patients with primary gingival cancer, as well as improved survival time after recurrence, compared with non-surgical treatment , which is also supported by our data (25,26).
We analyzed and compared the survival time of 2440 GSCC patients without distant metastasis, and the results showed that OS and CSS in the surgical group were significantly better than those in the non-surgical group. PSM was first proposed by Rosenbaum and Rubin in the 1980s to solve the problem of confounding bias in non randomized controlled trials, so as to achieve an effect similar to randomization (27). So as to minimize the hindrance of confounding factors on the results and enhance the reliability of data analysis (28), we matched the propensity scores of the two groups of patients and adopted the 1:1 local matching method to balance the inter-group covariables. Finally, a grand total of 372 patients were successfully matched (186 patients in the surgery group and 186 patients in the non-surgery group). After PSM, the baseline data of the two groups were well balanced, and the effectiveness of the research data was improved. The results show that in the multivariate Cox regression analysis, HR in the surgery group still showed benefit. This finding further supports the benefits of surgery for GSCC patients without distant metastasis. In addition, our clinical predictive model also indicated that patients undergoing surgery combined with radiotherapy or/and chemotherapy and patients without any treatment had higher risk scores than patients who underwent surgery alone, which could be related to the related adverse reaction of chemoradiotherapy. In a large retrospective analysis, patients who received chemoradiotherapy had a much higher rate of severe complications and decreased oral function than those who received surgery alone. A multicenter clinical study showed that the incidence of grade III or IV hematological toxicity was as high as 90% when docetaxel combined with nedaplatin was used in the treatment of oral squamous cell carcinoma (OSCC). In another phase III clinical trial, the incidence of oral mucositis caused by radiotherapy was more than 80% (29)(30)(31). Radiotherapy for OSCC can cause acute and long-term complications, such as radiation necrosis, subcutaneous fibrosis and thyroid insufficiency. The cytotoxicity of chemotherapeutic drugs will damage normal cells, such as platinum compounds will cause serious liver and kidney dysfunction, myelosuppression and gastrointestinal side effects and so on. These side effects will affect the prognosis of patients and become important factors limiting the survival of OSCC patients (30)(31)(32)(33). The above results confirm that surgery, as one of the most clinically beneficial treatments, deserves to be listed as the first treatment of GSCC without distant metastasis.
Nowadays, nomogram has become an important modern medical decision-making tool for predicting disease risk or survival outcomes. This model incorporate a variety of individual risk factors, quantifying the impact of each variable to visualize outcomes for different patients (34). To establish rble results, we extracted data from the SEER database for 2440 patients with GSCC during 2004 to 2015.. The independent influencing factors of GSCC were found by univariate and multivariate Cox analysis, and the prediction model of nomogram was built. The traditional C-index and calibration chart were used for evaluation of the nomogram. The evaluation outcomes were as follow: the C index of this nomogram was 0.708, significantly superior to the AJCC staging system. Calibration plots also approached 45 degrees, which indicates that the prediction probability of the model is close to the actual probability. AUC is defined as the area under the ROC curve and is often used to be an indicator to measure the performance of a model. Both the 3-year and 5-year AUC of this nomogram are 0.73, showing that the model performs well. In addition, in order to prove the clinical fit of the rosette, DCA was used to evaluate the rosette, and the results showed that the prediction model had bright clinical application value. Compared with other existing prediction models, our Nomogram can predict the prognosis of patients more intuitively based on the scores of influencing factors, and our inclusion of influencing factors is more detailed and comprehensive (18).
Our research has certain deficiencies, first of all, due to this study was a retrospective analysis, although some confounding bias was reduced by PSM, the occurrence of unobserved bias could not be controlled and some important case data may be screened out. Therefore, more randomized controlled trials are desired to validate the results of this study. Secondly, admittedly, there are many other factors affect GSCC, such as diet, smoking, alcohol, etc., but due to the SEER database does not contain these related factors, we have not these factors should be brought into the study. Thus, establishing prospective studies involving more factors is the direction of our effects. Thirdly, all the data in our study are from SEER datasets, and there may be overfitting deviation in the line graph. However, our study also has certain advantages. We used PSM to balance the partial confusion bias of the data. Our nomogram model is based on a large population of subjects, and both traditional and modern methods are used to verify the data internally and externally. It is a practical clinical model with high data accuracy and strong clinical fit.
In conclusion, surgery can significantly prolong the survival time of GSCC without distant metastasis. Compared with TN stage and tumor grade, the nomogram model of GSCC without distant metastasis established based on SEER database can accurately provide a more effective service for clinicians to predict the 3-year and 5-year survival rates of patients, and provide a powerful help for the selection of treatment options for patients with GSCC. Table 1 Comparison of baseline feature between surgical and non-surgical patients with M0 stage primary GSCC in the original and matched datasets.