Prediction of Lymph Node Metastasis in Penile Cancer: Evaluation of Clinicopathological Factors, Validation of an Existing Model, and Development of Novel Nomogram

Objective To investigate the predictive factors of lymph node metastasis (LNM) and evaluate the usefulness of prediction nomograms. This study analyzed data of 300 patients diagnosed with penile squamous cell carcinoma at West and 412 cases acquired from the Surveillance, Epidemiology, and End Results (SEER) program. Logistic regression analysis was performed on these cohorts to investigate the predictive factors of LNM. We evaluated a recently developed prediction nomogram for LNM, which was established based on the National Cancer Database (NCDB). Moreover, we developed a novel nomogram using cases from the WCH for the prediction of lymphatic metastasis.


Introduction
Penile squamous cell carcinoma (PSCC) is a relative rare genitourinary tumor, with an overall incidence of < 1 in 100,000 males in the USA and Europe (1,2). However, this number is markedly higher and increasing in developing countries (3). Metastasis of penile cancer to the inguinal lymph nodes, the most common metastatic site for this type of malignancy, is always associated with a poor prognosis (3,4).
While inguinal lymph node dissection (ILND) can assist in tumor grading and reduce the risk of mortality, this technique is also associated with a high incidence of complications, with an incidence of complications (70%) (5,6). Hence, it is important to identify patients who will bene t from ILND and to avoid unnecessary surgery.
Previous studies demonstrated that the development of lymph node metastasis (LNM) depends on several clinicopathological factors, such as tumor T stage, nuclear grade, lymph vascular invasion (LVI), etc (7,8). Models combine these factors could help on the accurate prediction of lymphatic metastasis (9)(10)(11). Taylor et al, established a LNM prediction nomogram based on clinicopathological features (nuclear grade, cN stage and LVI) of patients recorded in National Cancer Database (NCDB), this prediction nomogram exhibited high discrimination in its internal validation (11). In the present study, we analyzed the predictive value of several clinicopathological factors in patient cohorts from West China Hospital (WCH) of Sichuan University (Chengdu, China) and the Surveillance, Epidemiology, and End Results Program (SEER) database, evaluated the clinical usefulness of the NCDB nomogram, and subsequently developed a novel nomogram using data of PSCC cohort of our institute.

Patient selection
This study included patients who were diagnosed with PSCC and underwent complete excision of the lesion through partial or radical penectomy in the Department of Urology at WCH of Sichuan University between September 2008 and October 2020. The exclusion criteria were: 1) presence of unresectable disease or cN3 disease; 2) Eastern Cooperative Oncology Group score > 1; and 3) unwillingness of patients to provide information regarding their disease. Patients provided informed written consent prior to the collection of data. Finally, 300 patients with PSCC included in this study.

Clinical and pathological features
Patient clinical data (e.g., age at diagnosis, smoking history, duration of disease, tumor growth velocity, clinical lymph node stage) were retrieved from the medical records of our hospital. All pathological reports were provided by the Pathology Department of our institute, including histopathological type, pathological T stage, tumor size, nuclear grade and LVI. Pathological T stage was adjusted according to the Union for International Cancer Control (UICC) TNM classi cation system 7th and 8th editions, and the largest diameter of the tumor was recorded to determine its size. Clinical N stage was recorded at the rst outpatient visit, which is 1 month after primary resection.
Follow-up ILND was recommended for patients with pT1G2 or higher stage diseases, and those with palpable inguinal lymph node at the rst postoperative outpatient visit. Patients were followed-up through clinical examination once every 3 months during the rst year and every 6 months thereafter. Ultrasonography of the groin was performed every 6 months for the rst 2 years after surgery. Metastatic inguinal lymph nodes were con rmed by surgical resection or biopsy during follow-up.

SEER data resource and cohort selection
All cases of patients with PSCC for whom data were available in the SEER database were examined (SEER Research Data, 18 registries, Nov 2019 Sub; n=5222). We included those who had complete records in terms of cN stage, tumor nuclear grade, LVI, regional lymph node examined status, survival length and status, and those with cN3 disease were excluded. Except for the above-mentioned factors, clinicopathological data (e.g., age, race, tumor location, T stage, tumor size and ILND history) were also collected. There were nally 412 SEER cases included in our study.

Statistical methods
Univariate and multivariate logistic regression analyses were performed to determine the clinicopathological parameters associated with LNM. In univariate analysis, factors with statistical signi cance were included in the multivariate analysis, and independent predictors of LNM were selected to generate a novel nomogram. Bootstrapping was used to calculate the corrected c-index, and a calibration curve was created. Moreover, patients (WCH and SEER cohorts) were scored using the NCDB nomogram. Receiver operating characteristic curve analysis was used to evaluate predicted value of different clinicopathological factors, the NCDB nomogram and new-established WCH nomogram. Statistical analyses were performed using the SPSS Statistics 25 (IBM Corporation, Armonk, NY, USA) and The R Programming Language 4.0.4 (R Foundation for Statistical Computing, Vienna, Austria). A P< 0.05 denoted statistically signi cant differences. Table 1 presents the clinicopathological data of 300 patients with PSCC. The mean age was 54.2 years (standard deviation: 13.9 years), and the median follow-up time was 35.1 months (interquartile range: 15.0-86.6 months). ILN metastasis occurred in 93 of patients during follow-up. The 2 years cancerspeci c-survival rate for lymph node positive or negative series were 53.8% and 98.9% respectively. The clinicopathological data of 412 SEER cases were also shown in Table 1. The median follow-up time was 8.5 months (interquartile range: 4.0-15.0 months). In this cohort, LNM occurred in 6 patients during the follow-up. The 2 years cancer-speci c-survival rate for lymph node positive and negative groups was 37.5% and 83.7% separately.  Table 2). The multivariate analysis for the WCH cohort demonstrated that all aforementioned factors, except the location of the lesion, were independent predictors of LNM (P < 0.05, Table 3).  Prediction effects of above independent predictors for LNM were evaluated (Fig. 1A, B). In the WCH cases, higher AUC were seen in cN stage and nuclear grade, which were 0.754 and 0.722 respectively; AUC for all other factors were lower than 0.70, and 8th T stage showed better predictive effect than 7th (AUC: 0.672 vs 0.636). In the SEER cases, factors with completed records (grade, LVI and cN stage) were evaluated, and only cN stage had AUC higher than 0.70. Besides, external validations of the NCDB nomogram (11) were performed using both the SEER and WCH cohorts: for these two patient cohorts, the area under the curve (AUC) was 0.833 and 0.795, respectively (Fig. 1C, D).

Results
For better prediction of LNM, a new nomogram was established using clinicopathological data of WCH cases (Fig. 2). All the factors included in the model were previously proven independent predictors of LNM. The bootstrap corrected c-index of model was 0.876, which was similar to the AUC (Fig. 1E). Figure 3 illustrates consistency between the predicted risk and observed incidence.

Discussion
The occurrence of PSCC metastasis of the inguinal lymph node is linked to poor prognosis (12). Lymph node dissection is the most important approach for the prevention and treatment of LNM. However, surgeons must consider the balance between survival bene t and the high rate of complications (3). The use of dynamic sentinel node biopsy has been advocated to avoid unnecessary ILND, though the risk of false-negative results remains inevitable (13). Therefore, it is necessary to make an accurate prediction of LNM. This study demonstrated that age at diagnosis, cN stage and pathological data (T stage, nuclear grade and LVI) were independent predictors for LNM. The new nomogram established based on the above factors showed good discrimination. Thus, we think that it will be helpful in the decision-making regarding ILND.
Previous studies identi ed several factors associated with LNM of PSCC, such as tumor size, nuclear grade, and LVI, as well as invasion of the corpus, corpus spongiosum, urethra, nerves, etc (7,8,14). The UICC pathological T stage combined those factors which describe the growth and invasion of primary tumors, and was shown to be signi cantly correlated with LNM (9, 10, 12). The 8th edition of the UICC TNM classi cation includes the following changes in the de nitions of T1, T2 and T3: T1 was strati ed into two different groups depending on LVI; T2 denoted invasion of the corpus spongiosum; and T3 indicated invasion of the corpus cavernosum (15). In the present study, cN stage got the highest AUC both in WCH and SEER cohorts, which indicates that postoperative examination of groin area should not be ignored. Moreover, the 8th T stage showed better AUC than 7th version, which supports the application of the 8th T stage when predicting LNM.
This study also demonstrated that younger patients are at a higher risk of developing LNM. This relationship has rarely been reported in the past. Geise et al., retrospectively reviewed 378 patients with PSCC and found that younger patients had a higher frequency of morphological features(16). In their study, the frequency of LNM was 49% 34% and 21% for patients aged < 40 years, 40-60 years and > 60 years, separately. However, Zhu et al., and Peak et al., did not report age-related differences with regard to LNM (10,11). In fact, onset age is a factor that has been neglected for a long time:  (17). In recent years, a more simpli ed nomogram was established based on 1,636 patients from the NCDB (11). This model included grade, cN stage and LVI, its internal validation was produced an AUC of 0.880. Our present study showed relatively high accuracy of external veri cation both in WCH and SEER cases, which re ected the role of grade, cN and LVI in the prediction of LNM. After combined these factors with latest T stage and diagnostic age, we got high c-index of 0.876. It is anticipated that this novel model may have greater application value for Asian population which our patient series belongs to.
There were some limitations in this study. Firstly, the population included in our nomogram derived from a single source. Although some systematic errors (such as religion, race and medical-care conditions) could be reduced, the usefulness of this model in other geographic regions and populations of other racial backgrounds could not be evaluated. Furthermore, we did not analyze molecular targets that may be associated with LNM. The present model just incorporated the most important and routine clinicopathological factors; thus, we hope that this approach may facilitate to clinical practice and its further evaluation.

Conclusions
For patients with PSCC, age at diagnosis, pathological T stage, nuclear grade, LVI, and cN stage were independent predictors of LNM. The UICC 8th T stage has better predictive value for LNM than 7th edition. The NCDB nomogram has acceptable predictive value in WCH and SEER series. In this study, a novel LNM prediction nomogram for LNM was generated based on WCH cases. This model incorporates the aforementioned independent-prediction factors and show good predictive power.

Declarations
Ethics approval and consent to participate This retrospective study was approved Ethics Committee of West China Hospital, Sichuan University, with the whole process supervised. Patients and their authorized family members had been fully informed before follow-up work was performed, with informed consent signed.

Consent for publication
Consent for publication was obtained from all participants.

Availability of data and materials
The dataset supporting the conclusions of this article is included within the supplementary materials.

Competing interests
None declared.

Fundings
This study was supported by the National Natural Science Foundation of China (Reference Number: 81672552), the Science and Technology Foundation of Sichuan Province (2017JY0226) and the 1.