Urine Cellular DNA Point Mutation and Methylation as Potential Biomarkers for the Detection of Urothelial Carcinoma

Previously, our team identied a seven-gene mutation panel in urine sediment to discriminate UBC from benign urological diseases. In the present study, we aimed to validate the panel in an expanded and close to natural population cohort of hematuria. Also, we tried to optimize the panel by incorporating methylation biomarkers. We performed external validation to investigate the robustness and stability of the novel panel. UBC: urothelial bladder carcinoma, NMIBC: non-muscle invasive bladder cancer, MIBC: muscle invasive bladder cancer, CT: computed UTUC: upper urothelial carcinoma, positive predictive value, NPV: negative predictive value, IQR: interquartile range, KIRC: Kidney renal clear cell PRAC: Prostate carcinoma, CIS: in

Urothelial bladder carcinoma (UBC) is the most common malignancy of the urinary tract, with approximately an estimated 550,000 new cases and 200,000 deaths per year worldwide 1 . The majority of newly diagnosed cases are non-muscle invasive bladder cancer (NMIBC). Nearly 70% of these patients will experience recurrence, and 10-30% progress to muscle-invasive bladder cancer (MIBC) inevitably 2,3 .
Typical diagnosis and surveillance of UBC involves the use of cystoscopy, cytology, FISH and computed tomography(CT) 4,5 . Cystoscopy is regarded as the gold standard for the detection of UBC, which exhibits relatively high clinical sensitivity but low patient acceptance owing to its invasive nature 6 . In contrast, urine cytology and FISH are noninvasive and speci c, but lacks sensitivity, especially in low-grade tumors. CT is a good tool but still has the potential to cause radiation damage. These facts, together with the high cost and follow-up biopsy procedures, have led to many attempts to develop alternative noninvasive methods to detect UBC.
Given the special anatomical characteristic, noninvasive strategies to identify UBC mainly include urinebased genetic, epigenetic and protein assays. Currently, FDA has approved several tests for UBC diagnosis and surveillance, including NMP22 and UroVysion, with sensitivity ranging 30-100% and speci city of 55-98% 7,8 . However, due to assay performance inconsistencies, technical expertise and high cost, integration of such assays into routine clinical practice has not yet occurred. In addition, none of these assays have been validated for detection of upper tract urothelial carcinoma (UTUC), accounting for 5% of urothelial carcinoma (UC).
Previously, we identi ed a seven-gene mutation panel in urine sediment to discriminate UBC from benign urological diseases in hematuria patients 9 . Seven genes include 33 different types of point mutation. The model was 'X=-0.6685+416.2208*TERT+16.3065*FGFR3+21.6375*TP53+1030.8943*HRAS+269.6423*KRAS+ (-6.6597)*PIK3CA+1365.2377*ERBB2'. The value x was substituted into the sigmoid function f (x) = 1/1+e −x to get a t value, f(x). A malignancy is considered when f(x) > 0.4491049, while the benign state is indicated otherwise. In the present study, we aim to validate the panel in an expanded and close to natural population cohort. Besides, we try to optimize the panel by incorporating methylation biomarkers.
We also performed external validation to investigate the utility and stability of the novel panel.

Materials And Methods
Patients' characteristics and ethics statement Participants were prospectively recruited (Chinese Clinical Trial Registry, ChiCTR2000029980) as approved by the Ethics Committee of Xiangya Hospital(XYH), The Second Xiangya Hospital(SXYH), Hunan Provincial People's Hospital(HPPH), Hunan Cancer Hospital(HCH) and Beijing Hospital(BJH) after written informed consents were obtained. Studies were conducted in accordance with the ethical principles in the Declaration of Helsinki.

Study design
In the prospective, multicenter expanded cohort, a total of 385 individuals with macroscopic or microscopic hematuria were initially enrolled from XYH(n=143), SXYH(n=87), HPPH(n=82) and HCH(n=73) (Hunan, China) between August 2019 and January 2020, of whom 333 were eligible for inclusion( Figure 1). Previous seven-gene panel was validated in this cohort to explore the possibility of discriminating UC from other urological diseases.
In the panel optimization stage, we identi ed several UBC-speci c methylation biomarkers by comprehensive analyses of a series of TCGA, GEO and an independent cohorts from Hunan multicenter.
Candidate methylation biomarkers were examined in 333 participants. We established important predictor features using Boruta feature selection algorithm based on the analysis of DNA mutations and methylation, and reconstructed a novel panel using Random Forest algorithm.
In the external validation stage, 99 participants with hematuria were recruited from BJH (Beijing, China) from May 2020 to August 2020, of whom 89 were eligible to evaluate the stability and reproducibility of the optimal panel.

Sample collection and DNA Isolation
For all participants, each urine sample (at least 30ml) was collected from the rst miction in the morning.
The urine samples were centrifuged at 1,600 g for 10 mins at 4 °C, the supernatant was discarded and the pellet was carefully collected into new vacant 2 mL tubes. Same procedure was performed again at 12,000 g for 10 mins at 25 °C. Then 200 μl of 1× PBS was added to each tube to resuspend the cells. DNA isolation was performed using Tissue Genomic DNA Extraction Kit (cat DP304, Tiangen Biotechnology, China) according to the manufacture's instruction. The modi ed DNA was stored at -80°C for further processing.
Library Preparation and Sequencing (Mutation) 50ng genomic DNA from each sample was fragmented and tailing by TIANSeq Fragment/Repair/Tailing Module (Tiangen, Cat: NG301), and then ligased to forward oligos with UMI. After two rounds puri cation with 1.2× AMPure XP beads (Beckman), the ligased-product was PCR-ampli ed using speci c backward primers and universal primers. After another round of puri cation using 1× AMPure beads, the nal library pool was quanti ed by ABI 7500 fast Real-Time PCR system (Applied Biosystems) and sequenced on a NextSeq 500 system (Illumina, USA) to obtain paired-end 150 bp reads.
All reads were quality trimmed and sequences of adapters were removed. The index sequences and UMI were appended to the read identi er for the next analysis. Sequence reads were mapped to the human genome (hg19) using the Burrows-Wheeler aligner (BWA-MEM). The reads with the same UID were cluster to get the nal consensus sequence. Variant calling was performed using the Genome Analysis Toolkit (GATK v3.8) and emanated variants were annotated using ANNOVAR.
Methylation speci c-PCR(MS-PCR) Sodium bisul te conversion and puri cation of 100ng genomic DNA were performed using EZ DNA MethylationLightningTM Kit (Zymo Research Corporation, Irvine, California, USA), according to the manufacturer's protocol. GAPDH was set as the internal reference. Ct values represented the relative methylation quantity of CpG markers and the internal reference gene (GAPDH), which was measured by FAM and VIC signals separately. The delta ct (Δct ) values were calculated as methylation score.

Statistical analysis
The model performance was evaluated by the area under the curve (AUC) statistics. The sensitivity, speci city, accuracy, positive predictive value (PPV), and negative predictive value (NPV) of the panel and cytology in detecting UC were obtained by comparison to pathology and presented as univariate values in bar graph. The percent of cases in each variant subgroup using different clinical characteristics were also presented as univariate values in bar graph. TheΔct distribution were presented as boxplots with median and the interquartile range (IQR) marks. Random forest analysis was applied to highlight the most powerful mutation and methylation biomarker combination for distinguishing UC from controls. Chisquare test was used for categorical variables, t-test was used for continuous variables and Mann-Whitney U test for non-normally distributed variable data. All statistical analyses and data visualizations were carried out in R software (R version 3.4.3) and GraphPad Prism 8 (version 8.0.2). Adobe Illustrator (CC 2017) was used for image processing. All hypothesis tests were two-sided with a p value < 0.05 considered to be statistically signi cant.

Baseline characteristics and ow chart
The baseline characteristics of cohort were shown in table 1. The exact ow chart is summarized in gure 1.  Figure 1 A~D). Through differential methylation analysis and a series of statistical lters to reduce the number of markers, we nally identi ed 9 most powerful markers, including cg13974773, cg16966315, cg17945976, cg21472506, cg23229261, cg24720571, cg25510609, cg25947619, cg27404023 (Table S1).

Veri cation of putative methylation biomarkers
We then recruited an independent set of 71 UC patients (38 UBC and 33 UTUC) and 70 controls (31 benign controls and 39 malignant controls) to verify the methylation status of 9 biomarkers using MS-PCR (Table 2&Figure 3&Supplementart Figure 2&3). Based on the comprehensive analysis of AUC, cutoff value, sensitivity and speci city, cg16966315, cg17945976 and cg24720571 were selected for optimization of the panel. 4.50, 9.19 and 9.73 were set as the cut-off value for cg16966315, cg17945976 and cg24720571, respectively. The 3 biomarkers were then examined in Hunan multicenter cohort using MS-PCR. AnyΔct equal to or below the determined cut-off value is considered as being positive(+), while negative(-) is indicated otherwise.

Construction of the novel panel
The frequency was set at > 0.5% as abnormal cut-off value, and 33 point mutations status were converted to '+' or '-'. Boruta feature selection algorithm was used to rank the 33 point mutation biomarkers and 3 methylation biomarkers by their importance, and highlight the most powerful biomarker combinations for distinguishing UC from controls.

Integrated analysis of two cohorts
Combining data from the two cohorts, a total of 422 participants consisting of 236 UC participants and 186 controls, the novel panel showed an overall sensitivity of 0.88 and speci city of 0.86 (Figure 6 A&B).
In subgroup analysis, the sensitivity of the optimal panel reached 0.91 for UBC and 0.74 for UTUC, with a signi cant difference. In addition, the speci city of the panel was 0.89 for benign controls and 0.81 for malignant controls. To better understand how each biomarker contributes to the panel, we calculated the percent of cases in each variant subgroup. A signi cantly lower frequency of TERT 250(G_A) and cg24720571 in NMIBC versus MIBC&UTUC was observed (Figure 6 C&D).
From further analysis of the sensitivity and speci city using various clinical variables, no obvious difference was observed in gender and smoking status. Also, no signi cant difference of gene mutation frequency or methylation degree was found with gender and smoking history except TERT 228( Figure   S4).

Novel panel and cytology comparison
In the present study, urine cytology was available for only 210 UC patients and 119 controls (Table S2).

Discussion
Compared with previous mutation panel, the focus was shifted from genes to point biomarkers. And the number of point mutation reduced from 33 to 8, which signi cantly improved the detection e ciency. Also, the novel panel was more precise and speci c. In addition, the cohort is expanded and closed to natural hematuria population. This may suggest that the novel panel could be applied for hematuria population screening in the future.
Genetic mutations are often the subject of investigation and play basic roles in the malignant transformation of urothelial cells 10 . However, not all UC harbor mutations in the most commonly altered oncogenes. Mutation panel only produced sensitivities of 0.70 and 0.67 in previous validation group and present expanded cohort, respectively. The abnormal DNA methylation status is also an important mark in the development of UC, and could be the rst detectable neoplastic changes associated with tumorigenesis 5,11 . The novel panel, consisting of 8 point mutations and 1 methylation biomarker, showed a signi cant improvement in sensitivity. Epigenetic and genetic biomarkers therefore can complement and reinforce each other, resulting a more stable diagnostic performance [12][13][14] .
Cytology is highly speci c, and in expert hands nearly always indicates the presence of urothelial malignancy when positive. It is noninvasive, inexpensive, simple, and valuable for high-grade and at lesions 15,16 . However, cytology is not particularly sensitive, especially for low grade and early stage tumors. In the present study, cytology only achieved a sensitivity of 0.21 and 0.39 in low grade and < T2 tumors, respectively. The novel panel signi cantly outperformed cytology in nearly all aspects and exhibited comparable speci city. Besides, novel panel correctly identi ed 53 cases of low-grade UC while none were detected by cytology. This highlighted that the novel panel might replace the cytology for early detection of UC.
UTUC is an uncommon disease, accounting for only 5%~10% of UC 17 . Currently, most UC biomarkers focus on UBC, and UTUC associated biomarkers are relatively rare. Non-invasive and sensitive methods to Neuritin 1(NRN1), also named cpg15-1, is a GPI-anchored protein mainly involved in neuronal plasticity 19 .
Neuritin 1 is associated with mental illness, such as schizophrenia, bipolar disorder and depression 20,21 . Recently, aberrant methylation of nrn1 gene promoter region is associated with tumor development, such as gastric cancer and melanoma 22,23 . In our study, the CpG site cg24720571 located on the promoter region of nrn1 gene, was rst discovered as a useful biomarker to detect UC in urine. However, the biological function and methylated mechanism of NRN1 remain largely unknown, and further clari cation is needed.

Conclusions
In summary, we developed an optimized model consisting of 1 methylation and 8 point mutation biomarkers for UC detection, which showed a highly speci c and robust performance. It may be used as a replaceable approach for early detection of UC, resulting in less extensive examinations in patients at low risk.     The Δct distribution of candidate methylation biomarkers in the independent cohort (n=141). Statistical analysis was assessed using independent sample t-test. The data were presented as median with the interquartile range. *p< 0.05; **p< 0.01; **p< 0.001; **p< 0.0001; NS represents No signi cance