A Proteogenomic Atlas of Clear Cell Renal Cell Carcinoma in a Chinese Population

Renal cell carcinoma (RCC) is among the top 10 malignant carcinomas 1 . Clear cell (cc)RCC, accounting for ~ 75% of RCC cases, is an aggressive histological RCC subtype. In the last decade, large-scale multiomics studies have profoundly enhanced our understanding of this disease 2,3 . However, despite the differences of genomic alterations between Western and Eastern ccRCC 4,5 , these studies mostly focused on patients in Western populations. Here we conducted a comprehensive proteogenomic analysis of 232 tumor and adjacent non-tumor tissue pairs from Chinese ccRCC patients. Genomic analysis revealed unique genetic features of Chinese ccRCC and distinct mutation patterns associated with copy number alterations. Based on proteomic proles, ccRCC showed extensive metabolic dysregulation, especially in one-carbon metabolism. We classied ccRCC into three subtypes (GP1–3), among which the most aggressive GP1 exhibited dominant immune response, metastasis, and metabolic imbalance, linking the proteomic features, genomic alterations, and clinical outcomes of ccRCC. Nicotinamide N-methyltransferase (NNMT) and NNMT mediated protein homocysteinylation were identied as a poor prognosis indicator and a drug target for GP1, respectively. We demonstrated that NNMT induces DNA-dependent protein kinase catalytic subunit (DNA-PKcs) homocysteinylation, increases DNA repair, and promotes tumor growth in ccRCC. Treatment of N-acetyl-cysteine (NAC), an inhibitor of homocysteinylation, markedly reduced the NNMT overexpression induced radioresistance of tumor cells. This study provided valuable insights into the biological underpinnings and prognosis assessment of ccRCC, revealing a targetable metabolic vulnerability.


Introduction
Multiomics strategies encompassing genome and expression pro ling of multiple tumor types have elucidated novel molecular subtypes and abnormally activated signaling pathways, as well as potential therapeutic targets 3,6−12 . The Cancer Genome Atlas (TCGA) and Clinical Proteomic Tumor Analysis Consortium (CPTAC) have published landmark multiomics studies 2,3 , improving our cognition of ccRCC. CPTAC conducted an integrated proteogenomics analysis in 103 ccRCC cases, which revealed the immune signature of ccRCC, highlighting the necessity and signi cance of proteogenomics research.
In this study, we conducted genomic and proteomic pro ling of 232 paired tumor and adjacent non-tumor samples of Chinese ccRCC patients with a median follow-up of 85 months (range, 3-138 months). Our study revealed distributions of genetic lesions unique to the Chinese cohort and shared between the Chinese and Western populations. Integrated data analysis disclosed the associations among genetic aberrations, proteomic features and clinical outcomes of ccRCC. Further, we identi ed NNMT as biomarker for poor prognosis, and veri ed that NNMT overexpression mediated homocysteine metabolism dysregulation was a potential therapeutic opportunity for renal cell carcinoma.

Results
Proteogenomic Landscape of Chinese ccRCC We collected 232 paired tumor and adjacent non-tumor tissues from Chinese ccRCC patients based on strict criteria ( Figure S1A). Clinicopathological indicators, including age at surgery, sex, clinical manifestation, laterality, tumor size, chronic diseases status, tumor node metastasis (TNM) stage, and International Society of Urological Pathology (ISUP) grading classi cation are summarized in Table S1. Tumor and adjacent tissue regions were determined by pathological examinations. Freshly frozen tissues were used for proteomics analysis and whole-exome sequencing (WES). WES was conducted in 224 paired samples; samples from 8 patients were excluded due to low DNA quality (Fig. 1A).
WES data of tumor adjacent tissues were used as a reference to detect genetic variants in ccRCC. The mean sequencing coverage in the hg38 reference genome was 120.5× for tumor tissues and 68.72× for adjacent tissues (Table S2, Figure S1B). Among the 224 sample pairs, 10,475 non-silent mutations in 6,875 genes and 1,203 silent mutations were detected. VHL was the most frequently mutated gene in this cohort (64.3%), followed by PBRM1 (24.5%), BAP1 (10.7%) and SETD2 (8.9%) (Fig. 1B), consistent with previous studies 2,4,5,13 (Fig. 1C). Interestingly, mutation frequencies of these genes exhibited ethnic and geographic variations in the TCGA 2 , and European 5 cohorts. Speci cally, VHL had the lowest mutation frequency in the TCGA cohort (64.3% in Chinese vs. 46.2% in TCGA vs. 73.4% in European). In contrast, PBRM1 (24.5% in Chinese vs. 42.1% in TCGA vs. 39.4% in European), SETD2 (8.0% in Chinese vs. 13.9% in TCGA vs. 19.1% in European), and MTOR (4.5% in Chinese vs. 9.2% in TCGA vs. 8.5% in European) had lower mutation frequencies in the Chinese cohort (Fig. 1C). Mutational spectra revealed that C > T transversion (27.0%) was the dominant mutation in the Chinese and TCGA cohorts ( Figure S1C). The frequency of A > T transversions was higher in the Chinese cohort than in the TCGA cohort (21.0% vs. 10.6%, Figure S1C). When we decomposed the mutation spectra using the Catalogue of Somatic Mutations in Cancer (COSMIC) database 14 , ve single-base substitution (SBS) signatures (SBS1, SBS5, SBS22, SBS40, SBS52) were detected ( Figure S1D). Signatures SBS1, SBS5, and SBS40 were considered to be correlated with patient age. SBS22 was associated with exposure to aristolochic acid (AA), a Chinese herbal ingredient associated with renal injury and ccRCC carcinogenesis 4,5,15 , corroborating that AA exposure is a carcinogenic factor for Chinese ccRCC. Moreover, patients with the AA signature showed higher mutational burden (p < 0.0001), but lower CNA values (p < 0.05) (Figures S1E-F).
For proteomic data analysis, Spearman's correlation coe cient was calculated for all quality control (QC) runs using HEK293T cell samples ( Figure S1G). The average correlation coe cient of the QC samples  1D). Signi cantly more proteins were identi ed in tumors (median, 9,697) than in paired adjacent tissues (median, 8,915) (paired t test, p < 0.0001, Figure S1J), indicating the complexity of tumor microenvironment. In total, 16,915 proteins were detected in the 232 paired samples ( Figure S1K), among which 14,159 proteins were in common between tumor and adjacent tissues, whereas 1,606 and 1,150 proteins were detected speci cally in tumor and adjacent tissues, respectively (Fig. 1E, Table S3). When we compared our data with those of the CPTAC ccRCC study 3 , 10,581 proteins were found in both cohorts, whereas 6,332 and 774 proteins were detected speci cally in the Chinese cohort and the CPTAC cohort, respectively (Fig. 1E). Proteome quanti cation was conducted using the iBAQ algorithm, followed by normalization to fraction of total (FOT) as reported previously 16,17 ( Figure S1L). The dynamic range of proteins detected spanned eight orders of magnitude ( Figure S1L). In summary, this study provided a comprehensive landscape of Chinese ccRCC at both the genomic and the proteomic levels.
Chromosome 3p loss drives the distinct mutations of BAP1 and PBRM1 and regulates the expression of proteins in the complement and coagulation cascades in a trans-regulatory mode, which further in uences clinical outcomes.
Proteomic Alterations in ccRCC Compared to Adjacent Tissues To obtain a general insight into the proteomic alterations in ccRCC tumor tissues compared to adjacent tissues, 6,111 proteins detected in > 25% of the patients were further analyzed (Table S4). Principle component analysis (PCA) and hierarchical clustering analysis revealed a clear distinction between the proteomes of tumor and adjacent tissues (Figs. 3A, S5A). PCA distances among tumor tissues were signi cantly lower than those among tumor adjacent tissues, corroborating tumor heterogeneity. In total, 1,995 differentially expressed proteins (DEPs) were identi ed in tumor tissues compared with adjacent tissues (Benjamini-Hochberg-adjusted p < 0.05, t test, FC > 2), including 1,296 downregulated and 699 upregulated proteins (Fig. 3B, Table S4). Oncogene/TSG information from the OncoKB database 27 was used to determine whether predicted effects would manifest in the tumor based on the observed protein expression alternations of these genes. We identi ed 13 signi cantly upregulated oncogene products (AXL, BCL2, CCND1, CDK4, CDK6, DNMT1, EGFR, FLT1, LCK, LYN, NOTCH3, PIK3CD, and RBM15) and 10  signi cantly downregulated TSG products (ATP6V1B2, CDH1, EPCAM, FH, PTPRD, PTPRS, SDHA, SDHB, SDHC, and SFRP1) in the tumors ( Figure S5B), among which 8 out of the 13 upregulated oncogene products and all 10 downregulated TSG products were also identi ed in the CPTAC cohort. AXL, BCL2, CCND1, LCK, and NOTCH3 were signi cantly upregulated in the Chinese cohort, but not in the CPTAC cohort, revealing the similarities and distinctions in the carcinogenesis of ccRCC between the Chinese and CPTAC cohorts. Notably, the higher protein expression levels of three oncogene products (DNMT1, LYN, and NOTCH3) and lower protein expression levels of two TSG products (EPCAM, SDHC), were related to poor prognosis (log-rank test, p < 0.05, Figure S5C).

Dysregulation of Metabolic Bioprocesses in ccRCC
Kidney is a metabolic organ. As ccRCC is characterized by aberrant metabolic pathways that control energetics and biosynthesis, it is important to learn how metabolic bioprocesses are altered at the proteome level in ccRCC. To this end, we surveyed metabolism-related pathways annotated in KEGG 30 and Reactome 31 (Table S5). Glycogen metabolism, and glycolysis were upregulated in tumor tissues. In contrast, most metabolic pathways, including tricarboxylic acid (TCA) cycle, oxidative phosphorylation (OXPHOS), amino acid metabolism, lipids metabolism, one-carbon metabolism, and metabolism of vitamins and cofactors were downregulated (Fig. 4A). Correspondingly, only 149 metabolism-related proteins were upregulated in tumor tissues, comparing with 502 downregulated metabolism-related proteins (Benjamini-Hochberg-adjusted p < 0.05, FC > 2). In contrast, ccRCC is characterized by downregulation of most metabolic bioprocesses 32,33 .
One-carbon metabolism, which supports various bioprocesses, including nucleotide biosynthesis, amino acid homeostasis, epigenetic maintenance, and redox defense, plays central roles in carcinogenesis and tumor progression 37 . Loss of SHMT1 (T/TA = 0.19) and ALDH1L1 (T/TA = 0.22) attenuated formate clearance. Overexpression of MTHFD1L (T/TA = 4.37) and MTHFD2 (T/TA = 6.89, identi ed in 52 samples) in turn resulted in increased one-carbon ux to generate excess formate (Fig. 4C). NNMT (T/TA = 17.37) and DNMT1 (T/TA = 3.97), upstream enzymes of homocysteine (Hcy) metabolism, were overexpressed in tumors, leading to enhanced Hcy generation. Hcy can be removed through catabolic processes via different enzymes, such as methionine synthase (MTR), betaine-Hcy S-methyltransferase (BHMT, T/TA = 0.20; BHMT2, T/TA = 0.17), and cystathionine-beta-synthase 38 . In tumor tissues, we observed impairment of the cytosolic one-carbon cycle (ALDH1L, MTHFD1, and MTHFR) (Fig. 4C), limiting generation of CH 3 -THF, coenzyme of MTR. Thus, enhanced production and diminished removal of Hcy resulted in Hcy accumulation in ccRCC. To verify this hypothesis, we examined the levels of Hcy metabolites in tumor and adjacent tissues. Hcy was 2.7-fold more concentrated in ccRCC tumors than in adjacent tissues (paired t test, p < 0.0001) (Fig. 4D). By survey the CPTAC data 3 , we found that uncoupling of mRNA and protein level was not only observed in OXPHOS, but also in one-carbon metabolism ( Figure S6C). More excitingly, by investigating the enzymes involved in formate metabolism, we found that patients with higher expression of ALDH1L1 and SHMT1 appeared to have better prognostic outcomes, whereas patients with higher expression of MTHFD1L appeared to have poorer prognostic outcomes (log-rank test, p < 0.05, Fig. 4E). As for Hcy metabolism, higher expression of NNMT and DNMT1 was negatively associated with good prognosis, whereas higher expression of BHMT and BHMT2 was positively associated with good prognosis (log-rank test, p < 0.05, Fig. 4F). Our study interrogated concrete protein expression alterations in one-carbon metabolism in ccRCC, highlighting the signi cance of one-carbon metabolism dysregulation during ccRCC pathogenesis.
We used ESTIMATE 41 to deconvolute tumor microenvironment (TME) compositions ( Figure S9A) and conducted overrepresentation analysis of elevated proteins in each subtype (Fig. 5G). In total, 641, 1,838, and 97 proteins were upregulated in GP1, GP2, and GP3, respectively (Table S7). GP1 was characterized by a high degree of immune in ltration (Kruskal-Wallis test, p < 0.0001, Figure S9A), as indicated by the enrichment of multiple immune-associated pathways, including innate immune system, complement and coagulation cascades, antigen processing-cross presentation, interferon signaling, and T cell receptor (TCR) signaling (q < 0.05, Fig. 5G). Consistently, GP1 had the highest immunosuppression, CD 8 cluster, and MHC I antigen-presenting machinery (APM) scores (Kruskal-Wallis test, p < 0.05, Figure S9B). GP2 displayed high tumor purities (Kruskal-Wallis test, p < 0.0001, Figure S9A) and increased metabolismrelated pathways, including the TCA cycle and respiratory chain, amino acid metabolism, mitochondrial translation, lipid metabolism, and glycolysis/gluconeogenesis (q < 0.05, Fig. 5G). GP3 featured the highest stromal scores (Kruskal-Wallis test, p < 0.0001, Figure S9A), corresponding to upregulation of ECM-related pathways, including ECM organization, collagen formation, elastic ber formation, and focal adhesion (q < 0.05, Fig. 5G). Classi cation of our proteome data according to established CPTAC subtyping signatures 3 provided further support for the diverse characteristics of the proteome subtypes in the Chinese ccRCC cohort (STAR Methods). Speci cally, GP1 were mainly CD8 + in amed tumors, GP2 were mainly of the metabolic immune-desert subtype, and GP3 were mainly CD8 − in amed tumors ( Figure S9C). The allocation of CPTAC subtypes in our data was also indicated by the immune and stromal scores (Figures S9A, C).
Since GP1 patients had the poorest prognosis and were supposed to be assigned into a clinically highrisk category, they deserved to receive further therapy. To determine the potential therapeutic drug targets, we mapped GP1-representative proteins to Drugbank 46 druggable proteins (Fig. 5I, Table S7). Ten candidates, involved in metabolism (IL4I1, NNMT), immunity (C1QC, HM13, CASP1), and metastasis (P4HA1, P4HB, PLOD3, S100A4, PML), were identi ed. In brief, we identi ed three novel proteomic subtypes of Chinese ccRCC with distinct molecular features that connect the proteomic, genomic, and clinical features of ccRCC.

NNMT Promotes Cancer Cell Proliferation Through Hcy Accumulation
We conducted supervised analysis to identify robust and representative prognostic proteins, and we anticipated to screen out drug targets (Fig. 5I). NNMT, an important enzyme in Hcy metabolism, was overexpressed in ccRCC tumors (Fig. 6A) and signi cantly associated with poor prognosis (Fig. 6B). Furthermore, western blotting (Fig. 6C) and immunohistochemistry (IHC) (Figs. 6D, S10) con rmed that NNMT was overexpressed in ccRCC. Historically, ccRCC has been considered resistant to conventional chemo-and radiotherapy, indicating tolerance to genotoxic stress. Moreover, ccRCC cells are able to proliferate rapidly in a nutrient-depleted microenvironment 34 . Thus, we tested whether high NNMT expression increased the viability of ccRCC under various stresses. NNMT overexpression promoted the proliferation of ACHN and 786-O cells and profoundly enhanced cell proliferation during nutritional stress or genotoxic stress (Figs. 6E-F). Nutritional and genotoxic stresses induce DNA damage in cells. NNMT overexpression reduced DNA damage in stressed ACHN, 786-O, and 769-P cells as evidenced by the levels of γ-H 2 AX detected using immuno uorescence staining (Fig. 6G) and western blotting (Fig. 6H) and by DNA damage detection using the comet assay (Fig. 6I), in cultured renal cancer cell lines. These results indicated that NNMT overexpression may contribute to proliferation promotion under stress. NNMT catalyzes methyl transfer from S-adenosyl methionine (SAM) to nicotinamide (NAM) and generates Sadenosyl homocysteine (SAH) and 1-methylenicotinamide (1MNA). Increased NNMT in cells resulted in a decrease in SAM and increases in SAH and 1MNA (Fig. 6J). The level of Hcy, the hydrolysis product of SAH, was also increased signi cantly (Fig. 6J). Supplementation of SAH or Hcy, but not supplementation of 1MNA or a reduction in SAM through knockdown of MAT, reduced DNA damage (Fig. 6K) and promoted cell proliferation under stress (Fig. 6L). Furthermore, blockade of SAH hydrolysis by knockdown of S-adenosylhomocysteine hydrolase (SAHH) in NNMT-overexpressing cells abrogated the DNA damagereducing effect of NNMT (Figs. 6M-N), suggesting that Hcy, but not SAH, plays a role in the DNA repairand proliferation-promoting effects of NNMT.

Lysine Homocysteinylation of DNA-PKcs Enhances DNA Repair
When present at high levels, intracellular Hcy modi es protein lysine residues, which results in protein lysine-homocysteinylation in cells 47 . We observed increased protein lysine-homocysteinylation (K-Hcy) levels in NNMT-overexpressing (Fig. 7A) and SAH-supplemented cultured renal cancer-derived ACHN, 786-O, 769-P, and A-498 cells (Fig. 7B). Meanwhile, Hcy (Fig. 4D) and K-Hcy (Fig. 6C) levels were increased in ccRCC tumors compared to adjacent tissues. In our previous cell-wide proteomics screen for K-Hcy substrates in HEK293T cells, we observed that DNA-PKcs, a protein required for the non-homologous endjoining pathway of DNA repair, was heavily modi ed by K-Hcy 47,48 . In ccRCC tumors, we validated that three different lysine residues (K122, K712, K868, and K902) were modi ed by K-Hcy (Fig. 7C), suggesting that K-Hcy regulates DNA-PKcs-mediated DNA repair. Among the three lysine residues in DNA-PKcs (K122, K712, K868, and K902) that were modi ed by homocysteinylation, K122 is located within the interface between DNA-PKcs and KU70/KU80, while K712, K868, and K902 are located within the intramolecular interaction region of DNA-PKcs 49 (Fig. 7C). Proteins that physically interact with methionyl-tRNA synthetase (MARS) are more prone to being modi ed and regulated by K-Hcy 47 . Accordingly, an interaction between DNA-PKcs and MARS was con rmed by co-immunoprecipitation assays using either exogenous DNA-PKcs and MARS in ACHN and 769-P cells ( Figure S11A), or endogenous DNA-PKcs and MARS in 786-O cells (Fig. 7D). Elevated NNMT, SAH, Hcy, or MARS levels led to dose-dependent increases in DNA-PKcs homocysteinylation in ACHN and 769-P cells (Figures S11B-E). These results con rmed that DNA-PKcs is subject to MARS-mediated K-Hcy modi cation. Increased NNMT expression resulted in the activation of DNA-PKcs as indicated by increased phosphorylation of DNA-PKcs and its downstream target protein p53 (at Ser15) in ACHN and 769-P cells (Fig. 7E). Moreover, supplementation of either SAH or Hcy or overexpression of MARS induced elevated K-Hcy level of DNA-PKcs and activated the DNA-PKcs pathway in ACHN and 769-P cells (Figures S11F-H). In contrast, reducing K-Hcy modi cation through knockdown of NNMT, SAHH, or MARS inhibited DNA-PKcs activity in ACHN and 769-P cells (Figures S11I-K). Furthermore, phosphorylation levels of DNA-PKcs and p53 were markedly increased in ccRCC tumors vs. adjacent tissues (Fig. 6C). These results were consistent with the comet assay results, which revealed that compared with adjacent tissues, tumors exhibited decreased DNA damage (Fig. 7F) and reduced γ-H 2 AX levels (Fig. 6C). The results also indicated that DNA-PKcs is inactivated in ccRCC tumors and that NNMT-induced hyper lysine-homocysteinylation might promote ccRCC by activating DNA-PKcs.

Lysine-homocysteinylation facilitates the formation of DNA-PKcs complex
We next investigated the mechanism by which lysine-homocysteinylation activates DNA-PKcs. DNA-PKcs-KU70/KU80 interaction was signi cantly enhanced by increased cellular K-Hcy levels induced by NNMT overexpression in ACHN and 769-P cells (Figs. 7G, H). As a result, the activity of DNA-PKcs was enhanced in NNMT-overexpressing cells as determined by monitoring its kinase activity in phosphorylating its substrate p53 in vitro (Fig. 7I) and measuring ADP formation in an ADP-Glo-DNA-PK assay (Fig. 7J). We validated that at increased levels, K-Hcy activates DNA-PKcs, as determined by adding homocysteine thiolactone (HTL) to the in vitro DNA-PKcs assay (Fig. 7K) and measuring intracellular ADP formation in an ADP-Glo-DNA-PK assay (Fig. 7L). Moreover, to mimic the bulky side chain effects of K-Hcy 47 , we created two mutant DNA-PKcs constructs in which either Lys122 within the KU70/KU80-binding interface or all three modi able lysine residues were mutated to tryptophan ("KW" and "4KW" constructs, respectively). Relative to wild-type DNA-PKcs, the mutant DNA-PKcs showed increased binding a nity to KU70/KU80 (Figs. 7M, N) and enhanced DNA-PKcs activity, as determined by an in vitro DNA-PKcs assay (Fig. 7O). In addition, increased NNMT expression in 786-O and ACHN promoted xenograft tumor growth in nude mice, especially in IR-treated cell xenografts, whereas inhibition  20,21 . Our study demonstrated that copy-neutral and low-degree 3p loss might be a novel predictor of poor prognosis in ccRCC. The degree of 3p loss was associated with the frequencies of driver mutations in BAP1 and PBRM1. Further studies to dissect out the association of chromosome 3p copy number alterations and driven mutations will promisingly promotes our cognations for the carcinogenesis and development of this disease.
To provide metabolic insights into ccRCC, we survey the expression of metabolism-related proteins in the 232 ccRCC tumor/adjacent tissue pairs. The results revealed three major metabolic imbalances in ccRCC, including energy metabolism, lipid metabolism and one-carbon metabolism. Concretely, the Warburg effect, manifested by upregulated glucose uptake, glycolysis, a downregulated TCA cycle, OXPHOS, led to energy metabolic imbalance. Lipid metabolism imbalance manifested as increased fatty acid synthesis and decreased β-oxidation, suggesting lipid accumulation in ccRCC. One-carbon metabolism imbalance in ccRCC resulted in the accumulation of two oncometabolites, formate and Hcy. In particular, we detected Hcy accumulation in tumor and adjacent tissues, con rming our nding based on proteomic data. Overall, metabolic alterations were not only ccRCC features, but were also associated with advanced disease and poor clinical outcomes. Metabolic reprogramming facilitates the identi cation of novel and repurposed drugs that could potentially be used to treat ccRCC.
Based on the proteome pro les of tumor tissues, we conducted molecular subtyping of ccRCC, which revealed three subtypes. The subtypes exhibited dramatic diversity in proteomic signatures, genetic alterations, and patient survival. Among the three subtypes, GP1 was associated with the poorest survival, and 80% cases of GP1 nally progressed. GP1 exhibited a dominant immune signature, with the highest CD8 + T cell in ltration and immunosuppression scores ( Figure S9B), indicating adaptive immune resistance. Accordingly, GP1 showed higher APM scores ( Figure S9B), which reportedly are associated with the immunogenicity of ccRCC tumors 50 . Therefore, we hypothesized that GP1 patients might bene t from immune checkpoint inhibitor therapy.
For the targeted therapeutic strategy, we paid more attention to GP1 patients because they had the poorest prognosis. Ten druggable candidates were screened, including IL4I1, NNMT, C1QC, HM13, CASP1, P4HA1, P4HB, PLOD3, S100A4 and PML. NNMT, a metabolic enzyme, was identi ed as an important carcinogenic factor and drug target. Some clinical studies using the candidate protein survey strategy [51][52][53][54][55] have reported that NNMT is overexpressed in various tumors, including lung, liver, bladder, colon, and kidney cancers. One proteomics study revealed NNMT as a master metabolic regulator of cancerassociated broblasts 56 . In cultured cells, NNMT promotes cancer cell survival, proliferation, migration, and invasion [51][52][53][54][55] . However, the exact oncogenic role of NNMT in ccRCC as well as its metabolic functions in cancer cells have not been determined.
Our recent study demonstrated that Hcy can modify protein lysine residues and turn the metabolic status to cell signaling in colorectal cancer 47 . As NNMT is an upstream metabolic enzyme in Hcy metabolism, we linked NNMT and Hcy in ccRCC tumorigenesis and development. Mechanistically, NNMT overexpression increases Hcy and K-Hcy modi cation in tumor cells and promoted tumor proliferation. K-Hcy modi cation of DNA-PKcs, enhancing DNA-PKcs-KU70/KU80 interaction, and nally activates DNA-PK complex. Xenograft experiments revealed that NNMT overexpression empovered the resistance to radiation therapy in renal cell carcinoma. Inhibiting K-Hcy modi cation by NAC rescued the injuring effect of radiation on tumor. The current study suggested that the NNMT-K-Hcy-DNA-PKcs axis can partially explain the radiotherapy resistance of ccRCC and be considered a potential therapeutic target.
In summary, our study provided a comprehensive proteogenomic landscape of Chinese ccRCC. The dominant pathways that were altered in the ccRCC proteome subtypes revealed the potential molecular mechanism underlying clinical phenotypes and outcomes. We identi ed a potential druggable protein, NNMT, and demonstrated the value of this multiomics approach. We believe that this study provides valuable information regarding ccRCC biology and paves the way to novel therapeutic strategies.

Clinical Sample Collection
We screened 1,556 consecutive patients who underwent radical or partial nephrectomy for the treatment of renal tumors at the Department of Urology of Fudan University Shanghai Cancer Center (FUSCC, Shanghai, China) from January 2007 to March 2014. Electronic medical records were screened retrospectively. In total, 232 eligible ccRCC patients who had undergone radical nephrectomy at the FUSCC were consecutively enrolled. Median follow-up was 85 months (range, 3-138 months). At the last follow-up, 79 patients (34.1%) had progressive disease and 49 patients (21.1%) had died of ccRCC.
Clinicopathological indicators, including age at surgery, sex, clinical manifestation, laterality, tumor size, chronic diseases status, TNM stage, and ISUP grading classi cation are summarized in Table S1. Tumor and adjacent non-tumor tissue samples were collected during surgery and are available from the FUSCC tissue bank. Samples were collected according to following criteria: 1) tumor adjacent tissues were collected >2cm from the tumor margin; 2) each tumor/adjacent sample was checked by an expert pathologist to con rm the sample quality. Hematoxylin and eosin (H&E)-stained slides of tumor and tumor adjacent tissues were uploaded to the Mendeley data (https://data.mendeley.com/datasets/pb5tbs2by5/draft?a=1b8aa955-40a7-4df0-970a-e936666ffd99).
Among the 1,324 excluded patients, 161 patients were diagnosed with benign renal tumor, 118 with urinary tract carcinoma, 326 with non-clear RCC, and 89 with other simultaneous or heterochronous malignancies. Further, 577 patients (mainly those who underwent partial nephrectomy) were excluded because of unavailable adjacent normal tissues, and 53 samples failed to pass pathological quality check, such as tumor cell rate < 90% ( Figure S1A). All cases were staged according to the 2010 American Joint Committee on Cancer TNM staging system. H&E-stained sections were reviewed by an experienced genitourinary pathologist to determine the ISUP grade, and frozen sections were reviewed to determine the tumor cell rate of the ccRCC tissues. The study was compliant with the ethical standards of Helsinki Declaration II and was approved by the institutional review board of FUSCC (050432-4-1212B). Written informed consent was obtained from each patient before any study-speci c investigation was conducted. (Thermo, 65306) was used for library puri cation, P5/P7 primers (Nanodigmbio, ND10010) and HotStart ReadyMix (KAPA, KK2612) were used for library ampli cation. The ampli ed libraries were puri ed using SPRISELECT (Beckman, B23319). DNA quality was assessed using a Bioanalyzer High Sensitivity DNA Analysis kit (Agilent Technologies, 5067-4626). Samples underwent paired-end sequencing on a Nextseq CN500 platform (Illumina), with a 150-bp read length. The WES target region was 33 M. A mean coverage of 100×, a capture rate of 95%, and a dup rate of 40% were achieved for tumor sequencing.

Somatic Variant Detection
Read-depth statistics were calculated using the DepthOfCoverage function in the Genome Analysis Toolkit (GATK v3.8.1.0) 57 . Paired-end reads in Fastq format were aligned to a reference human genome 58 (UCSC Genome Browser, hg38) using Burrows-Wheeler Aligner. Variant calling was conducted following GATK best practices. Somatic single-nucleotide variations and small insertions and deletions were detected using MuTect2 (GATK v4.1.2.0) and were annotated using ANNOVAR 59 based on UCSC known genes. Two longest genes, TTN and MUC16, were excluded as they tended to acquire numerous mutations by chance in large-scale genome/exome sequencing experiments. The Maftools R package 60 was used to display mutant genes with non-synonymous mutations. MutSigCV 61 was used to identify signi cantly mutated genes with default parameters. Genes with Benjamini-Hochberg-adjusted p < 0.01 were identi ed as signi cantly mutated genes.

Mutation Frequency Variances Across Regions
TCGA ccRCC genome data were downloaded from xenabrowser.net 62 and data for a European ccRCC cohort were obtained from 5 . The top 10 most frequently mutated genes in our Chinese cohort and these two cohorts were compared using Fisher's exact test.

Mutual Exclusivity and Mutation Co-occurrence Analysis
Mutually exclusive or co-occurring sets of genes were detected using the somaticInteractions function in the Maftools R package, using pair-wise Fisher's exact test to detect signi cant gene pairs. p < 0.05 was used as a threshold for statistical signi cance.
Each gene in each sample is assigned a threshold copy number that re ects the magnitude of its deletion or ampli cation. These are integer values ranging from -2 to 2, where 0 means no ampli cation or deletion of a magnitude greater than the threshold parameters described above. Ampli cations are represented by positive numbers: 1 indicates ampli cation above the ampli cation threshold; 2 indicates ampli cation larger than the arm-level ampli cations observed in the sample. Deletions are represented by negative numbers: -1 indicates deletion beyond the threshold; -2 indicates deletions greater than the minimum arm-level copy number observed in the sample.

LC-MS/MS
Samples were analyzed on a Q Exactive HF-X mass spectrometer (Thermo Fisher Scienti c) coupled with a high-performance liquid chromatograph (EASY-nLC 1200 System, Thermo Fisher Scienti c). Dried peptide samples were dissolved in solvent A (0.1% formic acid in water) and loaded onto a trap column (100 μm × 2 cm, home-made; particle size, 3 μm; pore size, 120 Å; SunChrom) with a maximum pressure of 280 bar using solvent A, then separated on a home-made 150 μm × 12 cm silica microcolumn (particle size, 1.9 μm; pore size, 120 Å; SunChrom) with a gradient of 5%-35% mobile phase B (acetonitrile and 0.1% formic acid) at a ow rate of 600 nL/min for 75 min. MS analysis was conducted with one full scan (300-1,400 m/z, R = 120,000 at 200 m/z) at an automatic gain control target of 3e6 ions, followed by up to 20 data-dependent MS/MS scans with higher-energy collision dissociation (target 5e4 ions, max injection time 20 ms, isolation window 1.6 m/z, normalized collision energy of 27%). Detection was done using Orbitrap (R = 7,500 at 200 m/z). Data were acquired using the Xcalibur software (Thermo Fischer Scienti c).

MS Platform QC and ccRCC Proteome Quality Assessment
For QC of MS performance, tryptic digests of HEK293T cell lysates were measured as a QC standard every 2 days. The QC standard was made and run using the same method, conditions, software, and parameters as those used for ccRCC samples. Pairwise Spearman's correlation coe cients were calculated using the R package corrplot 65 for all QC runs, and the results are shown in Figure S1B Precursor ion score charges were limited to +2, +3, and +4. The data were also searched against a decoy database so that protein identi cations were accepted at FDR of 1%. Label-free protein quanti cations were calculated using a label-free, intensity-based absolute quanti cation (iBAQ) approach 16 . Match between runs 67 was used to improve parallelism between tumor/adjacent samples. We built a dynamic regression function based on common peptides in tumor/adjacent samples. Based on the correlation value R 2 , Firmiana chooses a linear or quadratic function for regression to calculate the retention time (RT) of corresponding hidden peptides and checks the existence of the extracting ion current (XIC) based on the m/z and calculated RT. The program determines the peak area values of existing XICs. We calculated peak area values as parts of corresponding proteins. Proteins with at least 1 unique peptide with a 1% FDR at the peptide level were selected for further analysis. The FOT was used to represent the normalized abundance of a particular protein across samples. FOT was de ned as a protein's iBAQ divided by the total iBAQ of all proteins identi ed in each sample. FOT values were multiplied by 10 5 for ease of presentation and missing values were assigned 10 -5 (Table S3).

Protein and Pathway Alterations in Tumor vs. Adjacent Tissues
PCA was conducted to visualize the separation of tumor and tumor-adjacent proteomes using the R package factoextra v1.0.6 68 . In total, 6,111 proteins identi ed in both >25% of tumor and tumor-adjacent samples were used for subsequent analysis. Volcano plots were used to display DEPs in tumor and adjacent tissues by applying thresholds of fold change >2 and Benjamini-Hochberg-adjusted p < 0.05.
Among the DEPs, 1,296 proteins were signi cantly upregulated and 699 proteins were signi cantly downregulated in ccRCC tumor tissues. The DEPs were then subjected to KEGG pathway enrichment analyses in DAVID 69 , with a p value cutoff of 0.05 (Table S4). Signature proteins of the nephrons (including glomerulus, proximal tubule, distal tubule and collecting duct) were obtained from the Human Protein Atlas database (https://www.proteinatlas.org/humanproteome/tissue/kidney).
Although the ESTIMATE algorithm was designed to analyze transcriptome data, some studies have used it for proteome analysis 3,7 . The results indicate the feasibility to evaluate the engagement of each subtype of immune cells. APM, immunosuppression, and CD8 cluster signatures were obtained from previous reports 50, 70 and computed using single-sample GSEA 71 . Metabolic pathway scores for 232 paired ccRCC samples were computed using the R package GSVA v1.34.0 72 (Table S4) 73 . KEGG, Reactome, and HALLMARK gene sets downloaded from the MSigDB v7.1 were set as background. FDR < 0.05 was used as a cutoff. The normalized enrichment score was used to re ect the degree of pathway overrepresentation.

Associations Between Clinical Characteristics and the ccRCC Proteome
Speci c clinical information is presented in Table S1. TNM stage-and ISUP grade-speci c proteins were screened out based on a fold change > 1.5 and p < 0.05. Speci c proteins of each TNM stage and ISUP grade were subjected to over-representation analysis using ConsensusPathDB (http://cpdb.molgen.mpg.de/) 74 . Clinical characteristics-associated pathways are listed in Table S6.

Proteomic Subtyping of ccRCC, and Subtype Features
Consensus clustering was conducted using the R package Consensus Cluster Plus 75 using Pearson correlation as the distance measure. The 1,000 proteins with the highest median absolute deviation in tumor samples were used for k-means clustering with up to ve groups. Consensus matrices for k = 2, 3, 4, 5 clusters are shown in Figure S8E-F. The consensus matrix for k = 3 showed clear separation among clusters. The cumulative distribution function of the consensus matrix for each k-value was also measured ( Figure S8F). The relative change in area under the cumulative distribution function curve increased by 33% from 2 clusters to 3 clusters, whereas others exhibited no appreciable increase. Thus, proteome clusters were de ned using k-means consensus clustering with k = 3. Subtype-speci c upregulated proteins are: (1) detected in ³25% tumor samples; (2) expressed higher than other subtypes (FC > 2, t test p < 0.05). Subtype-speci c upregulated proteins were further analyzed in ConsensusPathDB 74 . DEPs of each subtype and relevant enriched pathways are listed in Table S7.

Validation of Proteomic Subtyping Performance
GSEA was conducted to identify signature proteins of each proteomic subtype using GSEA v4.0.3 73 , and the 20 proteins with the highest scores in each subtype were selected. Hierarchical clustering of CPTAC ccRCC cohort 3 (available follow-up is three years at present) proteome data with signature proteins also classi ed the CPTAC cohort into three subgroups with a similar survival curve in our population, with GP1 showing distinctly worse survival than the other two subtypes (log-rank test, p = 0.001) ( Figure S8).

Correlations Between Subtypes and Clinical Features
To evaluate correlations between proteomic subtypes and clinical features, Fisher's exact test was conducted on categorical variables, including driver gene mutations, signi cant arm-level CNA events, age, sex, hypertension status, obesity status, cardiovascular and cerebrovascular disease status, family history of cancer, TNM stage, ISUP grade, and CPTAC subtype. Only variables that varied signi cantly among the three proteome subtypes are shown in Figure 5A. Scaled CPTAC ccRCC proteome data were used to identify signature proteins of each subtype by GSEA. The 20 proteins with the highest GSEA scores were selected as support vectors to build a support vector machine classi er. Chinese ccRCC cohort was divided into four CPTAC subtype using this classi er.

Effects of CNAs
Spearman's correlations between CNA values (gene level) and protein abundances were calculated using 14,538 genes quanti ed at both CNA and proteome levels. CNAs with signi cant correlation with proteins were selected based on FDR < 0.01. In total, 89,992 CNA and protein pairs showed signi cant correlation. Correlations were visualized using the R package multiOmicsViz. Genomic alterations that affect gene expression at the same locus are said to act in cis, whereas an impact of another locus is de ned as a trans effect (vertical patterns in Figure 2C), whereas the impact of other locus was de ned as a trans effect (diagonal patterns in Figure 2C).

Survival Analysis
The Kaplan-Meier method was used for survival analyses, and groups were compared using the log-rank test. The R survival package 3.2-3 76 and survminer 0.4.8 were used for statistical tests and visualization.
The HR was calculated by Cox proportional hazards regression analysis. Variates with p < 0.05 were considered to signi cantly impact prognosis. OS was used as a primary endpoint. Clinical and molecular variates with p < 0.05 in single variant analysis were selected for Cox regression multivariate analysis (Table S1).

Gene Silencing and Overexpression
For NNMT stable shRNA knockdown or overexpression, cells were co-transfected with pCMV-VSV-G, pCMV-Gag-Pol, and plasmids using the calcium phosphate method 77 . Transfected cells were cultured in DMEM containing 10% FBS for 6 h. Twenty-four hours after transfection, culture supernatant was collected and used for retrovirus preparation to infect cells at 10% con uency in 90-mm-diameter dishes. Cells were re-infected 48 h after the initial infection and selected using 5 μg/mL puromycin (Amresco).
NNMT shRNA was cloned into the AgeI and EcoRI restriction sites of the pMKO vector. NNMT was subcloned into the BamHI and EcoRI restriction sites of the pBABE vector using ClonExpress MultiS One Step Cloning Kit. The sequences of primers used were as follows: shNNMT-Forward (#9284), KU70 (#4103), and KU80 (#2753) were purchased from Cell Signaling Technology. Antibody against NNMT was purchased from Abcam (#ab58743). Antibody against Actin was purchased from Genscript (#A00702). Anti-K-Hcy antibody was generated as described previously 47 . Chemiluminescence was measured on a Typhoon FLA 9500 instrument (GE Healthcare).

IHC
Sections of ccRCC and adjacent tissues were obtained from formalin-xed, para n-embedded tissue blocks (not enrolled in the proteogenomic cohort). Immunostaining was carried out as reported previously 78, 79 . Sections were stained using relevant antibodies and the Envision detection kit (Dako). Immunostaining quanti ed based on the number of immunoreactive cells (quantity score) and the staining intensity (intensity score), as reported 78,79 .

Metabolite Quanti cation
Human tissues were homogenized in ice-cold phosphate-buffered saline (PBS) and centrifuged, and supernatants were collected for Hcy quanti cation. Hcy concentrations were determined using an Axis Homocysteine Enzyme Immunoassay Kit (Axis-Shield). To assay HTL, cells were harvested by PBS washing and denatured in pre-chilled 60% methanol (in ddH 2 O, pre-cooled at -80°C for 1-2 h). Cell lysates were centrifuged (10,000 × g) at 4°C for 5 min. Supernatants were vacuum-dried, re-dissolved in ddH 2 O, and subjected to ultra ltration on a polyvinylidene uoride low protein binding membrane (Millex-GV4 and Millex-HV4, Millipore). Metabolites were extracted and HTL was analyzed using LC-MS. SAM and SAH levels were detected using a SAM & SAH ELISA Combo Kit (Cell Biolabs). 1-Methylnicotinamide was measured using a UHPLC-QTOF-MS System (Agilent Technologies, 1290 LC, 6550 MS) as described previously 80 . NAD + levels were determined according using an NAD/NADH assay kit (Abcam) per the manufacturer's instructions. Each assay was repeated in triplicate, and means were used for analysis.

Lysine-homocysteinylation Site Identi cation in ccRCC Tissues
To identify lysine-homocysteinylation sites in tissue samples, ccRCC tumor and non-tumor tissues were ground in 0.5% NP-40 buffer, and supernatants were immunoprecipitated with anti-DNA-PKcs antibody and digested with trypsin. LC-MS/MS experiments were conducted on an EASY-nLC100 chromatograph coupled with an Orbitrap Elite (both from Thermo Fischer Scienti c) equipped with an online nanoelectrospray ion source. Peptides were desalted and suspended in 10 μL solvent A (A: water with 0.1% formic acid; B: acetonitrile with 0.1% formic acid). Each sample was loaded onto a self-packed C18 column (100 μm × 2 cm, 5-μm particle size), with a ow rate of 5 μL/min for 5 min and subsequently separated on the analytical column (C18, 75 μm × 20 cm) with a linear gradient from 5% B to 90% over 120 min. The column was re-equilibrated at initial conditions for 15 min. The column ow rate was maintained at 200 nL/min. The mass spectrometer was set as follows: ion-transfer capillary, 275°C; spray voltage, 2 kV; and full MS range, 400-2,000 m/z. Full mass spectra were acquired at 60,000 resolution with a target ion setting of 10 6 . One full MS scan was followed by 15 MS/MS scans, and multistage activation was enabled. The dynamic exclusion function was set as follows: repeat count, 2; repeat duration, 30 seconds; and exclusion duration, 60 s.

DNA-PKcs In Vitro Kinase Assay
In vitro DNA-PKcs kinase assays were conducted as described previously 49 . In brief, 200 ng DNA-PKcs and 3 μg p53 were incubated in a buffer containing 50 mM HEPES (pH 7.4), 100 mM KCl, 10 mM MgCl 2 , 2 mM EGTA, 0.1 mM EDTA, and 1 mM ATP at 30°C for 30 min. Y-shape DNA and KU70/KU80 were added as indicated. Reactions were terminated by addition of sodium dodecyl sulfate (SDS) sample loading buffer and boiling for 5 min. Samples were subjected to SDS-polyacrylamide gel electrophoresis and immunoblotting using site-speci c antibody against p53.
Brie y, we isolated DNA-PKcs protein from cells subjected to various treatments. To measure DNA-PKcs activity, 1 μL 5% DMSO, 2 μL of enzyme, and 2 μL of substrate/ATP mix were added to the wells of a 384well plate. The plate was incubated at room temperature for 60 min. Then, 5 μL of ADP-Glo TM reagent was added and the plate was incubated at room temperature for 40 min. Consequently, 10 μL of kinase detection reagent was added and the plate was incubated at room temperature for 30 min. Luminescence was recorded with an integration time of 0.5-1 s.

Cell Proliferation Assay
Cell proliferation was assessed using the Cell Counting Kit-8 (Dojindo Laboratories). In brief, cells were seeded in a 96-well plate at 4×10 3 cells/well and allowed to adhere. Cell Counting Kit-8 solution (10 μL) was added to each well, and the cells were incubated in 5% CO 2 at 37°C for 2 h. Cell proliferation was determined by measuring the absorbance at 450 nm.

Comet Assay
A Comet Assay Kit (Trevigen) was used to detect single-and double-stranded DNA breaks in cultured cells and tissues. Slides were examined under a Leica DMI 4000B epi uorescence microscope (425-500-nm excitation). Comet slides were used for each condition. In normal cells, uorescence is mostly con ned to the nucleus because intact DNA cannot migrate. In DNA-damaged cells, DNA is denatured with an alkaline or neutral solution to detect single-or double-stranded breaks, respectively; negatively charged DNA fragments are released from the nucleus and migrate toward the anode.

In Vivo Xenograft studies
Four-to-six-week-old Balb/C nude mice were obtained from Shanghai SLAC Laboratory Animal Co., Ltd.
Control and NNMT-overexpressing ACHN and 786-O cell lines were subcutaneously transplanted into the left and right anks of each mouse. For the IR group, irradiated control and NNMT-overexpressing cells were transplanted into the left and right anks of each mouse. For the IR+NAC group, irradiated control and NNMT-overexpressing cells were transplanted into the left and right anks of each mouse, and the mice were intraperitoneally injected with NAC (500 mg/kg) every other day. At the end of the experiment, following euthanasia, tumors were excised, weighed, and imaged. All procedures were approved by the Animal Care Committee at Fudan University.

QUANTIFICATION AND STATISICAL ANALYSIS
Quanti cation methods and statistical analysis methods for proteomic and integrated analyses were mainly described and referenced in the respective Method Details subsections.
Additionally, standard statistical tests were used to analyze the clinical data, including but not limited to Student's t test, Fisher's exact test, Kruskal-Wallis test, log-rank test. All statistical tests were two-sided, and statistical signi cance was considered when p value < 0.05. To account for multiple-testing, the p values were adjusted using the Benjamini-Hochberg FDR correction. Kaplan-Meier plots (log-rank test) were used to describe overall survival. Variables associated with overall survival were identi ed using univariate Cox proportional hazards regression models. Signi cant factors in univariate analysis were further subjected to a multivariate Cox regression analysis. All the analyses of clinical data were performed in R and GraphPad Prism. For functional experiments, each was repeated at least three times independently, and results were expressed as mean ± standard error of the mean (SEM). Statistical analysis was performed using GraphPad Prism.

DATA AVAILABLITY
Proteome raw datasets are publicly available at the iProx data portal: https://www.iprox.org/page/PSV023.html;?url=1605014585802S8oG, with a password rEiV. WES data les can be accessed at https://www.biosino.org/node/review/detail/OEV000128?code=XZPGYZGS. Figure 1 Proteogenomic Landscape of Chinese ccRCC. A, Schematic representation of the multiomics analyses of ccRCC, including sample preparation, protein identi cation, WES, and function veri cation. B, Genomic pro le and associated clinical features of 224 ccRCC patients. C, Comparison of frequently mutated genes among Chinese, European, and TCGA cohorts (Fisher's exact test). D, Overview of proteomic pro les of pairwise ccRCC samples. The dashed curves tted by lasso regression show the distribution of protein identi cations. The shading that underlies the lasso curves denotes the 95% con dence intervals. E, The upper Venn diagram shows the overlap of proteins identi ed in tumors and adjacent normal tissues. The lower Venn diagram shows that proteins identi ed in this study cover most of the proteins identi ed in the CPTAC ccRCC cohort.