Low CCL19 Expression Is Associated With Adverse Clinical Outcomes For Patients With Advanced Stage Follicular Lymphoma

Background: This study aimed to explore the interactions and relationships of genes and recognize the hub genes associated with prognosis in follicular lymphoma (FL) treated with rst-line rituximab combined with chemotherapy. Method: RNA sequencing data of dataset GSE65135 (n=24) were included in differentially expressed genes (DEGs) analysis. Weighted gene co-expression network analysis (WGCNA) was applied for exploring the coexpression network and identifying hub genes. Validation of hub genes expression and prognosis were applied in dataset GSE119214 (n=137) and patient cohort of Cancer Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College (n=32), respectively, by analyzing RNAseq expression data and serum protein concentration quantied by ELISA. CIBERSORT was applied for tumor-inltrating immune cells (TIICs) subset analysis. Results: A total of 3260 DEGs were obtained, with 1861 genes upregulated and 1399 genes downregulated. Using WGCNA and Cytoscape analysis, eight hub genes, PLA2G2D, MMP9, PTGDS, CCL19, NFIB, YAP1, RGL1, and TIMP3 were identied. Kaplan-Meier analysis and multivariate COX regression analysis indicated that CCL19 independently associated with overall survival (OS) for FL patients treated with rituximab and chemotherapy (HR = 0.47, 95%CI [0.25-0.86], p = 0.014). Higher Serum CCL19 concentration was associated with longer progression-free survival (PFS, p=0.014) and OS (p=0.039). TIICs subset analysis showed that CCL19 expression had a positive correlation with monocytes and macrophages M1, and a negative correlation with naïve B cells and plasma cells. Conclusion: CCL19 expression was associated with survival outcomes and might be a potential prognostic biomarker for FL treated with rst-line immunochemotherapy.

Therefore, identifying the high-risk patients with suboptimal treatment response before treatment and improve their survival outcomes are necessary for the management of FL.
Most previous studies focused on screening differential expression genes and establishing gene models in FL [7, 8=. However, the interconnectivity and co-expression of the genes was ignored. Weighted gene co-expression network analysis (WGCNA) is a useful method to explore the interactions and relationships within genes and can help to recognize the hub genes associated with clinical characteristics [9, 10=. WGCNA can recognize the core gene and provide information for potential clinical prognosis biomarkers and has been widely used in cancer genome-related researches, including other subtypes of lymphoma [11-13=. In this study, we sought to identify the hub genes for FL using WGCNA conducted in a cohort containing FL samples and normal samples. We focused on the prognosis value of the identi ed hub genes. The association of hub genes and survival outcomes was validated in two independent cohorts including both tumor samples and serum samples of FL patients receiving standard rst-line treatment, respectively. We aimed to raise new potential biomarkers for FL, which could bring more insight for diagnosis and clinical treatment in FL.

Materials And Methods
Patients and datasets GSE65135 containing RNA sequencing (RNA-seq) data of 24 cases, including 10 normal tonsil tissue (control group) and 14 FL tissue (cancer group) obtained from Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/) was used for exploring the differential expressed genes (DEGs). GPL570 (Affymetrix Human Genome U133 Plus 2.0 Array) was used for this dataset and the annotation for the gene IDs was conducted using the hgu133plus2.db package in R software (Vienna, Austria. https://www.R-project.org/). The Affy package in R was applied for raw data quality control, preconditioning and sorting out[14=. The expression of repeated genes was expressed as mean values.
To validate the prognostic value of hub genes identi ed from GSE65135, GEO database was searched using the term "follicular lymphoma", with organism restricted to "Homo sapiens". Up to January 9th, 2021, a total of 147 FL related GEO series was screened. Datasets of expression data with complete prognostic information were included in the validation cohort. Only one dataset, GSE119214 contained expression data and survival data (FFS and OS) of pre-treated FFPE samples of 137 FL patients treated with rituximab and chemotherapy and was included in the validation cohort. GPL13938 (Illumina HumanHT-12 WG-DASL V4.0 expression beadchip) was selected for this microarray dataset and gene IDs were mapped to the microarray probes using the annotation information provided on GEO dataset. Differentially expressed genes screening The limma package in R was used for screening the differentially expressed genes (DEGs) between control group and cancer group in GSE65135 [15=. The thresholds for DEGs were as follows: (1) p < 0.05; Coexpression Network Construction and hub gene selection WGCNA package in R was applied for exploring the coexpression network and identifying hub genes [9, 10=. Expression data of DEGs were input into R and undergoing quality check before coexpression analysis. Samples and genes with poor quality would be excluded. Expression data of quali ed samples were included in the analysis and similar matrix was constructed by calculating the Pearson correlation coe cient of two genes. Soft thresholding power was explored to ensure a scale-free network. The mean connectivity and scale independence of network modules were analyzed using gradient under soft thresholding power ranging from 1 to 20. Hierarchical clustering dendrogram summarized the gene modules. The minimum number of each gene module was set at 50 to ensure the reliability of each module. Heatmap and topological overlap matrix (TOM) plot were drawn to display the intensity of interaction among the modules.
The genes of modules were then used to construct the functional protein network. To further analyze the hub genes of the network, edges and nodes of network were output under certain threshold value (0.4) to Cytoscape software (version 3.8.0; http://www.cytoscape.org/) for analyzing and visualizing[19=. CytoHubba plugin in Cytoscape was applied for analyzing the degree of connectivity for each gene. The degree of connectivity set to rank genes. Genes with connectivity degree over 20 were identi ed and were considered to be hub genes. Identi ed hub genes were uploaded in the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) online database (version 11.0, https://string-db.org/) for analyzing the protein-protein interaction (PPI).

Validation of hub genes
To validate the prognostic power of hub genes, Kaplan-Meier analysis were applied using the data of GSE119214 and CHCAMS cohort. The best expression cut-off value for each hub gene for OS was obtained by the maxstat package of R. To identify the independent prognostic value of hub genes, multivariate analysis was applied.
To further explore mRNA levels of hub genes, ONCOMINE database (http://www.oncomine.org), a publicly accessible online cancer microarray database that facilitates the discovery of genome wide expression analyses, was applied. Student's t test was conducted for comparing the mRNA level of cancer specimens and normal control datasets. Fold change value was set as 2 and p value < 0.05 were considered as signi cant.

Measurement of serum CCL19 concentration
Peripheral blood samples were obtained before rst-line treatment, and serum was extracted and stored at − 80°C until use. Serum CCL19 concentrations were quanti ed using the CCL19 human enzyme-linked immunosorbent assay (

Statistical analysis
Univariable analysis was performed using Kaplan-Meier survival analysis with log-rank test. Multivariable analysis was performed using COX regression model. Correlations between two groups were calculated with Spearman's coe cient (R). Comparison of two groups was performed using Mann-Whitney-Wilcoxon test and comparison of multiple groups was performed using Kruskal-Wallis test. All statistical analyses were performed and visualized by R studio software (version 4.0.3, https://www.r-project.org/) and GraphPad PRISM (version 8.0.2). Two-side p value < 0.05 was considered as statistically signi cant.

Overview of Differentially Expressed Genes
In differential expression analysis, a total of 3260 differentially expressed genes (DEGs) were obtained, of which 1861 genes were upregulated and 1399 genes were downregulated in tumor samples compared with normal samples (Fig. 1A to B).
To further analyze the biological function of the DEGs, the upregulated and downregulated genes were further analyzed with GO and KEGG enrichment analysis, respectively. Functional enrichment analysis was performed in the biological process. Up-regulated DEGs were enriched in extracellular matrix organization, extracellular structure organization and regulation of vasculature development, while down-regulated DEGs were enriched in nucleosome assembly, chromatin assembly or disassembly and chromatin assembly ( Fig. 1C and 1D). KEGG analysis showed that the upregulated DEGs were enriched in PI3K-AKT signaling pathway, cytokine-cytokine receptor interaction and cell adhesion molecules. The down-regulated DEGs were involved in alcoholism, neutrophil extracellular trap formation and systemic lupus erythematosus ( Fig. 1E and 1F).
As all the genes included in the GO and KEGG analysis were selected under the threshold set arti cially (p < 0.05; log2 [fold change] > 1 or <-1), the results might be different under different thresholds. Therefore, all expression dataset was included in GSEA analysis. Results showed that IL-6/JAK/STAT3 signaling, complement and mitotic spindle were three pathways that mostly enriched in tumor samples (supplementary Figure S1).

Weighted Gene Correlation Network Analysis
To explore the key modules and hub genes in FL, WGCNA was performed for network construction to nd highly-correlated genes. All the cancer samples and controlled samples were included in the analysis after quality check (supplementary Figure S2). Soft thresholding power was set at 6 to ensure a scale-free network, with scale-free R 2 = 0.88 and mean k = 134 (supplementary Figure S3). Gene modules were explored and the identi ed modules were showed in the hierarchical cluster tree (Fig. 2A). A total of 3260 DEGs were allocated in three modules, with 633 genes in blue gene module, 209 genes in brown gene module and 2418 genes in gene module. Heatmap plot show the topological overlap matrix (TOM) among all genes and the interactive relationships between all three coexpression modules (Fig. 2B). Figure 2C illustrated the relationship of the modules with FL. According to the Pierson correlation coe cient between a module and sample feature for each module, turquoise module was closely associated with FL (R = 0.99) and the genes of this module were further analyzed. The functional enrichment of genes in turquoise module was further explored. GO analysis showed that cell chemotaxis, extracellular matrix organization and extracellular structure organization were the most enriched pathways of genes in turquoise module. KEGG analysis indicated that genes of turquoise module were mostly enriched in alcoholism, neutrophil extracellular trap formation and systemic lupus erythematosus pathways (supplementary Figure S4).

Identi cation of hub genes
The threshold of edge weight was set to be more than 0.4 to localized hub genes. A total of 1791 edges and 382 nodes of turquoise module were included in Cytoscape analysis. The genes involved in the network were ranked by degree calculated in cytoHubba plugin to explore potential hub genes. Supplementary Figure S5A

RNA expression analysis for the validation of hub genes
Oncomine database was applied to analyze the mRNA expression of the eight hub genes. Supplementary Figure S6 illustrated the expression of the hub genes among different cancer types. All of the eight hub genes were overexpressed and MMP9, NFIB, YAP1, RGL1 and TIMP3 expression were down-regulated across different datasets.
Next, the expression and prognostic value of hub genes were validated in another independent GEO dataset with expression pro ling and survival information of 137 FL samples (GSE119214) [21=. The whole 137 patients were all received rituximab in combination with chemotherapy as rst-line treatment. The detailed baseline patient characteristics and survival information were shown in supplementary   Table S1. Kaplan-Meier analysis was performed and PLA2G2D, CCL19, and YAP1 were found to be signi cantly associated with OS while the expression of other ve genes (MMP9, PTGDS, NFIB, RGL1, and TIMP3) showed no signi cant relationship with OS ( Fig. 3A-H and supplementary Figure S7). To determine independent prognostic genes, multivariate COX regression analysis was performed among the eight hub genes. Higher expression of CCL19 and RGL1 were signi cantly associated with longer OS (HR As CCL19 showed prognostic value in both Kaplan-Meier analysis and multivariate COX analysis in the GEO dataset, the expression of CCL19 was further explored. Oncomine dataset analysis indicated that mRNA of CCL19 was ≥ 84.549 and ≥ 1.438 fold elevated in FL samples compared to normal tissue in Compagno lymphoma dataset and Brune lymphoma dataset, respectively (Fig. 4A to B). However, in Alizadeh lymphoma dataset and Rosenwald multi-cancer dataset, the expression level of CCL19 was showed no statistical signi cance between FL samples and normal tissue (Fig. 4C to D).

Serum CCL19 concentration measurement
The prognostic value of CCL19 was further validated in CHCAMS cohort. A total of 32 pre-treatment serum samples of treatment naïve FL were included in the ELISA test, with 16 (50%) patients were male.
The median age was 47 years old. All the patients received rst-line immunochemotherapy, including R-CHOP or R-CHOP-like regimens, with 10 of 32 (31.2%) receiving rituximab maintenance. Median follow-up time was 58.0 (range: 12.0-98.0) months. The detailed baseline patient characteristics and survival information were shown in supplementary Table S1. There were no difference of serum CCL19 concentration between difference age groups (mean concentration: 1.439 ng/ml in age over 60 years vs. 1.823 ng/ml in age less than 60 years, p = 0.468), Follicular Lymphoma International Prognostic Index risk groups (mean concentration: 2.122 ng/ml in low-risk group vs. 1.578 ng/ml in intermediated-group vs. 1.070 ng/ml in high-risk group, p = 0.348) and rituximab maintenance groups (mean concentration: 1.165 ng/ml in maintenance group vs. 1.578 ng/ml in no maintenance group, p = 0.278) (Fig. 5A to C). The best-cutoff value for serum CCL19 was 0.94ng/ml (supplementary Figure S8) and patients were grouped into high CCL19 concentration and low CCL19 concentration group according to the cutoff value. Kaplan-Meier curves showed that serum CCL19 concentration was associated with PFS and OS, with higher CCL19 concentration group having longer PFS (p = 0.014) and OS (p = 0.039) (Fig. 5D to F).

CIBERSORT analysis
To further elucidate which subset of TIICs might serve as regulatory role in different CCL19 expression groups, CIBERSORT analysis was applied in dataset GSE119214 to calculate the relationship of CCL19 expression and microenvironment cell subsets. Figure 6A illustrated the proportion of different subsets of TIICs between CCL19 low expression and high expression groups. The correlations were further quanti ed and CCL19 mainly had a positive correlation with monocytes (R = 0.363, p < 0.001) and macrophages M1 (R = 0.325, p < 0.001), and it had a negative correlation with naïve B cells (R = -0.304, p < 0.001) and plasma cells (R = -0.229, p = 0.007, Fig. 6B).

Discussion
FL is a kind of indolent lymphoma with heterogeneity. Previous studies focused on identifying related genes using differential expression analysis. In this study, we applied WGCNA analysis utilizing RNAseq data of FL and found three related co-expression gene modules, explored the gene network relationship, and identi ed eight key genes of the network. The prognostic value of eight key genes were validated in two independent FL patient cohorts and we identi ed that CCL19 was signi cantly associated with PFS and OS in FL patients receiving rst-line immunochemotherapy, which indicated that CCL19 could be a potential prognosis biomarker for FL. To our best knowledge, this study is the rst to report the potential prognostic value of CCL19 in FL.
In previous studies, different genes were reported to be related to the development of FL. About over 85%  [32, 33=. In this study, among the three modules identi ed, the turquoise module showed signi cantly association with FL tumors. Functional analysis showed that genes in turquoise module was enriched in cell chemotaxis, extracellular matrix organization and extracellular structure organization.
Results of our study further supported that pathway of cell interaction with chemokine may be closely related to the development of FL.
In this study, we identi ed that CCL19 was signi cantly associated with survival for FL patients treated with rituximab in combination with chemotherapy. CCL19 encoded cytokine that involved in immunoregulatory and in ammatory processes. CCL19 speci cally binds to chemokine receptor CCR7. CCR7 and CCL19 played an important role in organizing thymic architecture and function, lymph-node homing of naive and regulatory T cell, as well as homeostasis and in ammation-induced lymph-nodebound migration of dendritic cells, which indicated that CCR7 and CCL19 involved in the homeostasis, immune surveillance, and tumor formation[34=. In previous studies, CCR7/CCL19 was reported to be associated with some types of cancer. CCR7 chemokine receptor binds to the ligand CCL19/CCL21 and promotes lymphogenesis and metastasis in breast cancer[35=. CCL19 overexpression signi cantly inhibited gastric cancer cell proliferation and tumor growth through CCL19/CCR7/AIM2 pathway[36=. Also, CCL19 overexpression is associated with malignant transformation in cervical cancer[37=. For lymphoma, O'Connor et al. reported that CCL19-CCR7 interactions may contributed to the increasing risk of age-related central nervous system lymphoma[38=. In T cell lymphoma, higher expression of CCR7 was associated with distant metastasis as well as tumor cell migration in vitro and the underlying mechanism might be associated with PI3K/AKT signaling pathway[39=.
In our study, CCL19 was found to be overexpressed in different FL cell lines utilizing public database analysis. Previous studies have showed that strong in vitro chemotactic activities of CCL19 for T cells and DCs and might activate a LTα1β2-dependent pathway of normal and pathological lymphoid tissue formation[40=. In addition, CCL19 mRNA overexpression was related to higher survival rate. Also, elevated serum CCL19 concentration was associated with longer PFS and OS. The protective prognostic value of higher CCL19 expression level had been validated at both the transcription and protein concentration in two independent FL patient cohorts. In the analysis of immune cell in ltration in tumor microenvironment, the expression of CCL19 was associated with macrophages M1 and monocyte. The protective effect of CCL19 overexpression might be explained from the mechanism of CCL19 and the formation of FL. CCL19 was produced by T-zone broblastic reticular cells and are essential for the formation and maintenance of the T-cell zone in lymphoid organs, as well as T cells and DCs peripheral recruitment. Therefore, elevated CCL19 expression could induce the function of T-zone reticular cells and recruit T cells and DCs to tumor tissues[41=. Previous studies have reported the chemotaxis function of CCL19 for macrophages M1[42=. Also, decreased subpopulations of CD4+/CD8 + T cells, macrophages and dendritic cells in patients are associated with FL transformation and are predictors of worse survival[43, 44=, which was consistent with the result of TIIC subtype analysis in our study.
In B cell lymphomas, including FL, DCs have showed to be correlated with better prognosis, suggesting that DCs may act against tumor cells in lymphoma [45-47=. Although in this study, the DC proportion has no signi cant correlation with the expression of CCL19, it is still worthwhile to exploring the function of CCL19 and DC cells for potential clinical treatment. In fact, new generation of CAR-T cell has been engineered to expressed IL-7 and CCL19 to elevate anti-tumor e cacy[48=. Moreover, dendritic cell-based active immunotherapies [49 = and macrophage checkpoint inhibitor Hu5F9-G4 have showed e cacy in FL [50=. Result of this study indicated that CCL19 could act as a biomarker to identify FL patients who may have inferior e cacy derived from rst-line immunotherapy. Considering the DC and macrophages recruitment capability of CCL19, it is worthwhile to explore whether these patients could bene t from dendritic cell-based active or macrophage-mediated immunotherapies for stimulating the innate immune system.

Conclusion
CCL19 might be a potential survival biomarker for FL treated with rst-line immunochemotherapy. In vitro, in vivo, and clinical studies are needed to be explored the underlying interaction and regulation mechanism of CCL19 for FL in our future studies.

Declarations
Ethics approval and consent to participate Study of pre-treatment serum samples from patients was approved by the medical ethics committee of Cancer Hospital, CAMS & PUMC (No. 19/088-1873).

Consent for publication
Not applicable. and takes responsibility for the integrity of the data and the accuracy of the data analysis. All authors contributed to the development of the article and approved the nal version.