Cell type-specific pathways associated with Diffuse large B-cell lymphoma metastasis related to Neuro-diseases

Background: With the advancement of single-cell sequencing, it’s become rapid emergency to detect the cell-specific changes of Diffuse Large B-cell Lymphoma metastasis that leads to the central nervous system disorder. Results: In this study, single-cell RNA-seq of Peripheral blood mononuclear cell from a human sample is curated and cell types related to lymphocytes are identified. Subsequently, the potential markers of Diffuse Large B-cell Lymphoma are found. It is noticed that LEF1, TCF7 and CD79A/B markers of different cell types show an important role in this disease formation, progression and metastasis processes. To understand the impact of markers, associated pathways are studied in details by establishing a pathway semantic network. Moreover, this association validated the channel through which the pathways are triggered within the cell environment and resulted in metastasis. The connection between Diffuse Large B-cell Lymphoma metastasis and other central nervous system disorders is demonstrated by constructing a disease network. Conclusion: The study reveals how cell types are responsible for the pathway shifts. Furthermore, this information provides a cell-specific channel of triggering the progression of central nervous system diseases among Diffuse Large B-cell Lymphoma patients.


Introduction
Diffuse large B-cell lymphoma (DLBCL) [1] is the most common lymphoma and nearly 30-40% of lymphoma patients are found suffering from DLBCL. Long time it is considered as a single entity disease. With the advancement of different sophisticated techniques, gradually revealed the genetically heterogeneous characteristics of DLBCL [2]. Mainly, it is subdivided into two groups: an activated B-cell subtype and a germinal centre B-cell subtype [3]. These two subtypes are related to various B-cell differential stage and encompass distinct oncogenic mechanism. It is noticed that most of the DLBCL patients are curable with R-CHOP chemotherapy [4]. On the other hand, some patients are failed to respond to this treatment [5]. Moreover, this cancer type quickly metastasizes or spread to central nervous system and resulted in long term disorders (Alzheimer's, multiple sclerosis, etc.) [6,7]. To increase the curability rate among all the patients it is crucial to understand the molecular pathogenesis of the disease. Tissue-specific cell type-based studies open a new in-sight. The cells share similar tissue type can disclose crucial information regarding the different pathological state. Therefore, studies based on the single-cell level are capable to unveil the underlying biological complexity and uncover the mechanism of involving biological processes. Due to the aggressive nature of DLBCL, it is an important to understand the cell-specific changes that lead to metastasis. Although potential markers play an important role in metastasis, their associated pathways are also responsible to trigger the further spreading process of malignancies. As per our knowledge, no studies have yet reported the cell-specific association of markers and their pathways to uncover the relation between DLBCL metastasis and CNS diseases. Whereas this information can open a new direction of research on DLBCL. Moreover, details understanding of CNS diseases related to DLBCL metastasis is also important in therapeutic studies.
In this study, single-cell RNAseq data for Peripheral blood mononuclear cell (PBMC) are studied. The differentially expressed markers of each cell type are extracted. From the resulted markers, DBLCL specific potential markers are identified to perform further study. The pathways associated with the potential cell markers of a particular cell type is responsible for regulating the cell functionality towards progression and metastasis of the disease. In this regard, the connection between cell-specific pathways is unfolded by establishing a pathway semantic network. DLBCL patients tend to get affected by central nervous system (CNS) related disorder [8,9]. Therefore a cell-specific disease network is established between the potential markers and diseases to visualize how the pathways responsible to the metastasis and finally leads to brain-related diseases.

Methododlogy
Publicly available single-cell RNAseq PBMC data is downloaded from http://www.10xgenomics.com. The complete flowchart of the proposed work is described in Figure 1.
Cell type identification dropClust [10] is used to process the raw count matrix. Initially, Unique Molecular Identifier (UMI) counts < 3 is set to eliminate PCR amplification bias effect. As a result of the UMI threshold, the cells are filtered and genes with poor quality are debarred depending on the minimum number of cells. Subsequently the selected genes undergoes through count normalization technique. In this process, the expression values of the raw count data are normalized by applying the median normalized total count. Moreover, based on the dispersion index the genes are selected according to the rank. For dimensionality reduction, principal component analysis (PCA) is utilized. During PCA, two clustering process are used. At first, an initial clustering is performed to approximate the structure of the data. Then each cluster is fine-tuned by using average-linkage hierarchical clustering to group the sampled cells based on the expression of the selected genes. Euclidean distance is used to measure the dissimilarity. Cells that are not subjected to hierarchical clustering are assigned their respective clusters of origin using locality preserving hash codes. Furthermore, the differentially expressed genes (DE genes) of each cluster are identified. The top-ranked 30 genes are selected to identify cell types of each cluster by utilizing both CellMarker [11] and PanglaoDB [12] databases.
Determine the responsible genes for DLBCL Each cluster is considered to identify the cell types that are responsible for Diffuse Large B-cell Lymphoma. From different studies, it is found that T cell and B cell have higher impact on this disease. Along with these two cell type, we observed that Natural killer (NK) cells are also responsible. To identify the potential markers of DLBCL, these three cell types and their subtypes are selected for further study. DisGeNet database [13] is utilized to prepare the disease-specific list of genes, this list is overlapped with the genes of the selected cell types. The resulted genes from this are considered as potential markers of the disease.
Pathway Semantic similarity calculation The pathways of the selected genes of each cluster are considered [14] and utilized for calculating the pathway semantic similarity score. During the pathway selection the disease-specific pathways are eliminated. The interacting relation between pathways can be represented through a network which is constructed based on semantic score [15]. Let G is the GO terms denoted as DAG G = (G, V G , E G ), here V G and E G represent set of GO terms and connecting edges of the GO terms respectively. Thus, defined the contribution of a GO term g1 to the semantic of GO term G as the S − value of GO term g1 related to term G. For any of term g1 in DAG G , its S − value related to term G.
For two given GO term, G and H, the semantic similarity between them is defined as: The method proposed by Wang et al. [14] uses equation 1 to compute the GO semantic similarity (SS). Moreover, S G (g1) and S H (g1) are the S-value of GO term g1 related to term G and H respectively.V H is the set of GO terms including term H as well as all its ancestors.

Disease Network
From each cell type, the selected potential markers are considered for establishing the disease network. The disease list is prepared by using DisGeNet database. Moreover, these diseases are considered based on (i) CNS diseases due to DLBCL metastasis found from literature and (ii) the selected markers and their pathways responsible for those diseases. Finally, the resultant diseases are shortlisted and the cell type-specific network is constructed between CNS diseases and their responsible markers.

Results
The ScRNASeq data helps to reveal the information regarding DLBCL disease. A hierarchical clustering is performed on the curated dataset to identify the cell types. The diverse cell types recognize the functionality of a particular tissue or organ and unzip the heterogeneity of organism into a specific taxonomy. During this study, twelve clusters are identified, as shown in Figure 2A. Different colours are used to distinguish the clusters. In each cluster genes are ranked depending on their expression value. Along with the clusters representation, top differentially expressed genes of each clusters are visualized through a heatmap in Figure 2B. The cell type of each cluster is identified through CellMarker and PanglaoDB databases. The selected potential biomarkers of each cell type are reported in Table 1 by using DisGeNet database. To understand the expression level of the biomarkers, one from each cell type is considered to perform a violin plot, shown in Figure 3.
The potential markers are further obtained to identify the cell type-specific pathways. Pathways having maximum number of marker association along with high impact on DLBCL are curated from Reactome database [16] and KEGG Pathway database [17]. During this study, we noticed that most of the pathways are common among the cell types but some are unique and cell type-specific reported in Supplementary Table T1. However, the marker association of the common pathways are cell type dependent. The biological process is pondered for each associated pathways. Depending on the processes, cell type-specific semantic network is established. This is shown in Figure 4. Each coloured node represents a particular pathway and the connecting edges between two nodes depict the similarity value. The highest semantic score among the pathways is highlighted with red colour. We found that NF-κB signalling pathway and Wnt signalling pathway posses high semantic score in most of the cell type. Moreover, both TCR and BCR signaling pathways play crucial role in DLBCL. In Figure 4, two cell types are not mentioned because only one signalling pathway of each type is found from the databases. The semantic similarity network of Natural Killer cells and Regulatory T cells respectively are reported in Supplementary Figure S1. The cell-specific semantic network helps to reveal the pathways those are triggered by the potential markers and finally leads to metastasis. Furthermore, from literature it is observed that patients suffering from DLBCL are at high risk of brain metastasis. Interestingly, the pathways reported in our study played an important role in this progression. The brain related diseases associated with both the selected potential markers and their pathways are considered to establish a cell type-specific disease network, shown in Figure  5. In each disease network, yellow oval and blue rectangular shaped nodes represent markers and diseases respectively. On the other hand, to highlight the DLBCL connection with diseases and markers, the node is denoted as green colour with diamond shaped.

Role of cell type-specific pathways in DLBCL
In our study, we found some important markers which are LEF1, TCF7, CD79A and CD79B. The first two are from naïve CD8+ T cell whereas other two are from B-cell respectively. Each cell type possesses significant pathways with their unique implications on the progression of DLBCL (e.g., Wnt signalling pathway, NOD-like receptor signaling pathway, Prolactin signaling pathway and Chemokine signaling pathway etc.). The first two aforementioned markers have worked as a downstream transcription factor for WNT signaling pathway [18]. This pathway plays a key role in the development and maintenance of embryo, metabolism and cellular growth [19] as well as involves in the progression of carcinoma [20]. During the development of CD8+ T cells from naïve CD8+ T cells, the expressions of these two markers are decreased. Our detailed study of this cell type reported another pathway i.e cytokine-cytokine interaction. As the selected cells are part of initial immune response, the cells have been activated after detection of the antigen. Initially, the malignant and/or infected cells are fatally targeted by secreting cytokines [21]. However, disruption during the secretion may lead to pathogenic outcome e.g., multiple sclerosis. Furthermore, Reya et al [22] has reported that deficiency of LEF-1 hampers the proliferation and survival rate of pro-B cells where LEF-1 contributes in the inability of the cells to respond to mitogenic signals of WNT in the microenvironment. WNT signaling pathway is directly connected to the BCR signaling pathway which is the key regulator of B cell. Normal BCR signaling is associated with the proliferation, survival, apoptosis, and differentiation of B-cell. However, abnormal activation of this pathway is responsible for oncogenesis, e.g, DLBCL [23]. Interestingly, two markers (CD79A, CD79B), reported previously, are the key player of BCR signaling pathway in B-cell. In a subtype of DLBCL, mutation of these two markers are observed which leads to BCR signaling pathway dysregulation and finally leads to oncogenesis. From the B-cell pathway list, we found another marker CD40 which is significantly associated with multiple pathways. The reported marker is a surface receptor that can control the activity of bruton tyrosine kinase (BTK) in B cell [24]. The expression of BTK is restricted. However, the pathway is involves in the differentiation and activation of the cell type. Also related to immune function, CD40 is responsible for transcription regulation, and apoptosis modulation in both Toll-like receptor (TLR) pathway and cytokine receptor signaling pathways. In the BCR signaling pathway, BTK is responsible for receiving signals from SYK and transducing signals to initiate downstream signaling pathway i.e. NF-κB signaling pathway, one of the important pathways in DLBCL [25]. Moreover, WNT signaling pathway partially triggers the activation procedure of NF-κB signaling pathway along with BCR signaling pathway [26]. The activation of this pathway can trigger the oncogenic events by mediating some mutations of CD79A, CD79B, CARD11, MYD88 and CD40 etc [27]. The classical NF-κB pathway is downstream of BCR signaling and some other pathways such as TCR signaling pathway, TNF signalling pathways and TLR signalling pathway. A large portion of adaptively induced antigen-specific regulatory T cells are present within tumor-infiltrating lymphocytes, peripheral blood lymphocytes, and/or regional lymph node lymphocytes [28]. To determine the seriousness of the disease and the therapeutic impact, tumor-related immunosuppression plays a crucial role which can decrease the cytotoxity, proliferation, and cytokine secretion of type 1 T helper cell (Th1). The activity of regulatory T cell is highly controlled by the TCR signaling pathway [29]. No direct connection has been found between DLBCL and TCR pathway. However, different pathways that play a key role in DLBCL, are activated due to TCR signaling. Therefore, it can infer that TCR can partially control the progression of this aggressive B-cell lymphoma. In a study, Dong et al. has reported that typical PD1 and PD-Ls couple co-inhibitors help to balance the immune function in the immune system [30]. Usually, PD-1 adversely controls effector T-cell functions by inhibiting the T-cell receptor. However, this may lead to effector T cell exhaustion and finally resulted in immune escape. The immune escape of the signaling pathway is highly responsible for oncogenesis, tumor aggression and metastasis in many types of malignancies. On the other hand, various studies show that TCR signaling pathway has an important impact on the activation of the NF-κB signaling pathway which has a strong contribution in DLBCL progression [31]. Furthermore, regulatory T cell inhibits the function of important cell type. such as CD8+ T cells and B cells through variable mechanism. Another cell type, natural killer cells are inhibited by regulatory T cell. We reported this cell type as one of the important cell types in Table 1 as cluster ID 9. Both the CD8+ T cells and NK cells are cytotoxic effector cells of the immune system. However, their specificity, sensitivity, and memory mechanisms are drastically different. A very little information is available of NK cell types. From the literatures [32,33], it is observed that marker IRF1 can critically regulate TNF signalling pathway . Interestingly, we also noticed the association between a marker and the pathway in NK cell type. The complex environment of DLBCL possesses TNF-α along with chemokines and some cytokines produced in mesenchymal matrix from different inflammatory cells. In a recent study, it has been proved that sCXCL16 can reinforce TNF-α to mediate the DLBCL cell proliferation [34]. In this process, NF-κB signaling is activated by sCXCL16 to increase the production of TNF-α in DLBCL cells.
Along with the discussed pathways above, some other significant pathways are also observed during the detailed study of cell-specific pathways. These pathways are also associated with NF-κB signaling pathway and dedicatedly possess an important role in DLBCL. Among these, PI3K/AKT/mTOR signaling pathway is found activated in most of the DLBCL patients. The phosphorylation of phophatidylinositol-4,5-biophate converted to phosphatiyl-inositol-3-4-5 triphosphate (PIP3) by the activated PI3K isoform. This conversion activates ATS which plays an important role in PI3K/AKT/mTOR pathway. Activated AKT (p-AKT) effects the signals of pro-survival for anti-apoptosis, proliferation and cell growth through mTOR [35]. Study of genetic ablation of BCR signaling in B-cell B-cell unveil that survival of mature B-cell with BCR deficiency can be retrieved by downstream PI3K signaling. Similarly, downstream kinase PDK1 as well as activated PI3K are crucial for the survival of cell lines that possess CD79B mutation. Inhibition of this activated pathways can be a promising target for improving DLBCL therapeutic research in future. Also, TLR signaling pathway plays as a stimulus to activate NF-κB in ABC-DLBCL (a subtype of DLBCL). TLR signaling is mediated by an adaptor protein MYD88 [36]. Another pathway reported under cell type CD8+ T cell is NOD-like receptor (NLR). This is responsible for multiple biological functions. These functions include initiating the formation of nuclear factor κB (NF-κB), stress kinases, interferon reaction factors (IRFs), inflammatory caspases and autophagy. Among the subfamilies of NLRs depending on the nature of N-terminals residues, an acidic transactivation domain or NLRA subgroup shows significant role in DLBCL. In the naïve CD8+ T cell type, a unique pathway, the Hippo-YAP signalling pathway is noticed. Hippo-Yes-associated protein (YAP) as a key effector of this pathway, involves transcriptional coactivator in balancing cell development, cell apoptosis, and drug target in a few human malignancies. The study of the Hippo signalling pathway in DLBCL environment is very limited. Though it is evidenced that, DLBCL samples with high expression rate of YAP shows poor prognosis. A recent study demonstrated that insulin-like growth factor-1 receptor (IGF-1R) acted as an upstream negative regulator of Hippo-YAP signaling [37]. It is extremely interesting to note that IGF-1R inhibition resulted in decreased activation of Hippo-YAP signalling in DLBCL. It is also shown that the expression level and nuclear gathering of YAP in DLBCL can be notably controlled by IGF-1R knockdown.

DLBCL metastasis lead to CNS disorder
It is well-known that DLBCL patients have a tendency to develop central nervous system disorder. However, the pathways which cross the blood-brain-barrier to help in the progression of the diseases are still unknown. Among the aforementioned pathways, alternation in BCR, TLR and NF-κb pathways are shown in greater than 90% of primary CNS lymphoma [38]. Due to the metastasis of DLBCL a very rare type of CNS lymphoma is described as a subtype of DLBCL lymphoma. In this microenvironment, the cancer cells are developed from lymph tissue in the brain or spinal cord (primary CNS) area or even metastasis to the brain from other parts of the body (secondary CNS). Another type, leptomeningeal lymphoma is observed in this kind of patients which affect the spinal fluid that bathes the brain and spinal cord. This subtype is mostly considered as a metastasis form of diffuse large B-cell lymphomas and the patients show very poor prognosis [39]. In Figure 5, the CNS diseases related to markers of DLBCL are shown through a disease network.
Apart from CNS diseases, DLBCL patients may affect by many other brain's related diseases for example Head and Neck carcinoma. In a review paper, the authors reported that NF-κB, Wnt/β-catenin and PI3-K/AKT/mTOR signalling pathways are involved in Head and Neck carcinoma [40] which also have an important impact on DLBCL. Some diseases similar to primary central nervous system lymphoma are glioma, meningiomas, medulloblastoma. On the other hand, patients diagnosed with CNS lymphoma suffers from dementia, seizures, dizziness. During our study, we found the genes associated with Alzheimer's (which is a common type of dementia), thymoma, multiple sclerosis, Parkinson disease, etc. Studies show that CNS disorder can lead to these diseases in many cases. Depending on this fact we can conclude that primary and secondary CNS lymphoma may accelerate the rate of occurrence of these diseases. Neoplasm is very uncommon to Parkinson diseases. Surprisingly a study in 2001 demonstrated a woman suffering from high risked DLBCL presenting Parkinson diseases [41]. Interestingly, we noticed a prolactin signalling pathway associated with NK cell types. The elevated level of prolactin is depending on various reasons including presence of DLBCL. In hyperprolactinemia, natural killer cell reduces in number and function and progression of tumor starts in the pineal gland [42]. Although more clinical study is required for validation, a case study [43] indicated the connection between lymphoma progression and hyperprolactinemia, whereas more clinical study is required for the validation. However, as the eye is the closest organ to the brain so ocular lymphoma may be developed as an initial state of developing the primary CNS. In this section, we discussed the connection between DLBCL and some of the CNS related disorder which may resulted as metastasis of DLBCL. We noticed that potential markers of the selected cell types are involved with various brain-related diseases, although we shown a connection among the pathways and how they contributed in the progression of those diseases. However, further study is required based on the discussed markers and pathways of this study to establish a direct relationship between DLBCL and the reported diseases.

Conclusion
In this article, influential cell types for DLBCL are identified based on single-cell bioinformatics strategy. The objective is to identify the potential markers responsible for the pathway shift in the particular cell type. Moreover, these pathways are studied thoroughly to reveal the path through which DLBCL metastasis is progressed and activated diverse CNS diseases. Initially, from PBMC data the cell types are identified. The significant cell types along with their potential markers are considered. We found from our study that LEF1, TCF7, CD79A/B and IRF8 markers are belonged to T cell, B cell and NK cell respectively. They are responsible to influence the signalling pathway such as TCR, BCR, NF-κB and Wnt which not only possess high semantic value but also played a vital role in the formation, progression and metastasis of DLBCL. Moreover, the aforementioned pathways are noticed to have an important participation in the blood-brain barrier and helped in CNS disorder progression. Through a disease network, the relation between the markers and the possible diseases are established. Hence, the outcome of the study helps to bridge the gap between the connection between DLBCL and CNS disorder. Availability of data and material All data generated or analyzed during this study are either taken from publicly available database (https://support.10xgenomics.com/single-cell-gene-expression/datasets/3.1.0/5k pbmc protein v3) or included in this article.

Competing interests
The authors declare that they have no competing interests.

Funding
Not applicable.
Author's contributions AD conceptualized the paper, performs the experiments and scripted the manuscript. UM corrected and edited the manuscript. Figure 1 The flowchart of the proposed framework   A cell-type-specific pathway semantic similarity networks of A. Naive CD8+ T cell, B. CD8+ T cell, C. Effector memory T cell, D. B cell are established by considering the biological process associated with each pathway. The nodes represent a particular pathway and their connecting edges define the weight between two pathways in order to understand the highest sharing biological process. The red marked bold edge in the networks indicate the higher association score between those pathways. Figure 5 The cell type-specific disease networks of A. Naive CD8+ T cell, B. CD8+ T cell, C. Effector memory T cell, D. B cell, E. Natural Killer cells and F. Regulatory T cells are constructed depending on the potential markers of each cell. The yellow and blue nodes represent the markers and associated CNS diseases respectively. In the network the targeted lymphoma is represented with green colour. Tables   Cluster ID Cell type  Markers  3  Naive CD8+ T cell  LEF1,CCR7,FHIT,TCF7,CD27,RPS6  5  CD8+ T cell  SH2D1A,CCL5,LCK  6 Effector memory, T cell NCAM1,GZMB,FCGR3A,CD247, CCL4   7 & 11   CD79A,MS4A1,TNFRSF13C,CD22,BLK,  B cell  FCRLA,VPREB3,CD79B,PAX5,IGHM,  CD19,HLA-DQA1,CD40,ARHGAP24,SPIB  9 Natural Killer cells IRF1, MALAT1 10 Regulatory T cells CD28,MAL Table 1 The selected cell types along with their potential cell markers