Single-cell RNA sequencing reveals the heterogeneity of peripheral blood mononuclear cells in renal cell carcinoma


 Background

Peripheral blood mononuclear cells (PBMCs) are closely related to tumors, and the functions of T cells and B cells are related to tumor occurrence, development, and prognosis. Conventional second-generation sequencing cannot distinguish the characteristics of various peripheral blood mononuclear cell subsets, so it is impossible to study PBMCs accurately. Recently, single-cell sequencing technology has improved, and it provides a tool to study immune cells in circulating blood.
Methods

PBMCs from patients with renal cell carcinoma (RCC) were sequenced to explore the characteristics of PBMC subsets in patients with renal cell carcinoma, mainly B cells and Treg cells, to study the relationship between renal cell carcinoma and peripheral blood immune cells.
Results

our PBMC study of RCC patients successfully separated several types of immune cells from the blood. PBMCs can map the tumor to a certain extent.
Conclusions

We can expand the samples based on the current research, and further research will discover more meaningful information related to cancer.


Background
Renal cell carcinoma (RCC) is the most common type of renal tumor arising from the proximal renal tubules []. The 5-year survival rate is 44-92% for RCC patients who are diagnosed when the cancer is localized, and the 5-year survival rate decreases when it has spread to regional lymph nodes or metastasized [2]. Early RCC symptoms are not typical, and they are often overlooked. Therefore, only latestage RCC patients can be diagnosed when the tumor is large and the symptoms are obvious. Peripheral blood mononuclear cells (PBMCs) are composed of lymphocytes (T cells, B cells, NK cells) and monocytes. They have been shown to be involved in the formation and development of many diseases [3]. Using single-cell sequencing technology, previous studies have shown that immune cells in circulating blood are associated with tumors, autoimmune diseases, in ammation, and other diseases [4,5].
Currently, single-cell RNA sequencing (scRNAseq) is an important tool to establish cell lineage and identify tissue composition. Traditional RNA sequencing can only provide the average expression signal in the whole tissue without considering cell heterogeneity. However, scRNAseq can also describe subtle differences in cell subsets in complex mixtures such as tissues or blood samples [6,7]. In this study, PBMCs from renal cancer patients were sequenced to explore the characteristics of mononuclear cell subsets in RCC patients, and to study the relationship between RCC and PBMCs.

Methods
Ethics statement.The present study was approved by the Ethics Committee at Hainan Hospital. All samples were obtained from patients who were diagnosed with RCC between January 2019 and January 2020 at Hainan Hospital. All study subjects provided written informed consent.
Human samples.Samples were collected from four patients who were initially diagnosed with RCC. Blood was collected from the cubital vein before drug treatment or other antineoplastic therapy. PBMC was isolated from human peripheral blood using Ficoll-Paque density gradient centrifugation. First, whole blood was diluted with the same amount of phosphate buffered saline (PBS), and the diluted blood solution was placed onto the Ficoll-Paque gradient medium (GE Healthcare Bio-Sciences), centrifuged at room temperature at 1000 ×g for 20 min, and the PBMCs were carefully collected from the interface layer between plasma and Ficoll solution. The collected PBMCs were washed using PBS, then centrifuged at 500 ×g for 20 min, and the cell concentration was adjusted to 1×10 7 PBMCs/mL. A single cell RNA library was then constructed. Processing and analysis of single-cell RNA-seq data. The Seurat Package (version:3.1.2) was used for gene expression data analysis. Cell demultiplexing was realized using the HTODemUX function in the Seurat Package. After single cell identi cation, cells with mitochondrial readings of more than 30%, fewer than 200 genes, or more than 5000 genes are excluded from the analysis. Downstream analysis only considered those genes that exist in more than ve cells, and standardization, scaling, and dimensionality reduction steps were performed for each subset of PMBC data. Uniform Manifold Approximation and Projection (UMAP) was then used for two-dimensional representation of the data structure. After clustering, the " ndmarkers" and " ndallmarkers" functions from the Seurat software package were used to search the clustering biomarkers of each group, and the clustering marker genes were determined by the expression differences. Tags that identify a single cluster were identi ed and compared with all other cells.
Identi cation of single cell subpopulation identi cation. There are many automatic tools for single cell subgroup identi cation, and they are mainly divided into two categories, as follows: automatic recognition and semi-supervised. The more common automatic recognition is Singler, which has built-in cell data from humans and mice. The basic principle is to determine the cell type by calculating the correlation between a single cell and the built-in database. The advantage of this tool is that a person does not need to provide their own cell types and corresponding marker genes, but its disadvantage is that it can only recognize cell types that already exist in the database, and it cannot recognize particularly ne cell subsets. The cell subpopulation can also be identi ed based on traditional classical marker genes, and cell subsets can be identi ed based on marker genes of known cell types. Generally, subgroup identi cation is not a single gene, but may require multiple genes. The traditional classical marker gene collection generally uses the following two commonly used databases: CellMarker(http://biocc.hrbmu.edu.cn/CellMarker/) and Magi Panglaodb (https://panglaodb.se/index.html). In this study, we used the traditional classical marker gene, combined with automatic identi cation tool Singler to identify the cell population.

Results
Single cell RNA-seq data quality. PBMCs were isolated and RNA-Seq libraries were constructed for sequencing on the Illumina sequencing platform. Single-cell data sets usually contain various forms of rough information such as technical noise and batch effect. Processing these signals can improve downstream dimension reduction and clustering, and improve the reliability of data analysis. We obtained 7659 high-quality cells after quality control, based on the information from genes and mitochondrial genes in the samples (Fig. 1A, B). After data normalization, all hypervariable genes in a single cell were screened, and downstream analysis was performed using the hypervariable genes (Fig. 1C).
Visualization and exploration of single-cell sequencing data. In addition to PCA dimensionality reduction, Seurat provides several nonlinear dimensionality reduction techniques for visualization and exploration of single cell sequencing data, to place similar units together in low dimensional space. Uniform Manifold Approximation and Projection (UMAP) is a recently published nonlinear dimensionality reduction technique. Compared with tSNE, UMAP has a faster running time, better consistency, more meaningful cell cluster organization, and better continuity preservation. In this study, PBMCs were clustered and seven major cell types were identi ed using UMAP, which included T cells, NK cells, monocytes, B cells, T regulatory (Treg) cells, plasmacytoid dendritic cells, and gamma delta T cells ( Fig. 2A). Single cell sequencing was also used to annotate each of our PBMCs with the main cell type labels. The results included a heatmap of the group and label scores (Fig. 2B). Each group should exhibit a high score in one label relative to all of the other scores, indicating that the label assignment was unambiguous. The results showed that each cell subgroup can be divided into different cell subtypes. For example, T cells can be divided into CD4+ effector memory, CD8+ effector memory, CD8+ naïve, and other subtypes.
B cell and Treg cell clusters in PBMCs.We identi ed each cell cluster with the largest proportion of common T and B cells, which is consistent with the actual situation (Fig. 3A, F). The UMAP heat map shows that there is a large distance between B cells and other cell clusters. Further analysis showed that the genes CD79, MS4A1, and IGHM had relatively speci c expression in B cell clusters (Fig. 3C-E). We screened highly variable genes from B cells and analyzed these genes using GO terms (Fig. 3B). The results showed that they were mainly enriched in B cell activation, antigen receptor-mediated signaling pathways, antigen processing and presentation of peptide antigens through MHC class l, regulation of lymphocyte activation, interferon gamma-mediated signaling pathways, regulation of leukocyte proliferation, components of the lumen side of the endoplasmic reticulum, and B cell differentiation.
Tregs are an immunosuppressive subset of CD4+ T cells that are characterized by the expression of the major transcription factor forkhead box protein P3 (Foxp3). Tregs can inhibit tumor immunity, thereby hindering the protective immune surveillance of tumors and the effective anti-tumor immune response of tumor hosts. This promotes tumor development and progression. Our analysis of Tregs and UMAP showed that Foxp3, IL32, and GBP5 were highly expressed in Treg cells (Fig. 3H-J). We screened highly variable Tregs genes and analyzed these genes using GO terms, and they showed high regulating T cell activation, leukocyte cell adhesion, regulating lymphocyte activation, lymphocyte differentiation, cytokine secretion, lateral plasma membrane, focal adhesion, and cell-matrix adhesion junctions (Fig. 3G).

Discussion
In this study, we performed single-cell transcriptome analysis on PBMCs from RCC patients, and then focused on T and B cells. There were more B cells than T cells in bone marrow, but fewer B cells in blood and lymph nodes than T cells and fewer B cells than T cells in the chest tube [8,9]. B cells begin as immature cells. Once activated, B cells differentiate into plasma cells and secrete antibodies, especially those against free antigens. In this study, PBMCs were isolated from RCC patients, and a single cell RNA library was constructed using a BD single cell platform, and high-throughput sequencing was performed using a novaseq 6000 sequencing platform. Sequencing results were analyzed and cell clusters were identi ed. Classi cation of mononuclear cells showed that T cells and B cells accounted for the largest proportion.B cells were identi ed from cell clusters, and CD79a, MS4AL and IGHM were highly expressed B cell markers in B cells.GO terminology analysis of highly variable genes in B cell clusters showed that the results conformed to the functional expression of B cells.Seurat is an R language software package for single-cell RNA sequence data that can identify and explain the source of cellular heterogeneity from single-cell transcriptome sequencing data, and integrate various types of single-cell data for data analysis and mining.
Treg cells are a subset of CD4+ T cells that have a signi cant immunosuppressive effect, which can inhibit the immune response of other cells, and they play an important role in maintaining the immune balance and preventing autoimmune diseases and transplantation rejection [10,11]. Treg cells can be divided into natural Treg cells and induced Treg cells, but in this study, they were not further classi ed. In peripheral blood, Treg cells account for 5-10% of the total number of CD4+ T cells [12,13]. Currently, it is believed that natural Tregs are derived from the thymus and that they play an inhibitory role mainly through cell contact mechanism [10]. iTregs are induced by peripheral mature T cells under conditions of persistent stimulation by antigens as well as transforming growth factor (TGF-β) and other cytokines [14].
Foxp3, IL32, and GBP5 are markers of Treg cell subsets. The GO term analysis in this study mainly included an increase in Treg activation, leukocyte cell adhesion, regulating lymphocyte activation, and lymphocyte differentiation. The GO terms are in agreement with the biological information from Treg cells, which supports the authenticity of our data.
This study had some limitations including a small sample size. However, we only detected the PBMC transcriptome and preliminarily classi ed PBMC subpopulations. We did not perform an in-depth analysis, especially in combination with clinical and pathological analysis. Single-cell RNA sequencing is an important tool in tumor research, and this project has provided some information for our future research.

Conclusions
In sum, our PBMC study of RCC patients successfully separated several types of immune cells from the blood. PBMCs can map the tumor to a certain extent. We can expand the samples based on the current research, and further research will discover more meaningful information related to cancer.

Declarations
Ethics approval and consent to participate All experimental protocols have been performed in accordance with the Declaration of Helsinki and approved by the Ethics Committee 928th Hospital. All methods were carried out in accordance with relevant guidelines and regulations. The informed consent was obtained from all subjects and/or their legal guardian(s).

Consent for publication
Not applicable.
Availability of data and materials The datasets that were used during the present study are available from the corresponding author on reasonable request. Figure 1 Single-cell sequencing quality control. A, cell count, gene number, and mitochondrial gene percentage before quality control; B, cell count, gene number, and mitochondrial gene percentage after quality control;

Figures
C, screening of highly variable genes.