Single-Cell RNA Sequencing Reveals Heterogeneity Among Adult Mouse, Primate and Human Kidney Cells

Multiple studies have been performed to map the kidney landscape of human and rodent, along with the development of sequencing technique. Although rodent disease models have been widely applied, many disadvantages also exist. Non-human primates (NHPs) are considered as the closest experimental animals to humans and show great advantages in the construction of animal models of human disease. Therefore, a comprehensive understanding of the heterogeneity and homogeneity between human and multiple animal kidney cells is important for further establishing animal models of human renal disease. Here, we generated the rst single-cell transcriptome data of normal adult cynomolgus monkey kidney using 10x Genomics scRNA-seq platform. Then, we further performed an in-depth comparison across species at the single-cell level, and our analysis indicated that the gene expression of adult primate kidney cells showed a better correlation with human kidney than mouse kidney. Furthermore, our results demonstrated that the cellular localization of GWAS-identied renal disease genes showed differences across species. The cellular localization of blood pressure associated genes in human displayed similarity to cynomolgus monkey. This study provided a reliable reference for further studies associated with renal diseases on NHPs. In addition, our results also provided a novel insight into the choice of renal disease animal model and a detailed explanation for close genetic relationship between NHPs and human at a single cell level. distinct types of kidney cells. Moreover, we demonstrate different gene expression proles across species at a single cell resolution and annotate the gene expression of complex traits disease in human, monkey and mouse. Our study will provide an experiment basis for establishing animal models of renal disease.

The nonhuman primate has been considered as important translational model that bridge the gap between laboratory research to clinical application. As early as in 1960s, Feldman et al. reported a rhesus monkey model of nephrotic syndrome with glomerulonephritis 21 . Liu et al. successfully established type 1 diabetic mellitus NHPs models via partial pancreatectomy and low-dose streptozotocin (STZ) administration, which were further induced to diabetic nephropathy (DN) models by long-term effects of hyperglycemia [22][23][24] . The DN monkeys showed a distinct metabolic disturbance which were similar to DN patients, including higher levels of VLDL/LDL, lipids, unsaturated lipids, uric acid, fumarate and hippurate, as well as lower levels of HDL, alanine, glutamate and NAD + . More importantly, the DN monkeys exhibited obvious glomerulosclerosis and tubulointerstitial lesions, which was consistent with previous reports about chronic renal injury in diabetic patients 24,25 . By means of intravenous administration of cisplatin, Moghadasali et al. induced acute kidney injury (AKI) and chronic kidney disease (CKD) in adult cynomolgus monkey 26,27 . Four days after injections of cisplatin, creatinine and urea levels were found to be signi cantly increased in treated monkeys in comparison with normal monkeys. The pathological changes of renal tubules in AKI monkeys showed cytoplasmic vacuolization, loss of brush border, epithelial cell detachment in addition to luminal cell debris and nuclear fragmentation, which appeared to better mimic human nephropathy 26 . Furthermore, the treated monkeys exhibited obvious pathological changes of CKD, including tubular loss, glomerular brosis and cortical brosis after long-term administration of cisplatin. 27 .
Using scRNA-seq, we constructed a single cell atlas of healthy adult cynomolgus monkey kidney and identify distinct types of kidney cells. Moreover, we demonstrate different gene expression pro les across species at a single cell resolution and annotate the gene expression of complex traits disease in human, monkey and mouse. Our study will provide an experiment basis for establishing animal models of renal disease.

Animals and ethics statement
One adult male and two adult female (10-12 years old) cynomolgus monkeys were raised at Wincon TheraCells Biotechnologies Co.Ltd (Guangxi, China), which is a non-human primate research facility accredited by AAALAC. The experimental protocols were approved by Institutional Animal Care and Use Committee of the Wincon TheraCells Biotechnologies. All animal experiments were conducted by certi ed veterinarians, according to the approved guidelines.

Preparation of single cell suspension from monkey kidney
To ensure the vitality of kidney tissues, three kidney tissue samples were harvested separately from three monkeys at different days. In brief, the monkeys were anesthetized with pentobarbital sodium (50mg/kg, iv), and then the left kidney was removed from the abdomen cavity and immediately immersion placed in ice-cold RPMI 1640 (WISENT, 350-006-CL) supplemented with 5% Fetal bovine serum (FBS, Gibco, 10099-141C) and 1% Antibiotic-Antimycotic (Gibco, 15240062). Then, the renal capsule was removed, and 0.5g of full-thickness kidney tissue were cut off using surgical scissors. After washing with 4℃ Dulbecco's phosphate-buffered saline (DPBS, WISENT, 311-425-CL) twice, the tissue was placed into a 60 mm cell culture dish containing DPBS and minced into 1 mm 3 pieces using micro scissors on ice. Subsequently, we collected the kidney tissue fragments into the centrifuge tube and centrifuged at 300 rpm for 3 minutes at 4℃. After discarding the supernatant, we used 10mL RPMI 1640 containing 0.1mg/ml Liberase TL (Roche, 05401020001) and 0.1 mg/ml DNase to digest the kidney tissue fragments for 40 min at 37℃. Digestion was terminated by RPMI 1640 containing 10% FBS. The cell suspension were then ltered through a 70 μm cell strainer and centrifuged at 300 rpm for 5min at 4℃. After discarding the supernatant, the red blood cells were removed by treating with RBC lysis buffer (Roche, 11814389001) for 5 minutes. The cells were then spun down and re-suspended in DPBS. After the supernatant was poured off, the cells were re-suspended in DPBS and ltered through a 40 μm cell strainer. The number of live cells were counted using trypan blue staining (Gibco, 420301).

Single-cell RNA library construction and squencing
The Chromium Single Cell 3' GEM, Library&Gel Bead Kit v3 (10X Genomics, 1000075) and Chromium Chip B Single Cell Kit (10X Genomics, 1000153) was used for single-cell RNA library preparation. The single cells were resuspended in PBS with 0.1% BSA to around 700-1000 cells per milliliter. Then, the cell suspension, Gel Bead and Partitioning Oil were dispensed into the Chromium Single-Cell B Chip, and the assembled chip was placed onto a Chromium Controller (10× Genomics) to generate single-cell Gel Bead-In-EMulsions (GEMs). GEM-reverse transcription (RT) was conducted using a T100 Termal Cycler (Bio-Rad). After RT, cDNA was recovered using Recovery Agent and then cleaned up using the Dynabeads Cleanup Mix. Subsequently, cDNA was ampli ed using a T100 Termal Cycler. The ampli ed cDNA product was cleaned up with SPRIselect Reagent Kit. Indexed sequencing libraries were constructed using Chromium Single Cell 3' Reagent Kit v3. Library quality was checked by the Agilent 2100 Bioanalyzer (performed by Genery Biotechnology, Shanghai). All libraries were sequenced on the Illumina Nova S6000 (performed by Genery Biotechnology, Shanghai).
scRNA-seq raw data processing Preliminary sequencing results (bcl les) were converted to FASTQ les with CellRanger (version 3.0, http://10xgenomics.com). The FASTQ les were then aligned to the cynomolgus monkey genome reference sequence, Macaca_fascicularis_5.0. Meanwhile, we also generated a le containing a barcodes table, a genes table, and a gene expression matrix using Cell Ranger. Next, we obtained an overview website containing a considerable amount of information, such as number of cells, median number of detected genes and others.
Due to these samples were obtained from three cynomolgus monkey and sequenced in batches, NHPs IDs were used to eliminate potential batch effect. After data normalization, we identi ed top 2000 variable genes and created potential Anchors with FindIntegrationAnchors function of Seurat. Then, IntegrateData function was used to integrate data and create a new matrix with 2000 features, in which potential batch effect was regressed out. We then used principal-component analysis (PCA), with the variable genes as input, and identi ed signi cant principal components based on the jackStraw function. A total of 20 PCs were used to perform the downstream analysis. The cell clusters were identi ed by FindCluster function with a resolution of 0.2 and then were visualized with 2D UMAP plots. Next, we nd differentially expressed genes (DEGs) between each type of cells using FindAllMarkers function.
Human and mouse data from NCBI were analyzed using a similar method. NCBI data available from the Gene Expression Omnibus (GEO) and Sequence Read Archive (SRA), accession GSE131685, GSE107585, SRR6348586, SRR6348583, SRR6348584 and GSE141115.

Gene expression comparison cross-species
For species-based pro le analysis, we rst transferred the cynomolgus monkey and mouse gene names into human homologous gene names with the ENSEMBL R package BioMart (version 2.44.4, http://www.biomart.org/). We discarded genes with no gene names and retained identical names between the human and mouse, human and cynomolgus monkey kidney database. Correlation coe cients between human and cynomolgus monkey, and between human and mouse kidney database were calculated, and regression curves were plotted via scatter diagram using ggplot2 package. Subsequently, we used the Subset function of Seurat to select mutual types of cell among human, monkey and mouse kidney database, and performed correlation analysis with a similar method.
Based on the average expression of homologous genes, we used pheatmap package to plot heatmap for the expression of complex traits kidney diseases related genes across species. Next, the expression of target genes related to hypertension therapy were plotted using Vlnplot function.

Results
scRNAseq pro ling and clustering of adult cynomolgus monkey kidney cells We sorted and sequenced a total of 43,899 cells from kidney cell suspensions from one male and two female adult healthy cynomolgus monkey. After stringent quality control, 28,019 high-quality cells were retained for further analysis. The initial UMAP analysis revealed 37 cell clusters (Additional le Fig. 1A). Based on cell-speci c markers, we identi ed each cell clusters unbiasedly. For example, based on the markers LRP2, ALDOB and MT1G, we identi ed cluster 0-2, 4, 7-8, 14-18, 20 and 26 cells as proximal tubule cells 8,28,29 . Cluster 5, 11 and 29 cells expressed high levels of EMCN and IGFBP5, which are well accepted as molecular markers for endothelium 30 . Similarly, cluster 6 and 28 cells were identi ed as intercalated cells and principle cells respectively according to ATP6V0D2, SPINK1, ATP6V1G3 and FXYD4 31 . Finally, our single kidney cell pro ling revealed 11 distinct cell types, including proximal tubule cells, endothelium, intercalated cells, principle cells, loop of Henle, distal tubule cells, T/NK cells, B cells, macrophages, myo broblasts and a small cluster of erythroid cells. (Fig. 1A, 1B) Integrating and clustering of human and mouse kidney scRNAseq database To gain a better comparison of the scRNAseq transcriptomic pro les of kidney across species, we integrated four healthy human and thirteen healthy mouse kidney scRNAseq database respectively by nding 'anchors' among different datasets 32 . After quality control, 44,795 human kidney cells and 64,966 mouse kidney cells were retained for downstream comparison. We identi ed 31 cell clusters representing 9 distinct cell types through scHCL analysis on integrated human database (Fig. 1B, 1D, Additional le Fig. 1B). Cluster 22 cells were identi ed as loop of Henle, which expressed high levels of UMOD and SLC12A1 33,34 , although they also expressed low levels of distal tubule cells speci c markers (CALB1 and SLC12A3). We could not de ne myo broblasts and podocytes due to the low expression levels of ACTA2, MYL9, TAGLN and NPHS2 within all clusters.
Integrated mouse database was identi ed into 28 cell clusters and 11 cell types. Podocytes were identi ed clearly according to the expression of Nphs2 and Cdkn1c 9 . (Fig. 1C, 1F, Additional le Fig. 1C)

Correlation analysis of kidney cells across species at single-cell level
We calculated the average gene expression of all cells in three databases within mutual clusters across species. cynomolgus monkey and mouse gene names were transferred into human homologous gene names using R package biomaRt 35 . Finally, 12,993 homologous genes derived from cynomolgus monkey database and 13,056 homologous genes derived from mouse database were retained for downstream analysis.
First, we compared the expression of all homologous genes between human and monkey kidney cells, as well as between human and mouse kidney cells ( Fig. 2A, 2B). The Pearson correlation coe cients were 0.806 and 0.778, respectively (Fig. 2C, 2D). Some genes, such as DHDH, MT3 and FUOM were highly expressed in human kidney cells, but showed low expression in kidney cells from cynomolgus monkey. On the contrary, IGFBP5, FUOM and GPX3 were highly expressed in monkey kidney cells, but is low in human kidney cells (Fig. 2C, Additional le Data1). The expression of CRYAB, RPL35 and GAPDH were lower in mouse kidney cells than human kidney cells. KLK1, SLC34A1 and FXYD6-FXYD4 were highly expressed in mouse kidney cells, but is low in human kidney cells (Fig. 2D, Additional le Data1).
Next, we analyzed all cell types in these three species and found that some cynomolgus monkey and human kidney cells were well correlated. Of all these cells, high correlations were observed between cynomolgus monkey and human PT cells, endothelium, and T cells, the Pearson correlation coe cients were 0.838, 0.824 and 0.836 respectively. Similarly, we observed high correlations between mouse and human PCs and B cells, the Pearson correlation coe cients were 0.794 and 0.815, respectively (Additional le Fig. 2). Interestingly, we also noticed that cynomolgus monkey and mouse macrophages showed well correlation with human neutrophils, the Pearson correlation coe cients were 0.84 and 0.814, respectively. Furthermore, we identi ed species-speci c and shared DEGs of these high correlated cells (HCCs) in cynomolgus monkey, human and mouse kidney. For example, between cynomolgus monkey and human speci c markers, B2M, SPP1 and GPX3 were speci cally expressed in human PT cells. In contrast, these genes were poorly expressed in cynomolgus monkey. On the contrary, MT3, FUOM and DHDH were speci cally expressed in cynomolgus monkey PT cells (Fig. 2E). In addition, ALDOB, PFDN1 and RARRES2 were expressed in both human and cynomolgus monkey (Additional le Data1).

To investigate the correlation of different subtypes of high correlated cells clusters among different species, we used Seurat SubsetData function to select the PT cells, endothelium and T cells in both human and cynomolgus monkey as well as B cells and PCs in both human and mouse databases, and
performed re-clustering respectively. Further correlation analyses were conducted based on these subsets of HCCs. We found that most subsets of HCCs were relatively well correlated, but heterogeneity still exists.
Poor correlations were observed between PT18 cells in cynomolgus monkey kidney and all the subtypes of PT cells in human kidney (Fig. 2F).

Complex trait disease genes show heterogeneity of cell-type speci city across species
A previous study had demonstrated that that hereditary kidney disease-related genes showed cell-type speci city in mouse kidney. However, there are very few similar studies in human and nonhuman primates. Therefore, we annotated the expression of complex trait disease genes that have been associated with chronic kidney disease (CKD), systemic lupus erythematosus (SLE), serum metabolite levels and blood pressure (BP) in human, cynomolgus monkey and mouse kidney cells (Fig. 3A-D). Our result indicated that most genes involved in these complex trait diseases show a similar cell type-speci c expression pattern in the three species, but differences were also observed across species. For example, the expression of some genes associated with CKD was mainly speci ed in human PT cells, endothelium and PCs, whereas these genes were mainly expressed in cynomolgus monkey PT cells and ICs and in mouse PT cells, PCs and podocytes. The genes associated with serum metabolite levels such as SLC17A3, BHMT and SLC7A9 showed strong enrichment for PT cells. Besides, we noticed that other genes associated with serum metabolite levels were also highly expressed in PCs, endothelium and aLOH of human kidney, but similar results were not observed in cynomolgus monkey and mouse kidney cells.
Among all of the genes related to complex trait diseases, the genes associated with blood pressure exhibited signi cant different expression patterns across species. Previous study has demonstrated that BP-associated genes were mostly expressed in mouse collecting duct cells (including ICs and PCs). In our integrated mouse database, the expression of BP-associated genes also showed enrichment for cell types of myo broblasts, podocytes and macrophages. However, the BP-associated genes showed very high expression in PT cells, collecting duct cells and endothelium in human and cynomolgus monkey although they are also expressed in myo broblasts in cynomolgus monkey. Furthermore, we found that the expression of genes of the renin-angiotensin system (including ACE, ACE2, MME, ECE1, AGTR1, AGTR2 and MAS1) displayed species differences (Fig. 4A-C). For example, the expression of AGTR1 and AGTR2 were not found in human kidney. Meanwhile, the expression of AGTR2 and MAS1 were not found in cynomolgus monkey kidney and the expression of Agtr2a was not found in mouse kidney. The expression of MME was mainly in PT cells of human and cynomolgus monkey kidney, whereas Mme were expressed in PT cells, aLOH and ICs in mouse kidney. Our single cell transcriptome analysis will provide valuable information to establishing animal model of kidney disease.

Discussion
Nonhuman primates (NHPs) play an important role in the establishment of disease models due to their similarity to humans in physiology, anatomy, immunology and genetics. For a half century, NHPs have been widely used in biomedical research. There has been a relative increase in the importance and advantage of using NHPs as pre-clinical models in infectious diseases 36,37 , ageing-related and neurological disorders 38,39 , endocrine diseases 40,41 and others. Nevertheless, the differences between human and NHPs should not be neglected, and more efforts need to be done to distinguish the cellular and genetic differences between NHPs and human.
Here, we provide a molecular de nition of cell types in normal adult cynomolgus monkey kidney by single-cell RNA-sequencing of 43,899 cells. In the current study, scRNAseq data revealed 11 distinct cell types by quantitative gene expression, including PT cells, endothelium, ICs, PCs, aLOH, DT, myo broblasts, three types of immune cells (NKT cells, macrophages and B cells) and a small cluster of erythroid cells, and these cells included almost all previously described cell types in human and rodent kidney [8][9][10] . Our work complements the de ciency of the research on single-cell transcriptome atlas of NHPs and provides a reference for future investigations of NHPs kidney cells as well as for studies on kidney diseases in NHPs at single cell resolution.
Our scRNAseq data also provides novel insights into the correlation between NHPs and human kidney cells. We demonstrated that transcriptome of human kidney was more closely correlated with that of nonhuman primate kidney than that of mouse kidney. The gene expression of PT cells, endothelium and ICs in cynomolgus monkey kidney showed the highest correlation with gene expression of these cells in human kidney. Previous study revealed the sex diversity in human and mouse renal PT cells as well as a shared and species-speci c differential gene expression between human and mouse PT cells 42 . In our study, we performed a correlation analysis of PT cells and subsets of PT cells across species. The result demonstrated that PT cells and most of the subtypes of PT cells in cynomolgus monkey kidney were well correlated with those cells in human kidney. Interestingly, we noticed that a speci c subtype of PT cells in cynomolgus monkey kidney showed a low correlation with all of the subtypes of PT cells in human kidney. The function of this cluster of PT cells in cynomolgus monkey kidney needs further investigation.
In short, this is the rst time that reveal the similarities and differences between human and monkey kidney cells at a single cell level.
Park et al. has demonstrated that the expression of kidney disease-associated genes is restricted to speci c kidney cell types 9 . In this study, we applied a similar approach to the analysis of the relation between human and monkey kidney cells and kidney diseases. Our results revealed the heterogeneity of the expression of complex trait kidney disease associated genes across species. Genes associated with plasma metabolite are predominantly expressed in human PT cells, endothelium and PCs, However, PCs and endothelium shows low enrichment for metabolite related genes in monkey and mouse. Additionally, we noticed that BP-associated genes also highly expressed in human and monkey PT cells and endothelium, whereas the previous study in mouse kidney demonstrated these genes were mainly expressed in collecting duct cells. Therapeutic target genes for hypertension also showed signi cant differences across species. For example, MME highly enriched in PT cell of human and cynomolgus monkey kidney, whereas it was widely expressed in PT cells, ICs, aLOH and podocytes in mouse kidney.
In summary, we generates the rst single-cell transcriptomic atlas of the adult primate kidney and uncovers across-species differences in kidney gene expression at single-cell level. This study will improve our understanding of primate kidney biology. Furthermore, the data in this study will also provide useful information for the establishment of animal models of various renal disease in different species and provides a foundation for future studies.

Declarations
Authors' Contributions J.L. and Y.L. performed RNA-seq experiments, made cDNA library; S.C. performed single-cell RNA-seq analyses, made gures, and wrote the paper; Y.C. wrote the paper; C.Z. and J.L. dissected monkey kidney tissues, performed RNA-seq experiments; Q.S. and L.W. discussed the draft paper, and critically reviewed the manuscript; C.H. and K.D. offered bioinformatics help. Z.C. and Z.M. conceived of and supervised the project, analyzed data, made gures, and wrote the paper; and all authors read and commented on the manuscript.   Cell-type speci c expression of GWAS-identi ed complex traits renal disease genes. (A-D) Averaged expression of genes was calculated in each cell type. The color scheme is based on z-score. In the heatmap, each row represents one gene and each column is single cell type.