Single-cell profiling of human OA cartilage chondrocytes
To explore the genes closely related to the development of chondrocytes, we acquired 480 chondrocyte samples from 3 patients with OA (OA1, OA2 and OA3). Each OA included 160 chondrocytes. The program Seurat were used to analyze data. The distribution pattern chart showed the gene numbers and the sequencing count of each cell (Figure 1a). Most cells of the 3OAs included 1000-5000 genes. The sequencing count of cells focus on the range of 950000-980000. We did not find the mitochondrial sequencing count from the data. By analyzed the correlation between nFeature and nCount, the significantly positive correlation between nFeature and nCount were found (Figure 1b). We also identified the gene symbols with significant difference across cells (Figure 1c), of which10 significantly differentially expressed genes (COL1A1, ZNF585B, SLC5A12, COL1A2, IBSP, SPARCL1, TMSB4X, SPP1, CCL2, COL14A1) across the cell samples have been found.
Furthermore, a PCA was performed based on the 1500 most variable genes. The correlated genes in each component were screened. The expression profile of 30 top relative genes and the correlation analysis of top relative genes in each component were shown in Supplementary Figure 1. The distribution of cells under the two principal components PC1 and PC2 showed an obvious dispersion, indicating that our PCA could be used to map the cells into two dimensions (Figure 1d). The other components were calculated with an estimated P value, and we selected the significant components for subsequent analysis (Figure 1e). Subsequently, cells were clustered based on a graph-based clustering approach, and were visualized in dimensional space using t-distributed Stochastic Neighbor Embedding (tSNE). By tSNE algorithm we classified the cells into 6 clusters (Figure 1f). However, subgroups of the cells could not be distinguished. The 6 clusters were all annotated as chondrocyte. Therefore, a deeper analysis was needed to find the subgroups of cells.
Cell trajectory analysis and annotations
In order to find the subgroups of cells, we first revealed the tendency curve of different clusters by cell trajectory analysis. Before conducting the cell trajectory analysis, we performed differential analysis using the limma package and identified a total of 1010 candidate marker genes with |log fold change| > 0.5 and adj Pval < 0.05 (Supplementary Table 3). The top 10 differential genes between the six clusters in the heatmap plot were illustrated in Figure 2a. Moreover, we constructed the trajectory analysis to reveal the tendency curve of the six clusters (Figure 2b). The results showed the cells of cluster2, cluster3 and cluster5 may had significant difference between each other and may belonged to different subgroup of chondrocyte. In addition, we empirically defined the subgroup of chondrocyte in each cluster by candidate marker genes. By annotation, we found the cells of cluster2 belonged to HTCs. The cells of cluster3 belonged to homeostatic chondrocytes (HomCs) which were defined by Ji et al. [16]. The cells of cluster5 belonged to FCs. We failed to annotate the cells of cluster0, cluster1 and cluster4. We suspected that the cells of cluster0, cluster1 and cluster4 may belong to different period of HTCs, HomCs and FCs. However, these existed a low degree of heterogeneity of the candidate marker genes among the 3 clusters. Therefore, we could not define the specific cell subtype or cell period of the 3 clusters. By analyzing these results, we suggested the trajectory analysis combined with the annotation to candidate marker genes of tSNE was helpful to defining the types of cells between different clusters. For the top 10 candidate marker genes of HTC (cluster2), HomC (cluster3) and FC (cluster5), we illustrated the significantly differential expressions of these genes in six clusters via bubble plot (Figure 2c-e). The plots showed most of these genes were specifically overexpressed in the cells of each cluster.
Function and pathway enrichment analysis
By cell trajectory analysis and annotation of candidate marker genes in 3 cluster (cluster2, cluster3 and cluster5), we defined HTCs, HomCs and FCs. To testify the validity of the cell annotation to 3 cluster and the function of candidate marker genes in the cells of 3 cluster, we carried out the GeneOntology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis of the candidate marker genes in the 3 cluster respectively. For the genes belonging to HTCs, most functions including oxidative stress, epithelial cell proliferation and growth factor activity were closely related to HTCs (Figure 3a). Similar to GO analysis, the KEGG analysis of the genes in HTCs also had some important pathways (ECM-receptor interaction, IL-17, TNF) (Figure 3b). The genes of these pathways all play important roles in HTCs. The results of GO and KEGG analysis to genes in HomCs showed some functions such as unfolded protein, temperature stimulus and heat shock protein could modulate cellular homeostasis (Figure 3c-d). These functions and pathways illustrated HomCs may regulated cell cycle, development, RNA metabolism and biosynthesis. FCs have close relationship with collagen formation. The function and pathway enrichment analysis of the genes in cluster 5 showed some notably functions with collagen (collagen metabolic, collagen trimer and collagen binding) (Figure 3e-f). These results further evidence the correctness of the annotation to cells in 3 clusters. The results of GO and KEGG analysis also indicate these candidate marker genes may play significant roles during the process of proliferation and differentiation for chondrocytes.
Determining the expression level of candidate marker genes among resCs, proCs, HTCs and FCs
The GO and KEGG analysis identified the annotation of clusters was accurate. In order to confirm the candidate marker genes indeed had aberrant expression in different subtypes of chondrocyte, we detected the expression level of top 10 candidate marker genes in 3 clusters among resCs, proCs, HTCs and FCs. We first used ATDC5 cells to separate resCs, proCs and HTCs. We separated the 3 subtypes of chondrocyte by comparing the differential expression of marker genes (resC: SOX9 and Fn1, proC: COL2A1 and Runx2, HTC: COL10A1 and COL1A1) among 7d, 14d and 21d in ATDC5 cells (Supplementary Figure 2). For FCs, we used L-929 cells which were proximal cartilage fibroblasts of mouse as FCs.
After finishing the preparation of subtypes of chondrocyte, we detected the expression level of top 10 candidate marker genes in 3 clusters among resCs, proCs, HTCs and L-929 cells. For HTCs, the results showed the 10 genes all had higher expression level in recCs, proC, HTCs compared with L-929 cells. The genes of TNFAIP6, GAS1, SLC7A2 and TNC were markedly upregulated in HTCs (Figure4a-b). The results of top 10 genes in HomCs showed the 10 genes all had higher expression level in recCs, proC, HTCs. However, compared with top 10 genes in HTCs the expression level of most genes in HomCs did not have obviously differential expression among resCs, proCs, HTCs (Figure4c-d). When examined the expression level of top 10 genes in FCs, we found the expression of most genes up-regulated in L-929 except for CFD, SFRP2 and SPARCL1. The gene expression of COL1A1 and COL1A2 also showed up-regulated in HTC compared with resC and proC (Figure4e-f). These results implied that most candidate marker genes may have special expression in their own subgroup.
Identification of hub genes from candidate marker genes in HTCs, HomCs and FCs
To further identify the key genes in each cell subset. Protein-protein interaction (PPI) network of candidate marker genes in 3 clusters was constructed by Cytoscape software respectively, based on STRING database (Supplementary Figure 3). The number of genes which were used to construct PPI in HTCs, HomCs and FCs were 104, 175 and 329, respectively. The top 20 candidate marker genes (HTCs: the degree of connectivity≥8, HomCs: the degree of connectivity≥19, FCs: the degree of connectivity≥40) with high degree of connectivity were displayed (Figure5a-c). These top genes might play critical roles in chondrocyte progression.
CD44, JUN and FN1 were the hub genes with highest degree of connectivity in each cell subset. In order to test the relationship between candidate marker genes and cell subset. The expression of the 3 genes in the 4 chondrocyte subsets was detected. The expression of CD44 was gradually increased from resC to HTC of chondrocytes (Figure 5d). The result of bubble plot also showed CD44 was specifically overexpressed in HTCs (cluster2) (Supplementary Figure 4). JUN was a candidate marker gene that was specifically overexpressed in HomCs (cluster3). The expression of JUN was contrary to CD44, it was high expressed in resC, proC and HTC compared with L-929 (Figure 5d). The expression level of FN1 was markedly upregulated in L-929 which was consistent with the result of FN1 in bubble plot (Supplementary Figure 4). These results combined with previous analysis confirmed some candidate marker genes can reveal the progression of chondrocytes. CD44, JUN and FN1 may play important roles in different subtypes of chondrocyte.