Cell purity assessments
The purity of the CD4 + cells from both tissues was ~ 95% (Supplementary Figure S1-3, Additional File 1), however the thymic samples had a higher average viability; 88% vs 77% in the blood (data not shown). The average purity of the CD8 + T cells was about ~ 95% in the thymic tissue (Supplementary Figure S4, Additional File 1). Noteworthy, the positive selection assay performed better in adult blood then infant, with purity scores at 95% and 75%, respectively (Supplementary Figure S6-5, Additional File 1). Staining with CD3 in the CD8 + pool, showed that > 90% of the T cells were CD3+ (Supplementary Figure S7, Additional File 1), suggesting that only a small portion could be NK/NKT, immature thymocytes or other CD8 + CD3- cells. The average viability of thymic CD8 + cells was 63% while from blood it was 71% (data not shown). We detected suspected double positive CD4CD8 + thymocytes in the CD4 + thymocyte population (Supplementary Figure S1, Additional File 1), and vice versa (about 10%) (Supplementary Figure S4, Additional File 1). In the infant blood we observed 2% CD4 + cells in the CD8 + population (Supplementary Figure S5, Additional File 1), while in adult blood we observed 5% CD4 + cells in the CD8 + population (Supplementary Figure S6, Additional File 1). We also found traces of CD8 + T cells in the isolated CD4 + T cells. This was seen, to a less extent, in CD4 + adult blood (~ 2% CD8 + cells, Supplementary Figure S3, Additional File 1).
Descriptive statistics
Figure 1 provides a graphical overview of the experimental design and workflow. For the SP CD4 + and CD8 + T cells from infant thymus and blood, we used 3–5 biological replicates (ages 5 days – 15 months), while peripheral blood CD4 + and CD8 + T cells from adults were pooled from five individuals (23–45 years). From all 18 transcriptome profiles generated, the sequencing depth ranged from 69–122 M reads (Supplementary Table S1, Additional File 2). However, particularly the sequencing data from the CD8 + T cells contained a considerable proportion of multimapping reads (28–86%). Yet, after excluding multimapping reads from further analysis, satisfactory estimated library sizes for detecting DE genes (> 10 M) (18), remained for 14 out of 18 samples.
The thymic and peripheral blood T cell transcriptome
RNA-seq of human CD4 + and CD8 + T cells, derived from infant thymus, as well as from infant and adult peripheral blood, detected 44,282 known coding transcripts (Fig. 2A). In addition, 25,344 potentially novel alternative transcripts, 314 novel long non-coding RNA (lncRNA) and 200 novel transcripts of uncertain coding potential (TUCP) were also uncovered. The novel alternative transcripts displayed the largest range in number of exons, with 26.5% of the transcripts exceeding 20 exons (Supplementary Figure S1A, Additional File 3), showed a high coding probability (median 0.99, Supplementary Figure S1B, Additional File 3), and comprised the longest transcripts, with 30% exceeding 10 kb (Supplementary Figure S1C, Additional File 3). The median coding probability was high also for the generally shorter TUCP (0.67), while it was very low (0.004) for the novel lncRNA. Both TUCP and lncRNA had a median of two exons. Investigating thymic SP T cells exclusively, 39,965 known transcripts, 20,764 potentially novel alternative transcripts, 252 potentially novel lncRNA and 171 transcripts of uncertain coding potential (Supplementary Figure S1D, Additional File 3) were detected. When comparing the groups according to tissue and age, CD4 + and CD8 + thymic derived transcripts, combined, were more numerous in all categories; known coding, novel lncRNA, novel alternative transcripts and TUCP (Table 1), and adult blood transcripts were consistently the least abundant.
Table 1
Number of known coding transcripts, potentially novel lincRNA, tentative alternative transcripts and TUCP (transcript of uncertain coding potential) identified in CD4 + and CD8 + thymic, infant and adult blood derived T cells.
Cell | Group | Known coding | Novel lncRNA | Novel alterntive transcripts | TUCP |
CD4 | adult blood | 28878 | 91 | 12815 | 55 |
CD4 | infant blood | 34525 | 145 | 18691 | 86 |
CD4 | infant thymus | 34252 | 160 | 15306 | 76 |
CD8 | adult blood | 17000 | 44 | 6262 | 38 |
CD8 | infant blood | 25740 | 92 | 10718 | 58 |
CD8 | infant thymus | 34833 | 182 | 17151 | 145 |
Genes expressed in T cells from human thymus and blood
RNA-seq of the primary T cell subsets from human thymus and blood identified transcripts from 18,218 known genes in total, after filtering low expressed genes (< 1 pr million counts) (Supplementary Figure S2, Additional File 3). 14,441 (79%) were protein coding (representing 61% of Ensembl protein coding genes), 2501 lncRNA, 944 pseudogenes and 332 non-coding RNA (ncRNA). A multidimensional scaling (MDS) plot of the transcriptomes (Fig. 2B), revealed that the greatest variation was found between the thymic vs blood T cell subsets, regardless of age. The thymic CD4 + and CD8 + T cells showed less diversity than the CD4 + and CD8 + T cells from infant or adult blood. In total, 8,359 genes were shared between the CD4 + T cell (Fig. 2C) and 8,200 between the CD8 + T cells (Fig. 2D) from all three origins, at a cutoff of FPKM > = 2. Both thymic SP CD4 + and CD8 + T cells showed more uniquely expressed genes than the blood derived T cells from infants or adults. A higher number of expressed genes were shared between thymic CD4 + and thymic CD8 + T cells, than between infant blood vs thymic T cells of the same cell population (Supplementary Figure S3A, Additional File 3). This pattern was also true for genes associated with autoimmune diseases (Supplementary Figure S3B, Additional File 3).
Genes associated with autoimmune diseases
Of 555 loci associated with autoimmune diseases (AID), the majority were expressed in our T cell datasets. Only 123 (22.2%) of the annotated genes were not detected (at FPKM > = 2) in neither CD4 + nor CD8 + T cells from any of the three origins, while more than half of the genes (N = 285) were expressed in both T cell populations from all sample types (Supplementary Table S2, Additional File 2). The proportion of AID genes expressed varied across our T cell populations and between the diseases (Fig. 3). For the AIDs we investigated, at least half of the identified risk genes were found to be expressed. Observing the T cell populations separately, 378 of AID associated genes were expressed by CD4 + of any origin and 421 genes were expressed by CD8 + of any origin (Supplementary Figure S3C-D, Additional File 3). Interestingly, 49 of the 432 expressed AID genes were not expressed in T cells from adult blood (Supplementary Table S2, Additional File 2). Of these 18 AID risk genes were only expressed in thymic SP T cells while 20 AID risk genes were only detected in peripheral T cells from children. These 49 loci were mainly associated with inflammatory bowel disease (N = 21), multiple sclerosis (N = 18), rheumatoid arthritis (N = 15) and type 1 diabetes (N = 10).
Differential expression was most pronounced between thymus and blood
In both CD4 + and CD8 + T cells, the largest number of differentially expressed genes (DEGs) was discovered when comparing T cells from thymus with infant blood, followed by adult blood (Table 2). Comparing infant with adult blood T cells provided less DEGs. Similarly, when comparing the transcriptomes of CD4 + with CD8 + T cells, from different origins (Table 2), the highest numbers of DEGs were observed between the two T cell subpopulations in thymus, followed by infant blood, and lastly, adult blood. Volcano plots of DEGs for the pairwise comparisons are shown in Supplementary Figure S4 (Additional File 3), and complete lists of DEGs with expression values for all samples are found in Supplementary Tables S3-11 (Additional File 2).
A heatmap of the of top 10 DEGs between all comparisons, revealed that the subsets clustered according to tissue of origin, then cell type and age – with one major clade for the thymic cells and one major clade for the blood derived cells (Fig. 4A). Genes associated with V(D)J recombination and T cell commitment, including RAG2, HES1 and DNTT, were amongst the top 10 DEGs upregulated in thymic T cells. In CD8 + infant and adult blood T cells, the top upregulated genes included genes involved in cell migration and lineage commitment; S1PR5, PLEKHG3, and TBX21, while, amongst others, interleukin receptors IL6R and IL4R displayed high expression in CD4 + infant and adult peripheral blood T cells.
Table 2
Number of significantly differentially expressed genes (DEGs) from the pairwise comparisons, at FDR < 0.05, and additional criteria logCPM > 1.5 and logFC > 1.
group | comparison | upregulated in | #DEGs | #DEGs total |
CD4+ | thymus vs infant blood | thymus | 1624 | |
| | infant blood | 1333 | 2957 |
CD4+ | thymus vs adult blood | thymus | 1451 | |
| | adult blood | 1237 | 2688 |
CD4+ | infant blood vs adult blood | infant blood | 246 | |
| | adult blood | 329 | 575 |
CD8+ | thymus vs infant blood | thymus | 1403 | |
| | infant blood | 1378 | 2781 |
CD8+ | thymus vs adult blood | thymus | 679 | |
| | adult blood | 804 | 1483 |
CD8+ | infant blood vs adult blood | infant blood | 250 | |
| | adult blood | 46 | 296 |
adult blood | CD4 + vs CD8+ | CD4+ | 339 | |
| | CD8+ | 336 | 675 |
infant blood | CD4 + vs CD8+ | CD4+ | 819 | |
| | CD8+ | 1176 | 1995 |
thymus | CD4 + vs CD8+ | CD4+ | 1107 | |
| | CD8+ | 921 | 2028 |
Differences in gene set enrichment profiles related to developmental stage
The upregulated DEGs in thymic SP CD4 + and CD8 + T cells, were mainly involved in cell division and proliferation, when compared to infant blood CD4 + and CD8 + T cells (Fig. 5A). The DEGs upregulated in infant blood CD4 + and CD8+, compared to the equivalent thymic subset, were enriched for multiple immune related biological processes, such as defense response, cytokine production, and intercellular signal transduction, as well as regulation of cell proliferation and differentiation. When comparing infant to adult blood T cells (Fig. 5B), the infant blood T cells were enriched for genes involved in proliferation and cell death, besides regulation of gene expression and immune system processes. The genes upregulated in adult blood T cells were engaged in response to stimulus, immune and defense response, cytokine production and biological adhesion. Comparing CD4 + to CD8 + T cells, of the same tissue and age, revealed that genes upregulated in thymic CD4 + T cells were heavily involved in chromosome organization and cell cycle, while enriched GO terms in CD8 + T cells in infant blood, were dominated by immune related processes (Supplementary Figure S5, Additional File 3).
T cell markers for differentiation and migration
To further investigate differentially expressed genes involved in T cell differentiation and migration, we extracted DEGs associated with the GO terms “lymphocyte migration” (GO:0072676) and “T cell differentiation” (GO:0030217), as well as relevant genes from the literature (Fig. 4B). The genes upregulated in thymic T cells included recombination-activating genes; RAG1 and RAG2, genes involved in adhesion and homing; ITGAE (CD103) and CCR9, T lineage commitment; SATB1, cell proliferation; MKI67 and transcriptional regulators involved in T cell development; ID2, SOX4, LEF1 and BCL6. In adult blood T cells, several chemokines, interleukins, and their receptors were upregulated; CCL5 (RANTES), IL12RB1, IL10RA, IL32, CCR2 and CCR5, as well as genes involved in cell adhesion and migration; ADAM8, ITGB7, SELPLGM, and lymphocyte function and activation, including SLAMF6, PIK3CD, TXK and NFATC2. Several genes involved in cell adhesion and lymphocyte homing, migration and maturation were upregulated in infant blood T cells; CD69, CD44, SELL (CD62L), CCR7, S1PR1, ITGA6, ITGA5, ITK and TESPA1.
Our data suggests that infant CD8 + T cells may express CD8B at a higher level than CD8A, while the opposite was seen in the adult pool of CD8 + T cells (Fig. 6C), though the difference was not statistically significant. The expression levels of CD8A and CD8B in the SP thymic T cells were equivalent. We explored the distribution of CD8B isoforms, and detected highest expression of CD8b-201 (ENST00000331469) in SP thymic CD8 + T cells, followed by the blood CD8 + T cells from adults and infants (Supplementary Figure S6, Additional File 3). The most abundant isoform was CD8b-203 (ENST00000390655), mainly expressed by the CD8 + mature thymocytes, followed by the infant blood T cells, and to a lesser degree in adult CD8 + T cells.
Next, we investigated different CD45 isoforms (Fig. 6A). In general, the CD4 + T cells expressed a wider repertoire of PTPRC transcripts than CD8 + T cells. In peripheral blood, the adults showed higher expression of CD45RO transcripts (PTPRC-201) in their CD4 + T cells than children, while the opposite was observed for the CD45RABC isoform (PTPRC-209). The isoform patterns of CD45 have been less well characterized in CD8 + T cells. We observed tentative novel isoforms (Fig. 6B I and II), sharing exons with CD45RABC, in CD8 + T cells, not found to be expressed in CD4 + T cells. In the CD8 + cells, these novel PTPCR transcripts were expressed at similar levels as CD45RABC and CD45RO. We also observed that the CD45RB transcripts (PTPRC 203 and 214) displayed higher expression in the peripheral blood CD4 + T cells than the SP CD4 + T cells in the thymus, yet compared to the RO and the RABC isoforms, overall expression was low.
We furthermore investigated the CD45RA/RO ratios of the CD4 T cells, at the surface protein level using FACS, comparing a thymic sample and blood from the same child, and blood samples from two adults aged 30 and 70 years (Supplementary Figure S8, Additional File 1). Like others (5, 19), we observed high amounts of CD45RO in the thymic sample, while the blood sample, from the same individual, displayed less CD45RO and more CD45RA positive cells. Both the adult samples, regardless of age, showed extensive co-expression of CD45RA and CD45RO (43–51%, Supplementary Figure S8, Additional File 1), yet the overall expression of CD45RA was low, compared to infant blood.
Markers of thymic egress
In mice, the T cell egress phenotype has been determined as Cd3 + Cd27 + Cd45ra + Cd62l + Cd69- (20). We found expression levels of SELL (CD62L) generally higher than CD69 in all samples, roughly two-fold, within each dataset (Fig. 6D). Infant blood T cells displayed the highest expression of both SELL and CD69. In humans, CD31 and PTK7 have been described as RTE markers for CD4 + T cells (21, 22), and CD103 for CD8 + T cells (23). We detected PTK7 expression, above the detection limit, in SP T cells from thymus, exclusively. In CD4 + T cells, PECAM1 (CD31) expression was higher in thymus than blood, while in CD8 + T cells the highest expression levels were detected in infant blood. ITGAE (CD103) expression was high in thymus, and close to or below the detection level in infant and adult peripheral blood, in both CD4 + and CD8 + T cells.