CD8+ T cell clonotypes from prior SARS-CoV-2 infection predominate during the cellular immune response to mRNA vaccination

Almost three years into the SARS-CoV-2 pandemic, hybrid immunity is highly prevalent worldwide and more protective than vaccination or prior infection alone. Given emerging resistance of variant strains to neutralizing antibodies (nAb), it is likely that T cells contribute to this protection. To understand how sequential SARS-CoV-2 infection and mRNA-vectored SARS-CoV-2 spike (S) vaccines affect T cell clonotype-level expansion kinetics, we identified and cross-referenced TCR sequences from thousands of S-reactive single cells against deeply sequenced peripheral blood TCR repertoires longitudinally collected from persons during COVID-19 convalescence through booster vaccination. Successive vaccinations recalled memory T cells and elicited antigen-specific T cell clonotypes not detected after infection. Vaccine-related recruitment of novel clonotypes and the expansion of S-specific clones were most strongly observed for CD8+ T cells. Severe COVID-19 illness was associated with a more diverse CD4+ T cell response to SARS-CoV-2 both prior to and after mRNA vaccination, suggesting imprinting of CD4+ T cells by severe infection. TCR sequence similarity search algorithms revealed myriad public TCR clusters correlating with human leukocyte antigen (HLA) alleles. Selected TCRs from distinct clusters functionally recognized S in the predicted HLA context, with fine viral peptide requirements differing between TCRs. Most subjects tested had S-specific T cells in the nasal mucosa after a 3rd mRNA vaccine dose. The blood and nasal T cell responses to vaccination revealed by clonal tracking were more heterogeneous than nAb boosts. Analysis of bulk and single cell TCR sequences reveals T cell kinetics and diversity at the clonotype level, without requiring prior knowledge of T cell epitopes or HLA restriction, providing a roadmap for rapid assessment of T cell responses to emerging pathogens.

context of hybrid immunity, repeated S antigen exposure from mRNA vaccination after COVID-19 would expand S-speci c T cell clonotypes from prior infection and potentially diversify S-speci c T cell memory 19,20 .
To investigate diversity and kinetics of S-speci c T cells, we used activation-induced markers (AIM) and sequenced the TCRs of antigen-speci c T cells (AIM-scTCRseq) 20 . With TCR sequences as molecular barcodes, antigen-reactive cells were quanti ed longitudinally in immune cell repertoires obtained via high-throughput (i.e., bulk) sequencing of the TRB locus from blood. Matching AIM-scTCRseq with longitudinal bulk repertoire data permitted measurement of the recruitment and expansion of S-reactive clonotypes from convalescence through three mRNA vaccinations to measure clonotype-level kinetics and to evaluate the diversity of S-speci c CD8 + and CD4 + T cells. In the full cohort we sequenced 259 serial bulk TRB repertoires from 54 persons developing hybrid immunity with complementary clinical and serologic data. We also performed bulk repertoire sequencing at the infection-relevant nasal site. This permitted examination of the kinetics of S-speci c T cell expansion as well as correlations between the severity of COVID-19 illness, the antibody response to vaccination, the longitudinal dynamics of S-speci c T cells over time, and the presence of S-speci c T cells at the site of pathogen entry. Finally, thousands of TCR sequences were clustered by an HLA-informed sequence similarity algorithm to permit inference of the likely restricting HLA allele for many S-reactive TCRs -providing a broadly applicable roadmap for "de-orphaning" TCR sequences without a priori knowledge of epitope-level speci city.

Vaccination provokes clonal expansion
To assess the impact of vaccination on repertoire structure, we compared TRB clonotype frequencies within-participant across time by pairwise differential relative abundance. Longitudinal repertoire sampling was used to measure clonal expansions and contractions following vaccination (representative participant, Fig. 1a; persons with E01 and E03 samples, Extended Data Fig. 2). Some participants had large expansions of individual TRB clonotypes -up to 1,000-fold or greater -after primary vaccination. To differentiate changes in clonal frequency likely due to vaccine-induced proliferation from natural background variation, we de ned expanded and contracted clones as those with an absolute log 2 foldchange greater than 2 and a statistically signi cant change in frequency between samples using Fisher's Exact Test with correction for multiple hypotheses (FDR-adjusted q-value < 0.05) (Supplemental Data Table 4, summary in Supplemental Data Table 5). This focuses the analysis on clonotypes with a robust and statistically signi cant expansion or contraction.
Among 32 participants with TRB repertoires at both E01 and E03, a median of 72 (IQR 51-104) statistically signi cant vaccine-expanded TRB clonotypes were detected per participant, comprising a median of 1.8% (IQR 0.9-3.5%) of each participants' productive circulating T cell repertoire. In one extreme case, 292 unique vaccine-expanded TRB clonotypes made up more than a quarter of the circulating repertoire after primary vaccination in participant P761 (Extended Data Fig. 2). We noted pronounced intra-cohort and dose-to-dose variability in the number and fold-expansion of expanding clonotypes as well as their pre-vaccine frequencies (two representative participants, Fig. 1b, c).
To track vaccine-expanded clonotypes over time, the productive frequencies of clonotypes meeting criteria for expansion between E01 and E03 were plotted over the time course from convalescence through booster for each participant (Extended Data Fig. 3a). Two representative participants illustrate heterogeneity in the vaccine T cell response. In participant P581 (Fig. 1b), many TRB clonotypes destined for vaccine-related expansion were present at 0.001 to 0.1% of the repertoire at early COVID-19 convalescence (E00). These clonotypes universally decreased in frequency between infection and vaccination. The rarer convalescent clonotypes tended to contract below the limit of detection (ND in Fig. 1b) but then proliferated at least 10-to 100-fold after the rst mRNA dose (black lines in Fig. 1b between E01 and E02). In participant P581, there were also previously unseen clonotypes, observed in neither convalescent (E00) nor pre-vaccination (E01) repertoires, that expanded after dose 1 (shown as orange line in Fig. 1b). The 2nd dose provoked only modest additional proliferation for clonotypes that expanded after the 1st dose and entrained almost no additional clonotypes (Fig. 1b). In P581, 80% of the vaccine-expanding clonotypes were detected before vaccination, suggesting a dominant role for memory T cells. Dose 2 contributed few additional vaccine-expanded clonotypes and there was sparse evidence of clonotype expansion after a third dose (E05, Fig. 1b). In contrast, for participant P837 (Fig. 1c), clonotypes destined to expand after vaccination displayed much more kinetic heterogeneity, with clonotypes detected at E00 showing either stability or contraction prior to vaccination. Dose 1 prompted expansion in many previously unseen clonotypes but did little to expand the clonotypes persisting prior to vaccination. Dose 2, however, provoked expansion of previously-detected clonotypes, as well as entrained additional new clonotypes not observed in earlier repertoires. In the year after the primary vaccine series, the clonotypes that had required two doses to expand tended to contract markedly, despite receipt of a 3rd dose (E05, Fig. 1c). Clonotype trajectories for the remaining 30 participants with E01 and E03 samples (Extended Data Fig. 3a) further illustrate inter-participant variability. While the majority of both memory and previously undetected vaccine-expanded clonotypes persisted throughout the time course, a higher percent of recalled memory clonotypes persisted past the 3rd dose to E05 (p-value 0.002, Extended Data Fig. 3b).
Vaccination boosts pre-existing clonotypes and elicits previously undetected clonotypes The vaccine-responsive clonotypes detected in blood before vaccination likely represent memory to SARS-CoV-2 or other CoV infection(s), while the previously unseen vaccine-responsive clonotypes could be immunologically naive or rare memory clonotypes below the limit of detection. With a median repertoire size of 430,000 sequenced productive templates, a clonotype with a frequency of 1 cell per 200,000 would have > 85% chance of detection. While we could not differentiate these possibilities, we summarized recalled memory and previously unseen clonotypes over time across our cohort to compare the kinetics of these subpopulations and their relative contributions to the post-vaccine T cell response. After dose 1 (E02), the integrated abundance of previously unseen clonotypes comprised 0.01-1% of the total repertoire (Fig. 1d), whilst memory clonotypes contributed 0.1% to almost 10% of the repertoire. To assess cellular proliferation after each vaccine dose, we examined clones present at the pre-vaccine timepoint (E01) that expanded by E03. For these clonotypes, the median number of cell divisions after dose 1 (4.0 [2.4, 4.6]) was greater (p-value < 0.001) than after dose 2 (1.8 [1.2, 2.7]). By E03, after two mRNA doses, total frequencies of expanded clonotypes ranged considerably from 1 to 26% (Fig. 1e). T cells derived from memory were numerically dominant (Fig. 1e), but the diversity of unique clonotypes was more balanced between memory and previously unseen vaccine-responsive clonotypes (Fig. 1f).
To further analyze heterogeneity in vaccine response among all participants we calculated the (i) frequency and (ii) clonal breadth quantities within each repertoire from the de ned set of expanded clonotypes. Frequency is the sum of the productive frequencies of all expanded clonotypes in a sample, whereas clonal breadth is the proportion of unique clonotypes that were expanded, an indicator of clonal diversity. We observed signi cant increases in the frequency of expanded clonotypes between visit E02 and visit E03 for both recalled memory and previously unseen clonotypes (Fig. 1d, right bar graph, p < 0.001). The breadth of previously unseen clonotypes also increased signi cantly after each dose (Fig. 1g, p < 0.05). The percentage of unique TRB clonotypes derived from memory T cells decreased from visit E02 to visit E03 in most participants (Fig. 1h, p < 0.001 signed-rank test), indicating that both doses expand naive or rare memory lymphocytes. Expanded TRB clonotypes that were detectable only after vaccination often persisted after the booster (Extended Data Fig. 3) but were less abundant than expanded memory clones (Fig. 1d, right bar graph). Notably, the repertoire of each person with a frequency of vaccine-expanded clones > 5% at E03 (participants P673, P582, P836, P761, and P581) was characterized by strong polyclonal expansion of pre-existing cellular memory (Fig. 1e, f).
Single cell sequencing identi es CD8 + phenotype in highly expanded clones To obtain paired TCRαβ sequences and determine the phenotypes of vaccine-expanded clonotypes, we isolated S-reactive T cells. We chose the post-2nd dose sample (E03) from 17 participants with > 50 vaccine-expanded clonotypes (Supplemental Data Table 5). The subgroup was balanced for sex and vaccine manufacturer; four were hospitalized with COVID-19. T cells co-expressing the activation-induced markers (AIM) CD69 and CD137 after S peptide stimulation were single-cell sequenced (AIM-scTCRseq) at TRA and TRB loci. Clonotypes were classi ed as S-reactive if their productive TRB frequency was enriched within the AIM-scTCRseq fraction relative to their TRB bulk unsorted frequency at the same time point. From 17 E03 samples, we sequenced and phenotyped 24,557 S-reactive T cells and identi ed 5,733 unique S-reactive clonotypes (see Methods) (Supplemental Data Tables 6,7). S-reactive clonotypes were individually classi ed as CD4 + or CD8 + , or inconclusive, using oligonucleotide-labeled mAb binding (Supplemental Data Table 7).
Overlay of AIM-scTCRseq data onto serial blood TRB repertoires showed that among expanded clonotypes, the frequency in serial blood TRB repertoires was highly correlated with the frequency in the AIM-scTCRseq fraction (Extended Data Fig. 5a; representative subjects, Extended Data Fig. 5b, c). Despite this overall agreement, it was notable that a large fraction of S-reactive clonotypes were not signi cantly expanded in the bulk TRB repertoires, suggesting that not all S-reactive clonotypes were expanded by vaccination and thus would not be identi able by pre-post TRB differential testing alone (Fig. 1). Conversely, there were also some robustly vaccine-expanded TRB clonotypes that did not match a corresponding S-reactive AIM-scRNAseq TRB (Fig. 2a, Extended Data Fig. 4, 5), possibly re ecting vaccine-unrelated changes in the repertoire or suboptimal sensitivity of the AIM-scTCRseq assay. Despite these possibilities, AIM-scTCRseq recovered a balanced spectrum of unique, antigen-reactive CD4 + and CD8 + T cells across all participants used as a set of index clonotypes to compare the clonal dynamics of S-speci c CD4 + and CD8 + T cells.
CD4 + and CD8 + S-reactive T cells exhibit divergent kinetics We matched the S-reactive TRB AIM-scTCRseq nucleotide sequences, from E03, to bulk TRB repertoires across time (representative participants, Fig. 2d; others, Extended Data Fig. 6). Twelve participants had matched TRB bulk repertoire sequencing at E00, E01, E02, E03, and E05 time points. Unsupervised clustering of clonotype-level trajectories in these 12 participants revealed that 5 groupings described the expansion or contraction trajectory of most T cell clonotypes (Extended Data Fig. 7): (i) minimal proliferation, (ii) proliferation after dose 1 followed by contraction, (iii) late proliferation after dose 2, (iv) proliferation without contraction, or (v) serial proliferation after both dose 1 and 2. CD8 + S-reactive T cells were observed to have more prolonged and greater expansion; in 12 of 12 participants CD8 + clonotypes were observed to have serial expansion (type v). The majority of CD4 + S-reactive T cells showed minimal expansion (type i). CD4 + clonotypes were not observed to have strong serial expansion and many contracted after dose 1 or 2 (Extended Data Fig. 7). Overall, comparison of the mean trajectories of Sreactive clonotype abundances for the 17 participants with AIM-scTCRseq data (Fig. 2e) illustrates the greater expansion of CD4 + than of CD8 + T cells after vaccination.
Despite less CD4 + clonal expansion in response to vaccination, longitudinal analyses of their S-reactive clones suggest a somewhat stable CD4 + S-reactive memory population within the blood. Their integrated abundance, de ned as TRB sequences in blood matching CD4 + AIM-scTCRseq sequences, was 0.06% (IQR 0.02 to 0.1%) of the circulating repertoire after infection at time point E00, increasing to only 0.09% (IQR 0.04 to 0.2%) after two mRNA doses at E03. The breadth of S-reactive CD4 + T cell clonotypes increased from E00 to E02 in 11 of 17 participants (p-value 0.02, signed-rank test). However, the breadth of S-reactive CD4 + T cells appears primarily to re ect the recruitment of pre-existing clones; a median of 65% (IQR 61-77%) of S-reactive CD4 + clonotypes observed after vaccination (E03) were detected in prevaccine samples (Supplemental Data Table 8 columns X:AE).
The data illustrate that repeated antigen exposure likely diversi es detectable S-reactive CD8 + T cells; a median of 50% (IQR 33-61%) of the CD8 + TRB clonotypes observed after the second mRNA dose were not detectable prior to vaccination. Furthermore, a median 20% (IQR 11-32%) of the S-reactive CD8 + TRB clonotypes observed after dose 2 were also not yet detectable after dose 1. Thus, even in persons with a pre-existing S-reactive memory T cell population due to previous infection, a second mRNA dose may recruit previously naive or memory CD8 + T cells clonotypes, potentially capable of broadening recognition of S. The additional boosting effect of a two-dose vaccine regimen can be seen visually by longitudinal clonal tracking between E02 and E03 for participant P836 (Fig. 2d), and in participants P684, P527, and P525 (Extended Data Fig. 6).
To study the potential emplacement of SARS-CoV-2-speci c T cells in the nose, a site of viral infection, we obtained nasal swabs at E05, several weeks after booster vaccination, from 48 persons. Productive TRB sequences were obtained from between 184 to over 57,000 T cells per swab. Amongst 16 persons with nasal samples and AIM-scTCRseq data, 14 had con rmed S-reactive CD8 T cell clonotypes detected in nasal swabs. Considerable heterogeneity was noted, as exempli ed by recovery of diverse and abundant S-speci c CD8 T cells in participant P673, but only a few clonotypes and at low abundance were detected in a nasal swab participant P836. S-speci c CD4 + T cells were occasionally also detected (Extended Data Fig. 9).
Hybrid immunity elicits highly public S-speci c TCR motifs TCRs recognizing a common ligand often exhibit convergent sequence features 26,27 in CDR3 peptidecontacting residues and other CDRs that contact the peptide-HLA ligand. Amongst 5,733 unique Sreactive TRA/TRB clonotypes recovered by AIM-scTCRseq from 17 participants and passing additional quality lters (see Methods), we computed pairwise sequence divergence using TCRdist -a multi-CDR position-weighted, biochemically aware distance metric -to search for public TCR clusters with similar CDR AA sequences 28 . A similarity graph was constructed from the 1,458 clonotypes that had at least one other similar TCR in the dataset, with edges joining su ciently similar TCRs (TCRdist metric ≤ 100, generally corresponding to similar TRBV/TRAV gene usage and one to four AA substitutions or deletions within TRA and TRB CDR3s). This graph of high-con dence SARS-CoV-2 S-speci c TCRαβ sequences, identi ed without prior knowledge of the restricting HLA or peptide, enabled us to reveal diverse public clusters of TCRs, often characterized by distinct CDR3 motifs, that were expanded by mRNA vaccination. Overall, we found 284 such TCR clusters (Fig. 3a, Extended Data Figs. 8, 9). The ten largest clusters contained cells from 3-11 participants and contained between 25-144 unique member clonotypes.
We reasoned that if public clusters accurately identi ed groups of TCRs recognizing a shared epitope, each TCR group should contain exclusively CD8 + or CD4 + T cells and be enriched for participants sharing at least one HLA allele. Consistent with this hypothesis, greater than 97% of edges within clusters connected clonotypes with matching CD4 + or CD8 + assignments (Fig. 3a). Many of the public TCR clusters were formed from groups of persons expressing only one feasible shared HLA class I or class II allele, suggesting speci city for a peptide ligand restricted by this allele (Extended Data Fig. 9).
We also observed multiple novel and highly public HLA-A*03:01-associated clusters -with sequence motifs found in at least 6 of the 7 A*03:01-expressing participants. Two of largest HLA-A*03-associated clusters shared CDR3α junctions de ned by central NNNAG residues, which were paired with two distinct and V-gene biased CDR3β receptor motifs in Cluster 4 (TRBV19-dominated) and Cluster 5 (TRBV9dominated) (Fig. 3d, e). Cluster 4 and 5 motifs shared similar central CDR3β junctional residues (i.e., SIKGG (Fig. 3d) and SPWGG (Fig. 3e)), where a hydrophobic residue in TRB CDR3 position 6 was frequently followed by a glycine residue in positions 7 or 8. This pattern also appeared as a unifying feature within the CDR3β motif of public HLA-A*03:01-associated cluster 6 (Fig. 3f). The prevalent HLA-A*11:01-associated receptor motif, found in 4 of 6 HLA-A*11:01 participants, is shown in Fig. 3h. To con rm the ligands of representative TCRs assigned to HLA-A*03:01, we expressed 6 AIM-scTCRseqidenti ed receptors from participant P673 (Fig. 1a, 2a) that were expanded strongly after vaccination, originating in clusters 5, 6, 10 (two TCRs with identical TRA), 24, and 269 (Fig. 3a). Each TCR showed strong, speci c recognition of arti cial antigen presenting cells (aAPC) co-expressing HLA-A*03:01 and either full-length S from ancestral strain Wu-1, or near full-length S from Wu-1 or Omicron BA.1, BA2, or BA.4 SARS-CoV-2 (representative data, Extended Data Fig. 10a-e; summary, Extended Data Fig. 10f). Control APC expressing empty vector, S alone, other HLA-A or B from participant P673 with S, or HLA alone, were negative. All TCRs from showed reactivity with peptide S 378-387, an HLA-A*03:01-restricted epitope 33 (Extended Data Fig. 10f,g). We observed potential differences in the structural requirements for T cell activation between the various TCRs, indicating that the biochemically-informed TCRdist metric may cluster TCRs into functionally meaningful groups (Extended Data Fig. 10g). TCRs 1 and 4, from clusters 269 and 6, respectively, recognized 10-mer peptide 378-387 but neither internal 9-mer, requiring both N-terminal lysine 378 and C-terminal leucine 387. In contrast, TCR3 was versatile, equally recognizing the parent 10-mer and each 9-mer. TCR2 was intermediate, optimally recognizing 378-387 with partial response to both internal 9-mers. TCRs 8.1 and 8.2 were generally similar to TCR1/TCR4. In agreement with the transfection data, peptide containing variant amino acids at positions 373, 375, and 376, representing Omicron BA.4/BA.5, was recognized by each TCR. The TRB sequence of each reporter cell-con rmed S-speci c TCR was detected in the nasal swab of participant P673, some at high abundance (Extended Data Fig. 9).
Severe disease imprints the S-reactive CD4 + T cell population AIM-scTCRseq overlay on differential abundance plots (Fig. 2a, Extended Data Fig. 4, 5) showed that most highly vaccine-expanded clonotypes were CD8 + . S-reactive. Expanded CD4 + T cells were present, but not as strongly expanded or entrained into the circulating repertoire by vaccination (Fig. 2a, b, e, Extended Data Fig. 4). To more generally study the heterogeneity and longitudinal dynamics of SARS-CoV-2 reactive CD4 + T cells, we counted the breadth of clones in each sample exactly matching a set of CD4 +associated TRB sequences that were previously found to be enriched in SARS-CoV-2 convalescent versus healthy control repertoires 34 (diagnostic breadth). These TRB sequences, derived using the ImmunoSEQ assay (Adaptive Biotechnologies) [35][36][37] , were previously assigned to S (n = 917) or non-Spike (n = 1564) SARS-CoV-2 antigens 38 (see Methods).
To examine if COVID-19 disease severity resulted in differential imprinting of the T cell repertoire, we compared diagnostic clonal breadth (de ned in Methods) in hospitalized and non-hospitalized patients, a measure of antigen-speci c diversity computed as the percentage of unique diagnostic SARS-CoV-2 reactive TRB clonotypes assigned to CD4 + T cells amongst the total number of clonotypes detected (Fig. 4a, Supplemental Data Table 8 columns E:V). At the convalescent time point (E00), we observed greater diagnostic S-reactive CD4 + T cell breadth in hospitalized vs. non-hospitalized patients (0.014 vs 0.006%, p-value < 0.01). This difference was no longer observed by the pre-vaccination time point (E01) or after the rst mRNA dose (E02). CD4 + S breadth was again elevated in previously hospitalized persons after the 2nd (E03) and 3rd (E05) doses (0.012 vs 0.008%, p-value = 0.02 at E03; 0.009% vs 0.005%, pvalue < 0.01 at E05), suggesting that a diverse CD4 + memory population after severe infection may result in a more diverse repertoire after full vaccination. In contrast, the breadth of vaccine-expanded clonotypes at post-vaccine time points, which was skewed towards CD8 + T cells, did not correlate with hospitalization status (Fig. 4b). Spike and non-spike diagnostic breadth were weakly correlated (rank correlation ρ = 0.38, p = 0.019) at E00 (Fig. 4c), but the relationship was not evident after vaccination (ρ = 0.14, p = 0.2) (E03, Supplemental Data Fig. 3), consistent with increasing diagnostic breadth for CD4 + recognizing S but not non-S epitopes after mRNA vaccination. Both S and non-S diagnostic breadth declined in the year between convalescence and vaccine dose 1 (E00 to E01, p < 0.001). Diagnostic S breadth increased promptly after dose 1 and then slowly declined, while in contrast non-S breadth remained stable throughout vaccination after the initial decline in the participants without breakthrough infection (Fig. 4d). We did not observe any associations between CMV infection status and parameters of SARS-CoV-2 speci c TRB repertoires early after infection (Fig. 4c) or at later time points (Supplemental Data Fig. 3).

Post-infection CD4 + T cell breadth correlates with antibody responses to vaccination
In the larger cohort that contains the persons studied here, we reported that in the rst year after recovery from COVID-19, higher Nt50 was associated with COVID-19 requiring hospitalization 25 . In the subset of subjects reported here, vaccination led to robust increases in Nt50 in previously infected persons, with no difference per hospitalization (Fig. 4e, Supplemental Data Table 2). Nt50 declined in the year after infection (reciprocal dilution, median 80, IQR 50-160 at E00 to median 60, IQR 40-80 at E01), but was boosted to ≥ 640 in 51 of 52 participants measured after the rst mRNA vaccination (Fig. 4e, median 2,560, IQR 2,250-5,120). No consistent increase in Nt50 occurred after further vaccination. Only participant P845, ambiguous for prior infection, did not reach Nt50 ≥ 640 after vaccination. CD4 + T cells provide help to B cells, and S-speci c circulating T follicular helper-like cells (cT FH ) have been associated with disease severity 39 . To determine whether pre-vaccination CD4 + breadth might in uence postvaccination antibody neutralization, we conducted a lagged, temporal correlation analysis. The diagnostic CD4 + breadth metric, shown to be predictive of SARS-CoV-2 infection 22 , at early convalescence (E00) was associated with Nt50 at the same time point (ρ = 0.49, p-value 0.00004) (Fig. 4f). This association of early convalescent diagnostic CD4 + breadth remained strong with Nt50 after one mRNA dose at E02 (ρ = 0.46, p-value 0.001) (Fig. 4g) but was not present between Nt50 and diagnostic CD8 + breadth (ρ = 0.2, p-value 0.1). Consistent with this, there was a positive association between Nt50 early after infection and after dose 1 (E02) (Fig. 4h, ρ = 0.38, p-value = 0.007). The rapid increase in Nt50 in persons with hybrid immunity after a single mRNA vaccine dose contrasts with CD8 + T cell expansion, which accumulates over both doses 1 and 2.

Discussion
Given the high global prevalence of hybrid immunity from combined natural infection and vaccination, we sought to understand how sequential exposure to S impacts the circulating TCR repertoire. We phenotyped single cells selected by expression of activation markers upon S peptide stimulation with barcoded mAbs, identifying similar numbers of unique S-reactive CD4 + (n = 2430) and CD8 + (n = 2467) TCRαβ clonotypes from 24,557 S-reactive single cells. Tracking these clonotypes longitudinally -from convalescence through booster vaccination -in participant-matched bulk TCR repertoires, we observed divergent kinetics between S-reactive CD8 + and CD4 + T cells. Across almost all participants, comparisons of clonotype frequencies from before and after vaccination were marked by pronounced, vaccine-induced expansions of S-speci c CD8 + clones: 93% of the highly vaccine-expanded S-reactive clonotypes were CD8 + (Fig. 2a, 2b) -with fold-expansions ranging from 4-to > 100-fold in response to mRNA dose 1 and further expansions observed after dose 2.
Despite modest CD4 + T cell clonotype expansion after vaccination, we observed that CD4 + T cell breadth, measured using TRB sequences assigned to S, waned during convalescence but was boosted by vaccination to levels observed soon after acute illness. This may represent detection of both clonotypes recruited from the naive population and proliferation of memory clonotypes. Moreover, in our cohort, we found evidence that severe disease may leave an imprint on the S-reactive CD4 + T cell memory subset that persists through vaccination. Participants who were hospitalized had a greater diversity of S-reactive CD4 + T cells before and after vaccination. This may re ect the long-lived infection-imprinted IFN-γ and IL-10 cytokine pro le of CD4 + T cell memory reported by Rodda et al. 15 in response to natural infection but not vaccination alone. Severe COVID-19 has previously been associated with SARS-CoV-2-speci c CD4 + T cells with a cytotoxic phenotype soon after recovery 40 or with an altered Th2/Th17 balance 41 , and with CD8 + T cells with markers of exhaustion 42 . It is not yet known if these properties segregate by clonotypes or are imprinted by infection for carryover after vaccination amongst S-speci c T cells. In contrast to Sreactive CD4 + T cells, neither vaccine-associated nAb titers nor CD8 + T cell magnitude or breadth were strongly associated with COVID-19 severity in our study.
Rapid expansion of CD8 + T cells has been described in naive persons receiving a rst dose of mRNA vaccination 29 ; however, continued boosting of S-reactive CD8 + T cells by repeated S antigen exposure in COVID-19-recovered persons has not previously been described. Indeed, in naive persons, mRNA vaccines generally induce weaker CD8 + T cell responses than replication-incompetent adenovirus vaccines encoding the same antigen 43 . Pseudouridine modi cation, used in SARS-CoV-2 mRNA vaccines, has been shown to upregulate Th1 recruiting cytokines in mice 44 49 ), such that congruency for the other markers was likely essential. The markers used for CD4 + T cells included CD200, differing from our use of CD69 and CD137. As discussed in Methods, a complex set of activation marker patterns can be expressed by S-reactive PBMC in the hybrid immunity context. Further work will be required to determine if this phenotypic diversity is associated with CD4 + T cell clonotype identity, and overall to understand tradeoffs between activation markers for sensitivity, speci city, and capture of T cells with different effector and memory phenotypes and expansion kinetics.
Strengths of this report include the long duration of serial TRB repertoire sequencing, up to 2 years, and study of 3-8 blood samples sequenced per participant. This permitted us to measure the frequency of Sreactive clonotypes from several weeks after infection until after three doses of vaccine. Critically, this allowed quantitative assessment of the contribution of post-infection memory to vaccine-elicited responses. We showed that the main contributor to CD8 + T cell responses in hybrid immunity is the expansion of memory cells detected in circulation after infection. However, we also observed that the rst two doses of vaccine appear to further increase circulating S-reactive clonotypic diversity, consistent with data from peptide-HLA oligomer-sorted T cells 20 . Another strength of our study was the con rmation of many vaccine-expanded TCRs (analyzed in Fig. 1) as functionally S-speci c using AIM (Fig. 2) or TCR reconstruction (Fig. 3). The large degree of overlap between expanded and scAIM-TCRseq-con rmed clonotypes indicates that comparison of serial blood TRB repertoires may be a suitable surrogate for resource-intensive AIM-or peptide-HLA oligomer-scTCRseq to assign TRB sequences as vaccine antigenspeci c. AIM-scTCRseq data from this report substantially augment databases 50 used to match to estimate T cell responses to vaccines, as reported for adenovirus-based SARS-CoV-2 products 51,52 . While limited data suggest that intramuscular mRNA vaccination alone can result in nasal S-speci c CD8 + T cells 11 , more research is required to assess the contribution of a priming or breakthrough infection 12 for the emplacement of SARS-Cov2-speci c T RM in the nose. Here we show that vaccine-expanded clonotypes in circulation are detectable in the nose in persons with hybrid immunity. With development of nasally-delivered SARS-CoV-2 vaccination 53 , nasal swabs and immune sequencing become a rational assay endpoint to evaluate mucosal immunity after vaccination. More speculatively, the novel S-speci c TCR motifs identi ed here (Fig. 3, Extended Data Fig. 8, Supplemental Data Fig. 2) are candidates for non-exact, TCR distance-informed matching to serial blood or mucosal TCR repertoires from SARS-CoV-2 vaccine recipients. As knowledge of TCR-peptide-HLA triads grows, TCR sequencing-based immune monitoring can be benchmarked against T cell functional assays.
The use of an HLA-peptide oligomer-independent approach to identify and phenotype S-reactive T cells allowed us to identify and quantify immunogen-speci c T cells in a relatively unbiased fashion, without foreknowledge of epitopes or restricting HLA. Thus, we were able to recover thousands of paired chain TCRαβ sequences, revealing both α-and β-chain sequence features contributing to epitope speci city. In particular, we report the rst large set of public paired receptors sequence motifs, with experimental validation, for an immunodominant A*03:01-restricted S epitope. As is shown here, S speci city can result from TCRαβ pairs where a single alpha chain may pair promiscuously with multiple beta chains, and vice versa, con rming that isolated β (or α) chain sequencing may be imperfect for assigning speci city from single-chain sequencing. Through experimental validation and cloning of representative TCRs within novel clusters of convergent S-reactive receptor sequences, we show a roadmap for de-orphaning TCRpeptide/HLA ligand pairings. This methodology would be valuably transferred to cohorts in less wellstudied and geographically diverse populations with a distinct distribution of HLA alleles, which may not otherwise have cellular responses amenable to study by existing HLA-peptide oligomer reagents.
Our study has several limitations. We were only able to study hybrid immunity in the context of infection before vaccination. Given breakthrough infections after vaccination are now common, work is needed to measure how the order of hybrid exposure shapes the TCR repertoire. Our study was limited to persons infected early in the pandemic who were generally older (median age 60.6 years) and may not directly translate to younger persons. Also, while in general we found considerable overlap between TRB sequences expanding across serial blood TRB comparison and post-vaccine AIM-scTCRseq, many clonotypes that expanded with vaccination did not match a S-reactive receptor from single cell analysis.
Finite sampling of AIM-scTCRseq may contribute to partial coverage of low abundance vaccine-expanded clones. Another factor could be our use of CD69/CD137 as activation markers, which would likely impact CD4 + clonotype detection more than CD8 + T cells. On the other hand, our use of 15 amino acid-long S peptides is more likely to underestimate CD8 + than CD4 + responses, given the greater permissiveness of HLA class II than HLA class I for binding of peptides with N-or C-terminal extensions 54 . Thus, we hypothesize that many of the strongly expanded clonotypes not matching at TRB with cells recovered by AIM-scTCRseq represent CD8 + T cells that are sub-optimally presented in our in vitro AIM assay. Detailed analyses of de ned peptide sets and activation marker combinations may enable optimization in the future. Our study was limited to mRNA vaccination. Keeton et al. studied Ad26.COV2.2 vaccination after infection and observed relatively equivalent boosts of S-speci c CD8 + and CD4 + T cells. The assay used, intracellular cytokine staining, is similar to AIM as it is dependent on the phenotypic markers chosen to denote T cell activation 55 . In contrast, clonotype expansion (as in Fig. 1), while not testing T cell functional potential, is independent of prior assumptions about T cell phenotype.
Possibly, some expanded clonotypes may have been coincidentally ampli ed by non-SARS-CoV-2 antigen exposure over the vaccination time interval. We searched TRB from expanded (Fig. 1) and AIM-scTCRseq ( Fig. 2, 3) clonotypes for TRB assigned to other antigens in public references 50 . Fewer than 0.1% of these TCRs matched EBV or cytomegalovirus (CMV)-assigned CDR3 sequences. In almost all these cases, neither TRBV or TRBJ matching nor HLA matching between reported HLA restricting alleles and participant haplotypes, could be documented for these CDR3 sequences assigned to common herpesviruses. One TCR assigned to Epstein-Barr virus (EBV) matched a TRBV, TRBJ, CDR3, and participant HLA allele to a clonotype signi cantly expanded after dose 2. This is not unexpected, as EBVspeci c T cells have been detected in several pathophysiologic states, though their signi cance is unknown 56-58 .
Another limitation of our study is the inability to directly infer whether vaccine-expanded clonotypes that were below the limit of detection prior to vaccination came from very rare memory or from naive populations. Conclusive identi cation would require sorting large numbers of PBMCs from seropositive persons from time points prior to vaccination for separate repertoire sequencing, or identi cation of naive S-reactive clones by comparing the outcome of vigorous peptide stimulation protocols 42 between puri ed naive and memory cells, which was not feasible in our cohort. Despite not being able to de nitively resolve whether the vaccine elicits naive cells at the time of vaccination in previously infected persons, we showed unambiguously that the mRNA vaccination expands many low-abundance CD8 + clonotypes previously undetectable after infection.
In summary, vaccine formulations of mRNA encoding SARS-CoV-2 S in lipid nanoparticles, administered intramuscularly, induce profound, albeit variable, expansion of pre-existing circulating memory T cell clones. Given the ability of VOC to escape antibody responses, it is likely that virus-speci c T cells contribute to protection, at least from severe disease, in the hybrid immunity context. Phenotypic analyses show that vaccine-driven clonotype-level expansion is much greater for CD8 + than for CD4 + T cells, while the overall diversity of circulating S-reactive CD8 + and CD4 + T cells after mRNA vaccination is similar. Sequence variation amongst S-speci c TCR heterodimers, while large, is amenable to simpli cation by clustering algorithms. This allows accurate prediction of the HLA restriction of previously unseen TCRs. Further research is required to determine how the phenotype, durability, and For CD4 + T cells, the sensitivity to detect peptide activation using CD69/CD137 was more limited (median 21.1%). However, no AIM molecule pair was consistently the most sensitive. CD69/CD137 showed better sensitivity than combinations of two TNFRs and was used to identify both CD4 + and CD8 + T cell activation by peptide stimulation.
Bulk TCR sequencing. Genomic DNA was extracted from frozen PBMC samples using the Qiagen DNeasy Blood Extraction Kit (Qiagen). Immunosequencing of CDR3 regions of TCR-β chains used the ImmunoSEQ™ Assay (Adaptive Biotechnologies, Seattle, WA, USA). Input DNA was ampli ed in a biascontrolled multiplex PCR, followed by high-throughput sequencing. Sequences were collapsed and ltered to identify and quantitate the absolute abundance of each unique TCR-β CDR3 region for further analysis, as described [35][36][37] . For analyses of bulk repertoires, the term clonotype is used for a unique TRB sequence: a complementarity determining region 3 (CDR3) nucleotide sequence and associated TRBV and TRBJ genes. These generally distinguish a unique T cell clonotype; occasionally, a single TRB may pair with > 1 TRA in distinct T cell clonotypes.
Single-cell VDJseq and feature barcode data analysis and alignment. Raw sequencing data were processed with the Cell Ranger version 6.1.0 (10X Genomics) pipeline. Demultiplexing from raw .bcl data and conversion to .fastq data used Cell Ranger mkfasq. Surface feature barcode antibody binding analyses used Cell Ranger counts with the reference feature barcode library (Supplemental Data Table 8).
TCR VDJ analyses used the Cell Ranger VDJ module and GRCh-Alts-ensembl-5.0.0. Output matrix data les for feature barcodes and TCR were initially analyzed with the Loupe package (10X Genomics).
CD4 + /CD8 + assignments. For each cell we computed the percent of UMI counts corresponding to DNA barcodes for CD8 and CD4 assigned to each marker. To assign a phenotype per cell, we computed a score based on the natural logarithm of total CD8 divided total CD4 counts. A score greater than 1 was classi ed as CD8 and a score less than − 1 was classi ed as CD4. Values between 1 and − 1 were considered ambiguous and not assigned a T cell phenotype. When a TCR clonotype (cells with identical TRA and TRB nucleotide sequences) was present in multiple cells, the median score was used to classify that clonotype.
Longitudinal analysis of S-reactive clones. We tested for enrichment of S-reactive clones in the AIM assay using a statistical test. The observed frequency of each AIM + clonotype among the total AIM + cells was compared with an expectation from a null model based on each clonotype's frequency in the bulk sequenced repertoire from the same sample. The p-value of the observed counts of each AIM + clonotype under the null model was computed from the complement of the binomial cumulative distribution function: From the binomial cumulative distribution function, we computed the chance of observing k single cells of a given clonotype in a pool of n total AIM + single cells, with the null success probability p equal to the fraction of the matching TRB in the unsorted bulk repertoire. We then applied a multiple hypothesis correction with the Benjamini-Hochberg procedure to compute an FDR-adjusted q-value for each AIM + clonotype. We designated clonotypes with a q-value < 0.05 as stringently enriched by the AIM sort and thus high con dence S-reactive clones. These highest-con dence S-reactive clonotypes were used for trajectory analysis and estimation of the total fraction of the repertoire composed of S-reactive CD4 + and CD8 + T cells, respectively.
TCR sequence clustering. To compare and cluster paired TRA/TRB sequences between cells, we rst ltered sequences from AIM-scTCRseq to those with a matching TRB from the deeply sequenced bulk blood repertoire from the same subject and time point. Next, we ltered out sequences with TRB occurring at a lower frequency in the set of clonotypes expressing AIM markers than in the bulk repertoire, clonotypes which had not been enriched by AIM. Non-enriched clonotypes were assumed to have been sorted after bystander activation and were not considered further. Next we computed pairwise dissimilarity between 5569 subject-unique TRA/TRB clonotypes using the TCRdist distance metric 27 as implemented using default parameters in tcrdist3 version 0.2.2 28 . Pruning the pairwise distance matrix to include connections between sequences within 100 TCRdist units, we formed a sequence graph.
When selecting samples to analyze by TCRdist, we restricted analyses to samples to include ≥ 2 persons with prevalent HLA alleles, such that the subcohort studied included persons with HLA-A*02:01 (n = 11), To examine whether connected components with the graph (i.e., any subgraph where a pair of nodes is connected with each other via an edge path) might recognize an HLA-restricted epitope, we used a graph walking approach to discover minimal sets of feasible HLA alleles that participants shared within closely connected nodes. Brie y, for each node, we sorted nodes in ascending order by TCRdist to all other nodes with its largest connected component. Starting at the closest public node found in another HLAgenotyped participant, we took the intersection of the set of all class I (CD8 + nodes) or class II (CD4 + nodes), before moving on to the next-nearest connected node and taking the next stepwise intersection. If possible, the algorithm continues to narrow the set of feasible presenting HLA alleles to a minimal possible set. We inferred feasible HLA alleles, and in many cases only one allele was shared among closely connected S-reactive TCR sequences. This allele was assigned to the corresponding TCR cluster. Code to assign feasible HLA restriction from TCR sequence similarity analyses/graphs is provided in the Code availability section using custom scripts run in Python version 3.9. Graph visualization used the Networkx v2.8.6 package 61 . TCR motif visualization. From the weighted sequence similarity graphs formed from all AIM-scTCRseq Sreactive clones, we identi ed clusters of similar sequences using the Louvain community detection algorithm with the communities v3.0.0 package in Python. For each public sequence cluster with sequences donated from 3 or more participants, we depict selected TCR clusters by six graphical elements, with the CDR3α and β junctions on the left and right, respectively (Fig. 3b-h). The lower sequence logo shows the observed position-speci c frequency of each amino acid within the TCR cluster, and the upper logo plots represent the position-speci c information content in bits (i.e., a signal of selection) compared to CDR3α and β receptors, with the same Vand J-gene usages, randomly sampled from naive repertoires. The Sankey ow diagrams left of the CDR3 motifs show the frequencies of TRAV/TRAJ and TRBV/TRJV gene usages within each cluster. Motifs were aligned and computed in palmotif v0.4, and graphics rendered using ggplot2 and ggseqlogo 62 in R version 4.2.

HLA-
TRB repertoire analyses. Breadth of TRB sequences signi cantly expanding (or contracting) between serially collected blood specimens was calculated as the number of unique clonotypes meeting signi cance criteria. In brief, to determine longitudinal persistence and previous detection, TRB were ltered for productive sequences and analyzed at the nucleotide level (CDR3, TRBV, TRBJ). TRB from bulk sequencing data were de ned as expanded if their log 2 fold change was > 2 relative to the E01 time point and met a second criteria for a statistically signi cant change in counts between samples using Fisher's Exact Test with correction for multiple hypotheses (FDR-adjusted q-value < 0.05). Analysis used custom Python scripts detailed in Code Availability Statement. CDR3 amino acid sequence with V and J gene usage and HLA restriction (if published) was used to determine whether a clonotype had been shown previously to be associated with a known antigen.
Separately, TRB CDR3 assigned to SARS-CoV-2 were generated by statistically comparing TRB CDR3 sequences from whole blood Immunoseq TRB repertoires between persons with documented SARS-CoV-2 infection and healthy controls (HD). Sequences were assigned as likely to represent CD4 + T cells based on publicity between persons sharing HLA class II alleles, or as likely to represent CD8 + T cells based on publicity between persons sharing HLA class I alleles. Assignments to SARS-CoV-2 S or non-Spike speci city were performed using the output of multiplexed antigen restimulation assays (MIRA) 63-65 .
Brie y, de ned SARS-CoV-2 antigens were used to stimulate expanded PBMC from SARS-CoV-2-infected persons and sorted CD4 + or CD8 + T cells expressing activation markers were bulk-sequenced at the TRB locus. Further re nements were performed to exclude non-SARS-CoV-2-speci c TRB sequences associated with ubiquitous antigens such as CMV or EBV, or with TRB sequences non-speci cally associated with HLA alleles in a cohort of healthy controls 66 . MIRA-enriched and statistically SARS-CoV-2-associated TRB sequences were co-analyzed to create sets of TRB sequences spanning CDR3 and assigned, when possible, as CD4 + or CD8 + , or as S-or non-S-speci c. The diagnostic breadth of blood TRB repertoires were calculated as described 66,67 and represent the proportion of productive TRB clonotypes present in a repertoire that are assigned as SARS-CoV-2-speci c.
TCR expression. TCR CDR3 sequences from AIM-scTCRseq were integrated into assigned TRA and TRB Vaccine broadens pre-existing memory response by expanding low abundance clonotypes. We identi ed expanded and contracted clonotypes in bulk TRB repertoires. Expanded clonotypes are those with log 2 (fold change) > 2 and Fisher's exact test FDR-adjusted p value < 0.05 when comparing E01 (prevaccine) and E03 (post-2nd mRNA vaccine dose) as shown for a representative participant (a). Expanded clonotypes were assigned as detected prior to vaccination (black) versus only detected after vaccination (orange), shown for two representative participants (b, c). For each participant with paired E01 and E03 specimens we computed the sum of productive frequencies of all expanded clonotypes by whether they were previously detected. Comparisons of the relative frequencies of previously seen and unseen clonotypes at each post-vaccine time point are shown in the bar graph at the right (d). In the full cohort with E01-E03 matched samples, the summed frequencies of memory and previously unseen vaccineexpanded clonotypes (black) vary between participants after the second vaccine dose. Recalled memory clonotypes (black) predominate. Of note, participant P845, who did not seroconvert after reported natural infection, had the lowest integrated abundance of expanded clonotypes after vaccine dose 2 (e). The relative contributions of memory and previously unseen clonotypes to the diversity of expanded clonotypes varied across the cohort at both E02 and E03 timepoints, with serologically naive participant P845 having the lowest contribution of memory clonotypes. Numbers of unique expanded clonotypes after the 1st and 2nd mRNA vaccine doses are shown by participant with numbers of unique memory (black) or previously unseen (orange) clonotypes (f). The cumulative frequencies of both memory and previously unseen expanded clonotypes increase after the 2nd mRNA dose compared to the rst, but not after a booster. At each timepoint, memory clonotypes from prior infection(s) predominate (g). The proportion of unique expanded clonotypes accounted for by previously detectable TRB clonotypes (% memory) decreased during the course of the primary vaccination (h). Statistical comparison between paired samples are signed-rank tests, and comparison between groups are Wilcoxon tests. Asterisks represent level of statistical signi cance (ns not signi cant, *<0.05, **<0.01, ***<0.001, ****<0.0001).

Figure 2
Kinetics of S-reactive clonotypes de ned by AIM-scTCRseq spanning convalescence and vaccination.
Overlay of TRBsequences from AIM-scTCRseq of T cells activated in response to S peptides onto bulk blood TRB clonotype frequency comparisons between pre (E01)-and post (E03)-vaccine in representative participants (P836 had no E00.5 sample). The numbers of expanded or contracted TRB clonotypes also seen in the AIM-scTCRseq datasets are shown in color while the numbers of such clonotypes not seen by AIM-scTCRseq are gray. Clonotypes are color-coded by CD8+ (blue) or CD4+ (green) phenotypes. Upward and downward triangles indicate expanded and contracted clonotypes. For clonotypes neither expanded nor contracted, only AIM-scTCRseq TRB-matched clonotypes are shown (a). PBMC TRB-de ned clonotypes that match AIM-scTCRseq-derived CD8+ (blue) or CD4+ (green) S-reactive TRB clonotypes in