Whole blood (collected in BD Vacutainer K2EDTA tubes) and serum (collected in BD Vacutainer serum tubes) samples from consented subjects recovered from COVID were collected in EDTA tubes at the Brigham and Women’s Hospital, Boston, MA, USA. Results from PCR testing when available, disease and demographic information (Table 1) were collected after blood draws through a RedCap-administered survey. All studies were performed under the IRB protocol number 2020P000849 “Biorepository for Samples from those at increased risk for or infected with SARS-CoV-2.” approved at the Brigham and Women’s Hospital, Boston, MA, USA.
Serology and neutralization assays
Serum samples were aliquoted from serum tubes and aliquoted for storage at -80ºC. Serology was performed at the Broad Institute as described previously31. Briefly, MaxiSorp 384-well plates were pre-coated with 50uL/well of 2.5ug/mL SARS-CoV2 RBD then incubated with 50μL of 1:100 diluted serum samples for 30 min at 37ºC. After washing, HRP-anti human IgG and IgM (1:25000 dilution, Bethyl Laboratory) was added to each well for 30 min at RT, before washing and incubation with 40 μl/well of Pierce TMB peroxidase substrate (ThermoFisher). The reaction was stopped by adding 40 μl/well of 0.5 M H2SO4 and the OD was read at 450 and 570 nm on a BioTek Synergy HT.
For neutralization assays with pseudotyped viruses8,32, HEK293 cells were cotransfected with psPAX2 (AIDS Resource and Reagent Program), pLenti-CMV Puro-Luc (Addgene), and spike protein expressing pcDNA3.1-SARS CoV-2 SΔCT sequences for the Wuhan strain. Supernatant was collected 48 h post-transfection and mixed with 3-fold serial dilutions of heat-inactivated serum samples. After a 1-hour incubation at 37ºC, mixes were used to infect HEK293T-hACE2 cells seeded in 96-well plates at a density of 1.75 × 104 cells per well the previous night. After 48h, cells were lysed in Steady-Glo luciferase assay (Promega) and neutralization titers were defined as the sample dilution at which a 50% or 80% reduction in relative light unit was observed relative to the average of the virus control wells.
For neutralization assays with live SARS-CoV2 virus31, serially diluted patient sera samples were mixed with diluted SARS-CoV-2 live virus (D614) and incubated at 37°C for 1 h. Mixes were used to infect Vero E6-TMPRSS2 cells seeded the day prior at 10,000 cells per well in CellCarrier-384-well microplates. After 48h of culture, cells were fixed with 4% PFA for 2h, washed and incubated with a mouse anti-SARS-CoV2 NP antibody (Sino Biological) for 1.5h at RT, followed by Alexa488-conjugated goat-anti-mouse antibody (Jackson ImmunoResearch Labs) for 45 min at RT and nuclear staining with Hoechst 33342 (Thermo Fisher Scientific). Fluorescence imaging was performed using the Opera Phenix™ High Content Screening System (Perkin Elmer) and half-maximal inhibitory dilutions (ID50) were determined using a four-parameter, nonlinear curve fitting algorithm, with a total range of 20 to 10,240 (for samples where the minimal dilution did not achieve 50% neutralization or where the maximal dilution exceeded 50% neutralization, respectively).
PBMC isolation and T-cell restimulation
PBMCs were isolated within 3 hours of blood draw from subjects by density gradient separation using Ficoll-Paque PLUS (VWR- GE Life Sciences) and SepMate tubes (StemCell) according to manufacturer’s instructions. Cells were frozen in Cryostor (StemCell) and stored at -80°C. Either fresh or frozen isolated PBMCs were cultured at 37oC, 5% CO2 in RPMI supplemented with 10% fetal bovine serum (FBS), 1% glutamax, 0.5% NEAA and 1% penicillin-streptomycin. Isolated PBMCs from subjects were plated in triplicate at 1 million cells per well in a flat-bottom 96 well plate in 200μL of media. After letting the cells rest for 1 hr, each well was stimulated with 1μL of 2mg/mL peptide mix for a final concentration of 10 μg/mL of each peptide pool of SARS CoV-2 spike (Miltenyi; cat. # 130-126-701) or nucleoprotein (Miltenyi; cat. # 130-126-699) vs a control with water only. Nucleoprotein peptide mix is a tiled mix of 15mer peptides with an eleven amino acid overlap. Spike peptide mix includes selected immunodominant regions of N-terminal domain and a tiled mix of the C-terminal domain (15mers). Cells were stimulated for 18 hrs, at which time the cell supernatant was saved for further cytokine analysis.
Flow cytometry and cell sorting
After stimulation of PBMCs with the spike peptide mix, nucleoprotein peptide mix or a control (buffer only) for 18 hours, cells were washed and resuspended in a buffer containing 1% FBS in PBS before staining with surface markers. Cells were first stained with a 1/100 dilution of Fc block (Biolegend). To distinguish cells from different subjects and allow for pooled sequencing, cells were then hashed with the Total-Seq C barcode antibodies at a 1/1000 dilution (TotalSeqC-0251 to 0260, Biolegend). PBMCs were washed 3 times in Cell Staining Buffer (Biolegend, #420201) and combined in pools of up to 8 subjects/8 barcodes. Pools were then stained for 20 min with the following fluorescent antibodies: anti-CD45 (clone HI30), anti-CD3e (clone OKT3), anti-CD4 (clone RPA-T4), anti-CD8 (clone RPA-T8), anti-CD45RO (clone UCHL1), anti-OX40 (clone ACT35) and anti-CD154 (clone 24-31). A Zombie UV (Biolegend) live/dead marker was also utilized. All antibodies were purchased from Thermo-Fischer and used at the recommended dilution. Stained T-cells were sorted into eppendorf tubes based on expression of OX-40 (PE) and CD154 (APC) on a two-way cell sorter (Sony SH800). Only cells restimulated with spike or nucleoprotein peptide pools were sorted, both on CD4+ and CD8+ T-cells that were CD45, CD3 and CD45RO positive from each of the 16 subjects.
Single cell RNA sequencing
Cells were separated into droplet emulsions using the Chromium Next GEM Single-cell 5′ Solution (v1.1) and the 10x Chromium Controller. 10,000 cells were loaded per channel of the Chromium Next GEM single-cell 5′ (v1.1) Chip G. Following lysis of cells, barcoded mRNA reverse transcription, and cDNA amplification, a 0.6X SPRI cleanup was performed, and the supernatant was set aside for Feature Barcoding library construction as instructed by the Chromium NextGEM single-cell V(D)J v1.1 protocol. A final elution of 45 μL was saved for further construction of libraries. Using the saved supernatant of the 0.6X cDNA cleanup, Feature Barcoding libraries were completed according to the 5′ Next GEM (v1.1) Feature Barcoding library construction methods provided by 10x Genomics. Gene expression and V(D)J libraries were created according to the manufacturer's instruction (10x Genomics).
Gene expression, feature barcoding libraries, and TCR enriched V(D)J libraries were sequenced on a Nextseq500 (Illumina) using a high output 150 cycle flowcell, with the read configuration Read 1: 28 cycles, Read 2: 96 cycles, Index read 1: 8 cycles or sequenced on a HiSeq X (Illumina), using a 150 cycle flowcell with the read configuration: Read 1: 28 cycles, Read 2: 96 cycles, Index read 1: 8 cycles. Feature Barcoding libraries were spiked into the gene expression libraries (at 5% of the sample pool) prior to sequencing.
Cytokine bead arrays
Cytometric bead arrays (Legendplex Human T-helper cell 12-plex; Biolegend) for IFN-γ, TNF-ɑ, IL-2, IL-4, IL-5, IL-6, IL-9, IL-10, IL-13, IL-17A, IL-17F, and IL-22 cytokines were used to quantitatively assess cytokine responses in PBMCs from healthy subjects after in vitro stimulation in triplicate with 10ug/mL of the peptide pools of spike and nucleoprotein vs control-treated wells. CBAs were performed on culture supernatants using recommended protocol and analyzed by flow cytometry. Standard curves made from 2-fold dilutions of kit standards were used to interpolate pg/mL concentrations of cytokines in GraphPad Prism 8.1.2 from median PE fluorescence intensities. Median PE fluorescence intensities below the standard curve were set to the theoretical pg/mL detection limit determined by the vendor.
Multivariate linear regression for evaluating the associations between age, sex, infection status, and processing with cytokine level fold changes upon stimulation
For each cytokine k in the list of cytokines containing IL2, IL5, IL6, IL9, IL10, IL17A, IL17F, IL22, IFNg, TNF, let Yk_spike , Yk_nucleo , and Yk_control stand for the measurement levels (in log2 scale) of k in spike stimulated, nucleoprotein stimulated and unstimulated control conditions. We assessed the associations between subject age, sex (male or female), infection status (yes or no), and sample processing (fresh or frozen) with fold changes in the level of k upon spike (Yk_spike - Yk_control) and nucleo (Yk_nucleo - Yk_control) stimulations with two separate multivariate linear regression models:
Yk_spike - Yk_control = 𝜷0 + 𝜷1Age + 𝜷2Sex + 𝜷3Processing + 𝜷4Infected
Yk_nucleo - Yk_control = 𝜷0 + 𝜷1Age + 𝜷2Sex + 𝜷3Processing + 𝜷4Infected
Here 𝜷1, 𝜷2, 𝜷3 and 𝜷4 are the learned parameters that stand for the effect sizes of changes in age, sex (female over male), sample processing (frozen over fresh) and infected (yes over no) over the fold changes in k upon stimulation.
Preprocessing of scRNA-seq and V(D)J readouts
mRNA and VDJ sequence reads were mapped to the reference human genome GRCh38-3.0.0 with the cloud-based Cumulus workflows 33, using the CellRanger 3.0.2 software pipeline. Pooled cells were assigned back to the corresponding donors by using the Python implementation of the Souporcell algorithm 34.
For the mRNA data integration, count normalization, dimensionality reduction, clustering, cell scoring, and cluster marker genes detection Seurat R package 35,36 was employed. Batch effects, defined as the batch of the sequencing run, were regressed out from the normalized count values with the ComBat algorithm 37 implemented in SVA R Package version 3.38.0.
Cells which either do not have 10x standard high-quality heavy and light chain V(D)J sequences, or have more than 10% of their transcriptome reads coming from mitochondrial genes were filtered out before the downstream transcriptome analysis. Sets of TRAV, TRAJ, TRBV, TRBJ genes were discarded in all of the downstream analyses.
For the UMI count normalization step, gene expression counts for each cell were divided by the total counts for that cell and multiplied by 106, which was then log-transformed using log1p.
Dimensionality reduction, graph clustering, and UMAP visualization
Dimensionality reduction was done with PCA identifying the first 50 principal components. For clustering of the cells into expression clusters, a k-nearest neighbor (kNN) graph of the cells was constructed (k=20) using the 50 principal components. Then the Leiden algorithm 38 was used to find the clusters of the cells based on the generated kNN graph, with a resolution of 0.1. Expression levels of immunoglobulin genes were discarded during the clustering step. Uniform manifold approximation and projection (UMAP) 39 algorithm was run on the first 50 principal factors to obtain the 2D projections of the cells.
Scoring gene set signatures
Cell specific expression scores of the cell cycle genes and the cytokines were calculated as previously defined in 40, where for each gene in the gene-set, 100 genes were randomly selected as control genes.
Non-negative matrix factorization
The integrated single cell count matrix was non-negative matrix factorized using the Python package Consensus Non-negative Matrix Factorization (cNMF) 41. The number of high variance genes used to run the factorization was set to 3000, and the NMF loss function was frobenius. The optimal number of latent factors was chosen based on the identifiability of the independent pathways as determined by gene set enrichment analysis. The genes with significantly high loadings per factor had loadings greater than three interquartile ranges above the 75th percentile. The clusterProfiler R package was used to perform gene set enrichment of the selected genes 42.
Pseudotime analyses of the CD4+ and CD8+ T cell populations were performed using the R implementation of Monocle3 43. Root cells were selected based on the transcriptome analyses of the Leiden clusters.
TCR repertoire analysis
While generating the assembled V(D)J sequences, the 10x Genomics V(D)J contig assembly algorithm (https://support.10xgenomics.com/single-cell-vdj/software/pipelines/latest/algorithms/assembly) accounts for many types of noise specific to scRNA-seq data. Nonetheless, only cells with high-quality V(D)J contig sequences in the beta and alpha chains were chosen. The downstream repertoire analysis excluded cells with more than one high-quality alpha or beta chain sequence (i.e., double expressors). Cells having the same beta chain V and J genes as well as CDR3 nucleotide sequences were assumed to be clonally related. A small fraction of clones contained representatives in both CD4+ and CD8+ populations which might have represented contamination events, in those cases the clone was assigned to the dominant population for further analyses.