Distinct B cell subsets give rise to antigen-specific antibody responses against SARS-CoV-2

Summary Discovery of durable memory B cell (MBC) subsets against neutralizing viral epitopes is critical for determining immune correlates of protection from SARS-CoV-2 infection. Here, we identified functionally distinct SARS-CoV-2-reactive B cell subsets by profiling the repertoire of convalescent COVID-19 patients using a high-throughput B cell sorting and sequencing platform. Utilizing barcoded SARS-CoV-2 antigen baits, we isolated thousands of B cells that segregated into discrete functional subsets specific for the spike, nucleocapsid protein (NP), and open reading frame (ORF) proteins 7a and 8. Spike-specific B cells were enriched in canonical MBC clusters, and monoclonal antibodies (mAbs) from these cells were potently neutralizing. By contrast, B cells specific to ORF8 and NP were enriched in naïve and innate-like clusters, and mAbs against these targets were exclusively non-neutralizing. Finally, we identified that B cell specificity, subset distribution, and affinity maturation were impacted by clinical features such as age, sex, and symptom duration. Together, our data provide a comprehensive tool for evaluating B cell immunity to SARS-CoV-2 infection or vaccination and highlight the complexity of the human B cell response to SARS-CoV-2.


SARS-CoV-2-specific B cell subsets 138
To discern the identities of distinct B cell subsets, we further analyzed Ig repertoire, differentially 139 expressed genes, and performed pseudotime analyses of integrated clusters. For pseudotime analysis, we 140 rooted the data on cluster 2, as cells within this cluster expressed Ig genes with little to no SHM or CSR 141 ( Fig. 1f) and displayed low probe reactivity (Extended Data Fig. 1c), suggesting this subset is comprised 142 of true naïve B cells. Pseudotime analysis rooted on cluster 2 identified clusters 0, 1, and 8 in various 143 stages of differentiation, suggestive of recent activation ( Fig. 2a-b). As they displayed little CSR or SHM 144 ( Fig. 1f), we therefore categorized these subsets as innate-like or possibly germinal center independent. 145 Clusters 3 and 5 appeared to be specific IgM memory subsets ( Fig. 1f and Extended Data Fig. 1c), while 146 clusters 4, 7, 9, and 12 displayed high specificity, CSR, and SHM, demonstrating an affinity-matured 147 memory phenotype ( Fig. 1f and Extended Data Fig. 1c). As naïve B cells and MBCs are quiescent, clusters 148 4, 5, 7, and 9 were similar to cluster 2 in pseudotime analysis   19 . Lastly, cluster 6 was of interest 149 as these cells displayed the greatest frequency of SHM and IgA CSR, and may have arisen in the context 150 of a mucosal immune response. 151 152 In-depth analysis of select genes including those related to B cell fate, MBC differentiation and 153 maintenance, and long-lived plasma cells (LLPCs) helped to further reveal the identities of select clusters. 154 Genes associated with MBCs (cd27,cd38,cd86,pou2af), repression of apoptosis (mcl1), early 155 commitment to B cell fate (zeb2), repression of LLPC fate (spiB, pax5, bach2), and early B cell activation 156 and proliferation (bach2) confirmed clusters 3, 4, 5, 7 and 9 as MBCs though with varying degrees of 157 differentiation, CSR, and SHM (Fig 2b-c and Extended Data Fig 2). Notably, we identified upregulation 158 of the transcription factor hhex in cluster 7, which has recently been shown to be involved in MBC 159 differentiation in mice (Extended Data Fig. 2) 20 . Lastly, cluster 12 appeared to be LLPCs or precursors 160 thereof by expression of genes associated with LLPC fate, including prdm1, xbp1, and manf (Extended 161 Data Fig. 2) 19,21,22 . Together with our antigen-specific probe data ( The properties of B cells targeting immunogenic targets such as ORF8 and NP compared to the spike are 167 unknown. We further analyzed isotype frequencies, VH SHM, VH gene usages, and frequencies of B cells 168 against these targets within distinct B cell subsets. The majority of antigen-specific B cells were of the IgM isotype with a limited degree of CSR. There were no major differences between the isotypes of B 170 cells specific to these distinct targets, with the majority of class-switched cells being of the IgG1 isotype. 171 Consistent with a de novo response against the novel SARS-CoV-2, we observed that the majority of 172 antigen-specific B cells had little to no VH SHM, though spike-reactive B cells displayed slightly 173 increased amounts of SHM. Spike-specific B cells were primarily enriched in MBC and LLPC-like 174 clusters 4, 5, 7, 9, and 12 while NP-and ORF8-specific B cells were largely found within naïve-and 175 innate-like clusters but also within MBC clusters ( Fig. 3a-l). Lastly, we did not observe differences in 176 heavy chain (HC) or light chain (LC) complementarity determining region 3 length by antigen targeting 177 (Extended Data Fig. 3a-b), though we did observe that HC and LC isoelectric points (pI) for spike-reactive 178 B cells were generally lower than NP-or ORF8-reactive B cells (Extended Data Fig. 3c- We next analyzed the VH gene usages of spike-, NP-, and ORF8-specific B cells and identified the most 182 common VH usages per reactivity (represented by larger squares on each tree map) as well as shared VH 183 usages across reactivities (shown by matching colors; Fig. 3m-p). Strikingly, we identified usage of 184 particular VH gene loci that did not overlap between spike-and RBD-reactive B cells (shown in black). 185 VH1-24, VH3-7, and VH3-9 were the highest represented VH gene usages exclusively associated with 186 non-RBD spike reactivity, and VH1-24 usage was enriched in cluster 7, an MBC-like cluster  and Extended Data Fig. 1b). These results were confirmed by mAb data, which identified spike-specific 188 mAbs utilizing VH1-24 and VH3-7 that did not bind to the RBD (Extended Data Table 3). Unique LC V 189 gene usages were also evident amongst antigen-specific cells  Finally, public B cell clones were of interest as the epitopes bound can be targeted by multiple people and 192 thus represent important vaccine targets. We identified five novel public clones from this dataset, three of 193 which were present in two separate subjects, one that was present amongst three subjects, and one amongst 194 four subjects (Extended Data Table 4). Four of the clonal pools were specific to the spike protein, and the 195 remaining clone to NP. The majority of clonal pool members were identified in MBC-like clusters 3, 4, 196 5, 7, and 9, suggesting that B cells specific to public epitopes can be established within stable MBC 197 compartments. 198 199

Monoclonal antibody binding and neutralization 200
To simultaneously validate the specificity of our approach and investigate the properties of mAbs targeting 201 distinct SARS-CoV-2 viral epitopes, we synthesized and characterized the binding and neutralization 202 ability of 90 mAbs from our single cell dataset (Extended Data Table 3). B cells exhibiting variable probe  203 binding intensities toward distinct antigens were chosen as candidates for mAb generation, as well as B 204 cells that tended to bind multiple probes (exhibiting non-specificity or polyreactivity). MAbs cloned were 205 representative of various clusters, reactivities, VH gene usages, mutational load, and isotype usages (Fig. 206 4a, Extended Data Table 3). Representative mAbs generated from cells specific to the spike, NP, and 207 ORF8 exhibited high affinity by ELISA, though probe intensities did not meaningfully correlate with 208 apparent affinity (KD) (Fig. 4b, Extended Data Fig. 4a). Only a small percentage of cloned mAbs to the 209 spike, NP, and ORF8 exhibited non-specific binding (Fig. 4b). Notably, cells exhibiting non-specific 210 binding were reactive to the PE-SA-oligo probe conjugate and were largely polyreactive (Extended Data 211 While mAbs targeting the RBD of the spike are typically neutralizing, little is known regarding the 214 neutralization capabilities of mAbs targeting non-RBD regions of the spike, ORF8 and NP. We addressed 215 the neutralization ability of all synthesized mAbs using a live virus plaque assay and determined that all 216 mAbs cloned against NP and ORF8 were non-neutralizing, while mAbs against the RBD and other 217 epitopes of the spike were largely neutralizing at varying degrees of potency ( Fig. 4c-d). As anti-spike 218 mAbs were predominantly neutralizing and enriched in memory, these MBC subsets may serve as a 219 biomarker for superior immunity to SARS-CoV-2. 220 221

Antigen targeting and clinical features 222
Previous studies from our group and others have suggested serum antibody titers correlate with sex, 223 SARS-CoV-2 severity, and age 6,14,23 . We therefore investigated the frequencies of SARS-CoV-2-reactive 224 B cells to assess whether reactivity toward particular SARS-CoV-2 antigens correlated with clinical 225 parameters. By both serology and ELISpot, we identified that B cell responses against the spike/RBD and 226 NP were immunodominant, though ORF8 antigen targeting was substantial (Fig. 5a, b). Consistent with 227 our single cell dataset, spike-specific B cells were enriched in memory by ELISpot (Fig. 5b). 228 229 We next analyzed the distribution of B cell subsets and frequencies of B cells specific to the spike, NP, 230 ORF7a, and ORF8 in sets of patients stratified by age, sex, and duration of symptoms from our single cell 231 dataset. We normalized antigen probe signals by a centered log-ratio transformation individually for each 232 subject; all B cells were clustered into multiple probe hit groups according to their normalized probe 233 signals, and cells that were negative to all probes or positive to all probes (non-specific) were excluded 234 from the analysis. We identified substantial variation amongst individual subjects in terms of the degree 235 of spike, NP, ORF7a, and ORF8 antigen targeting (Fig. 5c). As subject age increased, the percentages of 236 spike-reactive B cells relative to B cells targeting internal proteins decreased, and age positively correlated 237 with increased percentages of ORF8-reactive B cells (Fig. 5d-e). Similarly, female subjects and subjects 238 experiencing a longer duration of symptoms displayed reduced spike targeting relative to internal proteins 239 In summary, our study highlights the diversity of B cell subsets expanded upon novel infection with 245 SARS-CoV-2. Using this approach, we identified that B cells against the spike, ORF8, and NP differ in 246 their ability to neutralize, derive from functionally distinct and differentially adapted B cell subsets, and 247 correlate with clinical parameters such as age, sex, and symptom duration. 248

249
Discussion 250 The COVID-19 pandemic continues to pose one of the greatest public health and policy challenges in 251 modern history, and robust data on long-term immunity is critically needed to evaluate future decisions 252 regarding COVID-19 responses. Our approach combines three powerful aspects of B cell biology to 253 address human immunity to SARS-CoV-2: B cell transcriptome, Ig sequencing, and recombinant mAb 254 characterization. Our approach enables the identification of potently neutralizing antibodies and the 255 characteristics of the B cells that generate them. Importantly, we showed that antibodies targeting key 256 protective spike epitopes are enriched within canonical MBC populations. 257

258
Identification of multiple distinct subsets of innate-like B cells, MBCs, and apparent LLPC precursors 259 illustrates the complexity of the B cell response to SARS-CoV-2, revealing an important feature of the 260 immune response against a novel pathogen. The B cell clusters herein may provide biomarkers in the form 261 of distinct B cell populations that can be used to evaluate future responses to various vaccine formulations. 262 In particular, the identification of LLPC precursors in the blood following infection and vaccination has 263 been long sought after, as they serve as a bonafide marker of long-lived immunity 24,25 . Future studies elucidating distinct identities and functions of these subsets are necessary and will provide key insights 265 into B cell immunology. 266 267 We identified that older patients, female patients, and patients experiencing a longer duration of symptoms 268 tended to display reduced proportions of MBC clusters and reduced VH SHM, consistent with a previous 269 study that identified limited germinal center formation upon SARS-CoV-2 infection 26 . Notably, older 270 patients had increased percentages of ORF8-specific B cells, which we identified as exclusively non-271 neutralizing. Mechanistically, these observations may be explained by reduced adaptability of B cells or 272 increased reliance on CD4 T cell help for B cell activation, which have been observed in aged individuals 273 upon viral infections 27,28 . Furthermore, T cell responses to SARS-CoV-2 ORF proteins are prevalent in 274 convalescent COVID-19 patients, and recent studies suggest impaired T cell responses in aged COVID-275 19 patients impact antibody responses 10,29,30,42 . More research is warranted to definitively determine 276 whether B cell targeting of distinct SARS-CoV-2 antigens correlates with age and disease severity. 277 Addressing these questions will be critical for determining correlates of protection and developing a 278

Study cohort and sample collection 388
All studies were performed with the approval of the University of Chicago institutional review board 389 IRB20-0523 and University of Wisconsin-Madison institutional biosafety committees. Informed consent 390 was obtained after the research applications and possible consequences of the studies were disclosed to 391 study subjects. This clinical trial was registered at ClinicalTrials.gov with identifier NCT04340050, and 392 clinical information for patients included in the study is detailed in Extended Data Table 1  Computational analyses for single cell sequencing data 459 We adopted Cell Ranger (version 3.0.2) for raw sequencing processing, including 5' gene expression 460 analysis, antigen probe analysis, and immunoprofiling analysis of B cells. Based on Cell Ranger output, 461 we performed downstream analysis using Seurat (version 3.2.0, an R package, for transcriptome, cell 462 surface protein and antigen probe analysis) and IgBlast (version 1.15, for immunoglobulin gene analysis). 463 For transcriptome analysis, Seurat was used for cell quality control, data normalization, data scaling, 464 dimension reduction (both linear and non-linear), clustering, differential expression analysis, batch effects 465 correction, and data visualization. Unwanted cells were removed according to the number of detectable 466 genes (number of genes <200 or >2500 were removed) and percentage of mitochondrial genes for each 467 cell. A soft threshold of percentage of mitochondrial genes was set to the 95 th percentile of the current 468 dataset distribution, and the soft threshold was subject to a sealing point of 10% as the maximum threshold 469 in the case of particularly poor cell quality. Transcriptome data were normalized by a log-transform 470 function with a scaling factor of 10,000 whereas cell surface protein and antigen probe were normalized 471 by a centered log-ratio (CLR) normalization. We used variable genes in principal component analysis 472 (PCA) and used the top 15 principal components (PCs) in non-linear dimension reduction and clustering. 473 High-quality cells were then clustered by Louvain algorithm implemented in Seurat under the resolution 474 of 0.6. Differentially expressed genes for each cell cluster were identified using a Wilcoxon rank-sum test 475 implemented in Seurat. Batch effects correction analysis was performed using an Anchor method 476 implemented in Seurat to remove batch effects across different datasets. All computational analyses were 477 performed in R (version 3.6.3). 478 standardize the assays, control antibodies with known binding characteristics were included on each plate 512 and the plates were developed when the absorbance of the control reached 3.0 OD405 units. All experiments 513 were performed in duplicate 2-3 times. 514 515 Polyreactivity ELISA 516 Polyreactivity ELISAs were performed as previously described 39,40 . High-protein binding microtiter plates 517 (Costar) were coated with 10 µg/ml calf thymus dsDNA (Thermo Fisher), 2 µg/ml Salmonella enterica 518 serovar Typhimurium flagellin (Invitrogen), 5 µg/ml human insulin (Sigma-Aldrich), 10 µg/ml KLH 519 (Invitrogen), and 10 µg/ml Escherichia coli LPS (Sigma-Aldrich) in 1X PBS. Plates were coated with 10 520 µg/ml cardiolipin in 100% ethanol and allowed to dry overnight. Plates were washed with water and 521 blocked with 1X PBS/0.05%Tween/1mM EDTA. MAbs were diluted 1 µg/ml in PBS and serially diluted 522 4-fold, and added to plates for 1.5 hours. Goat anti-human IgG-HRP (Jackson Immunoresearch) was 523 Strain (Sigma-Aldrich), and 6 µg/ml CpG (Invitrogen) in complete RPMI in an incubator at 37°C/5% CO2 532 for 5 days. After stimulation, cells were counted and added to ELISpot white polystyrene plates (Thermo 533 Fisher) coated with 4 µg/ml of SARS-CoV-2 spike that were blocked with 200 µl of complete RPMI. 534 ELISpot plates were incubated with cells for 16 hours overnight in an incubator at 37°C/5% CO2. After 535 the overnight incubation, plates were washed and incubated with anti-IgG-biotin and/or anti-IgA-biotin 536 (Mabtech) for 2 hours at room temperature. After secondary antibody incubation, plates were washed and 537 incubated with streptavidin-alkaline phosphatase (Southern Biotech) for 2 hours at room temperature. 538 Plates were washed and developed with NBT/BCIP (Thermo Fisher Scientific) for 2-10 minutes, and 539 reactions were stopped by washing plates with distilled water and allowed to dry overnight before 540 counting. Images were captured with Immunocapture 6.4 software (Cellular Technology Ltd.), and spots 541 were manually counted.

Neutralization assay 543
The SARS-CoV-2/UW-001/Human/2020/Wisconsin (UW-001) virus was isolated from a mild case in 544 February 2020 and used to assess neutralization ability of mAbs. Virus (~500 plaque-forming units) was 545 incubated with each mAb at a final concentration of 10 µg/ml. After a 30-minute incubation at 37 o C, the 546 virus/antibody mixture was used to inoculate Vero E6/TMPRSS2 cells seeded a day prior at 200,000 cells 547 per well of a TC12 plate. After 30 minutes at 37 o C, cells were washed three times to remove any unbound 548 virus, and media containing antibody (10 µg/ml) was added back to each well. Two days after inoculation, 549 cell culture supernatant was harvested and stored at -80 o C until needed. A non-relevant Ebola virus GP 550 mAb and PBS were used as controls. 551 552 To determine the amount of virus in the cell culture supernatant of each well, a standard plaque-forming 553 assay was performed. Confluent Vero E6/TMPRSS2 cells in a TC12 plate were infected with supernatant 554 (undiluted, 10-fold dilutions from 10 -1 to 10 -5 ) for 30 minutes at 37 o C. After the incubation, cells were 555 washed three times to remove unbound virus and 1.0% methylcellulose media was added over the cells. 556 After an incubation of three days at 37 o C, the cells were fixed and stained with crystal violet solution in 557 order to count the number plaques at each dilution and determine virus concentration given as plaque-558 forming units (PFU)/ml. A stringent cutoff for neutralization was chosen as 100-fold greater neutralization 559 relative to the negative control mAb. 560

561
Statistical analysis 562 All statistical analyses were performed using Prism software (GraphPad Version 7.0). Sample sizes (n) 563 are indicated directly in the figures or in the corresponding figure legends and specific tests for statistical 564 significance used are indicated in the corresponding figure legends. P values less than or equal to 0.05 565 were considered significant. *p<0.05, **p<0.01, ***p<0.001, ****p<0.0001. 566