Patients and controls
Sixteen Norwegian patients with APS-1 (8 males, 8 females, mean age 35.9 years +- 16.2) and 16 Norwegian gender and age-matched controls (+/- 5 years) were included in the blood microarray expression study. Of the APS-1-patients, two sibling-trios were included, and two other patients were otherwise related. Fourteen of the same 16 patients and 4 additional APS-1 patients were included in the real time PCR verification experiments together with 8 additional controls for a total of 18 APS-1 patients and 24 healthy controls.
Nine APS-1 patients (6 males, 3 females, mean age 47.6 years +-13.1) and 16 controls (10 males and 6 females, mean age 45.4 years +- 14.1) were recruited to the B cell profiling experiments using flow cytometry, while 5 APS-1 patients (2 males, 3 females, mean age 44.6+-15.5; 3x AIRE c.967-979del13/c.967-979del13 and 2x AIRE c.879 + 1G > A/c.879 + 1G > A) and 4 age-matched healthy controls were subjected to single B naïve cells sequencing. No children < 16 years were included in any parts of the study.
The included APS-1 patients have all been reported previously and diagnosis was always confirmed by the clinical criteria for this syndrome, AIRE mutational analysis and/or autoantibody screening against IFN-ω [26]. All patients were additionally analyzed for antibodies against organ specific targets known for APS-I and most for the cytokine targets IL-22, IL-17A, IL-17F, IFN-α2 and IFN-α8 (see below for protocol). None of them were on immune suppressive treatments except for glucocorticoids to restore physiological cortisol levels for patients with autoimmune Addison’s disease. Data of the patients and degree of overlap between the data sets are summarized in Table 1. A flow chart describing the enrolled patients and the methodologies is included in Fig. 1.
Table 1
AIRE mutations: 1. c.769C>T; 2. c.1242_1243insA; 3. c.967-979del13; 4. c.22C>T; 5. c.402delC; 6. c.879+G>A; 7. c.1336T>G; 8. c.1249dupC; 9. del e1-e8
AIRE mutation group microarray analysis: 1. c.769C>T homozygous: Finnish; 2. C.769C>T + other: Finnish het.; 3. c.967-979del13 homozygous: 13 bp del.; 4. missense + ins/del: point mut.; 5. c.879+1G>A: Splice. Used to classify mutations in the microarray analysis interpretations.
Autoantibodies in serum: 1: Positive. wk: very weak positive response. NA: not analyzed. 0: negative.
Organ specific autoantibodies: The number represents how many of the following autoantibodies each patient has in serum (total N=8): 21-hydroxylase, 17-hydroxylase, glutamic acid decarboxylase-65, side-chain cleavage enzyme, aromatic L-amino acid decarboxylase, tryptophan hydroxylase, tyrosine hydroxylase, NACHT leucine-rich-repeat protein 5,
Global whole blood RNA expression analyses using microarray chip.
Experimental approach
Sampling of APS-1 patients and controls was performed in standardized manners in PAXgene blood RNA tubes (PreAnalytix, Qiagen, Hombrectikon, Switzerland) and stored at -80°C until use. Purification of RNA was achieved by the PAXgene blood RNA kit following the instructions from the manufacturer. Samples were quality assessed by Agilent Bioanalyzer, using the Agilent 6000 Nano kit (Agilent, Santa Clara, CA, USA), providing RNA with RNA integrity numbers (RIN) above 6.0. The samples were randomly distributed into 4 batches for RNA extraction, each with 4 patients and 4 sex- and age matched controls and was extracted by the same person on the same day. Following the procedures from Illumina, RNA was subsequently transformed to cRNA, and these constructs were labeled, amplified and quality-checked again by the Agilent Bioanalyzer. The cRNAs were then hybridized to 4 Illumina HumanRef-8 BeadChip microarrays, followed by washing and scanning according to the protocol. Quality control of the arrays was done by BeadStudio.
Bioinformatic approach
Microarray output was analyzed using Limma within R [27]. A list of differentially expressed genes (GRCh38p13), between patients and controls was then provided. Genes targeted by multiple probes were collapsed and the list comprising 11441 features was ranked according to log2 fold-change and interrogated using gene set enrichment analysis (gsea) [28]. The 26978 gene sets (GSs) identified were compiled on March 1st 2021 using the resources described in [29]. Permutation number was deemed adequate at 1000 iterations, 8972 GSs passed the size thresholds set at > 10 and < 500 member genes and for all other parameters default values were used.
The contribution of individual genes within each GS is subsequently addressed via leading edge (LE) analyses. They identify each GS’s member genes that appear in the ranked list at or before the point at which the running sum reaches its maximum deviation from zero. Thus, genes assigned to a GS’s LE are the specific genes that account for this GS’s significant enrichment or depletion signal [28].
To interpret gsea results comprehensively, GSs were mapped using network-based algorithm described previously [30–33]. In summary, the latter organize GSs that passed the significance threshold (FDR < 0.1) using forces of attraction, i.e., edge-weights, exerted in cases where GSs share more than or equal to 5% of their member genes. Consequently, pairwise shared LE-members between the GSs determined their position and organization within the network clusters. The emerging transcriptional landscape was then visualized using the yFiles organic layout in Cytoscape 3.8.2 [34].
Computational deconvolution techniques based on immune cell subset-derived gene expression signatures enable the approximation on the quantities of various immune-cell types per sample without the need for experimental cell sorting. Here, using ImmQuant’s DMAP-signatures [35], changes in immune cell-subset proportions in blood from APS-1 patients versus controls were assessed on the basis of the data matrix underlying differential expression analyses. All parameters were used per default prior to integration with the results from other analyses.
Global single cell RNA sequencing of naïve B cells.
PBMCs were purified from 5 APS-1 patients and 4 controls according to standard procedures using Ficoll Paque Plus. Naïve (CD19 + CD20 + CD27- IgD+) B cells were subsequently sorted into single wells in a 384 well plate directly in buffer obtained by the Eukaryotic single cell genomics facility (ESCG) at SciLifeLab, Sweden. Naïve B cells were chosen because of their implicated differential expression between patients and controls from the microarray study and because of their abundance in blood. The antibodies used were anti-CD19 clone HIB19 PerCp/Cy5.5, anti-CD20 clone 2H7 APC-Cy7, anti-CD27 clone 0323 PE/Cy7 and anti-IgD clone IA6 FITC, all from Biolegend (San Diego, CA, US) (Suppl. Table 1). The gating strategy for naïve B cells is given in Suppl. Figure 1.
ESCG performed the single cell full-length RNA sequencing using the Smart-seq2 kit for library preparation according to the protocol with minor modifications [36]. RNA sequencing was made on an Illumina HiSeq2500 sequencer by Scilifelab NGI, with 50 bp single reads using standard procedure with an estimated read depth of 0.5 million bp. The STARsolo single cell aligner [37] was used to align fastq files to the GRCh38 (hg38, [38]) version 100 reference genome using the options “--soloType SmartSeq”, “--soloUMIdedup NoDedup” and “--soloStrand Unstranded”.
The count matrices were imported into R (v. 4.0.2, R Core Team, 2014) and RStudio (v. 1.4.1103, RStudio Team, 2020) using the R packages scran (v. 1.18.7), scuttle (v. 1.0.4) and scater (v. 1.18.6,: [39]). AnnotationHub (object AH79689, v. 2.22.1, Morgan M, Shepherd L, 2021) was used for reference data, and used for calculating transcripts per million (TPM). Cells were discarded on the basis of a high percentage of mitochondrial gene expression, low number of genes detected, and low total read counts using scuttle’s quickPerCellQC. Genes with low expression were filtered out by requiring genes to have 1 read or more in at least 2 cells. Cell-type classification was performed with SingleR (v. 1.4.1, [40]) using the Monaco et al [41] reference dataset from celldex (v 1.0, [40]). Technical noise was modelled and pca were denoised using modelGeneVar and denoisePCA from scran, and plots, were generated using scater and uwot (v. 0.1.10 [42]).
The number of genes was reduced for differential expression analysis by requiring genes to have at least 4 reads in at least 10 cells. The data was modelled using a zero-inflated negative binomial using the zinbwave R package (v. 1.12.0, [43]). Zinbwave was run with the options “K = 0”, “epsilon = 1e12”, and “observationalWeights = TRUE”. The design used was “~group”, where group consists of a factor with either control or APS-1. Differential expression analysis using DESeq2 (v. 1.30.1, [44]) was performed using the options “fitType="local"”, “useT = TRUE”, “minmu = 1e-6”, “test="LRT"”, “minReplicatesForReplace = Inf”, “reduced = ~ 1”, and a significant level of 0.05. Subsequently, log fold change shrinkage was performed using lfcShrink with the option “type = "normal"”. MA plots, volcano plots, and violin plots were generated using ggplot2 (v. 3.3.5, H. Wickham, 2016) with the additional packages scales (v. 1.1.1), RColorBrewer (v. 1.1-2), viridis (v. 0.6.1), ggrepel (v. 0.9.1), and cowplot (v. 1.1.1 [45]). The expression values were then computed as reads per kilobase of gene model and million mappable reads (RPKMs) to normalize for varying sequencing depths across sequenced cells and the gene lengths as explained by Ramsköld et al. [46] and correcting for the uniquely alignable positions using MULTo [47]. The alignments of reads with the indicated genome assembly provided BAM files using SAM tools [48].
Real time PCR verification of selected genes from the affected pathways identified in the microarray and B cell single cell sequencing datasets
Custom array RT2 profiler PCR plates were ordered from Qiagen (Hilden, Germany) and the protocol from the manufacturer was followed using 200 ng of RNA as input with the QuantStudio5 equipment from Thermo Scientific (Waltham, Massachusetts, US). All samples were run in doublets and the mean of the results for the three housekeeping genes GADPH, ACTB and HPRT1 was used as normalisation base. The 2−∆∆Ct method was used to provide fold change values for each gene for each individual relative to the mean result of the healthy controls. Differences between groups were analysed by the Student’s t-test in Prism (Graph Pad Software, Inc., San Diego, California, USA), P < 0.01.
Flow cytometry analysis on PBMC with a B cell panel
PBMC isolation was performed on heparin-blood using Ficoll Paque Plus and cells were then frozen in human AB-serum (Sigma-Aldrich provided by Merck, Darmstadt, Germany) with 10% DMSO (Sigma-Aldrich) at -150 C until use. On the day of analysis, cells were thawed and transferred to 37°C Dulbecco's Modified Eagle Medium GlutaMAX (DMEM) (Gibco, Thermo Scientific), supplemented with 10% fetal bovine serum (FBS) (Gibco) and 1% Penicillin-Streptomycin (Sigma-Aldrich). Samples were centrifuged and subsequently resuspended in 4 ml PBS with 2% FBS and filtered using a MACS SmartStrainer 70 µm (Miltenyi Biotec, Bergisch Gladbach, Germany).
Two microliters undiluted Fc-block (BD, Franklin, New Jersey, US) was added, and cells were incubated for 20 min before stained with 1 µl live/dead marker (Zombie Violet BV421 (BioLegend)). After washing, samples were subsequently incubated with B cell relevant antibodies for extracellular staining; CD38-PerCP, CD27-FITC, CD10-Pe/Cy7, CD19-Pe, CD79b-APC, CD45BV785, CD74-BV650, CD5-BV605, CD3 V500 (all from Biolegend, except anti-human CD74 from BD) and a “dump mixture” with the aim to exclude non-relevant cells (biotinylated antibodies against CD34, CD11b, CD14, CD61, CD16, CD1c and TCR γ/δ). Cells were then stained with Strepavidin-BV421 (BioLegend) (Suppl. Table 1). Acquisition was conducted on the LSR Fortessa Flow Cytometer (BD), and analysis was performed using FlowJo version 10.7.1 (BD). The gating strategy is defined in Suppl. Figure 2. Differences between groups were analysed by the Student’s t-test in Prism (Graph Pad Software, Inc., San Diego, California, USA), P < 0.05.
Analysis of autoantibodies and BAFF levels in serum
Binding and neutralization autoantibodies against organ specific targets, IFN-α2, IFN-α8, IFN-ω, IL-22, IL-17A and IL-17F were analyzed by radioimmune assay (RIA) or enzyme linked immunosorbent assay (ELISA) and antiviral interferon neutralization assay (AVINA), respectively, as described previously [5, 7, 49, 50]. The organ specific targets for autoantibody analysis included in Table 1 were 21-hydroxylase, 17-hydroxylase, glutamic acid decarboxylase-65, NACHT leucine-rich-repeat protein 5, aromatic-L amino acid decarboxylase, tryptophan hydroxylase, tyrosine hydroxylase and side-chain cleavage enzyme.
B cell activating factor (BAFF) levels were analyzed in 25 uL serum from 21 APS-1 patients and 21 healthy controls by the Human BAFF/Blys/TNFSF13B Quantikine ELISA kit, Catalog number DBLYS0B from R&D systems (Abingdon, United Kingdom) in accordance with the manufacturers protocol.