Study participants
Samples were obtained after informed consent was provided by the study participant, in accordance with the Declaration of Helsinki and with approval from the ethics review board of the Graduate School of Medicine, Osaka University, Japan (No. 855). MPA was defined according to the 2012 Chapel Hill Consensus Conference nomenclature and definitions. Patients with AAV were diagnosed as having MPA according to the 2022 American College of Rheumatology/European Alliance of Associations for Rheumatology classification criteria34,35. The diagnosis was verified by at least 2 rheumatologists. The Birmingham Vasculitis Activity Score (BVAS) 2008 version 3 was used to rate MPA disease activity.
Sample profiles
Eight patients with MPA (five females; median age, 72 years) and seven healthy donors (five females; median age, 62 years) were recruited for CITE-seq experiments. All PBMCs samples were submitted for cytometry by time-of-flight (CyTOF) analysis as well. All patients with MPA had new-onset disease and had not received any immunosuppressive therapy. 43 patients with MPA (24 females; median age, 75 years) were additionally recruited to evaluate clinical and laboratory parameters. All serum samples were submitted to IFN-α ELISA.
Serum and PBMCs preparation
Whole blood (3.5 mL) was collected in Vacutainer SST II tubes (BD Diagonostics). Tubes are centrifuged for 10 minutes at 1,200 × g. The resultant supernatant was collected as serum and stored at − 80℃. For PBMCs collection, whole blood (20 mL) was collected into a Na-heparin blood collection tube (Terumo). PBMCs were separated using Leucosep (Greiner). PBMCs were washed and resuspended with Cellbanker 1plus (ZENOAQ) to a concentration of 1.0 × 107 cell/mL before being stored at − 150℃.
Single-cell library construction
PBMCs were thawed and DNA-barcoded antibodies for CITE-seq were attached. Information on the antibodies used for CITE-seq is shown in Supplementary Table 6. Single-cell suspensions were processed through the 10x Genomics Chromium Controller (10x Genomics). The libraries were constructed following the protocol outlined in the Chromium Single Cell 5¢ Reagent Kits v2 (Dual Index) User Guide (10x Genomics). Briefly, up to 10,000 labeled live cells per sample were separately loaded into the 10x Genomics platform without sample mixing to create a barcoded cDNA library for individual cells. Data quality control was performed using the Bioanalyzer (Agilent). Individual libraries were pooled for sequencing on the HiSeq 2500 or Novaseq 6000 platform (Illumina) to achieve at least 20,000 paired-end reads per cell for gene expression and 60,000 paired-end reads per cell for surface proteins. Sequence information is summarized in Supplementary Table 7.
Reference-based and manual annotation of CITE-seq data
Raw FASTQ files were matched to the GRCh38 reference genome using CellRanger (version 6.0.6). Filtered HDF5 feature-barcode matrix files were generated using CellRanger count to establish a Seurat object. The Seurat R package (V4.2.0) was used for data quality control, scaling, transformation, clustering, dimensionality reduction, differential expression analysis, and visualization. A total of 109,350 cells were selected for further analysis out of a total of 117,791 putative cells using unique molecular identifiers (UMIs) per cells and % mitochondrial reads. Data were normalized and scaled using the SCTransform function. Cellular identity was determined by two rounds of clustering. At the first round of clustering, reference-based integration was applied for the query dataset using the CITE-seq dataset of 211,000 human PBMCs as a reference13. The FindTransferAnchors function was used to find anchors between the reference and the query using precomputed supervised principal component analysis (supervised PCA) transformation for SCT-normalized data. The MapQuery function was then used to transfer cell type labels and protein data from the reference to the query. Platelets and erythrocytes were removed from the analysis. To identify clusters within each major cell type, we performed a second round of clustering on monocytes (CD14 Mono, CD16 Mono, and cDC) and CD8+ T cells (CD8 Naive, CD8 TCM, and CD8 TEM). The RunUMAP function was used for uniform manifold and projection (UMAP) dimensional reduction with 30 precomputed spca dimensions. A nearest-neighbor graph using the 30 dimensions of the supervided PCA reduction was computed using the FindNeighbors function followed by clustering using the FindClusters function. The newly generated UMAP was visualized using the DimPlot function. Each cluster was manually annotated using gene expression and protein data. Doublets were manually removed using cell-surface protein data (e.g., CD3, CD4, CD8, CD11c, CD19, CD56), separately.
Differential abundance analysis using scRNA-seq data
Differential abundance analysis of patients with MPA and healthy donors was performed using scRNA-seq data. We used miloR (version 3.15) to detect sets of cells that are differentially abundant in various conditions by modeling counts of cells in the neighborhoods of a k-nearest neighbor (KNN) graph18. We first used the buildGraph function to construct a KNN graph based on precomputed supervised PCA with k = 10, using 30 principal components (d = 30). Next, we used the makeNhoods function to assign cells into neighborhoods based on their connectivity over the KNN graph. For computational efficiency, we subsampled 10% for monocytes and CD8+ T cells. To test for differential abundance, Milo fit an NB GLM to the counts for each neighborhood, accounting for different numbers of cells across samples using TMM normalization. We included age as covariates in testNhoods function. The log2 fold change of number of cells between two conditions in each neighborhood was used for visualization.
Module scoring using scRNA-seq data
Gene scores for each study participant were visualized using the Dotplot function based on cell-based scores, which were calculated using the AddModuleScore function. Interferon signature genes (ISGs) used for module scoring were previously reported36. Classical monocyte signature genes and CD8+ cytotoxic T lymphocytes (CTL) signature genes were determined in the human PBMC dataset13 as genes highly expressed in the CD14+ monocyte population and the CD8+ CTL population, respectively.
CyTOF assays
PBMCs were thawed and prepared to a concentration of 1 × 107 cell/mL. Next, they were cultured in RPMI-1640 medium for 6 hours at 37℃ with GolgiStop supplementation (BD bioscience). To limit the batch effect, we barcoded each sample based on combinations of seven types of anti-CD45 antibodies 30 minutes before the endpoint of the culture. Cell-ID Cisplatin (Fluidigm) (2 µM) was added 15 minutes before the endpoint of the culture. All barcoded samples were then combined and stained with antibodies specific for surface markers for 30 minutes at room temperature. To normalize the data across multiple batches, we combined control PBMCs (Cellular Tchnology Limited) across all batches. The samples were fixed with 1 mL of Maxper Fix and Perm buffer (Fluidigm) for 30 minutes at 4℃. Cells were stained in 1 mL of Foxp3 Fixation/Permeabilization buffer (eBioscience) with antibodies specific for intracellular cytokines and Cell-ID intercalator-Ir (Fluidigm) for 30 minutes at room temperature. The antibodies used for CyTOF are shown in Supplementary Table 8. The samples were suspended in a total of 10% Four Element Calibration Beads (Fluidigm) with Cell Acquisition Solution (Fluidigm). CyTOF data were collected with a Helios CyTOF system (Fluidigm). Raw FCS data underwent bead-based normalization with CyTOF software (version 7.0.8493; Fluidigm).
Normalization and population analysis of CyTOF data
In the preprocessing step, the FCS data were debarcoded by gating based on the staining patterns of anti-CD45 antibody–conjugated metals in Cytobank (https://premium.cytobank.org/cytobank/). We used CytoNorm37 and normalized the data across multiple batches based on a combined control sample. FlowSOM clustering was used to make 10 clusters for control samples with learning a spline to transfer from the computed 101 quantities. Combined samples were mapped with FlowSOM clustering and normalized based on the computed spline. Newly created FCS files were analyzed in Cytobank for PBMCs analysis. For each study participant, 10,532–99,958 single live cells were identified and used for further analysis. UMAP was applied to all normalized samples. Cells were manually annotated with surface proteins listed in Supplementary Table 8.
Differential abundance analysis of CyTOF data
We used cydar (version 1.22.0)38 to detect the set of cells that was differentially abundant in patients with MPA and healthy donors using CyTOF data. Normalized FCS files were transformed using the transformation function and used to construct hyperspheres using the countCells function (downsample = 10) with the tolerance parameter chosen so that each hypersphere had at least 50 cells, as estimated using the neighborDistances function. Hyperspheres from monocytes or CD8+ T cells were then extracted and UMAP was applied to aggregated data from 15 individuals using the umap function. Enrichment of each hypersphere from patients with MPA was visualized using the ggplot function.
Pseudo-bulk differential gene expression analysis using scRNA-seq data
Differential gene expression analysis was performed between patients with MPA and healthy donors. Pseudo-bulk samples were first created by aggregating gene counts and normalized by the overall counts in individual samples. Genes whose expression rate was more than 15% in either patients with MPA or healthy donors were included in the analysis. P-values were calculated using Student’s t-test. For the characterization of differential expression genes (DEG), we performed gene set enrichment analysis using Enrichr39 for highly expressed genes in MPA (Fold Change > 1.5). Human Gene Atlas from BioGPS40 and Reactome 201541 were used as dataset and adjusted p-values for each pathway were calculated by Benjamini-Hochberg method. ISGs were identified using a gene set termed “Interferon alpha/beta signaling” and “Interferon gamma signaling” in Reactome 2015, and “Interferon Alpha Response” and “Interferon Gamma Response” from MSigDB Hallmark 2020 from GSEA42. CD14 Mono-signature genes were identified using a gene set termed “CD14+ Monocytes” or “CD33+ Myeloid” in Human Gene Atlas, and “CD14 Monocyte” and “Monocyte” in Azimuth Cell Types 202113.
Measurement of serum interferon-alpha (IFN-α) levels
Serum IFN-α concentrations were measured using a pan–IFN-α ELISA detection kit (PBL Assay Science) using Flex Station3 (Molecular Devices).