Cell lines and recombinant SARS-CoV-2
A549 and HEK293T cell lines were obtained from the American Type Culture Collection (ATCC, Bethesda, MD). All cell lines and their genetically modified cell lines were cultured in Dulbecco’s modified Eagle’s medium (DMEM) supplemented with 10% heated-inactivated fetal bovine serum (FBS; #S11150, R&D System, Minneapolis, MN) and 100μg/ml of Normocin (ant-nr-1, InvivoGen, San Diego, CA). The A549-hACE2 cells that stably express human ACE2 were generated previously30 and grown in the culture medium supplemented with 10 μg/mL Blasticidin (#A1113903; ThermoFisher, Waltham, MA). Cells were grown at 37 °C with 5% CO2. All cell lines were authenticated by short tandem repeat fingerprinting or the expression of tagged markers used for genetic modification. The mycoplasma detection kit (#13100-01, SouthernBiotech, Birmingham, AL) was used to routinely monitor for mycoplasma contamination of cultured cells. The maximum length of time of in vitro cell culture between thawing and use in the described experiments was two weeks.
The recombinant SARS-CoV-2 was generated by a previously described reverse genetic system based on the strain 2019-nCoV/USA_WA1/2020 derived from the first patient diagnosed in the US48. The recombinant SARS-CoV-2 was used to screen host factors regulating CPE. The nanoluciferase severe respiratory syndrome coronavirus 2 (SARS-CoV-2-Nluc) established in our previous study30 was used to evaluate the involvement of identified host factors in viral entry and replication. Experiments with SARS-CoV-2 and nano-Luciferase virion were performed in a BSL-3 laboratory by personnel equipped with powered air-purifying respirators. All procedures were followed by biosafety protocols approved by the Institutional Biosafety Committees at the University of Houston and the University of Texas Medical Branch at Galveston.
Establishment of genetically modified cell lines
Lentivirus-based gene delivery was used to either genetically knockout gene-of-interests (GOIs) or ectopically express GOI. To generate lentiviral supernatants, HEK293T cells were seeded 16 hours prior to transfection and transfected with the lentiviral vectors encoding different gRNAs or GOIs, along with lentiviral packaging plasmids, pCMV-VSV-G and psPAX2 (#8454 and #12260, Addgene) by the jetPRIME transfection reagent (#101000046, VWR, Radnor, PA) according to the manufacturer’s protocol. Viral supernatants were collected 72 hours post-transfection and filtered by 0.45 mm PVDF Syringe Filter Unit (#SLHV033NK, Millipore-Sigma, Burlington, MA) to remove cell debris. Designated titers of lentivirus were used to infect cells in the presence of 8μg/ml hexadimethrine bromide (#107689, Sigma-Aldrich, St Louis, MO).
To generate A549 cells expressing Cas-9 for gene editing, A549-hACE2 cells were transduced lentivirus expressing lentiCas9-EGFP (#63592, Addgene). The Cas-9 expressing A549-hACE2 (A549-AC) line was generated by sorting GFP+ cells 72 hours after lentivirus transduction. To genetically suppress the expression of GOIs in cells, lentiviral gRNA-expressing vectors were constructed. Fully synthesized double strand (ds) DNA fragments (Twist Bioscience, San Francisco, CA) encoding gene-specific gRNAs were inserted into the lentiGuide-Puro (#52963, Addgene) as previously described20. The forward sequences of dsDNA fragments were listed in the extended data table 5. In addition, a vector encoding non-targetable gRNA was also constructed to generate control cell lines. A549-AC cells were transduced with newly constructed gRNA-expressing vectors and followed by antibiotic selection using 1μg/ml of puromycin (#A1113803, Gibco, Carlsbad, CA) to generate stable cell lines with knocking out of GOIs. Cells transduced with the viral vector encoding a non-targetable gRNA were generated and served as control cells. To ectopically express GOIs in cells, dsDNA fragments encoding Flag-tagged human open reading frames (ORFs) of DAZAP2 (NM_014764.4) and VTA1 (NM_016485.5) were inserted into the lentiviral vector, pLVX-IRES-ZsGreen1 (#PT4064-5, Takara Bio, San Jose, CA). A549-hACE2 cells were transduced with lentiviral ORF vectors. Stable cell lines were generated by sorting GFP+ cells 72 hours after lentivirus transduction. Cells transduced with pLVX-IRES-ZsGreen1 were generated and served as control cells.
CRISPR dropout screens
The optimized human genome-wide knockout (KO) CRISPR Library (H1) consisting of 92,817 gRNAs targeting 18,436 genes (5 gRNAs for each gene) was purchased from Addgene (Pooled Library #1000000132) and used to generate lentivirus as described above. 1.5×108 of A549-AC cells were transduced with pooled library lentivirus at a low multiplicity of infection (MOI; ~0.15-0.2) to ensure that each cell would receive only one gRNA as previously described20. 48 hours after transduction, cells were cultured in the growth medium in the presence of 1mg/ml puromycin to select transduced cells. 72 hours after puromycin selection, 30 million of cells were collected and used as the reference sample. 7 days post puromycin selection, pooled gRNA-expressing A549-AC cells were re-seeded in T175 cell culture flasks (10 million/flask). The next day, for the group receiving virus infection, SARS-CoV-2 was used to infect 2×108 of pooled A549-AC cells at MOI =5 for 48 hours. Pooled A549-AC cells cultured in the same assay medium were collected and served as the control group. 48 hours post SARS-CoV-2 infection, non-adhesive cells were removed by repeated wash using pre-warmed PBS. Adherent cells were harvested by using 0.25% Trypsin-EDTA (#15050065, ThermoFisher) and washed three times using PBS. Three replicated samples were collected for each group.
Genomic DNA from all cell samples was extracted by using TRIzol (#15596026; ThermoFisher) according to the manufacturer’s protocol. DNA fragments containing gRNA sequences were amplified and barcoded with adaptation by nested polymerase chain reaction (PCR) as previously described20. The quality and concentration of all PCR products were determined by the Qubit ssDNA high sensitivity assay kit (#Q10212; ThermoFisher) and the bioanalyzer High Sensitivity DNA Kit (#5067-4626 2100; Agilent, Santa Clara, CA) respectively. Samples were then sequenced by Illumina Next-Generation Sequencing (NSG) at the MD Anderson Cancer Center Advanced Technology Genomics Core.
Assays to evaluate the virus-induced cytopathic effect
A series of genetically modified A549-AC cell lines (10,000 cells per well in DMEM medium containing 2% FBS) were plated into clear flat-bottom 96-well plates. On the next day, the recombinant SARS-CoV-2 was used to infect pre-seeded A549-AC cells at designated MOIs (0.5, 2.5 or 5). 48 hours after viral infection, 4 μL of Cell Counting Kit-8 (#CCK-8, Sigma-Aldrich, St Louis, MO) was added to each well. After incubation at 37 °C for 90 min, absorbance at 450 nm was measured using a Cytation5 multi-mode microplate reader (BioTek, Winooski, VT). The relative cell viability was calculated by normalizing the absorbance of the control groups (set as 100%). At least two independent experiments were performed to determine the sensitivity of genetically modified cells to virus-induced cytopathic effect. For each experiment, triplication was performed for all groups.
Assays to evaluate viral entry and replication
A series of genetically modified A549-AC cell lines (10,000 cells per well in DMEM medium containing 2% FBS) were plated into were seeded in white opaque 96-well plates. On the next day, the recombinant SARS-CoV-2-Nluc virus was used to infect pre-seeded A549-AC cells at designated MOIs (0.02, 0.1, and 0.5). 18 hours after infection, cells were applied to Nano-Glo® Dual-Luciferase® reporter assays (#N1610; Promega, Madison, WI) according to the manufacturer's instructions. Luciferase signals from all samples were measured using a Synergy™ Neo2 microplate reader. At least two independent experiments were performed to determine the sensitivity of genetically modified cells to virus-induced cytopathic effect. For each experiment, triplication was performed for all groups.
Assays to evaluate viral entry and attachment
A series of genetically modified A549-AC cell lines (25,000 cells per well in DMEM medium containing 2% FBS) were plated into were seeded in 12-well plates. On the next day, SARS-CoV-2 was used to infect pre-seeded cells at MOI=1.0. For the virus entry assay, cells were co-incubated with SARS-CoV-2 at 37°C for 1h. For the viral attachment, cells were co-incubated with the virus at 4°C for 1h. After co-incubation, RNAs were isolated from infected cells by Trizol and Direct-zol RNA Miniprep Kits (#R2050; Zymo Research, Irvine, CA) according to the manufacturer's instructions. The relative expression levels of the SARS-CoV-2 nucleocapsid (N) protein were determined from the quantitative Real-time PCR (qRT-PCR) using the iTaq Universal One-Step RT-qPCR Kit (#1725151; Bio-Rad, Hercules, CA). Triplication of PCR reactions was included in all assays. The expression levels of ACTB were used for data normalization. The sequences of RT-PCR primers are listed in the extended data table 6.
Immunoblot analysis
To verify the expression of GOIs, proteins were extracted by lysed cells using RIPA Lysis and Extraction Buffer (#89900; ThermoFisher) and the concentrations of protein samples were quantified with the Pierce BCA Protein Assay Kit (#23225; ThermoFisher). The western blot analysis was used to determine the expression of protein-of-interest. The intensity of protein bands was detected by Immobilon Western Chemiluminescent HRP Substrate (#WBKLS0500; MilliporeSigma, Burlington, MA) using the ChemiDoc Imaging System. The antibody targeting b-actin (8H10D10, #3700) was purchased from the Cell Signaling Technology (Danvers, MA), human ACE2 (AF933) was purchased from R&D Systems, DAZAP2 (G-4, sc-515182) was purchased from Santa Cruz Biotechnology (Dallas, TX), KLF5 (21017-1-AP) was purchased from Proteintech Group (Rosemont, IL) and the monoclonal ANTI-FLAG antibody (M2, #F3165) was purchased from MilliporeSigma.
Bioinformatic analysis
The MAGeCK (v0.5.9.4) count module was used to calculate the read count of individual sgRNAs in different samples with the following parameters: “-l human_sgrna_sequences_A.library --control-sgrna human_sgrna_sequences_A.library.negctrl --norm-method control --sample-label C-1,C-2,C-3,control1,control2,control3,Ref-1 -n COVID-19_CRISPR_210212.count --fastq files.fq”. MAGeCK test module was then applied with parameters “-k COVID-19_CRISPR_210212.count.txt -c control1,control2,control3 -t C-1,C-2,C-3 --norm-method control --keep-tmp -n COVID-19_CRISPR_210212_C_control --control-sgrna human_sgrna_sequences_A.library.negctrl --gene-lfc-method secondbest”, to identify the genes that showed a significant differential selection (|log2(fold-change)| ≥ 0.5 and p < 0.05) between the control and SARS-CoV-2 infection groups.
GWAS and interactome analysis
The COVID-19 GWAS meta-analyses results (release 6)49 for “Hospitalized covid vs. population” and “Very severe respiratory confirmed covid vs. population” were downloaded from the COVID19 Host Genetics Initiative (https://www.covid19hg.org/). The CRISPR screen hits that are within +/- 10kb of the SNPs reaching the significance level of p< 0.001 in the GWAS meta-analysis were then identified as candidate genes associated with “Hospitalized” and “Critically-ill” conditions.
We also performed an integrated analysis of host-host and host-viral interactome, based on the protein-protein interaction (PPI) data from the BioGRID database (Release 4.4.205)22. Briefly, we first constructed a human-human and human-SARS-CoV-2 PPI (CoV-2_HsPPI) network that contained the human-human or human-viral PPIs supported by at least two independent experiments which resulted in a total of 114,366 interactions covering 13,716 human proteins and 30 SARS-CoV-2 proteins. By selecting the CRISPR screen hits and the SARS-CoV-2 proteins as the seed nodes, a sub-network was then constructed as the CRISPR screen hits related host-host and host-viral PPI network.
Besides the PPI, we integrated four host-viral protein-RNA interactome (RPI) datasets50-53 that characterize the interaction between human protein and SARS-CoV-2 RNAs to construct a, “SARS-CoV-2 RNAs – human proteins” (CoV-2_HsRPI) network that included 452 nodes and 706 edges. By selecting the CRISPR screen hits as the seed nodes, a sub-network was then constructed as the CRISPR screen hits related RPI network.
scRNAseq analysis
scRNA-seq data of airway epithelial cells were extracted from the dataset generated by Wauters et al54, which include 65,166 cells from 35 pneumonia patients, 22 of whom tested positive for SARS-CoV-2, while the other 13 are infected by other pathogens. Among all the patients, 14 of whom experienced mild symptoms, while the other 21 are severe cases. The data analysis was performed as described in previous studies by Baggen et al9. Briefly, based on the source of cells, epithelial cells were dissected into four groups, namely non COVID19_mild, non COVID19_severe, COVID19_mild, and COVID19_severe. To test the hypothesis that whether candidates identified in our screen are differentially expressed in among epithelial cells derived from different groups, we use the Kruskal–Wallis test to identify if any group of cells are significantly different from another, and if so, the Wilcoxon rank-sum test was used as a post-hoc test to identify the pairs of groups that are significantly different.
Statistical analyses
Summary statistics (e.g., mean, SEM) of the data are reported. Assessments of differences in continuous measurements between two groups were made using two-sample t-test. Multiple group comparisons were performed by Analysis of Variance (ANOVA) with repeated measures. A p-value of less than 0.05 was considered significant. Graph generation statistical analyses were performed using the Prism software program (GraphPad Software), Tableau 8.2 software program (Tableau Software), and R software programming language (version 3.1.0). The sample size for each experiment was chosen based on the study's feasibility given its exploratory nature.
Data availability
Raw data files from genome-wide CRISPR drop-off screen are accessible from GEO (GSE209750).