CRISPR Cas9 Mediated Generation of Bex2 KO Mouse Model and Transcriptome Analysis of the Brain

Noor Bahadar China Japan Union Hospital, Jilin University Rajiv Kumar Sah Northeast Normal University Salah Adlat Northeast Normal University Hanif Ullah Guangxi Medical University Zinmar Oo Northeast Normal University Fatoumata Binta Bah Northeast Normal University Farooq Hayel Northeast Normal University Muhammad Umar Shenzhen University Xuechao Feng Northeast Normal University Yaowu Zheng Northeast Normal University Minru Zong (  zongmr@jlu.edu.cn ) China Japan Union Hospital, Jilin University May Zun Zaw Myint Northeast Normal University


Introduction
To date, ve BEX genes are identi ed in humans (BEX1-5), ve in chimp (BEX1-5), ve in mice (Bex1-4 and Bex6), and four in rats (Bex1-4) genome. According to the phylogenetic grouping, mouse and rat both are missing Bex5 member of the Bex gene family. It is predicted that Bex5 got lost in mice during the murid lineages after it is diverged from other mammals. All the Bex genes family members are positioned on the X-chromosome, excluding Bex6, which is present on chromosome 16 of the murine genome (1). These genes show high sequence homology and are predominantly expressed in the brain (2). A promising feature of the BEX genes is that they offer high expression in the mouse brain and are responsible for more than 12% of the rat brain's expressed sequence tags (1,2). BEX proteins have a role in transcription regulation and signalling pathways in neurodegeneration, cell cycle, and tumour growth (3)(4)(5)(6).
The researchers have reported the involvement of BEX2 in various cancer types, such as glioblastoma, glioma, and breast cancer (5)(6)(7). BEX2 is characterized in glioma development (8) and is vital to cells' tumorigenesis with activated mTOR (9). BEX2 has a similar pro-survival function in LNT-229 glioma cells, and the downregulation of BEX2 sensitizes LNT-229 cells to the cell death mediated by a dominantpositive variant of p53 (10). The role of BEX2 in oncogenesis is supported by the fact that this gene's downregulation impairs neo-angiogenesis and migration of cells in oligodendroglioma (11). Due to the diverse expression pattern of BEX2 in various tumours types, there is con icting evidence about the role of BEX2 in multiple cancers. The BEX2 expression in glioblastoma is very high and promotes the proliferation and survival of the glioblastomas mediated by NF kB signalling (4). BEX2 is also involved in the cell moment and invasion in glioma (5).
For the prediction of new transcripts of a particular gene expression pattern in various tissues or developmental stages, the most widely used technology is RNA sequencing (RNA-seq) compared to DNA microarray analysis (12). In this study, a Bex2 KO mouse model has been generated using CRISPR Cas9 technology, and transcriptome analysis has been performed of the brain. Several essential pathways and KEGG diseases like TNDM, Non-syndromic X-linked mental retardation, Neurodegeneration due to cerebral folate transport de ciency, Schizencephaly are identi ed.

Animals
The Institutional Animal Care Committee and Animal Experimental Ethics Committee of Northeast Normal University (NENU/IACUC) have approved the study with approval number of (NENU/IACUC, AP2018011). All the recommendations for The Use of Laboratory Animals of NIH, USA, are followed strictly. Mice were kept in IVC cages (5-6 in each cage) at rotations of the 12/12 light-dark cycle in a pathogen-free environment with free access to food and water. The temperature of 21 degrees Celsius was maintained along 30-60% of humidity. Each time the mice were anaesthetized with 1% of pentobarbital natrium. The dose given was 10 mg/kg.

Designing of sgRNAs and construction of plasmid vector
The CRISPR Cas9 system is based on the 20bps nucleotide complementarity (13). These twenty nucleotides are termed as sgRNAs. These 20 nucleotides might be followed by a three bps nucleotide, NGG, where "N" can be any nucleotide, termed as Protospacer Adjacent Motif (PAM) (14). The Benchling database tool (https://www.benchling.com) was used to design sgRNAs. Two sgRNAs were chosen to delete the coding exon. The plasmid px330-U6-Chimeric_BB-CBhSpCas9 was purchased from Addgene (Addgene plasmid# 42230). The sgRNAs were ligated to the linearized plasmid vector according to the lab protocols. The sgRNAs designed and the respective positions on the genomic loci are shown in Table   1.

Microinjection and genotype
Pre-requisites were carried out before going for microinjection of the vector to the mice's oocytes strain C57BL/6J. The plasmid vector was directly injected into each oocyte's pro-nuclei. They were transferred directly to the ovary of the pseudo-pregnant mother already prepared. Olympus IX71 inverted microscope and Narishige microinjector was used. Microinjection was carried out by the labs' technician. Bex2 speci c primers targeting the upstream and downstream of sgRNAs were designed using the NCBI Primer-BLAST tool (https://www.ncbi.nlm.nih.gov/tools/primer-blast/). As the pups were two weeks old, they were being labelled by cutting their ngers, which were used as templates for genotyping. The PCR was carried out according to the prescribed protocol. The PCR conditions were denaturation at 95°C for 4 minutes, 32 cycles of ampli cation at 94°C for 30 seconds, 56°C for 30 seconds and 72°C for 45 seconds followed by 10 mins extension at 72°C and then cooled down to 4°C. The PCR product was analyzed with 1% of agarose gel electrophoresis for up to one hour. GelDoc was used to observe the gel pictures (15).
The transgenic littermates were separated accordingly. The chimaeras were obtained by crossing the littermates.
RNA extraction, synthesis of cDNA and qRT-PCR Total RNA was extracted from brain tissues of Bex2 KO and WT male mouse (n=3), using Trizol reagent (Takara, Dalian, China). The extracted RNA was converted to cDNA using a reverse transcription kit (Takara, Dalian). Jena Analytika (Germany) system was used for RT-qPCR. Quantitative Real-Time PCR (qPCR) was conducted with SYBR green mix in triplicates (Takara, Dalian, China). All the results were normalized according to the previously reported mechanism (16).

RNA Extraction and Library Preparation for Transcriptome analysis
Total RNA from the brain of Bex2 KO and wild-type mice (n=3, biological replicates per sample, 6 in total) extracted using RNAiso plus reagent (Takara, Dalian) following the manufacturer's instruction and followed by an additional step of DNase I digestion to eliminate genomic DNA contamination. The sample size, and selection of the animals have been decided according to the published protocols (17).
Male homozygous KO and pure WT were selected for further experiments since the Bex2 gene is X linked. Quality and purity of RNA was checked on Nanodrop, NanoDropTM One spectrophotometer (ThermoFisher Scienti c, USA), and Agilent 2100 Bioanalyzer (Santa Clara, USA). One µg RNA for each transcript was used for the RNA-seq library. RNA-seq was performed on the BGISEQ-500 platform. The adaptor sequences and low quality sequence reads were removed from the data sets. Raw sequences were transformed into clean reads after data processing. Clean reads were mapped to the mouse genome (GCF_000001635.26_GRCm38.p6) by using the Bowtie2 tool (18). The mRNA Seq raw data were deposited at the NCBI's Sequence Read Archive (ID 792087 -BioProject -NCBI (nih.gov)) under the accession number PRJNA792087.
GO and KEGG Analysis of the differentially expressed genes All the DEGs were mapped to the Gene Ontology database. To perform GO enrichment analysis, the phyper function in the R program was used. Brie y, all DEGs were rstly mapped to each term in the Gene Ontology database (19). Genes in each term were calculated, and the hypergeometric test was applied to nd GO terms that are signi cantly enriched in DEGs compared to the background of all genes in reference species. Bonferroni correction was used for p-value adjustment (20). Q value (corrected p-value) < 0.05 was de ned as signi cantly enriched GO terms in DEGs. Kyoto Encyclopedia of Genes and Genome pathway classi cation was performed by mapping all the DEGs to the KEGG pathway database (21). The pHYPER function in the R program was used to perform the enrichment analysis accordingly (16). Pathway with Q value ≤ 0.05 was considered as signi cantly enriched in differentially expressed genes (22).

Statistical Analysis
Results are expressed as means ± SEM (SEM). P-value <0.05 (unpaired Student's t-test) was considered statistically signi cant. All graphics were prepared with GraphPad Prism 8 for Mac (GraphPad Software).

Strategy of Bex2 -/mouse generation & Screening of mutants
The Bex2 is a small gene and consists of only 1672 bps. The third exon contains the entire coding region. To inactivate the gene, the sgRNA_I was designed in front of the start codon, while the sgRNA_II was targeted upstream of the stop codon. To screen fragment deletion, primer pairs were designed anking to sgRNA targets ( Figure 1A). Nine pups were obtained after 19 days of the zygote transfer. Fifteen days later to the birth, these pups were weaned, cut their ngers for labelling purposes. The same biopsies were used for genotyping, following the standard protocols. Two out of nine mice were found transgenic for Bex2, as shown in gure 1B. The deleted (knocked-out) fragment was con rmed by ligating the PCR product fragment into a pMD18 simple vector (Takara, China). The chromatograph is shown ( Figure 1C). These two mice were mated with a wild-type C57BL/6J background to segregate the alleles. F2 progenies were used as experimental organisms. The Bex2 expression level was con rmed by qPCR in the brain and lungs ( Figure 1D).
Gene expression pro le of Bex2 −/− and WT brain The cDNA libraries were constructed from brain mRNA of 7 weeks old Bex2 −/− and WT male mice in three replicates (n=3). BGISEQ platform was used for RNA-seq. Total raw reads for Bex2 −/− and WT were 23.75 M for each sample. Adapter sequences and low-quality reads were ltered out, and each sample produced an average of 1.19 Gb data or 23.74M reads each, respectively, with Q30 base percentage 93.13% and 92.97%, while Q20 percentage was 98.10% and 98.03% (Data not shown). The clean reads were then aligned to the mouse reference genome (GCF_000001635.26_GRCm38.p6), and matching e ciency between clean reads was identi ed using Bowtie2 (23). Transcript expression levels were calculated and presented by RSEM (24).

Identi cation of differentially expressed genes
The genes expressed only in KO mice are termed differentially expressed genes (DEGs). Differentially expressed genes were identi ed accordingly (25). The criteria set for the selection of DEGs was log 2 fold change > 1, and FDR 0.001. According to the set standards, a total of 93 genes were identi ed as differentially expressed between Bex2 −/− and WT, out of which 57 were up-regulated while 36 were downregulated ( Figure 2).

GO enrichment analysis
Gene Ontology (GO) is a standard gene function classi cation system that comprehensively explains the attributes of genes and gene products in organisms. GO has been divided into three categories; biological process, cellular components and molecular function(26). All the identi ed DEGs were assigned to the GO tool (Fig. 3A, 3B, 3C). The most signi cant GO terms enriched under Bex2 −/− regulation among Biological processes were cellular process, biological regulation, regulation of the biological process, metabolic process, multicellular organismal process, response to stimulus and signalling etc. Among the Cellular Component category, cell, cell part, organelle, organelle part, protein-containing complex, synapse, synapse part, etc. were enriched. Among Molecular Function, binding, catalytic activity, molecular function, molecular transducer activity etc. were enriched.
KEGG analysis identi es many processes are under Bex2 −/− regulation KEGG pathway analysis helps to understand the biological function of gene networks. KEGG Pathway Enrichment Analysis was then performed to nd the signi cantly enriched pathways in terms of DEGs mapped to the entire genome background (Figure 4). The most signi cant pathways enriched were cell adhesion molecules (CAMs), neuroactive ligand-receptor interaction, antigen processing and presentation, cAMP signalling pathway, cellular senescence, calcium signalling pathway etc.

KEGG Analysis of Disease-associated pathways
The identi ed DEGs were assigned to KEGG disease enrichment ( Figure 5). The KEGG diseases enriched under Bex2 −/− regulation were Transient neonatal diabetes mellitus (TNDM), Postaxial polydactyly, Nonsyndromic X-linked mental retardation, Congenital lactase de ciency, and Neurodegeneration due to cerebral folate TD, among others. The genes involved in Postaxial polydactyl, Non-syndromic X-linked mental retardation.

Genes that Encoding Transcription Factor Proteins
The essential regulatory proteins are termed transcription factors that play a role in multiple biological processes. Multiple TFs were identi ed in this study. TFs enriched under Bex2 −/− regulation was HMGA self-build, Otx1 transcription factor, zf-C4 self-build TFs, SAND DNA-binding protein domain, Zinc ngers, and Homeobox transcription factor genes ( Figure 6). (Figure 7). Otx proteins constitute a class of vertebrate homeodomain-containing transcription factors essential for anterior head formation, including nervous development. SAND DNA-binding protein domain is localized in the cell nucleus and has a vital function in chromatin-dependent transcriptional control. It is found solely in eukaryotes. Six3 is identi ed among the Homeobox transcription factors (Log 2 FC = 2.98), which are vital in nervous system development Fig. 6.

Discussion
The RNA-guided genome manipulation based on type II prokaryotic CRISPR/Cas system has successfully generated transgenic mouse models (27)(28)(29)(30)(31). The Bex2 KO mouse model was generated using the CRISPR Cas9 gene modi cation system in this study. The gene was selected based on its importance in multiple physiological functions in the brain. The deletion of the fragment was con rmed by Sanger sequencing and, later on, the mRNA using RT-qPCR. A promising feature of the BEX genes is that they offer high expression in the mouse brain and are responsible for more than 12% of the rat brain's expressed sequence tags (1,2). A brain transcriptome study was conducted in this study using RNA-seq methodology. The genes expressed only in KO mice are termed differentially expressed genes (DEGs). Differentially expressed genes were identi ed accordingly (25). The criteria set for the selection of DEGs was log 2 fold change > 1, and FDR 0.001. According to the set standards, a total of 93 genes were identi ed as differentially expressed between Bex2 −/− and WT, out of which 57 were up-regulated while 36 were down-regulated. The most up-regulated gene is Tmsb15l, the log 2 FC is 8.29. Other researchers also found its upregulation in broblast of E14 using microarray analysis (32). H2-Q8, DEG log 2 FC is 6.62, also found up-regulated by other researchers using microarray analysis (33). Gvin1, 5.67 was identi ed IFN treatment of neuron cultures (34). Tgtp2, log 2 FC is 5.45. lncRNA-Tcam1 up-regulated Tgtp2 during spermatogenesis (35). In contrast, other researchers' results were inverse to this study, where they found that Tgtp2 was down-regulated in Mcpt6 KO mice (36). Serpina9, log 2 FC is 3.3. The SNP rs11628722 in the SERPINA9 gene was previously associated with incident ischemic stroke in the Atherosclerosis Risk in Communities (ARIC) study reviewed by (37). Other researchers identi ed that the SNPs in the SERPINA9 gene showed race-speci c associations with characteristics of carotid atherosclerotic plaques (38). A team of researchers suggests that along with other genes, Serpina9 might associate with Alzheimer's disease neuropathology in the neurodegenerative process (39). Six3, DEG log 2 FC is 2.98. Ectopic expression of Six3 in chicks has shown that Six3 is a direct negative regulator of Wnt1 expression (40). Cldn2 (+2.3), It was found that CLDN2 expression signi cantly inhibited the malignant phenotype of OS cells in vitro. Rnaset2b, the most down-regulated gene, log 2 FC is -9.0. An in ammatory disease, Aicardi-Goutières syndrome, is related to this genes' mutation (41). Another team evaluated two patients with RNASEH2B mutations (42). Rex2, DEG log 2 FC -4.90, which a group of researchers identi es as a lesserstudied gene (43). Mss51, DEG log 2 FC -3.23 is a metabolism-related gene (44,45). CD300lf, DEG log 2 FC -2.7 is the primary physiologic receptor of murine norovirus (46). Stab2, DEG log 2 FC is -2.39 is shown to be venous thromboembolic disease (47).
The most signi cant GO terms enriched under Bex2 −/− regulation among Biological processes were cellular process, biological regulation, regulation of the biological process, metabolic process, multicellular organismal process, response to stimulus and signalling, etc. Among the Cellular Component category, cell, cell part, organelle, organelle part, protein-containing complex, synapse, synapse part, etc., were enriched. Among Molecular Function, binding, catalytic activity, molecular function, molecular transducer activity, etc. were enriched. KEGG classi cation regulated under Bex2 −/− , several DEGs were annotated to multiple pathways from different categories. KEGG Pathway Enrichment Analysis was then performed to nd the signi cantly enriched pathways in terms of DEGs mapped to the entire genome background. The most signi cant pathways enriched were cell adhesion molecules (CAMs), neuroactive ligand-receptor interaction, antigen processing and presentation, cAMP signalling pathway, cellular senescence, calcium signalling pathway, etc. The identi ed DEGs were assigned to KEGG disease enrichment. The KEGG diseases enriched under Bex2 −/− regulation were Transient neonatal diabetes mellitus (TNDM), Postaxial polydactyly, Non-syndromic X-linked mental retardation, Congenital lactase de ciency, and Neurodegeneration due to cerebral folate TD, among others. X-linked mental retardation (XLMR) is an inborn disorder that triggers malfunction to grow intellectual capabilities as of alterations in numerous genes on the X chromosome. XLMR is subdivided into syndromic and nonsyndromic types (NS-XLMR), varying on whether further anomalies are observed on bodily inspection, laboratory analysis and brain tomography.

Declarations
Ethics approval and consent to participate. The Institutional Animal Care Committee and Animal Experimental Ethics Committee of Northeast Normal University (NENU/IACUC) have approved the study with approval number of (NENU/IACUC, AP2018011). All the recommendations for The Use of Laboratory Animals of NIH, USA, are followed strictly. Mice were kept in IVC cages (5-6 in each cage) at rotations of the 12/12 light-dark cycle in a pathogen-free environment with free access to food and water. The temperature of 21 degrees Celsius was maintained along 30-60% of humidity. Each time the mice were anaesthetized with 1% of pentobarbital natrium. The dose given was 10 mg/kg. We con rm that all methods are reported in accordance with ARRIVE guidelines (https://arriveguidelines.org) for the reporting of animal experiments.

Consent for publication. N/A
Availability of data and materials. The RNA seq raw data is available at SRA, NCBI with accession numbers SRR17327319 and SRR17327320.
Competing interests. The authors declare no competing interests.
Funding. This research is supported by the College-Enterprise collaboration Project (2017YX244) and the Natural Science Foundation of Jilin Province (20200201127JC). The funders have no role in the design of the study nor analysis.   KEGG Pathway enrichment of Bex2 -/-. The X-axis is the enrichment ratio (calculated as: Term candidate gene number/Term gene number). The Y-axis is the KEGG pathway. The size of the circle represents the number of genes annotated to the KEGG pathway. The colour represents the enriched signi cance.