mRNA Analysis Identies deep Intronic Splicing Variants Leading to Alport Syndrome and Overcomes the Problem of Negative Results of Exome Sequencing

Mutations in COL4A3, COL4A4 and COL4A5 genes lead to Alport syndrome (AS). However, pathogenic variants in some AS patients are not detected by exome sequencing. The aim of this study was to identify the underlying genetic causes of ve unrelated AS probands with negative NGS test results. Urine COL4A3–5 mRNAs were analyzed in the probands with an uncertain inherited mode of AS, and COL4A5 mRNA of skin broblasts was analyzed in the probands with X-linked AS. RT-PCR and direct sequencing were performed to detect mRNA abnormalities. PCR and direct sequencing were used to analyze the exons with anking intronic sequences corresponding to mRNA abnormalities. Nine novel deep intronic splicing variants in COL4A4 and COL4A5 genes that cannot be captured by exome sequencing were identied in the ve AS probands. Skipping of an exon was caused by four intronic variants, and retention of an intron fragment led to the remaining variant. Our results reveal that mRNA analysis for AS genes from either urine or skin broblasts can resolve genetic diagnosis in AS patients with negative NGS results. We recommend analyzing COL4A3–5 mRNA from urine as the rst choice for these patients because it is feasible and non-invasive. an electron 5 . Genetic testing for pathogenic variants in COL4A3, COL4A4, COL4A5 of of the availability of next-generation sequencing (NGS), including targeted NGS and whole exome sequencing, in the clinic 6 . NGS detects approximately 82–86% of pathogenic COL4A3–5 gene variants 6,7 . However, some genetic changes that cause AS, such as deep intronic splicing variants, somatic mosaicism, and copy number variants, are not detectable by NGS 8–10 anomalies cDNA exons 2–4, 23, 25–27 anking introns 2–3, 25–26 amplied PCR DNA proband 1 and her parents. Sequence analysis 1 and heterozygous variants intron c.72 − intron c.1623 + 570A > G, intron c.1804–158A Proband 1 and 23 on splicing, COL4A4 mRNA 1’s Sequencing of the fragment 4 RT-PCR product showed that the father also had a heterozygous insertion of a 109 bp sequence of intron 23 between exons 23 and exon 24 (r.1623_r.1624 590_1623 698], (data not which indicated that the variant c.1623 + 702T > A located in intron pathogenic.


Introduction
Alport syndrome (AS) is a hereditary nephritis characterized by hematuria, proteinuria, and progressive renal failure and is sometimes accompanied by sensorineural deafness and ocular abnormal 1 . The three genetic forms of Alport syndrome depend on the mode of inheritance: X-linked AS (XLAS), autosomal recessive AS (ARAS), and autosomal dominant AS (ADAS) 2,3 . XLAS is caused by pathogenic variants in the COL4A5 gene, while ARAS and ADAS are caused by pathogenic variants in the COL4A3 or COL4A4 gene 4 . Pathogenic variants in COL4A3, COL4A4 or COL4A5 genes lead to abnormal α3, α4 or α5 chains of type IV collagen in the glomerular basement membrane (GBM). The gold standard for clinical diagnosis of AS is the characteristic changes in GBM, including irregular thickening, splitting and "basket-weave" changes, seen under an electron microscope 5 . Genetic testing for pathogenic variants in COL4A3, COL4A4, or COL4A5 genes is currently used frequently in the diagnosis of AS because of the increasing availability of next-generation sequencing (NGS), including targeted NGS and whole exome sequencing, in the clinic 6 . NGS detects approximately 82-86% of pathogenic COL4A3-5 gene variants 6,7 . However, some genetic changes that cause AS, such as deep intronic splicing variants, somatic mosaicism, and copy number variants, are not detectable by NGS [8][9][10] .
mRNA sequencing is an effective method to identify intronic splicing variants. Several COL4A5 gene deep intronic splicing variants have been reported in studies that analyzed mRNA from skin broblasts 11 , peripheral blood lymphocytes 12 , hair root 13 , or renal tissue 14 . In our clinical practice, since 2000, simultaneous examinations of α5(IV) staining in skin and COL4A5 mutation screening using mRNA extracted from cultured skin broblasts have been routinely performed in patients with suspected XLAS. However, this approach cannot be applied for Alport syndrome patients with autosomal inherited patterns, since α3(IV) and α4(IV) are not expressed in skin. Two deep intronic variants in the COL4A3 gene were identi ed by analysis of mRNA from blood or urine 15 . However, to our knowledge, the value of detecting mutations in COL4A5 and COL4A4 genes in mRNA isolated from urine has not been adequately studied.
The aim of this study was to identify the genetic etiologies of ve unrelated AS patients with negative NGS results. We used our developed approach for analysis of the entire coding regions of COL4A3, COL4A4, and COL4A5 mRNAs isolated from urine and COL4A5 mRNA extracted from cultured skin broblasts and identi ed deep intronic splicing variants in the enrolled patients. These ndings indicate that our developed approach may help guide medical practitioners and genetic counselors to provide personalized management of AS.

Clinical features of AS patients
Five unrelated Alport syndrome probands were enrolled according to the inclusion and exclusion criteria listed in the Material and Methods. Patient clinical information and pedigrees are shown in Table 1 and Fig. 1, respectively. Patient 1 was diagnosed with AS based on characteristic AS features in GBM; the inheritance pattern was uncertain because of a negative family history and normal staining of α5(IV) chain in skin tissue. Patient 2 was highly suspected of having XLAS based on a positive family history of end stage renal disease; however, normal staining of α5(IV) chain in skin tissue did not support the diagnosis. XLAS was diagnosed in patients 3-5 with abnormal staining of α5(IV) chain in skin specimens. Gene mutations in the ve probands In proband 1, cDNA analysis showed that no abnormal transcripts were detected in COL4A3 and COL4A5 mRNAs isolated from proband 1's urine (Suppl.1 and 2). However, agarose gel electrophoresis revealed a smaller COL4A4 mRNA transcript (fragment 1; Fig In proband 2, cDNA analysis showed that no abnormal transcripts were detected in COL4A3 and COL4A4 mRNAs isolated from proband 2's urine (Suppl.3 and 4). Agarose gel electrophoresis revealed a smaller COL4A5 mRNA transcript (fragment 5, Fig. 3A) in proband 2's urine. Sequencing of 10 RT-PCR products revealed that exon 32 of COL4A5 gene was skipped heterozygously (r.2678_r2767del), which led to an in-frame deletion (p.Thr894_Gly923del) (Fig. 3B). COL4A5 exons 31-33 with the sequences of the anking introns 31-32 were ampli ed by PCR from genomic DNA for proband 2, her husband, and her daughter and then sequenced. Proband 2 and her daughter were heterozygous for the variant intron 31 c.2677 + 487C > A and c.2677 + 646C > T ( Fig. 3C and D). Neither of the two variants were identi ed in her husband.
In proband 3, agarose gel electrophoresis of RT-PCR fragment 2 products of COL4A5 mRNA from skin broblasts showed an abnormal transcript in addition to the wild-type transcript (Fig. 4A). Sequencing of 10 RT-PCR products revealed that a 128 bp sequence from intron 10 was inserted between exon 10 and exon 11 (r.609_r.610 ins[609 + 751_609 + 878]) ( Fig. 4B), which led to premature termination of α5(IV) chain (p.Gly204Valfs*7). COL4A5 intron 10 was ampli ed by PCR from genomic DNA in proband 3 and his mother. Sequencing revealed an A to G change in intron 10 at 879 bp downstream from exon 10 (IVS10 c.609 + 879 A > G) (Fig. 4C) in proband 3 and his mother.
Neither of the two variants were identi ed in his mother.
In silico prediction of deep intronic pathogenic variants Table 2 shows the output of HSF, NNSPLICE, and NetGene2 for each deep intronic splice variant identi ed in this study. Only two out of the nine variants were correctly predicted as deleterious by all three tools and six variants were only detected by one tool; no effect was predicted for COL4A5 intron 31 variant c.2677 + 646C > T. As this variant and the same intron variant c.2677 + 487C > A were segregated with AS in proband 2's family, and not all available in silico predictions were used, it was di cult to consider this variant as a neutral variant.

Discussion
In this study, by analyzing COL4A3-5 mRNAs from urine or skin broblasts, nine deep intronic pathogenic variants were identi ed in ve unrelated Alport syndrome patients with negative NGS results. These ndings indicate that our developed approach may be applied to help provide personalized evaluation and care of patients and their families. In addition, this is the rst report on compound heterozygous deep intronic splicing mutations in COL4A4 gene in an Alport syndrome patient.
Numerous studies have shown that NGS is effective in nding single nucleotide variations and small indels in exons and the anking intronic regions 16 .
However, some genetic events such as deep intronic variants, copy number variants, and somatic cell mosaicism may be missed by NGS 17 . Therefore, for a patient with clinically diagnosed or suspected AS and no pathogenic variants detected by NGS, it is necessary to further analyze COL4A3-5 genes by mRNA sequencing, chromosome microarray analysis, droplet digital PCR or other approaches to improve genetic diagnosis 18,19 .
According to the literature and public databases (Human Gene Mutation Database and Leiden Open source DNA Variation Database), pathogenic splicing variants account for 14.9-24.5% in the COL4A5 gene 20,21 . Approximately 70.4% (112/159) occurred at consensus splice sites, and only seven splicing variants occurred in introns at more than 100 base pairs up/downstream from exon-intron junctions. Approximately 70% (23/32) of the pathogenic COL4A3 splicing variants occurred at consensus splice sites and only two variants were located in introns at more than 100 base pairs upstream from the exons. No deep intronic COL4A4 splicing variants have been reported to date. These ndings indicate that deep intronic COL4A3-5 mutations are rare. The nine novel deep intronic pathogenic variants obtained in the present study extend the mutational spectrum of AS. These ndings also highlight COL4A3-5 mRNA analysis as an effective supplementary approach for NGS in molecular diagnosis of this disease.
Previous studies have reported that GBM collagen α3α4α5(IV) is synthesized solely by podocytes 22 , and the urine podocyte detachment rate (assessed by podocin mRNA in urine pellets) is increased in AS patients 23,24 . Therefore, extraction of RNA directly from patient-originated urine may be a valuable approach to the analysis of all three Alport gene variants, which was demonstrated by the ndings of the present study. In addition, compared with the method that extracted RNA from urine-derived podocyte-lineage cells, our developed approach for isolation of RNA directly from urine is simpler and more practical. A weak point of our approach is the requirement for patient cooperation to obtain enough fresh urine, which means that young patients who cannot rapidly drink 1000-1500 ml water are not suitable for urine mRNA analysis.
Given that the deep intronic variants identi ed in the present study could be detected using whole genome sequencing, and in silico splicing prediction tools are usually used to select variants that are predicted to have an effect on splicing in a molecular diagnostic setting 25 , we assessed the reliability of HSF, NNSPLICE, and NetGene2 in discriminating between neutral and pathogenic variants. Assuming that the splice outcomes obtained from one tool were consistent with transcript analysis results, eight out of the nine variants detected in this study were correctly predicted, which indicated that these tools are useful to select deep intronic variants that are likely to be worth RNA analysis. However, extensive in silico analysis should be compared with transcript analysis results to determine their bene t in the context of molecular diagnosis.
In summary, three novel pathogenic COL4A4 variants and six novel pathogenic COL4A5 splicing variants were detected in ve unrelated AS patients with negative NGS test results. All identi ed variants were deep intronic variants. As obtaining urine is feasible and non-invasive, we suggest analyzing COL4A3-5 mRNA from urine as the preferred method for evaluation of patients with clinically diagnosed or suspected AS with negative NGS results.

Materials And Methods
All methods were carried out in accordance with relevant guidelines and regulations.

Ethical Considerations
The Ethical Committee of Peking University First Hospital approved the procedures in this study. Patients were excluded if informed consent was not obtained from either themselves or their parents. In total, ve patients with AS were included in this study.

Patients
Analysis of COL4A3-5 mRNA from urine For AS patients with an uncertain inheritance pattern, COL4A3-5 mRNAs from urine were analyzed. When available, RNA from parents was sequenced to assess the segregation of variants with the disease in the respective families. To obtain fresh urine, patients were asked to drink approximately 1000-1500 ml water rapidly after emptying the bladder and spontaneously void every 30-45 min. Approximately 500 ml of urine per patient was collected and allocated in 50 mL centrifuge tubes pre-treated with RNAlater (Qiagene, 145023696). Urine samples were centrifuged for 5 min (1200 rpm at 4°C), and the supernatants were carefully removed using pipettes. The urinary pellets were washed twice with ice-cold PBS supplemented with RNAlater (1 ml RNAlater per 50 ml PBS) and the samples were centrifuged for 5 min (1200 rpm at 4°C). Total RNA was isolated from urinary pellets using TRIzol reagent (Gibco, Grand Island, NY, USA) according to the manufacturer's instructions. The concentration of RNA was quanti ed with a NanoDrop 2000 spectrophotometer (Thermo Fisher Scienti c, Waltham, MA, USA). Reverse-transcription was performed using the RevertAid First Strand cDNA Synthesis Kit (TAKARA, K1622). Ten pairs of PCR primers were designed for COL4A3 (NM_000091.5), COL4A4 (NM_000092.5), and COL4A5 (NM_000495.5); the sequences are listed in Table 3. The 'Touchdown' PCR procedure included annealing from 64°C to 57°C, descending 1°C every two cycles, followed by annealing at 57°C for 26 cycles. The PCR ampli cation products were checked by 2% agarose gel electrophoresis and sequenced on an ABI 3730XL (SinoGenoMax Company Limited, China). The reference sequence of COL4A3-5 transcripts was NM_000091.5, NM_000092.5, and NM_000495.5, respectively.

Analysis of COL4A5 mRNA from skin broblasts
For patients with XLAS, COL4A5 mRNA from cultured skin broblasts was analyzed. Dermal broblasts were cultured as described previously 11 . Primers for COL4A5 cDNA analyses were performed using the same primers as shown in Table 3. RT-PCR and direct sequencing followed the above protocol.

Genomic DNA analysis
Genomic DNA was extracted from peripheral blood lymphocytes. Once abnormal COL4A3-5 transcripts were detected, the corresponding exons with anking intronic sequences were further analyzed using PCR and direct sequencing to identify the point variants that may cause new splice sites. PCR primers are available on request.
In silico splice tools for identifying deep intronic pathogenic variants To evaluate the reliability of in silico splicing prediction tools in discriminating the deep intronic pathogenic variants identi ed in this study, three tools including HSF (http://www.umd.be/HSF/), NNSPLICE (http: //www.fruit y.org/seq_tools/splice.html) and NetGene2 (http://www.cbs.dtu.dk/services/NetGene2/) were used. NNSPLICE and NetGene2 present scores of 0-1 for the predicted site; the higher the score the more likely a variant is a splicing site.