Next Generation Exome Sequencing of paediatric Asthma Identifies Rare and Novel Variants in Candidate Genes


 Background: Multiple genes have been implicated to have a role in asthma predisposition by association studies. Paediatric patients often manifest more extensive disease and a particularly severe disease course. It is likely that genetic predisposition could play a more substantial role in this group. This study aims to identify the spectrum of rare and novel variation in known paediatric asthma susceptibility genes using whole exome sequencing-analysis in nine individual cases of childhood onset allergic asthma. Data were processed through an analytical pipeline to align sequence reads, conduct quality checks, identify and annotate variants where patient sequence differed from the reference sequence. Results: DNA samples from the nine children with a history of bronchial asthma diagnosis underwent targeted exome capture and sequencing. For each patient, the entire complement of rare variation within strongly associated candidate genes was catalogued. The analysis showed 21 variants in the subjects, 13 had been previously identified and 8 were novel. Also, amongst of which, nineteen were non-synonymous and 2 were nonsense. With regard to the novel variants, the 2 non-synonymous variants in the PRKG1 gene (PRKG1: p.C519W and PRKG1: p.G520W) were presented in 4 cases, and a non-synonymous variant in the MAVS gene (MAVS: p.A45V) was identified in 3 cases. The variants we found in this study will enrich the variants spectrum and build up the database in the Saudi population. Novel eight variants were identified in the study which provides more evidence in the genetic susceptibility in asthma among Saudi children.Conclusion: Screening a cohort of Saudi children for molecular identification of polymorphisms associated with allergic asthmatic response. These, together with the clinical phenotypes, revealed genetic determinants for paediatric asthma and also we compared to the similar previous reports. Providing a genetic screening map for the molecular genetic determinants of allergic disease in Saudi children, with the goal of reducing the impact of chronic diseases on the health and the economy. We belief that the advanced specified statistical filtration/annotation programs used in this study succeeded to release such results in preliminary study, exploring the genetic map of that disease in Saudi children.

the PRKG1 gene (PRKG1: p.C519W and PRKG1: p.G520W) were presented in 4 cases, and a nonsynonymous variant in the MAVS gene (MAVS: p.A45V) was identi ed in 3 cases. The variants we found in this study will enrich the variants spectrum and build up the database in the Saudi population. Novel eight variants were identi ed in the study which provides more evidence in the genetic susceptibility in asthma among Saudi children.
Conclusion: Screening a cohort of Saudi children for molecular identi cation of polymorphisms associated with allergic asthmatic response. These, together with the clinical phenotypes, revealed genetic determinants for paediatric asthma and also we compared to the similar previous reports.
Providing a genetic screening map for the molecular genetic determinants of allergic disease in Saudi children, with the goal of reducing the impact of chronic diseases on the health and the economy. We belief that the advanced speci ed statistical ltration/annotation programs used in this study succeeded to release such results in preliminary study, exploring the genetic map of that disease in Saudi children.

Background
Asthma and other allergic diseases, including allergic rhinitis, eczema and food allergy, cause a substantial burden of disease in childhood. Although a rapid increase in asthma and allergies has been identi ed over the latter part of the 20th century, the reasons for this are still unknown [1]. Recent changes in environmental factors and their interactions with genetic pro les have been suggested as major factors responsible for the increase in asthma and allergic diseases [2].
Asthma, a chronic in ammatory respiratory condition characterized by hyper-responsive airways and reversible air ow obstruction is a substantial public health problem that affects nearly 155 million individuals worldwide with the prevalence of current asthma is higher in children compared than adults [3][4][5]. Although environmental factors are important, there are strong genetic predispositions for the development of allergic diseases. It has been reported that there are more than 100 candidate genes in every chromosome which are identi ed to have a linkage with asthma and the strength of association of these single nucleotide polymorphisms (SNPs) with asthma varies in different parts of the world [6][7][8].
A better knowledge of asthma susceptibility will hold promise for a better understanding of the pathology, diagnosis, prevention, treatment and management of this increasingly frequent disease.
Next generation sequencing (NGS) technology, in particular exome sequencing, currently represents the most powerful and cost effective approach to identifying variation in the human genome and has already been shown to uncover important disease-causing variation missed by GWAS studies [9][10][11][12][13][14][15][16]. Previous studies have implicated rare variants in asthma and asthma-related traits for a number of relevant genes [17,18], suggesting that rare variants may indeed play an important role in asthma susceptibility.
Chromosome 17q21 was the rst asthma susceptibility locus discovered by genome wide association studies (GWAS) [13,17,[19][20][21][22]. However, none of the genes within the locus had previously been implicated in asthma pathogenesis. SNPs in17q21 showing highly signi cant associations with childhood asthma correlate with the expression of ORMDL3 transcripts, suggesting ORMDL3 was a plausible asthma candidate gene in the locus [1]. Later, allele-speci c gene expression was also observed for other genes in the17q21locus [23].
Despite these successes, no de nitive causal variants have been identi ed in any of these genes. It is asserted that the associated variants are in LD with the causal variants in these genes, but more effort must be made to identify causal variants so that the biology of these genes in the aetiology of asthma can be better understood.
Despite the success of GWAS in identifying the associated genetic variants for complex diseases and more than 1000 studies in the past few years were conducted to identify the genetic complexity of many immune diseases including asthma, this approach still could -in part -explains the heritability of asthma regarding the clinical prediction of phenotypic heritability and immunological pathways [24].
This study aims to reveal the genetic determinants for paediatric asthma in Saudi Arabia using whole exome sequencing technique, reviewing the results with similar international studies.

Recruitment of paediatric asthma cohort of patients
Seventy-nine children included in this study were initially selected from the paediatric allergy and immunology clinic, maternity children hospital Makkah, KSA between Jan 2014 and Oct 2015. All children had been diagnosed with asthma after clinical evaluation in addition to both physiological assessment including pulmonary function tests (PFT), and the use of the International study of asthma and allergies in childhood (ISAAC) questionnaire (modi ed for the population) (Note that PFT was not considered for young age who was not able to perform the test properly).They were aged between 5 and 14 years at the time of recruitment. Written informed consent was obtained from the attending parents of all the children.
In the initial recruitment interview, clinical data and venous blood samples (3 ml of whole blood for CBC and DNA extraction and 3 ml for plasma separation) were collected.
Additional comprehensive clinical data were extracted from their medical records with their consent. For each patient, the information gathered included gender, dates of birth and initial diagnosis, laboratory investigations, physiological assessment, disease history, parents' history for any allergic and/or other autoimmune diseases, medication history (use of steroids, immunomodulators, biological therapies), and history of potential allergen such as carpet, plants and/or animal exposure. Only four non-asthmatic children were recruited (three male (75%) and one female), sequenced in the study as a control group.

Selection of samples for whole exome analysis
Asthmatic patients were recruited based on the Global Initiative for Asthma (GINA) guidelines: A history of respiratory symptoms such as shortness of breath, chest tightness, wheezing and coughing that varies in intensity over time, as well as variable expiratory air ow limitation. Nine patient samples were selected to represent the asthma cohort for exome sequencing based on change to nal diagnosis (Table 1).

DNA extraction
Genomic DNA was extracted from EDTA anti coagulated peripheral venous blood samples using a spincolumn method using the Applied Biosystem DNA extraction kit. All Genomic DNA was quanti ed using a Nano-Drop 2000 spectrophotometer (Thermo Scienti c).

Selection of a panel of known asthma target genes
All studies in the GWAS catalogue with "asthma" as a phenotype or keyword till 28 March 2017 were reviewed, and studies identifying asthma susceptibility variants that included paediatric subjects were identi ed and the signi cant loci retrieved. Following this, 16 additional genes from four studies were added following manual review of the literature. For a list of studies contributing to the candidate gene list (see additional table 3). In a total twenty loci encompassing 131 potential candidate genes amongest which 110 genes were covered in this analysis.

Results
Verify BamIDf reemix values for eight samples were below the 0.02 threshold and did not indicate substantial levels of contamination. One sample (subject 9) showed evidence for contamination and was excluded from further analysis. The minimum mean read depth was 47.7X and the minimum coverage at 10X depth was 86.6% across the eight samples (additional Table 2). Order of Tables 1,2,3. In total of 8 subjects, 21 variants were identi ed following ltering for potential functionality, amongst which 13 had previously been identi ed in dbSNP and 8 were novel ( Table 2). Nineteen of the 21 variants were non-synonymous and 2 were nonsense (stop gain). Amongst the 13 non-synonymous variants in dbSNP, 2 variants have been previously studied. The rs3923647 was reported to be associated with increased production of the Th1 cytokines, IFN-γ and IL-2, following BCG vaccination [37]. The variant rs16889462 was identi ed and encoded for SLC30A8 gene reported for many functions in T2D pathophysiology involving the lowering of T2D risk in case of reduced SLC30A8 gene-activity [38]. With regard to the novel variants, the 2 non-synonymous variants in the PRKG1 gene (54041969 and 5404970) were present in 4 cases (1, 3, 4 and 5), and a non-synonymous variant in the MAVS gene (3842992) was identi ed in 3 cases (4, 5, and 9).

Discussion
This report describes the results of a pilot study of exome sequencing in Saudi children diagnosed with Asthma. Exome sequencing of eight children in this study has identi ed 21 potentially deleterious SNPs in known asthma genes; among them eight novel variants and 13 variants previously deposited in dbSNP.
In this study, the genetic association was not repeated in the selected samples because this study was not designed as family-based study. Our study was presented to de ne -for the rst time in KSA-the genetic variants that are mostly associated with the asthma development in paediatric age in Makkah region through a large asthma GWAS cohort study, recognizing the possible genetic causes that implicated in asthma and perhaps also suggesting related biological pathways that play a role in the pathogenesis of asthma. Using advanced ltration and annotation programs, the total resulted variants were reduced to only 28 candidate genetic-variants on 10 loci were associated with childhood-onset disease.
On chromosome 1, three genetic variants were identi ed in three different samples (cases 3,4,7) with two novel variants. One was reported previously, rs147827524 encoded for pyrin and HIN domain family member 1 gene (PYHIN1) that has been accounted for HIN200 proteins which are primarily nuclear proteins involved in transcriptional regulation of genes important for cell cycle control, differentiation, and apoptosis in addition to a surprisingly large proportion of asthma risk in people of African descent [39].
The two novel SNPs discovered, rs152488110 and rs180651520 encoded for both cysteine-rich Cterminal 1 gene (CRCT1) and xenotropic and polytropic retrovirus receptor 1 gene (XPR1), respectively. The CRCT1 SNP is located near the end of the exon region in the gene and among many nonsynonymous reported SNPs. Hence, we supposed that it will be of non-signi cant effect on the protein function. However, the second variant, XPR1 has already been mentioned to play a role in modulating human airway smooth muscle (ASM) contraction, cell growth, and pro-in ammatory cytokine production that promote broncho-constriction, airway in ammation, and remodelling in asthma [40]. On chromosome 2, only one variant was identi ed that was reported before, rs184451758 encoded for tensin 1 gene (TNS-1), the gene that has been accounted essentially for myo broblast differentiation and extracellular matrix formation. The polymorphism in that gene is reported to be signi cantly associated with FULL FORM COPD risk [41]. On chromosome 4, three polymorphic variants were identi ed, one that previously reported, rs3923647, that encoded to Toll-like receptors1 (TLR1) polymorphisms that seems to play a role in susceptibility to asthma, atopic eczema, and allergic rhinitis [42]. The other two polymorphisms identi ed rs745975778 and rs61529635 were reported for the same gene, synaptopodin 2(SYNPO2) that was reported to be associated with total serum IgE in asthmatics in an independent GWAS, suggesting roles for this gene in asthma [43]. On chromosome 6, two closed sequentially novelvariants were identi ed, rs32946010 and rs32946011 encoded for the same transcription factor element gene, bromodomain (BRD2) located on exon 8 (of 12 exons for that gene) where its protein has been shown to modulate transcription, in particular, in cell cycle-induced transcriptional activation. It has been reported recently that BRD2 protein inhibition attenuates neutrophil-dominant allergic airway disease in mice models [44]. On chromosome 8: one non-synonymous polymorphic variant was identi ed, rs16889462 that was reported previously encoding one of the zinc e ux transporters, solute carrier family 30 member 8 (SLC30A8) which has been classi ed as one of the major components for providing zinc to insulin maturation and/or storage processes in insulin-secreting pancreatic β-cells. Five genomewide association studies (GWAS) identi ed SLC30A8 polymorphism rs13266634 among Asian and European but not African populations [45]. On chromosome 9: only one reported polymorphism was identi ed, rs35642290 that encoded for the cytoskeletal protein, talin 1 (TLN-1) which is considered as one of the genes that might be associated with total IgE in asthmatics [46]. On chromosome10: Surprisingly, three variants in the same gene were identi ed including one previously reported SNP and two sequential novel ones, rs54041969 and rs54041970 that identi ed in 50% of samples (cases 1,3,4, and 5) located on exon 14 (of 18 exons reported for that gene). All three variants encode the same protein kinase c GMP-dependent 1 gene (PRKG-1). All PRKG protein-isoforms act as key mediators of the nitric oxide/cGMP signalling pathway and are important components of many signal transduction processes in diverse cell types [47]. It was reported that asthma is typically associated with high levels of exhaled nitric oxide (NO) which reduce the normal levels of S-nitrosothiols, which act as a bronchodilator in the airway [47].
On chromosome 12: one previously reported SNP, rs747186265 was identi ed -in only one sample (case number 6)-that encoded for the otogelin like gene/protein, which has been accounted to be expressed in the inner ear of vertebrates with the highest level of expression seen at the embryonic stage. No signi cant ear complications were recorded for that child in our study. We suppose that this variant does not signi cantly involve in the pathophysiology of asthma development. Also, we suppose that this SNP does not affect the protein structure and function even it is classi ed as non-synonymous, and/or we recommend performing further studies for that particular SNP effects. On chromosome 19: four polymorphisms were identi ed in our study including two previously reported SNPs, rs142299823 and rs74939505 encoded for both genes of zinc nger protein family, ZNF30 and ZNF154, respectively which involved in the process of DNA binding transcription factor activity., Additionally two novel variants were identi ed both in two children in the analysis (cases 4 and 5), non-synonymous (rs7267725) and stopgain (rs7267726). Both variants encoded for the insulin receptor (INSR) gene region that was suggestively associated with asthma risk [48]. On chromosome 20: only one non-synonymous previously reported variant, rs779234123 was identi ed -in three samples (cases, 4,5,8) -, that encoded for mitochondrial antiviral-signalling gene (MAVS), which expresses the protein required for protein kinase activity which is essential for gene expression. Impaired antiviral interferon expression may be involved in asthma exacerbations commonly caused by rhinovirus infections in asthmatic patients [49].
Numerous genetic and molecular studies have been carried out in the eld of asthma in both the children and in adults previously [50][51][52][53][54][55][56][57][58][59][60]. Among genetic studies, exome/NG sequencing studies has been documented throughout the global populations [61,62]. The con rmed observations along with the genetic conclusions con rm the attractive genetic susceptibility factors in asthma patients. It is possible that genetic and non-genetic novel and documented variants might play a major role in the saudi asthma patients. However, the limited patients number and missing the screening of novel variants present a limitation of this study.

Conclusions
In conclusion, this is the initial exome sequenced study implemented in the saudi children diagnosed with asthma. Based on early studies and the results we found in this study, we assume that genetic variants might play a role in the increase susceptibility for the development of asthma. Other variants present in this study cannot be avoided considering the high number of loci and its speci c genetic role involved within the disease in the global population. Future studies recommend to screen more patients for novel variants within the saudi population to rule out its role in the asthma disease.

Consent for publication
All the team of the study pleased to submit an original research article entitled "Next Generation Exome Sequencing of paediatric Asthma Disease Identi es Rare and Novel Variants in Candidate Genes for consideration for publication in BMC Genetics journal.

Availability of data and materials
All data generated or analysed during this study are included in this published article and its additional information les: 1. Additional le: Table 1: Mean depth, coverage at 10x depth across target region and Verify Bam ID free mix values for each sample.
2. Additional le: Table 2: Gene coverage for 107 genes from the asthma gene panel.
3. Additional le: Table 3 list of selected genes associated with asthma

Competing interests
All investigators have NO a liations with or involvement in any organization or entity with any nancial interest or non-nancial interest (such as personal or professional relationships, a liations, knowledge or beliefs) in the subject matter or materials discussed in this manuscript.