Nexus Between Genome-Wide Copy Number Variations and Autism Spectrum Disorder in Northeast Han Chinese Population


 Background: Autism spectrum disorder (ASD) is a common neurodevelopmental condition, with an increasing prevalence worldwide. Copy number variation (CNV), as one of genetic factors, is involved in ASD etiology. However, there exist substantial differences in terms of location and frequency of some CNVs in the general Asian population. Whole-genome studies of CNVs in Northeast Han Chinese samples are still lacking, necessitating our ongoing work to investigate the characteristics of CNVs in a Northeast Han Chinese population with clinically diagnosed ASD.Methods: We performed a genome-wide CNVs screening in Northeast Han Chinese individuals with ASD using array-based comparative genomic hybridization.Results: We found 22 kinds of CNVs (six deletions and 16 duplications) were potential pathogenic. These CNVs were distributed in chromosome 1p36.33, 1p36.31, 1q42.13, 2p23.1-p22.3, 5p15.33, 5p15.33-p15.2, 7p22.3, 7p22.3-p22.2, 7q22.1-q22.2, 10q23.2-q23.31, 10q26.2-q26.3, 11p15.5, 11q25, 12p12.1-p11.23, 14q11.2, 15q13.3, 16p13.3, 16q21, 22q13.31-q13.33, and Xq12-q13.1. Additionally, we found 20 potential pathogenic genes of ASD in our population, including eight protein coding genes (six duplications [DRD4, HRAS, OPHN1, SHANK3, SLC6A3, and TSC2] and two deletions [CHRNA7 and PTEN]) and 12 microRNAs genes (ten duplications [MIR202, MIR210, MIR3178, MIR339, MIR4516, MIR4717, MIR483, MIR675, MIR6821, and MIR940] and two deletions [MIR107 and MIR558]).Limitations: The sample size in our study may confer limited statistical power to discover significant findings. De novo or inherited of the CNVs were not be classified because of the lack of data from parents.Conclusions: We identified CNVs and genes implicated in ASD risks, conferring perception to further reveal ASD etiology.


Background
Autism spectrum disorder (ASD) is a common neurodevelopmental condition, with an increasing prevalence worldwide [1,2]. Persons with ASD manifest the wide range of symptoms and severity in perceivability and socialization with others, such as limited and repetitive patterns of behavior. Both genetic and environmental factors are involved in the pathogenesis of ASD. Environmental factors, including viral infections, medications during pregnancy, and air pollutants, may contribute to ASD [3]. Compared with environmental factors, genetic factors appear to be a prerequisite for ASD development: genetic changes (mutations) may increase ASD risks; and genes, such as CHD8 [4], CNTNAP2 [5], DCC [6], neurexin genes [7], SHANK1 [8], SHANK2 [9], SHANK3 [10], and WNT2 [11] may affect brain development or brain-cells communication. The heritability of ASD has been estimated to be 50%, re ecting that genetic factors contribute to main components in ASD etiology [12].
ASD begins in early childhood. Children with ASD usually show symptoms of autism within the rst year, and regress during a period between one and two years of age. Although there is no speci c medication for ASD patients [13], early treatment can confer a big difference in the lives of children with ASD. Gene-based test provides an impressive method to identify the potential infants with ASD [8].

Study subjects
We enrolled 16 individuals with ASD aged 2 to 7 years from the Chunguang Rehabilitation hospital in Jilin Province, after cases with fragile X syndrome, Rett syndrome, chromosomal abnormalities, and any neurological or psychiatric disorders were excluded. The individuals with ASD were diagnosed by Pediatric Neurology and Neurorehabilitation doctors using the Diagnostic and Statistical Manual of Mental Disorders (5th edition) [32]. All the individuals with ASD were northeast Han Chinese. This study was approved by the ethics committee of Jilin University. The parents or guardians of each individual with ASD signed the written informed consent forms.

DNA extraction and Detection of CNVs
Genomic DNA was extracted from peripheral blood samples using DNA extraction kits, according to the manufacturer's instructions (DP319 TIANamp Blood DNA Kit, TIANGEN BiotechCo. Ltd, Beijing, China) [33]. We used Nano Drop (Cat#ND-1000, ThermoFisher, Waltham, MA, US) and 1% agarose gel electrophoresis to check the quantity and quality of the isolated DNA. We used aCGH for genome-wide CNVs screening (Agilent SurePrint G3 Human CGH 60K
CNVs were considered of strong putative interest when they reached the following criteria: (1) they were classi ed as likely pathogenic or pathogenic; (2) they were of large size (> 100kb); (3) they had been found in the knowledgebases for the genetic evidence of ASD (Simons Foundation Autism Research Initiative [SFARI, https://www.sfari.org/resource/sfari-gene/], or AutismKB [http://db.cbi.pku.edu.cn/autismkb_v2/index.php]); (4) they had been found in the Database of genomic variation and phenotype in Humans using Ensembl Resources (DECIPHER, https://decipher.sanger.ac.uk/about#overview); and (5) they contained previously reported ASD-relative genes. All potential pathogenic CNVs showing ≥ 90% overlap with at least one common variant of the same type in the DGV database were considered as common CNVs, and the others were rare CNVs [36,37]. Identi cation of potential pathogenic genes of ASD We selected potential pathogenic genes within potential pathogenic CNVs on the basis of the following criteria: (1) genes enriched in ASD-related pathways; and (2) within 363 genes classi ed as high-con dence or strong-candidate in SFARI, and 228 genes classi ed as high-con dence in AutismKB.
Identi cation of potential pathogenic microRNAs of ASD MicroRNAs (miRNAs) are involved in the pathogenesis of ASD [30,38]. Because genes implicated in CNVs that we found encode miRNAs, we further selected potential-pathogenic-CNVs-related miRNAs by retrieving PubMed according to experimental evidence documenting nervous system dysfunction.

Identi cation CNVs
To detect CNVs, aCGH was performed in all DNA samples from the 16 subjects with ASD (13 males and 3 females). We identi ed 364 CNVs (153 deletions and 211 duplications) with an average genomic size of 211.982 kb (114.091 kb for deletions and 258.705 kb for duplications). The mean number of CNVs per subject was 22.750 (9.563 for deletions and 13.188 for duplications). The mean number of deletions in male (10.462) was greater than that in females (5.667) ( Table 1). Identi cation potential pathogenic CNVs of ASD A total of 20 CNVs from 364 CNVs failed to be converted to GRCh37 (hg19); thus, we obtained 72 benign, 65 likely benign, 9 VOUS, 167 likely pathogenic, and 31 pathogenic CNVs (Table 2). We found that more than half CNVs were likely pathogenic or pathogenic.  Table 1). The distribution of the 115 kinds of CNVs in chromosome is visualized by circular plot (Fig. 1).
The Gene Ontology (GO) and KEGG pathway analyses of the genes from potential pathogenic CNVs were performed using clusterPro ler package in R3.6.2 software [39]. P-value < 0.05 was considered statistically signi cant. miRWalk 2.0 database, which contained 12 miRNA-target-prediction database, was used to predict target genes of CNVs-miRNAs [40]. We selected the target genes according to the criteria-target genes existed in at least seven of the 12 databases. Moreover, interactive relationship between CNVs-miRNAs and target genes was presented using Cytoscape 3.8.0 (http://www.cytoscape.org/).
We constructed intersections among 511 genes that we found, 363 high-con dence or strong-candidate risk genes of ASD reported in SFARI database, and 228 high-con dence risk genes related to ASD reported in AutismKB database (Fig. 3). After investigating genes in the intersections, we found that cholinergic receptor nicotinic alpha 7 subunit gene (CHRNA7) was involved in the regulation of excitatory postsynaptic potential and cholinergic synapse; dopamine receptor D4 gene (DRD4) was involved in the regulation of synaptic transmission, dopamine binding, and glutamatergic synapse; HRas proto-oncogene (HRAS) played roles in the regulation of excitatory postsynaptic potential, glutamatergic synapse, and mTOR signal pathway; oligophrenin 1 gene (OPHN1) correlated with regulated synaptic signal, ionic glutamate receptor binding, and glutamatergic synapse; phosphatase and tensin homolog (PTEN) was implicated in the regulation of synaptic signal, neuron differentiation of central nervous system, ionic glutamate receptor binding, sphingolipid signaling, and mTOR signaling; SH3 and multiple ankyrin repeat domains 3 gene (SHANK3) was involved in the regulation of synaptic signal, ionic glutamate receptor binding, neuronal synapse, postsynaptic density, and asymmetric synapse; solute carrier family 6 member 3 gene (SLC6A3) played roles in dopamine binding, neurotransmitter: sodium cotransporter activity, and neurotransmitter transport activity; and TSC complex subunit 2 gene (TSC2) was involved in synapses, postsynaptic density, asymmetric synapses, and mTOR signaling pathways. Scores of all these genes (CHRNA7, DRD4, HRAS, OPHN1, PTEN, SHANK3, SLC6A3, and TSC2) in AustismKB and corresponding ranks in SFARI are listed in Table 4. DRD4, HRAS, OPHN1, SHANK3, SLC6A3, and TSC2 were in the regions of CNVs duplication. CHRNA7 and PTEN were in the regions of CNVs deletion.

Identi cation and Analysis of potential pathogenic miRNAs with CNVs of ASD
We found 50 potential-pathogenic-CNVs-related miRNAs (45 encoded by duplication regions and 5 encoded by deletion regions). According to experimental evidence documenting nervous system dysfunction, we retrieved PubMed, identifying that 12 miRNAs genes were previously reported to be associated with brain or nervous system dysfunction ( Table 5). We intersected target genes predicted using miRWalk 2.0 database with the union between SFARI and AutismKB, presenting the interaction networks between CNVs-miRNAs and 219 target genes (Figs. 4 and 5). KEGG pathway enrichment analysis showed enriched key pathways, such as glutamatergic synapse (hsa04724), dopaminergic synapse (hsa04728), and Wnt signaling pathway (hsa04310). The top 20 pathways are presented in Fig. 6 and Supplementary Table 9.

Discussion
In the present study, we identi ed that 22 kinds of CNVs (six deletions and 16 duplications), eight protein-coding genes, and 12 miRNAs genes are associated with ASD risks in northeast Chinese Han from Jilin province, China.
CNVs have repeatedly been found to correlate with ASD risks [41,42]. In our study, we ltered 22 potential pathogenic CNVs.
Individuals with deletions and duplications of 15q13.3 have been found to manifest neuropsychiatric disease and cognitive de cits [43]. In line with the discoveries of Chen et al. [44] and Pinto et al. [28], we further documented that CNVs at 22q13.33 and 15q13.3 are associated with ASD risks. Autism-related phenotypes are common in patients with deletion or duplication at 22q13.3 [45][46][47][48]. Most of the defects are due to haploinsu ciency of SHANK3 [46]. Chen et al. found a deletion at 22q13.3 in two male children with ASD and a duplication at 22q13.31-q13.33 in one male child with ASD from Taiwan, China [44]. In our study, we found a duplication at 22q13.31-q13.33 that overlaps SHANK3 from two male children with ASD, indicating that the duplication at 22q13.31-q13.33 may play a key role in the etiology of ASD in our population. CNVs at 15q13.3 have been found to be involved in a variety of neuropsychiatric diseases, including intellectual disability/developmental delay, epilepsy, schizophrenia, and ASD [43,[49][50][51]. The relation between CHRNA7 at 15q13.3 and neuropsychiatric disorder phenotype has been validated intensively [50]. In accordance with the discovery of Pinto et al. [28], we also found that a deletion of CHRNA7 is associated with ASD risks. Except CHRNA7 and SHANK3, we found CNVs-duplications (DRD4, HRAS, OPHN1, SLC6A3, and TSC2) and CNVs-deletions (PTEN). For DRD4 and HARS, we found seven children with ASD had duplications at 11p15.5, which overlaps DRD4 and HARS. Mutations in DRD4 are associated with ASD risks [52][53][54]. The mRNA expression levels of DRD4 in peripheral blood lymphocytes are higher in people with ASD than that in healthy controls [55,56]. Herault et al. also found positive association between HRAS and autism in French-Caucasian [57,58].  [61]. We found duplications at 16p13.3 in two female children with ASD. PTEN loss involved in white matter pathology in human with ASD is consistent with in mouse models of Pten loss [62]. We revealed that deletions at 10q23.2-q23.31 overlapping PTEN in 13 male children with ASD, rather than 3 female children with ASD. Thus, these eight genes may be implicated in the etiology of ASD.
MiRNAs coded within CNVs are important functional variants, providing a new dimension to recognize the association between genotype and phenotype [63]. MiRNAs play vital roles in governing essential aspects of inhibitory transmission and interneuron development in nervous system [64]. Deletion or duplication of a chromosomal loci changes the levels of miRNAs which further impact on neuronal function and communication [38]. In our study, 12 candidate-susceptible miRNA-coding genes of ASD were identi ed (ten duplications [MIR202, MIR210, MIR3178, MIR339, MIR4516, MIR4717, MIR483, MIR675, MIR6821, and MIR940] and two deletions [MIR107 and MIR558]). BDNF, a brain-derived neurotrophic factor and a member of the neurotrophic factor family, is a target gene of miR-202 [65]. Moreover, we further predicted that miR-4717-5p, miR-483-3p, and miR-940 also targeted BNDF. Skogstrand et al. found that lower BDNF levels in serum correlate with ASD risks [66,67]. miR-339-5p has been found to be a drug target for Alzheimer's disease, and is low expressed in mature neurons and related to axon guidance [68,69]. In our study, we found that miR-339-5p targets 42 genes associated with ASD risks. Among these genes, the association of DIP2A and ASD risks has been validated by our team [70]; moreover, Dip2a knockout mice exhibit autism-like behaviors, including excessive repetitive behavior and social novelty defects [71]. Notably, autism-like behaviours and germline transmission in MECP2 transgenic monkeys corroborate association between miR-339-5p and MECP2 [72]. In addition, miR-202-5p, miR-483-3p, and miR-940 also targets MECP2. For these reasons, miRNAs coded within CNVs may be implicated in ASD etiology.

Limitations
Our study had some limitations: (1) the sample size in our study may confer limited statistical power to discover signi cant ndings; (2) genetic and environmental factors contribute to ASD risk; however, environmental factors were not available for us; and (3) de novo or inherited of the CNVs were not be classi ed because of the lack of data from parents.

Conclusions
We identi ed that 22 kinds of CNVs (six deletions and 16 duplications), eight protein-coding genes, and 12 miRNAs genes are implicated in ASD risks, conferring perception to further reveal ASD etiology.

Availability of data and material
The datasets generated during the current study are available from the corresponding authors on reasonable request.
Competing of interests Figure 1 The distribution of CNVs on genome-wide chromosomes