5’UTR Variant rs6017916 Regulates SRC Expression and Contributes to the Susceptibility to Ischemic Stroke in the Chinese Han Population

Previous studies reported that the SRC protein was involved in a variety of pathological mechanisms related to ischemic stroke (IS). In this study, we conducted a genetic association study between rs6017916 within the 5’UTR region of SRC gene and IS susceptibility. A total of 533 IS patients and 531 healthy controls were recruited to participate in the current study. The sequenom MassARRAY technology platform was used for genotyping. The quantitative polymerase chain reaction (qPCR) was conducted to detect SRC mRNA expression. The dual luciferase reporter system was used to verify the regulation of rs6017916 on SRC mRNA expression. Results showed that SRC mRNA expression was signicantly increased in IS patients than that in controls (P<0.001). Receiver operating characteristic curve (ROC) analysis demonstrated that the signature of SRC mRNA expression differentiated between controls and IS patients with an area under the curve (AUC) of 0.935 corresponding to a specicity of 0.820 and sensitivity of 0.920. Genetic association analysis showed that rs6017916 was signicantly associated with IS susceptibility under multiple genetic models, including additive [OR (95% CI)=0.76 (0.60,0.96), P adj =0.019] and dominant [OR (95% CI)=0.75(0.58,0.98), P adj =0.031]. In addition, the dual luciferase reporter system showed that the minor allele C of rs6017916 inhibited luciferase activity compared with the major allele A. In summary, we report that SRC mRNA expression was signicantly increased in IS patients and was a potential diagnostic biomarker. Moreover, the 5’UTR variant rs6017916 of SRC was signicantly associated with IS susceptibility. And rs6017916 might affect the pathological process of IS by regulating SRC expression. 5’-GGTGGTCTCCTCTGACTTCAACA-3’ primer: 5’-AGACTGGGCTCTGGCTCTGTTC-3’ 5’-TTGGCAAGATGCCACAAACTG-3’ calculated using 2− ΔCT SNPs could participate in the regulation of parental gene expression in different ways at the transcription and post-transcriptional levels. Our results demonstrated that the minor allele C of rs6017916 of SRC 5’UTR could inhibit SRC expression, but its underlying mechanism is not clear. However, based on the above evidence, it is not dicult to nd that the minor allele C of rs6017916 may reduce the IS risk by suppressing SRC expression. potential be Our are derived from regulation further


Introduction
Stroke is a common neurological disease and the second leading cause of death (~12% of total death) and the third leading cause of disability (~6% of total DALYs) in the world [1]. In China, the age-strandardised stroke prevalence is 11.15‰, which is the highest among low-and middle-income countries, and the annual age-strandardised incidence and mortality are 2.47‰ and 1.15‰, respectively [2]. More important, the IS incidence keeps increasing in China over the past 3 decades, and the IS incidence in rural areas is higher than in urban areas [2]. Ischemic stroke (IS) is a common neurological disease and accounts for about 63% of the incidence of stroke, which caused a widespread and serious burden of social and economic [1]. However, there is still no effective manner to assess the risk of healthy individuals suffering from IS in advance, which is extremely important for reducing the burden of disease.
Single nucleotide polymorphism (SNP) is a germline substitution of a single nucleotide at a speci c position in the genome, and this substitution exists in a su ciently large population. Since SNP is inherent and stable, it is a reliable indicator for identifying individuals who are susceptible to some diseases.
Previous studies revealed a large number of gene variants related to disease susceptibility. Larsson et al. found that genetic variants related to serum calcium levels were signi cantly associated with the susceptible to coronary artery disease and myocardial infarction [3]. And rs1333049 was reported to be signi cantly associated with the susceptibility to IS [4]. The identi cation of these risk variants made an outstanding contribution to identify diseasesusceptible individuals from the population. Generally, SNPs are divided into several kinds according to their location, including intron SNPs, 5'UTR SNPs, 3'UTR SNPs, and CDS SNPs. And different kinds of SNPs may participate in some disease process through different biological functions. For example, intron SNP (rs3850997) exerts an allele-speci c remote regulation of GCLET by affecting the binding a nity of CTCF, thereby affecting the progression of gastric cancer [5]. And the SNP located in the CDS region usually changes the amino acid sequence of the protein encoded by the parent gene and affects its normal biological function [6]. In addition, the 3'UTR SNP can regulate mRNA stability by affecting miRNA binding [7,8]. Gene 5'UTR containing promoter and enhancer elements mainly regulates the transcriptional activity and thus affects gene expression. And the 5'UTR SNP could regulate the transcriptional activity of genes by affecting the binding of transcription factors [9]. For example, Buroker et al. found that SNPs in the 5'UTR region of VEGFA changed the binding of multiple transcription factors to regulate VEGFA expression [10]. These in-depth studies provided new insights for revealing the underlying mechanism of the association between gene variants and diseases, as well as for pre-intervention of diseases.
SRC proto-oncogene, non-receptor tyrosine kinase (SRC) gene is located on chromosome 20 and encodes the SRC protein. Previous studies demonstrated that SRC has been reported to be involved in a variety of physiological and pathological processes, including lipid metabolism, tumorigenesis [11,12]. For neurological diseases, abnormally inhibited SRC participates in cognitive dysfunction in schizophrenia by enhancing synaptic N-methyl-D-aspartic acid receptors (NMDAR) function [13]. In addition, SRC was also involved in the multiple pathogenic basis of IS. After cerebral ischemia, activated SRC promotes the secretion of in ammatory factors, activates oxidative stress and promotes astrocyte apoptosis and aggravates cerebral ischemic damage [14]. On the other hand, cerebral ischemia triggers the release of a large number of molecules, including glutamate, oxygenated hemoglobin, cytokines, reactive oxygen species (ROS), etc. Most of these molecules would activate SRC. Then activated SRC increases the sensitivity of vascular smooth muscle cells to calcium, resulting in calcium overload and apoptosis of vascular smooth muscle cells, which will destroy the blood-brain barrier and cause severe brain edema after cerebral ischemia [15,16]. In addition, activated SRC could aggravate neuronal apoptosis after ischemia by activating N-methyl-D-aspartic acid (NMDA) receptors and regulating excessive autophagy [17,18]. These evidences all indicate that SRC is related to the pathogenic mechanism of IS and is likely to be a injury factor that aggravates the severity of IS.
The rs6017916 is a SNP located in the 5'UTR of the SRC promoter region. No study reported that rs6017916 was associated with any disease susceptibility so far. Here, for the rst time, we explored the association between rs6017916 and IS susceptibility, and further explored the effect of rs6017916 on the expression of its parental genes.

Research Participants
The current study recruited 1064 participants, including 533 ischemic stroke (IS) patients and 531 unrelated healthy controls. 533 IS patients were recruited from the Department of Neurology, First A liated Hospital of Guangxi University of Traditional Chinese Medicine, Guangxi, China. 531 healthy controls were recruited from the physical examination center in the same hospital. All the research participants were Chinese Han people and not related.
The inclusion and exclusion criteria of IS patients and controls were as follows: IS patients were diagnosed by at least 2 experienced doctors based on CT or MRI results, and excluded due to other serious illness, such as hemorrhagic stroke, cancer, and severe shock. Controls were healthy individuals without stroke, coronary heart disease and other cardiovascular diseases. Before being included in this study, each participant provided informed consent. This study was approved by the medical ethics committee of the Guangxi University of Chinese Medicine.

Collection of related clinical parameters
The 2ml plasma was collected through anticoagulant blood collection tube. The StAgO Capact coagulation meter was used to detect the parameters of coagulation function in plasma, including activated partial prothrombin time (APPT), D dimer (D-D), brinogen (FIB), international normalized ratio (INR), prothrombin time (PT), prothrombin time activity (PTA), Thrombin Time (TT).

DNA Isolation and Genotyping
The blood genomic DNA was extracted from the peripheral blood of each participant by using an blood genomic DNA extraction kit (Aidlab biotechnologies CO. Ltd) according to the instructions. Subsequently, the DNA samples were quali ed and stored in -80 ℃ until use. Genotyping was performed by Bomiao Biological Co., Ltd. (Beijing, China) through Sequenom MassARRAY technology platform (Sequenom, San Diego, CA, USA). Speci c primers for genotyping were designed by AssayDesigner 3.1 software and as follows: F: 5'-ACGTTGGATGCAAGCCCATTCCACAGACTC-3' and R: 5'-ACGTTGGATGATTCTCCACATGGTGTGCTG-3'. For quality control, 5% DNA samples were randomly selected for double testing without the knowledge of the inspectors. The repeatability of these DNA samples reaches 100%, which con rms the reliability of genotyping.

Detection of SRC expression
The SRC expression was detected in the peripheral blood from some research objects, including 50 IS patients and 50 healthy controls. Total RNA was extracted from peripheral blood using the Trizol reagent (Invitrogen,Carlsbad, CA, USA). cDNA was synthesized through the PrimeScriptTM RT reagent Kit with gDNA Eraser (Takara Bio Inc.) according the instructions. Quantitative real-time polymerase chain reaction (qPCR) was conducted to detect the SRC expression using the SYBR® Premix Ex TaqTM II kit (Takara Bio Inc.). The related primer sequences were as follows, GAPDH (housekeeper gene) primer: F: 5'-GGTGGTCTCCTCTGACTTCAACA-3' and R: 5'-GTTGCTGTAGCCAAATTCGTTGT-3'; SRC primer: F: 5'-AGACTGGGCTCTGGCTCTGTTC-3' and R: 5'-TTGGCAAGATGCCACAAACTG-3' . The relative expression of SRC was calculated using 2− ΔCT method.
Construction and transfection of luciferase reporter plasmid SNPs located in the 5'UTR region were reported to have the potential to regulate parental gene expression [19]. Whether rs6017916 regulate the SRC expression? To clarify this question, we constructed rs6017916 wild type (WT: A) and mutant type (MT: C) into the upstream 5'UTR region of the re y luciferase coding region. The constructs were further con rmed by sequencing (Shanghai Genechem Co., Ltd). The constructed re y luciferase and Renilla luciferase vectors were transfected into HEK-293T cells through the X-tremegene HP reagent kit (Roche, 6366244001). Cells were collected 48h after transfection for luciferase activity detection.

Luciferase Activity Assays
Fire y luciferase and Renilla luciferase activity were tested with Dual-Luciferase® Reporter Assay System (Promega, E2910) according to instructions. Renilla luciferase activity was used to normalize re y luciferase activity to obtain relative luciferase activity.

Statistical Analysis
PLINK software (http://pngu.mgh.harvard.edu/~purcell/plink/) was used for genetic association analysis. The Chi-square goodnessof-t test was used to evaluate Hardy-Weinberg equilibrium. The comparison of genotype distribution and allele frequency between groups was performed using the Chi-square test. Unconditional logistic regression model analysis was used to evaluate the association between rs6017916 and IS susceptibility under multiple genetic models (additive, dominant, and recessive models). The association between rs6017916 and related clinical parameters was evaluated using general linear model. Moreover, the comparison of SRC expression and related luciferase activity between groups was statistically analyzed by SPSS software (17.0). Statistical signi cance was set at P < 0.05, and all tests were two-tailed.

SRC expression signi cantly increased in IS patients
The qPCR results showed that SRC expression was signi cantly increased in the peripheral blood of IS patients than that in controls (Fig. 1). Receiver operating characteristic curve (ROC) analysis was performed to evaluate the potential diagnostic value of SRC expression for IS. Results showed that the area under the ROC curve (AUC) was 0.935, sensitivity and speci city were 0.920 and 0.820, respectively. That suggested that increased SRC expression was a potential and excellent diagnostic biomarker for IS (Fig. 2).
Correlation analysis of SRC variant rs6017916 and IS susceptibility A total of 533 IS patients and 531 controls were recruited in this study. Moreover, the sex distribution and age were comparable and matched between IS patients and control, details could be found in our previous study [20]. The genotype distributions of rs6017916 in controls were in line with the Hardy-Weinberg equilibrium (HWE) (P=0.583). And there was no signi cant distribution difference between IS patients and controls based on rs6017916 genotype (χ 2 =5.78, P=0.055, And there was no other association between rs6017916 and other parameters of coagulation function, including APPT, D-D, FIB, INR, PTA, TT, details were shown in Table. 3.
The minor allele C of rs6017916 inhibits SRC expression.
Previous studies reported that SNPs located in the 5'UTR region could regulate parental gene expression at the transcription or post-transcriptional level. To clarify whether rs6017916 could regulate SRC expression, we constructed wild-type (WT: rs6017916 major allele A) and mutant-type (MT: rs6017916 minor allele C) luciferase reporter vectors (Fig. 3). Luciferase activity assay demonstrated that rs6017916 minor allele C signi cantly decreased luciferase activity compare to rs6017916 major allele A (Fig. 4), which suggests that rs6017916 minor allele C inhibited SRC expression.

Discussion
Here, we reported that SRC expression was signi cantly increased in IS patients when compare to controls. Then we conducted a case-control study (including 533 IS patients and 531 controls) and revealed that the 5'UTR SNP (rs6017916) of SRC was signi cantly associated with the susceptibility to IS. Furthermore, dual luciferase reporter system revealed that the 5'UTR containing rs6017916 minor allele C signi cantly inhibited the luciferase activity compared to the rs6017916 major allele A.
The increased SRC expression was observed in the peripheral blood of IS patients in the current study. The SRC protein encoded by SRC is a tyrosine-speci c protein kinase, which is widely expressed in almost all human tissues. The SRC is the prototype of Src family protein kinases (SFKs), which perform important biological functions by delivering extracellular stimulus signals to key downstream molecules and signaling pathways [21]. Structurally, SRC consists of four main domains, including: (1) A unique N-terminal myristoylation sequence can enhance the association of plasma membrane proteins (SRC homology (SH) 4 domain); (2) A domain prefers to bind to proline-rich sequences (SH3 domain); (3) A domain promotes the interaction of SRC with phosphorylated residues (SH2 domain); (4) A domain responsible for catalytic kinase (SH1 domain) [22]. Among them, the ligand binding relationship between SH2 and SH3 maintains the conformation of SRC auto-inhibition, which is critical for the transition from SRC inhibition to activation and biological functions [23,24]. In addition to participating in the process of cancer occurrence and development, accumulated evidence also indicated that SRC could affect important IS-related pathogenic mechanisms such as neuronal apoptosis and blood-brain barrier destruction after cerebral ischemia. Jickling et al. reported that SRC activated can phosphorylate and activate N-methyl-D-aspartic acid (NMDA) receptors after cerebral ischemia [17]. And this kind of activated NMDA receptor induced a large amount of calcium to enter neurons, leading to neuronal excitotoxicity and apoptosis [25]. The MAPK signaling pathway has been reported to be involved in the in ammatory response and neuronal apoptosis process after cerebral ischemia [26]. And the activated SRC can activate the three sub-pathways (ERK, JNK, p38) of MAPK to promote neuronal death after ischemia [14]. Moreover, SRC could also play a signal transmission function to transmit IL-17A and zinc stimulation signals from outside to inside the cell, thereby phosphorylating PP2 and PP2B, inducing excessive autophagy and aggravating cerebral ischemia injury [18,27]. In addition to neuronal damage, SRC could also increase the vascular permeability of the blood-brain barrier in cerebral ischemia areas by regulating vascular endothelial growth factor (VEGF), and further destroy the blood-brain barrier, which induce vasogenic brain edema to aggravate nerve damage [28,29]. The above evidence demonstrated that SRC could aggravate post-cerebral ischemia injury in multiple ways. Combined with our ndings, SRC expression was increased in IS patients, we speculated that the increased SRC expression was related to the development of IS, and it is likely to be a risk factor for IS.
Because the application of traditional IS treatment strategies, intravenous thrombolysis, is severely limited by strict treatment time windows (about 4.5h), most IS patients missed the best treatment time point when they are diagnosed. Therefore, the search for effective and speci c IS diagnostic biomarkers is of great signi cance for the early diagnosis of IS and reducing the burden of IS disease. Previously, various biomolecules have been explored as IS diagnostic biomarkers. For protein molecules, Cometpay et al. reported that Soluble tumor necrosis factor-like weak inducer of apoptosis (sTWEAK) protein could be used as a potential diagnostic biomarker for IS (AUC: 0.840; sensitivity: 0.778; speci city: 0.917) [30]. For circular RNA molecules, circFUNDC1, circPDS5B, and circCDC14A were reported to be combined as IS diagnostic biomarkers (AUC: 0.875; sensitivity: 0.715; speci city: 0.910) [31]. For miRNAs, miR-125a-5p, miR-125b-5p and miR-143-3p have also been reported to be combined as IS diagnostic biomarkers (AUC: 0.930; sensitivity: 0.650; speci city: 0.975) [32]. In the current study, ROC analysis showed that SRC was a potential IS biomarker (AUC: 0.935; sensitivity: 0.920; speci city: 0.820). As an IS diagnostic biomarker, the AUC, sensitivity and speci city of SRC are higher most previously reported potential biomarkers, which suggests that SRC will be a competitive potential IS diagnostic biomarker. Although there is currently no biomarker that can be applied to the timely diagnosis of IS, it is rmly believed that potential IS diagnostic biomarkers that have been revealed will provide great help for the development of IS diagnosis and treatment in the future.
Accumulated genetic association studies revealed a lot of SNPs closely related to disease susceptibility. The discovery of these risky SNPs greatly promoted the development of personalized medicine, laying the foundation for future precision medicine [33]. Previously, the association between SRC gene variant rs6017916 and any disease susceptibility has not been reported yet. In this study, we reported for the rst time that rs6017916 was signi cantly associated with IS susceptibility. We found that the distribution frequency of A/C alleles of rs6017916 in healthy controls in this study was 0.802/0.198, which is consistent with the ndings of East Asian in the previous 1000 genome project according to the data from NCBI database. That further re ects the reliability of our data and nding. In the current study, the genotype distribution analysis demonstrated that the frequency (0.159) of the minor allele C of rs6017916 was lower in IS patients than that (0.198) in healthy controls. These results suggested that the rs6017916 minor allele C may be a protective factor for IS. On the other hand, SNPs located at 5'UTR have shown a powerful effect on the regulation of parental gene expression. Pan et al. found that the SNP (-32C>G) of FGF 5'UTR could inhibit the process of mRNA translation by disturbing the interaction between FGF13 mRNA and polypyrimidine-tract-binding protein 2 (PTBP2), leading to defects in brain development and cognitive function [19]. Rs3813946 was located at the 5'UTR of CR2 gene and affected the development of systemic lupus erythematosus by regulating the transcriptional activity of CR2 [34]. In addition, caspase-3 gene 5'UTR SNPs may also affect the caspase-3 expression by regulating the binding relationship of miR-339-3p [35]. These evidences all indicate that 5'UTR SNPs could participate in the regulation of parental gene expression in different ways at the transcription and post-transcriptional levels. Our results demonstrated that the minor allele C of rs6017916 of SRC 5'UTR could inhibit SRC expression, but its underlying mechanism is not clear. However, based on the above evidence, it is not di cult to nd that the minor allele C of rs6017916 may reduce the IS risk by suppressing SRC expression.
Several potential limitations must be noted in this study. Our research conclusions are derived from research on the Chinese Han population. Taking into account the heterogeneity of gene variants between races, the positive or negative association we found in the current study needs to be prudently extended to other ethnic groups. In addition, although we revealed that rs6017916 could affect the expression of its parental SRC, the underlying mechanism on this regulation is still unclear, and further investigation is needed.
In summary, we revealed for the rst time that SRC variant rs6017916 was signi cantly associated with the susceptibility of IS in the Chinese Han population. The SRC mRNA expression was signi cantly increased in IS patients and was a potential diagnostic biomarker. In addition, the 5'UTR variant rs6017916 may affect the susceptibility to IS by regulating SRC expression.

Con icts of interest
All authors declare that they have no con cts of interest.   Figure 1 Relative expression of SRC in the IS patients and controls.

Figure 2
Receiver operating characteristic (ROC) curve analysis of SRC expression for patients with IS.

Figure 3
Schematic diagram of wild-type (rs6017916 major allele) and mutant-type (rs6017916 minor allele) luciferase reporter vector construction.