Functional Analysis of Haplotypes and Promoter Activity at the 5’ Region of the ADH7 Gene

Background: The function of the 5’ regulatory region and the role of the different SNP loci have not been well characterized. This study investigated the effect of several ADH7 haplotypes on the regulation of gene expression in vitro and the functional sequences in the 5’ regulatory region of ADH7. Three SNPs (rs17537595, rs2851028, and rs2654847) and four different haplotypes (T-C-T, T-T-T, C-T-T, T-C-A) were identied by cloning and sequencing. Methods: Effects of 4 different haplotypes and 8 truncated fragments of 5’ regulatory region on ADH7 gene expression were detected using a dual-luciferase reporter assay system. All recombinant plasmids were transfected into HEK-293, U87, and SH-SY5Y cells, respectively, and their relative uorescence intensity was measured. Results: In HEK-293, U87, and SH-SY5Y cell lines, the relative uorescence intensity of haplotype T-C-T was signicantly higher than that of haplotype T-T-T, C-T-T, and T-C-A. Additionally, we found that regions from -83 to -310bp (ATG, +1), -560 to -768bp , and -987bp to -1203bp up-regulated gene expression. In contrast, the region from -768 to -987bp down-regulated gene expression. The gene expression of regions from -1203 to -1369bp and -1369 to -1626bp was down-regulated in U87 and SH-SY5Y cell lines, but the trend was opposite in HEK-293 cell line. The region from -310 to -560bp up-regulated gene expression in SH-SY5Y cell line, but down-regulated gene expression in HEK-293 and U87 cell lines. Conclusions: This study has shown that the polymorphisms of ADH7 5’ regulatory region play an important role in the regulation of gene expression.


Introduction
The ADH7 gene encodes IV alcohol dehydrogenase, which is mainly expressed in the upper digestive tract. ADH7 mostly plays a role as retinol dehydrogenase to convert retinol (the main precursor of vitamin A) to retinal, relates to the synthesize of retinoic acid, and regulates the metabolic pathway and metabolic process of retinoic acid (Ding et al. 2020). Extensive genomic studies have con rmed the association between the ADH7 gene and diseases such as alcohol dependence, drug dependence, severe depression, Parkinson's disease, and upper gastrointestinal cancer (Buervenich et  Gene expression is typically regulated at the transcriptional level through the binding of transcription factors to genomic regulatory regions called promoters and enhancers (Wang et al. 2018). The genomic regions with protein-coding capacity only account for 1% of the human genome and a vast proportion of non-coding genomes are more likely correlated with the genetic associations of many diseases (Hindorff et al. 2009). Sowmya Jairam et al. have shown that ADH7 proximal promoter variant rs2851028 affect transcriptional activities. Upstream regulatory sequences generally showed a greater increase or smaller reduction in activity when combined with the rs2851028-T promoter than with the rs2851028-C promoter(Jairam and Edenberg 2014). A previous study showed that the binding of transcription factors AP-1 and C/EBP to the promoter region of the ADH7 gene signi cantly reduced its activity (Kotagiri and Edenberg 1998).
We have identi ed three SNPs in the 5' proximal promoter region of the ADH7 gene previously, but the in uence of haplotypes and different fragments of the proximal promoter region on the protein in vitro is still not clear (Zeng et al. 2020). Therefore, we further explored the effects of four haplotypes and multiple recombinant fragments on gene expression at the protein level.  (T-C-T, T-T-T, C-T-T, and T-C-A) were selected as templates to amplify target fragments by PCR.

Materials And
Primers were designed and restriction enzymes were selected according to the pGL3 plasmid polyclonal site using Primer 5.0 software. The restriction endonucleases Kpn and Hind were introduced into the 5' end of the corresponding primers, and the target fragment was ampli ed by PrimerSTAR® reagent kit (Takara, Dalian, China). The primer sequences were 5'-CGGGGTACCGATTTGGAGTCAGATG-3' (sense) and 5'-CCCAAGCTTCATCCTGTCTTTGTCTTG − 3' (antisense). The underlined nucleotides were used to introduce restriction sites for KpnI or HindIII. The puri ed PCR product was cloned into pBM20S vector by pBM20S Toposmart cloning kit (TIANGEN, Beijing, China). The recombinant pBM20S vector was transformed into JM109 competent cells, and the plasmid extracted in the SanPrep® Column Enodotoxin-Free Plasmid Mini-Preps Kit(Sangon Biotech, Shanghai, China) was sequenced to con rm that the inserted fragment was correct.
Finally, four haplotypes of pBM20S vectors in the 5' proximal promoter region were subcloned into pGL3 basic vector (Promega, Madison, Wisconsin, USA) and transfected into eukaryotic cell lines.

Construction of truncated sequence recombinant vectors in the 5' promoter region of ADH7 gene
The primers (Table 1) containing restriction endonucleases KpnI and BglII digestion sites were introduced into the 5' end to amplify the target fragment. In this study, the common end was at + 107bp (ATG, + 1) and the longest fragment extended from − 1626bp to + 107bp (ATG, + 1). The longest fragment was used as an ampli cation template to construct eight truncation fragments with 5' deletion. The puri ed target gene was cloned into pBM20S vector, and the recombinant vector of the target fragment was accurately screened by sequencing. Then the pBM20S recombinant vectors were cloned into pGL3 Basic vector for transfection of eukaryotic cell lines. Table 1 Primer sequences of the target fragments containing the cleavage sites

Target fragments
Primer sequences Notes: The primer sequences were shown, and the 5′ end of the primer sequences contained the KpnI and BglII restriction enzyme cleavage sites. The number in brackets was the 5′ end position in the ADH7 gene as ATG + 1. F = forward, R = reverse.

Cell culture and transfection
Embryonic kidney (HEK-293), glioma (U87), and human neuroblastoma (SH-SY5Y) cell lines were obtained from the cell bank of the Chinese Academy of Sciences (Shanghai) and cultured in a humidi ed 37 ℃ environment at 5% CO 2 . HEK-293 cells and U87 cells were cultured in HyClone® DMEM high glucose medium containing 10% fetal bovine serum (Thermo Fisher Scienti c, Massachusetts, USA), while SH-SY5Y cells were cultured in HyClone® DMEM/F-12 mixed medium containing 15% fetal bovine serum. When the cell density reaches 90% or higher, the transfection experiment was carried out. The cells were inoculated on 24-well plates with 2 × 10 5 cells per well. According to the manufacturer's agreement (Invitrogen, California, USA), the pGL3 recombinant plasmid containing 4 haplotypes and 8 truncated fragments was co-transfected with Lipofectamine® 3000 reagent and Renilla luciferase expression vector pRL-TK (Promega). After 24 hours of culture, the cells were collected. Finally, re y luciferase activity (LUC) and renal cell luciferase activity (TK) were measured. In each experiment, recombinant vectors were represented in triplicate and the experiment was repeated three times.

Statistical analysis
LUC/TK was calculated as the relative uorescence intensity. The comparison between the different recombinants was performed using SPSS19.0 software (IBM, Armonk, NY, USA). The relative uorescence intensity was expressed by mean ± SD (standard deviation) and p < 0.05 (bilateral) represents a statistically signi cant difference.

Transcription factor prediction
The functional fragments with important protein expression were analyzed by bioinformatics.
Transcription factor prediction for the functional fragments was carried out using JASPAR (http://jaspar.genereg.net) (Fornes et al. 2020). The transcription factor with at least > 80% match score was screened.

Construction of luciferase reporter gene vectors in the 5' proximal promoter region of ADH7 gene
The ADH7 gene fragment from − 649bp to + 46bp (ATG, + 1) contained three SNPs (Fig. 1). Haplotypes in the 5' proximal promoter region were identi ed by cloning and sequencing. These sequences included four different haplotypes (5 − 1, 2, 3, 4) ( Table 2). In the previous study, we found that haplotype 5 − 1 was the major haplotype in the northern Han Chinese population. Moreover, eight ADH7 gene sequences of different lengths were successfully cloned into the pGL3 Basic vector (Fig. 2). After sequencing, the inserted fragment sequences were consistent with the primer design fragment sequences.

Analysis of relative uorescence intensity of haplotypes in three cell lines
The luciferase reporter gene assay was carried out in three cell lines HEK-293, U87, and SH-SY5Y respectively, which further proved that the 3 SNPs in the 5' proximal promoter region of the ADH7 gene were involved in the expression of the ADH7 receptor. In HEK-293, U87, and SH-SY5Y cell lines, the relative uorescence intensity of haplotype 5 − 1 (T-C-T) was signi cantly higher than that of haplotype 5 − 2 (T-T-T), 5 − 3 (C-T-T), and 5 − 4 (T-C-A). In HEK293 cell line, the mean LUC/TK of haplotype 5 − 2 was the lowest compared with the other three haplotypes. There are signi cant differences between type 5 − 1 and type 5 − 2, and type 5 − 2 and type 5 − 3. In the U87 cell line, the mean value of haplotype 5 − 1 LUC/TK was the highest, which was signi cantly different from other haplotypes. Haplotype 5 − 2 was signi cantly higher than haplotype 5 − 3 and 5 − 4. In the SH-SY5Y cell line, haplotype 5 − 2 and 5 − 4 was signi cantly lower than haplotype 5 − 1. In terms of relative uorescence intensity, the overall luciferase activity in HEK293 cells was relatively higher (Fig. 3).

Pairwise comparative analysis of relative uorescence intensity of eight recombinant vectors in three cell lines
In this experiment, 8 recombinant vectors with different fragment lengths were transfected into three cell lines to verify the in uence of the regulatory regions of the ADH7 gene on the expression of ADH7 protein.
We found that regions from − 83 to -310bp (ATG, + 1), -560 to -768bp, and − 987bp to -1203bp upregulated gene expression. In contrast, region − 768 to -987bp down-regulated gene expression. The gene expression of regions from − 1203 to -1369bp and − 1369 to -1626bp was down-regulated in U87 and SH-SY5Y cell lines, but the trend was opposite in HEK-293 cell line. The region from − 310 to -560bp upregulated gene expression in SH-SY5Y cell line, but down-regulated gene expression in HEK-293 and U87 cell lines (Fig. 4).

Discussion
In order to elucidate the characteristics of ADH7 gene promoter, we con rmed four haplotypes and eight fragments with different sequence lengths by cloning and sequencing the constructed vectors.
The haplotypes 5 − 1 had a signi cant statistical difference with 5 − 2. A possible reason was the different the base structure of the two haplotypes of the rs2851028 SNP. We also found that the C allele luciferase expression of rs2851028 was signi cantly higher than the T allele. What's more, our results were consistent with the previous study, in which the promoter activity of the C allele in CP-A and HepG2 cell lines was 1.6 to 2 times that of the T allele(Jairam and Edenberg 2014). The luciferase expression of A allele of rs17537595 was signi cantly lower than that of T allele in haplotype 5 − 1 compared with haplotype 5 − 4. Therefore, we speculated that the A allele of rs17537595 may inhibit transcriptional activity. Transcription factors are an important group of proteins that regulate gene expression at the transcriptional level by binding to speci c DNA sequences. Through JASPAR analysis of functional regions, we found transcription factors that bound to rs2851028, such as vitamin D receptor (VDR). We also found that the transcription factor that bound to rs17537595 was VDR. Therefore, it was reasonable that VDR could inhibit the transcriptional activity of ADH7 gene by binding to the A allele of rs2851028 and rs17537595. VDR is a member of the nuclear receptor family and it controls transcriptional responses and regulates micro-RNA directed post-transcriptional mechanisms for initiation of an effective immune response. It has been implicated in the regulation of several vital cellular processes, including apoptosis, cell migration, and calcium homeostasis (Costa et al. 2019;Matana et al. 2018). In the U87 cell line, the relative uorescence intensity of haplotype 5 − 2 was signi cantly higher than haplotype 5 − 3, which might be due to the increased speci c expression of rs2654847 T allele in human glioma cell lines. As a result, the speci c regulatory role of gene polymorphism is complex and depends on the DNA sequence and the cell type.
The promoters of eukaryotes contain various sites with different a nity for the binding of diverse transcription factors. Transcription factors bind to speci c DNA sequences in the promoter regions of genes to activate or repress transcription of multiple target genes (Echigoya et al. 2020;Kim and O'Shea 2008). We performed additional detailed truncation analysis of the promoter region in HEK-293, U87, and SH-SY5Y cell lines. We found that regions from − 83 to -310bp(ATG, + 1), -560 to -768bp, and − 987bp to -1203bp up-regulated gene expression. In contrast, region − 768 to -987bp down-regulated gene expression. There may be an enhanced region in the former sequence and an inhibitory regulatory factor in the latter sequence. Through sequence analysis using JASPAR software, many transcription factors, such as ZNF384, ZEB1, and VSX1/X2, were identi ed in the region from − 83 to -310bp (ATG, + 1), -560 to -768bp, and − 987bp to -1203bp. ZEB1 is at the center of the signal pathway connecting environmental risk factors hypoxia and blood-brain barrier destruction and has became a target related to the pathogenesis of schizophrenia(Leduc-Galindo et al. 2019; Najjar et al. 2017). The − 768~-987bp region also predicted some transcription factors, such as YY1, VDR, and ZNF354C. ZNF354C acts as an inhibitory transcription factor in its KRAB domain dependent on speci c amino acids. Overexpression of ZNF354C leads to transcriptional inhibition of many genes (Oo et al. 2020). The expression among the three cell lines was not the same in the regions of-310 ~ 560bp, -1203~-1369bp, and-1369~-1626bp. This heterogeneity might be due to the differences in cell type.
In this study, uncertainty remains regarding the regulatory factors and functional elements that actually play a role in each region. Although transcriptional factors have been predicted in some regions, further experiments need to verify whether these transcription factors affect protein expression. Moreover, only the luciferase protein, and not the mRNA, was detected during the nal phase of the study, making it di cult to ensure whether the level of mRNA was consistent with protein expression.

Conclusions
ADH7 5' regulatory region polymorphisms have regulatory effects on gene expression. Regions − 83 to -310bp (ATG, + 1), -560 to -768bp, -768~-987bp, and − 987bp to -1203bp contain functional sequences that can promote or suppress ADH7 gene expression. Therefore, it is necessary to carry out further experiments to explore the structure and function of ADH7 gene.

Funding Information
This work was supported by the National Natural Science Foundation of China (No. 81601653). The funding is owned and provided by Jun Yao. The funding body involved in the design of the study and the writing of the manuscript.