Study population
In this study, 404 patients with pathologic stage I, II, and IIIA (micro-invasive N2) lung adenocarcinoma were enrolled. One hundred sixty-six patients from the Kyungpook National Chilgok University Hospital, Daegu, Korea and 238 patients from the Seoul National University Bundang Hospital, Bundang, Korea, underwent surgery with curative intent at each hospital. All patients in this study were of Korean ethnicity. Written informed consent was obtained from all patients before surgery. This study was approved by the Institutional Review Boards of the Kyungpook National University Chilgok Hospital and Seoul National University Bundang Hospital (Approval No..KNUCH 2017-07-012). All experiments were performed in accordance with relevant guidelines and regulations.
Cell Culture and antibodies
H2087 cells were obtained from the American Type Culture Collection (ATCC) and H1703 cells were purchased from Korean Cell Line Bank (KCLB), Seoul, Korea. Antibodies used in this study include anti-Histone H3 antibodies (ab8580, ab4441, ab4729 and ab8895), and anti-GLRX3 antibody (ab226396) from Abcam (Cambridge, UK), anti-YY1 antibody (46395) from Cell Signaling Technology (Danvers, MA, USA), and anti-TFAP4 antibody (sc-166216X) from Santa Cruz Biotechnology (Dallas, TX, USA). Cells were cultured at 37˚C in a humidified atmosphere with 5% CO2 in Corning® RPMI medium (Corning Inc., Corning, NY, USA) supplemented with 10% Corning® Fetal Bovine Serum (Corning Inc., Corning, NY, USA), and 100 U/ml penicillin and 100 mg/ml streptomycin.
Chromatin immunoprecipitation (ChIP)-sequencing
ChIP assays were performed using the Pierce™ Magnetic ChIP kit (Thermo Fisher Scientific, Waltham, MA, USA), according to the manufacturer’s protocol. H2087 cells were crosslinked with 1% formaldehyde for 10 min, and the crosslinking was inactivated by 0.125 M glycine for 5 min at room temperature. Cells were washed with cold 1མPBS twice. The cells were lysed, sonicated to shear DNA. To immunoprecipitate protein/chromatin complexes, the diluted supernatants were incubated with 10 ug of H3K4me3 or H3K27ac antibody overnight, and then incubated for 2 hours after adding 50 ul of agarose/protein A or G beads. Ten percent of the diluted supernatants were saved as “input” for normalization. Several washing steps were followed by protein digestion using proteinase K. Reverse crosslinking was carried out at 65◦C. DNA was subsequently purified. ChIPSeq library preparation was performed with TruSeq® ChIP Library Preparation Kit (Illumina, San Diego, CA, USA). Sequencing was performed on an Illumina HiSeq4000. Sequence reads for each sample were aligned to the human genome using Bowtie (27). The reference genome sequence of Homo sapiens (hg19) and annotation data were downloaded from the UCSC table browser (http://genome.uscs.edu). Peaks were called in the aligned sequence data using a model-based analysis of ChIP-seq (MACS2 v2.1.1) (28). ChIPseeker (version 1.6.6) (29), a bioconductor package within the statistical programming environment R to facilitate batch annotation of enriched peaks identified from ChIP-seq data, was used to identify nearby genes and transcripts from the peaks obtained from MACS2.
RNA-sequencing
Total RNAs from H2087 cells were isolated using TRIzol® (Invitrogen, Carlsbad, CA, USA). Sequencing was performed on an Illumina HiSeq4000 and aligned the processed reads to the Homo sapiens (hg19) using HISAT v2.0.5 (30). Transcript assembly and abundance estimation was performed using StringTie v1.3.3b (31, 32). It provides the relative abundance estimates as FPKM values (Fragments Per Kilobase of exon per Million fragments mapped) of transcript and gene expressed in each sample. FPKM values have already been normalized with respect to library size.
SNP selection and genotyping
As a result of the integrated analysis of ChIP-seq and RNA-seq using H2087 cells, 66,713 SNPs within regions around genes with H3K4me3 and H3K27ac peak (FPKM ≥ 100) were selected. Using the FuncPred utility for functional SNP prediction in the SNPinfo web server (https://snpinfo.niehs.nih.gov/), a total of 279 potentially functional variants with minor allele frequency ≥ 0.1 based on the HapMap JPT data were collected after excluding those in linkage disequilibrium (r2 ≥ 0.8). Genotyping was performed using the iPLEX® Assay and MassARRAY® System (Agena Bioscience, San Diego, CA, USA). Approximately 5% of the samples were randomly selected to be genotyped again by a different investigator, by a restriction fragment length polymorphism assay, and the results were 100% concordant.
Promoter-luciferase constructs and luciferase assay
We evaluated the effect of the rs17583C > T or rs4751162A > G on the activity of the promoter of CAPN1 or GLRX3 genes by luciferase reporter assay. The rs17583C > T of CAPN1 gene is located in the region of H3K4me3 peak, which marks active promoters, in CAPN1 gene promoter. The 378bp fragment including rs17583C > T was synthesized by polymerase chain reaction from human genomic DNA and cloned into XhoI/HindIII site of the pGL3-basic vector (Promega, Madison, WI, USA). The correct sequences of all clones were verified by DNA sequencing. The rs4751162A > G is located in the region of H3K27ac peak, which is an activation mark of enhancers, in the intron region of LINC00959 gene. The SNP is expected to regulate expression of GLRX3 gene because it resides 26kb downstream of GLRX3 gene, although they both are on the chromosome 10. The promoter region of GLRX3 (-980 to + 38 bp, the transcriptional start site is designated as + 1) was synthesized by polymerase chain reaction from human genomic DNA and cloned into XhoI/NcoI site of the pGL3-promoter vector (Promega, Madison, WI, USA) to generate pGL3-GLRX3pro. Two fragments including rs4751162A or rs4751162G allele of rs4751162A > G were amplified from genomic DNA sample and the 283bp products were cloned into BamHI/SalI site of the pGL3-GLRX3pro, respectively, to generate pGL3-GLRX3pro_A and pGL3-GLRX3pro_G. The cloning PCR primers were listed Supplementary Table 1. All constructs were verified by direct sequencing before use. The H1703 cells were transfected with 200 ng of each plasmid DNA (pGL3-CAPN1_C and pGL3-CAPN1_T for rs17583, and pGL3-GLRX3pro, pGL3-GLRX3pro_A, or pGL3-GLRX3pro_G for rs4751162) and 2ng of pRL-SV40 Vector (Promega, Madison, WI, USA) using Effectene® transfection reagent (Qiagen, Hilden, Germany) according to manufacturer’s protocol. The cells were collected 48 h after transfection. Luciferase activity was measured using the Dual-Luciferase® Reporter Assay System (Promega, Madison, WI, USA). Firefly luciferase activity measurements were normalized with respect to pRL-SV40 Renilla luciferase activity to correct for variations in transfection efficiency. All experiments were performed in triplicate.
RNA preparation and quantitative reverse transcription-PCR (qRT-PCR)
CAPN1 and GLRX3 mRNA expression was examined by qRT-PCR. Total RNAs from tumors and paired non-malignant lung tissues (n = 114) were isolated using TRIzol® (Invitrogen, Carlsbad, CA, USA). Real-time PCR was performed using a LightCycler® 480 (Roche Applied GLRX3 expression Science, Mannheim, Germany) with QuantiFast SYBR® Green PCR Master Mix (Qiagen, Hilden, Germany). The real-time PCR primers for CAPN1, GLRX3 and β-actin genes were listed in Supplementary Table 1. Each sample was run in duplicate. Relative target gene mRNA expression was normalized to that of β-actin expression and then evaluated using the 2− ΔΔCt method (33).
ChIP-quantitative PCR (qPCR) assay
Chromatin from H2087 cells was immunoprecipitated with the Pierce™ Magnetic ChIP kit (Thermo Fisher Scientific, Waltham, MA, USA), using 10 ug anti-H3K4me3, anti-H3K9ac, anti-H3K27ac, anti-H3Kme1, anti-YY1, and anti-TFAP4 antibodies and 2 ug normal rabbit IgG antibodies per immunoprecipitation reaction. Immunoprecipitated chromatin was subjected to real-time qPCR using SYBR® Green PCR Master Mix (Qiagen, Hilden, Germany). The qPCR was performed as follows: 95 ℃ 10min, 45 cycles of 95 ℃ 15s, 60℃ 1 min. ChIP-qPCR enrichment analysis were performed by Comparative Ct method. Each samples were normalized to the input and the fold difference between sample and IgG was calculated using 2(−ΔΔCt) (34).
Enriched motif analysis
We identified the transcription factor binding motif enriched in the regions containing rs17583C > T or rs4751162 A > G. Motifs were analyzed using TOMTOM, a motif database scanning algorithm, of the MEME Suite web server (17) for comparison against Human and Mouse (35) and the SwissRegulon databases of known transcription factor motifs (36).
Statistical analyses
Hardy-Weinberg equilibrium was evaluated by a goodness-of-fit χ2 test with 1 degree of freedom. Overall survival (OS) was measured from the date of surgery to the date of death or the last follow-up. Disease-free survival (DFS) was calculated from the date of surgery until first evidence of disease recurrence or last date of follow up for patients who were free of disease. Estimated survival rate was calculated using the Kaplan-Meier method. Log-rank test was used to compare the difference in OS and DFS across different genotypes. Multivariate Cox proportional hazards models were used to estimate the hazard ratio (HR) and 95% confidence intervals (CI) after adjusting for age (< 64 years vs ≥ 64 years), gender (male vs female), smoking status (never vs ever), pathological stage (I vs II-IIIA), and adjuvant therapy (yes vs no). All analyses were carried out using Statistical Analysis System for Windows, version 9.4 (SAS Institute, Cary, NC, USA).