Cell culture
H358 cells were maintained in RPMI 1640 (Gibco), and A549 cells were cultured in DMEM/F12 (Gibco). All cells were supplemented with 10% newborn bovine serum (HyClone) and 1% penicillin/streptomycin/gentamicin. 293T cells were maintained in DMEM (Gibco) supplemented with 10% fetal bovine serum (HyClone) and 1% penicillin/streptomycin/gentamicin. All cells were incubated at 37℃ in a humidified chamber containing 5% CO2. All cell lines yielded negative results for mycoplasma contamination.
Plasmids and lentivirus transfection
The LSH lentiviral construct was generated by inserting the LSH cDNA into plvx-EF1a-puro vector, and an empty vector was used as negative control (Clontech, Mountain View, CA). All plasmid vectors were verified by sequencing. Plasmid transfection was performed using LipoMax (Sudgen Biotech, Bellevue, WA, USA), in accordance with the manufacturer’s protocol. Cell colonies were selected using puromycin (1 μg/ml). The overexpression of LSH was confirmed by western blot.
Western blot analysis
Cells were harvested, washed twice with ice-cold phosphate-buffered saline (PBS), lysed in lysis buffer [10mM Tris–HCl, pH 8.0, 1mM EDTA, 1% SDS, 5mM dithiothreitol, 10mM phenylmethyl sulfonylfluoride, 1mM Na3VO4, 1mM NaF, 10% (vol/vol) glycerol, protease inhibitor cocktail tablet (Roche)] and centrifuged at 15,000 × g for 10min after sonication. The supernatants were collected as whole cell lysates. A quantity of 50 μg of total protein was used for western blot analysis. Primary antibody against β-actin was purchased from Sigma-Aldrich (A5441, Sigma). Primary antibody against LSH was purchased from Santa Cruz Biotechnology(sc-46665, Santa cruz).
DNA, bisulfite treatment preparation
The cells were collected and the genomic DNA was extracted using a genomic DNA kit (Sangon Biotech, Cat.#B518201–0100); bisulfite conversion was then performed using the EZ DNA Methylation Direct kit (Zymo Research Corporation, Cat#D5020). The concentration and quantity of DNA were measured by a NanoDrop instrument (NanoDrop Technologies, Wilmington, DE, USA). All operations were conducted following the manufacturers’ recommended instructions.
RRBS library preparation and data analysis
The RRBS libraries were prepared according to previously published protocols [61]. Briefly, genomic DNA was digested with the MspI enzyme, followed by end-repair and ligation of sequencing adaptors. The fragments were then size-selected (40–220 bp) and bisulfite-converted prior to a PCR amplification step. The quality of the libraries was checked using a bioanalyzer, and two libraries were sequenced on an Illumina HiSeq X Ten machine (100 bp, single-ended run). The peak signals produced by the Illumina HiSeq were transformed into a base sequence using base calling of the raw data or raw reads. The raw reads were then filtered for subsequent information analysis to ensure the quality of the information analysis, including the removal of reads that had adapters and filtering reads with more than 10% N content or more than 50% low-quality bases. The final filtered data were regarded as clean reads.
Mapping reads to known genome
Sequencing reads must be aligned with a reference genome before conducting methylation analysis. Bismark software was used to perform a comparison of the alignments of bisulfite-treated reads to a reference genome using the default parameters. Reads that aligned with the same region of the genome were regarded as duplicates. The number of duplicates was used to summarize the sequencing depth and coverage. The conversion rate of bisulfite was calculated as the percentage of the methylated clean reads as a percentage of the total number of clean reads in the lambda genome using the Bismark software. Unmethylated cytosine from the genome was converted into T after bisulfite treatment and PCR amplification, but the methylated cytosine remained unchanged. Bismark was able to extract information about genomic cytosine sites by comparing the clean reads with the reference genome, thereby gaining cytosine site coverage statistics and the number of different types (CG as CpG, CHG and CHH) of methylated cytosine reads. As the methylation single C sites cannot be discriminated by Bismark, we used the binomial distribution test for each C site to confirm the methylated C site by screening conditions for coverage ≥4× and a false discovery rate (FDR) < 0.05.
Estimating methylation levels and the identification of DMRs
All cytosine sites with read coverages >10× were used for DMR analysis with MOABS [62]. First, to detect the methylated C sites in a region, we defined as the number of methylated reads at a single C site, as the number of unmethylated reads at a single C site, as the position of C, and as the total number of C positions. The methylation level of a C site was counted as follows [63]:
[Due to technical limitations, the formula could not be displayed here. Please see the supplementary files section to access the formula.]
The binomial distribution test was used to determine whether the C site was methylated. Subsequently, DMRs were defined as those with at least three different methylation sites in the region in which the difference in methylation level was greater than 0.2 (0.3 for CG type) with a p value from Fisher’s exact test of less than 0.05. The detailed DMRs were listed in supplementary file table1 to 3. The methylation level of a region was calculated as follows [63].
Bioinformatics analysis of DMGs
The DMGs were compared with functional databases such as GO, COG (Cluster of Orthologous Groups of proteins) and KEGG (Kyoto Encyclopedia of Genes and Genomes) by BLAST to obtain the annotation of these genes for analyzing gene function. The GO enrichment analysis was implemented by Wallenius noncentral hypergeometric distribution in the GOseq R package[64]. KOBAS software was used to assess the statistically enrichment of differentially expressed genes in the KEGG pathways [65].