Construction of an HBX overexpression cell line and in vitro experiments
The HL-7702 cell line was firstly transfected with HBX lentivirus fluid. Fluorescence microscopy captured fluorescence intensity after Hygromycin B screening (Figure 1A). Then, real time PCR and western blot methods were used to detect HBX expression and found that HL-7702 cells with HBX stable-expression had been successfully constructed (all P<0.05, Figure 1B-C). We then conducted in vitro experiments and found that cellular invasion, clonal formation, proliferation and migratory capacity were reduced compared to the control group (all P<0.05, Figure 1D-G).
Determination of differential RNAs by transcriptional sequencing and enrichment analysis after HBX overexpression
Whole transcriptome sequencing identified 285 differential mRNAs, such as MEMO1, PDE10A and PSMD1, with 101 up-regulated mRNAs and 184 down-regulated mRNAs (Figure 2A), and nine differential miRNAs, such as hsa-miR-625-3p, hsa-miR-3615 and hsa-miR-1268a, with four up-regulated miRNAs and five down-regulated miRNAs (Figure 2B). Enrichment analysis of differential mRNAs in GO terms revealed angiogenesis, blood vessel morphogenesis and extracellular matrix, among others, as enhanced GO terms (Figure 2C), while enriched KEGG pathways included axon guidance, drug metabolism (CYP450 and other enzymes) and steroid hormone biosynthesis (Figure 2D). On the other hand, GO terms in miRNA targets consisted of nervous system development, single-organism process (BP, Figure 2E), intracellular, cytoplasm, organelle (CC, Figure 2F), binding, protein binding (MF, Figure 2G), and others. However, there were no significant KEGG pathways according to Q values (Figure 2H).
Whole transcriptome sequencing identified 78 differential lncRNAs, such as LINC00511, MIR924HG and LINC80, with 39 up-regulated lncRNAs and 39 down-regulated lncRNAs (Figure 3A), and 117 differential circRNAs, such as hsa_circ_0000657, hsa_circ_0000660 and hsa_circ_0007710, with 56 up-regulated circRNAs and 61 down-regulated circRNAs (Figure 3B). Enrichment analysis of co-expression targets in GO terms included wound healing, leukocyte migration, blood vessel development, and others (Figure 3C), and enriched KEGG pathways included pentose and glucuronate interconversions and extracellular matrix-receptor interaction (Figure 3D). On the other hand, GO terms of co-location targets consisted of serotonin receptor activity, response to xenobiotic stimulus and organic cation transmembrane transporter activity (Figure 3E), and KEGG pathways included steroid hormone biosynthesis and drug metabolism (CYP450 and other enzymes, Figure 3F). Meanwhile, GO terms of circRNAs included nuclear export, cellular protein localization, membrane-bounded organelle, and others, while KEGG pathways included only RNA transport (Figure 3G-H). Clustered heatmaps of mRNAs, miRNAs, lncRNAs and circRNAs are shown in Supplementary Figure 1A-D.
Validation of differential RNAs and construction of ceRNA networks associated with HBX expression
Real time PCR results demonstrated that ten differential mRNAs, including ACACA, GIPC1, ITPR3, MEMO1, PLCB1, and others, were consistently validated as showing differential expression in the HBX expression group (all P≤0.05, Supplementary Figure 2A-J). Concerning lncRNAs, all of the five lncRNAs, including LINC0080, LINC00511, MIR924HG, PTK7-215 and SLC2A3-205, were consistently validated with differential expression in the HBX expression group (all P≤0.05, Supplementary Figure 2K-O). In terms of miRNAs, eight miRNAs, such as miR-625-3P, miR-3625 and miR-1268a, were consistently validated with differential expression, except for miR-3180-3P (all P≤0.05, Figure 4A-I).
In the mRNA-miRNA-lncRNA constructed ceRNA network, mRNAs including FAM65A, PDE7A, PAK4, IP6K1, CD99L2 and DNMT3B; miRNAs including miR-625-5P, miR-3180, miR-3180-3P and miR-767-5P; and lncRNAs including AC116366.3, LNC00511, LINC00466 and LINC00670, were enrolled in the network (Figure 4J). At the same time, in the mRNA-miRNA-circRNA constructed ceRNA network, mRNAs including RPL29, RPTOR, IP6K1 and PDCD11; miRNAs including miR-1268a, miR-3180, miR-3180-3P and miR-6842-3P; and circRNAs including circ-0003726, circ-0002153 and circ-0004432, were enrolled in the network (Figure 4K).
Determination of interventional concentrations and sequencing of codon 249 and neighboring codons
Firstly, we set interventional gradients for AFB1 at 2, 1, 0.5, 0.2, 0.1 and 0.05 mmol/L. Then, the growth status of the above groups was assayed 24 h, 48 h and 72 h after intervention for 24 h (Supplementary Figure 3A-F). With the exception of AFB1 at 0.05 mmol/L for 72 h, cells did not present in a proliferative status. Therefore, we selected 0.05, 0.02, 0.01 and 0.005 mmol/L as the interventional gradients to be used in the next experiment. After 1 month of AFB1 intervention, sequencing of codon 249 and neighboring codons did not find any mutations in codons 247–250 in the HBX overexpression group (Figure 5A). In control group, however, AGG>AGA mutations were observed in codon 249 at 0.05 mmol/L, AGG>AGA mutations were found in codon 249, CCC>TCC mutations were found in codon 250 at 0.02 mmol/L, and AGG>AGA mutations were observed in codon 249 at 0.005 mmol/L (Figure 5B). In the HL-7702 group, AGG>AGT mutations were found in codon 249 at AFB1 concentrations of 0.05 and 0.02 mmol/L, while CCC>ACC mutations were observed in codon 250 at 0.01 mmol/L (Figure 5C).
After 2 months of AFB1 intervention, sequencing found that there were consistent AGG>AGC mutations in codon 249 for three groups: 0.005 mmol/L in the HBX overexpression group, 0.05 mmol/L in the control group, and 0.02 mmol/L in the HL-7702 group (Supplementary Figure 4A-C). After 3 months of AFB1 intervention, sequencing found AAC>AGC mutations in codon 247 at 0.02 mmol/L and AGG>AGC mutations in codon 249 at 0.005 mmol/L (Supplementary Figure 5A). In the control group, there were CCC>GCC mutations in codon 250 at 0.05 mmol/L (Supplementary Figure 5B). In the HL-7702 group, there were AGG>AGC mutations in codon 249 at 0.05 mmol/L, AAG>AGC mutations in codon 247 at 0.02 mmol/L, and AGG>AGA mutations in codon 249 at 0.01 mmol/L (Supplementary Figure 5C).
Determination of differential RNAs by transcriptional sequencing and enrichment analysis of differential RNAs after AFB1 intervention
Whole transcriptome sequencing identified 585 differential mRNAs, such as TRPS1, OXR1 and PEPD, with 176 up-regulated mRNAs and 409 down-regulated mRNAs (Figure 6A), and 143 differential miRNAs, such as hsa-miR-122-5p, hsa-miR-122b-3p and hsa-miR-296-3p, with 73 up-regulated miRNAs and 70 down-regulated miRNAs in AFB1 group (Figure 6C). GO terms of differential mRNAs were enriched in single-organism process, regulation of cell proliferation, cytoplasm, protein binding, and others (Figure 6B). GO terms of targets of differential miRNAs were enriched in cellular metabolic process, cell process, intracellular part, protein binding, and others (Figure 6D). Whole transcriptome sequencing identified 564 differential mRNAs, such as FERMT2, XRCC6 and PYGL, with 183 up-regulated mRNAs and 381 down-regulated mRNAs (Figure 6E), and 126 differential miRNAs, such as hsa-let-7b-5p, hsa-let-7f-1-3p and hsa-miR-122-5p, with 64 up-regulated miRNAs and 62 down-regulated miRNAs in the AFB1+HBX group (Figure 6G). GO terms of differential mRNAs were enriched in anatomic structure morphogenesis, developmental process, and others (Figure 6F). GO terms of targets of differential miRNAs were enriched in cellular process, single-organism process, intracellular part, protein binding, and others (Figure 6H). Clustered heatmaps of mRNAs, miRNAs, lncRNAs and circRNAs are shown in Supplementary Figure 1E-H.
Whole transcriptome sequencing identified 208 differential lncRNAs, such as LINC02208, AL592435.1 and MIR924HG, with 86 up-regulated lncRNAs and 122 down-regulated lncRNAs, and 58 differential circRNAs, such as has_circ_0042103, hsa_circ_0000825 and hsa_circ_0000816, with 20 up-regulated circRNAs and 38 down-regulated circRNAs in the AFB1 group (Supplementary Figure 6A, C). GO terms of targets of differential lncRNAs were enriched in mitochondrion organization, mitochondrial envelop, and others. GO terms of targets of differential circRNAs were enriched in regulation of cell cycle, nucleic acid metabolism process, enzyme binding, and others (Supplementary Figure 6B, D). Whole transcriptome sequencing identified 233 differential lncRNAs, such as LINC106, LINC125 and LINC142, with 112 up-regulated lncRNAs and 121 down-regulated lncRNAs, and 61 differential circRNAs, such as hsa_circ_0000117, hsa_circ_0000119 and hsa_circ_0000504, with 21 up-regulated circRNAs and 40 down-regulated circRNAs in the AFB1+HBX group (Supplementary Figure 6E, G). GO terms of targets of differential lncRNAs were enriched in cellular respiration, oxidative phosphorylation, mitochondrion, ribosome, and others. GO terms of targets of differential circRNAs were enriched in cellular response to stress, DNA repair, intracellular organelle, and others (Supplementary Figure 6F, H). Meanwhile, KEGG pathways of these differential RNAs were mainly enriched in non-alcoholic fatty liver diseases, oxidative phosphorylation, ribosome, Parkinson’s diseases, and others (Supplementary Figure 7A-F, H, J). GO terms of targets of differential circRNAs were enriched in nucleic acid metabolic process, regulation of mitosis, membrane-bound organelle, enzyme binding, and others (Supplementary Figure 7G, I).
Validation of differential RNAs and construction of ceRNA networks associated with AFB1 exposure
In AFB1 exposure group, real time PCR results showed that two differential mRNAs, OX4R1 and VAV3, were consistently validated with differential expression (both P≤0.05, Supplementary Figure 8A-D). Concerning miRNAs, two differential mRNAs, miR-122b-3P and miR-122b-5P, were consistently validated with differential expression (both P≤0.05, Supplementary Figure 8E-H). In the mRNA-miRNA-lncRNA constructed ceRNA network, mRNAs including RORC, HADHB and VAV3, miRNAs including miR-99b-5P, miR-100-5P and miR-99a-5P, and lncRNAs including AL592435.1, LINC02356 and LINC01885, were enrolled in the network (Supplementary Figure 8I). At the same time, in the mRNA-miRNA-circRNA constructed ceRNA network, mRNAs includingAGAP3, MED24 and FSD1, miRNAs including miR-193b-5P, miR-6860 and miR-185-3P, and lncRNAs including circ-0000117, circ-0007653 and circ-0001310, were enrolled in the network (Supplementary Figure 8J).
In AFB1+HBX exposure group, real time PCR results showed that two differential mRNAs, XCRR6 and FERMT2, were consistently validated with differential expression, while PYGL and TRIM24 were not validated (P≤0.05, Figure 7A-D). Concerning miRNAs, three differential mRNAs, including miR-122b-3P, miR-122-5P and let-7f-5P, were consistently validated with differential expression while let-7b-5P was not validated (P≤0.05, Figure 7E-H). In the mRNA-miRNA-lncRNA constructed ceRNA network, mRNAs including TNIP1, SPEF2 and HOXA3, miRNAs including miR-4677-3P, miR-486-3P and miR-615-3P, and lncRNAs including LINC115, LINC122 and LINC160, were enrolled in the network (Figure 7I). At the same time, in the mRNA-miRNA-circRNA constructed ceRNA network, mRNAs including MAPT, NEK11 and CTNNA1, miRNAs including miR-2277-5P, miR-671-5P and miR-615-3P, and lncRNAs including circ-0006990, circ-0001467 and circ-0003333, were enrolled in the network (Figure 7J).
Identification of HBX and AFB1-related RNAs
We then performed intersectional analysis to identify HBX expression-related RNAs, which found 32 mRNAs, three miRNAs, eight lncRNAs, and two circRNAs, including APBB2, TNIP1, has-circ_0007108 and has-miR-3615 (Figure 8A-D, Table 1). In addition, intersectional analysis identified AFB1 exposure-related RNAs, including 171 mRNAs, 48 miRNAs, 52 lncRNAs, and 19 circRNAs, such as APBB2, CTSD, has-miR-3615, LINC02211 and has-circ_0007108 (Figure 8A-D, Table 2).
Analysis of exome sequencing results
Exon sequencing results mainly included SNP, SNV, copy number variant (CNV) and InDel results. Visualized results of SNPs and somatic SNVs in coding sequences are shown in Figure 8E-F. Specifically, SNP data of genomic and coding regions in three groups is shown in Table 3. SNP data included downstream, intergenic, intronic, missense SNP, splicing, ncRNA exonic, ncRNA intronic and ncRNA splicing data. Intronic regions included the greatest amount of SNP data, while ncRNA untranslated 3’ and 5’regions did not provide any SNP data. There were 1068 CNV loci in exon regions of the AFB1 group, including PRMT6, AMY1C, AMY2A, OLFM3, and others, and 138719 CNV loci in the exon regions of the AFB1+HBX group, including EIF4G3, HP1BP3, DNAJC6, HFM1, among others.
InDel data from genomic and coding regions in the three groups are shown in Table 4. Groups of frame deletions and insertions, coding sequences, stopgain, stoploss and intronic splicing were included in the groups. Intronic regions showed the highest amount of InDel data, while ncRNA untranslated 3’ and 5’ regions did not provide any InDel data. Somatic SNV and InDel data from genomic regions in the three groups are shown in Table 5-6. Both coding sequence and intronic regions showed the majority of somatic SNV data, but coding sequence regions showed the majority of somatic InDel data. There were 1061 somatic SNV loci in the AFB1 group, including AGHGEF16 (T>C), NBPF1 (rs4127088, T>C), CYP4Z1 (rs868004924, G>A), and others, and 3601 somatic SNV loci in the AFB1+HBX group, including MORN1 (C>A), C1orf158 (G>T), PTCH2 (rs950611083, G>A), and others. Meanwhile, there were 30 InDel loci in the AFB1 group, such as EIF2AK3 (rs143427234, AAT>A), CD96 (rs200152394, GAC>G), DOCK9 (rs774391312, TTTTTG>T), and 40 loci in the AFB1+HBX group, such as TNNT2 (T>TG), CR1L (TTCTACAGCTGTGAGCCCAGCTATGACCTCA GAGGGGCTG CGTC>T) and NOL1 0 (AGGCG>A).