PCR amplification and molecular cloning
PCR amplification showed that the LTR sequence of enJSRV was 447 bp (Fig. 1a). The product was purified and cloned into the pTOPO-Blunt vector. The LTR sequence of enJSRV, named NMJS1, was uploaded to GenBank under accession number ON211930. The amplification products of fragments 1 and 2 were 209 bp and 237 bp, respectively. The full length of the exJSRV LTR was 396 bp (Fig. 1b) by fusion PCR. The product was purified and cloned into pTOPO-Blunt vectors. The LTR sequence of exJSRV, named NMJS2, was uploaded to GenBank under accession number ON211931.
Phylogenetic analysis
A phylogenetic tree was constructed with the sequences NMJS1, NMJS2, and the available reference strains using the neighbor-joining method. Phylogenetic tree analysis showed that the LTRs of enJSRV and exJSRV were divided into two branches (Fig. 2). NMJS1 was closely related to enJSRV-20(EF680302), enJSRV-NM(DQ838493), and enJS56A1(AF153615). NMJS2 was closely related to JSRV-C1(KP691837) and was a branch of the American strains JS7(AF357971) and JSRV21(AF105220), indicating that the American strain was prevalent in northern China.
Analysis of transcriptional regulatory elements and conserved in the LTRs
Transcriptional elements and conservation of LTRs were analyzed using the sequences of the NMJS1, NMJS2, and reference strains. The important characteristics of LTR provirus structures, namely the TATA box and poly(A) signal, were found in the LTRs of enJSRV and exJSRV, and other cis-regulatory sequences, commonly called enhancers, were located upstream of the start site, mainly in the U3 region. We found that several enhancers were conserved between the LTRs of enJSRV and exJSRV, including the NF-KB, STST1, Ets-I, NF-I, Ef-I, IK-2, SP-I, STAT5, AP-2, and Oct-I binding sites, which are labeled in Fig. 3. However, compared with enJSRV LTR, exJSRV LTR has two HNF3-β sites; the upstream HNF3-β site is shared with the progesterone receptor, and the downstream HNF3-β site is shared with the glucocorticoid receptor. In addition, exJSRV LTR has a C/EBPβ site. In general, transcriptional regulatory elements are conserved in the LTRs of both enJSRV and exJSRV, and single-base mutations are not expected to affect their binding to transcription factors.
CpG island prediction in the LTRs
CpG islands are primarily located in the promoter region of genes and are easily modified by methylation, resulting in the inactivation of gene transcription. The CpG islands of NMJS1 and NMJS2 were predicted using the following criteria: island size >100 bp, GC percentage >30%, and observed/expected CpG ratio >0.6. Under the same search conditions, the results showed no CpG island in NMJS1. However, a CpG island of 121 bp in NMJS2 was in the U3 region (Fig. 4). In this regard, sodium bisulfite can be used to modify genomic DNA chemically, and specific primers provided by the MethPrimer software can amplify the modified DNA and detect DNA methylation by sequencing.
Putative quadruplex-forming sequences in the LTRs
The G-quadruplex (G4) is a non-canonical nucleic acid structure that regulates important cellular processes. The potential of a sequence to form G4 is called a PQS. Studies have shown a stable and conserved G4 in the LTR promoter of retroviruses [16]. G4 formed in the LTR can act as a silencer element to regulate viral transcription. Here, the QGRS Mapper software was used to assess the PQS of NMJS1 and NMJS2. The search procedure was set to GxNy1GxNy2GxNy3Gx, where x≥2; y1, y2, and y3 are the lengths of the rings connecting the tetrads, ranging from 0 to 36, and at most, one of the gaps is allowed to have a length of 0; the maximum length is set to 45 bases. The specific PQS information predicted for NMJS1 and NMJS2 is presented in Table 1. The results showed that while both have PQSs, NMJS2 has one more PQS than NMJS1.
Table 1. PQS analysis performed within the LTRs.
Genes
|
Length
|
Potential quadruplex forming sequences
|
G-Score
|
NMJS1
|
36
|
GGTTAAGTCTTGGGAGCTCCCTGGCAGGTATGCCGG
|
35
|
33
|
GGTGCGACTCTTGGTTGTGCTGGCCGCGGCAGG
|
33
|
NMJS2
|
39
|
GGACGACCCGTGAAGGGTTAAGTCCTGGGAGCTCTTTGG
|
33
|
44
|
GGCTCGGATGTTTGCTTTTGGCACTGCTTCACAGAAATACCAGG
|
18
|
33
|
GGTGCGACTCTTGCTTGTGCTGGCCGCGGCAGG
|
19
|
The underlined segments are the G2 tracts.
Transcriptional activity of the LTRs in different cell types
To verify whether the transcriptional activity of JSRV was related to cell tropism, we measured the transcriptional activity of NMJS1 and NMJS2 in different cell types. The difference analysis between the groups showed that the relative luciferase activities of NMJS1 and NMJS2 were highest in STC (Fig. 5). The results of the intra-group difference analysis showed that the relative luciferase activity of NMJS2 was significantly higher than that of NMJS1 in lung-derived epithelial cell lines such as A549, AEC, MLE-15, and NIH3T3 cells (P < 0.01). However, the relative luciferase activity of NMJS1 in STC and 293T cells was significantly higher than that of NMJS2 (P < 0.01) (Fig. 5). These results suggest that the LTR of enJSRV is preferentially expressed in germ cell lines and that the LTR of exJSRV is preferentially expressed in lung epithelial cell lines.