It was not possible to establish an association between genetic variability in the BLV pX region and the development of lymphocytosis, tumor development, asymptomatic status and proviral load in Holstein Friesian cattle infected with BLV genotype 1.
We obtained 30 sequences measuring 1156 nt in length that corresponded to the BLV pX region. Phylogenetic analysis showed that all of these sequences were associated with BLV genotype 1, which is widely distributed throughout Mexico [19, 20]. Furthermore, the sequences were grouped heterogeneously throughout the genotype 1 reference sequence clade, not displaying any relationship to infection phase (with and without lymphocytosis or lymphoma) nor geographic origin of the samples.
The R3 protein is made up of 44 amino acids with a hydrophilic portion at the N-terminus followed by a hydrophobic region.  described premature stop codons in two sequences obtained from tumor lesions in the R3 gene, but we did not find any premature stop codons in the 30 sequences analysed for this study. While we did find nine non-synonymous mutations in five sequences in the three study groups, we did not identify their possible biological implications (Figure 3 D).
Although the functions of G4 and R3 proteins have not been fully clarified, they are known to be of great importance for viral propagation in vivo . The oncogenic potential of G4 protein stands out among other known functions. This protein includes an amino terminal stretch of hydrophobic residues, followed by possible proteolytic cleavage sites and a region rich in arginines located in the center of the protein . This region is key in the interaction of G4 with farnesyl pyrophosphate synthetase (FPPS;), an enzyme that participates in the pathway to lipidation of a great variety of nuclear lamin proteins, Ras, other regulatory binding proteins (GTP), as well as various kinases and phosphatases  (Lefebvre et al., 2002).
The G4-FPPS interaction has been demonstrated in cell transformation. Mutations in the alpha helix of the arginine-rich region of the G4 protein, prevent the immortalization of primary cells and the induction of tumors in nude mice . The interruption of the G4-FPPS interaction could alter the process of oncogenesis. In our study sequences, we found four non-synonymous mutations in the arginine-rich region, two of them in abomasum and heart tumor tissues (Figure 3 C). Although we identified positive selection in the G4 gene in all infected animal groups, the high conservation of the arginine-rich region in the G4 protein may be necessary for it to exert its oncogenic function. We also found many non-synonymous mutations in the MYB motif, and this many accumulated changes have not been previously described in the literature for this gene. Additionally, we found two sequences with AGU_7488L residue deletions, and three residues (AGU_18A) that coincide with descriptions by Murakami et al. . These authors carried out in vitro analyses and reported a deletion of four amino acids in the G4 gene linked to a decrease in viral production and replication. The high number of changes identified in this region may impact viral replication, but in vitro studies are necessary to demonstrate this.
The Rex protein facilitates viral RNA export from the nucleus to the cytoplasm via nuclear localization (NLS) and nuclear export (NES) signals . NLS directs Rex proteins to the nucleus, except when it binds to viral RNA. This binding masks the NLS, allowing the NES to direct the viral RNA to the cytoplasm through nuclear pores [32, 33]. In our study sequences we found ten non-synonymous mutations (Figure 3 B), four of them were in the NES region in cattle with lymphocytosis, and six in the NLS region. Three of the latter were in cattle without lymphocytosis, two were in cattle with lymphocytosis, and one was in an animal with lymphoma. We only identified mutations in the NES region in cattle with lymphocytosis, where a serine residue exchange for leucine dominated.
The tax gene is involved with transcription of viral and cellular genes, and may allow oncogenic transformation through inhibition of DNA repair pathways in infected cells . The presence of a zinc finger motif, a transactivator motif and two phosphorylation sites have been identified in the Tax protein . A study found that Tax mutants with substitutions in amino acids 240 and 265 had a greater transcriptional capacity directed to LTR than what is seen naturally with the Tax protein . We did not find these substitutions in our study. Den Breoeke et al.  reported a mutant with a single substitution (E303K) that turned out to be replication defective, but this substitution was also absent in the sequences we obtained.
Phosphorylation sites, sites associated with transactivation, and B-cell epitopes were conserved in all sequences. Only four mutations were observed in the zinc finger motif, three mutations in the leucine-rich domain in three animals without lymphocytosis, and 13 mutations in T-cell epitopes. None of the substitutions observed in the Tax protein were associated with the previously described regions that impact virus transcription and replication (Figure 3A). While other studies have described mutations in functional domains and important epitopes of this protein, phosphorylation sites are generally conserved .
Previous studies have revealed that different genetic and epigenetic mechanisms can silence the tax gene, which is essential for non-progression of tumoral processes . We found high degrees of conservation in our analysis of tax gene sequences, and it was not possible to identify sequence patterns that could be associated with the development of cell transformation, especially in animals with lymphocytosis and tumor tissues.
The genetic distance values obtained from our study R3, G4, rex and tax gene nucleotide sequences were 0.2 - 2.09%, 0.94 - 1.18%, 0.5 - 0.8% and 0.73 - 0.8% respectively. Other studies on genotype 1 BLV have identified genetic distances ranging from 0 - 12.1%, 0 - 6.5%, 0 - 9.4% and 0 - 6.1% for the R3, G4, rex and tax genes respectively . The maximum genetic distance values described by Panei et al.  exceeded the values obtained in our study by up to six times. This may be due to some analyzed sequences being phylogenetically related to genotype 2 (JF288766 and JF288767; Figure 2).
Zhao et al.  identified that the tax gene has the highest mean nucleotide variation rate (1.86%) with respect to the R3, G4 and rex genes (1.24, 1.29 and 1.40%, respectively), and in this case tax had the second lowest average variation rate (0.77%) compared to rex, G4 and R3 (0.66, 1.1 and 1.12, respectively). Genetic distances ranging from 0 - 2.8% and 0 - 4.7% have been identified for the rex and tax genes respectively. These values are greater than the ones found in our study, and these differences may be explained by the inclusion of genotype 1 and genotype 4 sequences in the analysis . Genetic distances in the pX region of genotype 1 genes found in our study showed low variability, regardless of infection phase or geographical origin.
One method for determining genetic variability entails measuring the substitution-rate, primarily in overlapping reading frames, because a synonymous change in one gene may not be neutral in the other. Purifying selection, also known as negative selection, is a type of natural selection in which genetic diversity decreases as a particular trait value (phenotype frequency) stabilizes in the population. In comparison, positive selection increases the frequency of certain variations and occurs when equilibrium in the population has not yet been reached . The proportion of synonymous to non-synonymous substitutions (dN / dS) in our result sequences for the three analyzed cattle groups with respect to the tax, R3 (cattle without lymphocytosis and with tumor development) and rex (cattle with and without lymphocytosis) genes established a negative selection. Similar results for tax and rex genes were described by McGirr and Buehuring . Zhao et al.  reported negative selection for the BLV tax gene, which could be due to a higher percentage of its sequence not being superimposed, in comparison to other pX region genes.
The tax gene has been found to be more conserved than rex in primate lymphotropic T viruses 1 and 2, which are classified in the same genus as BLV. This is consistent with Tax being the most important regulatory protein for Deltavirus behavior. We determined positive selection for experimental sequences obtained for the R3 (cattles with lymphocytosis) and rex (with lymphoma) genes, in addition to all BLV G4 gene sequences obtained from cattle. This would indicate that the virus tries to modify its genome and thus avoid the host's immunity mechanisms, including APOBEC .
A first approach to EBL diagnosis can be carried out through clinical signs in cattle with tumors and the subsequent histopathological study of biopsies from these tissues. Tumor tissue histopathological analyses allowed the identification of lymphomas characterized by the proliferation of neoplastic lymphocytes with marked anisocytosis and anisokaryosis, as well as a large number of mitoses. Using the ISH technique, we identified the proviral genome and observed a positive signal in tumor tissues. Similar studies have detected BLV in organ samples from an ISH assay directed at non-coding RNA . Marking in lymphomas was low intensity (Figure 1 C and F), indicating few infected cells. However, these data suggest that the ISH technique may be useful in the study of EBL.
We categorized cattle as having lymphocytosis based on an absolute count of 10,000 lymphocytes/mm3, and out of 405 BLV seropositive cattle, 54% (n = 221) had lymphocytosis. These results differ from those described in other studies, including Khudhair et al. , who identified 29% of animals with lymphocytosis. This could be due to low infection rates, as only 7% of animals in that study were seropositive. It is important to mention that the high numbers of animals with lymphocytosis identified in our study cannot all be associated with BVL infection, as at least three continuous samplings are required to determine persistent lymphocytosis  and this was not done in our study.
The use of real-time PCR allowed us to identify proviral loads across study groups (cattle with and without lymphocytosis, tissues with lymphoma) however we did not find statistically significant differences between them. This may be related to the number of infected cells in animals with lymphocytosis, as observed in tumor cells, which did not allow us to identify differences between the study groups. The lack of difference in proviral load could also be related to the number of samples analyzed in each group . Previous studies have also been unable to correlate proviral load with different infection phases [45, 46]. On the other hand, Chieh-Wen  showed that BLV-induced lymphoma and proviral load are associated with different alleles of BoLA-DRB3 in Holstein cows in Japan, and Kobayashi  found that high proviral loads in blood are significant for identifying cattle at high risk for developing lymphomas. While available information on proviral loads suggests that these are an important factor in disease progression, our study did not identify any relation between proviral load, pathogenesis and disease.
Numerous diagnostic methods have been used in BLV infection detection studies (including seroneutralization, radioimmunoassay, IDGA, ELISA, western blot and PCR; ), and ELISA and PCR have proven to be highly efficient diagnostic techniques at the herd level [20, 50]. We identified BLV infection in six regions across Mexico via antibody detection using a commercial ELISA test. We found 55.9% seropositivity in the study group of 724 Holstein Friesian dairy cattle. Across states, BLV seropositivity ranged between 41% and 80%, which reveals high infection levels in Mexican dairy herds, as well as a notable increase in BLV infection from previous serological studies . Increasing infection levels coincide with reports from other countries such as the United States with 83.9% [45, 52] and Taiwan with 81.8% .
Most BLV phylogenetic work has focused on the env gene, and particularly on the gp51 region because of its importance as a ligand and antigen, among other functions [54, 55]. A detailed study in 2007 identified seven BLV genotypes through env gene analysis . Genotype 8 was first described in Croatia and later distributed across other geographical regions . Genotypes 9 and 10 were described in Bolivia in 2016 , Thailand  and Myanmar . Our results show that the pX region may be useful in BLV phylogenetic analyses, however it is necessary to generate sequences that include all the genotypes previously identified via env gene analysis to consolidate this proposal.
In conclusion, we did not find an association between BLV pX region (R3, G4, rex and tax) genetic variability and infected cattle with lymphomas, and with or without lymphocytosis. Although we identified positive selection in three of the four genes that make up the pX genetic region, we could not find implications of this in BLV pathogenesis. Proviral load quantification did not show significant differences between the three study groups. We identified high BLV seropositivity in the study regions, and an overall a high frequency of BLV genotype 1 infecting dairy cattle in Mexico.