Organisms will eventually develop a specific set of codon usages over the course of long-term evolution, which preserves the conveyance of genetic information between nucleotides and amino acids [65–66]. Nevertheless, synonymous codon usage frequency varies among different species, with certain codons being utilized more frequently than others [67]. Many factors affected CUB, including codon base composition, CDS length, synonymous codons length, amino acids length, amino acid hydrophobicity, aromaticity, mutation pressure, and natural selection, among which mutation pressure and natural selection are the most important factors affecting CUB [3, 14]. As a result of factors such as mutation pressure and natural selection, organisms tend to favor optimal codons, whereas mutation leads to the presence of non-optimal codons. Across long-term evolution, disparate genes of the same or distinct species display varying predilections towards codon usage. Consequently, CUB analysis offers valuable insights into the regulatory mechanisms of transcription and translation processes, as well as facilitating exogenous gene prediction and optimization for improved expression levels through industrial modification [59, 68]. To date, the characteristics of codon usage for thioredoxin genes of apicomplexan protozoa have not been fully understood.
Trx is a type of redox protein, which plays an important role in metabolic redox regulation, parasite survival, host immune evasion, and the invasion process of apicomplexan protozoa [69–74]. The length and codon base composition of Trxs in apicomplexan protozoa showed large variations, indicating the differentiation of apicomplexan protozoa Trxs. It is reported that the difference in synonymous codons is mainly reflected in the difference in the third codon. In this study, we found that the Cryptosporidium spp. and Plasmodium spp. tend to end with A/T, while the Eimeria spp., Babesia spp., Hammondia hammondi, Neospora caninum and Toxoplasma gondii were rich in C3/G3, which proved that the specific gene in different species show differences in base usage and the results are consistent with the feature of apicomplexan protozoa codon usage in other genes [75–76]. Most high-frequency Trx codons analyzed by RSCU also show the same tendency of using the third codon in apicomplexan protozoa. In addition, the CAI, CBI and Fop values of Eimeria necatrix were the highest, which indicates a strong codon bias. An ENC value lower than 35 indicates a strong codon preference [77–78]. The average ENC of these 32 apicomplexan protozoa is 46.59 in this study, all ENC values except Eimeria necatrix (30.77) were higher than 35, which indicates a weak codon preference among apicomplexan protozoa. Furthermore, we detected the correlations among codon base composition (GC1, GC2, GC3, GCs), CAI, CBI, FOP, ENC, GRAVY, AROMO, L_sym, and L_aa, indicating the influence of base composition and codon usage indices on CUB, which show a significant correlation in Plasmodium spp. The neutrality plot analysis, PR2-bias plot analysis, and ENC-GC3 plot analysis further demonstrated that natural selection plays an important role in Trxs of apicomplexan protozoa codon bias. Although some differences in codon usage indices among apicomplexan protozoa, their common point was that CUB of Trx was affected by strong natural selection.
Apicomplexans are a class of obligate intracellular parasitic protozoa, with a large geographical distribution, which are important pathogens for humans and animals and can cause serious zoonotic diseases such as malaria, toxoplasmosis, and cryptosporidiosis [35, 39–44, 79]. Besides, apicomplexans are believed to have been obtained from Protista, dividing into aconoidasida and conoidasida, including Toxoplasma gondii, Plasmodium spp., Cryptosporidium spp., Eimeria spp., Babesia spp., Theileria spp., Neospora caninum. At present, the RSCU clustering and CDS phylogenetic tree are widely used for analyzing the evolutionary relationship of the same gene in different species. And these two clustering analysis methods have consistent results in some species, while others differ significantly [2]. In this study, we analyzed the relationship of Trxs in different apicomplexan protozoa based on encoding sequences, and each evolutionary clade had a high support rate, indicating the reliability of analyzing apicomplexan protozoa Trxs phylogenetic relationships based on encoding sequences. In addition, based on the RSCU values of different apicomplexan protozoa, we constructed the relationship and found that the relationship here was different from the sequence-based relationships, especially for the Babesia spp. and Theileria spp. However, the genetic relationship between some species was correctly interpreted according to the RSCU value, which was consistent with other studies [60, 80]. The results show that the phylogenetic results based on RSCU can be an important supplement to the phylogenetic results based on the sequence.