This study compared the degrees of identity and similarity of pig TERT AAS and CDS with other phylogenetic species. As expected, the CDDs and AASs of the phylogenetically close to pig Cetartiodactyla species (cattle, goat, sheep) had the highest degrees of identity and similarity to corresponding pig TERT sequences. At the same time, rather high degrees of identity and similarity were also found for the pig TERT CDS and AAS with human TERT enzyme, which is significant because the structural and functional features of the human TERT gene and the protein encoded by it are well-studied and can be used in the analysis of pig TERT. It is notable that the length of human TERT AAS is 1132 amino acid residues, which is very close to pig TERT, but a similar AAS length compared to human is due to a number of indels in the TERT gene and does not indicate a greater similarity of pig TERT to human TERT than to Cetartiodactyla (cattle, goat, sheep).
It is noteworthy that the degrees of identity of CDS of the TERT gene were, in each case, higher than that of the TERT AAS, which indicates that there are more missense variants among nucleotide substitutions, i.e., those that lead to substitutions in AAS.
It is also worth noting that the reference sequence of pig TERT entered the NCBI [34] database (NCBI Gene ID: 492280) has some differences from this sequence in Ensembl. Thus, according to NCBI data, the reference pig TERT protein consists of 1131 amino acid residues and, compared to the reference sequence in Ensembl, has the amino acid substitution R354G (the relevant SNP rs325294961 is also considered in this study), as well as a single amino acid insertion (P700_P701insA). This may be the reason for some differences regarding the length of the AAS of pig TERT and the localization of individual SNPs when using the NCBI sequence as a reference in other studies.
According to the Ensembl data, there are four missense SNPs in the TEN domain of pig TERT (Table 2), while a total of 6 polymorphisms were detected in this gene region. For comparison, 336 polymorphisms, including 178 missense SNPs, were found in the human TERT gene in the same region. Such a significant difference between the number of SNPs in the human and pig TERT genes is due to the different levels of research of these species. It should be noted that the influence of synonymous mutations can also occur, but not on the structure of the enzyme, but on the level of its synthesis. This influence can be realized, for example, through mechanisms of mRNA translation at the level of codon-anticodon interaction with a possible deficiency of certain isoacceptor tRNAs [35]. The TEN domain belongs to the hypomutable regions of the enzyme if considered in terms of only amino acid substitutions [4], but the structure of the TEN domain does not show high interspecies conservatism [36]. Functional data confirm that the TEN domain is required for telomerase recruitment to telomeres [37]. In humans, mutations in the DAT (Dissociates Activity of Telomerase) region in the TEN domain render the enzyme unable to function in vivo, but telomerase retains some catalytic activity in vitro [38]. This fact indicates the possibility of the influence of missense SNPs localized in the part of the gene that encodes the TEN domain of telomerase on the activity of the enzyme.
The TRBD domain in the pig TERT gene accounts for 5 missense SNPs, and a total of 12 polymorphisms are known. Due to the low level of research on pig TERT, these data are unlikely to reflect the real level of polymorphism of this area in the pig TERT gene because 330 polymorphisms have been established for human TERT in a similar area, including 199 missense SNPs. It can be seen that the number of polymorphisms falling on the TEN and TRBD domains in human TERT is comparable, and the calculated SNP density is the same (0,487 SNPs per 1 bp in both cases) (Table 2). This domain, like TEN, is classified as hypovariable.
As for the RTD domain, 5 missense SNPs are known in the corresponding region of the TERT gene, total 19 SNPs. According to data obtained for human TERT, this domain contains 7 conserved motifs [4]. Mutations in this domain lead to a decrease in enzyme activity [39]. Moreover, for yeast TERT, it was shown that in addition to mutations that disrupt the function of the enzyme and assembly of the telomerase complex [40, 41], there are mutations that lead to an increase in the length of telomeres [42]. These data give reason to expect that some of the missense SNPs found in the region of the pig TERT gene encoding the RTD domain can significantly affect the structure of the enzyme and its activity.
The CTE domain in pig telomerase is characterized by 5 missense polymorphisms; besides, this domain is classified as low-conserved [4]. CTE is known to participate in telomerase recruitment, and mutations in this domain do not affect telomerase activity in vitro but do not allow the enzyme to maintain telomere length in vivo [43].
In this study, a predictive assessment of the expected effect of pig TERT gene missense SNPs on telomerase function and stability was performed, for which sequence-based and structure-based methods were used. The results of evaluation by sequence-based methods showed that of all the considered missense SNPs rs325294961, rs705602819, rs789641834 and rs706045634 can have such an effect.
As for SNP rs325294961, this polymorphism involves an amino acid substitution of arginine to glycine (R354G) in the TRBD domain of telomerase, which may affect the interaction of TERT with telomerase RNA. Arginine belongs to the basic amino acids and, thanks to the guanidine group, exhibits strong alkaline properties. This amino acid is able to form multiple hydrogen bonds with phosphate groups of nucleic acids. Glycine is a neutral amino acid, which differs significantly from arginine in terms of its chemical properties [44]. Thus, it is obvious that the replacement of arginine with glycine (R354G) can change the structural characteristics of telomerase and, probably, the affinity of the TRBD domain to telomerase RNA, which, in turn, can affect the catalytic activity of the enzyme. This assumption is confirmed by the evaluation of the effect of the R354G substitution in the TRBD domain, obtained using the bioinformatic tools SIFT, PROVEAN, PolyPhen-2, SNAP2. According to all involved software resources, this mutation is defined as having a significant effect on the enzyme. This agrees with the notion that it is substitutions of the first nucleotide of the codon that have the greatest effect on protein structure. As a result of such a substitution, a charged amino acid is often replaced by an amino acid with the opposite charge [45]. In this case, there is the replacement of the cytosine nucleotide with a guanine nucleotide (CGG/GGG) in the first position of the codon. Arginine, as already mentioned, is positively charged, and glycine belongs to amphoteric amino acids.
The SNP rs705602819 polymorphism corresponds to the amino acid substitution R629W in the RTD domain. In the same way as for R354G, the substitution of arginine for tryptophan is associated with a substitution of a nucleotide in the first position of the codon (CGG/TGG). Indeed, the specified amino acids differ significantly in their chemical properties. Arginine is an alkaline hydrophilic amino acid; tryptophan is an aromatic amino acid that exhibits hydrophobic properties [44]. Predictive evaluation of all used computer resources indicates a significant effect of the R629W substitution on the structure of telomerase. Given that the RTD domain is responsible for the enzymatic reverse transcriptase function of telomerase, it can be expected that SNP rs705602819 has the prospect of being used as a genetic marker associated with the activity of the enzyme and, as a final result, possibly with the manifestation of certain biological and economic traits of animals.
As mentioned above, the predictive scores for rs789641834, rs706045634 and rs705219838 by several programs also suggest their possible effect on the functional characteristics of the enzyme (Table 4). First two of these SNPs are located in the gene region corresponding to the TEN domain of telomerase. rs789641834 is caused by a mutation of the first nucleotide of the codon (CTG/ATG) and is the cause of the amino acid substitution L158M in the AAS enzyme. Leucine is a typical non-polar aliphatic α-amino acid, and methionine is also a non-polar aliphatic α-amino acid but has a bonded sulfur atom that exhibits hydrophobic properties [44]. When leucine is replaced by methionine, the hydrophobicity of the latter may affect the spatial structure of the protein. Another SNP rs706045634 as assessed by the SIFT, PolyPhen-2, SNAP2 programs also demonstrates the possibility of influencing the structural and functional properties of telomerase, leading to the amino acid substitution R201P in the same TEN domain. As for the last of the mentioned substitutions with possible effect, rs705219838 is in the Linker region of the TERT enzyme and leads to the substitution T270I.
Comparison of the results of the predicted impact of the missense SNPs obtained by structure-based methods with the results obtained by sequence-based methods reveals certain regularities in the estimates. Thus, results of the sequence-based methods for SNPs rs789641834 (L158M) and rs325294961 (R354G) indicate that there is a potential “effect” of these polymorphisms on TERT functions (Table 4). At the same time, these mutations predictably destabilize the structure of the TERT protein (Table 5). This allows us to make a general tentative conclusion about their significant conditionally deleterious effect on telomerase activity.
According to the results obtained using sequence-based methods, amino acid substitution R201P (rs706045634) refers to those that have a certain effect on the functional properties of telomerase. And according to structure-based methods, it is likely to contribute to increasing the stability of the enzyme. It can be assumed that this substitution affects the functional properties of the enzyme by increasing the stability of the tertiary structure of TERT.
Based on sequence-based methods, the substitutions R281W (rs330770291) and R606L (rs698738374) are estimated as neutral. This prediction is consistent with their assessment by structure-based methods, according to which they are likely stabilizing with respect to the tertiary structure of TERT. An increase in the stability of the TERT structure may contribute to a change in the activity of the enzyme.
Changes in telomerase activity associated with the influence of SNPs rs789641834, rs325294961, rs706045634 (based on both sequence and structure prediction) and rs705602819 (based on strong sequence prediction) can result in changes in certain processes at the cellular level and, ultimately, possibly affect physiological parameters associated with the manifestation of productive traits that depend on the health of animals, their resistance to stress factors and duration of economic use.
These SNPs should be tested in association studies to establish their actual effect on animal performance. In the event that the results of such studies demonstrate their significant association with such traits, they can obviously be used in marker-associated breeding and should be considered direct genetic markers. At the same time, mutations corresponding to them can be classified as causative since they directly affect the functional and structural characteristics of TERT that, in turn, induce changes in productive qualities. Those SNPs for which the corresponding amino acid substitutions, according to the results of bioinformatic analysis, did not reveal an effect on the characteristics of TERT, but in the association analysis performed for them show an effect on productive traits, obviously, can be considered linkage disequilibrium genetic markers.