Eleven novel gene sequences of the CLA-DRB3.2 gene of the Sangamneri goat breed were obtained after carrying out the present study. These sequences were deposited to Gen Bank, NCBI, and the following Gen Bank Accession numbers were obtained:
MG 765420, MG 835447, MG 897689, MG 934562, MG 986899, MG 986900, MG 986901, MG 986902, MG 986903, MH 013230, MH 013231.
The Class II MHC CLA-DRB3.2 gene of the Sangamneri animals, in the present study, was observed to have a length of 285 bp. This 285 base pair long CLA-DRB3.2 gene upon in silico translation, resulted in a 95 amino acid long CLA-DRB3.2 peptide.
The gene sequence obtained for each animal showed the occurrence of unique and different pattern of single nucleotide polymorphism(s), across its gene length. For example, the sequence MG 835447, of the isolate 3 of the Sangamneri animal(s) studied, presently, contains single nucleotide polymorphism(s) at position(s) 45(A > G), 80(A > T), 112(A > T), 122(A > T), 128(A > G), 143(A > T), 150(A > G), 172(A > G), 173(A > G), 178(A > G), 202(A > C), 204(C > T), 235(G > T), 237(C > G). In the gene sequence MG 897689 of the Sangamneri animal isolate 4, in the present study, single nucleotide variation(s) are observed to be occurring at the positions 95 (A > T), 97 (C > T), 112 (A > T), 113 (A > C), 115 (C > G), 122(A > T), 143(A > T), 154 (A > G), 172 (A > G), 173 (A > G), 178(A > G), 201 (C > G), 202 (C > T), 212(A > G), 213(C > G), 217(A > C), 224(A > C), 244(C > T), 259(A > G). The gene sequence MG 934562 of animal 5, of the study has snp(s) at positions 122(A > T), 177 (C > G), 178(A > G), 282(C > G). Similarly, the gene sequence, MG 986899, of the Sangamneri animal isolate 7, in the present study, shows occurrence of snp(s) at position(s) 81(C > T), 82(C > T), 83(T > G), 86 (A > C), 88(A > G) along the 285 bp gene length. And, in the gene sequence MG 986901 of the Sangamneri isolate 11, snp(s) could be observed in the 285 bp long CLA-DRB3.2 gene at positions 47(T > G), 49(C > T), 53(A > T), 80(A > T), 112(A > T), 122(A > T), 128(A > G), 143(A > T), 150(A > G), 172(A > G), 173(A > G), 201(C > G), 202(C > T), 212(A > G), 216(C > G), 235(T > G), 237(C > G).
Analysis of the gene sequence(s) by BioEdit (Hall, T., 2011), MEGA 6 (Tamura et al., 2013) and the DNAsp5.0 (Librado and Rozas, 2009) softwares; revealed the occurrence of a total of sixty three single nucleotide variation(s) across these eleven CLA-DRB3.2 gene sequences of the Sangamneri animals.
Detailed statistical analysis of the eleven sequences obtained, for the study of genetic polymorphism in the Sangamneri goat breed animals, by Dnasp 5.0 revealed that; of these 63 snp(s)observed to occur in the present study, fifty two were parsimony informative sites and the other eleven were singleton variable sites that had two variants. These eleven singleton variable sites that had two variants were at position(s) 45, 47, 49, 81, 82, 83, 86, 88, 95, 128, 177. Of these fifty two parsimony informative sites, forty five sites had two variants while 6 parsimony informative sites at position(s) 113, 174, 202, 211, 216, 259, had three variants i.e. they existed as triple alleles; one parsimony informative site at position 34 of the gene had four variants i.e. it existed in tetra-allelic state. Site positions of the forty five parsimony informative sites that had two variants were- 10, 35, 38, 41, 42, 50, 53, 56, 58, 66, 80, 97, 112, 115, 122, 133, 143, 150, 153, 154, 155, 172, 173, 178, 179, 190, 201, 204, 208, 212, 213, 215, 217, 220, 224, 225, 227, 235, 236, 237, 242, 256, 264, 266, 282.
Triple alleles are indicating that the Class II MHC CLA-DRB3.2 gene locus is a mutation hotspot wherein, the large number of genetic polymorphisms are maintained (Hodgkinson and Walker, 2010).
Number of haplotypes, h was sixteen; the Haplotype diversity, Hd was 0.974. Variance of haplotype diversity was 0.00037.
The Tajima’s D value was 0.03362, which was statistically insignificant at P > 0.10 (Tajima, 1989). A positive value of Tajima D, in the present study, indicates balancing selection forces, to be in action, resulting in the maintenance of a large number of alleles in the population. The dynamic selective pressures exerted by pathogens promote balanced polymorphism in the host response genes in several cases. The best documented example is the major histocompatibility complex (MHC) in vertebrates (Hedrick, 1998; Edwards and Hedrick, 1998; Hughes and Yeager, 1998, Bernatchez and Landry, 2003 ).
Analysis of the results obtained, for different population indices by the use of the POPGENE software (Yeh et al., 2000), gave the values of observed and expected heterozygosities to be 0.0792 ± 0.0953 and 0.3180 ± 0.1698, respectively. The overall Fis value (Wright, 1951) was observed to be 0.07391. These results where the observed heterozygosity of the population studied was lower than the expected heterozygosity value, and a positive value of Fis -Wrights fixation index indicated that the animals that were studied for their genetic polymorphism in the Class II MHC CLA-DRB3.2 gene locus, in the present study, had a certain degree of genetic relatedness amongst themselves.
Some of the nucleotide variation(s) - snp(s) observed in the sequences of the CLA-DRB3.2 gene of the Sangamneri goat breed in the present study, were situated at a very close proximity along the length of the nucleotide sequence(s). These snp(s) were occuring in the same triplet codon and resulted in the coding of more than one amino acid(s) due to the variation in the base/nucleotide at different positions in the same triplet codon. for example, in the sequence MG835447 of the Animal 3 of the Sangamneri breed in the present study, the snp(s)/ nucleotide variation(s) at position(s) 172 (A > G), 173(A > G),occur in the same triplet codon GAT at position 57 and encode for amino acid(s) Asp.-GAT, Glu-GAG, Asn.-AAT, AAC, Lys.-AAG, Arg.-AGG, Ser.-AGC and Gly.-GGC; similarly,the snp(s) 202 (C > A) and 204 (C > T) occurred in the same triplet codon CUC encoding the amino acid leucine at position 68 of the 95 amino acid CLA-DRB3.2 peptide of the animal 3 of the Sangamneri breed; the snp 202 (C > A) resulted in amino acid variation from CUC- leucine to AUC- Isoleucine ; while the snp 204 (C > T) is synonymous i.e. it does not result in any amino acid change, as both the triplet(s) CUC and CUU, encode the amino acid leucine. Also, the snp(s) 235(T > G) and 237(C > G) in Animal 3 CLA-DRB3.2 gene sequence MG835447, occur in the same triplet codon UUC encoding the amino acid phenyl alanine at position 79 of the 95 amino acid long CLA-DRB 3.2 peptide. While the snp 235(T > G) resulted in an amino acid variation from UUC- Phe to GUC- val.; the snp 237(C > G) caused an amino acid change from UUC-Phe to UUG- leucine. While Phe is an aromatic amino acid, both leucine and valine are aliphatic amino acids of hydrophobic nature. Similar situation was observed for the different snp(s) occurring in the CLA-DRB3.2 gene sequence(s) of the other Sangamneri goat breed animals studied presently.
When each of the eleven sequences obtained in the present study was put to SWISS - MODEL analysis (Gasteiger et al., 2005) they were all described as being a member of the broader category of MHC Class II antigen proteins.
Of the total 63 snp(s) observed in the CLA-DRB3.2 gene of the eleven animals of the Sangamneri goat breed, 50 nucleotide variations were non-synonymous i.e. these nucleotide polymorphisms, led to a change in the amino acid. Only thirteen snp(s) were synonymous i.e. the nucleotide variation did not lead to an amino acid change. A summary of the nucleotide variation(s) in the Class II MHC CLA-DRB3.2 gene of the Sangamneri goat and the corresponding amino acid variation(s) occurring as a result of that nucleotide variation(s) is given in Table 1 (Supplementary file - available online only). The results obtained in the present study, therefore, are indicating that a very high degree of polymorphism is present in the Caprine Class II Major Histo compatibility complex-CLA-DRB3.2 gene locus of the indigenous Sangamneri goat breed animals. Earlier studies have also supported the presence of a high degree of genetic polymorphism in the Class II MHC CLA-DRB3.2 locus(Amills et al., 1995, Takada et al., 1998, Zhao et al., 2011, Gowane et al., 2018) This genetic variability is maintained in this locus since it encodes the antigen binding site(ABS) which comes into direct contact with the different antigen(s). The genetic variability results in the amino acid variability that further leads to the protein being able of attaining varying conformations that aid in its interaction with the varying antigenic peptides being presented by different pathogens and eliciting an appropriate immune response in order to counter the infection of the host by the pathogen and conferring the host with an ability to fight against disease and protect it from the varied plethora of antigens present in the environment, thereby protecting the host from the assaults by the pathogens and contributing to disease resistance/susceptibility of the host.
Table 1
List of the nucleotide variation(s) in the Class II MHC CLA-DRB3.2 gene of the Sangamneri goat; and the corresponding amino acid variation(s) occurring as a result of that nucleotide variation.
S.No. | Single nucleotide variation position | Codon and the corresponding encoded amino acid(s) | Type of variation (Syn./Non Syn.) |
1. | 10 (T > C) | TCT- Ser, CCT-Pro. | NS |
2. | 34(A > C > T > G) | GCT-Ala., ACT-Thr., TCT-Ser., CCT-Pro, CAT- His. | NS |
3. | 35(C > A) | GCT-Ala., GAT-Asp | NS |
4. | 38(A > C) | AAG-Lys., ACG-Thr. | NS |
5. | 41(A > G) | AGC-Ser., AAC-Asn., AAA-Lys. | NS |
6. | 42(C > A) | AGC-Ser., AAA-Lys. | NS |
7. | 45(A > G) | GAG-Glu., GAA-Glu | Syn. |
8. | 47(G > T) | TGT-Cys., TTT-Phe. | NS |
9. | 49(T > C) | CAT-His., TAT-Tyr. | NS |
10. | 50(A > G) | CAT-His., CGT- Arg. | NS |
11. | 53 (T > A) | TTC-Phe., TAC- Tyr. | NS |
12. | 56(T > C) | TTC-Phe., TCC-Ser. | NS |
13. | 58(A > C) | AAC-Asn., CAC-His. | NS |
14. | 66(C > G) | ACC-Thr., ACG-Thr. | Syn. |
15. | 80(A > T) | TAC-Tyr., TTC-Phe. | NS |
16. | 81 (C > T) | TAC-Tyr., TAT-Tyr. | Syn. |
17 | 82(C > T) | CTG-Leu., TTG-Leu., | Syn. |
18 | 83(T > G) | CTG-Leu., TGG- Trp. | NS |
19 | 86(A > C) | GAC-Asp., GCC-Ala | NS |
20 | 88(A > G) | AGA-Arg., GGA-Gly. | NS |
21 | 95(A > T) | TTC-Phe,TAC-Tyr. | NS |
22 | 97 (C > T) | CAT-His, TAT-Tyr | NS |
23 | 112(A > T) | ATC-Ile, AAC-Asn., | NS |
24 | 113(A > T > C) | ATC-Ile, TAC-Tyr | NS |
25 | 115(G > C) | GTG-Val., CTG-Leu. | NS |
26 | 122(A > T) | TTC-Phe., TAC-Tyr. | NS |
27 | 128(A > G) | AGC-Ser., AAC-Asn. | NS |
28 | 133(C > T) | TGG-Trp, CGG-Arg. | NS |
29 | 143(A > T) | TTC-Phe, TAC-Tyr. | NS |
30 | 150(A > G) | GCA-Ala, GCG-Ala | Syn |
31 | 153(G > T) | GTG-Val., GTT-Val. | Syn |
32 | 154(A > G) | ACC-Thr., GCC-Ala. | NS |
33 | 155(C > G) | ACC-Thr., GGC-Gly. | NS |
34 | 172 (A > G) | GAT-Asp, AAT-Asn, AAG-Lys | NS |
35 | 173(A > G) | GAT-Asp, GGT-Gly, AGT-Ser., AGG-Arg | NS |
36 | 174(C > T > G) | GAT-Asp, GAG-Glu, GAC-Asp, AGT-Ser., GGT-Gly | NS |
37 | 177(C > G) | GCC-Ala, GCG-Ala. | Syn. |
38. | 178(A > G) | GAG-Glu, AAG-Lys | NS |
39. | 179(A > G) | GAG-Glu, GGG-Gly, AGG-Arg | NS |
40 | 190(A > G) | AGC-Ser., GGC-Gly | NS |
41 | 201(C > G) | GAG-Glu., GAC-Asp. | NS |
42 | 202(A > T > C) | ATC-Ile, TTC-Phe, CTC-Leu | NS |
43 | 204(C > T) | ATC-Ile, ATT-Ile | Syn |
44 | 208(A > G) | AAG-Lys., GAG-Glu | |
45 | 211(C > G > A) | CAG-Gln, GAG-Glu, AAG-Lys | NS |
46 | 212(A > G) | CAG-Gln., CGG-Arg. | NS |
47 | 213(C > G) | CAG-Gln., GAC-His., AGC-Ser., | NS |
48 | 215(A > G) | AGG-Arg., AAG-Lys. | NS |
49. | 216(A > G > C) | AGG-Arg., AGA-Arg. | Syn. |
50. | 217(A > C) | CGG-Arg., AGG-Arg. | Syn. |
51. | 220(A > G) | GCC-Ala., ACC-Thr. | NS |
52. | 224(C > A) | GCG-Ala., GAG-Glu. | NS |
53. | 225(G > C) | GCG-Ala., GCC-Ala. | Syn. |
54. | 227(T > C) | GTG-Val., GCG-Ala | NS |
55. | 235(T > G) | TAC-Tyr.,GAC-Asp., | NS |
56. | 236(A > T) | TAC-Tyr., GTC-Val., | NS |
57. | 237(C > G) | TAC-Tyr., GTG-Val. | NS |
58. | 242(C > G) | ACA-Thr., AGA-Arg. | NS |
59. | 256(G > T) | GTC-val., TTC-Phe. | NS |
60. | 259(G > T > A) | GTT-Val., TTT-Phe., ATT-Ile | NS |
61. | 264(A > G) | GAG-Glu, GAA-Glu | Syn |
62. | 266(G > T) | AGT-ser., ATT-Ile. | NS |
63. | 282(G > C) | CGG-Arg., CGC-Arg. | Syn |
NS-non synonymous, Syn.- synonymous. |
A positive value of the the nonsynonymous (dN) to synonymous substitution (dS) rate ratio (ω = dN/dS > 1), in the present study, implies a strong positive selection to be occurring on the caprine Class II MHC-DRB3.2 gene locus (Kryazhimskiy and Plotkin, 2008). Both the conservative and non conservative substitutions amongst the non-synonymous variation(s) in the CLA-DRB3.2 gene, could be observed,in the present study. Previous studies have also reported the ratio of dN/dS > 1 in the Cahi DRB3.2 / CLA-DRB3.2 gene locus (Ahmed and Othman, 2006; Gowane et al., 2018), and therefore a positive selection force acting on the Cahi DRB3.2/CLA DRB3.2 gene locus. A positive selection is expected to occur at this gene locus as it is involved in antigen presentation and therefore has to interact with all sorts of varied antigenic peptides and then be able to elicit an appropriate immune response in order to protect the animal from disease.