CR and HSPA5 sequence diversity
From the total of 116 samples collected for this study, 108 samples were successfully sequenced for the mtDNA CR and 100 samples for the HSPA5 gene region. A ~800 bp CR sequence was amplified from each of 108 kudu samples, before assemblies were trimmed to a final sequence length of 692 bp. A HSPA5 sequence of 900 - 1100 bp was amplified from 100 kudu samples and these were trimmed to a sequence length of 582 bp, which only included exon 8 and 30 bp of the 3’-UTR. Sequences were trimmed to this extent due to sequencing difficulties at the 5’ end of the target region. All unique haplotypes for CR and HSPA5 were uploaded to Genbank (Accession numbers: OK642751- OK642779).
The total sequencing success rate for the CR was 93.1%, and 86.2% for the HSPA5 region. A total of 60 polymorphic sites were observed for the mitochondrial CR sequences, with only two polymorphic sites for the HSPA5 amplicon (synonymous, n = 1; non-synonymous, n = 1). The non-synonymous polymorphic site was identified as an A/G single nucleotide polymorphism (SNP). The majority of samples (93%) had the A/A genotype, with 7% identified as the A/G heterozygote (KwaZulu-Natal Province, n =6; Free State Province, n =1), with no G/G genotypes observed. This heterozygous SNP is located at the 3’ end of the exon, 21 bp from the stop codon. It causes an amino acid change of glutamic acid (Glu) to glycine (Gly). The synonymous polymorphic site was identified as a C/T SNP, leading to no amino acid changes, and was identified in only three samples (Eastern Cape Province, n =2; Northern Cape Province, n = 1).
Population genetic diversity
The level of nucleotide diversity (π) at the CR for each population ranged from 0.002 in the Eastern Cape Province population, to 0.027 in the Free State Province population (Table 1). The Limpopo Province showed the highest number of haplotypes and haplotype diversity, with the Eastern Cape Province population showing the lowest level of haplotypes and haplotype diversity. For HSPA5, the nucleotide diversity ranged from no diversity to 0.00029. The highest haplotype diversity levels at HSPA5 were observed for the KwaZulu-Natal, Free State and Eastern Cape Provinces (Table 1).
An AMOVA was performed after grouping all individuals according to their province of origin. For the CR, slightly more genetic variation was found within the different groups (55%) than between populations (45%). This indicates some degree of genetic structuring within the overall kudu sample population at the CR. The HSPA5 AMOVA results showed that the majority of the genetic variation was found within the different groups at 86%, with 14% of the variation occurring between populations. A second AMOVA was performed after excluding the limited samples from the Free State and North West Provinces, producing similar results.
All CR-based pairwise FST comparisons (below diagonal in Table 2) of the populations from Limpopo, Northern Cape, Free State and North West Provinces showed no to very low genetic differentiation. Populations from both the KwaZulu-Natal and Eastern Cape Provinces showed significant differentiation compared to all other populations for the CR (p=xxx), with no statistically significant differentiation when compared to each other (p = 0.111). The HSPA5 pairwise combinations (above the diagonal in Table 2) showed that the North West Province differed significantly from all the other provincial groupings, although the corresponding FST values were low.
Phylogenetic analysis
A phylogenetic tree estimated from the ML analysis showed two major clades when analysing the combined CR sequence data (Fig. 2), with strong bootstrap support (99.1%). One clade (designated the Western clade) contained all the Eastern Cape Province sequences and most of the Northern Cape Province sequences. These samples grouped neatly with sequences from the Nersting and Arctander (2001) South-Western clade from Namibia and Botswana, as well as the single South African sample used in that study originating from northern parts of the Northern Cape Province. The second major clade (designated the Eastern clade) contained all the KwaZulu-Natal Province sequences and most of the Limpopo Province sequences. The sequences from this clade grouped with representative sequences from the Nersting and Arctander (2001) Intermediate clade from Botswana, Zimbabwe and Zambia. The sequences from the North West and Free State Provinces were spread across clades with no apparent trend.
The Bayesian phylogenetic tree estimated from Beast shows similar clustering of our South African kudu sequences than observed for the ML tree, with two main lineages observed. The divergence dates calculated for the Bovidae and Cervidae split was 20.748 Mya (95% highest posterior density (HPD) = 17.718-23.758), and the Bovini-Tragelaphini split was estimated at 14.752 Mya (95% HPD = 11.041-18.553 Mya; Fig. 3; Table 3). The T. strepsiceros split from the other Tragelaphus species was calculated at around 4.476 Mya (95% HPD = 3.449-5.644 Mya). The most recent common ancestor (MRCA) for the South African kudu was estimated at 2.237 Mya (95% HPD = 1.309-3.214), during the early Pleistocene, with the MRCAs for the Western and Eastern clades estimated at 1.641 (95% HPD = 0.879-2.488 Mya) and 0.996 Mya (95% HPD = 0.512-1.704 Mya), respectively.
For the haplotype network based on the CR sequences, 26 haplotypes were identified, which grouped to two major clusters, separated by 20 mutational steps (Supplementary Figure S1). Cluster 1 consisted of all samples from the populations from the Eastern and Northern Cape Provinces with some individuals from the Free State and North West Provinces. Cluster 2 consisted of all samples from the Limpopo and KwaZulu-Natal Provinces, and some samples from the Free State and North West Provinces. One Free State specimen sampled from the eastern Free State appeared as an outlier, with 25 mutational steps between this haplotype and the nearest neighbouring haplotypes sampled. These neighbouring haplotypes originated from Northern Cape, Limpopo and North-West Province. No clustering was shown between samples from the Free State, North West and Limpopo Provinces as they were spread throughout the haplotype network. Mitochondrial diversity from the CR thus reflects the western (Eastern and Northern Cape Provinces) and eastern (Limpopo and KwaZulu-Natal Provinces) geographical origin of the sample populations, with the exception of samples from the Free State and North West Provinces found in both clusters.
The haplotype network based on HSPA5 haplotypes (Supplementary Figure S2) showed one major haplotype (Haplotype 1) for the entire dataset. Two additional minor haplotypes were observed. Haplotype 2 consisted of two sequences from the Eastern Cape Province and one sequence from the Northern Cape Province. Haplotype 3 consisted of six sequences from KwaZulu-Natal Province and one sequence from the Free State Province, representing the non-synonymous mutation observed at this gene region. The MANOVA results identified a significant association between HSPA5 haplotype occurrence per locality and the tested bioclimatic variables (F(36, 154) = 1.505; Wilks’ lambda = 0.547, p-value = 0.047). Significant links were observed between HSPA5 haplotype occurrence and the total annual precipitation, precipitation of the wettest month, precipitation in the wettest quarter and precipitation in the warmest quarter of the sampling localities (p-value < 0.05). The post hoc multiple comparisons showed that HSPA5 Haplotype 3 occurrence was significantly different from the other two haplotypes when considering the total annual precipitation, precipitation of the wettest month, precipitation in the wettest quarter and precipitation in the warmest quarter bioclimatic variables (p-value < 0.05). The Spearman’s rank correlation (rho) analysis supported the MANOVA results, and indicated a significant positive correlation between HSPA5 Haplotype 3 occurrence and the total annual precipitation, precipitation of the wettest month, precipitation in the wettest quarter and precipitation in the warmest quarter bioclimatic variables (p-value < 0.05).
BMP4 STR data
A total of 111 samples were successfully genotyped for the BMP4 STR. These tandem repeats displayed a variable number of dinucleotide repeats of cytosine (C) and adenine (A) bases (CAn). Following fragment analysis, 14 distinct alleles were identified (Table 4). Private alleles were observed for the Northern Cape (n = 2), KwaZulu-Natal (n = 1) and North West Provinces (n = 1). The ANOVA results showed that the samples from the Northern Cape and Eastern Cape (western group) had significantly (p-value < 0.05) shorter alleles than samples from Limpopo and KwaZulu-Natal (eastern group; Fig. 4).
Populations from the Northern Cape, Eastern Cape, Limpopo, and KwaZulu-Natal Provinces had lower observed heterozygosity values (HO) than what was expected (HE). The KwaZulu-Natal Province population showed the lowest HO. Populations from both the Free State and North West Provinces showed higher HO levels, although this should be interpreted with caution as both these populations were represented by low sample sizes. Furthermore, with the samples taken from taxidermists, sampling error may be present leading to FIS values that are not representative of values within specific natural populations.
An AMOVA was performed after grouping all individuals according to their province of origin. It was observed that more genetic variation is found within the different individuals at 61%, with lower levels of variation between the individuals at 10%, and among the populations at 28%. Almost all pair-wise FST combinations showed statistically significant differentiation (p < 0.05; Table 5). Only pairwise combinations of the populations from Limpopo and KwaZulu-Natal Province, and Northern Cape and North West Provinces, did not show statistically significant differentiation. The PCoA based on the FST and Nei’s D, however, provided no clear patterns of genetic differentiation between groups (Supplementary Figure S3).