By analyzing 164 male samples from 62 localities throughout Malaysia (including five major islands), this work represents the most comprehensive effort to infer male-mediated gene flow of M. fascicularis in Malaysia based on the TSPY and SRY segments of the Y-DNA gene. Combining this dataset with the sequences of M. fascicularis from the other regional populations, a complete and broad analysis of the Y-DNA gene flow were able to be estimated, covering the entire range of the species in SEA. In general, the combined segments of the SRY and TSPY in this study were able to provide several paternal insights into the gene flow, phylogenetic relationships at the population level as well as divergence time estimation of M. fascicularis.
In this study, the phylogenetic relationship of Macaca based on the Y-DNA supported the four major species groupings congruent with the morphological classification by Delson [31]: sylvanus, silenus, sinica, and fascicularis species groups. Although supported with low bootstrap values (except for fascicularis species group which is supported by high bootstrap values of 99 and 100% for BI and ML, respectively), all Macaca species were clustered into their respective species groups. Similarly, previous nuclear DNA studies also supported the four major groupings [9,16,32-34].
Within the fascicularis species group, two major clades were observed: (1) a species complex group comprising of M. cyclopis, M. fuscata, M. mulatta, and the Indochinese M. fascicularis and (2) M. fascicularis haplotypes of Sundaic origin. Both clades shared an LCA at ~2.35 mya, which is in proximity to the previous estimation of 2.30 mya by Tosi, Morales, and Melnick [9]. This estimation coincides with the multiple major climate cooling events during Late Pliocene and early Pleistocene occurring between 2.8 and 2.4 mya [35-36] that influenced the paleoenvironments of SEA [24-25]. The cold and dry climate during the glacials gave rise to savannah-like vegetation or semi-open woodland [25,27,29,37], causing restriction of dispersal for primates, thus could lead to the separation between both clades.
Hybridization of M. mulatta Y-DNA into M. fascicularis in the Indochinese Region
The haplotype sharing between M. mulatta (N = 12) and M. fascicularis (N = 34) represents the male-mediated gene flow of M. mulatta into M. fascicularis populations from the Indochinese region. Several authors have previously reported on the possible occurrence of the hybridization events between the Indochinese M. fascicularis and M. mulatta [9,15-16,23,38-40]. The extent of the contamination of M. mulatta Y-DNA into the Indochinese M. fascicularis population is not only restricted to the contact zones between both species but possibly more widespread across the Indochinese region [23,38-39].
In this study, we confirmed the absence of the male-mediated gene flow of M. mulatta into M. fascicularis from the Sundaic region, which is restricted at the Surat Thani-Krabi depression region, as shown in Fig. 2b. This is similar to the separation observed between the southern and northern pig-tailed macaques [41]. It is possible that the Surat Thani-Krabi depression played a significant role as a barrier that limits the downwards dispersal of the contaminated Indochinese M. fascicularis into the Sundaic lineage. Thus, based on the current finding, we defined the Sundaic M. fascicularis as the original carrier of Y-DNA in M. fascicularis.
The region of Isthmus of Kra and Kangar-Patani marks the biogeographical barrier that separated the Indochinese and Sundaic zoogeographical regions [42]. These regions display the shift from mixed deciduous forest to the north to wet seasonal rainforest type to the south [43]. Situated along the modern Thailand–Malaysian border (see Fig. 2b), these biogeographical barriers has been documented to limit the gene flow between the Indochinese M. fascicularis from their Sundaic conspecifics [15,23,39,44] as well as in other vertebrate faunal species [43,45-50]. Therefore, to measure the extent of the M. mulatta male hybridization into M. fascicularis, further research to screen M. fascicularis from this area are needed.
Sundaic M. fascicularis: Continental and Insular Forms
The Sundaic M. fascicularis lineage, which is the original carrier of Y-DNA in M. fascicularis, was further separated into continental and insular forms and shared the LCA at about 2.00 mya. The continental form comprised of haplotypes from southern Thai Peninsular (south of Surat Thani-Krabi depression), Peninsular Malaysia, and Sumatra, including the introduced Mauritian population, while the insular form consisted of haplotypes from southern Sumatra, Java, Borneo, and the Philippines. Likewise, previous molecular studies have identified both forms using mtDNA and nuclear gene [15,23,51-56]. Furthermore, the MJN suggested that the insular form was the predecessors of the continental form (Fig. 4). This result would be consistent with the hypothesis that the ancestors of M. fascicularis evolved in insular SEA [31,57-58]. Wherever and whenever this species originated, the earliest M. fascicularis-like fossil was discovered in Java, suggesting that they have already inhibited the Sundaic region as early as 0.9 – 1.0 mya [58].
After diverging out from the insular form, the continental form diverged into 17 haplotypes, which all shared the LCA at around 1.65 mya (Fig. 3). The dominant Haplotype 1 characterized this form, which is shared with 103 individuals and is distributed as far north as Wat Suwankhuha, of the Thai Peninsular, downwards to entire Peninsular Malaysia until Sumatra of Indonesia. According to the MJN (Fig. 4), all the other continental haplotypes were derived from this dominant haplotype. Moreover, within Peninsular Malaysia, the distribution of haplotypes (Fig. 2a) further showed the segregation of several unique haplotypes that can be found only at the north-western region (Haplotype 4, 6 – 8, 11 – 14) as well as the east coast region (Haplotype 15 – 17). In contrast, the central-southern regions are relatively connected with no unique haplotype defining the region.
Compared to the 17 observed haplotypes within the continental form, only two haplotypes were observed within the insular form, which shared the LCA at around 0.71 mya (Fig. 3). M. fascicularis from southern Sumatra, Borneo, and the Sibuyan Island of northern Philippines were all represented by a single dominant insular haplotype, Haplotype 18. The island of Java, on the other hand, was represented by Haplotype 19. The small number of haplotypes observed shows that the insular form consists of significantly low variability and intact Y-DNA. The highly dispersal nature of males would move and homogenize Y-DNA across populations [9] and therefore contribute to this highly intact Y-DNA across the insular region.
Prolong and recurring connectivity that existed in the past could have facilitated the dispersal of males between the insular regions (Sumatra, Java, Borneo, and the Philippines). Several episodes of sea-level fluctuations in Sundaland during the Pliocene up until recently during the Holocene around seven thousand years ago (kya) would permit for such connectivity with some as low as -120 m below the present level [25,59]. In addition to the connectivity, a possible ancient bottleneck could explain for the intact Y-DNA in the insular region. The ancestors of the insular form could have been forced into a common refugium postulated to have existed in northern Borneo [24-25,29,60-61]. Over time, only a single dominant haplotype could have persevered. Alternatively, overhunting and exploitation by early human could also severely have reduced the effective Y-DNA gene pool. Remains of M. fascicularis-like species were discovered in cave settlements in Java (~0.9 – 1.0 mya) and Borneo (~30 – 40 kya) [58] giving evidence to early exploitations by human. Nevertheless, the low number of samples and available sequences (N = 27) could hinder the possibility of detecting other haplotypes within the insular form. Therefore, to further measure the diversity of the insular form, further research utilizing more samples should be conducted on the insular form.
Uniquely, M. fascicularis from Sumatra exhibited both the continental and insular Y-DNA haplotypes (Fig. 2b; Additional file 2: Table S2). A secondary contact hypothesis could account for this condition in Sumatra with two possible explanation. The first explanation offered by Tosi and Coke [62] hypothesized that Sumatra was already populated by the insular Sundaic form, prior to the immigrating males of the continental form from the Malay Peninsula. Due to the narrow and shallow waterways on the Sunda Shelf separating the Malay Peninsula and Sumatra [25,59], a recent dispersal of the dominant continental haplotype from the Malay Peninsula is postulated as being in the process of colonizing the native insular form.
In contrast, a second explanation similar to the first secondary contact hypothesis takes into account the biogeographical history of both regions. Several authors have postulated the formation of several refugia during the Pleistocene, whereas the Malay Peninsula was surrounded by a belt of savannah-like vegetation [24-26,63]. This implies that dispersal of the continental form occurred from a postulated refugium in Sumatra into the Malay Peninsula and the surrounding areas. Dispersal of the insular form from a refugium in western Java into Sumatra, which was already populated by the continental Sundaic haplotypes could explain for the presence of both forms in Sumatra.