Genetic relationship of Nepalese populations and their surrounding populations
We calculated diversity indices and tests of selective neutrality of Nepalese populations based on the HVS-I sequences (Additional file 2: Table S5). We detect relatively lower nucleotide diversity and mean pair-wise differences among the Newar sub-caste, whereas the highest diversity was observed among the Kathmandu (Gayden et al. 2007) and other Nepalese (mix population from Kathmandu and Eastern Nepal) (Wang et al. 2012). The pair-wise Fst distance (Additional file 1: Figure S2) shows a relatively higher genetic distance between the Newar and Tamang (0.031), followed by Newar and Nepali-other (0.029), Newar and Brahmin (0.025), and Newar and Magar (0.018). The principal component analysis (PCA) among the Nepalese populations (Additional file 1: Figure S3), accounting for 43.27% of the total variation, reflected a broad overview of population structure across the Southern Himalaya. All the Newar sub castes clustered tightly and were placed closer to the mixed populations from Kathmandu and Eastern Nepal. Brahmin positioned itself near the eastern Tharu (TH-E), Kathmandu and Nepali-other. Magar (MGR), Tamang (TMG), and Tharu from Chitwan (TH-CI and TH-CII) cluster closely in the PCA plot. Sherpa, who inhabit the highest mountainous regions close to the Tibetan plateau, remained distant from LAN, suggesting a low level of recent gene flow from surrounding Nepalese populations.
Further to examine the genetic relationship of Nepalese with other Asian populations, we performed PCA (Figure. 2), based on mtDNA haplogroup frequencies of Nepalese and Asian populations (Additional file 2: Table S4). PC1 fairly separated the Tai-Kadai and Han Chinese from the other groups. PC2 was successful in delineating the South Asians, including the Nepalese from the rest of the group. Tamang, Magar and Tharu (CI & CII) group of Nepal are placed closer to the high-altitude Himalayan cluster formed by Sherpa and Tibetans, showing high-altitude barrier to gene flow is rather more permeable from North to South direction across the Himalayan arc. A Southern cluster consisted mostly of IE and Dravidian-speaking populations of India and Nepal. Newar, Nepali-mix (Nepali-other), Tharu (Eastern), Tibeto-Burman groups (Myanmar, Bangladesh and northeast India) and IE groups (Kshatriya and Shah of North India) appear as potentially admixed populations forming a central Himalayan cluster in the PCA plot.
Figure 2 | Principal component analysis among the Nepalese and 127 other Asian populations. Non-Nepalese populations were mostly grouped by language families. The first (PC1) and the second (PC2) Principal components explain 24.37% and 21.4% of the genetic variance, respectively.
This differentiation across Nepalese subgroups was further tested by the analysis of the derived variant rs3827760 of the EDAR gene (Figure. 3). Scans of positive selection on genome-wide surveys from the global human population have identified the derived coding variant of the EDAR gene as an interesting candidate for the positive selection in East Asians (Fujimoto et al. 2008; Kamberov et al. 2013). This has resulted in a very high frequency of non-synonymous SNP rs3827760 (370A; 1540C) in East Asian and the Native American origin but virtually absent among the Indians (Dravidian and Indo-European), European and African populations (Chaubey et al. 2011; Fujimoto et al. 2008). Interestingly, this derived allele was present in both the Tibeto-Burman (Magar and Newar) and IE (Brahmin) populations of Nepal. The highest 1540C allele frequency was observed in Magar (71%), followed by Newar (30%) and Brahmin (20%). The frequency observed in Magar was similar to those reported in HAAPs (Himalayan and adjoining populations) (Tamang et al. 2018) and Indian Tibeto-Burman populations (Chaubey et al. 2011). However, despite having nearly equal proportions of East Eurasian maternal components, Newar showed almost half of the derived EDAR allele frequency than that of Magar, reflecting their distinct population history. The presence of a substantial 1540c allele frequency in Brahmin implies admixture with the local Tibet-Burman populations.
Figure 3 | EDAR 1540C allele frequency distribution. (A) Geographic distribution of the EDAR 1540C allele frequency worldwide. The Map was generated using the Kriging linear model of Surfer 16.0.3 (Golden Software, LLC). (B) Geographic distribution of the EDAR 1540C allele frequency in different groups of Asia. The frequency is shown in proportion to the bubble size.
Phylogeographic patterns of the dominant East Eurasian haplogroups
To delve into the migratory and admixture events that might have occurred between the aboriginal Nepalese and other East Eurasians, we compared the dominant East Eurasian lineages in Nepalese (Table 2) and East Eurasians and analyzed haplotype networks tree and parsimonious phylogenetic tree of major East Eurasian haplogroups. Further, the frequency counter maps of shared haplogroups were constructed.
Table 3
Estimated ages of the basal East Eurasian lineages in the Nepalese population.
|
|
Complete Genome Substitutions
|
|
|
|
Complete Genome Substitutions
|
|
|
Soares Rate
|
|
|
|
Soares Rate
|
|
|
(Soares et.al.2009)
|
|
|
|
(Soares et.al.2009)
|
Haplogroup
|
n
|
ρ ± SE
|
TMRCA (Kya)
|
|
Haplogroup
|
n
|
ρ ± SE
|
TMRCA (Kya)
|
A + 152 + 16362
|
53
|
8.9 ± 0.2
|
24.5 (22.9, 26.0)
|
|
F1d1
|
29
|
2.9 ± 0.2
|
7.6 (6.4, 8.8)
|
A17
|
41
|
5.3 ± 0.1
|
13.9 (13.4, 14.9)
|
|
F1d1a*+15204
|
14
|
2.8 ± 0.2
|
7.3 (6.0, 8.6)
|
A17a*
|
8
|
3.2 ± 0.3
|
8.4 (6.8, 10)
|
|
F1d1a1*
|
14
|
1.8 ± 0.2
|
4.6 (3.4. 5.9)
|
A17a1*
|
2
|
1.5 ± 0.5
|
3.8 (1.3, 6.4)
|
|
F1d + 5460 (F1d2*)
|
7
|
3.5 ± 0.1
|
9.2 (8.28, 10.2)
|
A27*
|
11
|
3.2 ± 0.2
|
8.4 (7.3, 9.4)
|
|
F1c1a2a*
|
19
|
3.6 ± 0.2
|
9.5 (8.4, 10.5)
|
Z3
|
97
|
9.2 ± 0.1
|
25.4 (24.4, 26.3)
|
|
F1c1a2a2*
|
5
|
2 ± 0.07
|
5.2 (4.8, 5.5)
|
Z3a
|
59
|
6.9 ± 0.2
|
18.7 (17.6, 19.8)
|
|
F1c1a2a1
|
9
|
1 ± 0.1
|
2.5 (2, 3)
|
Z3a1
|
51
|
6.8 ± 0.2
|
18.4 (16.8, 20.0)
|
|
G2a1d2
|
14
|
5.4 ± 0.3
|
14.4 (12.5, 16.1)
|
Z3a1a
|
45
|
4.8 ± 0.2
|
12.8 (11.7, 13.9)
|
|
G2a1d2b*
|
7
|
5.1 ± 0.5
|
13.6 (10.9, 16.3)
|
Z3a1a1*
|
43
|
3.2 ± 0.1
|
8.4 (7.9, 8.9)
|
|
G2a1d2b1*
|
6
|
4.8 ± 0.4
|
12.8 (10.6, 15)
|
Z3a2
|
7
|
4.1 ± 0.5
|
10.8 (8.2, 13.5)
|
|
G2a1d2a
|
5
|
1.8 ± 0.5
|
4.6 (2.1, 7.2)
|
Z3a2a
|
5
|
1 ± 0.3
|
2.5 (1.0, 4.1)
|
|
G2b2a1*
|
3
|
3.3 ± 0.1
|
8.7 (8.1, 9.2)
|
Z3b
|
9
|
2.1 ± 0.3
|
5.4 (3.9, 7.0)
|
|
G3a3*
|
4
|
2.2 ± 0.4
|
5.7 (3.6, 7.8)
|
Z1
|
22
|
7.9 ± 0.4
|
21.6 (19.3, 23.8)
|
|
D4i
|
41
|
5 ± 0.1
|
13.3 (12.8, 13.9)
|
Z1a
|
21
|
4 ± 0.2
|
10.6 (9.5, 11.6)
|
|
D4i2
|
21
|
3.1 ± 0.3
|
8.1 (6.5, 9.7)
|
Z1a1
|
15
|
3.3 ± 0.3
|
8.7 (7.1, 10.3)
|
|
D4i2a*
|
21
|
1.1 ± 0.1
|
2.8 (2.3, 3.3)
|
Z1a1a
|
13
|
1.5 ± 0.1
|
3.8 (3.3, 4.4)
|
|
D4i3
|
5
|
3 ± 0.4
|
7.8 (5.7, 10)
|
18
|
0.8 ± 0.07
|
2.0 (1.7, 2.4)
|
|
D4i6*
|
4
|
5.5 ± 0.8
|
14.7 (10.4, 19.2)
|
|
Z7
|
7
|
2.1 ± 0.2
|
5.4 (4.1, 6.8)
|
|
D4i6a*
|
3
|
2.7 ± 0.6
|
7 (3.9, 10.2)
|
Z7 + 7471
|
7
|
2.1 ± 0.2
|
5.4 (4.0, 6.8)
|
|
D4h2b*
|
5
|
1.6 ± 0.1
|
4.1 (3.6, 4.6)
|
Z8*
|
2
|
1.5 ± 0.6
|
3.8 (0.8, 7.0)
|
|
D5a2b
|
11
|
4.7 ± 0.3
|
12.5 (11, 14.1)
|
F1d
|
59
|
3.6 ± 0.1
|
9.5 (8.7, 10.2)
|
|
D5a2a1a
|
7
|
6.6 ± 0.5
|
17.8 (15, 20.6)
|
TMRCA (Time to most recent common ancestor). |
The symbol * indicates new basal lineages identified in this study. |
The phylogeography of haplogroup Z
Figure 4 | Diversity and distribution pattern of haplogroup Z. (A) Phylogenetic tree of sub-clade Z3 based on complete mtDNA sequences. Symbol * represents the new haplogroup defined in this study. (B) Network of sub-haplogroup Z3 based on HVR I sequence, color-coded by geographic origin. (C) Phylogenetic tree of haplogroup Z7 and Z8 based on complete mtDNA sequences. (D) Spatial frequency distribution map of haplogroup Z. Populations (Additional file 2: Table S4) having a frequency of Z > 0 were included in the analysis and were marked by the blue dots on the contour map.
Haplogroup Z is one of the infrequent mtDNA haplogroups with sporadic distribution in Russia, East Asia, Mainland Southeast Asia (MSEA), Central Asia, Eastern and Northern Europe, and South Asia (Figure. 4) (Chandrasekar et al. 2009; Fedorova et al. 2013). Among the members (Z1, Z2, Z3, Z4 and Z7) of haplogroup Z, Nepalese populations were characterized by rare clades Z3a1a and Z7, of which Z3a1a was the most frequent sub-clade in Newar (Additional file 1: Figure S4). Lineage Z1 dominated most of the representatives of haplogroup Z in North Asia and Europe (Fedorova et al. 2013). In this region, we estimated the coalescent age of ~ 21.6 Kilo-years ago (Kya) for clade Z1 (Table 3), similar to the earlier estimate of ~ 20.4 Kya (Fedorova et al. 2013). Another clade Z3, found in East Asia, North Asia and MSEA, is the oldest member of haplogroup Z with an estimated age of ~ 25.4 Kya. Haplogroup Z has its highest clade diversity in China, with an overall prevalence of 2.8% (620/ 21668) and is partially more frequent in Northern Han Chinese (3.1%). Likewise, the overall rate of haplogroup Z in Tibetans was 1.92% (of 6109 samples) (Qi et al. 2013). In another recent study of Tibetans, with comparatively less sample, haplogroup Z was characterized by clades Z3a (5/682), Z3a1a (1/682), Z4a (1/682) and Z4 (1/682) (Qi et al. 2013). Overall, haplogroup Z reaches the highest incidence in Tibeto-Burman speaking populations from Thailand (Lisu_2, 20%) (Oota et al. 2001), Nepal (Newar, 18.6%) and Myanmar (Chin2, 12%) (Li et al. 2015), in which clade Z3a1a dominated most of the representatives of this haplogroup.
After constructing a phylogenetic tree (Fig. 4A and Additional file 1: Figure S5) based on 161 complete mtDNA sequences, we identified the ancestral nodes of Z3a were distributed mainly in Northern and Eastern Han Chinese, and thus most likely originated in these regions. This claim was further bolstered by the presence of Z3a1 lineage in Yakut of Sakha Republic (Additional file 1: Figure S6). Terminal clade Z3a1a1 detected in Tibet, Myanmar, Nepal, India, Thai-Laos and Vietnam trace their ancestral roots to China with a coalescent age of ~ 8.4 Kya (Table 3). Basal variant 9452A with a coalescent age of ~ 4.4 Kya were only shared between the members of Z3a1a1 from North-East India, Thailand and Vietnam but not with the Nepalese and Tibetans. Z3a1a1 lineage of Tibet, Nepal, and India does not share any further basal variants, and lineage expansion as denoted by a star-like pattern was confined to their respective geographic locations (Figure. 4B). Sub haplogroup Z7 (Figure. 4C) observed, so far, only in Tibet, Northeast India and Nepal coalesce at ~ 5.4 Kya (Table 3). Newly defined clade Z8 (Figure. 4C), so far, detected only in Newar ethnic group of Nepal with a coalescent age ~ 3.8 Kya (Table 3).
The phylogeography of haplogroup F
Figure 5 | Diversity and distribution pattern of haplogroup F1. (A) Network of haplogroup F1 (F1d, F1d1, F1c1a2a, F1g, and F1g1) complete mtDNA sequences, color-coded by geographic origin. (B) network of sub-haplogroup F1d1, color-coded by geographic origin. (C, D) Spatial frequency distribution map of sub-haplogroup F1d and F1c in Asia. Out of 308 Asian populations (Additional File 2: Table S4), only the populations having a frequency of F1d > 0 were included in the analysis and were marked by the blue dots in the Map.
Haplogroup F, mainly clade F1a, is one of the most common haplogroups throughout Asia, with high frequency and haplotype diversities reported in MSEA and China, an indication of a dispersal center for this haplogroup (Duong et al. 2018; Kutanan et al. 2017; Li et al. 2015; Li et al. 2019b; Summerer et al. 2014; Zhang et al. 2013). Though the overall incidence of haplogroup F in Han Chinese is 16.5%, its sub-clades F1d (0.5%, 111/21668), F1c (0.85%) and F1g (0.8%) was scarce (< 1%) and were relatively more in Northern Han Chinese (Li et al. 2019b). In the present study, Nepalese populations were characterized by rare clades of F1 (F1c, F1d1, and F1g) and F2b, of which F1d1 was the second most frequent sub-clade in Newar (Fig. 5A and Additional file 1: Figure S4). Tibetan mtDNA data also revealed similar frequency patterns, in which F1g (4.5%) (Kang et al. 2016) were rather more frequent than F1d (2.6%, 18/671) (Li et al. 2019a) and F1c (2.16% of 6109 samples) (Qi et al. 2013). However, the Lhobas, one of the most isolated Himalayan tribes of southeastern Tibet, harbors a high frequency of both clade F1d (11%, 10/91) and F1c (7.2%, 7/96) (Kang et al. 2016). Likewise, in MSEA, among the Burmese of Myanmar and Mon people of west Thailand, the frequency of F1d reaches up to 7% and 8% respectively (Kutanan et al. 2017; Summerer et al. 2014). Overall, haplogroup F1d reaches the greatest proportion in Newar (11.97%) of Nepal and Kshatriya (16%) (Negi et al. 2016) of North India.
When we constructed a parsimonious phylogenetic tree (Additional file 1: Figure S6) based on 121 complete mtDNA sequences, we identified the ancestral node of clade F1d was mainly disseminated in Han Chinese with an estimated coalescent age of ~ 9.5 Kya (Table 3). Clade F1c, another sub-lineage of haplogroup F, attains its highest frequency in the Tibeto-Burman speaking group from Northeast India (Apatani, 16.8% and Nishi, 9.1%) and Myanmar (Chin 3: 15.4%, Naga 1: 6.3% and Rakhine 1: 8.3%). The estimated coalescent age of the sub-clade F1c1a2a and F1d1 was ~ 9.5 Kya and 7.8 Kya, respectively (Table 3). After the divergence from the ancestral Chinese F1d1 sequence, Tibeto-Burman groups (of Myanmar, Tibet, and Nepal) and Austroasiatic (AA) groups (of Thailand) shared a common variant 16284G (Figure. 5B) implying a common Tibeto-Burman ancestry. Intriguingly, these groups do not share any further basal variants and lineage expansion as denoted by star-like shape was confined to their respective geographic regions (Figure. 5B). Nepalese-specific branch (F1d1a1) coalesces at ~ 4.4 Kya, suggesting their ancient origin and lineage expansion in Nepal most probably via southeastern Tibet (Figure. 5C). Likewise, after bifurcating from ancestral Chinese/Vietnamese sequence, four basal variants (3970T, 5300T, 5663T, and 12892C) defining clade F1g1a remained restricted within Nepalese populations (Additional file 1: Figure S8). Analysis of other clades, F1c and F2b, also resulted in a similar mode of distribution, further verifying the notion of early entry into Nepal (Figure. 5D and Additional file 1: Figure S7 and S9).
The phylogeography of haplogroup A
Figure 6 | Maximum Parsimonious tree of haplogroup A17 and newly defined haplogroup A27. Complete phylogenetic tree based on 52 complete mtDNA sequences is given in the Additional File 1: Figure S10
Haplogroup A is one of the most prevalent haplogroups in northern and eastern Asia. It is relatively higher in Tibetans (14.63%) (Qi et al. 2013) than in Northern (8.53%) and Southern Han Chinese (6.54%) (Li et al. 2019b). Members of haplogroup A in Tibet are dominated by sub-clades A4 (7.8%) and A11 (5.5%) whereas the other sub-clades are absent or present in shallow frequency in Tibetans (< 2%) (Kang et al. 2016; Qi et al. 2013). Haplogroup A, dominated by its sub-lineage A15c1, was the most common maternal lineage in Sherpa of Nepal, with a frequency of 27.5% (Bhandari et al. 2015). This clade was not observed in the other Nepalese populations studied so far. In the examined LAN, lineage A was mirrored by its clades, A27, A14 and A17, of which A27 was the most abundant clade in Newar (3.99%). Newly defined clade A27 only discerned so far in Newar and Nepali-mix coalesce at ~ 8.4 Kya (Fig. 6 and Table 3) suggesting their ancient origin and potentially in-situ differentiation in Nepal. The other sub-clades, A14 and A17, were seen with shallow occurrence in Brahmin, Newar, Kathmandu and Nepali-mix. A17a lineage belonging to Nepalese, Chinese and Vietnamese shares the most recent common ancestor dating back to ~ 8.4 Kya (Fig. 5 and Table 3). Its sub-clade A17a1, branched off directly from the nodes occupied by the Chinese/Vietnamese lineages, is peculiar to Nepal with a coalesces time ~ 3.8 Kya.
The phylogeography of haplogroup D
Haplogroup D is found in Northeast Asia, and Central Asia; and is the second most dominant haplogroup in Tibetans with an average frequency of 16.5% in the Plateau (Comas et al. 2004; Kang et al. 2013; Kutanan et al. 2017; Li et al. 2015; Li et al. 2019b). Detected with noticeable frequency among the Nepalese, it also represents the most dominant maternal lineage in Tamang (26.1%) and Magar (24.3%). Lineage D4 and its representative D4i, D4j and D4e were the major sub-haplogroup in Nepalese, whereas D5 lineage was dominated by D5a2 and its sub-clades D5a2b and D5a2a1 (Additional file 1: Figure S4). We detect distinct sub clades of D4i distributed among the Nepalese populations. Magar and Nepali-mix shared the most recent common ancestor with Chinese, and Taiwanese D4i3 with a coalescent time of ~ 7.8 Kya (Table 3 and Additional file 1: Figure S11). Whereas, D4i detected in Newar, newly named D4i6, shared its ancestral root with the Turkic ethnic group (Uyghur) of northwest China (near to Central Asia) and coalesces at ~ 14.7 Kya (Table 3). Its sub-clade D4i6a so far only sighted in the Nepalese populations coalesce at ~ 7 Kya. Interestingly, newly defined clade D4h2b detected in Thailand and Nepal share a most recent common ancestor and coalesce at ~ 4.1 Kya. Three Nepalese samples shared a most recent common ancestor with the Chinese/Vietnamese and Tibetan D5a2b lineage (Additional file 1: Figure S12).
The phylogeography of haplogroup M9a
M9a is the most common haplogroup in Tibet, with an overall occurrence of 22.4% (Qi et al. 2013). In the Nepalese populations, it is prevalent mainly in Sherpa (27.4%), Tharu-CI (19.6%), Tamang (15.5%), Magar (13.5%), and Tharu-CII (Additional file 1: Figure S4). Among the four sub-haplogroups (M9a1a, M9a1a2, M9a1b1and M9a1a1c1b1a) of M9a, M9a1a1c1b1a is predominant (58% of the total M9a individual) in both Tibetans (Peng et al. 2011) and Sherpa (Bhandari et al. 2015) but virtually absent in other Nepalese populations. Previous study estimated a shallow divergence between Tibetans and Sherpa, indicating recent gene flow from Tibetans to Sherpa (Bhandari et al. 2015). Likewise, M9a1a2 is prevalent in Tharu and Magar, whereas Tamang encompasses M9a1a, M9a1a2 and M9a1b sub-clades. The shared prevalence of M9a1a2 between Nepalese populations (Tharu, Magar, Tamang and Newar) and the estimated age of 5.6 Kya (Table 3) show their rather ancient origin and most plausibly a de novo differentiation in Nepal (Wang et al. 2012).
The phylogeography of haplogroup G
Haplogroup G is distributed at a significant frequency across East Asia, Central Asia, and MSEA (Asari et al. 2007; Kutanan et al. 2017; Li et al. 2015; Li et al. 2019b; Qi et al. 2013). Among Nepalese, haplogroup G reaches the highest frequency in the Tharu (Th-CII, 28%, Th-CI, 26.7% and Th-E, 9.7%) ethnic group and also detected in considerable frequency in Tamang (11%) and Brahmin (8.3%). Complete mtDNA analysis shows clustering of Nepalese Brahmin G3 lineage with those identified in China, Tibet and Kashmir. Moreover, Brahmin and Kashmiri G3a3a lineage show divergence from ancestral Tibetans/Chinese G3 lineage ~ 5.7 Kya (Table 3 and Additional file 1: Figure S13). Similar to G3a, we identified several new lineages (G2a1d2b, G2b2a1 and G2c) of G2 characterized by population expansion within the Nepalese groups (Additional file 1: Figure S13).
Nepal holds a crucial and significant location in the geology of the Himalaya, serving as a bridge between the Western and Eastern Himalaya (Dhital 2015). It stretches from low-lying foothills in the south to the Great Himalayas that extend to the Tibetan plateau in the north, featuring several of the world’s highest peaks (Dhital 2015). At a genetic level, Himalayan arc was also suggested to have played a pivotal role in shaping human population history and genetic diversity in Asia, acting as a natural genetic barrier that helped to keep large cultural and genetic distinctions observable between South Asian and East Asian populations (Bhandari et al. 2015; Gayden et al. 2007; Gayden et al. 2013; Wang et al. 2012). Tibetan desert high plateau and the high-altitude transverse valleys (> 3,000 masl) of Nepal were among the challenging places inhabited by prehistoric humans. According to previous genetic studies, most of the Tibetan genetic components can be attributed mainly to the Neolithic immigrations initiated from northern China ~ 7 Kya (Qi et al. 2013; Zhao 2009). Archaeological data from the Tibetan plateau suggest an initial extensive human settlement in the northeastern plateau (< 2500 masl) ~ 5.2 Kya, long before their further expansion to high-altitude plateau areas (> 2500 masl) ~ 3.6 Kya (Chen et al. 2015). Among the Nepalese Tibeto-Burmans, Sherpa migration from Eastern to Central Tibet and later, approximately in the 16th century, to the Solukhumbu region of Nepal is well-documented (Oppitz ; Whelpton 2005). These historical records correlate adequately with both uniparental and genome-wide perspective (Bhandari et al. 2015; Jeong et al. 2016; Zhang et al. 2017). A recent study showed that two clades, M9a1a1c1b1a and A11a1a, played significant roles in shaping the genetic landscapes of contemporary Tibetans (Li et al. 2019a). However, our findings show that these lineages were limited to Tibetans and Sherpas, and were essentially non-existent in the LAN. Except for few LAN lineages that show connections to northeastern Tibet e.g., D4i3, G2a1d2b, G3a1—most of them show settlements into Eastern Himalayas that extend from Eastern Nepal across Northeast India, Bhutan, Southeastern Tibet to Yunan in China and Northern Myanmar.
Hence, phylogeographic analysis of the East Asian haplogroups detected in LAN shows different characteristics than high altitude Sherpa/Tibetans: (1) LAN share common Tibeto-Burman precursor with few ethnic groups of Southeastern Tibet including Lhasa and are prevalent in Myanmar, Thailand and Northeast India; (2) majority of the LAN maternal components can be traced back to Neolithic immigrations from China ~ 8 Kya, which is in a good agreement with previous studies (Qi et al. 2013; Zhao 2009); (3) Significantly, after the divergence from ancestral Chinese sequence, the majority of LAN mtDNA lineage do not share any additional variants with Tibetans, Northeast Indians and Burmese.
This genetic differentiation between Sherpa/ Tibetans and LAN is further supported by the mtDNA profile of eight ancient dental samples derived from the high-altitude Annapurna Conservation Area (ACA) of Nepal (Mustang District), situated near the Tibetan plateau (Jeong et al. 2016). These samples, spanning 3,150 to 1,250 years before present, yielded sub-clades (Z3a1a, F1c1a, F1d1, D4j1b, and M9a) that were mostly present in the contemporary LAN and were virtually absent in high altitude Sherpa (except M9a, whose terminal haplogroups was uncharacterized). Therefore, it is conceivable that precursors of the earliest inhabitant of the Southern slopes of Himalaya (e.g., those having occupied the mustang region of Nepal) were genetically closer to the contemporary low-altitude Nepalese and plausibly crossed the high-altitude Tibetan/Nepalese passes. Our results rejected the previous claim of affiliation of ACA samples with the high-altitude contemporary Sherpas (Jeong et al. 2016). This shed new light on the migratory roadmap and admixture events that might have occurred on the southern slopes of the Himalayas. Furthermore, phylogeographic analysis of the majority of the East Eurasian lineages detected in LAN displayed de novo differentiation in contemporary Nepalese, albeit a few shows a recent genetic admixture with northeast Indians. Among the de novo differentiated lineages, most were > 3.8 Kya (Table 3) suggesting a subsequent entry into Nepal after the extensive human settlements on the plateau began.