Reassessing genetic relationships of Homo heidelbergensis among primates based on mitochondrial and nuclear DNA

DOI: https://doi.org/10.21203/rs.3.rs-2097114/v1

Abstract

Mitochondrial and nuclear DNA isolated from the fossils of Homo heidelbergensis are a real godsend. Morphological assessments of a given fossil can be highly subjective, sometimes having to transfer species from Homo to Australopithecus. However, sequencing data makes the accurate genetic reassessment of this fossil among primates possible.

The mtDNA for 36 primates including Neanderthal, Denisovan, and Homo heidelbergensis were downloaded from NCBI and aligned and visualized in a heat map. Homo heidelbergensis appears to be a member of the human clade, separating sharply from all other primates. Comparisons between 21 modern humans, three archaic humans, and three Pan sequences show that archaic humans fit within the variation of modern humans, meaning that together with Neanderthal and Denisovans, Homo heidelbergensis is also part of the human clade.

Reads from nine Homo heidelbergensis SRAs were downloaded from BioProject PRJEB10957 and ten for an archaic human from PRJEB22592. These reads were mapped to the hg38, the Neanderthal, and the panTro6 genomes, and variants were called using the samtools pipeline. The same proportion of reads mapped to hg38 as did from the archaic human genome, and more than to panTro6. Furthermore, the variant density was also the same between Homo heidelbergensis and archaic human, when mapping to hg38. BLASTing read sequences against hg38 and panTro6 gave similar results. Finally, the proportion of C > T/G > A point mutations in Homo heidelbergensis (40.1%) was statistically significantly greater than in modern human (33.2%). This indicates that 6.9% of these mutations stem from deamination.

Homo heidelbergensis genetically behaves very much like archaic human, and thus can be considered to be human, just like Neanderthal and Denisovan. It would be of tremendous value if either mtDNA and/or nuclear DNA could be found in the fossils of other hominids so as to make a more precise assessment of human phylogenetics possible.

Introduction

How should we classify Homo heidelbergensis? Until recently, H. heidelbergensis was known only from fossil evidence. But what can genetic evidence tell us about its relationship to humans and other primates?

The first fossil of this ancient hominid was discovered in 1907 near Heidelberg, Germany. With its relatively large brain, it fits into the lower variation of modern human variation. The face is also less robust with a smaller postorbital constriction, a steeper forehead, relatively small teeth and no chin (Park, 2010). Large hand axes are also clues pointing to the humanity of H. heidelbergensis. It is held to be the common ancestor of Eurasian Neanderthals and modern humans (Buck and Stringer, 2014), but also to be more ancestral to Denisovans (Reich, 2018), and thus called super-archaic humans (Stringer, 2012). Some researchers think that H. heidelbergensis is the predecessor of Neanderthals based on a continuum of traits between the two in fossils from Sima de los Huesos (Fleagle, 1999).

Ancient mitochondrial DNA in fossils is a godsend. With it we are able to open a window into the past and assess human genetic relationships. The mitochondrial DNA (mtDNA) is a very tiny fraction of the human genome, only around 5.5∙10− 4%. Yet, it possesses the quality of being easy to isolate, and are also present in fossils. With its short sequence length, it is also easy to align it and infer species relationships among individual species.

Isolating ancient DA (aDNA) from fossils is a rare feat, and provides researchers very valuable information. Today, four individuals from the genus Homo all have had their mtDNA sequenced: Homo sapiens, H. sapiens neanderthalensis, H. sapiens subsp. ‘Denisova’, and H. heidelbergensis (Meyer et al., 2014). Furthermore, H. heidelbergensis also has nuclear DNA in nine SRA data sets available at NCBI. This gives us an exciting new opportunity to make a molecular-based analysis of H. heidelbergensis in order to assess its relationship with other members of the genus Homo more precisely.

Principle of investigation

With these aDNA sequences from H. heidelbergensis in hand, we can perform several types of sequence comparisons. Aligning the mtDNA sequences to one another can reveal how similar two species are. This can help us in understanding phylogenetic relationships better between species. To this end, thirty-six mtDNA sequences from seven primate genera (Homo, Pan, Gorilla, Pongo, Hylobates, Papio and Macaca) were downloaded from the NCBI database and aligned with one another (Thompson et al., 2002).

In a second study comparing the sequence similarity of H. heidelbergensis with human, 21 mtDNA sequences from modern humans were downloaded from NCBI together with one from Neanderthal, Denisovan and H. heidelbergenesis, respectively, and two mtDNA sequences from Pan paniscus and one from P. troglodytes.

Nine SRA data sets for H. heidelbergensis were downloaded from the NCBI SRA website (BioProject PRJEB10957, samples ERR995357–65), as well as ten SRA data sets for an ancient “Paleolithic” human (BioProject PRJEB22592, samples ERR2117984–93). Aligning these reads to the whole genome sequence (WGS) of modern human, Neanderthal and chimpanzee (Pan troglodytes) also allows us to measure species relatedness based on molecular genetics besides the mtDNA sequences.

Materials And Methods

Data sets

A list of NCBI accession numbers for the mtDNA sequences of 36 primates and Monodelphis domestica is available in Supplementary File 1 and 28 Homo and 3 Pan NCBI accessions in Supplementary File 2. From the second data set Homo sapiens isolate NA24143 and NA24149 and EgyprRef1 were recoded into their reverse complement sequence. Nine H. heidelbergensis SRA data sets from BioProject PRJEB10597 and ten SRA data sets from archaic (so-called “Paleolithic”) humans from BioProject PRJEB22592 were aligned to the hg38 WGS of modern human and the panTrog6 (or pt6) WGS of P. troglodytes (Sikora at el., 2017). These genomes were downloaded from the UCSC browser, whereas the Neanderthal WGS was downloaded from the website of the Bioinformatics Core of UNMC (University of Nebraska Medical Center). A bowtie2 index was created for all three WGS using the bowtie2-index command. The vcf file containing 1,112,554,591 SNPs from the dbSNP database were downloaded from https://ftp.ncbi.nih.gov/snp/latest_release/VCF/GCF_000001405.39.gz.

The SRA data sets for PRJEB10597 are listed in Supplementary File 3, along with the number of reads in each data set and several statistics calculated during the analysis. The SRA data sets and the results for PRJEB22592 are listed in Supplementary File 4. All Supplementary files are available on Zenodo at https://zenodo.org/record/6330907#.YpG8bFTMLrc.

Generation of plots

For the generation of the heatmap (Fig. 1) showing the baraminic relationships between the 36 primate species and the outlier, the heatmap.2 function using the ‘single’ clustering algorithm. The MDS plot was created using the cmdscale function to create MDS plot coordinates. Figure 2 was created using the hist function in R, and Fig. 4 was created using the barplot function. The MEGA-X software (Kumar et al., 2018) was used to create alignments for the mtDNA for the 28 human mtDNA sequences and also the baraminic tree in Fig. 3 using the Neighbor Joining method (Saitou and Nei, 1987; Tamura et al., 2004). Default parameters were used to generate the tree. Supplementary Figs. 2a and 2b were generated using the treemap tool in R.

Data processing pipeline

When aligning the archaic reads from H. heidelbergensis, the bowtie2 aligner was used with the --local and --end-to-end flags over against bwa-mem, since bowtie2 is faster and increases genome coverage when aligning aDNA reads (Poullet and Orlando, 2020). The bwa aligner was not used due to the fact that it presumes that there are few differences between the query and the target sequence within the first 32 bp, which frequently is the case with aDNA (Schubert et al., 2014). The resulting sam files were transformed into bam files and sorted for the mpileup function of samtools. Finally, for each sample, variants were called using bcftools with the following command: bcftools call -P 1e-3 -mv -Ob. The -P flag means that variants were called at a p-value of 0.001.

For each sample, the number of reads and the proportion of reads aligning to hg38 and pt6 were noted in separate columns in Supplementary Files 3 and 4. Then, the variant density was calculated for hg38 and pt6 WGS by dividing the number of reads by the number of variants found for that sample. Next, the hg38/pt6 density proportion was calculated by dividing the variant density of pt6 with the variant density of hg38 to see whether there are more variants in the P. troglodytes genome compared to human.

The variant calling pipeline was run on an Ubuntu 18.04.1 operating system. Student’s t-tests were run in R, version 4.1.0. using the t.test command. Two t-tests were run to calculate the p-values reported in Table 4: a normal t-test and a second one where the ‘alternative’ parameter was set to ‘less’. This means that the values in the first data set are less than those in the second data set used in the t-test.

Read sequence similarity analysis

For each sample from PRJEB10957 and PRJEB22952, 10,000 reads from the fastq file were converted to fasta files using the fastq_to_fasta tool. They were then BLASTed against the hg38 and pt6 genomes using blastn. Z-scores were calculated to tell how statistically significantly similar two normal distributions were, based on the following equation: \(Z=\frac{\left({\stackrel{-}{X}}_{1}-{\stackrel{-}{X}}_{2}\right)}{\sqrt{{\sigma }_{{X}_{1}}^{2}+{\sigma }_{{X}_{2}}^{2}}}\).

Results And Discussion

Mitochondrial DNA sequence similarity of 36 primates

The result of the alignment of the 36 primate mtDNA sequences (plus Monodelphis domestica as an outlier) can be seen in the heatmap in Fig. 1. The Hopkins clustering statistic is 0.904, which indicates very good clustering. The species visibly form compact clusters. Based on k-means clustering, there are seven clusters corresponding to the seven primate genera selected for the study. H. heidelbergensis clearly clusters with modern H. sapiens and also Neanderthal and Denisovan, forming a distinct monophyletic group.

The mean sequence identity value between all four species in this group is 97.33 ± 0.77% (p-value = 2.5∙10− 30). In contrast, the mean sequence similarity between the four Homo individuals with Pan paniscus and P. troglodytes is only 90.7 ± 0.58%.

In the MDS plot in Supplementary Fig. 1, the seven monophyletic groups found in this study (plus M. domestica as an outlier) are coded with different colored dots. To the lower left, we have Homo, Gorilla, Pan, Hylobates, and Pongo. To the lower right, we have Macaca and Papio. The outlier, M. domestica is at the center top of the plot. The MDS coordinates for all 37 species are provided in Supplementary File 1.

Homo versus Pan

In a second study comparing the sequence similarity of H. heidelbergensis with human, 21 mtDNA sequences from modern humans were downloaded from NCBI and were analyzed with Neanderthal, Denisovan, and H. heidelbergenesis, and two mtDNA sequences from Pan paniscus and one from P. troglodytes.

In Fig. 2, two histograms showing the frequencies of mtDNA sequence similarities (MSS) are shown. In Fig. 2A, we see the frequency of MSS of 21 modern humans among modern humans, Neanderthal, Denisovan, H. heidelbergensis, and Pan. H. heidelbergensis is arguably the least similar archaic human compared to modern humans. It is the farthest away from modern humans, followed by Denisovan and then Neanderthal. This is to be expected, since H. heidelbergensis is the oldest individual from the Homo group, being a super-archaic human (Brunner and Lombard, 2020).

However, in Fig. 2B, we see a different picture. The MSS between P. troglodytes and P. paniscus is a mere 97.1%, with a variation of 2.5%. This appears to be the upper limit of mtDNA variation between members of the genus Pan. Despite their reproductive and morphological differences, chimpanzees and bonobos can hybridize in captivity (Vervaecke and Van Elsacker, 1992), and there is evidence of introgression of a small portion of nuclear genetic material between the two species based on a study of 75 wild-born chimpanzees and bonobos (de Manuel et al., 2016). The MSS variation of Homo seen in Fig. 2A is only around 3.5%, less than that within Pan. Modern humans, Neandertals, and Denisovans are thought to have interbred with one another in the past (Pennisi, 2013; Coll Macià et al., 2021; Gopalan et al., 2021; Larena et al., 2021). Furthermore, as we can see from Table 1, the MSS between H. heidelbergensis and Pan is the least among Homo, at 89.8%. 

 

Group

Mean ± st.dev.

No. comparisons

Table 1

Mean sequence similarity value between modern humans, Neanderthal, Denisovan, H. heidelbergensis, and Pan.

All modern humans

0.997 ± 0.0017

210

Modern humans and Neanderthal

0.986 ± 7E-4

21

Modern humans and Denisovan

0.975 ± 5E-4

21

Modern humans and H. heidelbergensis

0.967 ± 6e-04

21

Modern humans and Pan

0.91 ± 9e-04

63

All Pan

0.971 ± 0.0248

3

Pan versus Neanderthal

0.912 ± 6e-04

3

Pan versus Denisovan

0.909 ± 6e-04

3

Pan versus H. heidelbergensis

0.898 ± 0.0012

3

To illustrate relationships within the human clade, a phylogenetic tree was created using the MEGA-X software shown in Fig. 3. The 25 modern human individuals are closely grouped together, whereas Neanderthal, Denisovan, and H. heidelbergensis are more basal in the tree, as they are more ancient humans, separated from all other sequences by more mutations.

Variant analysis of H. heidelbergensis and modern human

For each of the nine SRA samples from BioProject PRJEB10957 several variant statistics were called for H. heidelbergensis, as reported in Table 2. These same statistics are also provided for BioProject PRJEB22592 in Table 3, and are also provided in Supplementary Data File 3. 

 
 

% aln hg38

% aln ntal

% aln pt6

No. hg38 vars

hg38 var. density

No. ntal vars

ntal var. density

No. pt6 vars

pt6 var. density

pt6/hg38 dens. prop.

pt6/ntal dens. prop.

hg38/ntal dens. prop.

Table 2

Variant calling statistics for PRJEB10957: reads from H. heidelbergensis to hg38, Neanderthal and pt6

Mean

0.966

0.959

0.742

16356.4

54.6

16904.556

51.1

35767.8

28.1

2.440

2.300

0.946

Std. dev.

0.034

0.035

0.161

32127.7

66.5

32951.654

62.2

59506

29

1.267

1.154

0.021


 

% aln hg38

% aln ntal

% aln pt6

No. hg38 vars

hg38 var. density

No. ntal vars

ntal var. density

No. pt6 vars

pt6 var. density

pt6/hg38 dens. prop.

pt6/ntal dens. prop.

hg38/ntal dens. prop.

Table 3

Variant calling statistics for PRJEB22592: reads from an archaic human to hg38, Neanderthal and pt6 (with USER treatment)

Mean

0.999

0.996

0.945

2521616.000

70.737

2975644.800

59.190

20786182.8

8.211

74.870

1.180

0.847418

Std. dev.

0.000

0.001

0.006

1066303.035

58.051

1322310.477

46.696

11109092.69

5.688

2.081

1.240

0.806394

Table 4. P-values coming from Student t-test comparing variant densities from SRA samples from H. heidelbergensis, and archaic human to hg38, ntal and pt6. The term “less” means that the variant density values from column 1 are less than those from column 2.

variant density 1

variant density 2

p-value

p-value (“less”)

H. heidelbergensis vs. hg38

archaic human vs. hg38

0.647

0.677

H. heidelbergensis vs. ntal

archaic human vs. ntal

0.790

0.395

H. heidelbergensis vs. pt6

archaic human vs. pt6

0.078

0.961

Proportion of aligned reads to WGS

The proportion of aligned reads from a DNA sample to the WGS of another species should be high if the other genome comes from a species from the same monophyletic group and lower if from another monophyletic group. The proportion of reads from H. heidelbergensis mapping to hg38 and the proportion of reads from the archaic human to hg38 sample should not differ significantly. For PRJEB10957, the mean proportion of H. heidelbergensis reads mapping to hg38 (96.6 ± 3.4%) is much higher than that mapping to pt6 (74.2 ± 16.1%). This proportion is also very similar to the mean proportion of reads mapping to the Neanderthal genome (95.9 ± 3.5%).

When comparing the proportion of reads mapping from H. heidelbergensis to hg38 to the proportion of the same reads mapping to pt6, there is a statistically significant difference, with a p-value of 0.003. When comparing the proportion of reads mapping from H. heidelbergensis to the Neanderthal WGS to the proportion of the same reads mapping to pt6, the p-value is 0.004. However, when the proportion of H. heidelbergensis reads mapping to hg38 is compared to the proportion of these reads mapping to the Neanderthal WGS, the p-value is insignificant at 0.675. This means that the H. heidelbergensis reads map in the same manner to both modern (hg38) and the archaic genome (Neanderthal); there is no statistically significant difference between the two modes of mapping.

Variant density

Another important measurement is the number of variants in the nine H. heidelbergensis SRA samples compared to the human genome. The more variants and the kinds of variants between two species’ genomes could help us tell whether they are from the same or different monophyletic groups. If the two species are from the same group, their genome would harbor fewer differences than if they were from different monophyletic groups.

The variant number can be converted to a variant density by dividing the number of reads in a given sample by the number of variants found in the sample when mapping the reads to a given genome. This number corresponds to the average length of DNA between two neighboring variants. The variant density has an inverted relationship to the number of variants. A lower density value corresponds to a higher number of variants, and vice-versa. This was done because the H. heidelbergensis reads do not cover the whole genome. Since we are also looking at the number of reads instead of the total length of the DNA from a given SRA sample, this is only a rough estimate of the variant density.

As can be seen from Table 2, the mean number of variants is about the same when the H. heidelbergensis reads are mapped to modern human (16,356.4 ± 32,127.7) and to Neanderthal (16,904.6 ± 32,951.77). However, the mean number of variants found when mapping the H. heidelbergensis reads to the chimpanzee genome is much greater (35,767.8 ± 59,506.8). The variant density when mapping the H. heidelbergensis reads to the genome of modern humans and Neanderthals is 54.6 ± 66.5 and 51.1 ± 62.2 as opposed to the same reads mapping to the chimp genome, 28.1 ± 29.0.

If H. heidelbergensis belongs to a clade defined by the continuity of archaic and modern humans, then the variant densities calculated from its nine SRA samples (when mapped to a particular genome) should correlate highly with the variant density of the five SRA samples (ERR2117984–ERR2117988) from PRJEB22592, which contains DNA from an archaic “Paleolithic” human, when both are mapped to the same genome.

On the other hand, if H. heidelbergensis comes from a non-human group, then its pattern of DNA variant differences should differ from that of a human-to-human comparison. In order to measure this, three Student’s t-tests were performed, to see whether there is a statistically significant difference when a.) mapping H. heidelbergensis reads to hg38 compared to mapping ancient human reads to hg38, b.) mapping H. heidelbergensis reads to the Neanderthal genome compared to mapping ancient human reads to the Neanderthal genome, and c.) mapping H. heidelbergensis reads to pt6 compared to mapping ancient human reads to pt6. These comparisons are summed up in Table 4.

When reads from H. heidelbergensis archaic human were both mapped to hg38, the difference in variant density between these two comparisons has a p-value of 0.677. When the target genome is the Neanderthal genome, the p-value is 0.395. When the target genome is the pt6 genome, the p-value is 0.961.

This means that there is no statistically significant difference between the two variant densities; the mapping behavior of the H. heidelbergensis reads behaves very much like the ones from archaic humans. This implies that H. heidelbergensis is just as human as the archaic human from PRJEB22592. All three of these comparisons imply that the H. heidelbergensis reads behave much like those of ancient human, meaning that H. heidelbergensis is human.

Lastly, the different variant densities can be compared to one another by dividing one by the other. This way we can calculate a relative variant density measure, to see how much denser a given genome is in the number of variants compared to another genome. These proportions were calculated for pt6/hg38, pt6/ntal, and hg38/ntal over all nine H. heidelbergensis SRA samples (see Fig. 4). Overall, variants called in the pt6 genome are 2.4 and 2.3 times denser than those variants in the hg38 and Neanderthal genomes. Samples ERR995362 and ERR996364 may have a variant density proportion less than 1 for pt6/hg38 and pt6/ntal, because these samples have a very small number of reads in them. This may be due merely to biological variation.

Skewed variant frequency

The results presented here must be taken with caution since sequencing aDNA does have its caveats. These include contamination with DNA from microbes or modern humans, and degradation of the aDNA (Thomas and Tomkins, 2014).

Proportionately slightly fewer H. heidelbergensis reads map to the hg38 genome, compared to reads from ancient (“Paleolithic”) humans. Also, there are relatively more variants when mapping H. heidelbergensis reads to hg38 as opposed to ancient human reads. There are two reasons for this. The first is that over time, DNA variants could have accumulated in the modern human genome compared to the ancient genome of H. heidelbergensis. The second could be due to deamination from C to T (mirrored on the reverse strand as G to A). The research group that isolated the DNA from the H. heidelbergensis samples claimed that the frequency of C > T deaminations rose from 12–17% at the 5’ end of the reads to 55–62% at the 3’ end of the reads! They found that this rate of deamination is also similar to the same rate as found in bear remains from the same site at Sima de los Huesos (Meyer et al., 2014).

The frequency of each of the twelve possible SNPs was counted for each of the nine H. heidelbergensis SRA samples when mapped to both the hg38 and the Neanderthal genomes. This information can be found in Supplementary File 3. The proportion of each of the twelve SNPs between the H. heidelbergensis reads and both hg38 and the Neanderthal genome for SRA samples ERR995367–ERR995361 were plotted and can be found in Supplementary Figs. 2a and 2b online. These five SRA samples were chosen, because the other four samples had a very low number of reads, and some of the twelve SNPs did not occur in those samples. It is clear from the results that the C > T/G > A SNPs are the most common SNPs, making up 33.2–47% for hg38 and 33.1–46.7% for Neanderthal. These proportions cannot be by mere chance.

Guo and Jamison (2005) calculated that about 33.2% of all SNPs are C > T/G > A. The present study also calculated the frequency of C > T/G > A substitutions from the dbSNP to be 32%. The average C > T/G > A substitution rate over the five H. heidelbergensis SRA samples is 40.1 ± 2.9%. Thus, the proportion of C > T/G > A substitutions between the H. heidelbergensis and modern human genomes is slightly elevated. This corresponds to a Z-score of 2.69, which denotes that the two distributions are significantly different. It means that around 6.9% (40.1% – 33.2%) of the C > T/G > A transitions are due to deamination.

The mean number and standard deviation of SNPs excluding C > T and G > A were calculated for these five samples (see “PRJEB10957 SNP hg38 dist.” tab, rows “z (C > T)” and “z (G > A)” in Supplementary File 3). A z-score was calculated for C > T and G > T in these five samples to see how extreme they are. For all five samples, the z-score was greater than 1.65, meaning that these results are significant at the 5% level. This means that the higher number of C > T and G > A variants are not occurring by random chance. Since the number of C > T and G > A variants are skewed, this means that the genetic distance between H. heidelbergensis and modern humans decreases, and strengthens the conclusion that H. heidelbergensis is human.

Percent dissimilarity of reads mapped to hg38 and pt6

As the last step, 10,000 read sequences from the fastq file of the samples from PRJEB10957 and PRJEB22952 were mapped to both the hg38 and the pt6 genomes using BLASTN. The mean percent mismatches were noted as well as their standard deviations. The normal curves with these mean and standard deviation values are plotted in Fig. 5, and the mean and standard deviations are shown in Table 5.

Table 5

Mean % ± std. dev dissimilarity of 10,000 sequencing reads from SRA samples PRJEB10957 and PRJEB22952 mapped to hg38 and pt6,

 

hg38

pt6

Homo heidelbergensis

0.0049 ± 0.0022

0.0283 ± 0.016

‘Paleolithic’ archaic human

0.0045 ± 0.003

0.0283 ± 0.014

Z-score

0.108

1.9E-4

The z-score, described in the Materials and Methods section describes how similar two normal distributions are. Both values listed in the last row of Table 5 show that the z-scores comparing H. heidelbergensis to hg38 and pt6 and also the ‘paleolithic’ archaic human to hg38 and pt6 are well below 2.0, meaning that these distributions are almost identical to one another. This means that the genetic distance according to read sequence mismatches between modern human and ‘paleolithic’ archaic human and between modern human H. heidlbergensis are virtually the same, also supporting the idea that H. heidelbergensis is human.

Summary

From these analyses, we have seen that H. heidelbergensis behaves genetically similar to modern and archaic humans. Its mtDNA sequence is only slightly different, being an archaic human. H. heidelbergensis clusters together with modern humans, Neanderthal and Denisovan in the baraminogram (heatmap), as well as the phylogenetic tree, albeit at the base.

When nuclear data is examined in the form of SRA reads, about the same proportion of H. heidelbergensis reads map to the genomes of modern humans. The density of variants produced by mapping its reads to the genomes of modern and archaic humans and chimpanzee is also similar to that of archaic (“Paleolithic”) human. A similar inference can be made when examining the average sequence mismatch between read sequences from H. heidelbergensis and archaic human when mapped to hg38 and pt6.

All of these considerations confidently demonstrate that H. heidelbergensis is human, only another archaic human individual. It is therefore suggested that Homo heidelbergensis be collapsed into Homo sapiens, and could be renamed Homo sapiens heidelbergensis. This study also highlights the utility aDNA from fossil humans to determine their phylogenetic placement. Here the mtDNA from only four human subgroups were compared, but if genomes from more fossils could be isolated, such as Homo erectus, Homo naledi, Homo floresiensis, and others, this would sharpen the picture of human phylogenetic relationships even further.

Declarations

Acknowledgments

No specific acknowledgments.

 Data availability

All Supplementary files are available on Zenodo at https://zenodo.org/record/6330907#.YpG8bFTMLrc. The data sets used in this study are available from NCBI. The vcf file containing 1,112,554,591 SNPs from the dbSNP database were downloaded from https://ftp.ncbi.nih.gov/snp/latest_release/VCF/GCF_000001405.39.gz. The SRA data sets can be retrieved from the NCBI Sequencing Read Archives (SRA) at https://www.ncbi.nlm.nih.gov/sra. The mtDNA sequences used in this study can be retrieved from the NCBI Organelle Database at https://www.ncbi.nlm.nih.gov/genome/browse#!/organelles.

 Authors’ contributions

M.C. designed the whole study, wrote and ran all of the scripts, downloaded all of the data, analyzed all of the results and wrote the paper.

 Conflict of interest

The author declares no conflict of interest.

References

  1. Bruner, E., & Lombard, M. (2020). The skull from Florisbad: a paleoneurological report. Journal of Anthropological Sciences, 98, 10.4436/JASS.98014. doi.org/10.4436/JASS.98014
  2. Buck, L. T., & Stringer, C. B. (2014). Homo heidelbergensis. Current Biology 24(6), R214–R215. doi.org/10.1016/j.cub.2013.12.048
  3. Coll Macià, M., Skov, L., Peter, B. M., & Schierup, M. H. (2021). Different historical generation intervals in human populations inferred from Neanderthal fragment lengths and mutation signatures. Nature Communications, 12(1), 5317. doi.org/10.1038/s41467-021-25524-4
  4. Cserhati, M. F., Mooter, M. E., Peterson, L., Wicks, B., Xiao, P., Pauley, M., & Guda, C. (2018). Motifome comparison between modern human, Neanderthal and Denisovan. BMC Genomics 19(1), 472. doi.org/10.1186/s12864-018-4710-1
  5. de Manuel, M., Kuhlwilm, M., Frandsen, P., Sousa, V. C., Desai, T., Prado-Martinez, J., Hernandez-Rodriguez, J., et al. (2016). Chimpanzee genomic diversity reveals ancient admixture with bonobos. Science 354(6311), 477–481. doi.org/10.1126/science.aag2602
  6. Fleagle, J. G. Primate Adaptation and Evolution. 2nd ed. New York: Academic Press, 1999.
  7. Gopalan, S., Atkinson, E. G., Buck, L. T., Weaver, T. D., & Henn, B. M. (2021). Inferring archaic introgression from hominin genetic data. Evolutionary Anthropology 30(3), 199–220. doi.org/10.1002/evan.21895
  8. Guo, Y., & Jamison, D. C. (2005). The distribution of SNPs in human gene regulatory regions. BMC Genomics 6, 140. doi.org/10.1186/1471-2164-6-140
  9. Kumar S., Stecher G., Li M., Knyaz C., & Tamura K. (2018). MEGA X: Molecular Evolutionary Genetics Analysis across computing platforms. Molecular Biology and Evolution 35, 1547-1549. doi.org/10.1093/molbev/msy096
  10. Larena, M., McKenna, J., Sanchez-Quinto, F., Bernhardsson, C., Ebeo, C., Reyes, R., Casel, O., et al. (2021). Philippine Ayta possess the highest level of Denisovan ancestry in the world. Current Biology 31(19), 4219–4230. doi.org/10.1016/j.cub.2021.07.022
  11. Meyer, M., Fu, Q., Aximu-Petri, A., Glocke, I., Nickel, B., Arsuaga, J. L., Martínez, I., Gracia, A., de Castro, J. M., Carbonell, E., & Pääbo, S. (2014). A mitochondrial genome sequence of a hominin from Sima de los Huesos. Nature 505(7483), 403–406. doi.org/10.1038/nature12788
  12. Park, M. A. Biological Anthropology. 6th ed. New York: McGraw-Hill, 2010.
  13. Pennisi, E. (2013). Human evolution. More genomes from Denisovan cave show mixing of early human groups. Science 340, 799. doi.org/10.1126/science.340.6134.799
  14. Poullet, M., & Orlando, L. (2020). Assessing DNA Sequence Alignment Methods for Characterizing Ancient Genomes and Methylomes. Frontiers in Ecology and Evolution 8, 105.
  15. Reich, D. Who We Are and How We Got Here, Ancient DNA and the New Science of the Human Past. New York: Pantheon Books, 2018.
  16. Saitou N. & Nei M. (1987). The neighbor-joining method: A new method for reconstructing phylogenetic trees. Molecular Biology and Evolution 4, 406-425. doi.org/10.1093/oxfordjournals.molbev.a040454
  17. Schubert, M., Ermini, L., Der Sarkissian, C., Jónsson, H., Ginolhac, A., Schaefer, R., Martin, M. D., et al. (2014). Characterization of ancient and modern genomes by SNP detection and phylogenomic and metagenomic analysis using PALEOMIX. Nature Protocols 9(5), 1056–1082. doi.org/10.1038/nprot.2014.063
  18. Sikora, M., Seguin-Orlando, A., Sousa, V. C., Albrechtsen, A., Korneliussen, T., Ko, A., et al. (2017). Ancient genomes show social and reproductive behavior of early upper paleolithic foragers. Science 358, 659–662. doi.org/10.1126/science.aao1807
  19. Stringer, C. (2012). The status of Homo heidelbergensis (Schoetensack 1908). Evolutionary Anthropology 21(3), 101–107. doi.org/10.1002/evan.21311
  20. Tamura K., Nei M., & Kumar S. (2004). Prospects for inferring very large phylogenies by using the neighbor-joining method. Proceedings of the National Academy of Sciences (USA) 101, 11030-11035. doi.org/10.1073/pnas.0404206101
  21. Thompson, J. D., Gibson, T. J., & Higgins, D. G. Multiple sequence alignment using ClustalW and ClustalX. Current Protocols in Bioinformatics, Chapter 2, 2002.
  22. Vervaecke, H, & Van Elsacker, L. (1992). Hybrids Between Common Chimpanzees (Pan Troglodytes) And Pygmy Chimpanzees (Pan Paniscus) In Captivity. Mammalia 56, 667–669.
  23. Wood, B. & Collard, M. (1999). The human genus. Science284(5411), 65–71. doi.org/10.1126/science.284.5411.65