Background: Single Nucleotide Polymorphism (SNPs) markers have great potential to identify individuals, family relations, biogeographical ancestry, and phenotypic traits. In many forensic situations, DNA mixtures of a victim and an unknown suspect exist. Extracting SNP profiles from suspect’s samples can be used to assist investigation or gather intelligence. Computational tools to determine inclusion/exclusion of a known individual from a mixture exist, but no algorithm for extraction of an unknown SNP profile without a list of suspects is available.
Results: We present here AH-HA, a novel computational approach for extracting an unknown SNP profile from whole genome sequencing (WGS) of a two persons mixture. AH-HA utilizes techniques similar to the ones used in haplotype phasing. It constructs the inferred genotype as an imperfect mosaic of haplotypes from a reference panel of the target population. It outperforms more simplistic approaches, maintaining high performance through a wide range of sequencing depths (500x - 5x).
Conclusions: AH-HA can be applied in cases of victim-suspect mixtures and improve the capabilities of the investigating forces. This approach can be extended to more complex mixtures with more donors and less prior information, further motivating the development of SNP-based forensics technologies.