Repetitive structures in the genome often lead to difficulty in accurately characterizing variation across a variety of sequencing technologies and variant detection methods. To address this, the Genome in a Bottle consortium maintains “stratification BED files” for error analysis in “difficult” regions such as homopolymers and segmental duplications. However, this strategy represents genomic context in discrete bins, which sacrifices precision when quantifying difficulty; this could be improved using a data-driven model. To this end, we developed StratoMod, which uses an interpretable machine learning classifier to predict variant calling errors using features derived from genomic data. StratoMod identified distinct associations with errors for A/T vs. G/C homopolymer lengths, and quantified sources of error for a new sequencing technology. We also demonstrated that the model could predict clinically-relevant variants that may be missed by certain methods, using DeepVariant calls from Illumina as an example. From this we also produced a resource of difficult-to-map genes with challenging variants and large challenging INDELs. In each use-case, the interpretability of StratoMod enables one to understand how each feature contributed to a prediction. We anticipate this will be useful for method developers and clinicians who desire a quantitative understanding of sources of variant-calling errors.