Background: Current peak callers for identifying RNA-binding protein (RBP) binding sites from CLIP-seq data take into account genomic read profiles, but they ignore the underlying transcript information, that is information regarding splicing events. So far, there are no studies available that closer observe this issue.
Results: Here we show that current peak callers are susceptible to false peak calling near exon borders. We further quantify its extent in publicly available datasets, which turns out to be substantial. Finally, by providing a tool called CLIPcontext for automatic transcript and genomic context sequence extraction, we demonstrate that context choice also affects the performances of RBP binding site prediction tools.
Conclusions: Our results demonstrate the importance of incorporating transcript information in CLIP-seq data analysis. Taking advantage of the underlying transcript information should therefore become an integral part of future peak calling and downstream analysis tools.