Background: The integration of multi-omics data can greatly facilitate the advancement of research in Life Sciences by providing new insights on how biological systems interact. However, there is currently no widespread procedure for a robust, efficient and meaningful multi-omics data integration; the approach presented here is a first attempt towards increasing the reliability of data discovery power compared to the processing of individual biodata sets.
Results: Here, we proposed a high-speed framework, called InterTADs, for integrating multi-omics data from the same physical source (e.g. patient) taking into account the chromatin configuration of the genome, i.e. the topologically associating domains (TADs). The main concept of the proposed methodology is to create a single matrix with all different events (e.g. DNA methylation, expression, mutation) combined with their genome coordinates and the respective quantitative metrics after application of the appropriate scaling. The events are divided into their related TADs according to the chromosomal location and each TAD is evaluated for statistically significant differences between the groups of interest (e.g. normal cells vs cancer cells). Finally, several visualization approaches are available, including the mapping of the events on the chromosomal location of the TAD as well as the distribution of the counts within a given TAD across the different study groups.
Conclusions: InterTADs provides a general framework for integrating multi omics data and relating them with the TADs. This could lead to the extraction of new biological insight of the examined case study. InterTADs is an open-source tool implemented in R and licensed under the MIT License. The source code is freely available from https://github.com/nikopech/InterTADs.
This preprint is available for download as a PDF.