Background: Many scientific studies have sought to obtain better understanding of specific medical conditions. Concerning Alzheimer’s Disease, there is a lack of reliable diagnostics and this can be related to the availability of only small-scale ongoing biomarker studies and longitudinal cohorts including these subjects. Aiming to generate more substantial clinical evidence, researchers have started to perform multiple cohort analyses. While this is currently possible by harmonising these cohorts into a common data model, the migration pipelines are usually implemented using programming languages. Therefore, cohort owners may have difficulties in contributing during the validation stage of these pipelines.
Results: To reduce the dependency on technical teams’ support when validating the data transformations, we propose the use of an ETL tool with visual features. BIcenter is a collaborative web platform designed to implement ETL tasks through the browser. These pipelines are constructed using drag-and-drop features and intuitive forms to customise the ETL steps. This tool is an open-source project and is accessible at https://bioinformatics-ua.github.io/BIcenter/.
Conclusions: Our methodology produces interoperable cohorts for multicentric disease-specific studies. Therefore, we validated this using Alzheimer’s Disease cohorts from several countries, combining at the end 6,669 subjects and 172 medical attributes. The harmonised cohorts now enable multi-cohort querying and analysis, helping in the execution of new studies.