Differential Architecture Search in Deep Learning for DNA Splice Site Classification

doi:10.21203/rs.3.rs-39558/v1

Download PDF

Research

Differential Architecture Search in Deep Learning for DNA Splice Site Classification

https://doi.org/10.21203/rs.3.rs-39558/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 15 Feb, 2021

Read the published version in BioData Mining →

You are reading this older preprint version

Read the latest preprint version →

Background: The data explosion caused by unprecedented advancements in the field of genomics is constantly challenging the conventional methods used in the interpretation of the human genome. The demand for robust algorithms over the recent years has brought huge success in the field of Deep Learning (DL) in solving many difficult tasks in image, speech and natural language processing by automating the manual process of architecture design. This has been fueled through the development of new DL architectures. Yet genomics possesses unique challenges as we expect DL to provide a super human intelligence that easily interprets a human genome.

Methods: We adapted a differential architecture search method for the interpretation of biological sequences and applied it to the splice site recognition task on DNA sequences to discover new high-performance convolutional architectures in an automated manner. The discovered architecture was benchmarked on CPU and multiple GPU architectures in terms of computational time and classification performance.

Results: Our experimental evaluation demonstrated that the discovered architecture outperformed fixed baseline architectures for classification of splice sites. The benchmarking experiments of execution time and precision on architecture search and evaluation process showed that they performed better on recently available GPU models.

Conclusions: We applied differential architecture search mechanism to perform splice site classification on raw DNA sequences, and discovered new models with better performance than major baseline models. The results have shown a potential of using this automated architecture search mechanism for solving various problems in genomics domain.

Bioinformatics

Deep Learning

Splice Site

Genomics

Neural Architecture Search

Convolutional Neural Networks