Background: The data explosion caused by unprecedented advancements in the field of genomics is constantly challenging the conventional methods used in the interpretation of the human genome. The demand for robust algorithms over the recent years has brought huge success in the field of Deep Learning (DL) in solving many difficult tasks in image, speech and natural language processing by automating the manual process of architecture design. This has been fueled through the development of new DL architectures. Yet genomics possesses unique challenges as we expect DL to provide a super human intelligence that easily interprets a human genome.
Methods: We proposed a new model, DASSI, by adapting a differential architecture search method and applying it to the Splice Site (SS) recognition task on DNA sequences to discover new high-performance convolutional architectures in an automated manner. We evaluated the discovered model against state-of-the-art tools to classify true and false SS in Homo sapiens (Human), Arabidopsis thaliana (Plant), Caenorhabditis elegans (Worm) and Drosophila melanogaster (Fly).
Results: Our experimental evaluation demonstrated that the discovered architecture outperformed baseline models and fixed architectures and showed competitive results against state-of-the-art models used in classification of splice sites. The proposed model - DASSI has a compact architecture and showed very good results on a transfer learning task. The benchmarking experiments of execution time and precision on architecture search and evaluation process showed better performance on recently available GPUs making it feasible to adopt such architecture search based methods for large datasets.
Conclusions: We applied differential architecture search mechanism to perform SS classification on raw DNA sequences, and discovered new convolutional models with low number of tunable parameters and competitive performance compared with manually engineered architectures. The results have shown a potential of using this automated architecture search mechanism for solving various problems in the field of genomics.

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5
Loading...
On 03 Jan, 2021
On 15 Nov, 2020
On 15 Nov, 2020
Received 15 Nov, 2020
Received 15 Nov, 2020
On 14 Nov, 2020
Invitations sent on 14 Nov, 2020
On 14 Nov, 2020
On 14 Nov, 2020
Posted 19 Oct, 2020
Received 17 Oct, 2020
Received 17 Oct, 2020
On 17 Oct, 2020
On 14 Oct, 2020
On 12 Oct, 2020
Received 12 Oct, 2020
On 11 Oct, 2020
Invitations sent on 11 Oct, 2020
On 11 Oct, 2020
On 10 Oct, 2020
On 10 Oct, 2020
Received 03 Aug, 2020
On 03 Aug, 2020
Received 01 Aug, 2020
On 21 Jul, 2020
On 19 Jul, 2020
Received 18 Jul, 2020
On 17 Jul, 2020
Invitations sent on 16 Jul, 2020
On 15 Jul, 2020
On 14 Jul, 2020
On 14 Jul, 2020
On 01 Jul, 2020
On 03 Jan, 2021
On 15 Nov, 2020
On 15 Nov, 2020
Received 15 Nov, 2020
Received 15 Nov, 2020
On 14 Nov, 2020
Invitations sent on 14 Nov, 2020
On 14 Nov, 2020
On 14 Nov, 2020
Posted 19 Oct, 2020
Received 17 Oct, 2020
Received 17 Oct, 2020
On 17 Oct, 2020
On 14 Oct, 2020
On 12 Oct, 2020
Received 12 Oct, 2020
On 11 Oct, 2020
Invitations sent on 11 Oct, 2020
On 11 Oct, 2020
On 10 Oct, 2020
On 10 Oct, 2020
Received 03 Aug, 2020
On 03 Aug, 2020
Received 01 Aug, 2020
On 21 Jul, 2020
On 19 Jul, 2020
Received 18 Jul, 2020
On 17 Jul, 2020
Invitations sent on 16 Jul, 2020
On 15 Jul, 2020
On 14 Jul, 2020
On 14 Jul, 2020
On 01 Jul, 2020
Background: The data explosion caused by unprecedented advancements in the field of genomics is constantly challenging the conventional methods used in the interpretation of the human genome. The demand for robust algorithms over the recent years has brought huge success in the field of Deep Learning (DL) in solving many difficult tasks in image, speech and natural language processing by automating the manual process of architecture design. This has been fueled through the development of new DL architectures. Yet genomics possesses unique challenges as we expect DL to provide a super human intelligence that easily interprets a human genome.
Methods: We proposed a new model, DASSI, by adapting a differential architecture search method and applying it to the Splice Site (SS) recognition task on DNA sequences to discover new high-performance convolutional architectures in an automated manner. We evaluated the discovered model against state-of-the-art tools to classify true and false SS in Homo sapiens (Human), Arabidopsis thaliana (Plant), Caenorhabditis elegans (Worm) and Drosophila melanogaster (Fly).
Results: Our experimental evaluation demonstrated that the discovered architecture outperformed baseline models and fixed architectures and showed competitive results against state-of-the-art models used in classification of splice sites. The proposed model - DASSI has a compact architecture and showed very good results on a transfer learning task. The benchmarking experiments of execution time and precision on architecture search and evaluation process showed better performance on recently available GPUs making it feasible to adopt such architecture search based methods for large datasets.
Conclusions: We applied differential architecture search mechanism to perform SS classification on raw DNA sequences, and discovered new convolutional models with low number of tunable parameters and competitive performance compared with manually engineered architectures. The results have shown a potential of using this automated architecture search mechanism for solving various problems in the field of genomics.

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5
Loading...