Background: Presently, there is a wide variety of classification methods and deep neural networks approaches in bioinformatics. Deep neural networks have proven their effectiveness for classification tasks, and have outperformed classical methods, but they suffer from a lack of interpretability. Therefore, these innovative methods are not appropriate for decision support systems in healthcare. The algorithm should provide the main pieces of information allowing computed diagnosis and prognosis for the final decision by the clinician. To address this lack of interpretability, we used a new supervised autoencoder (SAE). The main advantage of our supervised autoencoder is its ability to provide a diagnosis with a confidence score for each patient thanks to a softmax classifier, and a meaningful latent space visualization. This confidence score and visual evaluation are crucial for clinical interpretability. Moreover, we used a new efficient feature selection method with a structured constraint for biologically interpretable results. Experimental results on three metabolomics datasets of clinical samples illustrate the effectiveness of our confidence score diagnostic method:
Results: The supervised autoencoder provides an accurate localization of the patients in the latent space, and an efficient confidence score. Experiments show that the SAE outperforms classical methods (PLS-DA, Random Forests, SVM, and neural networks (NN)). Furthermore, the metabolites selected by the SAE were found to be relevant.
Conclusion: In this paper, we have proposed a new efficient method for diagnosis or prognosis support using clinical metabolomics analyses.