Hybrid Semantic Recommender System for Chemical Compounds in Large-Scale Datasets
The large, and increasing, number of chemical compounds poses challenges to the exploration of such datasets. In this work, we propose the usage of Recommender Systems to identify compounds of interest to scientific researchers. Our approach consists of a hybrid recommender model suitable for implicit feedback datasets and focused on retrieving a ranked list according to the relevance of the items. The model integrates collaborative-filtering algorithms for implicit feedback (Alternating Least Squares and Bayesian Personalized Ranking) and a new content-based algorithm, using the semantic similarity between the chemical compounds in the ChEBI ontology. The algorithms were assessed on an implicit dataset of chemical compounds, CheRM-20, with more than 16.000 items (chemical compounds). The hybrid model was able to improve the results of the collaborative-filtering algorithms, by more than ten percentage points in most of the assessed evaluation metrics.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Due to technical limitations, full-text HTML conversion of this manuscript could not be completed. However, the latest manuscript can be downloaded and accessed as a PDF.
Posted 29 Dec, 2020
On 03 Jan, 2021
Received 15 Dec, 2020
Received 15 Dec, 2020
On 10 Dec, 2020
Invitations sent on 10 Dec, 2020
On 10 Dec, 2020
On 09 Dec, 2020
On 09 Dec, 2020
On 09 Dec, 2020
Received 07 Nov, 2020
On 07 Nov, 2020
Received 06 Nov, 2020
On 30 Oct, 2020
On 30 Oct, 2020
On 29 Oct, 2020
Invitations sent on 29 Oct, 2020
On 28 Oct, 2020
On 28 Oct, 2020
On 29 Sep, 2020
Received 28 Sep, 2020
Received 25 Sep, 2020
On 14 Sep, 2020
On 14 Sep, 2020
On 03 Sep, 2020
On 03 Sep, 2020
Invitations sent on 03 Sep, 2020
On 02 Sep, 2020
On 02 Sep, 2020
Hybrid Semantic Recommender System for Chemical Compounds in Large-Scale Datasets
Posted 29 Dec, 2020
On 03 Jan, 2021
Received 15 Dec, 2020
Received 15 Dec, 2020
On 10 Dec, 2020
Invitations sent on 10 Dec, 2020
On 10 Dec, 2020
On 09 Dec, 2020
On 09 Dec, 2020
On 09 Dec, 2020
Received 07 Nov, 2020
On 07 Nov, 2020
Received 06 Nov, 2020
On 30 Oct, 2020
On 30 Oct, 2020
On 29 Oct, 2020
Invitations sent on 29 Oct, 2020
On 28 Oct, 2020
On 28 Oct, 2020
On 29 Sep, 2020
Received 28 Sep, 2020
Received 25 Sep, 2020
On 14 Sep, 2020
On 14 Sep, 2020
On 03 Sep, 2020
On 03 Sep, 2020
Invitations sent on 03 Sep, 2020
On 02 Sep, 2020
On 02 Sep, 2020
The large, and increasing, number of chemical compounds poses challenges to the exploration of such datasets. In this work, we propose the usage of Recommender Systems to identify compounds of interest to scientific researchers. Our approach consists of a hybrid recommender model suitable for implicit feedback datasets and focused on retrieving a ranked list according to the relevance of the items. The model integrates collaborative-filtering algorithms for implicit feedback (Alternating Least Squares and Bayesian Personalized Ranking) and a new content-based algorithm, using the semantic similarity between the chemical compounds in the ChEBI ontology. The algorithms were assessed on an implicit dataset of chemical compounds, CheRM-20, with more than 16.000 items (chemical compounds). The hybrid model was able to improve the results of the collaborative-filtering algorithms, by more than ten percentage points in most of the assessed evaluation metrics.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Due to technical limitations, full-text HTML conversion of this manuscript could not be completed. However, the latest manuscript can be downloaded and accessed as a PDF.