Discovering critical proteins in the learning process in a Down Syndrome model of mouse through machine learning

Caused by an extra copy of the human chromosome21 (Hsa21), Down syndrome produces an intellectual disability that is still unknown and requires further research in order to have a better perception. One research conducted in this area of study has analysed different protein levels of the Ts65Dn mouse model of DS. Many researchers are trying to ﬁnd the critical proteins that categorize the mice classes accurately by using machine learning. In this study, we expand the problem by trying to ﬁnd the critical proteins that affect different types of learning. The protein subsets are found using forward feature selection method, ReliefF respectively and four different supervised learning algorithms are used. The experimental results are compared with previous related work, and demonstrated that the proposed method outperforms, or is comparable to, its competitors in term of accuracy. Then, a thorough analysis is done to identify the critical proteins for each learning case, by lowering the number to 9 critical proteins that can help in a better categorization of the mice. We hope that our work withhelp the scientists on their further research on ﬁnding a treatment that may help the learning process and ease the intellectual disability caused by Down Syndrome.


Introduction
Down Syndrome (trisomy 21) (DS) is caused by an extra copy of the human chromosome21 Hsa21. The central feature of DS is the impaired intellectual function and the intellectual disability (ID) 123456 . Currently, Hsa21 is estimated to contain 234 protein-coding genes 7 . It also shares many features with Alzheimer's Disease (AD) 389 , such as the deposition of both amyloid plaques and neurofibrillary tangles 10 . On their work, Ahmed et al. 1112 and Costa et al. 13 have shown that mouse model Ts65Dn, a Down Syndrome model of mice expresses a learning recovery capability when using Context Fear Shocking and a pharmaceutical drug used on treatment of AD called memantine. More recent studies are focused on investigating the effects of memantine on the learning process 14151617 and Smalheiser, Neil R. 18 proposes ketamine as a neglected therapy for AD. Since DS is invariably leading to 44 early-onset AD 19 and a deep analysis of dementia phenomenon in DS 20 , it is crucial to understand what are the proteins affecting the learning process. Ahmed et al. 12 measured 85 prontei levels in the hippocampus and cortex of the mice model Ts65Dn and normal mice to identify how these proteins affect the learning process. The dataset was published online by [21], where the total number of proteins is 77. Since then, different analysing methods are used to study the relationship between the proteins and the learning process. The first study performed by Ahmed was done by using statistical models. Later, Higuera et al. 21 applied unsupervised learning Self Organizing Maps (SOM) algorithm in order to identify the critical proteins rather than statistical models. However, since then the problem has been treated more as a classification problem, rather than a clustering problem as thought by Higuera et al. On their paper, Eicher et al. 22 used linear Support Vector machines (SVM) in order to identify the proteins that can classify two different classes of mice. In 23 B. Feng used adaptive boosted Decision Tree (AdaBoost) method as forward feature selection to identify the most correlated proteins, then they used Random Forest (RF), SVM, and Decision Tree (DT) algorithms for classification. On their paper, Kulan et al. 24 have used the same procedure steps as B.Feng et al. by only changing the feature selection method. They thought that Naïve Bayes would discriminate the protein better than AdaBoost, and they were right, since they got a higher accuracy than the previous work. On their recent paper, Kulan et al. 24 compared their results with the ones obtained from Higuera et al. related to finding the critical proteins that affect three types of different learning, such as successful, rescued and failed learning. They used again Naïve Bayes for feature selection in order to identify the correlated proteins, and their results were higher than the previous ones.
In this work, we aim to find the critical proteins by applying a different forward feature selection method. The subset of features was selected from 77 protein expression levels obtained from the hippocampus and the cortex of normal and Ts65Dn trisomic mice. After feature selection, RF, SVM, Neural network (NN), and K Nearest Neighbour (KNN) classification algorithms were applied in order to find the critical proteins in two different problems: multiclass classification and learning process. For multiclass classification, we compared our results with B. Feng et al. and Kulan et al. 24 works in which AdaBoost and Naïve Bayes were used for feature selection. Regarding the learning process, we compared our results with Higuera et al work with Kulan et al. 25 that used SOM and supervised learning respectively. Our results show that the selected protein subsets have a higher accuracy in classification then the previous related models. The subset of proteins selected in our work help researchers have a better understanding of the protein involvement on the learning process, providing solid grounds on drug development for treating the ID.
The rest of paper is designed as follows: the second section is dedicated to the related works; third section describes the materials and methods used in this paper. Results are shown on the fourth section and a deep analysis is done on the discussion part at the fifth section. We conclude our findings in the sixth section.

Exploring the dataset
The dataset used in this paper UCI (University of California Irvine) Machine Learning Repository dataset, publicly available at [21]. It consists of 77 numerical values (proteins modifications expression levels) and 3 categorical values such as genotype, behaviour, and treatment. In total, there are 72 mice where 38 of them are control mice and 34 are trisomic (Down syndrome) mice. For each mouse, 15 measurements are collected, thus 570 measurements in total for control mice and 510 measurements in total for trisomic mice. The dataset contains 1080 samples where each measurement can be considered as independent variables. As mentioned before, there are two different classes for genotype: control and trisomic. There are also two different classes for the behaviour such as context-shock CS (stimulated to learn) and shock-context SC (not stimulated to learn). For treatment also there are two different classes where the first group of mice were injected with memantine and the second group did not get any treatment. Table1 shows the summary of the mice classes.
# of mice c-CS-s control mice, stimulated to learn, injected with saline 9 c-CS-m control mice, stimulated to learn, injected with memantine 10 c-SC-s control mice, not stimulated to learn, injected with saline 9 c-SC-m control mice, not stimulated to learn, injected with memantine 10 t-CS-s trisomy mice, stimulated to learn, injected with saline 7 t-CS-m trisomy mice, stimulated to learn, injected with memantine 9 t-SC-s trisomy mice, not stimulated to learn, injected with saline 9 t-SC-m trisomy mice, not stimulated to learn, injected with memantine 9 Table 1. Summary of classes on the dataset

Data Preprocessing
Missing data Most of the related works have used the mean in order to fill the missing values. We wanted to try a new method by filling the missing values with the most frequent value of each mouse class. To compare these methods two datasets are prepared: D1 where the missing values are filled with the most frequent value method and D2 where the missing values are filled with the mean method.

Data Normalization
In order to prevent the influence of the proteins with higher value, there is a need for data normalization. Higuera et al. 21 has applied max-min normalization, which in fact does not preserve the range. Because of this, Kulan et al. 25 propose the Z-score normalization by subtracting mean of values from each value and divided by standard deviation in the end. On this paper, a different normalization technique in which each value is normalized in the range of [−1, 1] is used.

Feature Selection
Since a comparison of our work with other related works will be done in the end, the number of features selected is the same with the one used on the literature. As mentioned before, the comparison will be done in two different problems. The first problem is multi class classification of all the data. Our results will be to compared with Kulan et al. 24  In order to compare our results, in this work also 30 features are selected out of 77 proteins. In order to go a little bit deeper, we tried to do multiclass classification by reducing even more the dimensions to 11, 10, and 9 features. The second problem is related to finding the critical proteins subset related to normal learning, rescued learning, and failed learning. Our results will be compared with the ones in Higuera et al. 21 and Kulan et al. 25 . Higuera et al. applied SOM (Self Organizing map) for such cases and Kulan et al. used forward feature selection using Naïve Bayes classifier. For successful learning they have selected 11 features. For rescued learning they have selected 9 features and for failed learning they have selected 9 features. We have used the same number of features for each class.
The feature selection is done by Relief-based method ReliefF. Relief-based algorithms (RBAs), a unique family of filter-style feature selection algorithms are the only individual evaluation filter algorithms capable of detecting feature dependencies 29 . They use the information taken from the nearest neighbours in order to speculate the feature weights. By doing this, no search is involved on trying to combine the features, but they try to find the features that affect indirectly. Moreover, RBAs have an asymptotic time complexity of O (instances 2 * f eatures) (quadratic function) giving them a feature that make them advantageous compared to the other algorithms. This feature makes them relatively fast and may save many computational efforts.

Classification
After feature selection, different algorithms comparable with the other works based on our problems are selected. We used four different classifiers such as KNN (K-Nearest Neighbour), NN (Neural network), SVM (Support Vector Machines) and RF (Random forests). KNN is not used in any of the work compared, but it is added by us in order to check the accuracy of this algorithm with our methods. The software used for classification is Orange Inc 30 .
KNN (K-nearest neighbour) KNN 30 is a supervised machine learning algorithm that can be used for both classification 313233343536 and regression predictive problems 3738 . However, it is generally used in classification problems rather than regression. It is one of the simplest classification algorithm, taking in consideration the distance of the to-be-classified object with its k neighbours. The KNN algorithm assumes that similar things tend to stay closer to each other. In other words, similar things have a short distance among themselves. In order to identify the similar objects, KNN calculates the distance between the objects, by classifying them according to their shortest distances. Even though it is a very simple algorithm, it is still being used in classification problem and gives highly competitive results. On this work, we have used K (the number of nearest neighbours) as 5 and Manhattan distance as input parameters.

NN (Neural Network)
Neural network 39 is another method that is used for protein classification 40414243 . NN finds correlation between inputs and outputs by mapping them, through a process done in multiple hidden layers. These layers are made by nodes. A node is the place where the computations, such as assigning different weights to inputs, are executed. Then, these computations are passed through an activation function. By using the training process, they can adapt themselves to the data and change the weights properly without any explicit intervention. They are very robust towards noise and missing values and achieve higher results when more layers are added. But this at a cost of increase of computational times, which can make them slow. In our case, we have a neural network with 100 neurons in hidden layer. The activation function is a rectified linear units (ReLu) function 44 and the solver is Adam solver. Maximum number of iteration is 200.

SVM (Support Vector Machines)
SVMs 4546 classification is done by determining a decision plane which discriminates best a set of object dataset of different classes. The data are mapped into kernel, which usually is a higher dimension. This is very useful especially on the cases where the data are very difficult to separate on lower dimensions. The distance between decision plane and nearest data point from either part of the plane is known as the margin. The margin should be higher in order to have a better classification. The classification of an object is done by checking the margin values between the object and the classes. SVM works well on small datasets, however it is not very robust on noisy data with too many overlapping objects. It is found a practical method for protein classification 47484950 . In our work, the kernel is radial basis function RBF 51 and the maximum number of iteration is 100.

RF (Random Forest)
Random forest 52 (strong learners) is a collection of many decision trees (weak learners) selected from a random subset of training set. For each classification result, a vote is taken from each tree. In the end, the classification result may either be the average or the mode of the voting from each individual tree. Random forest is especially robust to missing values. It has been used in classification problems 53545556 and regression problems 575859 . Our parameters are: the number of trees is 10 and subsets smaller than 5 should not split.

Validation method
Cross validation method [62] is used to check how accurately the predicted models will perform in practice. Here, k-fold cross validation is used for evaluation. The dataset is split randomly into k folds (subsets) of equal sizes. Then the model is trained and tested k times. The accuracy is the ratio between the correct classifications over the total number. In our case, sometimes 5 fold validation is used, sometimes 10 fold validation and sometimes random validation, and random sampling, meaning every sample is selected randomly by a random seed

Multiclass classification problem
As mentioned above, for this problem we are going to compare our results with Kulan et al. [25] and B.Feng et al [24]. For classification technique 10-fold cross validation result. This is also the same method used by the other authors. We have added extra random sampling where 90% of the data is used for training and 10% for testing. The precision results are reported on Table 2. As we can see from the table, our results (in percentage) are higher than the ones reported on the other works, except for Random Forest. Comparing D1 with D2, we have a difference of 0.01% in precision.

4/22
Method B.Feng et al. Kulan Table 2. Results compared to other works In order to find the critical proteins, we reduced the dimensionality of the data even more: to 11, 10 and 9 respectively. Here we used only random sampling for result validation. On Table3 we can see results for D1 with 11 features. We still have a high precision, 99.5% from KNN which is greater than the results reported on the related works with 30 features. The difference of precision accuracy from 30 features is only 0.5%. So, in case we have a reduced dataset with this subset of features, we would still have a high accuracy on classification. Table 5 shows the results taken from D1 when selecting 10 features. The accuracy of the other methods starts dropping, but still KNN accuracy (99.4%) is higher than the related work with 30 features. We have only a 0.1% difference on the precision accuracy taken from 11 features. On the other hand, we see an increase on the accuracy of KNN (99.6% or 0.1% greater than 11 features or 0.2% greater than D1 results) when selecting only 10 features (see Table6). This means that the proteins subset used for classification in this case are more critical and more important. We notice an increase on the accuracy of NN and SVM, but a decrease on the accuracy of RF compared to D1 results.
The last results to be shown for this problem are the results taken after selecting only 9 features. Table7 shows the results taken from D1 dataset. Again, we see that KNN classification accuracy is greater than the ones reported on the related work (99.1%) compared to the best result Kulan et al. [25] reported from NN when selecting 30 features (99%). This is an important notice since we have a greater accuracy when the number of features is very small compared to the dataset (9 out of 77 proteins) and smaller than the 30 features that the previously related work have reported.

5/22
Method AUC(%) CA(%) F1(%) Precision(%) However, the other methods have a lower accuracy than KNN. SVM, NN and RF report a lower accuracy when the dimensions are reduced, which is also true for KNN, but the difference from dimensionality reduction on KNN is smaller than on the other methods. Table8 shows the results of D2 with 9 features selected. Again, we notice the same trend on these results as the ones when selecting 10 features for NN, SVM and RF. The first two report an increase in the accuracy, whereas RF report a decrease in the accuracy. As a conclusion, we may say that applying ReliefF for feature selection and KNN method for classification will yield us a better accuracy than the other methods so far, even when selecting a very small number of features (12% of the proteins in the dataset).

Critical proteins subset of multiclass classification problem
The results showed an increase on the precision accuracy, with KNN algorithm performing best. This means that our method for finding the critical proteins is proved to be effective. The first run was done with the selection of 30 proteins, then with 11, 10 and 9 features. We notice that the accuracy of D1 and D2 differs, which means that there should be different protein subsets selected as features. We start by comparing the subsets when selecting 11 features. On Table 9 we see the proteins subsets selected as 11 features on D1 and D2. Out of 11 proteins, 2 are different in any case: H3MeK4_N and H3AcK18_N for D1 and AcetylH3K9_N and APP_N for D2. The rest is the same. Depending on the method used, either D1 or D2 will give the highest accuracy. For example, KNN on D1 gives the accuracy 99.5% whereas on D2 gives a lower accuracy of 99.4%. However, SVM on D1 gives the accuracy 92.8% whereas on D2 gives a higher accuracy of 97.1%.
Let's see the protein subsets determined when selecting 10 features. Again, we see from the subsets shown on Table 10 that the subsets differ from 2 proteins: PKCA_N and H3MeK4_N on D1 and Ubiquitin_N and Braf_N on D2 respectively. The rest is all the same. If we compare the accuracy, we have an increase on accuracy on D2 for KNN, NN and SVM but a decrease for RF.
The last comparison is between the subsets of proteins selected with 9 features. Even on these subsets, only two proteins differ from subsets (see Table 11) and all the rest are the same. On D1 we have protein BRAF_N and H3MeK4_N as different, whereas on D2 we have APP_N and pGSK3B_N. The accuracy of classification changes from the method used and from the dataset.
If we compare the subsets of 11, 10 and 9 features in order to find the most critical proteins that need to be in every feature vector, the comparison is done on Table 12. From these subsets we found the 9 critical proteins which are CaNA_N, pPKCG_N,  SOD1_N, pCAMKII_N, S6_N, H3MeK4_N, pP70S6_N, and APP_N. As a conclusion, we may say that the pre-processing method that we use for filling the missing values affects the proteins subsets selected as features for classification. KNN algorithm has given the highest accuracy so far, where the difference on  Table 11. Protein subsets selected as 9 features on D1 and D2 the accuracy from different pre-processing method is very small compared to the other methods. Using ReliefF for feature selection proved to be an effective method for finding the critical proteins. After merging the proteins selected on D1 and D2 for 11, 10 and 9 features, we found 9 most critical proteins. Even using these 9 features we have an accuracy of 99.1% for multiclass classification given from KNN.

Successful learning, Rescued learning and Failed learning
As mentioned above, the second problem is related to finding the critical proteins subset related to normal learning, rescued learning and failed learning. Higuera et al. 21 and Kulan et al. 25 have done some work for this problem. Higuera et al. applied SOM or known differently as Kohonen map approach in order to identify the proteins subset that make the most critical contribution on learning. The authors treated the problem as a clustering problem in order to identify these proteins. Kulan Table 12. Comparison of protein subsets for 11, 10 and 9 features cross validation and added 10 fold and random sampling as other validation methods. Table 13 shows the accuracy results of our methods and the comparison with the other methods for successful learning. By checking the results taken from 5 fold cross validation, we notice an increase on the accuracy using D1 and D2. We also notice that when using 10 fold cross validation, we have greater accuracy, where KNN has the highest with 99.8%. If we compare D1 and D2 results, they differ depending on the method. On KNN, NN and SVM we see an increase on the accuracy, whereas RF changes from the validation method. Still, comparing the results means that our subset has found more critical proteins than the other methods. For rescued learning, 9 features are selected. The results of our method and the comparison with the other related work are shown on Table 14. We may notice that using D1, we have a higher accuracy for all the methods compared to the others using 5 fold cross validation. This differs when using D2, since SVM accuracy is lower than the previous work, and NN reports the same accuracy. Comparing D1 results with D2, we see that the results of D1 are greater than the results of D2. This means that the subset selected from D1 has more critical proteins than D2. The highest accuracy is given by KNN  For failed learning, 10 features are selected. The accuracy results are reported on Table 15. The first thing that we notice is that using KNN for classification we have an accuracy of 100%, meaning that all the data are classified accordingly. This means that the subset chosen in this case is the best one so far that can classify the data 100% correctly. Comparing the results taken from 5 fold cross validation, we see that there is an increase on the accuracy, with the biggest difference (9.7%) on RF from 89.2% to 98.9% taken using D2. KNN and SVM give the best results, whereas RF in average reports the lowest.  SOD1_N, pNUMB_N, pGSK3B_N,  S6_N and CaNA_N. The difference in the accuracy values is explained also by the low number of the same proteins found as critical on the other methods. We may say that these proteins subsets found with our methods are more critical in order to have a successful learning, and really affect the ability to have a successful learning.
Rescued Learning proteins found as feature subsets had a fixed length of 9 proteins. Again, because of the varying accuracy of D1 and D2 in different methods, the feature subsets should differ with each other. They differ only on one protein: D1 differs on pPKCAB_N and D2 differs on pP70S6. The other proteins are the same. The protein subset of each dataset are shown on Table 18.
We compare our protein subset with the ones of the other related work in the following Table 19. We notice fewer same proteins than for the successful learning. With the subset of Kulan et al. 24  Failed learning has a fixed protein subset of length 10. By using different pre-processing methods, we have different proteins subsets. This was also shown above by a varying accuracy among the datasets D1 and D2. On Table 20, the protein subsets are shown and it can be seen that they differ only by 1 protein. On D1 we have the protein BAD_N present, whereas on D2 we have the protein pRSK_N instead.
The proteins that we have found seem to be the critical proteins, because of the high accuracy that we get on classification.

9/22
Protein subset (D1 & D2) Kulan et al. [26] Higuera et al.  24 work. By observing the subsets shown on Table  21, we notice that there are only 2 common proteins with Kulan et al. subsets, pPKCAB_N and pCAMKII_N respectively. With Higuera et al.'s subset instead there is only one common protein pS6_N. The fact that very few proteins are in common explains the difference on the accuracy between our work and their work. Now, after finding the protein subsets for each type of learning, we started comparing them. We wanted to compare them in order to identify the critical proteins among the different type of learning. We were surprised to notice that the different proteins that were among the subsets of different types of learning were of the same size 3. Table 22 shows these critical proteins. For successful learning, the critical proteins that are not found on the subsets of the other types of learning are pNUMB_N, Ubiquitin_N and ADARB1_N. For rescued learning the different proteins are BRAF_N, pERK_N and DYRK1A_N. And the last, for failed learning the different proteins are ARC_N, BAD_N, pRSK_N. It is important to mention that some of these proteins are among the common proteins found with the other works protein subsets such as Ubiquitin_N, pNUMB_N for successful learning and BRAF_N, pERK_N, DYRK1a_N for rescued learning. It is important to mention that the rescued learning critical proteins are also the common proteins of our subset with Higuera Table 21. Critical proteins on failed learning found among the common proteins for the failed learning. These subsets may help researchers advance their work by studying them in detail. pNUMB_N  BRAF_N  ARC_N  Ubiquitin_N  pERK_N  BAD_N  ADARB1_N DYRK1A_N pRSK_N Table 22. The proteins that are different among the subsets of different types of learning

Analysis of the methods used
The relationship of the pre-processing with the methods used We mentioned earlier that by applying two different method to fill the missing values (with the most frequent and the mean respectively), we had a two different datasets D1 and D2. We also noticed when presenting the accuracy results that there is a varying accuracy between different methods and different datasets. So, we would like to observe how the pre-processing step that we choose in the beginning affects our classification results in the end. We start analysing this relationship by comparing the results of the multiclass classification with 9, 10, and 11 features respectively. The compared values are area under curve (AUC), cumulative accuracy (CA), F1, precision and recall. The first comparison was done for KNN and NN algorithms. As can be seen on Figure 1 and 2, the values vary among different datasets. In some cases D1 gives the highest values, sometimes D2. So we can conclude that these methods are not very much affected by the pre-processing step used to fill the missing values. The next to be compared are RF and SVM for the same conditions. On Figure 3 and 4 the respective graphs are shown. By observing the graphs, we notice that in contrary to KNN and NN, there is a trend for RF and SVM. The accuracy values are greater for D2 than D1, which means that for these algorithms the pre-processing steps has an influence on the accuracy of the classification. If we would like to use RF and SVM for classification, it is better to use as a pre-processing step to fill the missing values by using the mean, instead of mode. We can conclude that KNN and NN is not affected by the pre-processing step to fill the missing values. RF and SVM instead are affected by this step and the accuracy is higher when filling the missing values with the mean value instead of the mode. So, depending on the classification algorithm, depends also the pre-processing value that is going to be used to fill the missing values.  The next observation is done to find the relationship between the validation method and the algorithm used for the different datasets D1 and D2. For this case, the results of different types of learning are taken for comparison. The validation methods that are going to be compared are 10 fold, 5 fold and random sampling. The first algorithms to be compared are KNN and NN. The graphs are shown on Figure 5 and 6. As it can be noticed, we see a trend among the validation method, the algorithm and the dataset used. For successful learning, D2 gives the highest value for all types of validation method. On rescued learning instead, it gives lower values than D1. On failed learning, the values are the same for 10 fold and 5 fold, but lower than D1 for random sampling. RF and SVM are also compared and the respective graphs are shown in Figure 7 and 8. By looking at them, one can not determine a clear relationship between the validation method, the algorithm used and the dataset. But we may say that for RF algorithm it is noticed that the values of D1 and D2 for different type of validation method are the opposite of each other. Hence, whenever D1 is giving higher accuracy, D2 is lower and vice versa. SVM instead shows a different perspective. In successful learning the value of D1 and D2 are the same for 10 fold, and then D2 gives higher values. Whereas on rescued learning for each validation method D2 gives lower values. On failed learning, it gives lower values for 10 fold and 5 fold and the same value for random sampling. We need to keep in mind that the accuracy of the algorithms is also related to the feature subset they are using for classification. But we may conclude that we see the same trend for KNN and NN in the relationship of the validation method and the dataset used, whereas on RF and SVM there is not a clear relationship that can be described.

Conclusion
The aim of this paper was to identify the critical proteins that discriminate the classes among the mice more accurately and the critical proteins that are related to different types of learning. Two different methods are used to fill the missing values including the use of the mean of each class and the most frequent values. The algorithms used for classification are RF, SVM, NN and KNN. Data normalization is performed by normalizing all the values on the interval [-1, 1]. Forward feature selection using ReliefF is used to identify the protein subsets. The results obtained from the classification are compared to the other related works. For multiclass classification problem, protein subsets of different lengths are used such as 30, 11, 10 and 9. The first subset is used in order to compare our method with the results reported on other related papers, B.Feng et al. and Kulan et al. [25] respectively. It was shown that our method has a higher accuracy than their methods except for RF, with the highest achieved from KNN as 99.8%. Even when the protein subset length is lowered to 11, 10 and 9 features respectively, KNN showed a higher accuracy compared to the highest accuracy reported on these papers (99.1%). The same procedure is followed in order to identify the critical proteins related to different types of learning such as successful learning, rescued learning and failed learning. We compared our classification results with the related work. Our method shows a higher accuracy than the ones reported by Higuera et al. and Kulan et al. [26] The algorithm with the best performance was again KNN with an accuracy of 99.8% accuracy for successful learning, 99.5% for rescued learning and 100% for failed learning. Furthermore, we compared the protein subsets of each case and each problem in order to identify the critical proteins. We found the nine critical proteins which are CaNA_N, pPKCG_N, SOD1_N, pCAMKII_N, S6_N, H3MeK4_N, pP70S6_N, and APP_N for multiclass classification problem. For different types of learning, we found critical proteins for each type. In the end, we discovered the proteins that vary for each type. For successful learning, the critical proteins that are not found on the subsets of the other types of learning are pNUMB_N, Ubiquitin_N and ADARB1_N. For rescued learning the different proteins are BRAF_N, pERK_N and DYRK1A_N. And the last, for failed learning the different proteins are ARC_N, BAD_N, pRSK_N. The pre-processing method that is used to fill the missing values, may or may not affect the accuracy of the classification, this depending on the

19/22
algorithm. The subset of proteins selected provides have a better understanding of the protein involvement on the learning process that may lead to a better drug development for treating the ID. Finally, we may conclude that our goal to identify the critical proteins is achieved and we hope that this study will help scientists to achieve their goal of finding a treatment that may help the learning process and ease the intellectual disability caused by Down syndrome.