Algorithm Selection Using Edge ML and Case-based Reasoning

doi:10.21203/rs.3.rs-2436478/v1

Download PDF

Research Article

Algorithm Selection Using Edge ML and Case-based Reasoning

https://doi.org/10.21203/rs.3.rs-2436478/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 21 Nov, 2023

Read the published version in Journal of Cloud Computing →

You are reading this latest preprint version

In practical data mining applications area, classification and predictions are the most commonly and frequently used applications which involves intelligent classification methods and algorithms. In machine learning, a large number of classification algorithms are available which take the responsibility of classifying unforeseen data in classification problems and predicts future outcomes. However, the task of selecting a right classification algorithm to produce best results is a challenging issue both for machine learning practitioners and machine learning experts. The reason is the inherit characteristics of classification problems and unpredicted behaviors of the classifiers on these problems. Furthermore, building intelligent ML algorithm recommender over large and complex datasets by extracting their meta-features using conventional computational framework is computationally and timely expensive task. Hence, investigation and mapping of the unseen characteristics of classification problems with the behaviors of classifiers is a key research problem focused in this research work. The characteristics of classification problems and behaviors of classifiers are measured in terms of meta-characteristics and classifiers performance, respectively. Machine learning research community has addressed the issue from time-to-time with various approaches, however the issues of – unavailability of training data, use of uni-metric evaluation criteria for classifiers evaluation, and selection of conflicting classifiers sill persist. Therefore, this research work has proposed a novel methodology based on edge ML and case-based reasoning (CBR) to overcome the aforementioned issues. The key contributions of the research work are enlisted as follows: (a) design of an incremental learning framework using edge ML-based CBR methodology, (b) design of a multi-metrics classifiers evaluation criteria, (c) design of an efficient algorithm conflict resolution criteria (ACR) and (d) implementation of the CBR methodology integrated with ACR to automatically select appropriate algorithm for new classification problems. To evaluate the proposed CBR-based algorithm recommendation method, jCollobri framework has been used and extensive experiments with 152 classification problems, in the form of classification datasets, and 09 decision tree classifiers are carried out. Results of the proposed multi-metrics performance criteria indicates that the proposed CBR-based methodology is an effective one as compared to the baseline classifier recommendation method and can be used in practical applications development classification and prediction problems.

Algorithm selection

Machine learning

Meta learning

Edge ML

Edge computing

In data mining, classification problems are the most commonly and frequently used problems, which can be best solved using classification and predication applications. Nevertheless, the development of an accurate classification and prediction application relies on the use of an accurate classification algorithm. However, in machine learning area, the researchers have designed a number of classifiers, which can broadly be categorized as decision tree-based (e.g., ID3[1], C4.5[2], and CART[3]), probability-based (e.g., Naive Bayes[4] and AODE[5]), rule-based (e.g., OneR[6] and Ripper[7]), and fuzzy logic based [8], [9], [10], instance-based and ensemble-based (e.g., Bagging, Boosting, and ). Moreover, the research community is continuously working in the area of designing new classifiers [11], [12], [13], and as a result, a number of competing classifiers can be found for a given classification problem with almost equal likelihood to be the best classifier for that problem. Selection of an appropriate classifier from this vast alternatives list is a challenging task. One of the approaches is the empirical evaluation of all the available classifiers on the given classification problem and selection of the classifier with the best results. However, the approach suffers from the problem of exhaustive search, i.e., computational complexity [2], and a number of studies [14], [15] have shown that there is no specific classification algorithm applicable to all classification problems. For example, if the same classifier is used for another problem, it may come up with worst results and hence validates the known theorem of “No Free Lunch” [1]. The reason is that classifiers’ results depend on the characteristics of a given problem and consequently changes from problem to problem, therefore the problem of classifier selection can be viewed as a meta-learning approach [3] [27]. In meta-learning approach, meta-characteristics of the classification problems are computed and the performance of classifiers is measured on these problems. After this, mapping between problem features and classifier(s) with best performance is learned for recommending appropriate classifier [4]. Thus the automatic algorithm selection task using meta-learning is basically a four-fold process model, with the processes enlisted below.

Classifier characterization - evaluating classifier performance.
Problem characterization – extracting problem inherited meta-characteristics
Mapping and learning problems meta-characteristics against classifiers performance
Recommending appropriate classifier(s) for new problem

Classifier characterization is basically the goal set by the user for development of the application, i.e., accurate or a computationally less expensive classifier etc. It can be measured in terms of classifiers performance metrics using performance evaluation. The research community has characterized classifiers from uni-metric and multi-metrics perspectives. They, sometimes, call it meta-target. Problem characterization is the process of extracting inherit behaviors of the data, which show its unseen nature. It is measured in-terms of meta-features or meta-characteristics of the problem. Research community has extracted different types of meta-characteristics that can be categorized into statistical, information theoretic, model-based, land-marking, and complexity [9]. Recently, Q. Song et al. [10] has used a new dataset characterization method for computing datasets features and computed performance of seventeen classification algorithms over 84 UCI publically available datasets[11]. Mapping meta-characteristics and classifiers performance is the process of aligning each problem against the appropriate classifier. The objective of the process is to make algorithm selection problem as a machine learning problem where meta-characteristics form a feature-vector and label(s) of the classifier(s), with best performance, as the class label. Identification of the class-label is a challenging task and researchers have approached the issue using various approaches, such as multiple comparison method. As a result of these methods, some of the problems have more than one applicable classifiers as the class label. This makes the problem of algorithm selection is a single-class and multi-class problem and research community has approached them using single-label learning and multi-label learning. For learning association or mapping function between problems meta-characteristics and class label(s), researcher have used different approaches that can broadly be categorized – define categories - as decision tree-based learner (e.g., C4.5 [5]), rule-based learner [6], linear regression [7] and instance-based learner (e.g., k-NN [8], [10]). Finally, for the selection of appropriate classifier(s) on the fly, researchers have used different approaches.

Summarizing the research work done so far, the key issues which still need to be worked are the unavailability of training data, use of uni-metric evaluation criteria for classifiers evaluation, and selection of conflicting classifiers, i.e., the classifiers which have equal likelihood of being the best classifiers shall need to be resolved. By looking algorithm selection as a machine learning problem, it is observed that it is impractical and impossible to find a reasonable number of classification datasets to act as training data for building an accurate algorithm recommendation model. At the same time, to set an accurate class label for an instance in the training dataset, it is very hard to select the correct label based on only a single evaluation metric of the classifier performance. The reason is that a classifier perform best on a dataset evaluated using one metric, e.g., accuracy, but poor evaluated on another metric, e.g., time complexity.

Therefore, this research work has proposed an edge ML based CBR methodology to overcome the aforementioned issues. The key contributions of the research work are enlisted as follows: (a) design of an incremental learning framework using edge ML-based CBR methodology, (b) design of a multi-metrics classifiers evaluation criteria, (c) design of an efficient algorithm conflict resolution criteria (ACR) and (d) implementation of the CBR methodology integrated with ACR to automatically select an appropriate classifier for a new classification problem.

In this paper, the idea of Case-based Mata-learning and Reasoning (CB-MLR) framework is extended by introducing the concept of multi-metrics criteria for classifiers evaluation and integrating it with CBR and Classifier Conflict Resolution (ACR) methodologies. The CBR methodology incrementally learns mappings of problems’ meta-characteristics and labels for the best classifiers. For the problems meta-characteristics, general, basic statistical, advanced statistical, and information theoretic characteristics are considered. For choosing label of the best classifier(s), a multi-metrics performance evaluation criteria consisting of classifiers’ accuracy and consistency are considered. Similarly, for the mapping purpose of problems characteristics and best classifiers labels, a propositional feature-vector scheme is used. To learn mappings, the CBR incremental learning methodology is adopted which ultimately recommends the appropriate classifier for a given new classification problem. Hence, the key contributions of the research work are enlisted as follows.

Extends CB-MLR to a flexible and incremental meta-learning and reasoning framework using edge ML and CBR-based methodology, which is integrated with multi-criteria decision making, for classifier evaluation, and data characterization using multi-view meta-features extraction.
Extending the conventional ML algorithm section problem to edge ML-based methodology for efficient and faster recommendation with least computational resources utilization.
A new multi-metrics criteria is proposed for the evaluation of decision tree classifiers to select the best classifier as class label for the cases in training dataset (i.e., resolved cases in the proposed CBR methodology). Classifiers are analyzed based on their predictive accuracy and standard deviation, called consistency to select the best classifier as class-label.
The idea of multi-view learning is proposed to learn the data from multiple perspectives, with each perspective representing a set of similar meta-features that reflects one kind of behaviors of the data. Each set of features is called a family that forms a view of dataset.
Proposed a CBR-based meta-reasoning methodology with a flexible and incremental learning model integrating CBR with the algorithm conflict resolving (ACR) method to accurately recommend the most similar case as the suggested classifiers for a given new dataset. For the CBR retrieval phase, accurate similarity matching functions are defined, while for the CCR method, weighted sum score and AMD method

The remaining of this paper is organized as follows. Section 2 briefly overview edge ML computing for algorithm selection. Section 3 describes the edge ML and CBR-based methodology. Section 4 describes the implementation and evaluation of the proposed methodology. Section 5 concludes the work done.

Edge machine learning, abbreviated as edge ML, is the process of executing machine learning algorithms on edge devices, local to the source of data rather than on the remote cloud, to make quicker decisions. Here, the problem of ML algorithms selection is viewed as Edge ML problem where the problem of ML algorithm selection is divided into a two steps computational problem as follow.

Recommending appropriate family of ML algorithm by Edge ML computing
Recommending appropriate ML algorithm by the remote cloud server

Overview of the proposed Edge ML computing environment is shown in Figure 1. Unlike traditional machine learning for algorithm selection, step 1 is performed at the edge near to the data source, i.e., IoT sensor, to return an appropriate family of the applicable algorithms. Major families of the ML algorithm include probabilistic, decision tree, function, lazy learners, meta-learner, and rules-based algorithms. In the second step, the specific family of the algorithm is forwarded to the cloud server along-with the meta-features of the original real-word dataset.

The proposed methodology of edge ML computing for ML algorithm selection is supported by the hierarchical machine learning approach in which the first layer of the hierarchy recommend family of the algorithm and then the appropriate algorithm within the selected family by the second layer of the hierarchy. In this framework, both edge device and the cloud server acts as computing node at the local site and remote cloud. The paper onward focus on the case-based reasoning methodology used at both the edge and cloud devides.

In this section, the architectural view of the proposed framework, shown in Fig. 2, is focused and each module is described with the rationales behind its use in the framework. This framework is motivated from the Rice framework [16] initially designed for the algorithm selection problem based on the data and algorithm characterization.

3.1. Definition of Algorithm Selection Problem

Based on the Rice model [16], given a problem p as input, a set of candidate decision tree classifiers A that can learn the same p with different performance Y, find and select a decision tree classifier a ϵ A that can learn p with best possible performance. Now, we formally define the algorithm selection problem and introduce notation that we will use throughout this paper. Let P denotes a set of historical problems (i.e., classification datasets; in this case) with F as the features vector for representing the meta-features of each problem p ϵ P and A is a set of classification algorithms that can solve P with some performance Y.

3.2. Proposed CBR Framework

An abstract architecture of the proposed CBR framework is shown in Fig. 2. As outlined in the overview, the problem of algorithm selection is a decision making problem with three main processes, the corresponding framework also consists of three modules. These includes:

Dataset and classifiers characterization (DCC)
Algorithms selection model creation
Case-based reasoning (CBR)

In the high level abstracted view (Fig. 2), the proposed framework for the automatic classifier selection consists of two main phases, offline phase and online phase, discussed as follow.

3.2.1. Offline Phase: Creation of Algorithm Selection Model

This is the offline phase of the process of automatic classifier selection, where a model is developed that works as a knowledge model for real-world recommendation of a suitable classifier for a given new learning problem. It further consists of datasets and classifiers characterization and model creation processes, as described below.

Datasets and Classifiers Characterization (DCC): is the process of characterizing historical data problems $P$ and classifiers $A$ and mapping them against each other in way that the best classifier $a$ is assigned the feature vector $F$. This produces resolved cases/instances for training purpose that are used in later step of model creation. This component is responsible for extraction of meta-features $F$ for each dataset $d$ and relating/aligning the feature vector against the best classifier $a\in A$. The best classifier $a$ in this case is computed using the multi-criteria decision making methodology, utilizing predictive accuracy and consistency measures from the classifiers performance space $Y$. In the data characterization process, different meta-features, belonging to different families, such as simple statistical, advanced statistical and information theoretic, are extracted to enable multi-view learning for best classifier selection from multiple perspectives.
Model Creation (Case-base Creation): is the process of building classifiers selection model from the training instances produced by the DCC as output. Each training instance is a resolved case with meta-features as the problem description part and the best applicable classifier as the solution part or class label. This model can be created using different machine learning algorithms, however it is very hard to build such model using traditional learning methods due to the small number of training instances. To overcome this issue, we adopt the traditional CBR model with some enhancements in the case base creation and retrieval phases. In the proposed framework, output of the model creation is a case base of resolved cases that will be used in the online phase for real-world recommendation of right classifier for a given new dataset.

3.2.2. Online Phase: Algorithm Recommendation Using CBR

This is the online phase of the process of automatic classifier selection, where a suitable classifier is recommended to the end user for his given new dataset. It further consists of meta-features extraction of the new dataset, application of the standard CBR methodology for selecting top-k similar cases from the case base (created model: case base) and resolving the conflict, if more than 1 similar classifiers are recommended by the CBR methodology. The detail are described as follows.

New Case Preparation (Multi-view Meta-features Extraction): To recommend a classifier for a new dataset, first an un-resolved case, consisting only feature vector, is prepared. For this purpose the same dataset characterization mechanism is used as described in the offline phase. Multiple families of meta-features, such as simple statistical, advanced statistical and information theoretic features are extracted in which each group represents a different view of the dataset. This makes the process of algorithm selection as a multi-view learning process.
Case Retrieval (Local and Global Similarity Computation): Retrieve similar cases from the case-base using the methods of local and global similarity matching methods.
CBR-based Multi-views Meta-reasoning: to recommend most suitable top-k classifiers for a new learning problem, represented as a multi-view meta-features case, a customized-CBR methodology with the retrieve, reuse and retain steps is used. Accurate local and global similarity functions are defined that search the algorithm selection model (i.e., the case base of the resolved case) and returns top-k (with k = 3) similar classifiers. If no classifier is the winner among k = 3, then the classifiers conflict resolver step is activated prior to retain step to enable multi-level meta-reasoning.
Multi-level Meta-reasoning (Classifiers Conflict Resolver): is enabled when the first level, CBR, reasoning recommends classifiers with similar performance score. At this second level of meta-reasoning, the classifiers meta-characteristics are used rather than the data characteristics to break the tie with a best decision. A weighted sum aggregate score computation criteria is proposed that consumes the classifiers characteristics, such as decision tree length, number of rules, depth etc., as input and returns an aggregate score to rank the tie classifier. The classifier with highest rank is selected and suggested to the end user for building his/her data mining application.
Incremental/Evolutionary Learning (Retain New Case): is used to add the meta-features vector along with the recommended classifier as a new resolved case to the case base to improve quality of the system for future recommendations. One of the rationales behind the use of CBR-based methodology for classifier selection is the ability of CBR system to incrementally learn the domain and improve quality of the model with passage of time.

3.3. Methods in CBR – Offline Phase

This section describes the methods used in each step or module of the proposed CBR framework.

3.3.1. Dataset Characterization (Extraction of Meta-Characteristics)

To design an accurate classifier, the selection of a best classifier is required. As described earlier, the selection of a best classifier is a multi-factors problem, where multiple parameters need to be considered. For example, how the classifier produce results, measured using various performance evaluation metrics? How the performance is affected by the nature of the data, which can be described in terms of data characteristics. The performance of classifiers varies from data to data. If the characteristics of data are accurately mapped against the performance of classifiers, it will help in understanding the relations of data to classifiers and ultimately will assist in the selection of a best classifier for a problem in hand. These characteristics of the data are termed as meta-features and the resulting model is called meta-learning model. Each dataset can be represented as a set of meta-features, grouped into various families, representing a different view of the dataset. A multi-view analysis of the dataset during classifier selection process enables the resulting model to best map the data, using all its characteristics, against the best classifier. The general concept of multi-view data characterization for classifier selection is shown in Fig. 3.

In state-of-the-art meta-learning methods for algorithm selection, the analysis or automatic algorithm selection model creation is based on various single view meta-features, which then recommends algorithms by considering only those specific features of the model for each given new dataset. A few examples of such views are statistical, information theoretic, complexity, landmarking and model-based [17] [10]. In this study, we propose a new multi-view meta-features based classifier selection model utilizing twenty nine characteristics from the basic, basic statistical, advanced statistical and information theoretic views of the different available views of characteristics, as shown in Tables 1, 2, 3 and 4.

Table 1

Basic characteristics of dataset
Meta-Feature ID	Description
b 1	InstanceCount
b 2	NumAttributes
b 3	ClassCount
b 4	NumBinaryAtts
b 5	NumNominalAtts
b 6	NumNumericAtts
b 7	NumMissingValues

Basic view consists of simple measurements or general data characteristics of the dataset and are computed for the whole dataset, representing a global view using the aggregated values.

Table 2

Basic statistical characteristics of the dataset
Meta-Feature ID	Description
BStat 1	PercentageOfBinaryAtts
BStat 2	PercentageOfNominalAtts
BStat 3	PercentageOfNumericAtts
BStat 4	MeanSkewnessOfNumericAtts
BStat 5	MeanKurtosisOfNumericAtts
BStat 6	Dimensionality

The basic statistical view consists of measurements representing the statistics regarding the dimensionality and ratios of different kinds of attributes in the dataset.

Table 3

Advanced statistical characteristics of dataset
Meta-Feature ID	Description
AdvStat 1	MeanStdDevOfNumericAtts
AdvStat 2	MeanMeansOfNumericAtts
AdvStat 3	NegativePercentage
AdvStat 4	PositivePercentage
AdvStat 5	DefaultAccuracy
AdvStat 6	IncompleteInstanceCount
AdvStat 7	PercentageOfMissingValues
AdvStat 8	MinNominalAttDistinctValues
AdvStat 9	MaxNominalAttDistinctValues
AdvStat 10	StdvNominalAttDistinctValues
AdvStat 11	MeanNominalAttDistinctValues

Table 4

Information theoritics characteristics of dataset
Meta-Feature ID	Description
InfThe 1	ClassEntropy
InfThe 2	MeanAttributeEntropy
InfThe 3	MeanMutualInformation
InfThe 4	EquivalentNumberOfAtts
InfThe 5	NoiseToSignalRatio

Every dataset is a combination of continuous and symbolic data features, therefore to best analyze the data for algorithm selection, the set of symbolic meta-features are also extracted, which are collectively termed as information-theoretic features. These features are based on the entropy that measures the purity level of the data with respect to the class label.

The rationales behind the selection of only these three views of meta-features are: (i) they are the global features representing every kind of classification data and (ii) can easily be computed on the fly to support building real-world application development for data mining application. These meta-features are computed using OpenML [18] data characteristics (DC) open source library, available on GitHub [19].

3.3.2. ML Algorithm Characterization

In the proposed study, the classifiers performance analysis process is designed to determine best performance classifier amongst the candidate classifiers and make it class label of the resolved case, in the case base/training dataset. The performance results, for each dataset (p), are generated using the candidate set of classifiers (A) with a standard setting of 10x10-fold cross validation in Weka experimenter environment [20]. In this study, we used nine decision tree classifiers, implemented in Weka library, with their default parameters. Table 5 shows the list of these classifiers.

Table 5

List of Nine Decision Tree Algorithms
Decision Tree Classifier ID	Name of Decision Tree Classifier
A1	trees.BFTree
A2	trees.FT
A3	trees.J48
A4	trees.J48graft
A5	trees.LADTree
A6	trees.RandomForest
A7	trees.RandomTree
A8	trees.REPTree
A9	trees.SimpleCart

In each experiment, on each dataset (p), results are generated by all the classifiers (A) mentioned in Table 5. The results are stored into Performance Matrix. To determine, the applicable (best) classifier for the dataset (p) under consideration, we use performance metrics, predictive accuracy, measured in terms of Wgt.Avg.F-score and standard deviation (Stdev) in a sequential manner. Prior to this analysis, we use statistical significance test with significance level 0.05 to filter out those classifiers which are statically insignificant with 0.05 level of significance. The algorithm procedure used for this process is pictorially shown in Fig. 4.

In Fig. 4, the processes of applicable classifier(s) identification is sequentially shown following the steps as described below.

Performance matrix is computed using 10x10-fold cross validation
Statistical significance test is performed to filter out insignificant classifiers and reduce the algorithms space from A to A’
Applicable classifier is determined form the list A’ using the maximum Wgt.Avg.F-score sorting function. If more than one classifiers have same Wgt.Avg.F-score, step (iv) is used
Applicable classifier is determined using the minimum Stdev function. If more than one classifiers have same Stdev values, step (v) is used
Multiple applicable classifiers (A’’) are available from the same dataset, based on the considered performance metrics Wgt.Avg.F-score and Stdev only.

For further conflict resolution amongst these classifiers, other criteria can be used. However, in this study, we build the case-base only on these two metrics and use additional conflict resolution criteria at the later stage of online recommendation of best classifier for a given new dataset. For this purpose, i.e., conflict resolution amongst similar classifiers, the characterization of classifier is done from another view as well, where the characteristics of classifiers comprehensibility and interpretability are exploited.

3.4. Model Creation – Case-base Creation

Once the dataset and classifiers are characterized, as described in previous sections, they are aligned with each other, i.e., applicable classifier(s) are assigned to the set of meta-features (e.g., $F\to applicableClassifier\left(s\right)$ using a simple alignment function to produce one single instance of training dataset. The mapping of features versus classifiers forms resolved cases for a CBR classifier. We adapt, propositional case representation schemes [21], where a case is represented as a proposition containing key-value pair format. In our proposed algorithm selection scenario, a case contains data characteristic (i.e., extracted meta-features) as problem description and applicable algorithm name as solution description. A generic structure of our proposed case-base, using feature-vector representation, is shown in Table 6.

Table 6

Case-base representation
Problem or Dataset Description/Characterization					Algorithm Characterization
Case-ID	Meta- f1	Meta-f 2	…	Meta-f29	Appropriate-Algorithm
1	F1	F2	…	F29	Al
2	F1	F2	…	F29	A1
…	…	…	…	…	…
100	F1	F2	…	F29	A3

The meta-features 1–29, shown in Table 6, are the multiple views of data characteristics given in Tables 1–4. Similarly, the Applicable-classifier (last column) is the label of one or more, best decision tree classifier(s), from Table 5. The size of the case-base is 100 resolved cases, authored from 100 freely available classification datasets from UCI [22] and OpenML [18] machine learning repositories. The descriptions of these datasets is given in Table 7.

Table 7

Datasets used in Case-Base creation with their breif descriptions
ID	Dataset Name	General Characteristics
ID	Dataset Name	Attributes	NominalAtts	NumericAtts	BinaryAtts	Classes	IncompInstances	Instances	MissingValues
1	abalone.arff	9	1	7	0	3	0	4177	0
2	abe_148.arff	6	0	5	0	2	0	66	0
3	acute-inflammations.arff	7	5	1	5	2	0	120	0
4	ada_agnostic.arff	49	0	48	0	2	0	4562	0
5	ada_prior.arff	15	8	6	1	2	88	4562	88
6	adult- 4000.arff	15	8	6	1	2	0	3983	0
7	adult- 80000.arff	15	8	6	1	2	0	8000	0
8	ailerons − 5840.arff	41	0	40	0	2	0	5795	0
9	analcatdata_aids.arff	5	2	2	0	2	0	50	0
10	analcatdata_apnea1.arff	4	2	1	0	2	0	475	0
11	analcatdata_apnea2.arff	4	2	1	0	2	0	475	0
12	analcatdata_asbestos_ciupdated	4	2	1	1	2	0	83	0
13	analcatdata_authorship.arff	71	0	70	0	4	0	841	0
14	analcatdata_bankruptcy.arff	7	1	5	0	2	0	50	0
15	analcatdata_birthday.arff	4	2	1	0	2	30	365	30
16	analcatdata_bondrate.arff	12	7	4	1	5	1	57	1
17	analcatdata_boxing1.arff	4	3	0	1	2	0	120	0
18	analcatdata_boxing2.arff	4	3	0	1	2	0	132	0
19	analcatdata_braziltourism.arff	9	4	4	1	7	49	412	96
20	analcatdata_broadway.arff	10	6	3	1	5	6	95	9
21	analcatdata_broadwaymult.arff	8	4	3	1	7	18	285	27
22	analcatdata_chall101.arff	3	1	1	0	2	0	138	0
23	analcatdata_challenger.arff	6	4	1	0	2	0	23	0
24	analcatdata_chlamydia.arff	4	3	0	1	2	0	100	0
25	analcatdata_creditscore.arff	7	3	3	2	2	0	100	0
26	analcatdata_currency.arff	4	2	1	0	7	0	31	0
27	analcatdata_cyyoung8092.arff	11	3	7	2	2	0	97	0
28	analcatdata_cyyoung9302.arff	11	4	6	2	2	0	92	0
29	analcatdata_dmft.arff	5	4	0	1	6	0	797	0
30	analcatdata_donner.arff	4	3	0	1	2	0	28	0
31	analcatdata_draft.arff	6	2	3	0	2	1	366	1
32	analcatdata_election2000.arff	16	1	14	0	2	0	67	0
33	analcatdata_esr.arff	3	0	2	0	2	0	32	0
34	analcatdata_famufsu.arff	4	2	1	0	2	0	14	0
35	analcatdata_fraud.arff	12	11	0	10	2	0	42	0
36	analcatdata_germangss.arff	6	4	1	2	4	0	400	0
37	analcatdata_gsssexsurvey.arff	10	5	4	5	5	6	159	6
38	analcatdata_gviolence.arff	10	1	8	0	2	0	74	0
39	analcatdata_halloffame.arff	18	2	15	0	3	20	1340	20
40	analcatdata_happiness.arff	4	2	1	0	3	0	60	0
41	analcatdata_homerun.arff	28	14	13	7	2	1	163	9
42	analcatdata_impeach.arff	11	8	2	4	2	0	100	0
43	analcatdata_japansolvent.arff	10	1	8	0	2	0	52	0
44	analcatdata_lawsuit.arff	5	1	3	1	2	0	264	0
45	analcatdata_marketing.arff	33	32	0	0	5	35	347	79
46	analcatdata_michiganacc.arff	5	2	2	0	2	0	108	0
47	analcatdata_ncaa.arff	20	15	4	1	2	0	120	0
48	analcatdata_neavote.arff	4	2	1	0	2	0	100	0
49	analcatdata_negotiation.arff	6	1	4	1	2	17	92	26
50	analcatdata_olympic2000.arff	13	1	11	0	2	0	66	0
51	analcatdata_reviewer.arff	9	8	0	0	2	367	379	1368
52	analcatdata_runshoes.arff	11	6	4	5	2	14	60	14
53	analcatdata_supreme.arff	8	0	7	0	2	0	4052	0
54	analcatdata_uktrainacc.arff	17	0	16	0	2	25	31	150
55	analcatdata_votesurvey.arff	5	1	3	1	4	0	48	0
56	analcatdata_whale.arff	8	1	6	1	2	5	228	15
57	analcatdata_wildcat.arff	6	2	3	2	2	0	163	0
58	anneal.arff	39	32	6	14	6	0	898	0
59	anneal.ORIG.arff	39	32	6	7	6	898	898	22175
60	appendicitis.arff	8	0	7	0	2	0	106	0
61	ar1.arff	30	0	29	0	2	0	121	0
62	ar3.arff	30	0	29	0	2	0	63	0
63	ar4.arff	30	0	29	0	2	0	107	0
64	ar5.arff	30	0	29	0	2	0	36	0
65	arsenic-female-bladder.arff	5	1	3	0	2	0	559	0
66	arsenic-female-lung.arff	5	1	3	0	2	0	559	0
67	arsenic-male-bladder.arff	5	1	3	0	2	0	559	0
68	arsenic-male-lung.arff	5	1	3	0	2	0	559	0
69	audiology (binary version of audiology).arff	70	69	0	61	2	222	226	317
70	australian.arff.arff	15	0	14	0	2	0	690	0
71	automobile.arff	26	10	15	3	6	0	159	0
72	autoMpg.arff	8	3	4	0	2	6	398	6
73	autos.arff	26	10	15	4	7	46	205	59
74	autoUniv-au6-1000.arff	41	3	37	2	8	0	1000	0
75	autoUniv-au7-1100.arff	13	4	8	2	5	0	1100	0
76	autoUniv-au7-700.arff	13	4	8	2	3	0	700	0
77	backache.arff	33	26	6	22	2	0	180	0
78	badges2.arff	12	3	8	3	2	0	294	0
79	balance-scale.arff	5	0	4	0	3	0	625	0
80	balloon.arff	3	0	2	0	2	0	2001	0
81	banana.arff	3	0	2	0	2	0	5300	0
82	bands.arff	20	0	19	0	2	0	365	0
83	bank32nh − 1956.arff	33	0	32	0	2	0	1918	0
84	bank8FM.arff	9	0	8	0	2	0	8192	0
85	banknote-authentication.arff	5	0	4	0	2	0	1372	0
86	baskball.arff	5	0	4	0	2	0	96	0
87	BC-breast-cancer-data.arff	10	9	0	3	2	9	286	9
88	biomed.arff	9	1	7	0	2	15	209	15
89	blogger.arff	6	5	0	2	2	0	100	0
90	blood-transfusion-service-center.arff.arff	5	0	4	0	2	0	748	0
91	bodyfat.arff	15	0	14	0	2	0	252	0
92	bolts.arff	8	0	7	0	2	0	40	0
93	boston.arff	14	2	11	1	2	0	506	0
94	boston_corrected.arff	21	3	17	1	2	0	506	0
95	breast-cancer.arff	10	9	0	3	2	9	286	9
96	breastTumor.arff	10	8	1	4	2	9	286	9
97	bridges_version1.arff	13	9	3	2	6	37	107	73
98	bridges_version2.arff	13	12	0	2	6	37	107	73
99	bupa.arff	7	0	6	0	2	0	345	0
100	car.arff	7	6	0	0	4	0	1728	0

In the proposed Case-Base, all the features are real numbers, therefore their data types are set to numeric.

3.5. Algorithm Recommendation using CBR

The final output of the model creation is a labelled dataset, called Case-Base (in this scenario), that contains meta-characteristic of the datasets and classifiers. This Case-Base can equally likely be learned using any supervised machine learning algorithm to produce the corresponding meta-learning based classifier selection model. However, the problem of classifier selection is an estimation problem, where the algorithms performances, over datasets, are estimated rather than providing a cutting-edge solution, as looked in state-of-the-art methods for algorithms selection. This problem can be viewed as a multi-dimensional analysis of the data and classifiers characteristics and the estimation of a similar solution can best predict a suitable classifier for a given new learning problem. Keeping in view all these aspects, we adopt standard CBR as our meta-learner and reasoner and enhance its classical methodological steps by introducing accurate similarity functions along with its multi-level integration with a new reasoner to improve its final output and recommend a right classifier. Furthermore, the incremental learning capability of the CBR methodology is exploited to improve the algorithms selection model (Case-Base in this case). Reasoning methodology works as follows.

3.5.1. New Case Preparation:

A Query Case ($Q$) is prepared from a given new dataset using the meta-feature extractor.

3.5.2. CBR Cycle

The CBR methodological steps including retrieve, reuse and retain are used in sequential order if the finally resolved case is unique, otherwise the retain step is preceded by algorithm conflict resolution (ACR) method.

In the retrieve step, similarity functions are defined for matching the meta-features of query case against the resolved cases $R$ in Case-Base and retrieving top-k cases as the suggested solutions. For individual meta-features similarity matching, the local similarity function, shown in Eq. 1, is defined, while for matching the whole new case with the existing resolved case $R$ in Case-Base, a global function, shown in Eq. 2, is defined.

$${\text{S}\text{i}\text{m}}_{\text{l}}\left({\text{n}\text{C}}_{{\text{m}\text{f}}_{\text{i}}},{ \text{e}\text{C}}_{{\text{m}\text{f}}_{\text{i}}}\right)={\text{i}\text{d}\text{e}\text{a}\text{l}\text{S}\text{i}\text{m}}_{{\text{m}\text{f}}_{\text{i}}}-\frac{{\text{d}}_{\text{l}}\left({\text{n}\text{C}}_{{\text{m}\text{f}}_{\text{i}}},{ \text{e}\text{C}}_{{\text{m}\text{f}}_{\text{i}}}\right)}{{\text{d}}_{\text{g}}\left({\text{M}\text{a}\text{x}}_{{\text{m}\text{f}}_{\text{i}}},{ \text{M}\text{i}\text{n}}_{{\text{m}\text{f}}_{\text{i}}}\right)}$$

where, ${\mathbf{i}\mathbf{d}\mathbf{e}\mathbf{a}\mathbf{l}\mathbf{S}\mathbf{i}\mathbf{m}}_{{\text{m}\text{f}}_{\mathbf{i}}}$=1 & ${\mathbf{d}}_{\mathbf{g}}\left({\mathbf{M}\mathbf{a}\mathbf{x}}_{{\text{m}\text{f}}_{\mathbf{i}}},{ \mathbf{M}\mathbf{i}\mathbf{n}}_{{\text{m}\text{f}}_{\mathbf{i}}}\right)$ is the global interval or range of the values of each continuous value meta-feature. Similarly, ${\mathbf{n}\mathbf{C}}_{{\text{m}\text{f}}_{\mathbf{i}}}$ represents meta-feature of new case and ${ \mathbf{e}\mathbf{C}}_{{\text{m}\text{f}}_{\mathbf{i}}}$ represents meta-feature of existing case.

$${\text{S}\text{i}\text{m}}_{\text{g}}\left(\text{n}\text{C}, \text{e}\text{C}\right)=\frac{{\propto }_{1}\text{*}{ \text{S}\text{i}\text{m}}_{\text{l}}\left({\text{n}\text{C}}_{{\text{m}\text{f}}_{\text{i}}},{ \text{e}\text{C}}_{{\text{m}\text{f}}_{\text{i}}}\right) +\dots +{\propto }_{\text{n}}\text{*} {\text{S}\text{i}\text{m}}_{\text{l}}\left({\text{n}\text{C}}_{{\text{m}\text{f}}_{\text{n}}},{ \text{e}\text{C}}_{{\text{m}\text{f}}_{\text{n}}}\right)}{{\propto }_{1}+{\propto }_{2}+\dots +{\propto }_{\text{n}}}$$

where, ${\propto }_{{i}}$ is the weight value of each ${\mathbf{m}\mathbf{f}}_{\mathbf{i}}$ in the Case-Base and we assigned equal weight value to each meta-features, based on the assumption made that all the 29 meta-features are equally important for selecting a right classifier.

In the reuse step, the solution part, i.e., the label of applicable classifier, of the top-k similar cases are assigned to the problem description part of the new case as a suggested solution (recommended classifier in this case).

This process of retrieve and reuse are described in Algorithm 1.

DESCRIPTIONS OF THE PROCEDURES

CalculateSimilarityIntervals: This procedure loops through all meta-features, calculates interval value and defines weight for each of feature. The interval value is computed using ${\mathbf{d}}_{\mathbf{g}}\left({\mathbf{M}\mathbf{a}\mathbf{x}}_{{\text{m}\text{f}}_{\mathbf{i}}},{ \mathbf{M}\mathbf{i}\mathbf{n}}_{{\text{m}\text{f}}_{\mathbf{i}}}\right)$, while the weight assigned to each meta-feature is same, i.e., 1.
BuildNNConfig( I ): This procedure performs the main task of finding nearest neighbor computation. The set of tasks performed using this procedure are:
1. Initialize NNConfig
2. set global similarity function, see Eq. 2
3. Map a local similarity function with each feature, see Eq. 1
4. Set weight for each feature, i.e., assign 1 to each feature in this case
5. Return NNConfig
evaluateSimilarity( ${R}, {Q}, {S}$ ): evaluates similarity of each ${c}_{i}$ where ${c}_{i}$ $\in$ C against the queryCase Q using similarity function mapped in NN similarityConfig S, and returns a collection of retrievalResults RR (most similar cases)
selectTopK(RR, K): this procedure Selects top K most similar CBR cases from the collection of retrievalResult RR

In the retain step, the recommended solution is added to the Case-Base. The Case-Base grows in size and improves quality of the algorithm recommendation model.

Classifiers Conflict Resolver: is enabled when the first level, CBR, reasoning recommends classifiers with similar performance score. At this second level of meta-reasoning, the classifiers meta-characteristics are used rather than the data characteristics to break the tie with a best decision. One of the criteria to select best of the best accurate and consistent classifiers, is to use comprehensibility criteria. However, comprehensibility is a debatable concept and there is no universally acceptable criteria to quantify it, due to its nature of subjectivity and domain specificity. In the proposed study, we characterize the classifiers using the characteristics shown in Table 8, which can be used to measure comprehensibility of the decision tree (DT) classifiers.

Table 8

Characterization of Decision Tree classifiers
ID	DT Classifier Comprehensibility Characteristics
1	measureNumRules
2	measurePercentAttsUsedByDT
3	measureTreeSize
4	measureNumLeaves
5	measureNumPredictionLeaves

The comprehensibility indirectly describes interpretability and understandability of the decision trees model, which are self–explanatory, by non-experts to grasp the knowledge represented in the model [23]. In state-of-the-art methods, the comprehensibility is evaluated using the size characteristic of the model, i.e., decision tree size [24], however it has a number of issues as described in [25]. In some of the domain applications, e.g., bioinformatics a larger tree size is favored by physicians rather than smaller size as in business application, etc. Similarly, there is always accuracy-comprehensibility-complexity trade-off, which means if one is increased the other is decreased. To overcome this issue, a multi-objective criteria need to be defined, to provide an optimum solution for the final comprehensible classifier selection. In this research, the following solutions are recommended.

Define an aggregate weighted sum criteria
Use multi-criteria decision making methodology

As this conflict resolution is application dependent, therefore a semi-automatic expert-oriented criteria setting is required to add the required characteristics of the classifiers to form the criteria and their corresponding weight to compute the final preference score for each conflicting classifier. The aggregate scores are ranked for the conflicting classifiers and the one with top-rank is finally recommended to the end user for building his application.

This section describes how the proposed system is implemented and experiments have been performed to validate the proposed methodology.

4.1. Implementation

The proposed meta-learning and reasoning methodology for accurate classifier selection is implemented in Java environment as an Open source application. The key components of the methodology are the extraction of meta-features from the dataset and performing meta-reasoning by exploiting a Case-Base. These meta-features are computed using OpenML [18] data characteristics (DC) open source library, freely available on GitHub [19]. For the CBR-based reasoning process, jCollibri 2.0, a case-based reasoning framework [26], is used where we implemented our own case similarity functions for accurate matching of the existing cases. The resulting CBR-based meta-learning and reasoning system is released as an Open Source application on GitHub with an extensible and adoptable implementation strategy to enable end [27] users to use it for selecting a suitable decision tree classifier for their applications data. The interface of CBR-MLR application for meta-features extraction is shown in Fig. 5.

The process of meta-learning based multi-view reasoning using CBR is shown in Fig. 6.

4.2. Experimental Setup

a. Classifiers used

We performed the experiments on 09 most commonly used multi-class classification algorithms, shown in Table 5.5, which are implemented in Weka machine learning library [20]. These algorithms belong to the decision tree family of classifiers and we have used them with their default parameters.

b. Training and testing datasets

For training and testing the proposed methodology, two disjoint set of datasets are used. For training, the CBR model, i.e., Case-Base, is built using 100 multi-class classification datasets¹, shown in Table 7, taken from UCI machine learning repository [22] and OpenML repositories [18], are used. Similarity, a 52 datasets disjoint set is used for testing the methodology. All the classifier of Table 5 are evaluated for each of the test dataset, using the method implemented and described in Fig. 6, and the actual best classifiers are determined later on to see the performance of our proposed methodology.

c. Evaluation methodology and criteria

To evaluate the accuracy of the proposed method, the follows steps are used.

For each given dataset (test datasets in this case), meta-features are extracted using the developed meta-feature extractor to prepare a Query Case (Q).
CBR methodology is used to recommend top-k (k = 3) best classifiers for each Q.
Measure the similarity of the recommended top-k (k = 3) classifiers to the actual classifiers of those datasets. If the recommended classifier for a given dataset belongs to any of the top-k (k = 3) classifiers, the recommendation is declared as correct, otherwise incorrect.

4.3. Experiments and results analysis

When experiments on the test Case-Base of 52 datasets are performed and the results were evaluated, 48 out of 52 datasets were recommended with accurate classifiers. Hence, the overall accuracy of the proposed methodology was found as 48*100/52 = 94% for the correct classifiers recommendations in top k = 3 actual classifiers.

Table 9

Results of meta-learning based method for decision tree classifier selection
ID	Dataset	Position of Recommended Algorithm in Top-3 Algorithms
Dataset 1	cardiotocographt-10clas	1
Dataset 2	cars	1
Dataset 3	cars_with_names	2
Dataset 4	CastMetal1	1
Dataset 5	chess-small	1
Dataset 6	cholesterol	1
Dataset 7	chscase-adopt	3
Dataset 8	chscase-census2	3
Dataset 9	chscase-census3	2
Dataset 10	chscase-census5	1
Dataset 11	chscase-census6	2
Dataset 12	chscase-funds	1
Dataset 13	chscase-geyser1	1
Dataset 14	hscase-health	1
Dataset 15	chscase-vine1	1
Dataset 16	chscase-vine2	3
Dataset 17	chscase-whale	1
Dataset 18	cjs	1
Dataset 19	cleveland	1
Dataset 21	climate-simulation-crac	3
Dataset 22	cloud	3
Dataset 23	cm1_req	1
Dataset 24	cmc	1
Dataset 25	horse-colic	1
Dataset 26	horse-colic.ORIG	2
Dataset 27	colleges-aaup	1
Dataset 28	colleges-usnews	2
Dataset 29	collins	1
Dataset 30	onfidence	2
Dataset 31	ontact-lenses	9
Dataset 32	contraceptive	3
Dataset 33	costamadre1	1
Dataset 34	cps_85_wages	3
Dataset 35	cpu	3
Dataset 36	cpu_act	1
Dataset 37	cpu_small	1
Dataset 38	credit-rating	3
Dataset 39	crx	4
Dataset 40	DATATRIEVE	1
Dataset 41	dbworld-subjects-stemme	1
Dataset 42	dbworld-subjects	1
Dataset 43	delta_ailerons	1
Dataset 44	dermatology	1
Dataset 45	desharnais.csv-weka.fil	2
Dataset 46	pima_diabetes	1
Dataset 47	diggle-Table_A1_Luteniz	2
Dataset 48	disclosure-X_BIAS	1
Dataset 49	disclosure-X_NOISE	4
Dataset 50	disclosure-X_TAMPERED	3
Dataset 51	disclosure-Z	3
Dataset 52	dresses-sales	1
Dataset 53	eastWest	1

The results of Table 9 shows that only 3/52 classifiers, i.e., d30, d38 and d48 were not recommended with correct classifiers. Similarly, for top k = 1, the proposed methodology correctly recommended accurate classifiers for 30 datasets and hence the accuracy obtained is 30*100/52 = 57.6%. To analyze the results in top k = 2, the methodology correctly recommended classifiers for 38 dataset and achieved accuracy of 30*100/52 = 73%.

¹Some of the datasets are used with minor modifications by changing the type of the class label to nominal etc.

The research work is about the automatic ML algorithm selection using edge ML and case-based reasoning methodology. The edge ML provides edge computing capabilities to the CBR methodology to help with the recommendation of ML algorithms’ family at the edge node and the actual algorithm at the remote cloud server. This improves performance of the system with respect to its faster recommendation of the algorithm with less cost.

In future, we will extend the study by actual implementation of its edge ML computing which is not covered in this study. Furthermore we would like to add more meta-characteristics to the case-based and will add more families of the algorithm to make it a universal and generalized platform for ML practitioner.

Ethical Approval

“Not applicable”

Competing Interests

No, I declare that the authors have no competing interests as defined by Springer, or other interests that might be perceived to influence the results and/or discussion reported in this paper.

Authors’ Contributions

R.A. proposed the idea and wrote the main manuscript. M.S.H.Z implemented the algorithm selection platform as an open source software. A.M.K. and J.H. prepared the figures and reviewed the manuscript and provided financial support.

Funding

The research is partially supported by fund from Zayed University Office of Research and the fund from Sejong University, South Korea.

Availability of data and materials

Data can be made available to the researchers on their personal requests to the corresponding author.

Wolpert, D.H. and W.G. Macready, No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1997. 1(1): p. 67-82.
Brodley, C.E. Addressing the selective superiority problem: Automatic algorithm/model class selection. in Proceedings of the Tenth International Conference on Machine Learning. 1993. Citeseer.
Aha, D.W. Generalizing from Case studies: A Case Study. in Ninth International Conference on Machine Learning. 1992. Citeseer.
Smith-Miles, K.A., Cross-disciplinary perspectives on meta-learning for algorithm selection. ACM Computing Surveys (CSUR), 2008. 41(1): p. 1-25.
Brazdil, P., J.o. Gama, and B. Henery. Characterizing the applicability of classification algorithms using meta-level learning. in European Conference on Machine Learning: ECML-94. 1994. Springer.
Ali, S. and K.A. Smith, On learning algorithm selection for classification. Applied Soft Computing, 2006. 6(2): p. 119-138.
Gama, J. and P. Brazdil, Characterization of classification algorithms, in Progress in Artificial Intelligence. 1995, Springer. p. 189-200.
Brazdil, P.B., C. Soares, and J.P. Da Costa, Ranking learning algorithms: Using IBL and meta-learning on accuracy and time results. Machine Learning, 2003. 50(3): p. 251-277.
Bernado-Mansilla, E. and T.K. Ho, Domain of competence of XCS classifier system in complexity measurement space. Evolutionary Computation, IEEE Transactions on, 2005. 9(1): p. 82-104.
Song, Q., G. Wang, and C. Wang, Automatic recommendation of classification algorithms based on data set characteristics. Pattern recognition, 2012. 45(7): p. 2672-2689.
Bache, K. and M. Lichman, UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. 2013, Irvine, CA: University of California, School of Information and Computer Science.
Breiman, L., Bagging predictors. Machine Learning, 1996. 24(2): p. 123-140.
Freund, Y. and R.E. Schapire. Experiments with a new boosting algorithm. in International Conference on Machine Learning (ICML). 1996.
Blum, A. and T. Mitchell. Combining labeled and unlabeled data with co-training. in Proceedings of the eleventh annual conference on Computational learning theory. 1998. ACM.
Dietterich, T.G., An overview of MAXQ hierarchical reinforcement learning, in Abstraction, Reformulation, and Approximation. 2000, Springer. p. 26-44.
Rice, J.R., The algorithm selection problem. Advances in computers, 1976. 15: p. 65-118.
Wang, G., et al., A Generic Multilabel Learning-Based Classification Algorithm Recommendation Method. ACM Transactions on Knowledge Discovery from Data (TKDD), 2014. 9(1): p. 7.
Van Rijn, J.N., et al., OpenML: A collaborative science platform, in Machine learning and knowledge discovery in databases. 2013, Springer. p. 645-649.
Sun, Q., Integrated Fantail library. 2014, GitHub.
Bouckaert, R.R., et al., WEKA---Experiences with a Java Open-Source Project. The Journal of Machine Learning Research, 2010. 11: p. 2533-2541.
Sarkheyli, A. and D. Sa'ffker, Case indexing in Case-Based Reasoning by applying Situation Operator Model as knowledge representation model. IFAC-PapersOnLine, 2015. 48(1): p. 81-86.
Lichman, M., UCI machine learning repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science. 2013.
Rokach, L. and O. Maimon, Decision trees, in Data mining and knowledge discovery handbook. 2005, Springer. p. 165-192.
Freitas, A.A., Comprehensible classification models: a position paper. ACM SIGKDD explorations newsletter, 2014. 15(1): p. 1-10.
Rokach, L. and O. Maimon, Data mining with decision trees: theory and applications. 2014: World scientific.
Bello-TomÃ¡s, J.J., P.A. GonzÃ¡lez-Calero, and B.n. DÃaz-Agudo, Jcolibri: An object-oriented framework for building cbr systems, in Advances in case-based reasoning. 2004, Springer. p. 32-46.
Rahman, A.M., Sadiq, Automatic-algorithm-selector. 2016, GitHub.

No competing interests reported.

Download PDF

Journal Publication

published 21 Nov, 2023

Read the published version in Journal of Cloud Computing →

Editorial decision: Major revision
28 Mar, 2023
Reviews received at journal
07 Feb, 2023
Reviews received at journal
07 Feb, 2023
Reviewers agreed at journal
18 Jan, 2023
Reviewers agreed at journal
15 Jan, 2023
Reviewers agreed at journal
13 Jan, 2023
Reviewers invited by journal
13 Jan, 2023
Editor assigned by journal
05 Jan, 2023
Submission checks completed at journal
04 Jan, 2023
First submitted to journal
02 Jan, 2023

You are reading this latest preprint version

Algorithm Selection Using Edge ML and Case-based Reasoning

Status:

Journal Publication

Version 1

Abstract

Figures

1. Introduction

2. Overview Of Edge Ml Computing For Algorithm Selection

3. Methodology: Case-based Reasoning For Algorithm Selection

3.1. Definition of Algorithm Selection Problem

3.2. Proposed CBR Framework

3.2.1. Offline Phase: Creation of Algorithm Selection Model

3.2.2. Online Phase: Algorithm Recommendation Using CBR

3.3. Methods in CBR – Offline Phase

3.3.1. Dataset Characterization (Extraction of Meta-Characteristics)

3.3.2. ML Algorithm Characterization

3.4. Model Creation – Case-base Creation

3.5. Algorithm Recommendation using CBR

3.5.1. New Case Preparation:

3.5.2. CBR Cycle

4. Implementation, Experiments And Evaluation

4.1. Implementation

4.2. Experimental Setup

a. Classifiers used

b. Training and testing datasets

c. Evaluation methodology and criteria

4.3. Experiments and results analysis

5. Conclusion And Future Work

Declarations

Ethical Approval

Competing Interests

Authors’ Contributions

Funding

Availability of data and materials

References

Additional Declarations

Status:

Journal Publication

Version 1