Incremental method of generating decision implication canonical basis

Decision implication is an elementary representation of decision knowledge in formal concept analysis. Decision implication canonical basis (DICB), a set of decision implications with completeness and non-redundancy, is the most compact representation of decision implications. The method based on true premises (MBTP) for DICB generation is the most efficient one at present. In practical applications, however, data are always changing dynamically, and MBTP, lacking an update mechanism of DICB, still needs to re-generate inefficiently the whole DICB. This paper proposes an incremental algorithm for DICB generation, which obtains a new DICB just by modifying and updating the existing one. Experimental results verify that when samples in data are much more than condition attributes, which is actually a general case in practical applications, the incremental algorithm is significantly superior to MBTP. Furthermore, we conclude that even for the data in which samples are less than condition attributes, when new samples are continually added into data, the incremental algorithm must be also more efficient than MBTP, because the incremental algorithm just needs to modify the existing DICB, which is only a part of work of MBTP.


A brief review of formal concept analysis
Formal Concept Analysis (FCA) is an order-theoretic method for concept analysis and visualization, pioneered by Wille (2009) in the mid-80s. In essence, FCA comes from a philosophical understanding of a concept, which is viewed as a unit of thought constituted by its extent and intent. The extent of a concept is a collection of all objects belong-B Deyu Li lidysxu@163.com Shaoxia Zhang zhangshaoxia_sxu@163.com Yanhui Zhai chai_yanhui@163.com 1 to areas of information acquisition, text mining, software engineering and machine learning (Carpineto and Romano 2004;Bernhard et al. 2005). Decision implication, revealing the dependency between conditions and consequences (causes and effects), is an elementary representation of decision knowledge in FCA. A decision implication is defined as a formula A → B, meaning that if the conditions in A are satisfied, then the conclusions in B hold. Compared with other classifiers, decision implication has an equal or better classification ability (Mehran 1995;Hu 2000;Wang et al. 1999).
In practical applications, however, a small-scale data may produce a large number of decision implications. Thus, for the sake of easy storage and efficient processing, it is widely recognized that decision implications should be deduced from decision implication basis (a complete set of decision implications) rather than being computed from data (Qu et al. 2007;Zhai et al. 2015a, b). To achieve this, Qu et al. (2007) introduced α-decision inference rule, meaning that new decision implications can be deduced from decision implications by amplifying their premises or reducing their consequences. By using α-decision inference rule, one can obtain a decision implication set with relative completeness and non-redundancy. Li et al. (2011Li et al. ( , 2012aLi et al. ( , b, 2013a discussed the application of α-decision inference rule in decision contexts, incomplete decision contexts and real decision contexts. Nevertheless, all the above studies fail to present an integrated logic description of decision implication. Hence, Zhai et al. (2014Zhai et al. ( , 2015b and Li et al. (2017b) researched decision implication logic from semantical aspect and syntactical aspect. The semantical aspect accounts for the soundness of decision implications and the non-redundancy and completeness of decision implication sets. In the syntactical aspect, two inference rules, namely Augmentation and Combination, were proposed and proven to be sound, complete and non-redundant w.r.t. the semantical aspect.

A brief review of decision implications in decision contexts
Based on decision implication logic, Zhai et al. (2015aZhai et al. ( , 2014Zhai et al. ( , 2015b proposed the most compact set of decision implications, decision implication canonical basis (DICB), in decision contexts. This basis takes decision premises as its premises and the closures of decision premises as its consequences. DICB keeps all decision implications in data by the least amount of decision implications, that is because this basis is complete and non-redundant w.r.t. decision implication logic, and more importantly, it is optimal, i.e., it is of minimal cardinality among all complete sets of decision implications (Zhai et al. 2015a(Zhai et al. , 2014(Zhai et al. , 2015b. Starting from DICB, one can obtain all decision implications in data by iteratively applying Augmentation and Combination. Furthermore, DICB has been proven to have the strongest strength of knowledge representation, comparing with other models of decision implications, such as concept rule and granule rule (Wu et al. 2009;Li and Weizhi 2017). Other researches about decision implications and fuzzy decision implications can be found in Mingwen (2007), Zhai et al. (2013), Wu et al. (2009), Zhai et al. (2018a), Zhai et al. (2018b) and Jia (2021).

Motivations and contributions of the paper
An efficient method for DICB generation is essential for decision implication-based decision knowledge representation and reasoning. Zhai et al. (2015a) proposed a minimal generator-based method to generate DICB, which is, however, of exponential time complexity (Li et al. 2017b). Considering this shortcoming, Li et al. (2017b) put forward a method based on true premise (abbreviated to MBTP). It can be verified that MBTP has a polynomial time complexity; and by experiments, it is more efficient than the minimal generator-based method in Zhai et al. (2015a). In practical applications, however, data are always changing dynamically and DICB changes simultaneously. For example, in a shopping database, new purchase records are added moment by moment. In this case, MBTP, due to a lack of update mechanism of DICB, always has to re-generate inefficiently the whole DICB. Hence, to improve the efficiency of DICB generation, an approach that just iteratively updates the current DICB when new records come needs to be developed. This paper proposes an incremental method for DICB generation, which intends to obtain DICB just by updating the existing one. In this method, decision premises are clarified into four categories: unchanged decision premises, modified decision premises, invalid decision premises and new decision premises. We study their properties and renewal mechanisms, by which, the existing DICB can be modified and then a new DICB is achieved. Experimental results verify that when the samples are much more than condition attributes, which is actually a general case in practical applications, the incremental algorithm is significantly superior to MBTP. Furthermore, we conclude that, even for the data in which samples are less than condition attributes, when new samples are continually added into data, MBTP still re-generates the whole DICB; by contrast, the incremental algorithm just needs to modify the existing DICB, which is only a part of work of MBTP, and thus is also more efficient than MBTP.

The arrangement of the paper
This article is organized as follows. Section 2 introduces decision implication logic. Section 3 presents decision implication in decision contexts. Section 4 studies the incremental generation of decision premises. Section 5 proposes an incremental algorithm of generating DICB. Section 6 conducts an experiment to compare the performance of the incremental algorithm and MBTP. Conclusions and further work end the paper in Sect. 7. In what follows, we give the overall structure diagram of this article, as shown in Fig. 1.

Decision implication logic
Decision implication (Zhai et al. 2013(Zhai et al. , 2014(Zhai et al. , 2015a(Zhai et al. , b, 2018bLi et al. 2017b), revealing the dependency between premises and consequences (causes and effects), is defined as a formula between condition attributes and decision attributes.
Definition 1 (Zhai et al. 2014) Let C be a set of condition attributes and D be a set of decision attributes such that Decision implication logic gives the semantical and syntactical description of decision implications. The semantical aspect studies the completeness and non-redundancy of decision implication sets (Zhai et al. 2014(Zhai et al. , 2015b. Definition 2 (Zhai et al. 2014) Let C be a set of condition attributes, D be a set of decision attributes, and L and L 1 be sets of decision implications.
(1) For a set T ⊆ C ∪ D and a decision implication then L is non-redundant. (4) If any A → B that can be semantically deduced from L is contained in L, then L is closed. (5) If L is closed, L 1 ⊆ L and L 1 L, then L 1 is complete w.r.t. L.
For a given dataset, the soundness of a decision implication means that the decision implication is valid in the dataset. The completeness of a set of decision implications means that all valid decision implications can be deduced from the set. A set of decision implications is non-redundant if, in the set, no valid decision implications can be deduced from the other decision implications. In syntactical aspect (Zhai et al. 2014(Zhai et al. , 2015b, two inference rules Augmentation and Combination are proposed, and their soundness, completeness and redundancy of semantic compatibility are proved.
Augmentation : Combination : Theorem 1 (Zhai et al. 2014) Augmentation and Combination are sound, i.e., Theorem 2 (Zhai et al. 2014) Augmentation and Combination are complete w.r.t. the semantical aspect, i.e., for any closed set of decision implication L and a complete set L 1 ⊆ L, all decision implications in L can be obtained from L 1 , by applying Augmentation and Combination.
Theorem 3 (Zhai et al. 2014) Augmentation and Combination are non-redundancy, i.e., they cannot be replaced by each other in deduction process. Zhai et al. (2015a), Qu and Zhai (2008), Qu et al. (2007), Li et al. (2011Li et al. ( , 2012aLi et al. ( , 2013bLi et al. ( , 2017a studied decision implication in decision contexts and proposed the most compact set of decision implications, i.e., decision implication canonical basis (Zhai et al. 2015a). Decision context in firstly introduced in the following.

Decision implication in decision contexts
Definition 3 (Zhai et al. 2014) A decision context is a triple K = (G, C ∪ D, I C ∪ I D ), where G is the object set, C is the condition attribute set, and D is the decision attribute set such that C ∩ D = ∅. In this case, I C ⊆ G × C is the set of condition incidence relations and I D ⊆ G × D is the set of decision incidence relations. For g ∈ G and m ∈ C ∪ D, (g, m) ∈ I C or (g, m) ∈ I D denotes "the object g has the attribute m." A decision context can also be represented by a twodimensional table, in which row headers are object names, column headers are attribute names, and a "1" indicates the row object g has the column attribute m, i.e., (g, m) ∈ I C or (g, m) ∈ I D .
Example 1 We take the transaction data in All Electronics branch (Han et al. 2012 Table 1).
Definition 4 (Zhai et al. 2014) Let K = (G, C ∪ D, I C ∪ I D ) be a decision context. For sets A 1 ⊆ C, A 2 ⊆ D and B ⊆ G: Operator (.) D has the similar properties in Proposition 1. Definition 5 introduces decision implications in decision contexts.   (1) A → A C D is a decision implication of K ; (

2) A → B is a decision implication of K if and only if B ⊆ A C D .
By Proposition 2, we can see that for set A ⊆ C, A C D is the maximal consequence of A. Zhai et al. (2015a) and Li et al. (2017b) defined the most compact set of decision implications, i.e., decision implication canonical basis.

is not a decision premise of K if and only if A C D = (A).
Proof It is easy to see that A C D ⊇ (A). Then, by Definition 6, the proof is straightforward.
By Definition 6, we know that A is a decision premise if and only if A C D contains more decisions than (A). In other words, if A is a decision premise, one can collect only (A) from the decision premise subsets of A, but the consequence of A is A C D , which contains more conclusions than (A). In this case, decision implication A → A C D is indispensable to derive more decisions. On the contrary, if A is not a decision premise, by Proposition 3, A C D is equal to (A), meaning that A C D can be collected from the decision premise subsets of A, and hence, A → A C D is not necessary.
Decision implication canonical basis is a set of decision implications which take decision premise A as premises and A C D as consequences.
Definition 7 (Zhai et al. 2015a;Li et al. 2017b Decision implication canonical basis is proven to be complete, non-redundant and optimal w.r.t. decision implication logic (Zhai et al. 2015a), as shown in Theorem 4.
be a decision context and O be the DICB of K . Then: (1) O is complete, i.e., all decision implications in K can be obtained from O, by applying Augmentation and Combination.

any decision implication in O cannot be obtained from others in O, by applying Augmentation and Combination. (3) O is optimal, i.e., it is of minimal cardinality among all complete sets of decision implications.
Example 3 (Continuing Example 1) By calculation, the DICB of Table 1 Starting from O, all decision implications in Table 1 can be deduced by applying Augmentation and Combination.
In practical applications, however, data always change when new objects/samples are continuously added into data, and DICB also changes simultaneously. In this paper, we develop an incremental method for DICB generation, which just update the existing DICB to obtain a new one, when new objects come.

Incremental generation of decision premises
By Definition 7, we can see that DICB is defined based on decision premise. Hence, to obtain a new DICB by updating the existing DICB, the key is to update the existing decision premises.

Categories of decision premises
For the given decision context K = (G, C ∪ D, I C ∪ I D ), when a new object g is added into K , we denote the new decision context as: where I C ⊆ (G ∪ {g}) × C and I D ⊆ (G ∪ {g}) × D, and write, respectively, the operators (.) C and (. Since decision premise is defined via the operator (.) C D (Definition 6), in order to check the changes of decision premises from K to K g , it is necessary to check the change from A C D to A C D .

Proposition 4 For decision context K
and K g , let A ⊆ C. Then, we have the following conclusions:

) A ⊆ g C and A C D g D if and only if A C D ⊂ A C D .
Proof (1) Firstly, by the definitions of (.) C D and (.) C D , it is easy to see that A C D = A C D . There are two cases to be considered: • A g C . In this case, by the definitions of (.) C and (.) C , we have A C = A C and hence, A C D = A C D ; and considering By conclusion (4) of Proposition 2, we know In conclusion, A C D = A C D holds.
(4) "⇐". Assume that A g C . By conclusion (2), By Proposition 4, we conclude that: To generate decision premises of K g based on the existing decision premises of K , we classify the decision premises of K and K g as follows: (1) A is a decision premise of K , and A is also a decision premise of K g . Despite this, one may not obtain the same decision implication, since the consequence may change, i.e., A C D = A C D . Thus, by (1) of Proposition 4, we divide A C D ⊆ A C D into two cases: (1.1) A C D = A C D , i.e., the consequence of A is unchanged.
In this case, we call A an unchanged decision premise; (1.2) A C D ⊂ A C D , i.e., the consequence of A needs to be changed into A C D . In this case, we call A a modified decision premise; (2) A is a decision premise of K , but A is not a decision premise of K g . In this case, we call A an invalid decision premise.  (3) A is a not decision premise of K , but A is a decision premise of K g . In this case, we call A a new decision premise.
From the above, the decision premises of K and K g can be classified into four categories: unchanged decision premises, modified decision premises, invalid decision premises and new decision premises.
Example 4 (Continuing Examples 1 and 3) Consider the decision context K in Example 1. We add a new object T 900 and obtain the new decision context K T 900 , as shown in Table 2.
The DICB of K T 900 is shown in Table 3. By comparing the DICBs of K (see Example 3) and K T 900 , the categories of decision premises can be recognized, as shown in Table 4.

Properties and update of decision premises
In this section, we will study the properties of the four types of decision premises, by which one can determine which category they belong to and how to modify or update them.
We rewrite the set (A) (formula 1) in K g as:

Proposition 5 For decision contexts K and K g , let O be the DICB of K and A → A C D ∈ O. Then, A is an unchanged decision premise if and only if
Proof "⇐". We firstly prove that if A is a modified decision premise or an invalid decision premise, then A C D ⊂ A C D holds.
(1) A is a modified decision premise, then by the definition of modified decision premise, we have and holds by (1) of Proposition 4, and hence, (A) ⊆ (A) holds. Because A is not a decision premise of K g , and hence , A is neither a modified decision premise nor an invalid decision premise by Proposition 6. Because A → A C D ∈ O, A is not a new decision premise. In conclusion, A is an unchanged decision premise.
Example 5 (Continuing Example 4) Take the unchanged decision premise {I 5 } of Example 4 as an example. We

Proposition 6 For decision contexts K and K g , let O be the DICB of K and A → A C D ∈ O. Then, A is a modified decision premise if and only if A C D ⊂ A C D and A C D ⊃ (A).
Proof "⇒". It has been proven in the sufficiency proof of Example 6 (Continuing Example 4) Take the modified decision premise {I 4 } as an example. We have

Proposition 7 For decision contexts K and K g , let O be the DICB of K and A → A C D ∈ O. Then, A is an invalid decision premise if and only if A C D ⊂ A C D and A C D = (A).
Proof "⇒". It has been proven in the sufficiency proof of

Proposition 8 For decision contexts K and K g , if A is a new decision premise, then:
(1) A g C ; Proof (1) Assume A ⊆ g C . If we can prove A is not a decision premise of K g , i.e., A C D = (A) by Proposition 3, it contradicts the fact that A is a new decision premise. In this case, the assumption A ⊆ g C is wrong, and hence, A g C . We firstly prove (A) = (A). By formulas 3 and 4, we just need to prove that for any (3)  (4) Assume that for any  (3), A is not a new decision premise, contradicting the fact that A is a new decision premise. Thus, the assumption is wrong, and then, there must exist a Example 7 (Continuing Examples 3 and 4) Take the new decision premise {I 3 , I 4 } as an example. It can be verified that {I 3 , I 4 } satisfies the conditions in Proposition 8: By Definition 8, we know that there may be more than one generator of A. Proposition 9 further clarifies the relationship between A and its generators.
Proposition 9 For decision contexts K and K g , let A be a new decision premise. For any generator A m of A, A has exactly one more attribute m than A m and m does not belong to g C , i.e., m ∈ C − g C .

Proof
Because A m is a generator of A, we have A m ⊂ A and A C D m ⊂ A C D m by Definition 8; and A m ⊆ g C by (4) of Proposition 4. Considering A m ⊆ g C and A m ⊂ A, A can be written as: (1) We prove |T | = 1 in the following.
Assume |T | = 0. We have A = A m ∪ S ∪ T = A m ∪ S, and by A m ∪ S = g C ∩ A, we have A = g C ∩ A ⊆ g C , i.e., A ⊆ g C . Because A is a new decision premise, by (1) of Proposition 8, A g C holds, which contradicts A ⊆ g C . Thus, |T | = 0 holds. Assume |T | > 1. For any A i ⊂ A, there are two cases to be considered: Because T ∩ g C = ∅ and a ∈ T , then a / ∈ g C , and hence, A j = A i ∪ {a} In conclusion, for any A i ⊂ A, A C D i ⊆ (A) holds, i.e., (A) ⊆ (A) by formula 3, implying that A is not a new decision premise by (3) of Proposition 8, which contradicts the fact that A is a new decision premise. Thus, the assumption |T | > 1 is wrong; and considering |T | = 0, we have |T | = 1.
(2) We prove |S| = 0. Assume that |S| > 0. Because |S| > 0 and |T | = 1, by the definitions of S and T , it is clear that implying that A m is not a generator of A, which contradicts the fact that A m is a generator of A. Thus, the assumption A m ∪ S is a decision premise of K is wrong, i.e., A m ∪ S is not a decision premise of K .
For any decision premise A i of K such that A i ⊂ A, there are two cases to be considered: and then A C D i ⊆ A C D j by Proposition 1. Because T ∩g C = ∅, then T g C , and hence A j = A i ∪T g C , and then A C D j = A C D j by (2) of Proposition 4.
Because A i is a decision premise of K and we have proven that A m ∪ S is not a decision premise of K , we can ensure In conclusion, for any decision premise A i of K satisfying that A i ⊂ A and A C D i ⊆ (A), (A) ⊆ (A) holds by formula 1, implying that A is not a new decision premise by (3) of Proposition 8, contradicting the fact that A is a new decision premise. Thus, the assumption |S| > 0 is wrong, and hence |S| = 0. We illustrate Proposition 9 by Example 8.

Example 8 (Continuing Example 4) Take the new decision
premise {I 3 , I 4 } in Example 4 as an example. By calculation, in K T 900 , we have ({I 3 , I 4 }) = {{I 4 }}, implying that {I 4 } is a generator of {I 3 , I 4 } by Definition 8. Furthermore, it can be verified by Proposition 9 that {I 3 , I 4 } has exactly one more attribute than {I 4 }, i.e., I 3 does not belong to T 900 C (T 900 C = {I 4 }).
By Theorem 5, one can obtain all candidate new decision premises.

Theorem 5 For decision contexts K and K g , let O be the DICB of K . If A is a new decision premise, then there must exist A i → A C D i
∈ O satisfying the following conditions: (2) A has exactly one more attribute than A i , and the extra attribute does not belong to g C .
Proof By (4) of Proposition 8 and Proposition 9, they are straightforward.
By Theorem 5, for a new decision premise A, there must exist a set A i that satisfies conditions (1) and (2) in Theorem 5. Thus, we can obtain all the candidate new decision premises A ∪ {d }, where d ∈ C − g C , by searching all A → A C D in O that satisfy condition (1) and filtering those A that do not satisfy condition (2). By Theorem 5, there must exist a premise A i and condition attribute d ∈ C − g C that satisfy A = A i ∪ {d}.

Incremental algorithm for DICB generation
In this section, we will give an incremental algorithm for generating DICB. The incremental algorithm starts with an empty decision context, i.e., there are no objects in this decision context.
We then prove for any ∅ ⊂ A ⊆ C, A is not a decision premise of K . Because ∅ ⊂ A and ∅ is a decision premise, by the definition of (A), we have ∅ C D ⊆ (A), and hence D ⊆ (A) because ∅ C D = D. Considering D ⊆ (A) and A C D ⊆ D, we have A C D ⊆ (A), i.e., A is not a decision premise by Definition 6.
From the above, the DICB of K is {∅ → D}.
The incremental algorithm for generating DICB is shown in Algorithm 1.

Algorithm 1 Incremental algorithm for DICB generation
For the given decision context K = (G, C ∪ D, I C ∪ I D ), starting with G = ∅ and D = ∅, by Proposition 10, we initialize O with {∅ → D} (steps 1-2). At each iteration (steps 3-6), we add an object g to the existing decision context K current and update the existing DICB by the function U pdate − CanoBasis(K current , g, O) (Algorithm 2).

Algorithm 2 U pdate − CanoBasis function
for all a ∈ C − g C do 5.
add   We will also illustrate MBTP to make a comparison between the incremental method and MBTP. In MBTP, decision premise was proved to be equivalent to true premise of decision attributes. Thus, MBTP generates DICB by generating the true premises of each decision attribute. By calculation, for the decision context K T 800 , I 1 and I 2 have the same true premises, I 4 and I 5 . The corresponding decision implications {I 4 } → {I 1 , I 2 } and {I 5 } → {I 1 , I 2 } constitute the DICB of K T 800 . When T 900 is added into K T 800 , because MBTP lacks an update mechanism of DICB, it needs to regenerate the true premises of I 1 and I 2 , as listed in Table 6. The corresponding decision implications in Table 6 constitute the DICB of K T 900 (see Table 3). Now, we analyze the time complexity of Algorithm 2. Let |C| be the number of condition attributes, |D| be the number of decision attributes, and o n be the number of decision implications in DICB of K current . The time complexity of steps 1-9 is O(o n · |C| · o n · (|C| + |D|)). Denote the number of decision implications after step 9 byō n . Then, the time complexity of steps 10-20 is O(ō 2 n · (|C| + |D|)). Thus, the time complexity of Algorithm 2 is: Let |G| be the number of objects in K . Then, the time complexity of Algorithm1 is: O(|G| · |C| · o 2 n · (|C| + |D|) + |G| ·ō 2 n · (|C| + |D|)) Let |O| be the number of decision implications in DICB of K . We approximate o n by |O| andō n by |O| + |O| · |C|. The time complexity of Algorithm 1 becomes: O(|G| · |C| 2 · |O| 2 + |G| · |C| · |D| · |O| 2 +|G| · |C| 3 · |O| 2 + |G| · |C| 2 · |D| · |O| 2 ) i.e., O(|G| · |C| 3 · |O| 2 + |G| · |C| 2 · |D| · |O| 2 ) (5) Li et al. (2017b) put forward a true premise-based algorithm (abbreviated to MBTP below), whose time complexity is 1 O(|G| 2 · |C| · |D| + |G| · |C| 3 · |D| · |O| 2 +|G| · |D| 2 · |O|) and proved its absolute advantages compared with the minimal generator-based algorithm in (Zhai et al. 2015a). Section 6 makes a further comparative experiment between our proposed algorithm and MBTP algorithm.

Experimental verification
In this section, we conduct an experiment to compare the time consumption of the incremental method and MBTP. An advantage of the incremental algorithm over MBTP is that it is able to update, instead of re-generating, DICB when new objects are added to decision contexts. Taking this advantage into account, there are two ways to record the time consumption of the incremental algorithm: (1) a decision context is given and we record the time consumption of generating the whole DICB, or (2) new objects are added into decision contexts and we just record the time consumption of updating the existing DICB. Obviously, the time consumption of the incremental algorithm in the second way is only a part of that in the first way. Thus, if the incremental algorithm is more efficient than MBTP in the first way, it must be also more efficient than MBTP in the second way. In this case, we compare their performances in the first way.

Experiment data
We selected 8 UCI datasets with different scales, carried out some necessary pre-processing, such as removing missing values and normalizing the continuous attributes, and finally obtained 8 formal contexts according to the threshold value 0.5. The summary information of the formal contexts is shown in Table 7, in which |G| and |M| denote the numbers of objects and attributes, respectively.

Experiment approach and results
For a dataset, we generated the first decision context by randomly selecting one condition attribute from all the attributes and taking the remainder attributes as the decision attributes. Subsequently, we equably increased the number of condition attributes, which were also randomly selected from all the attributes, with the rest being taken as the decision attributes. This process was repeated 10 times. It is noted that when the number of condition attributes was not an integer, we rounded it to the nearest integer. Take the dataset "cloud" as an example. We obtained 10 decision contexts by randomly choosing 1, 3,5,7,9,11,13,16,18,20 condition attributes from all the attributes and taking the rest as decision attributes.
The results are shown in Tables 8, 9 , 10, 11, 12, 13, 14 and 15, in which: 1. "Incre" represents the incremental algorithm; 2. If the time consumption of the incremental algorithm is less than that of MBTP, we highlight it bold; 3. If the time consumption of one algorithm is more than 3600 seconds and 10 times that of the other one, the algorithm is terminated and its time consumption is appended with the symbol "+", e.g., "9922.8+"; 4. If an algorithm has ran more than 24 hours, it is also terminated and its time consumption is denoted as "24h+". It is noted that when both the algorithms are determined, |O| is unknown, and then it is denoted as " * ".

Experiment analysis
The time complexity of MBTP and the incremental algorithm is determined by |G|, |C|, |D| and |O| (Equations 5 and 6), where |G|, |C| and |D| are known for a given decision context but |O| is not. To explain the results in Tables 8, 9 , 10, 11, 12, 13, 14 and 15, we firstly explore the factors determining |O|.
From Tables 8, 9 , 10, 11, 12, 13, 14 and 15, it can be seen that as |C| increases, |O| grows in most cases, meaning that |O| is largely determined by |C|. Although the result is obtained as |D| decreases, we will explain in the following that, compared with |C|, the impact of |D| on |O| can be negligible in most cases.
For a decision context K = (G, C ∪ D, I C ∪ I D ), by Definition 7, |O| is equal to the number of decision premises; by Theorem 3 in (Li et al. 2017b), A is a decision premise if and only if A is a true premise of a decision attribute, and thus, the number of decision premises is equal to the number of true premises, i.e., |O| is equal to the number of true premises. Now, we assume that for a decision context with |C| condition attributes, each decision attribute has n true premises on average. Then, K , which has |C| condition attributes and |D| decision attributes, has |D|·n true premises, i.e., |O| = |D| · n.
Thus, when |C| increases to |C| + 1 and |D| decreases to |D| − 1, the increase of |C| leads to an increment |C| · |D| · n of |O| and the decrease of |D| leads to a decrement n of |O|, thus leading to an increment |C| · |D| · n − n of |O|. Thus, in our experiments, when |C| increases to |C| + i and |D| decreases to |D| − i, |O| will get the increment of about n · i · (|C| · |D| − 1).
We then evaluate the effect of |G| on |O| by selecting four datasets from Tables 8, 9 , 10, 11, 12, 13, 14 and 15. For each decision context listed in these tables, we divided the objects into 10 equal groups and incrementally added one group to the original decision context. Figure 2 records the change of |O| as |G| increases. Figure 2 shows that as |G| increases, |O| grows slowly in most cases; and when |G| is large enough, |O| holds steady.
By the above experiments, we conclude that |C| has a major impact on |O|, whereas both |D| and |G| have a limited impact on |O|.
We then discuss the key factors that affect the performances of the two algorithms. From Tables 8, 9, 10, 11, 12, 13, 14 and 15, we can see that, for each dataset, as |C| increases, with |D| decreasing and |G| unchanged, the time consumption of the incremental algorithm grows in most cases, which means that the incremental algorithm is mainly affected by |C|.
By (Li et al. 2017b), MBTP mainly includes two subfunctions get Allgd and D Pgenerator, whose time complexity is, respectively, O(|G| 2 · |C| · |D|) and O(|G| · |C| 3 · |D|·|O| 2 ), meaning that get Allgd is mainly affected by |G|, and D Pgenerator is mainly affected by |C| and |O|. As analyzed previously, |O| is mainly affected by |C|, and hence, D Pgenerator is also mainly affected by |C|. In this sense, when |G|/|C| is small, i.e., |G| is small but |C| is large, D Pgenerator will take more time than get Allgd. Thus, as |C| increases, the time consumption of D Pgenerator increases, and hence that of MBTP also grows. When |G|/|C| is large, i.e., |G| is large but |C| is small, get Allgd will take more time than D Pgenerator. Figure 3 further shows that when |G|/|C| is large, such as for "hou", "bank8FM" and "dplanes", as |C| increases, the time consumption of get Allgd is very close to that of the whole algorithm MBTP, and hence they will keep a coincident variation, i.e., as the time consumption of get Allgd decreases, the time consumption of MBTP also drops. Based on the above analysis, we can further compare the performances of the two algorithms by Tables 8,9,10,11,12,13,14 and 15. First, we notice that, when |G|/|C| is large, the incremental algorithm is more efficient than MBTP. Exemplary datasets are "bank8FM" and "dplanes" (Tables 12 and 13). When |G|/|C| > 300 in "bank8FM" and |G|/|C| > 1000 in "dplanes", compared with MBTP, the incremental algorithm has a remarkable advantage. Especially for "dplanes", where |G| = 4076, |C| = 4 and |D| = 29, MBTP takes 41349.31s but the incremental algorithm just takes 11.19s, with the efficiency being increased by more than 99%. Another example is "hou" (Table 9), where |G|/|C| > 30 and both the two algorithms are of low time-consumption, with the incremental algorithm being more efficient. When |G|/|C| is small, however, the incremental algorithm loses this advantage. Take "ion" and "triazines" (Tables 10 and 11) for example. It is observed that while the incremental algorithm is more efficient than MBTP in the beginning, when |G|/|C| < 40 or |G|/|C| < 5, MBTP will be more efficient. Another example is "cloud" (Table  8). When |G|/|C| < 30, both the two algorithms are of low time-consumption, with MBTP being more efficient.
In addition, we conclude that when |C| is large, both algorithms are time-consuming. For example, in "bank32nh" and "waveform" (Tables 14 and15), when |C| ≥ 33 or |C| ≥ 41, both the two algorithms take more than 24 hours and were terminated. That is because when |C| is large, there are a great number of decision implications in O, as analyzed before; and as |C| increases, |O| may have an explosive growth. For example, in "bank32nh", when |C| increases from 11 to 22, |O| increases from 271 to 9149; and in "waveform", when |C| increases from 14 to 27, |O| increases from 1465 to 50619. Thus, when |O| is large, traversing O will be time-consuming for the two algorithms. It is noted that, in most practical cases, objects are much more than condition attributes, i.e., |G|/|C| is large, and as analyzed before, the incremental algorithm will be superior to MBTP. Actually, even for the data in which objects are less than condition attributes, new objects may be continually added into data. For example, new product records are added moment by moment into the database of a supermarket. In this case, |G|/|C| increases and remains large. MBTP, however, still generates the whole DICB; and by contrast, the incremental algorithm just needs to modify the existing DICB to obtain a new one, which is only a part of work of MBTP. Hence, the incremental algorithm must be also more efficient than MBTP.

Conclusion and further work
In this paper, we proposed an incremental algorithm, which produces a new DICB by modifying and updating the existing DICB in the case of new objects being continually added into data. Experimental results verified that when samples in data are much more than condition attributes, the time consump-tion of generating the whole DICB by incremental algorithm will be remarkably less than that by MBTP. In addition, we conclude that, even for the data in which samples are less than condition attributes, when new objects are continually added into data, the incremental algorithm will also be more efficient than MBTP.
In practice cases, multiple samples but not a single one may be added simultaneously into data. Hence, when a bunch of samples come, how to modify the existing DICB to obtain a new one deserves further exploration. Furthermore, an improved distributed algorithm for DICB generation will be designed in our future study.
DICB is a complete and extremely compact representation of decision information in data. Hence, the DICB-based knowledge representation and reasoning is a valuable topic, which includes studying the effects of inference rules when they are applied to knowledge inference in different orders and different times, designing optimal inference strategies and constructing a system of knowledge representation and inference that takes DICB as its knowledge base and inference rules as its inference engine.