1 2 3 4 5 6 7 8 9


4. Knowledge processing

This section highlights how usual inductive learning algorithms can be stretched to complex data processing by using domain knowledge to generate accurate and meaningful decision trees (from pre-classified examples).

4.1 Tree-based classification using domain knowledge

Starting from the well known decision tree builder algorithm C4.5 [17] which works on discrete and continuous attributes, IKBS extends some functionality of this algorithm for dealing with:

  1. Structured objects
  2. Taxonomic attribute-values
  3. Multi-valued attributes:

Let IMAGE imgs/img4.gif be a set of observed examples, M = {N, Y}, a set of observable components and attributes defined in the descriptive model with N = {n1,..., nm} a set of structured components and A = {A1, ..., Ap} a set of attributes depending on N. Let dom(A) be the definition domain (range) of A.

Structured objects
The algorithm for building decision trees from structured objects is the following:

IMAGE imgs/img5.gif

IMAGE imgs/img6.gif

The original aspect of the algorithm is the classifier's selection function. The tree of the descriptive model is followed from root to leaves, component by component in depth search first. If one of it canbe absent (e.g. calices on verrucæ of Fig. 1), an "exist component" test is dynamically generated and placed in the eligible classifiers' list with values "Present" or "Absent". The sub-tree of this object is not yet visited in order to avoid inapplicable sub-objects and attributes as other classifiers. On the other hand, if components are always present (e.g. septa), dependent attributes are placed in this list.

In the identification process (see further), if the test exist of a component is chosen as the "best" one and the user answers that it is really present, then the classifier's selection function is recursively called on the sub-tree of the descriptive model.