classification by decision tree induction

Scalable Decision Tree Induction

 

•Partition the data into subsets and build a decision tree for each subset?
•SLIQ (EDBT’96 — Mehta et al.)
–builds an index for each attribute and only the class list and the current attribute list reside in memory
•SPRINT (VLDB’96 — J. Shafer et al.)
–constructs an attribute list data structure
•PUBLIC (VLDB’98 — Rastogi & Shim)
–integrates tree splitting and tree pruning: stop growing the tree earlier
•RainForest  (VLDB’98 — Gehrke, Ramakrishnan & Ganti)
–separates the scalability aspects from the criteria that determine the quality of the tree
–builds an AVC-list (attribute, value, class label)
Gini Index (IBM IntelligentMiner)

 

Data Cube-Based Decision-Tree Induction

 

•Integration of generalization with decision-tree induction (Kamber et al’97).
•Classification at primitive concept levels
–E.g., precise temperature, humidity, outlook, etc.
–Low-level concepts, scattered classes, bushy classification-trees
–Semantic interpretation problems.
•Cube-based multi-level classification
–Relevance analysis at multi-levels.
–Information-gain analysis with dimension + level.

 

Presentation of Classification Results

 

 

 

Previous

Classification and Prediction by V. Vanthana