标题:
多值决策表的最小决策树生成Minimal Decision Tree Generation for Multi-Label Decision Tables
作者:
乔莹, 许美玲, 钟发荣, 曾静, 莫毓昌
关键字:
多值决策表, 决策树, 动态规划算法Multi-Label Decision Tables, Decision Trees, Dynamic Programming Algorithm
期刊名称:
《Computer Science and Application》, Vol.6 No.10, 2016-10-28
摘要:
决策树技术在数据挖掘的分类领域应用极其广泛,可以从普通决策表(每行记录包含一个决策值)中挖掘有价值的信息,但是要从多值决策表(每行记录包含多个决策值)中挖掘潜在的信息则比较困难。多值决策表中每行记录包含多个决策值,多个决策属性用一个集合表示。针对已有的启发式算法,如贪心算法,由于性能不稳定的特点,该算法获得的决策树规模变化较大,本文基于动态规划的思想,提出了使决策树规模最小化的算法。该算法将多值决策表分解为多个子表,通过多值决策表的子表进行构造最小决策树,进而对多值决策表进行数据挖掘。
Decision tree is a widely used classification in data mining. It can discover the essential knowledge from the common decision tables (each row has a decision). However, it is difficult to do data mining from the multi-label decision tables (each row has a set of decisions). In a multi-label decision tables, each row contains several decisions, and several decision attributes are represented using a set. By testing the existing heuristic algorithms, such as greedy algorithms, their performance is not stable, i.e., the size of the decision tree might become very large. In this paper, we propose a dynamic programming algorithm to minimize the size of the decision trees for a multi- label decision table. In our algorithm, the multi-label decision table is divided into several subtables, and the decision tree is constructed by using all subtables of the multi-label decision table, then useful information can be discovered from the multi-label decision tables.