A greedy algorithm for construction of decision trees for tables with many-valued decisions - A comparative study
KAUST DepartmentApplied Mathematics and Computational Science Program
Computational Bioscience Research Center (CBRC)
Computer Science Program
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Extensions of Dynamic Programming, Machine Learning and Discrete Optimization Research Group
Office of the VP
Permanent link to this recordhttp://hdl.handle.net/10754/564819
MetadataShow full item record
AbstractIn the paper, we study a greedy algorithm for construction of decision trees. This algorithm is applicable to decision tables with many-valued decisions where each row is labeled with a set of decisions. For a given row, we should find a decision from the set attached to this row. Experimental results for data sets from UCI Machine Learning Repository and randomly generated tables are presented. We make a comparative study of the depth and average depth of the constructed decision trees for proposed approach and approach based on generalized decision. The obtained results show that the proposed approach can be useful from the point of view of knowledge representation and algorithm construction.
Showing items related by title, author, creator and subject.
Decision and Inhibitory Trees for Decision Tables with Many-Valued DecisionsAzad, Mohammad (2018-06-06) [Dissertation]
Advisor: Moshkov, Mikhail
Committee members: Bajic, Vladimir B.; Zhang, Xiangliang; Boros, EndreDecision trees are one of the most commonly used tools in decision analysis, knowledge representation, machine learning, etc., for its simplicity and interpretability. We consider an extension of dynamic programming approach to process the whole set of decision trees for the given decision table which was previously only attainable by brute-force algorithms. We study decision tables with many-valued decisions (each row may contain multiple decisions) because they are more reasonable models of data in many cases. To address this problem in a broad sense, we consider not only decision trees but also inhibitory trees where terminal nodes are labeled with “̸= decision”. Inhibitory trees can sometimes describe more knowledge from datasets than decision trees. As for cost functions, we consider depth or average depth to minimize time complexity of trees, and the number of nodes or the number of the terminal, or nonterminal nodes to minimize the space complexity of trees. We investigate the multi-stage optimization of trees relative to some cost functions, and also the possibility to describe the whole set of strictly optimal trees. Furthermore, we study the bi-criteria optimization cost vs. cost and cost vs. uncertainty for decision trees, and cost vs. cost and cost vs. completeness for inhibitory trees. The most interesting application of the developed technique is the creation of multi-pruning and restricted multi-pruning approaches which are useful for knowledge representation and prediction. The experimental results show that decision trees constructed by these approaches can often outperform the decision trees constructed by the CART algorithm. Another application includes the comparison of 12 greedy heuristics for single- and bi-criteria optimization (cost vs. cost) of trees. We also study the three approaches (decision tables with many-valued decisions, decision tables with most common decisions, and decision tables with generalized decisions) to handle inconsistency of decision tables. We also analyze the time complexity of decision and inhibitory trees over arbitrary sets of attributes represented by information systems in the frameworks of local (when we can use in trees only attributes from problem description) and global (when we can use in trees arbitrary attributes from the information system) approaches.
Decision and Inhibitory Rule Optimization for Decision Tables with Many-valued DecisionsAlsolami, Fawaz (2016-04-25) [Dissertation]
Advisor: Moshkov, Mikhail
Committee members: Bajic, Vladimir B.; Suraj, Zbigniew; Zhang, Xiangliang‘If-then’ rule sets are one of the most expressive and human-readable knowledge representations. This thesis deals with optimization and analysis of decision and inhibitory rules for decision tables with many-valued decisions. The most important areas of applications are knowledge extraction and representation. The benefit of considering inhibitory rules is connected with the fact that in some situations they can describe more knowledge than the decision ones. Decision tables with many-valued decisions arise in combinatorial optimization, computational geometry, fault diagnosis, and especially under the processing of data sets. In this thesis, various examples of real-life problems are considered which help to understand the motivation of the investigation. We extend relatively simple results obtained earlier for decision rules over decision tables with many-valued decisions to the case of inhibitory rules. The behavior of Shannon functions (which characterize complexity of rule systems) is studied for finite and infinite information systems, for global and local approaches, and for decision and inhibitory rules. The extensions of dynamic programming for the study of decision rules over decision tables with single-valued decisions are generalized to the case of decision tables with many-valued decisions. These results are also extended to the case of inhibitory rules. As a result, we have algorithms (i) for multi-stage optimization of rules relative to such criteria as length or coverage, (ii) for counting the number of optimal rules, (iii) for construction of Pareto optimal points for bi-criteria optimization problems, (iv) for construction of graphs describing relationships between two cost functions, and (v) for construction of graphs describing relationships between cost and accuracy of rules. The applications of created tools include comparison (based on information about Pareto optimal points) of greedy heuristics for bi-criteria optimization of rules, and construction (based on multi-stage optimization of rules) of relatively short systems of rules that can be used for knowledge representation.
Decision rules for decision tables with many-valued decisionsChikalov, Igor; Zielosko, Beata (Rough Sets and Knowledge Technology, Springer Nature, 2011) [Conference Paper]In the paper, authors presents a greedy algorithm for construction of exact and partial decision rules for decision tables with many-valued decisions. Exact decision rules can be 'over-fitted', so instead of exact decision rules with many attributes, it is more appropriate to work with partial decision rules with smaller number of attributes. Based on results for set cover problem authors study bounds on accuracy of greedy algorithm for exact and partial decision rule construction, and complexity of the problem of minimization of decision rule length. © 2011 Springer-Verlag.