A greedy algorithm for construction of decision trees for tables with many-valued decisions - A comparative study
KAUST DepartmentApplied Mathematics and Computational Science Program
Computational Bioscience Research Center (CBRC)
Computer Science Program
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Extensions of Dynamic Programming, Machine Learning and Discrete Optimization Research Group
Office of the VP
Permanent link to this recordhttp://hdl.handle.net/10754/564819
MetadataShow full item record
AbstractIn the paper, we study a greedy algorithm for construction of decision trees. This algorithm is applicable to decision tables with many-valued decisions where each row is labeled with a set of decisions. For a given row, we should find a decision from the set attached to this row. Experimental results for data sets from UCI Machine Learning Repository and randomly generated tables are presented. We make a comparative study of the depth and average depth of the constructed decision trees for proposed approach and approach based on generalized decision. The obtained results show that the proposed approach can be useful from the point of view of knowledge representation and algorithm construction.
Showing items related by title, author, creator and subject.
Decision and Inhibitory Trees for Decision Tables with Many-Valued DecisionsAzad, Mohammad (2018-06-06) [Dissertation]
Advisor: Moshkov, Mikhail
Committee members: Bajic, Vladimir B.; Zhang, Xiangliang; Boros, EndreDecision trees are one of the most commonly used tools in decision analysis, knowledge representation, machine learning, etc., for its simplicity and interpretability. We consider an extension of dynamic programming approach to process the whole set of decision trees for the given decision table which was previously only attainable by brute-force algorithms. We study decision tables with many-valued decisions (each row may contain multiple decisions) because they are more reasonable models of data in many cases. To address this problem in a broad sense, we consider not only decision trees but also inhibitory trees where terminal nodes are labeled with “̸= decision”. Inhibitory trees can sometimes describe more knowledge from datasets than decision trees. As for cost functions, we consider depth or average depth to minimize time complexity of trees, and the number of nodes or the number of the terminal, or nonterminal nodes to minimize the space complexity of trees. We investigate the multi-stage optimization of trees relative to some cost functions, and also the possibility to describe the whole set of strictly optimal trees. Furthermore, we study the bi-criteria optimization cost vs. cost and cost vs. uncertainty for decision trees, and cost vs. cost and cost vs. completeness for inhibitory trees. The most interesting application of the developed technique is the creation of multi-pruning and restricted multi-pruning approaches which are useful for knowledge representation and prediction. The experimental results show that decision trees constructed by these approaches can often outperform the decision trees constructed by the CART algorithm. Another application includes the comparison of 12 greedy heuristics for single- and bi-criteria optimization (cost vs. cost) of trees. We also study the three approaches (decision tables with many-valued decisions, decision tables with most common decisions, and decision tables with generalized decisions) to handle inconsistency of decision tables. We also analyze the time complexity of decision and inhibitory trees over arbitrary sets of attributes represented by information systems in the frameworks of local (when we can use in trees only attributes from problem description) and global (when we can use in trees arbitrary attributes from the information system) approaches.
Decision rules for decision tables with many-valued decisionsChikalov, Igor; Zielosko, Beata (Rough Sets and Knowledge Technology, Springer Nature, 2011) [Conference Paper]In the paper, authors presents a greedy algorithm for construction of exact and partial decision rules for decision tables with many-valued decisions. Exact decision rules can be 'over-fitted', so instead of exact decision rules with many attributes, it is more appropriate to work with partial decision rules with smaller number of attributes. Based on results for set cover problem authors study bounds on accuracy of greedy algorithm for exact and partial decision rule construction, and complexity of the problem of minimization of decision rule length. © 2011 Springer-Verlag.
Optimization of decision rule complexity for decision tables with many-valued decisionsAzad, Mohammad; Chikalov, Igor; Moshkov, Mikhail (2013 IEEE International Conference on Systems, Man, and Cybernetics, Institute of Electrical and Electronics Engineers (IEEE), 2013-10) [Conference Paper]We describe new heuristics to construct decision rules for decision tables with many-valued decisions from the point of view of length and coverage which are enough good. We use statistical test to find leaders among the heuristics. After that, we compare our results with optimal result obtained by dynamic programming algorithms. The average percentage of relative difference between length (coverage) of constructed and optimal rules is at most 6.89% (15.89%, respectively) for leaders which seems to be a promising result. © 2013 IEEE.