Total Path Length and Number of Terminal Nodes for Decision Trees

Type
Conference Paper

Authors
Hussain, Shahid

KAUST Department
Computer Science Program
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division

Online Publication Date
2014-09-13

Print Publication Date
2014

Date
2014-09-13

Abstract
This paper presents a new tool for study of relationships between total path length (average depth) and number of terminal nodes for decision trees. These relationships are important from the point of view of optimization of decision trees. In this particular case of total path length and number of terminal nodes, the relationships between these two cost functions are closely related with space-time trade-off. In addition to algorithm to compute the relationships, the paper also presents results of experiments with datasets from UCI ML Repository1. These experiments show how two cost functions behave for a given decision table and the resulting plots show the Pareto frontier or Pareto set of optimal points. Furthermore, in some cases this Pareto frontier is a singleton showing the total optimality of decision trees for the given decision table.

Citation
Total Path Length and Number of Terminal Nodes for Decision Trees 2014, 35:514 Procedia Computer Science

Publisher
Elsevier BV

Journal
Procedia Computer Science

Conference/Event Name
International Conference on Knowledge-Based and Intelligent Information and Engineering Systems, KES 2014

DOI
10.1016/j.procs.2014.08.132

Additional Links
http://linkinghub.elsevier.com/retrieve/pii/S1877050914010977

Permanent link to this record