Selected Data Mining Tools for Data Analysis in Distributed Environment
Type
ArticleKAUST Department
Applied Mathematics and Computational Science ProgramComputational Bioscience Research Center (CBRC)
Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division
Computer, Electrical and Mathematical Sciences and Engineering Division and Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
Extensions of Dynamic Programming, Machine Learning and Discrete Optimization Research Group
Date
2022-10-01Permanent link to this record
http://hdl.handle.net/10754/682316
Metadata
Show full item recordAbstract
In this paper, we deal with distributed data represented either as a finite set T of decision tables with equal sets of attributes or a finite set I of information systems with equal sets of attributes. In the former case, we discuss a way to the study decision trees common to all tables from the set T: building a decision table in which the set of decision trees coincides with the set of decision trees common to all tables from T. We show when we can build such a decision table and how to build it in a polynomial time. If we have such a table, we can apply various decision tree learning algorithms to it. We extend the considered approach to the study of test (reducts) and decision rules common to all tables from T. In the latter case, we discuss a way to study the association rules common to all information systems from the set I: building a joint information system for which the set of true association rules that are realizable for a given row ρ and have a given attribute a on the right-hand side coincides with the set of association rules that are true for all information systems from I, have the attribute a on the right-hand side, and are realizable for the row ρ. We then show how to build a joint information system in a polynomial time. When we build such an information system, we can apply various association rule learning algorithms to it.Citation
Moshkov, M., Zielosko, B., & Tetteh, E. T. (2022). Selected Data Mining Tools for Data Analysis in Distributed Environment. Entropy, 24(10), 1401. https://doi.org/10.3390/e24101401Sponsors
Research funded by King Abdullah University of Science and TechnologyPublisher
MDPI AGJournal
EntropyAdditional Links
https://www.mdpi.com/1099-4300/24/10/1401ae974a485f413a2113503eed53cd6c53
10.3390/e24101401
Scopus Count
Except where otherwise noted, this item's license is described as Archived with thanks to Entropy under a Creative Commons license, details at: https://creativecommons.org/licenses/by/4.0/