An Up-to-Date Comparison of State-of-the-Art Classification Algorithms

Handle URI:
http://hdl.handle.net/10754/623791
Title:
An Up-to-Date Comparison of State-of-the-Art Classification Algorithms
Authors:
Zhang, Chongsheng; Liu, Changchang; Zhang, Xiangliang ( 0000-0002-3574-5665 ) ; Almpanidis, George
Abstract:
Current benchmark reports of classification algorithms generally concern common classifiers and their variants but do not include many algorithms that have been introduced in recent years. Moreover, important properties such as the dependency on number of classes and features and CPU running time are typically not examined. In this paper, we carry out a comparative empirical study on both established classifiers and more recently proposed ones on 71 data sets originating from different domains, publicly available at UCI and KEEL repositories. The list of 11 algorithms studied includes Extreme Learning Machine (ELM), Sparse Representation based Classification (SRC), and Deep Learning (DL), which have not been thoroughly investigated in existing comparative studies. It is found that Stochastic Gradient Boosting Trees (GBDT) matches or exceeds the prediction performance of Support Vector Machines (SVM) and Random Forests (RF), while being the fastest algorithm in terms of prediction efficiency. ELM also yields good accuracy results, ranking in the top-5, alongside GBDT, RF, SVM, and C4.5 but this performance varies widely across all data sets. Unsurprisingly, top accuracy performers have average or slow training time efficiency. DL is the worst performer in terms of accuracy but second fastest in prediction efficiency. SRC shows good accuracy performance but it is the slowest classifier in both training and testing.
KAUST Department:
King Abdullah University of Science & Technology, Thuwal 23955-6900, , Saudi Arabia
Citation:
Zhang C, Liu C, Zhang X, Almpanidis G (2017) An up-to-date comparison of state-of-the-art classification algorithms. Expert Systems with Applications 82: 128–150. Available: http://dx.doi.org/10.1016/j.eswa.2017.04.003.
Publisher:
Elsevier BV
Journal:
Expert Systems with Applications
Issue Date:
5-Apr-2017
DOI:
10.1016/j.eswa.2017.04.003
Type:
Article
ISSN:
0957-4174
Sponsors:
This work is partially funded by the National Science Foundation of China (NSFC) under Grant no. 41401466 and 61300215, as well as Henan Science and Technology Project under Grant no. 132102210188. It is also supported by Henan University under Grant no. xxjc20140005 and 2013YBZR014. The authors acknowledge the help of Ms. Jingjun Bi on reorganising the experimental results.
Additional Links:
http://www.sciencedirect.com/science/article/pii/S0957417417302397
Appears in Collections:
Articles

Full metadata record

DC FieldValue Language
dc.contributor.authorZhang, Chongshengen
dc.contributor.authorLiu, Changchangen
dc.contributor.authorZhang, Xiangliangen
dc.contributor.authorAlmpanidis, Georgeen
dc.date.accessioned2017-05-31T11:23:05Z-
dc.date.available2017-05-31T11:23:05Z-
dc.date.issued2017-04-05en
dc.identifier.citationZhang C, Liu C, Zhang X, Almpanidis G (2017) An up-to-date comparison of state-of-the-art classification algorithms. Expert Systems with Applications 82: 128–150. Available: http://dx.doi.org/10.1016/j.eswa.2017.04.003.en
dc.identifier.issn0957-4174en
dc.identifier.doi10.1016/j.eswa.2017.04.003en
dc.identifier.urihttp://hdl.handle.net/10754/623791-
dc.description.abstractCurrent benchmark reports of classification algorithms generally concern common classifiers and their variants but do not include many algorithms that have been introduced in recent years. Moreover, important properties such as the dependency on number of classes and features and CPU running time are typically not examined. In this paper, we carry out a comparative empirical study on both established classifiers and more recently proposed ones on 71 data sets originating from different domains, publicly available at UCI and KEEL repositories. The list of 11 algorithms studied includes Extreme Learning Machine (ELM), Sparse Representation based Classification (SRC), and Deep Learning (DL), which have not been thoroughly investigated in existing comparative studies. It is found that Stochastic Gradient Boosting Trees (GBDT) matches or exceeds the prediction performance of Support Vector Machines (SVM) and Random Forests (RF), while being the fastest algorithm in terms of prediction efficiency. ELM also yields good accuracy results, ranking in the top-5, alongside GBDT, RF, SVM, and C4.5 but this performance varies widely across all data sets. Unsurprisingly, top accuracy performers have average or slow training time efficiency. DL is the worst performer in terms of accuracy but second fastest in prediction efficiency. SRC shows good accuracy performance but it is the slowest classifier in both training and testing.en
dc.description.sponsorshipThis work is partially funded by the National Science Foundation of China (NSFC) under Grant no. 41401466 and 61300215, as well as Henan Science and Technology Project under Grant no. 132102210188. It is also supported by Henan University under Grant no. xxjc20140005 and 2013YBZR014. The authors acknowledge the help of Ms. Jingjun Bi on reorganising the experimental results.en
dc.publisherElsevier BVen
dc.relation.urlhttp://www.sciencedirect.com/science/article/pii/S0957417417302397en
dc.subjectClassification benchmarkingen
dc.subjectClassifier comparisonen
dc.subjectClassifier evaluationen
dc.titleAn Up-to-Date Comparison of State-of-the-Art Classification Algorithmsen
dc.typeArticleen
dc.contributor.departmentKing Abdullah University of Science & Technology, Thuwal 23955-6900, , Saudi Arabiaen
dc.identifier.journalExpert Systems with Applicationsen
dc.contributor.institutionSchool of Computer and Information Engineering, Henan University, KaiFeng, 475001, , Chinaen
kaust.authorZhang, Xiangliangen
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.