Data Stream Clustering With Affinity Propagation

Handle URI:
http://hdl.handle.net/10754/556655
Title:
Data Stream Clustering With Affinity Propagation
Authors:
Zhang, Xiangliang ( 0000-0002-3574-5665 ) ; Furtlehner, Cyril; Germain-Renaud, Cecile; Sebag, Michele
Abstract:
Data stream clustering provides insights into the underlying patterns of data flows. This paper focuses on selecting the best representatives from clusters of streaming data. There are two main challenges: how to cluster with the best representatives and how to handle the evolving patterns that are important characteristics of streaming data with dynamic distributions. We employ the Affinity Propagation (AP) algorithm presented in 2007 by Frey and Dueck for the first challenge, as it offers good guarantees of clustering optimality for selecting exemplars. The second challenging problem is solved by change detection. The presented StrAP algorithm combines AP with a statistical change point detection test; the clustering model is rebuilt whenever the test detects a change in the underlying data distribution. Besides the validation on two benchmark data sets, the presented algorithm is validated on a real-world application, monitoring the data flow of jobs submitted to the EGEE grid.
KAUST Department:
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Citation:
Data Stream Clustering With Affinity Propagation 2014, 26 (7):1644 IEEE Transactions on Knowledge and Data Engineering
Journal:
IEEE Transactions on Knowledge and Data Engineering
Issue Date:
9-Jul-2014
DOI:
10.1109/TKDE.2013.146
Type:
Article
ISSN:
1041-4347
Additional Links:
http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6585253
Appears in Collections:
Articles; Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division

Full metadata record

DC FieldValue Language
dc.contributor.authorZhang, Xiangliangen
dc.contributor.authorFurtlehner, Cyrilen
dc.contributor.authorGermain-Renaud, Cecileen
dc.contributor.authorSebag, Micheleen
dc.date.accessioned2015-06-10T11:43:01Zen
dc.date.available2015-06-10T11:43:01Zen
dc.date.issued2014-07-09en
dc.identifier.citationData Stream Clustering With Affinity Propagation 2014, 26 (7):1644 IEEE Transactions on Knowledge and Data Engineeringen
dc.identifier.issn1041-4347en
dc.identifier.doi10.1109/TKDE.2013.146en
dc.identifier.urihttp://hdl.handle.net/10754/556655en
dc.description.abstractData stream clustering provides insights into the underlying patterns of data flows. This paper focuses on selecting the best representatives from clusters of streaming data. There are two main challenges: how to cluster with the best representatives and how to handle the evolving patterns that are important characteristics of streaming data with dynamic distributions. We employ the Affinity Propagation (AP) algorithm presented in 2007 by Frey and Dueck for the first challenge, as it offers good guarantees of clustering optimality for selecting exemplars. The second challenging problem is solved by change detection. The presented StrAP algorithm combines AP with a statistical change point detection test; the clustering model is rebuilt whenever the test detects a change in the underlying data distribution. Besides the validation on two benchmark data sets, the presented algorithm is validated on a real-world application, monitoring the data flow of jobs submitted to the EGEE grid.en
dc.relation.urlhttp://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6585253en
dc.rights(c) 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.en
dc.subjectAffinity Propagationen
dc.subjectData Stream Clusteringen
dc.subjectStreaming data clusteringen
dc.subjectaffinity propagationen
dc.subjectautonomic computingen
dc.subjectgrid monitoringen
dc.titleData Stream Clustering With Affinity Propagationen
dc.typeArticleen
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Divisionen
dc.identifier.journalIEEE Transactions on Knowledge and Data Engineeringen
dc.eprint.versionPost-printen
dc.contributor.institutionTAO-INRIA, CNRS, University of Paris-Sud 11, Orsay, Franceen
kaust.authorZhang, Xiangliangen
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.