Self-adaptive change detection in streaming data with non-stationary distribution

Handle URI:
http://hdl.handle.net/10754/564264
Title:
Self-adaptive change detection in streaming data with non-stationary distribution
Authors:
Zhang, Xiangliang ( 0000-0002-3574-5665 ) ; Wang, Wei
Abstract:
Non-stationary distribution, in which the data distribution evolves over time, is a common issue in many application fields, e.g., intrusion detection and grid computing. Detecting the changes in massive streaming data with a non-stationary distribution helps to alarm the anomalies, to clean the noises, and to report the new patterns. In this paper, we employ a novel approach for detecting changes in streaming data with the purpose of improving the quality of modeling the data streams. Through observing the outliers, this approach of change detection uses a weighted standard deviation to monitor the evolution of the distribution of data streams. A cumulative statistical test, Page-Hinkley, is employed to collect the evidence of changes in distribution. The parameter used for reporting the changes is self-adaptively adjusted according to the distribution of data streams, rather than set by a fixed empirical value. The self-adaptability of the novel approach enhances the effectiveness of modeling data streams by timely catching the changes of distributions. We validated the approach on an online clustering framework with a benchmark KDDcup 1999 intrusion detection data set as well as with a real-world grid data set. The validation results demonstrate its better performance on achieving higher accuracy and lower percentage of outliers comparing to the other change detection approaches. © 2010 Springer-Verlag.
KAUST Department:
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division; Computer Science Program; Machine Intelligence & kNowledge Engineering Lab
Publisher:
Springer Science + Business Media
Journal:
Advanced Data Mining and Applications
Conference/Event name:
6th International Conference on Advanced Data Mining and Applications, ADMA 2010
Issue Date:
2010
DOI:
10.1007/978-3-642-17316-5_33
Type:
Conference Paper
ISSN:
03029743
ISBN:
3642173152; 9783642173158
Appears in Collections:
Conference Papers; Computer Science Program; Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division

Full metadata record

DC FieldValue Language
dc.contributor.authorZhang, Xiangliangen
dc.contributor.authorWang, Weien
dc.date.accessioned2015-08-04T06:21:15Zen
dc.date.available2015-08-04T06:21:15Zen
dc.date.issued2010en
dc.identifier.isbn3642173152; 9783642173158en
dc.identifier.issn03029743en
dc.identifier.doi10.1007/978-3-642-17316-5_33en
dc.identifier.urihttp://hdl.handle.net/10754/564264en
dc.description.abstractNon-stationary distribution, in which the data distribution evolves over time, is a common issue in many application fields, e.g., intrusion detection and grid computing. Detecting the changes in massive streaming data with a non-stationary distribution helps to alarm the anomalies, to clean the noises, and to report the new patterns. In this paper, we employ a novel approach for detecting changes in streaming data with the purpose of improving the quality of modeling the data streams. Through observing the outliers, this approach of change detection uses a weighted standard deviation to monitor the evolution of the distribution of data streams. A cumulative statistical test, Page-Hinkley, is employed to collect the evidence of changes in distribution. The parameter used for reporting the changes is self-adaptively adjusted according to the distribution of data streams, rather than set by a fixed empirical value. The self-adaptability of the novel approach enhances the effectiveness of modeling data streams by timely catching the changes of distributions. We validated the approach on an online clustering framework with a benchmark KDDcup 1999 intrusion detection data set as well as with a real-world grid data set. The validation results demonstrate its better performance on achieving higher accuracy and lower percentage of outliers comparing to the other change detection approaches. © 2010 Springer-Verlag.en
dc.publisherSpringer Science + Business Mediaen
dc.subjectChange detectionen
dc.subjectData streamen
dc.subjectNon-stationary distributionen
dc.subjectSelf-adaptive parameter settingen
dc.titleSelf-adaptive change detection in streaming data with non-stationary distributionen
dc.typeConference Paperen
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Divisionen
dc.contributor.departmentComputer Science Programen
dc.contributor.departmentMachine Intelligence & kNowledge Engineering Laben
dc.identifier.journalAdvanced Data Mining and Applicationsen
dc.conference.date19 November 2010 through 21 November 2010en
dc.conference.name6th International Conference on Advanced Data Mining and Applications, ADMA 2010en
dc.conference.locationChongqingen
dc.contributor.institutionInterdisciplinary Centre for Security, Reliability and Trust (SnT Centre), University of Luxembourg, Luxembourg, Luxembourgen
kaust.authorZhang, Xiangliangen
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.