Efficient Estimation of Dynamic Density Functions with Applications in Data Streams
KAUST DepartmentComputer Science Program
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Online Publication Date2018-07-29
Print Publication Date2019
Permanent link to this recordhttp://hdl.handle.net/10754/628248
MetadataShow full item record
AbstractRecently, many applications such as network monitoring, traffic management and environmental studies generate huge amount of data that cannot fit in the computer memory. Data of such applications arrive continuously in the form of streams. The main challenges for mining data streams are the high speed and the large volume of the arriving data. A typical solution to tackle the problems of mining data streams is to learn a model that fits in the computer memory. However, the underlying distributions of the streaming data change over time in unpredicted scenarios. In this sense, the learned models should be updated continuously and rely more on the most recent data in the streams. \n \nIn this chapter, we present an online density estimator that builds a model called KDE-Track for characterizing the dynamic density of the data streams. KDE-Track summarizes the distribution of a data stream by estimating the Probability Density Function (PDF) of the stream at a set of resampling points. KDE-Track is shown to be more accurate (as reflected by smaller error values) and more computationally efficient (as reflected by shorter running time) when compared with existing density estimation techniques. We demonstrate the usefulness of KDE-Track in visualizing the dynamic density of data streams and change detection.
CitationQahtan A, Wang S, Zhang X (2018) Efficient Estimation of Dynamic Density Functions with Applications in Data Streams. Learning from Data Streams in Evolving Environments: 247–278. Available: http://dx.doi.org/10.1007/978-3-319-89803-2_11.