KAUST DepartmentComputer Science
Computer Science Program
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Permanent link to this recordhttp://hdl.handle.net/10754/656131
MetadataShow full item record
AbstractGeo-textual data that contain spatial, textual, and temporal information are being generated at a very high rate. These geo-textual data cover a wide range of topics. Users may be interested in receiving local popular topics from geo-textual messages. We study the cluster-based subscription matching (CSM) problem. Given a stream of geo-textual messages, we maintain up-to-date clustering results based on a threshold-based online clustering algorithm. Based on the clustering result, we feed subscribers with their preferred geo-textual message clusters according to their specified keywords and location. Moreover, we summarize each cluster by selecting a set of representative messages. The CSM problem considers spatial proximity, textual relevance, and message freshness during the clustering, cluster feeding, and summarization processes. To solve the CSM problem, we propose a novel solution to cluster, feed, and summarize a stream of geo-textual messages efficiently. We evaluate the efficiency of our solution on two real-world datasets and the experimental results demonstrate that our solution is capable of high efficiency compared with baselines.
SponsorsThis work is supported in part by grants awarded by National Natural Science Foundation of Chine ( NSFC) (No.61832017, 61836007, 61532018)
Conference/Event name2019 IEEE 35th International Conference on Data Engineering (ICDE)