Accurately Estimating User Cardinalities and Detecting Super Spreaders over Time
Name:
[submit]cardinality_TKDE (1).pdf
Size:
5.145Mb
Format:
PDF
Description:
Accepted manuscript
Type
ArticleAuthors
Jia, PengWang, Pinghui
Zhang, Yuchao
Zhang, Xiangliang

Tao, Jing
Ding, Jianwei
Guan, Xiaohong
Towsley, Don
KAUST Department
Computer Science ProgramComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Machine Intelligence & kNowledge Engineering Lab
Date
2020Permanent link to this record
http://hdl.handle.net/10754/661680
Metadata
Show full item recordAbstract
Online monitoring user cardinalities in graph streams is fundamental for many applications such as anomaly detection. These graph streams may contain edge duplicates and have a large number of user-item pairs, which makes it infeasible to exactly compute user cardinalities due to limited computational and memory resources. Existing methods are designed to approximately estimate user cardinalities, but their accuracy highly depends on complex parameters and they cannot provide anytime-available estimation. To address these problems, we develop novel bit/register sharing algorithms, which use a bit/register array to build a compact sketch of all users' connected items. Our algorithms exploit the dynamic properties of the bit/register arrays (e.g., the fraction of zero bits in the bit array) to significantly improve the estimation accuracy, and have low time complexity O(1) to update the estimations for a new user-item pair. In addition, our algorithms are simple and easy to use, without requirements to tune any parameter. Furthermore, we extend our methods to detect super spreaders with large cardinalities in real-time. We evaluate the performance of our methods on real-world datasets. The experimental results demonstrate that our methods are several times more accurate and faster than state-of-the-art methods using the same amount of memory.Citation
Jia, P., Wang, P., Zhang, Y., Zhang, X., Tao, J., Ding, J., … Towsley, D. (2020). Accurately Estimating User Cardinalities and Detecting Super Spreaders over Time. IEEE Transactions on Knowledge and Data Engineering, 1–1. doi:10.1109/tkde.2020.2975625Sponsors
The research presented in this paper is supported in part by National Key R&D Program of China (2018YFC0830500), National Natural Science Foundation of China (U1736205, 61603290), Shenzhen Basic Research Grant (JCYJ20170816100819428), Natural Science Basic Research Plan in Shaanxi Province of China (2016JQ6034).Additional Links
https://ieeexplore.ieee.org/document/9007395/https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9007395
ae974a485f413a2113503eed53cd6c53
10.1109/TKDE.2020.2975625