Inferring ground truth from crowdsourced data under local attribute differential privacy
Name:
TCS2021_truth.pdf
Size:
499.1Kb
Format:
PDF
Description:
Accepted manuscript
Embargo End Date:
2023-02-01
Type
ArticleAuthors
Wang, DiXu, Jinhui
Date
2021-02Embargo End Date
2023-02-01Submitted Date
2020-06-21Permanent link to this record
http://hdl.handle.net/10754/668012
Metadata
Show full item recordAbstract
Nowadays, crowdsourcing gains an increasing popularity as it can be adopted to solve many challenging question answering tasks that are easy for humans but difficult for computers. Due to the variety in the quality of users, it is important to infer not only the underlying ground truth of these tasks but also the users ability from the answers given by users. This problem is called Ground Truth Inference and has been studied for many years. However, since the answers collected from the users may contain sensitive information, ground truth inference raises serious privacy concern. Due to this reason, the problem of ground truth inference under local differential privacy (LDP) model has been recently studied. However, this problem is still not well understood and even some basic questions have not been solved yet. First, it is still unknown what is the average error of the private estimators to the underlying ground truth. Secondly, we do not know whether we can infer the ability of each user under LDP model and what is the estimation error w.r.t. the underlying users ability. Finally, previous work only shows that their methods have better performance than the private major voting algorithm through experiments. However, there is still no theoretically result which shows this priority formally or mathematically. In this paper, we partially solve these problems by studying the ground truth inference problem under local attribute differential privacy (LADP) model, which is a relaxation of LDP model, and propose a new algorithm called private Dawid-Skene method, which is motivated by the classical Dawid-Skene method. Specifically, we first provide the estimation errors for both ability of users and the ground truth under some assumptions of the problem if the algorithm start with some appropriate initial vector. Moreover, we propose an explicit instance and show that the estimation error of the ground truth achieved by the private major voting algorithm is always greater than the error achieved by our method.Citation
Wang, D., & Xu, J. (2021). Inferring ground truth from crowdsourced data under local attribute differential privacy. Theoretical Computer Science. doi:10.1016/j.tcs.2021.02.039Sponsors
This research was supported in part by the baseline funding of KAUST, the National Science Foundation (NSF) through grants CCF-1422324 and CCF-1716400.Publisher
Elsevier BVJournal
Theoretical Computer ScienceAdditional Links
https://linkinghub.elsevier.com/retrieve/pii/S0304397521001237ae974a485f413a2113503eed53cd6c53
10.1016/j.tcs.2021.02.039