Web-ADARE: A Web-Aided Data Repairing System

Handle URI:
http://hdl.handle.net/10754/623010
Title:
Web-ADARE: A Web-Aided Data Repairing System
Authors:
Gu, Binbin; Li, Zhixu; Yang, Qiang; Xie, Qing; Liu, An; Liu, Guanfeng; Zheng, Kai; Zhang, Xiangliang ( 0000-0002-3574-5665 )
Abstract:
Data repairing aims at discovering and correcting erroneous data in databases. In this paper, we develop Web-ADARE, an end-to-end web-aided data repairing system, to provide a feasible way to involve the vast data sources on the Web in data repairing. Our main attention in developing Web-ADARE is paid on the interaction problem between web-aided repairing and rule-based repairing, in order to minimize the Web consultation cost while reaching predefined quality requirements. The same interaction problem also exists in crowd-based methods but this is not yet formally defined and addressed. We first prove in theory that the optimal interaction scheme is not feasible to be achieved, and then propose an algorithm to identify a scheme for efficient interaction by investigating the inconsistencies and the dependencies between values in the repairing process. Extensive experiments on three data collections demonstrate the high repairing precision and recall of Web-ADARE, and the efficiency of the generated interaction scheme over several baseline ones.
KAUST Department:
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division; Computer Science Program
Citation:
Gu B, Li Z, Yang Q, Xie Q, Liu A, et al. (2017) Web-ADARE: A Web-Aided Data Repairing System. Neurocomputing. Available: http://dx.doi.org/10.1016/j.neucom.2016.09.132.
Publisher:
Elsevier BV
Journal:
Neurocomputing
Issue Date:
8-Mar-2017
DOI:
10.1016/j.neucom.2016.09.132
Type:
Article
ISSN:
0925-2312
Sponsors:
This research is partially supported by Natural Science Foundation of China (Grant No. 61303019, 61402313, 61472263, 61572336), Postdoctoral scientific research funding of Jiangsu Province (No. 1501090B) National 58 batch of postdoctoral funding (No. 2015M581859), Collaborative Innovation Center of Novel Software Technology and Industrialization, Jiangsu, China, and the King Abdullah University of Science and Technology.
Additional Links:
http://www.sciencedirect.com/science/article/pii/S0925231217304642
Appears in Collections:
Articles; Computer Science Program; Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division

Full metadata record

DC FieldValue Language
dc.contributor.authorGu, Binbinen
dc.contributor.authorLi, Zhixuen
dc.contributor.authorYang, Qiangen
dc.contributor.authorXie, Qingen
dc.contributor.authorLiu, Anen
dc.contributor.authorLiu, Guanfengen
dc.contributor.authorZheng, Kaien
dc.contributor.authorZhang, Xiangliangen
dc.date.accessioned2017-03-15T07:15:28Z-
dc.date.available2017-03-15T07:15:28Z-
dc.date.issued2017-03-08en
dc.identifier.citationGu B, Li Z, Yang Q, Xie Q, Liu A, et al. (2017) Web-ADARE: A Web-Aided Data Repairing System. Neurocomputing. Available: http://dx.doi.org/10.1016/j.neucom.2016.09.132.en
dc.identifier.issn0925-2312en
dc.identifier.doi10.1016/j.neucom.2016.09.132en
dc.identifier.urihttp://hdl.handle.net/10754/623010-
dc.description.abstractData repairing aims at discovering and correcting erroneous data in databases. In this paper, we develop Web-ADARE, an end-to-end web-aided data repairing system, to provide a feasible way to involve the vast data sources on the Web in data repairing. Our main attention in developing Web-ADARE is paid on the interaction problem between web-aided repairing and rule-based repairing, in order to minimize the Web consultation cost while reaching predefined quality requirements. The same interaction problem also exists in crowd-based methods but this is not yet formally defined and addressed. We first prove in theory that the optimal interaction scheme is not feasible to be achieved, and then propose an algorithm to identify a scheme for efficient interaction by investigating the inconsistencies and the dependencies between values in the repairing process. Extensive experiments on three data collections demonstrate the high repairing precision and recall of Web-ADARE, and the efficiency of the generated interaction scheme over several baseline ones.en
dc.description.sponsorshipThis research is partially supported by Natural Science Foundation of China (Grant No. 61303019, 61402313, 61472263, 61572336), Postdoctoral scientific research funding of Jiangsu Province (No. 1501090B) National 58 batch of postdoctoral funding (No. 2015M581859), Collaborative Innovation Center of Novel Software Technology and Industrialization, Jiangsu, China, and the King Abdullah University of Science and Technology.en
dc.publisherElsevier BVen
dc.relation.urlhttp://www.sciencedirect.com/science/article/pii/S0925231217304642en
dc.rightsNOTICE: this is the author’s version of a work that was accepted for publication in Neurocomputing. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Neurocomputing, [, , (2017-03-08)] DOI: 10.1016/j.neucom.2016.09.132 . © 2017. This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/en
dc.subjectData Repairingen
dc.subjectRule-based Methoden
dc.subjectWeb-aided Methoden
dc.subjectMultiple Sourcesen
dc.titleWeb-ADARE: A Web-Aided Data Repairing Systemen
dc.typeArticleen
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Divisionen
dc.contributor.departmentComputer Science Programen
dc.identifier.journalNeurocomputingen
dc.eprint.versionPost-printen
dc.contributor.institutionSchool of Computer Science and Technology, Soochow University, Chinaen
dc.contributor.institutionSchool of Computer Science and Technology, Wuhan University of Technology, Chinaen
kaust.authorZhang, Xiangliangen
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.