Controlling attribute effect in linear regression

Handle URI:
http://hdl.handle.net/10754/564826
Title:
Controlling attribute effect in linear regression
Authors:
Calders, Toon; Karim, Asim A.; Kamiran, Faisal; Ali, Wasif Mohammad; Zhang, Xiangliang ( 0000-0002-3574-5665 )
Abstract:
In data mining we often have to learn from biased data, because, for instance, data comes from different batches or there was a gender or racial bias in the collection of social data. In some applications it may be necessary to explicitly control this bias in the models we learn from the data. This paper is the first to study learning linear regression models under constraints that control the biasing effect of a given attribute such as gender or batch number. We show how propensity modeling can be used for factoring out the part of the bias that can be justified by externally provided explanatory attributes. Then we analytically derive linear models that minimize squared error while controlling the bias by imposing constraints on the mean outcome or residuals of the models. Experiments with discrimination-aware crime prediction and batch effect normalization tasks show that the proposed techniques are successful in controlling attribute effects in linear regression models. © 2013 IEEE.
KAUST Department:
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division; Computer Science Program; Machine Intelligence & kNowledge Engineering Lab
Publisher:
Institute of Electrical and Electronics Engineers (IEEE)
Journal:
2013 IEEE 13th International Conference on Data Mining
Conference/Event name:
13th IEEE International Conference on Data Mining, ICDM 2013
Issue Date:
Dec-2013
DOI:
10.1109/ICDM.2013.114
Type:
Conference Paper
ISSN:
15504786
Appears in Collections:
Conference Papers; Computer Science Program; Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division

Full metadata record

DC FieldValue Language
dc.contributor.authorCalders, Toonen
dc.contributor.authorKarim, Asim A.en
dc.contributor.authorKamiran, Faisalen
dc.contributor.authorAli, Wasif Mohammaden
dc.contributor.authorZhang, Xiangliangen
dc.date.accessioned2015-08-04T07:17:26Zen
dc.date.available2015-08-04T07:17:26Zen
dc.date.issued2013-12en
dc.identifier.issn15504786en
dc.identifier.doi10.1109/ICDM.2013.114en
dc.identifier.urihttp://hdl.handle.net/10754/564826en
dc.description.abstractIn data mining we often have to learn from biased data, because, for instance, data comes from different batches or there was a gender or racial bias in the collection of social data. In some applications it may be necessary to explicitly control this bias in the models we learn from the data. This paper is the first to study learning linear regression models under constraints that control the biasing effect of a given attribute such as gender or batch number. We show how propensity modeling can be used for factoring out the part of the bias that can be justified by externally provided explanatory attributes. Then we analytically derive linear models that minimize squared error while controlling the bias by imposing constraints on the mean outcome or residuals of the models. Experiments with discrimination-aware crime prediction and batch effect normalization tasks show that the proposed techniques are successful in controlling attribute effects in linear regression models. © 2013 IEEE.en
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en
dc.subjectBatch Effectsen
dc.subjectFair Data Miningen
dc.subjectLinear Regressionen
dc.subjectPropensity Scoreen
dc.titleControlling attribute effect in linear regressionen
dc.typeConference Paperen
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Divisionen
dc.contributor.departmentComputer Science Programen
dc.contributor.departmentMachine Intelligence & kNowledge Engineering Laben
dc.identifier.journal2013 IEEE 13th International Conference on Data Miningen
dc.conference.date7 December 2013 through 10 December 2013en
dc.conference.name13th IEEE International Conference on Data Mining, ICDM 2013en
dc.conference.locationDallas, TXen
dc.contributor.institutionComputer and Decision Engineering Dept., Universite Libre de Bruxelles (ULB), Belgiumen
dc.contributor.institutionDept. of Computer Science, SBASSE, Lahore University of Management Sciences (LUMS), Pakistanen
kaust.authorZhang, Xiangliangen
kaust.authorKamiran, Faisalen
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.