Handle URI:
http://hdl.handle.net/10754/597281
Title:
A Heckman Selection- t Model
Authors:
Marchenko, Yulia V.; Genton, Marc G.
Abstract:
Sample selection arises often in practice as a result of the partial observability of the outcome of interest in a study. In the presence of sample selection, the observed data do not represent a random sample from the population, even after controlling for explanatory variables. That is, data are missing not at random. Thus, standard analysis using only complete cases will lead to biased results. Heckman introduced a sample selection model to analyze such data and proposed a full maximum likelihood estimation method under the assumption of normality. The method was criticized in the literature because of its sensitivity to the normality assumption. In practice, data, such as income or expenditure data, often violate the normality assumption because of heavier tails. We first establish a new link between sample selection models and recently studied families of extended skew-elliptical distributions. Then, this allows us to introduce a selection-t (SLt) model, which models the error distribution using a Student's t distribution. We study its properties and investigate the finite-sample performance of the maximum likelihood estimators for this model. We compare the performance of the SLt model to the conventional Heckman selection-normal (SLN) model and apply it to analyze ambulatory expenditures. Unlike the SLNmodel, our analysis using the SLt model provides statistical evidence for the existence of sample selection bias in these data. We also investigate the performance of the test for sample selection bias based on the SLt model and compare it with the performances of several tests used with the SLN model. Our findings indicate that the latter tests can be misleading in the presence of heavy-tailed data. © 2012 American Statistical Association.
Citation:
Marchenko YV, Genton MG (2012) A Heckman Selection- t Model . Journal of the American Statistical Association 107: 304–317. Available: http://dx.doi.org/10.1080/01621459.2012.656011.
Publisher:
Informa UK Limited
Journal:
Journal of the American Statistical Association
KAUST Grant Number:
KUS-C1-016-04
Issue Date:
Mar-2012
DOI:
10.1080/01621459.2012.656011
Type:
Article
ISSN:
0162-1459; 1537-274X
Sponsors:
The authors thank the editor, the associate editor, and two referees for a careful review of the manuscript and valuable suggestions. The authors also thank Adelchi Azzalini and Reinaldo B. Arellano-Valle for helpful comments on an earlier version of the article, and David Drukker for useful discussions about selection models. Genton's research was partially supported by the National Science Foundation (NSF) grant DMS-1007504 and by award number KUS-C1-016-04 made by King Abdullah University of Science and Technology (KAUST).
Appears in Collections:
Publications Acknowledging KAUST Support

Full metadata record

DC FieldValue Language
dc.contributor.authorMarchenko, Yulia V.en
dc.contributor.authorGenton, Marc G.en
dc.date.accessioned2016-02-25T12:29:44Zen
dc.date.available2016-02-25T12:29:44Zen
dc.date.issued2012-03en
dc.identifier.citationMarchenko YV, Genton MG (2012) A Heckman Selection- t Model . Journal of the American Statistical Association 107: 304–317. Available: http://dx.doi.org/10.1080/01621459.2012.656011.en
dc.identifier.issn0162-1459en
dc.identifier.issn1537-274Xen
dc.identifier.doi10.1080/01621459.2012.656011en
dc.identifier.urihttp://hdl.handle.net/10754/597281en
dc.description.abstractSample selection arises often in practice as a result of the partial observability of the outcome of interest in a study. In the presence of sample selection, the observed data do not represent a random sample from the population, even after controlling for explanatory variables. That is, data are missing not at random. Thus, standard analysis using only complete cases will lead to biased results. Heckman introduced a sample selection model to analyze such data and proposed a full maximum likelihood estimation method under the assumption of normality. The method was criticized in the literature because of its sensitivity to the normality assumption. In practice, data, such as income or expenditure data, often violate the normality assumption because of heavier tails. We first establish a new link between sample selection models and recently studied families of extended skew-elliptical distributions. Then, this allows us to introduce a selection-t (SLt) model, which models the error distribution using a Student's t distribution. We study its properties and investigate the finite-sample performance of the maximum likelihood estimators for this model. We compare the performance of the SLt model to the conventional Heckman selection-normal (SLN) model and apply it to analyze ambulatory expenditures. Unlike the SLNmodel, our analysis using the SLt model provides statistical evidence for the existence of sample selection bias in these data. We also investigate the performance of the test for sample selection bias based on the SLt model and compare it with the performances of several tests used with the SLN model. Our findings indicate that the latter tests can be misleading in the presence of heavy-tailed data. © 2012 American Statistical Association.en
dc.description.sponsorshipThe authors thank the editor, the associate editor, and two referees for a careful review of the manuscript and valuable suggestions. The authors also thank Adelchi Azzalini and Reinaldo B. Arellano-Valle for helpful comments on an earlier version of the article, and David Drukker for useful discussions about selection models. Genton's research was partially supported by the National Science Foundation (NSF) grant DMS-1007504 and by award number KUS-C1-016-04 made by King Abdullah University of Science and Technology (KAUST).en
dc.publisherInforma UK Limiteden
dc.subjectHeavy tailsen
dc.subjectHeckman modelen
dc.subjectMissing not at randomen
dc.subjectSample selectionen
dc.subjectSelection-normalen
dc.subjectSelection-ten
dc.subjectSkew-normalen
dc.subjectSkew-ten
dc.subjectTwo-stepen
dc.titleA Heckman Selection- t Modelen
dc.typeArticleen
dc.identifier.journalJournal of the American Statistical Associationen
dc.contributor.institutionStataCorp LP, College Station, United Statesen
dc.contributor.institutionTexas A and M University, College Station, United Statesen
kaust.grant.numberKUS-C1-016-04en
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.