Variable selection and estimation for longitudinal survey data

Handle URI:
http://hdl.handle.net/10754/600160
Title:
Variable selection and estimation for longitudinal survey data
Authors:
Wang, Li ( 0000-0001-6387-0415 ) ; Wang, Suojin; Wang, Guannan
Abstract:
There is wide interest in studying longitudinal surveys where sample subjects are observed successively over time. Longitudinal surveys have been used in many areas today, for example, in the health and social sciences, to explore relationships or to identify significant variables in regression settings. This paper develops a general strategy for the model selection problem in longitudinal sample surveys. A survey weighted penalized estimating equation approach is proposed to select significant variables and estimate the coefficients simultaneously. The proposed estimators are design consistent and perform as well as the oracle procedure when the correct submodel was known. The estimating function bootstrap is applied to obtain the standard errors of the estimated parameters with good accuracy. A fast and efficient variable selection algorithm is developed to identify significant variables for complex longitudinal survey data. Simulated examples are illustrated to show the usefulness of the proposed methodology under various model settings and sampling designs. © 2014 Elsevier Inc.
Citation:
Wang L, Wang S, Wang G (2014) Variable selection and estimation for longitudinal survey data. Journal of Multivariate Analysis 130: 409–424. Available: http://dx.doi.org/10.1016/j.jmva.2014.05.006.
Publisher:
Elsevier BV
Journal:
Journal of Multivariate Analysis
KAUST Grant Number:
KUS-CI-016-04
Issue Date:
Sep-2014
DOI:
10.1016/j.jmva.2014.05.006
Type:
Article
ISSN:
0047-259X
Sponsors:
The research of L. Wang was partially supported by NSF grants DMS-0905730, DMS-1106816, DMS-1309800 and the ASA/NSF/BLS research fellow program. The research of S. Wang was partially supported by Award Number KUS-CI-016-04, made by King Abdullah University of Science and Technology (KAUST). The views expressed in this paper are those of the authors and do not necessarily reflect the policies of the US Bureau of Labor Statistics.
Appears in Collections:
Publications Acknowledging KAUST Support

Full metadata record

DC FieldValue Language
dc.contributor.authorWang, Lien
dc.contributor.authorWang, Suojinen
dc.contributor.authorWang, Guannanen
dc.date.accessioned2016-02-28T06:44:00Zen
dc.date.available2016-02-28T06:44:00Zen
dc.date.issued2014-09en
dc.identifier.citationWang L, Wang S, Wang G (2014) Variable selection and estimation for longitudinal survey data. Journal of Multivariate Analysis 130: 409–424. Available: http://dx.doi.org/10.1016/j.jmva.2014.05.006.en
dc.identifier.issn0047-259Xen
dc.identifier.doi10.1016/j.jmva.2014.05.006en
dc.identifier.urihttp://hdl.handle.net/10754/600160en
dc.description.abstractThere is wide interest in studying longitudinal surveys where sample subjects are observed successively over time. Longitudinal surveys have been used in many areas today, for example, in the health and social sciences, to explore relationships or to identify significant variables in regression settings. This paper develops a general strategy for the model selection problem in longitudinal sample surveys. A survey weighted penalized estimating equation approach is proposed to select significant variables and estimate the coefficients simultaneously. The proposed estimators are design consistent and perform as well as the oracle procedure when the correct submodel was known. The estimating function bootstrap is applied to obtain the standard errors of the estimated parameters with good accuracy. A fast and efficient variable selection algorithm is developed to identify significant variables for complex longitudinal survey data. Simulated examples are illustrated to show the usefulness of the proposed methodology under various model settings and sampling designs. © 2014 Elsevier Inc.en
dc.description.sponsorshipThe research of L. Wang was partially supported by NSF grants DMS-0905730, DMS-1106816, DMS-1309800 and the ASA/NSF/BLS research fellow program. The research of S. Wang was partially supported by Award Number KUS-CI-016-04, made by King Abdullah University of Science and Technology (KAUST). The views expressed in this paper are those of the authors and do not necessarily reflect the policies of the US Bureau of Labor Statistics.en
dc.publisherElsevier BVen
dc.subjectBootstrapen
dc.subjectGeneralized estimating equationsen
dc.subjectPenaltyen
dc.subjectPrimaryen
dc.subjectSampling weightsen
dc.subjectSuperpopulationen
dc.titleVariable selection and estimation for longitudinal survey dataen
dc.typeArticleen
dc.identifier.journalJournal of Multivariate Analysisen
dc.contributor.institutionThe University of Georgia, Athens, United Statesen
dc.contributor.institutionTexas A and M University, College Station, United Statesen
kaust.grant.numberKUS-CI-016-04en
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.