Integrating Data Transformation in Principal Components Analysis

Handle URI:
http://hdl.handle.net/10754/598638
Title:
Integrating Data Transformation in Principal Components Analysis
Authors:
Maadooliat, Mehdi; Huang, Jianhua Z.; Hu, Jianhua
Abstract:
Principal component analysis (PCA) is a popular dimension reduction method to reduce the complexity and obtain the informative aspects of high-dimensional datasets. When the data distribution is skewed, data transformation is commonly used prior to applying PCA. Such transformation is usually obtained from previous studies, prior knowledge, or trial-and-error. In this work, we develop a model-based method that integrates data transformation in PCA and finds an appropriate data transformation using the maximum profile likelihood. Extensions of the method to handle functional data and missing values are also developed. Several numerical algorithms are provided for efficient computation. The proposed method is illustrated using simulated and real-world data examples.
Citation:
Maadooliat M, Huang JZ, Hu J (2015) Integrating Data Transformation in Principal Components Analysis. Journal of Computational and Graphical Statistics 24: 84–103. Available: http://dx.doi.org/10.1080/10618600.2014.891461.
Publisher:
Informa UK Limited
Journal:
Journal of Computational and Graphical Statistics
KAUST Grant Number:
KUS-CI-016-04
Issue Date:
2-Jan-2015
DOI:
10.1080/10618600.2014.891461
PubMed ID:
25914514
PubMed Central ID:
PMC4408558
Type:
Article
ISSN:
1061-8600; 1537-2715
Sponsors:
We thank an associate editor and two anonymous referees for their constructive and thoughtful comments that helped us tremendously in revising the manuscript. Maadooliat and Hu were partially supported by the National Science Foundation (grants DMS-0706818), the National Institutes of Health (grants R01GM080503-01A1, R21CA129671), and the National Cancer Institute (grant CA97007). Huang was partially supported by the National Science Foundation (grants DMS-0606580, DMS-0907170). Huang and Maadooliat were partially supported by King Abdullah University of Science and Technology (grant KUS-CI-016-04).
Appears in Collections:
Publications Acknowledging KAUST Support

Full metadata record

DC FieldValue Language
dc.contributor.authorMaadooliat, Mehdien
dc.contributor.authorHuang, Jianhua Z.en
dc.contributor.authorHu, Jianhuaen
dc.date.accessioned2016-02-25T13:33:35Zen
dc.date.available2016-02-25T13:33:35Zen
dc.date.issued2015-01-02en
dc.identifier.citationMaadooliat M, Huang JZ, Hu J (2015) Integrating Data Transformation in Principal Components Analysis. Journal of Computational and Graphical Statistics 24: 84–103. Available: http://dx.doi.org/10.1080/10618600.2014.891461.en
dc.identifier.issn1061-8600en
dc.identifier.issn1537-2715en
dc.identifier.pmid25914514en
dc.identifier.doi10.1080/10618600.2014.891461en
dc.identifier.urihttp://hdl.handle.net/10754/598638en
dc.description.abstractPrincipal component analysis (PCA) is a popular dimension reduction method to reduce the complexity and obtain the informative aspects of high-dimensional datasets. When the data distribution is skewed, data transformation is commonly used prior to applying PCA. Such transformation is usually obtained from previous studies, prior knowledge, or trial-and-error. In this work, we develop a model-based method that integrates data transformation in PCA and finds an appropriate data transformation using the maximum profile likelihood. Extensions of the method to handle functional data and missing values are also developed. Several numerical algorithms are provided for efficient computation. The proposed method is illustrated using simulated and real-world data examples.en
dc.description.sponsorshipWe thank an associate editor and two anonymous referees for their constructive and thoughtful comments that helped us tremendously in revising the manuscript. Maadooliat and Hu were partially supported by the National Science Foundation (grants DMS-0706818), the National Institutes of Health (grants R01GM080503-01A1, R21CA129671), and the National Cancer Institute (grant CA97007). Huang was partially supported by the National Science Foundation (grants DMS-0606580, DMS-0907170). Huang and Maadooliat were partially supported by King Abdullah University of Science and Technology (grant KUS-CI-016-04).en
dc.publisherInforma UK Limiteden
dc.subjectPCAen
dc.subjectMissing Dataen
dc.subjectProfile Likelihooden
dc.subjectTransformation Modelen
dc.subjectFunctional Pcaen
dc.titleIntegrating Data Transformation in Principal Components Analysisen
dc.typeArticleen
dc.identifier.journalJournal of Computational and Graphical Statisticsen
dc.identifier.pmcidPMC4408558en
dc.contributor.institutionDepartment of Mathematics, Statistics and Computer Science, Marquette University, WI.en
dc.contributor.institutionDepartment of Statistics, Texas A&M University, College Station, TX.en
dc.contributor.institutionDepartment of Biostatistics, Division of Quantitative Sciences, The University of Texas MD Anderson Cancer Center, Houston, TX.en
kaust.grant.numberKUS-CI-016-04en

Related articles on PubMed

All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.