Application of survival analysis methodology to the quantitative analysis of LC-MS proteomics data

Handle URI:
http://hdl.handle.net/10754/597597
Title:
Application of survival analysis methodology to the quantitative analysis of LC-MS proteomics data
Authors:
Tekwe, C. D.; Carroll, R. J.; Dabney, A. R.
Abstract:
MOTIVATION: Protein abundance in quantitative proteomics is often based on observed spectral features derived from liquid chromatography mass spectrometry (LC-MS) or LC-MS/MS experiments. Peak intensities are largely non-normal in distribution. Furthermore, LC-MS-based proteomics data frequently have large proportions of missing peak intensities due to censoring mechanisms on low-abundance spectral features. Recognizing that the observed peak intensities detected with the LC-MS method are all positive, skewed and often left-censored, we propose using survival methodology to carry out differential expression analysis of proteins. Various standard statistical techniques including non-parametric tests such as the Kolmogorov-Smirnov and Wilcoxon-Mann-Whitney rank sum tests, and the parametric survival model and accelerated failure time-model with log-normal, log-logistic and Weibull distributions were used to detect any differentially expressed proteins. The statistical operating characteristics of each method are explored using both real and simulated datasets. RESULTS: Survival methods generally have greater statistical power than standard differential expression methods when the proportion of missing protein level data is 5% or more. In particular, the AFT models we consider consistently achieve greater statistical power than standard testing procedures, with the discrepancy widening with increasing missingness in the proportions. AVAILABILITY: The testing procedures discussed in this article can all be performed using readily available software such as R. The R codes are provided as supplemental materials. CONTACT: ctekwe@stat.tamu.edu.
Citation:
Tekwe CD, Carroll RJ, Dabney AR (2012) Application of survival analysis methodology to the quantitative analysis of LC-MS proteomics data. Bioinformatics 28: 1998–2003. Available: http://dx.doi.org/10.1093/bioinformatics/bts306.
Publisher:
Oxford University Press (OUP)
Journal:
Bioinformatics
KAUST Grant Number:
KUS-C1-016-04
Issue Date:
24-May-2012
DOI:
10.1093/bioinformatics/bts306
PubMed ID:
22628520
PubMed Central ID:
PMC3400956
Type:
Article
ISSN:
1367-4803; 1460-2059
Sponsors:
C.D.T. was supported by a postdoctoral training grant from the National Cancer Institute (R25T - 090301). R.J.C. was supported by a grant from the National Cancer Institute (R27-CA057030). This publication is based in part on work supported by Award No. KUS-C1-016-04, made by King Abdullah University of Science and Technology (KAUST).
Appears in Collections:
Publications Acknowledging KAUST Support

Full metadata record

DC FieldValue Language
dc.contributor.authorTekwe, C. D.en
dc.contributor.authorCarroll, R. J.en
dc.contributor.authorDabney, A. R.en
dc.date.accessioned2016-02-25T12:42:47Zen
dc.date.available2016-02-25T12:42:47Zen
dc.date.issued2012-05-24en
dc.identifier.citationTekwe CD, Carroll RJ, Dabney AR (2012) Application of survival analysis methodology to the quantitative analysis of LC-MS proteomics data. Bioinformatics 28: 1998–2003. Available: http://dx.doi.org/10.1093/bioinformatics/bts306.en
dc.identifier.issn1367-4803en
dc.identifier.issn1460-2059en
dc.identifier.pmid22628520en
dc.identifier.doi10.1093/bioinformatics/bts306en
dc.identifier.urihttp://hdl.handle.net/10754/597597en
dc.description.abstractMOTIVATION: Protein abundance in quantitative proteomics is often based on observed spectral features derived from liquid chromatography mass spectrometry (LC-MS) or LC-MS/MS experiments. Peak intensities are largely non-normal in distribution. Furthermore, LC-MS-based proteomics data frequently have large proportions of missing peak intensities due to censoring mechanisms on low-abundance spectral features. Recognizing that the observed peak intensities detected with the LC-MS method are all positive, skewed and often left-censored, we propose using survival methodology to carry out differential expression analysis of proteins. Various standard statistical techniques including non-parametric tests such as the Kolmogorov-Smirnov and Wilcoxon-Mann-Whitney rank sum tests, and the parametric survival model and accelerated failure time-model with log-normal, log-logistic and Weibull distributions were used to detect any differentially expressed proteins. The statistical operating characteristics of each method are explored using both real and simulated datasets. RESULTS: Survival methods generally have greater statistical power than standard differential expression methods when the proportion of missing protein level data is 5% or more. In particular, the AFT models we consider consistently achieve greater statistical power than standard testing procedures, with the discrepancy widening with increasing missingness in the proportions. AVAILABILITY: The testing procedures discussed in this article can all be performed using readily available software such as R. The R codes are provided as supplemental materials. CONTACT: ctekwe@stat.tamu.edu.en
dc.description.sponsorshipC.D.T. was supported by a postdoctoral training grant from the National Cancer Institute (R25T - 090301). R.J.C. was supported by a grant from the National Cancer Institute (R27-CA057030). This publication is based in part on work supported by Award No. KUS-C1-016-04, made by King Abdullah University of Science and Technology (KAUST).en
dc.publisherOxford University Press (OUP)en
dc.subject.meshModels, Statisticalen
dc.subject.meshStatistics, Nonparametricen
dc.subject.meshSoftwareen
dc.titleApplication of survival analysis methodology to the quantitative analysis of LC-MS proteomics dataen
dc.typeArticleen
dc.identifier.journalBioinformaticsen
dc.identifier.pmcidPMC3400956en
dc.contributor.institutionDepartment of Statistics, College Station TX 77843-3143, United Statesen
kaust.grant.numberKUS-C1-016-04en

Related articles on PubMed

All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.