Handle URI:
http://hdl.handle.net/10754/600161
Title:
Variance estimation in the analysis of microarray data
Authors:
Wang, Yuedong; Ma, Yanyuan; Carroll, Raymond J.
Abstract:
Microarrays are one of the most widely used high throughput technologies. One of the main problems in the area is that conventional estimates of the variances that are required in the t-statistic and other statistics are unreliable owing to the small number of replications. Various methods have been proposed in the literature to overcome this lack of degrees of freedom problem. In this context, it is commonly observed that the variance increases proportionally with the intensity level, which has led many researchers to assume that the variance is a function of the mean. Here we concentrate on estimation of the variance as a function of an unknown mean in two models: the constant coefficient of variation model and the quadratic variance-mean model. Because the means are unknown and estimated with few degrees of freedom, naive methods that use the sample mean in place of the true mean are generally biased because of the errors-in-variables phenomenon. We propose three methods for overcoming this bias. The first two are variations on the theme of the so-called heteroscedastic simulation-extrapolation estimator, modified to estimate the variance function consistently. The third class of estimators is entirely different, being based on semiparametric information calculations. Simulations show the power of our methods and their lack of bias compared with the naive method that ignores the measurement error. The methodology is illustrated by using microarray data from leukaemia patients.
Citation:
Wang Y, Ma Y, Carroll RJ (2009) Variance estimation in the analysis of microarray data. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 71: 425–445. Available: http://dx.doi.org/10.1111/j.1467-9868.2008.00690.x.
Publisher:
Wiley-Blackwell
Journal:
Journal of the Royal Statistical Society: Series B (Statistical Methodology)
KAUST Grant Number:
KUS-CI-016-04
Issue Date:
Apr-2009
DOI:
10.1111/j.1467-9868.2008.00690.x
PubMed ID:
19750023
PubMed Central ID:
PMC2740938
Type:
Article
ISSN:
1369-7412; 1467-9868
Sponsors:
Wang's research was supported by a grant from the National Science Foundation (DMS-0706886). Ma's research was supported by the National Science Foundation of Switzerland. Carroll's research was supported by grants from the National Cancer Institute (CA-57030 and CA104620). Carroll's research was supported by grants from the National Cancer Institute (CA57030 and CA104620). Part of the work was based on work supported by award KUS-CI-016-04, made by King Abdullah University of Science and Technology.We thank Dr Strimmer for sending us the leukaemia data. We also thank the Joint Editor, Associate Editor and two referees for constructive comments that substantially improved an earlier draft.
Appears in Collections:
Publications Acknowledging KAUST Support

Full metadata record

DC FieldValue Language
dc.contributor.authorWang, Yuedongen
dc.contributor.authorMa, Yanyuanen
dc.contributor.authorCarroll, Raymond J.en
dc.date.accessioned2016-02-28T06:44:02Zen
dc.date.available2016-02-28T06:44:02Zen
dc.date.issued2009-04en
dc.identifier.citationWang Y, Ma Y, Carroll RJ (2009) Variance estimation in the analysis of microarray data. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 71: 425–445. Available: http://dx.doi.org/10.1111/j.1467-9868.2008.00690.x.en
dc.identifier.issn1369-7412en
dc.identifier.issn1467-9868en
dc.identifier.pmid19750023en
dc.identifier.doi10.1111/j.1467-9868.2008.00690.xen
dc.identifier.urihttp://hdl.handle.net/10754/600161en
dc.description.abstractMicroarrays are one of the most widely used high throughput technologies. One of the main problems in the area is that conventional estimates of the variances that are required in the t-statistic and other statistics are unreliable owing to the small number of replications. Various methods have been proposed in the literature to overcome this lack of degrees of freedom problem. In this context, it is commonly observed that the variance increases proportionally with the intensity level, which has led many researchers to assume that the variance is a function of the mean. Here we concentrate on estimation of the variance as a function of an unknown mean in two models: the constant coefficient of variation model and the quadratic variance-mean model. Because the means are unknown and estimated with few degrees of freedom, naive methods that use the sample mean in place of the true mean are generally biased because of the errors-in-variables phenomenon. We propose three methods for overcoming this bias. The first two are variations on the theme of the so-called heteroscedastic simulation-extrapolation estimator, modified to estimate the variance function consistently. The third class of estimators is entirely different, being based on semiparametric information calculations. Simulations show the power of our methods and their lack of bias compared with the naive method that ignores the measurement error. The methodology is illustrated by using microarray data from leukaemia patients.en
dc.description.sponsorshipWang's research was supported by a grant from the National Science Foundation (DMS-0706886). Ma's research was supported by the National Science Foundation of Switzerland. Carroll's research was supported by grants from the National Cancer Institute (CA-57030 and CA104620). Carroll's research was supported by grants from the National Cancer Institute (CA57030 and CA104620). Part of the work was based on work supported by award KUS-CI-016-04, made by King Abdullah University of Science and Technology.We thank Dr Strimmer for sending us the leukaemia data. We also thank the Joint Editor, Associate Editor and two referees for constructive comments that substantially improved an earlier draft.en
dc.publisherWiley-Blackwellen
dc.subjectHeteroscedasticityen
dc.subjectMeasurement erroren
dc.subjectMicroarrayen
dc.subjectSemiparametric methodsen
dc.subjectSimulation-extrapolationen
dc.subjectVariance function estimationen
dc.titleVariance estimation in the analysis of microarray dataen
dc.typeArticleen
dc.identifier.journalJournal of the Royal Statistical Society: Series B (Statistical Methodology)en
dc.identifier.pmcidPMC2740938en
dc.contributor.institutionUniversity of California, Santa Barbara, USA.en
kaust.grant.numberKUS-CI-016-04en
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.