Bayesian Modeling of MPSS Data: Gene Expression Analysis of Bovine Salmonella Infection
Type
ArticleAuthors
Dhavala, Soma S.Datta, Sujay
Mallick, Bani K.
Carroll, Raymond J.
Khare, Sangeeta
Lawhon, Sara D.
Adams, L. Garry
KAUST Grant Number
KUS-CI-016-04Date
2010-09Permanent link to this record
http://hdl.handle.net/10754/597652
Metadata
Show full item recordAbstract
Massively Parallel Signature Sequencing (MPSS) is a high-throughput, counting-based technology available for gene expression profiling. It produces output that is similar to Serial Analysis of Gene Expression and is ideal for building complex relational databases for gene expression. Our goal is to compare the in vivo global gene expression profiles of tissues infected with different strains of Salmonella obtained using the MPSS technology. In this article, we develop an exact ANOVA type model for this count data using a zero-inflatedPoisson distribution, different from existing methods that assume continuous densities. We adopt two Bayesian hierarchical models-one parametric and the other semiparametric with a Dirichlet process prior that has the ability to "borrow strength" across related signatures, where a signature is a specific arrangement of the nucleotides, usually 16-21 base pairs long. We utilize the discreteness of Dirichlet process prior to cluster signatures that exhibit similar differential expression profiles. Tests for differential expression are carried out using nonparametric approaches, while controlling the false discovery rate. We identify several differentially expressed genes that have important biological significance and conclude with a summary of the biological discoveries. This article has supplementary materials online. © 2010 American Statistical Association.Citation
Dhavala SS, Datta S, Mallick BK, Carroll RJ, Khare S, et al. (2010) Bayesian Modeling of MPSS Data: Gene Expression Analysis of Bovine Salmonella Infection . Journal of the American Statistical Association 105: 956–967. Available: http://dx.doi.org/10.1198/jasa.2010.ap08327.Sponsors
Soma S. Dhavala is a Doctoral Candiate, Department of Statistics, Texas A&M University, 3143 TAMU, College Station, TX 77843 (E-mail: soma@stat.tamu.edu). Sujay Datta is Senior Scientist and Faculty Member, Statistical Center for HIV/AIDS Research and Prevention, Fred Hutchinson Cancer Research Center, M2-C125,1100 Fairview Avenue N., Seattle, WA 98109 (E-mail: sdatta@fhcrc.org). Bani K. Mal lick is Professor, Department of Statistics, Texas A&M University, 3143 TAMU, College Station, TX 77843 (E-mail: bmallick@stat.tamu.edu). Raymond J. Carroll is Distinguished Professor, Department of Statistics, Texas A&M University, 3143 TAMU, College Station, TX 77843 (E-mail: carroll@stat.tamu.edu). Sangeeta Khare is Research Assistant Professor, Department of Veterinary Pathobiology, Texas A&M University. 4467 TAMU, College Station, TX 77843 (E-mail: skhare@cvm.tamu.edu). Sara D. Lawhon is Assistant Professor, Department of Veterinary Pathobiology, Texas A&M University, 4467 TAMU, College Station, TX 77843 (E-mail: slawhon@cvm.tamu.edu). L. Garry Adams is Professor. Department of Veterinary Pathobiology, Texas A&M University, 4467 TAMU, College Station, TX 77843 (E-mail: gadams@cvm.tamu.edu). The research of Bani K. Mal lick and Raymond J. Carroll was supported by from the National Cancer Institute grants (CA 104620 and CA57030, respectively), National Science Foundation grant DMS 0914951. and by award KUS-CI-016-04, made by King Abdullah University of Science and Technology (KAUST). The research of Sujay Datta was supported by a postdoctoral training grant from the National Cancer Institute (CA90301). The research of L. Garry Adams was supported by the grants NIAID 1 RO1 A144170-01A1, USDA 2002-35204-12247, and NSF DMS 0914951. Public Health Service grant AI060933 supported the research of Sara D. Lawhon. The authors are greatful to Dr. David Dahl for discussions, and to the editors and the two anonymous referees for their suggestions and constructive comments.Publisher
Informa UK Limitedae974a485f413a2113503eed53cd6c53
10.1198/jasa.2010.ap08327