HMCan: A method for detecting chromatin modifications in cancer samples using ChIP-seq data
Bajic, Vladimir B.
KAUST DepartmentComputational Bioscience Research Center (CBRC)
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
MetadataShow full item record
AbstractMotivation: Cancer cells are often characterized by epigenetic changes, which include aberrant histone modifications. In particular, local or regional epigenetic silencing is a common mechanism in cancer for silencing expression of tumor suppressor genes. Though several tools have been created to enable detection of histone marks in ChIP-seq data from normal samples, it is unclear whether these tools can be efficiently applied to ChIP-seq data generated from cancer samples. Indeed, cancer genomes are often characterized by frequent copy number alterations: gains and losses of large regions of chromosomal material. Copy number alterations may create a substantial statistical bias in the evaluation of histone mark signal enrichment and result in underdetection of the signal in the regions of loss and overdetection of the signal in the regions of gain. Results: We present HMCan (Histone modifications in cancer), a tool specially designed to analyze histone modification ChIP-seq data produced from cancer genomes. HMCan corrects for the GC-content and copy number bias and then applies Hidden Markov Models to detect the signal from the corrected data. On simulated data, HMCan outperformed several commonly used tools developed to analyze histone modification data produced from genomes without copy number alterations. HMCan also showed superior results on a ChIP-seq dataset generated for the repressive histone mark H3K27me3 in a bladder cancer cell line. HMCan predictions matched well with experimental data (qPCR validated regions) and included, for example, the previously detected H3K27me3 mark in the promoter of the DLEC1 gene, missed by other tools we tested. The Author 2013. Published by Oxford University Press. All rights reserved.
CitationAshoor H, Herault A, Kamoun A, Radvanyi F, Bajic VB, et al. (2013) HMCan: a method for detecting chromatin modifications in cancer samples using ChIP-seq data. Bioinformatics 29: 2979-2986. doi:10.1093/bioinformatics/btt524.
PublisherOxford University Press (OUP)
PubMed Central IDPMC3834794
The following license files are associated with this item:
Except where otherwise noted, this item's license is described as This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
- HMCan-diff: a method to detect changes in histone modifications in cells with different genetic characteristics.
- Authors: Ashoor H, Louis-Brennetot C, Janoueix-Lerosey I, Bajic VB, Boeva V
- Issue date: 2017 May 5
- Genome-wide localization of protein-DNA binding and histone modification by a Bayesian change-point method with ChIP-seq data.
- Authors: Xing H, Mo Y, Liao W, Zhang MQ
- Issue date: 2012
- Identifying dispersed epigenomic domains from ChIP-Seq data.
- Authors: Song Q, Smith AD
- Issue date: 2011 Mar 15
- Stability of histone modifications across mammalian genomes: implications for 'epigenetic' marking.
- Authors: Lee BM, Mahadevan LC
- Issue date: 2009 Sep 1
- histoneHMM: Differential analysis of histone modifications with broad genomic footprints.
- Authors: Heinig M, Colomé-Tatché M, Taudt A, Rintisch C, Schafer S, Pravenec M, Hubner N, Vingron M, Johannes F
- Issue date: 2015 Feb 22
Showing items related by title, author, creator and subject.
HOCOMOCO: A comprehensive collection of human transcription factor binding sites modelsKulakovskiy, Ivan V.; Medvedeva, Yulia A.; Schaefer, Ulf; Kasianov, Artem S.; Vorontsov, Ilya E.; Bajic, Vladimir B.; Makeev, Vsevolod J. (Oxford University Press (OUP), 2012-11-21)Transcription factor (TF) binding site (TFBS) models are crucial for computational reconstruction of transcription regulatory networks. In existing repositories, a TF often has several models (also called binding profiles or motifs), obtained from different experimental data. Having a single TFBS model for a TF is more pragmatic for practical applications. We show that integration of TFBS data from various types of experiments into a single model typically results in the improved model quality probably due to partial correction of source specific technique bias. We present the Homo sapiens comprehensive model collection (HOCOMOCO, http://autosome.ru/HOCOMOCO/, http://cbrc.kaust.edu.sa/ hocomoco/) containing carefully hand-curated TFBS models constructed by integration of binding sequences obtained by both low- and high-throughput methods. To construct position weight matrices to represent these TFBS models, we used ChIPMunk software in four computational modes, including newly developed periodic positional prior mode associated with DNA helix pitch. We selected only one TFBS model per TF, unless there was a clear experimental evidence for two rather distinct TFBS models. We assigned a quality rating to each model. HOCOMOCO contains 426 systematically curated TFBS models for 401 human TFs, where 172 models are based on more than one data source. The Author(s) 2012.
Chromatin architecture: A new dimension in the dynamic control of gene expressionRamirez Prado, Juan Sebastian; Rodriguez-Granados, Natalia Yaneth; Ariel, Federico; Raynaud, Cécile; Benhamed, Moussa (Informa UK Limited, 2016-09-10)As the most recent evidence of eukaryotic cell complexity, genome architecture has astounded the scientific community and prompted a variety of technical and cognitive challenges. Several technologies have emerged and evidenced the integration of chromatin packaging and topology, epigenetic processes, and transcription for the pertinent regulation of gene expression. In the present addendum we present and discuss some of our recent research, directed toward the holistic comprehension of the processes by which plants respond to environmental and developmental stimuli. We propose that the study of genome topology and genomic interactions is essential for the understanding of the molecular mechanisms behind a phenotype. Even though our knowledge and understanding of genome architecture and hierarchy has improved substantially in the last few years -in Arabidopsis and other eukaryotes -, there is still a long way ahead in this relatively new field of study. For this, it is necessary to take advantage of the high resolution of the emerging available techniques, and perform integrative approaches with which it will be possible to depict the role of chromatin architecture in the regulation of transcription and ultimately, physiological processes.
PREDICTION OF CHROMATIN STATES USING DNA SEQUENCE PROPERTIESBahabri, Rihab R. (2013-06)Activities of DNA are to a great extent controlled epigenetically through the internal struc- ture of chromatin. This structure is dynamic and is influenced by different modifications of histone proteins. Various combinations of epigenetic modification of histones pinpoint to different functional regions of the DNA determining the so-called chromatin states. How- ever, the characterization of chromatin states by the DNA sequence properties remains largely unknown. In this study we aim to explore whether DNA sequence patterns in the human genome can characterize different chromatin states. Using DNA sequence motifs we built binary classifiers for each chromatic state to eval- uate whether a given genomic sequence is a good candidate for belonging to a particular chromatin state. Of four classification algorithms (C4.5, Naive Bayes, Random Forest, and SVM) used for this purpose, the decision tree based classifiers (C4.5 and Random Forest) yielded best results among those we evaluated. Our results suggest that in general these models lack sufficient predictive power, although for four chromatin states (insulators, het- erochromatin, and two types of copy number variation) we found that presence of certain motifs in DNA sequences does imply an increased probability that such a sequence is one of these chromatin states.