AdvisorsBajic, Vladimir B.
Permanent link to this recordhttp://hdl.handle.net/10754/244613
MetadataShow full item record
AbstractModeling of transcription factor binding sites (TFBSs) and TFBS prediction on genomic sequences are important steps to elucidate transcription regulatory mechanism. Dependency of transcription regulation on a great number of factors such as chemical specificity, molecular structure, genomic and epigenetic characteristics, long distance interaction, makes this a challenging problem. Different experimental procedures generate evidence that DNA-binding domains of transcription factors show considerable DNA sequence specificity. Probabilistic modeling of TFBSs has been moderately successful in identifying patterns from a family of sequences. In this study, we compare performances of different probabilistic models and try to estimate their efficacy over experimental TFBSs data. We build a pipeline to calculate sensitivity and specificity from aligned TFBS sequences for several probabilistic models, such as Markov chains, hidden Markov models, Bayesian networks. Our work, containing relevant statistics and evaluation for the models, can help researchers to choose the most appropriate model for the problem at hand.