Show simple item record

dc.contributor.advisorBajic, Vladimir B.
dc.contributor.authorAlam, Tanvir
dc.date.accessioned2012-09-18T09:09:27Z
dc.date.available2013-07-30T00:00:00Z
dc.date.issued2012-07
dc.identifier.citationAlam, T. (2012). Distinguishing the Transcription Regulation Patterns in Promoters of Human Genes with Different Function or Evolutionary Age. KAUST Research Repository. https://doi.org/10.25781/KAUST-85H4D
dc.identifier.doi10.25781/KAUST-85H4D
dc.identifier.urihttp://hdl.handle.net/10754/244611
dc.description.abstractDistinguishing transcription regulatory patterns of different gene groups is a common problem in various bioinformatics studies. In this work we developed a methodology to deal with such a problem based on machine learning techniques. We applied our method to two biologically important problems related to detecting a difference in transcription regulation of: a/ protein-coding and long non-coding RNAs (lncRNAs) in human, as well as b/ a difference between primate-specific and non-primate-specific long non-coding RNAs. Our method is capable to classify RNAs using various regulatory features of genes that transcribe into these RNAs, such as nucleotide frequencies, transcription factor binding sites, de novo sequence motifs, CpG islands, repetitive elements, histone modification marks, and others. Ten-fold cross-validation tests suggest that our model can distinguish protein-coding and non-coding RNAs with accuracy above 80%. Twenty-fold cross-validation tests suggest that our model can distinguish primate-specific from non-primate-specific promoters of lncRNAs with accuracy above 80%. Consequently, we can hypothesize that transcription of the groups of genes mentioned above are regulated by different mechanisms. Feature selection techniques allowed us to reduce the number of features significantly while keeping the accuracy around 80%. Consequently, we can conclude that selected features play significant role in transcription regulation of coding and non-coding genes, as well as primate-specific and non-primate-specific lncRNA genes.
dc.language.isoen
dc.titleDistinguishing the Transcription Regulation Patterns in Promoters of Human Genes with Different Function or Evolutionary Age
dc.typeThesis
dc.contributor.departmentComputer, Electrical and Mathematical Science and Engineering (CEMSE) Division
dc.rights.embargodate2013-07-30
thesis.degree.grantorKing Abdullah University of Science and Technology
dc.contributor.committeememberGao, Xin
dc.contributor.committeememberZhang, Xiangliang
thesis.degree.disciplineComputer Science
thesis.degree.nameMaster of Science
dc.rights.accessrightsAt the time of archiving, the student author of this thesis opted to temporarily restrict access to it. The full text of this thesis became available to the public after the expiration of the embargo on 2013-07-30.
refterms.dateFOA2013-07-30T00:00:00Z


Files in this item

Thumbnail
Name:
Tanvir Alam Thesis.pdf
Size:
2.794Mb
Format:
PDF
Description:
PDF file

This item appears in the following Collection(s)

Show simple item record