• Login
    View Item 
    •   Home
    • Theses and Dissertations
    • Theses
    • View Item
    •   Home
    • Theses and Dissertations
    • Theses
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of KAUSTCommunitiesIssue DateSubmit DateThis CollectionIssue DateSubmit Date

    My Account

    Login

    Quick Links

    Open Access PolicyORCID LibguidePlumX LibguideSubmit an Item

    Statistics

    Display statistics

    Pipeline for Efficient Mapping of Transcription Factor Binding Sites and Comparison of Their Models

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Thumbnail
    Name:
    Wail Ba Alawi Thesis.pdf
    Size:
    1.126Mb
    Format:
    PDF
    Description:
    PDF file
    Download
    Type
    Thesis
    Authors
    Ba Alawi, Wail cc
    Advisors
    Bajic, Vladimir B. cc
    Committee members
    Moshkov, Mikhail cc
    Zhang, Xiangliang cc
    Program
    Computer Science
    KAUST Department
    Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
    Date
    2011-06
    Permanent link to this record
    http://hdl.handle.net/10754/136709
    
    Metadata
    Show full item record
    Abstract
    The control of genes in every living organism is based on activities of transcription factor (TF) proteins. These TFs interact with DNA by binding to the TF binding sites (TFBSs) and in that way create conditions for the genes to activate. Of the approximately 1500 TFs in human, TFBSs are experimentally derived only for less than 300 TFs and only in generally limited portions of the genome. To be able to associate TF to genes they control we need to know if TFs will have a potential to interact with the control region of the gene. For this we need to have models of TFBS families. The existing models are not sufficiently accurate or they are too complex for use by ordinary biologists. To remove some of the deficiencies of these models, in this study we developed a pipeline through which we achieved the following: 1. Through a comparison analysis of the performance we identified the best models with optimized thresholds among the four different types of models of TFBS families. 2. Using the best models we mapped TFBSs to the human genome in an efficient way. The study shows that a new scoring function used with TFBS models based on the position weight matrix of dinucleotides with remote dependency results in better accuracy than the other three types of the TFBS models. The speed of mapping has been improved by developing a parallelized code and shows a significant speed up of 4x when going from 1 CPU to 8 CPUs. To verify if the predicted TFBSs are more accurate than what can be expected with the conventional models, we identified the most frequent pairs of TFBSs (for TFs E4F1 and ATF6) that appeared close to each other (within the distance of 200 nucleotides) over the human genome. We show unexpectedly that the genes that are most close to the multiple pairs of E4F1/ATF6 binding sites have a co-expression of over 90%. This indirectly supports our hypothesis that the TFBS models we use are more accurate and also suggests that the E4F1/ATF6 pair is exerting the control over these genes.
    Citation
    Ba Alawi, W. (2011). Pipeline for Efficient Mapping of Transcription Factor Binding Sites and Comparison of Their Models. KAUST Research Repository. https://doi.org/10.25781/KAUST-N98Q0
    DOI
    10.25781/KAUST-N98Q0
    ae974a485f413a2113503eed53cd6c53
    10.25781/KAUST-N98Q0
    Scopus Count
    Collections
    Theses; Computer Science Program; Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division

    entitlement

     
    DSpace software copyright © 2002-2021  DuraSpace
    Quick Guide | Contact Us | KAUST University Library
    Open Repository is a service hosted by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items. For anonymous users the allowed maximum amount is 50 search results.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.