• Login
    View Item 
    •   Home
    • Events
    • Digital Health 2020
    • View Item
    •   Home
    • Events
    • Digital Health 2020
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of KAUSTCommunitiesIssue DateSubmit DateThis CollectionIssue DateSubmit Date

    My Account

    Login

    Quick Links

    Open Access PolicyORCID LibguideTheses and Dissertations LibguideSubmit an Item

    Statistics

    Display statistics

    NNfold: RNA secondary structure prediction by deep learning

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Type
    Poster
    Authors
    Umarov, Ramzan
    Li, Yu cc
    Van Neste, Christophe
    Date
    2020-1-20
    Permanent link to this record
    http://hdl.handle.net/10754/661203
    
    Metadata
    Show full item record
    Abstract
    NNfold: RNA secondary structure prediction by deep learning RNA molecules have a plethora of functions within the cell. These functions can be divided into information-carrier, catalytic, or structural (scaffolding of other molecules), or a combination. For the catalytic or regulation functionality the structure that the RNA molecule has is pivotal and predicting to which structure it is most likely to fold is therefore essential to fully understand its biological role. In general, RNA affects extensively protein regulation, through its control of gene expression, post-transcriptional modifications, or translational regulation. RNA secondary structure can be obtained by techniques such as X-ray diffraction and NMR. However, biological experimental methods are still inefficient and expensive. Thus, computational prediction algorithms are still widely used for predicting RNA secondary structures. Taking the raw sequence represented in a string, we first use a one-hot encoding. The encoded matrix has a dimension of L by 4. Then, the encoding will go through two models,  the local model and the global model, to extract local contact information and global contact information, respectively. Regrading the local model, the input for the model are two chunks of the raw encoding, whose dimensions are 20 by 20. Then we concatenate those 20 by 20 chunk matrices into the L by L local contact information matrix. We used six 1D convolutional layers and one fully-connected layer to model the local information.  In terms of the global model, we use three 1D convolutional layers to predict whether a base can pair with any other base or not, whose output is a vector of length L.  In the vector, 1 means the corresponding base may pair with the other base and 0 means that the corresponding base does not pair with this base.  To combine the local information and the global information, we convert the global vector into a symmetric matrix of L by L and perform a pairwise multiplication between the global information and the local information,  enforcing the global constraint into the preliminary contact map. After combining the global information and the local information, the obtained global contact map may still violate the two constraints mentioned above.  We used the following greedy sorting algorithm to resolve the conflict. We introduce NNfold, a sequence based deep learning method to predict RNA secondary structure. The predictions are made in two steps: first we construct a matrix with likelihood of each nucleotide pairing by predicting all potential interactions using convolutional deep learning model. Next, we modify the base pairs list obtained from the matrix using second model whose output is used to ensure validity of the final secondary structure. NNfold performed much better than thermodynamics-based methods on the diverse set of RNA sequences, improving average F1 score by 0.20. It is also capable of predicting pseudoknots which is a challenging task for other approaches.
    Conference/Event name
    Digital Health 2020
    Additional Links
    https://epostersonline.com//dh2020/node/34
    Collections
    Digital Health 2020; Posters

    entitlement

     
    DSpace software copyright © 2002-2023  DuraSpace
    Quick Guide | Contact Us | KAUST University Library
    Open Repository is a service hosted by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items. For anonymous users the allowed maximum amount is 50 search results.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.