• Login
    View Item 
    •   Home
    • Research
    • Conference Papers
    • View Item
    •   Home
    • Research
    • Conference Papers
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of KAUSTCommunitiesIssue DateSubmit DateThis CollectionIssue DateSubmit Date

    My Account

    Login

    Quick Links

    Open Access PolicyORCID LibguideTheses and Dissertations LibguideSubmit an Item

    Statistics

    Display statistics

    Parallel motif extraction from very long sequences

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Type
    Conference Paper
    Authors
    Sahli, Majed cc
    Mansour, Essam
    Kalnis, Panos cc
    KAUST Department
    Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
    Computer Science Program
    Date
    2013
    Permanent link to this record
    http://hdl.handle.net/10754/564651
    
    Metadata
    Show full item record
    Abstract
    Motifs are frequent patterns used to identify biological functionality in genomic sequences, periodicity in time series, or user trends in web logs. In contrast to a lot of existing work that focuses on collections of many short sequences, modern applications require mining of motifs in one very long sequence (i.e., in the order of several gigabytes). For this case, there exist statistical approaches that are fast but inaccurate; or combinatorial methods that are sound and complete. Unfortunately, existing combinatorial methods are serial and very slow. Consequently, they are limited to very short sequences (i.e., a few megabytes), small alphabets (typically 4 symbols for DNA sequences), and restricted types of motifs. This paper presents ACME, a combinatorial method for extracting motifs from a single very long sequence. ACME arranges the search space in contiguous blocks that take advantage of the cache hierarchy in modern architectures, and achieves almost an order of magnitude performance gain in serial execution. It also decomposes the search space in a smart way that allows scalability to thousands of processors with more than 90% speedup. ACME is the only method that: (i) scales to gigabyte-long sequences; (ii) handles large alphabets; (iii) supports interesting types of motifs with minimal additional cost; and (iv) is optimized for a variety of architectures such as multi-core systems, clusters in the cloud, and supercomputers. ACME reduces the extraction time for an exact-length query from 4 hours to 7 minutes on a typical workstation; handles 3 orders of magnitude longer sequences; and scales up to 16, 384 cores on a supercomputer. Copyright is held by the owner/author(s).
    Citation
    Sahli, M., Mansour, E., & Kalnis, P. (2013). Parallel motif extraction from very long sequences. Proceedings of the 22nd ACM International Conference on Conference on Information & Knowledge Management - CIKM ’13. doi:10.1145/2505515.2505575
    Publisher
    Association for Computing Machinery (ACM)
    Journal
    Proceedings of the 22nd ACM international conference on Conference on information & knowledge management - CIKM '13
    Conference/Event name
    22nd ACM International Conference on Information and Knowledge Management, CIKM 2013
    ISBN
    9781450322638
    DOI
    10.1145/2505515.2505575
    ae974a485f413a2113503eed53cd6c53
    10.1145/2505515.2505575
    Scopus Count
    Collections
    Conference Papers; Computer Science Program; Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division

    entitlement

     
    DSpace software copyright © 2002-2023  DuraSpace
    Quick Guide | Contact Us | KAUST University Library
    Open Repository is a service hosted by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items. For anonymous users the allowed maximum amount is 50 search results.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.