• Login
    View Item 
    •   Home
    • Research
    • Articles
    • View Item
    •   Home
    • Research
    • Articles
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of KAUSTCommunitiesIssue DateSubmit DateThis CollectionIssue DateSubmit Date

    My Account

    Login

    Quick Links

    Open Access PolicyORCID LibguideTheses and Dissertations LibguideSubmit an Item

    Statistics

    Display statistics

    Mining a database of single amplified genomes from Red Sea brine pool extremophiles-improving reliability of gene function prediction using a profile and pattern matching algorithm (PPMA).

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Thumbnail
    Name:
    Article - Frontiers in Microbiology - Mining a database of single amplified genomes - 2013.pdf
    Size:
    1.947Mb
    Format:
    PDF
    Description:
    Article - Full Text
    Download
    Thumbnail
    Name:
    Data Sheet 1.DOCX
    Size:
    6.647Mb
    Format:
    Microsoft Word 2007
    Description:
    Supplemental Data Sheet 1
    Download
    Thumbnail
    Name:
    Data Sheet 2.DOCX
    Size:
    85.91Kb
    Format:
    Microsoft Word 2007
    Description:
    Supplemental Data Sheet 2
    Download
    Thumbnail
    Name:
    Data Sheet 3.DOCX
    Size:
    138.8Kb
    Format:
    Microsoft Word 2007
    Description:
    Supplemental Data Sheet 3
    Download
    Thumbnail
    Name:
    Data Sheet 4.DOCX
    Size:
    151.5Kb
    Format:
    Microsoft Word 2007
    Description:
    Supplemental Data Sheet 4
    Download
    Thumbnail
    Name:
    Data Sheet 5.DOCX
    Size:
    139.8Kb
    Format:
    Microsoft Word 2007
    Description:
    Supplemental Data Sheet 5
    Download
    View more filesView fewer files
    Type
    Article
    Authors
    Grötzinger, Stefan W.
    Alam, Intikhab
    Ba Alawi, Wail cc
    Bajic, Vladimir B. cc
    Stingl, Ulrich cc
    Eppinger, Jörg cc
    KAUST Department
    Applied Mathematics and Computational Science Program
    Biological & Organometallic Catalysis Laboratories
    Biological and Environmental Sciences and Engineering (BESE) Division
    Bioscience Program
    Chemical Science Program
    Computational Bioscience Research Center (CBRC)
    Computer Science Program
    Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
    KAUST Catalysis Center (KCC)
    Marine Microbial Ecology Research Group
    Marine Science Program
    Physical Science and Engineering (PSE) Division
    Red Sea Research Center (RSRC)
    Date
    2014-04-07
    Permanent link to this record
    http://hdl.handle.net/10754/323510
    
    Metadata
    Show full item record
    Abstract
    Reliable functional annotation of genomic data is the key-step in the discovery of novel enzymes. Intrinsic sequencing data quality problems of single amplified genomes (SAGs) and poor homology of novel extremophile's genomes pose significant challenges for the attribution of functions to the coding sequences identified. The anoxic deep-sea brine pools of the Red Sea are a promising source of novel enzymes with unique evolutionary adaptation. Sequencing data from Red Sea brine pool cultures and SAGs are annotated and stored in the Integrated Data Warehouse of Microbial Genomes (INDIGO) data warehouse. Low sequence homology of annotated genes (no similarity for 35% of these genes) may translate into false positives when searching for specific functions. The Profile and Pattern Matching (PPM) strategy described here was developed to eliminate false positive annotations of enzyme function before progressing to labor-intensive hyper-saline gene expression and characterization. It utilizes InterPro-derived Gene Ontology (GO)-terms (which represent enzyme function profiles) and annotated relevant PROSITE IDs (which are linked to an amino acid consensus pattern). The PPM algorithm was tested on 15 protein families, which were selected based on scientific and commercial potential. An initial list of 2577 enzyme commission (E.C.) numbers was translated into 171 GO-terms and 49 consensus patterns. A subset of INDIGO-sequences consisting of 58 SAGs from six different taxons of bacteria and archaea were selected from six different brine pool environments. Those SAGs code for 74,516 genes, which were independently scanned for the GO-terms (profile filter) and PROSITE IDs (pattern filter). Following stringent reliability filtering, the non-redundant hits (106 profile hits and 147 pattern hits) are classified as reliable, if at least two relevant descriptors (GO-terms and/or consensus patterns) are present. Scripts for annotation, as well as for the PPM algorithm, are available through the INDIGO website.
    Citation
    Grötzinger SW, Alam I, Ba Alawi W, Bajic VB, Stingl U, et al. (2014) Mining a database of single amplified genomes from Red Sea brine pool extremophiles -- improving reliability of gene function prediction using a profile and pattern matching algorithm (PPMA). Frontiers in Microbiology 5. doi:10.3389/fmicb.2014.00134.
    Publisher
    Frontiers Media SA
    Journal
    Frontiers in Microbiology
    DOI
    10.3389/fmicb.2014.00134
    PubMed ID
    24778629
    PubMed Central ID
    PMC3985023
    Additional Links
    http://journal.frontiersin.org/Journal/10.3389/fmicb.2014.00134/abstract
    http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3985023/
    ae974a485f413a2113503eed53cd6c53
    10.3389/fmicb.2014.00134
    Scopus Count
    Collections
    Articles; Biological and Environmental Science and Engineering (BESE) Division; Red Sea Research Center (RSRC); Bioscience Program; Marine Science Program; Applied Mathematics and Computational Science Program; Physical Science and Engineering (PSE) Division; Computer Science Program; Chemical Science Program; KAUST Catalysis Center (KCC); Computational Bioscience Research Center (CBRC); Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division

    entitlement

    Related articles

    • INDIGO - INtegrated data warehouse of microbial genomes with examples from the red sea extremophiles.
    • Authors: Alam I, Antunes A, Kamau AA, Ba Alawi W, Kalkatawi M, Stingl U, Bajic VB
    • Issue date: 2013
    • The development of PIPA: an integrated and automated pipeline for genome-wide protein function annotation.
    • Authors: Yu C, Zavaljevski N, Desai V, Johnson S, Stevens FJ, Reifman J
    • Issue date: 2008 Jan 25
    • Issue date: 2004
    • Insights into Red Sea Brine Pool Specialized Metabolism Gene Clusters Encoding Potential Metabolites for Biotechnological Applications and Extremophile Survival.
    • Authors: Ziko L, Adel M, Malash MN, Siam R
    • Issue date: 2019 May 8
    • [Analysis, identification and correction of some errors of model refseqs appeared in NCBI Human Gene Database by in silico cloning and experimental verification of novel human genes].
    • Authors: Zhang DL, Ji L, Li YD
    • Issue date: 2004 May
    DSpace software copyright © 2002-2022  DuraSpace
    Quick Guide | Contact Us | KAUST University Library
    Open Repository is a service hosted by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items. For anonymous users the allowed maximum amount is 50 search results.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.