Show simple item record

dc.contributor.authorDimonaco, Nicholas
dc.contributor.authorCreevey, Chris
dc.contributor.authorHoehndorf, Robert
dc.contributor.authorKulmanov, Maxat
dc.contributor.authorLiuwei, Wang
dc.contributor.authorClare, Amanda
dc.contributor.authorAubrey, Wayne
dc.contributor.authorKenobi, Kim
dc.date.accessioned2019-04-25T06:59:00Z
dc.date.available2019-04-25T06:59:00Z
dc.date.issued2019-04-24
dc.identifier.citationDimonaco N, Creevey C, Hoehndorf R, Kulmanov M, Liuwei W, et al. (2019) Uncovering the dark matter of the metagenome one read at a time. Access Microbiology 1. Available: http://dx.doi.org/10.1099/acmi.ac2019.po0557.
dc.identifier.issn2516-8290
dc.identifier.doi10.1099/acmi.ac2019.po0557
dc.identifier.urihttp://hdl.handle.net/10754/631990
dc.description.abstractContemporary metagenomic annotation methods have proven insufficient in our attempts to better understand the complex environments around us. We call the yet to be annotated part of a metagenome it’s ‘dark matter’. The Gene Ontology (GO) is a hierarchical vocabulary used to describe gene product function and a large collection of curated genes with GO annotations already exists. DeepGO utilises deep learning to build models from these curated genes and gene products to predict GO categories for novel proteins. One of the major problems with metagenomic studies today is the process of assembling the environmental DNA sequences into their original genomes. This is difficult, with chimeric metagenomically assembled genomes being common. To avoid this and the computational and time expense, we have modified DeepGO to perform protein function prediction directly from sequence reads with limited protein coding sequence prediction. Three independent models were trained as the following; The first 50 amino acids of a protein were used for training, The last 50 amino acids were used for training, A phasing window of 50 amino acids was used to train across the entirety of a protein sequence. These models were chosen to learn from the different parts of a protein sequence we are likely to capture from only the short unassembled sequence reads. We compared the three models by producing a mock metagenomic community consisting of 6 model bacterial genomes. We evaluated the functions predicted from the unassembled sequence reads and the protein coding sequences predicted from the assembled metagenome.
dc.publisherMicrobiology Society
dc.titleUncovering the dark matter of the metagenome one read at a time
dc.typePoster
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
dc.contributor.departmentComputer Science Program
dc.contributor.departmentComputational Bioscience Research Center (CBRC)
dc.identifier.journalAccess Microbiology
dc.contributor.institutionAberystwyth University, Aberystwyth, United Kingdom
dc.contributor.institutionQueen’s University Belfast, Belfast, United Kingdom
kaust.personHoehndorf, Robert
kaust.personKulmanov, Maxat
kaust.personLiuwei, Wang


This item appears in the following Collection(s)

Show simple item record