Unifying the global coding sequence space enables the study of genes with unknown function across biomes
Type
PreprintAuthors
Vanni, Chiara
Schechter, Matthew S.

Acinas, Silvia G.
Barberán, Albert
Buttigieg, Pier Luigi
Casamayor, Emilio O.
Delmont, Tom O.
Duarte, Carlos M.

Eren, A. Murat

Finn, Robert D.
Kottmann, Renzo
Mitchell, Alex
Sanchez, Pablo

Siren, Kimmo

Steinegger, Martin

Glöckner, Frank Oliver
Fernandez-Guerra, Antonio

KAUST Department
Biological and Environmental Science and Engineering (BESE) DivisionMarine Science Program
Red Sea Research Center (RSRC)
Date
2020-07-01Permanent link to this record
http://hdl.handle.net/10754/667614
Metadata
Show full item recordAbstract
AbstractOne of the biggest challenges in molecular biology is bridging the gap between the known and the unknown coding sequence space. This challenge is especially extreme in microbial systems, where between 40% and 60% of the predicted genes are of unknown function. Discarding this uncharacterized fraction should not be an option anymore. Here, we present a conceptual framework and a computational workflow that bridges this gap and provides a powerful strategy to contextualize the investigations of genes of unknown function. Our approach partitions the coding sequence space removing the known-unknown dichotomy, unifies genomic and metagenomic data and provides a framework to expand those investigations across environments and organisms. By analyzing 415,971,742 genes predicted from 1,749 metagenomes and 28,941 bacterial and archaeal genomes we showcase our approach and its application in ecological, evolutionary and biotechnological investigations. As a result, we put into perspective the extent of the unknown fraction, its diversity, and its relevance in genomic and environmental contexts. By identifying a target gene of unknown function for antibiotic resistance, we demonstrate how a contextualized unknown coding sequence space enables the generation of hypotheses that can be used to augment experimental data.Citation
Vanni, C., Schechter, M. S., Acinas, S. G., Barberán, A., Buttigieg, P. L., Casamayor, E. O., … Fernandez-Guerra, A. (2020). Unifying the global coding sequence space enables the study of genes with unknown function across biomes. doi:10.1101/2020.06.30.180448Sponsors
The authors thankfully acknowledge the computer resources at MareNostrum and the technical support provided by Barcelona Supercomputing Center (RES-AECT-2014-2-0085), the BMBF795 funded de.NBI Cloud within the German Network for Bioinformatics Infrastructure (de.NBI) (031A537B, 031A533A, 031A538A, 031A533B, 031A535A, 031A537C, 031A534A, 031A532B), the University of Oxford Advanced Research Computing (http://dx.doi.org/10.5281/zenodo.22558) and the MARBITS bioinformatics core at ICM-CSIC. CV was supported by the Max Planck Society. AFG received funding from the European Union’s Horizon 2020 research and innovation program Blue Growth: Unlocking the potential of Seas and Oceans under grant agreement no. 634486 (project acronym INMARE). AM was supported by the Biotechnology and Biological Sciences Research Council [BB/M011755/1, BB/R015228/1] and RDF by the European Molecular Biology Laboratory core funds. EOC was supported by project INTERACTOMA RTI2018-101205-B-I00 from the Spanish Agency of Science MICIU/AEI. SGA and PS received additional funding by the project MAGGY (CTM2017-87736-R) from the Spanish Ministry of Economy and Competitiveness. The Malaspina 2010 Expedition was supported by the Spanish Ministry of Economy and Competitiveness (MINECO) through the Consolider-Ingenio program (ref. CSD2008-00077). The authors thank Johannes Söding and Alex Bateman for helpful discussions.Publisher
Cold Spring Harbor LaboratoryAdditional Links
http://biorxiv.org/lookup/doi/10.1101/2020.06.30.180448https://www.biorxiv.org/content/biorxiv/early/2020/11/18/2020.06.30.180448.full.pdf
Relations
Is Supplemented By:- [Software]
Title: functional-dark-side/vanni_et_al-figures: Code for the figures in Vanni et al. manuscript. Publication Date: 2020-07-03. github: functional-dark-side/vanni_et_al-figures Handle: 10754/667789 - [Software]
Vanni, C., & Fernandez-Guerra, A. (2021). functional-dark-side/agnostos-wf: First release (Version v1.0) [Computer software]. Zenodo. https://doi.org/10.5281/ZENODO.4557846. DOI: 10.5281/zenodo.4557846 Handle: 10754/667791
ae974a485f413a2113503eed53cd6c53
10.1101/2020.06.30.180448
Scopus Count
Except where otherwise noted, this item's license is described as Archived with thanks to Cold Spring Harbor Laboratory