Enhancing the Biological Relevance of Machine Learning Classifiers for Reverse Vaccinology

Handle URI:
http://hdl.handle.net/10754/622919
Title:
Enhancing the Biological Relevance of Machine Learning Classifiers for Reverse Vaccinology
Authors:
Heinson, Ashley; Gunawardana, Yawwani; Moesker, Bastiaan; Hume, Carmen; Vataga, Elena; Hall, Yper; Stylianou, Elena; McShane, Helen ( 0000-0002-2126-5142 ) ; Williams, Ann; Niranjan, Mahesan; Woelk, Christopher
Abstract:
Reverse vaccinology (RV) is a bioinformatics approach that can predict antigens with protective potential from the protein coding genomes of bacterial pathogens for subunit vaccine design. RV has become firmly established following the development of the BEXSERO® vaccine against Neisseria meningitidis serogroup B. RV studies have begun to incorporate machine learning (ML) techniques to distinguish bacterial protective antigens (BPAs) from non-BPAs. This research contributes significantly to the RV field by using permutation analysis to demonstrate that a signal for protective antigens can be curated from published data. Furthermore, the effects of the following on an ML approach to RV were also assessed: nested cross-validation, balancing selection of non-BPAs for subcellular localization, increasing the training data, and incorporating greater numbers of protein annotation tools for feature generation. These enhancements yielded a support vector machine (SVM) classifier that could discriminate BPAs (n = 200) from non-BPAs (n = 200) with an area under the curve (AUC) of 0.787. In addition, hierarchical clustering of BPAs revealed that intracellular BPAs clustered separately from extracellular BPAs. However, no immediate benefit was derived when training SVM classifiers on data sets exclusively containing intra- or extracellular BPAs. In conclusion, this work demonstrates that ML classifiers have great utility in RV approaches and will lead to new subunit vaccines in the future.
KAUST Department:
King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
Citation:
Heinson A, Gunawardana Y, Moesker B, Hume C, Vataga E, et al. (2017) Enhancing the Biological Relevance of Machine Learning Classifiers for Reverse Vaccinology. International Journal of Molecular Sciences 18: 312. Available: http://dx.doi.org/10.3390/ijms18020312.
Publisher:
MDPI AG
Journal:
International Journal of Molecular Sciences
Issue Date:
1-Feb-2017
DOI:
10.3390/ijms18020312
Type:
Article
ISSN:
1422-0067
Sponsors:
This work was performed with the support of the IRIDIS High Performance Computing Facility and the Bioinformatics Core at the University of Southampton and was funded by a Marie Curie Career Integration Grant (CIG, PCIG13-GA2013-618334).
Additional Links:
http://www.mdpi.com/1422-0067/18/2/312
Appears in Collections:
Articles

Full metadata record

DC FieldValue Language
dc.contributor.authorHeinson, Ashleyen
dc.contributor.authorGunawardana, Yawwanien
dc.contributor.authorMoesker, Bastiaanen
dc.contributor.authorHume, Carmenen
dc.contributor.authorVataga, Elenaen
dc.contributor.authorHall, Yperen
dc.contributor.authorStylianou, Elenaen
dc.contributor.authorMcShane, Helenen
dc.contributor.authorWilliams, Annen
dc.contributor.authorNiranjan, Mahesanen
dc.contributor.authorWoelk, Christopheren
dc.date.accessioned2017-02-26T06:34:20Z-
dc.date.available2017-02-26T06:34:20Z-
dc.date.issued2017-02-01en
dc.identifier.citationHeinson A, Gunawardana Y, Moesker B, Hume C, Vataga E, et al. (2017) Enhancing the Biological Relevance of Machine Learning Classifiers for Reverse Vaccinology. International Journal of Molecular Sciences 18: 312. Available: http://dx.doi.org/10.3390/ijms18020312.en
dc.identifier.issn1422-0067en
dc.identifier.doi10.3390/ijms18020312en
dc.identifier.urihttp://hdl.handle.net/10754/622919-
dc.description.abstractReverse vaccinology (RV) is a bioinformatics approach that can predict antigens with protective potential from the protein coding genomes of bacterial pathogens for subunit vaccine design. RV has become firmly established following the development of the BEXSERO® vaccine against Neisseria meningitidis serogroup B. RV studies have begun to incorporate machine learning (ML) techniques to distinguish bacterial protective antigens (BPAs) from non-BPAs. This research contributes significantly to the RV field by using permutation analysis to demonstrate that a signal for protective antigens can be curated from published data. Furthermore, the effects of the following on an ML approach to RV were also assessed: nested cross-validation, balancing selection of non-BPAs for subcellular localization, increasing the training data, and incorporating greater numbers of protein annotation tools for feature generation. These enhancements yielded a support vector machine (SVM) classifier that could discriminate BPAs (n = 200) from non-BPAs (n = 200) with an area under the curve (AUC) of 0.787. In addition, hierarchical clustering of BPAs revealed that intracellular BPAs clustered separately from extracellular BPAs. However, no immediate benefit was derived when training SVM classifiers on data sets exclusively containing intra- or extracellular BPAs. In conclusion, this work demonstrates that ML classifiers have great utility in RV approaches and will lead to new subunit vaccines in the future.en
dc.description.sponsorshipThis work was performed with the support of the IRIDIS High Performance Computing Facility and the Bioinformatics Core at the University of Southampton and was funded by a Marie Curie Career Integration Grant (CIG, PCIG13-GA2013-618334).en
dc.publisherMDPI AGen
dc.relation.urlhttp://www.mdpi.com/1422-0067/18/2/312en
dc.rightsThis is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).en
dc.subjectBacterial pathogenen
dc.subjectBacterial protective antigenen
dc.subjectMachine learningen
dc.subjectReverse vaccinologyen
dc.subjectSupport vector machineen
dc.titleEnhancing the Biological Relevance of Machine Learning Classifiers for Reverse Vaccinologyen
dc.typeArticleen
dc.contributor.departmentKing Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabiaen
dc.identifier.journalInternational Journal of Molecular Sciencesen
dc.eprint.versionPublisher's Version/PDFen
dc.contributor.institutionFaculty of Medicine, University of Southampton, Southampton, SO17 1BJ, United Kingdomen
dc.contributor.institutionThermo Fisher Scientific, Inchinnan Business Park, 3 Fountain Drive, Paisley, PA4 9RF, United Kingdomen
dc.contributor.institutionLondon School of Hygiene and Tropical Medicine (LSHTM), Department of Pathogen Molecular Biology, London, WC1E 7HT, United Kingdomen
dc.contributor.institutioniSolutions, University of Southampton, Southampton, SO17 1BJ, United Kingdomen
dc.contributor.institutionPublic Health England, National Infection Service, Porton Down, Salisbury, SP4 0JG, United Kingdomen
dc.contributor.institutionThe Jenner Institute, University of Oxford, Oxford, OX3 7DQ, United Kingdomen
dc.contributor.institutionDepartment of Electronics and Computer Science, University of Southampton, Southampton, SO17 1BJ, United Kingdomen
kaust.authorHume, Carmenen
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.