DWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm

Handle URI:
http://hdl.handle.net/10754/346688
Title:
DWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm
Authors:
Soufan, Othman ( 0000-0002-4410-1853 ) ; Kleftogiannis, Dimitrios A. ( 0000-0003-1086-821X ) ; Kalnis, Panos ( 0000-0002-5060-1360 ) ; Bajic, Vladimir B. ( 0000-0001-5435-4750 )
Abstract:
Many scientific problems can be formulated as classification tasks. Data that harbor relevant information are usually described by a large number of features. Frequently, many of these features are irrelevant for the class prediction. The efficient implementation of classification models requires identification of suitable combinations of features. The smaller number of features reduces the problem's dimensionality and may result in higher classification performance. We developed DWFS, a web-based tool that allows for efficient selection of features for a variety of problems. DWFS follows the wrapper paradigm and applies a search strategy based on Genetic Algorithms (GAs). A parallel GA implementation examines and evaluates simultaneously large number of candidate collections of features. DWFS also integrates various filteringmethods thatmay be applied as a pre-processing step in the feature selection process. Furthermore, weights and parameters in the fitness function of GA can be adjusted according to the application requirements. Experiments using heterogeneous datasets from different biomedical applications demonstrate that DWFS is fast and leads to a significant reduction of the number of features without sacrificing performance as compared to several widely used existing methods. DWFS can be accessed online at www.cbrc.kaust.edu.sa/dwfs.
KAUST Department:
Computational Bioscience Research Center (CBRC); Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Citation:
DWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm 2015, 10 (2):e0117988 PLOS ONE
Publisher:
Public Library of Science (PLoS)
Journal:
PLoS ONE
Issue Date:
26-Feb-2015
DOI:
10.1371/journal.pone.0117988
PubMed ID:
25719748
PubMed Central ID:
PMC4342225
Type:
Article
ISSN:
1932-6203
Additional Links:
http://dx.plos.org/10.1371/journal.pone.0117988
Appears in Collections:
Articles; Computational Bioscience Research Center (CBRC); Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division

Full metadata record

DC FieldValue Language
dc.contributor.authorSoufan, Othmanen
dc.contributor.authorKleftogiannis, Dimitrios A.en
dc.contributor.authorKalnis, Panosen
dc.contributor.authorBajic, Vladimir B.en
dc.date.accessioned2015-03-16T05:59:05Zen
dc.date.available2015-03-16T05:59:05Zen
dc.date.issued2015-02-26en
dc.identifier.citationDWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm 2015, 10 (2):e0117988 PLOS ONEen
dc.identifier.issn1932-6203en
dc.identifier.pmid25719748en
dc.identifier.doi10.1371/journal.pone.0117988en
dc.identifier.urihttp://hdl.handle.net/10754/346688en
dc.description.abstractMany scientific problems can be formulated as classification tasks. Data that harbor relevant information are usually described by a large number of features. Frequently, many of these features are irrelevant for the class prediction. The efficient implementation of classification models requires identification of suitable combinations of features. The smaller number of features reduces the problem's dimensionality and may result in higher classification performance. We developed DWFS, a web-based tool that allows for efficient selection of features for a variety of problems. DWFS follows the wrapper paradigm and applies a search strategy based on Genetic Algorithms (GAs). A parallel GA implementation examines and evaluates simultaneously large number of candidate collections of features. DWFS also integrates various filteringmethods thatmay be applied as a pre-processing step in the feature selection process. Furthermore, weights and parameters in the fitness function of GA can be adjusted according to the application requirements. Experiments using heterogeneous datasets from different biomedical applications demonstrate that DWFS is fast and leads to a significant reduction of the number of features without sacrificing performance as compared to several widely used existing methods. DWFS can be accessed online at www.cbrc.kaust.edu.sa/dwfs.en
dc.publisherPublic Library of Science (PLoS)en
dc.relation.urlhttp://dx.plos.org/10.1371/journal.pone.0117988en
dc.rightsThis is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.en
dc.titleDWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithmen
dc.typeArticleen
dc.contributor.departmentComputational Bioscience Research Center (CBRC)en
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Divisionen
dc.identifier.journalPLoS ONEen
dc.identifier.pmcidPMC4342225en
dc.eprint.versionPublisher's Version/PDFen
dc.contributor.affiliationKing Abdullah University of Science and Technology (KAUST)en
kaust.authorSoufan, Othmanen
kaust.authorKleftogiannis, Dimitrios A.en
kaust.authorKalnis, Panosen
kaust.authorBajic, Vladimir B.en

Related articles on PubMed

All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.