DWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm
Type
ArticleKAUST Department
Computational Bioscience Research Center (CBRC)Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Computer Science Program
Applied Mathematics and Computational Science Program
Date
2015-02-26Permanent link to this record
http://hdl.handle.net/10754/346688
Metadata
Show full item recordAbstract
Many scientific problems can be formulated as classification tasks. Data that harbor relevant information are usually described by a large number of features. Frequently, many of these features are irrelevant for the class prediction. The efficient implementation of classification models requires identification of suitable combinations of features. The smaller number of features reduces the problem's dimensionality and may result in higher classification performance. We developed DWFS, a web-based tool that allows for efficient selection of features for a variety of problems. DWFS follows the wrapper paradigm and applies a search strategy based on Genetic Algorithms (GAs). A parallel GA implementation examines and evaluates simultaneously large number of candidate collections of features. DWFS also integrates various filteringmethods thatmay be applied as a pre-processing step in the feature selection process. Furthermore, weights and parameters in the fitness function of GA can be adjusted according to the application requirements. Experiments using heterogeneous datasets from different biomedical applications demonstrate that DWFS is fast and leads to a significant reduction of the number of features without sacrificing performance as compared to several widely used existing methods. DWFS can be accessed online at www.cbrc.kaust.edu.sa/dwfs.Citation
DWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm 2015, 10 (2):e0117988 PLOS ONEPublisher
Public Library of Science (PLoS)Journal
PLoS ONEPubMed ID
25719748PubMed Central ID
PMC4342225Additional Links
http://dx.plos.org/10.1371/journal.pone.0117988ae974a485f413a2113503eed53cd6c53
10.1371/journal.pone.0117988
Scopus Count
Related articles
- A novel feature selection approach for biomedical data classification.
- Authors: Peng Y, Wu Z, Jiang J
- Issue date: 2010 Feb
- Using machine learning techniques and genomic/proteomic information from known databases for defining relevant features for PPI classification.
- Authors: Urquiza JM, Rojas I, Pomares H, Herrera J, Florido JP, Valenzuela O, Cepero M
- Issue date: 2012 Jun
- Feature selection for splice site prediction: a new method using EDA-based feature ranking.
- Authors: Saeys Y, Degroeve S, Aeyels D, Rouzé P, Van de Peer Y
- Issue date: 2004 May 21
- Filter versus wrapper gene selection approaches in DNA microarray domains.
- Authors: Inza I, Larrañaga P, Blanco R, Cerrolaza AJ
- Issue date: 2004 Jun
- A scalable memetic algorithm for simultaneous instance and feature selection.
- Authors: García-Pedrajas N, de Haro-García A, Pérez-Rodríguez J
- Issue date: 2014 Spring