BLProt: Prediction of bioluminescent proteins based on support vector machine and relieff feature selection
AuthorsKandaswamy, Krishna Kumar
Hazrati, Mehrnaz Khodam
Permanent link to this recordhttp://hdl.handle.net/10754/325467
MetadataShow full item record
AbstractBackground: Bioluminescence is a process in which light is emitted by a living organism. Most creatures that emit light are sea creatures, but some insects, plants, fungi etc, also emit light. The biotechnological application of bioluminescence has become routine and is considered essential for many medical and general technological advances. Identification of bioluminescent proteins is more challenging due to their poor similarity in sequence. So far, no specific method has been reported to identify bioluminescent proteins from primary sequence.Results: In this paper, we propose a novel predictive method that uses a Support Vector Machine (SVM) and physicochemical properties to predict bioluminescent proteins. BLProt was trained using a dataset consisting of 300 bioluminescent proteins and 300 non-bioluminescent proteins, and evaluated by an independent set of 141 bioluminescent proteins and 18202 non-bioluminescent proteins. To identify the most prominent features, we carried out feature selection with three different filter approaches, ReliefF, infogain, and mRMR. We selected five different feature subsets by decreasing the number of features, and the performance of each feature subset was evaluated.Conclusion: BLProt achieves 80% accuracy from training (5 fold cross-validations) and 80.06% accuracy from testing. The performance of BLProt was compared with BLAST and HMM. High prediction accuracy and successful prediction of hypothetical proteins suggests that BLProt can be a useful approach to identify bioluminescent proteins from sequence information, irrespective of their sequence similarity. 2011 Kandaswamy et al; licensee BioMed Central Ltd.
CitationKandaswamy K, Pugalenthi G, Hazrati M, Kalies K-U, Martinetz T (2011) BLProt: prediction of bioluminescent proteins based on support vector machine and relieff feature selection. BMC Bioinformatics 12: 345. doi:10.1186/1471-2105-12-345.
PubMed Central IDPMC3176267
The following license files are associated with this item:
Except where otherwise noted, this item's license is described as This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
- Prediction of bioluminescent proteins using auto covariance transformation of evolutional profiles.
- Authors: Zhao X, Li J, Huang Y, Ma Z, Yin M
- Issue date: 2012
- Prediction of apoptosis protein locations with genetic algorithms and support vector machines through a new mode of pseudo amino acid composition.
- Authors: Kandaswamy KK, Pugalenthi G, Möller S, Hartmann E, Kalies KU, Suganthan PN, Martinetz T
- Issue date: 2010 Dec
- Identification of functionally diverse lipocalin proteins from sequence information using support vector machine.
- Authors: Pugalenthi G, Kandaswamy KK, Suganthan PN, Archunan G, Sowdhamini R
- Issue date: 2010 Aug
- Prediction of bioluminescent proteins by using sequence-derived features and lineage-specific scheme.
- Authors: Zhang J, Chai H, Yang G, Ma Z
- Issue date: 2017 Jun 5
- Seminal quality prediction using data mining methods.
- Authors: Sahoo AJ, Kumar Y
- Issue date: 2014