YamiPred: A novel evolutionary method for predicting pre-miRNAs and selecting relevant features
AuthorsKleftogiannis, Dimitrios A.
KAUST DepartmentComputer Science Program
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Online Publication Date2015-01-23
Print Publication Date2015-09-01
Permanent link to this recordhttp://hdl.handle.net/10754/550520
MetadataShow full item record
AbstractMicroRNAs (miRNAs) are small non-coding RNAs, which play a significant role in gene regulation. Predicting miRNA genes is a challenging bioinformatics problem and existing experimental and computational methods fail to deal with it effectively. We developed YamiPred, an embedded classification method that combines the efficiency and robustness of Support Vector Machines (SVM) with Genetic Algorithms (GA) for feature selection and parameters optimization. YamiPred was tested in a new and realistic human dataset and was compared with state-of-the-art computational intelligence approaches and the prevalent SVM-based tools for miRNA prediction. Experimental results indicate that YamiPred outperforms existing approaches in terms of accuracy and of geometric mean of sensitivity and specificity. The embedded feature selection component selects a compact feature subset that contributes to the performance optimization. Further experimentation with this minimal feature subset has achieved very high classification performance and revealed the minimum number of samples required for developing a robust predictor. YamiPred also confirmed the important role of commonly used features such as entropy and enthalpy, and uncovered the significance of newly introduced features, such as %A-U aggregate nucleotide frequency and positional entropy. The best model trained on human data has successfully predicted pre-miRNAs to other organisms including the category of viruses.
CitationYamiPred: A novel evolutionary method for predicting pre-miRNAs and selecting relevant features 2015:1 IEEE/ACM Transactions on Computational Biology and Bioinformatics