PPI-IRO: A two-stage method for protein-protein interaction extraction based on interaction relation ontology

Abstract
Mining Protein-Protein Interactions (PPIs) from the fast-growing biomedical literature resources has been proven as an effective approach for the identifi cation of biological regulatory networks. This paper presents a novel method based on the idea of Interaction Relation Ontology (IRO), which specifi es and organises words of various proteins interaction relationships. Our method is a two-stage PPI extraction method. At fi rst, IRO is applied in a binary classifi er to determine whether sentences contain a relation or not. Then, IRO is taken to guide PPI extraction by building sentence dependency parse tree. Comprehensive and quantitative evaluations and detailed analyses are used to demonstrate the signifi cant performance of IRO on relation sentences classifi cation and PPI extraction. Our PPI extraction method yielded a recall of around 80% and 90% and an F1 of around 54% and 66% on corpora of AIMed and Bioinfer, respectively, which are superior to most existing extraction methods. Copyright © 2014 Inderscience Enterprises Ltd.

Citation
Li, C. X., Chen, P., Wang, R. J., Wang, X. J., Su, Y. R., & Li, J. (2014). PPI-IRO: a two-stage method for protein-protein interaction extraction based on interaction relation ontology. International Journal of Data Mining and Bioinformatics, 10(1), 98. doi:10.1504/ijdmb.2014.062890

Acknowledgements
We thank the anonymous reviewers for their constructive comments on the paper. This research was supported in part by the National Natural Science Foundation of China (No. 60774096) and the National Key Technology R&D Program of China (No. 2008BAK49B05). This work was also supported in party by the National Science Foundation of China (No. 60803107).

Publisher
Inderscience Publishers

Journal
International Journal of Data Mining and Bioinformatics

DOI
10.1504/IJDMB.2014.062890

Permanent link to this record