Protein Function Prediction Based on Sequence and Structure Information

Handle URI:
http://hdl.handle.net/10754/610703
Title:
Protein Function Prediction Based on Sequence and Structure Information
Authors:
Smaili, Fatima Z. ( 0000-0001-6439-0659 )
Abstract:
The number of available protein sequences in public databases is increasing exponentially. However, a significant fraction of these sequences lack functional annotation which is essential to our understanding of how biological systems and processes operate. In this master thesis project, we worked on inferring protein functions based on the primary protein sequence. In the approach we follow, 3D models are first constructed using I-TASSER. Functions are then deduced by structurally matching these predicted models, using global and local similarities, through three independent enzyme commission (EC) and gene ontology (GO) function libraries. The method was tested on 250 “hard” proteins, which lack homologous templates in both structure and function libraries. The results show that this method outperforms the conventional prediction methods based on sequence similarity or threading. Additionally, our method could be improved even further by incorporating protein-protein interaction information. Overall, the method we use provides an efficient approach for automated functional annotation of non-homologous proteins, starting from their sequence.
Advisors:
Gao, Xin ( 0000-0002-7108-3574 )
Committee Member:
Arold, Stefan T. ( 0000-0001-5278-0668 ) ; Bajic, Vladimir B. ( 0000-0001-5435-4750 )
KAUST Department:
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division; Computer Science
Program:
Computer Science
Issue Date:
25-May-2016
Type:
Thesis
Appears in Collections:
Theses; Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division

Full metadata record

DC FieldValue Language
dc.contributor.advisorGao, Xinen
dc.contributor.authorSmaili, Fatima Z.en
dc.date.accessioned2016-05-25T12:40:39Z-
dc.date.available2016-05-25T12:40:39Z-
dc.date.issued2016-05-25-
dc.identifier.urihttp://hdl.handle.net/10754/610703-
dc.description.abstractThe number of available protein sequences in public databases is increasing exponentially. However, a significant fraction of these sequences lack functional annotation which is essential to our understanding of how biological systems and processes operate. In this master thesis project, we worked on inferring protein functions based on the primary protein sequence. In the approach we follow, 3D models are first constructed using I-TASSER. Functions are then deduced by structurally matching these predicted models, using global and local similarities, through three independent enzyme commission (EC) and gene ontology (GO) function libraries. The method was tested on 250 “hard” proteins, which lack homologous templates in both structure and function libraries. The results show that this method outperforms the conventional prediction methods based on sequence similarity or threading. Additionally, our method could be improved even further by incorporating protein-protein interaction information. Overall, the method we use provides an efficient approach for automated functional annotation of non-homologous proteins, starting from their sequence.en
dc.language.isoenen
dc.subjectprotein functionsen
dc.subjectprotein structuresen
dc.subjectpredictionen
dc.subjectsimilarity searchen
dc.titleProtein Function Prediction Based on Sequence and Structure Informationen
dc.typeThesisen
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Divisionen
dc.contributor.departmentComputer Scienceen
thesis.degree.grantorKing Abdullah University of Science and Technologyen_GB
dc.contributor.committeememberArold, Stefan T.en
dc.contributor.committeememberBajic, Vladimir B.en
thesis.degree.disciplineComputer Scienceen
thesis.degree.nameMaster of Scienceen
dc.person.id132942en
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.