AuthorsMuttakin, Md Nurul
Embargo End Date2024-05-11
Permanent link to this recordhttp://hdl.handle.net/10754/691654
MetadataShow full item record
Access RestrictionsAt the time of archiving, the student author of this thesis opted to temporarily restrict access to it. The full text of this thesis will become available to the public after the expiration of the embargo on 2024-05-11.
AbstractMachine learning models such as AlphaFold can generate protein 3D conformation from primary sequence up to experimental accuracy, which gives rise to a bunch of research works to predict protein functions from 3D structures. Almost all of these works attempted to use graph neural networks (GNN) to learn 3D structures of proteins from 2D contact maps/graphs. Most of these works use rich 1D features such as ESM and LSTM embedding in addition to the contact graph. These rich 1D features essentially obfuscate the learning capability of GNNs. In this thesis, we evaluate the learning capabilities of GCNs from contact map graphs in the existing framework, where we attempt to incorporate distance information for better predictive performance. We found that GCNs fall far short with 1D-CNN without language models, even with distance information. Consequently, we further investigate the capabilities of GCNs to distinguish subgraph patterns corresponding to the InterPro domains. We found that GCNs perform better than highly rich sequence embedding with MLP in recognizing the structural patterns. Finally, we investigate the capability of GCNs to predict GO-terms (functions) individually. We found that GCNs perform almost on par in identifying GO-terms in the presence of only hard positive and hard negative examples. We also identified some GO-terms indistinguishable by GCNs and ESM2-based MLP models. This gives rise to new research questions to be investigated by future works.
CitationMuttakin, Md Nurul. (2023). Learning 3D structures for protein function prediction [KAUST Research Repository]. https://doi.org/10.25781/KAUST-VXC7P