Show simple item record

dc.contributor.advisorBajic, Vladimir B.
dc.contributor.authorIslam, Elaf J.
dc.date.accessioned2018-11-28T12:04:50Z
dc.date.available2018-11-28T12:04:50Z
dc.date.issued2018-11-28
dc.identifier.citationIslam, E. J. (2018). Prediction of Active and Inactive Chemical Compounds from High-Throughput Assays. KAUST Research Repository. https://doi.org/10.25781/KAUST-4Y962
dc.identifier.doi10.25781/KAUST-4Y962
dc.identifier.urihttp://hdl.handle.net/10754/630092
dc.description.abstractThis study considers chemical compounds that can exert their activity by interacting with a target protein or other molecular receptors. Our aim is to develop machine learning models that can predict if a chemical compound will be active in a particular test/assay. We will use data from assays that are present in the PubChem knowledgebase, specifically in its segment called BioAssays which reports the results of many high-throughput screening experiments. PubChem BioAssays is a valuable resource that contains information from a large number of experiments. In one assay, sometimes many hundreds or even many thousands of chemicals are tested. Data from these experimental assays contain information about chemicals that are active as well as chemicals that are not active in the assay. These represent an interesting resource of experimental data that are well suited for classification purposes. We will approach the problem by evaluating different ways that chemical compounds can be numerically described by means of so-called fingerprints, and then apply different machine learning (ML) and deep learning (DL) models to classify active and inactive chemicals for a number of assays. In this study, we will make comprehensive comparisons of the types of ML/DL models and types of fingerprint features that describe chemicals, and evaluate combinations of models and fingerprints that work best for the problem in question. Our focus is on finding those combinations which are useful for distinguishing active from inactive compounds in single PubChem assays. We will evaluate the methods across 10 assays and will examine the effects of 11 types of fingerprints. For example, PubChem fingerprints and MACCS keys fingerprints. For the evaluation, up to now we performed 88 experiments for each dataset and 968 in total for all 10 PubChem assays. These experiments involved approximately 6,000 interactions between chemicals and their targets. The implementation of this project has been done using MATLAB. Based on these and additional experiments, we will be in a position to propose which combination of fingerprints and ML/DL models works best in the above mentioned task. Such modeling will be useful to predict activity for chemicals that are not yet tested.
dc.language.isoen
dc.subjectmachine learning
dc.subjectbioinformatics
dc.subjectprediction chemical
dc.subjectcompounds activities
dc.titlePrediction of Active and Inactive Chemical Compounds from High-Throughput Assays
dc.typeThesis
dc.contributor.departmentComputer, Electrical and Mathematical Science and Engineering (CEMSE) Division
thesis.degree.grantorKing Abdullah University of Science and Technology
dc.contributor.committeememberLaleg-Kirati, Taous-Meriem
dc.contributor.committeememberMoshkov, Mikhail
thesis.degree.disciplineComputer Science
thesis.degree.nameMaster of Science
refterms.dateFOA2018-11-28T12:04:51Z


Files in this item

Thumbnail
Name:
Thesis_final_submision.pdf
Size:
2.895Mb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record