• Login
    View Item 
    •   Home
    • Research
    • Preprints
    • View Item
    •   Home
    • Research
    • Preprints
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of KAUSTCommunitiesIssue DateSubmit DateThis CollectionIssue DateSubmit Date

    My Account

    Login

    Quick Links

    Open Access PolicyORCID LibguidePlumX LibguideSubmit an Item

    Statistics

    Display statistics

    Fragment Graphical Variational AutoEncoding for Screening Molecules with Small Data

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Thumbnail
    Name:
    Preprintfile1.pdf
    Size:
    1.063Mb
    Format:
    PDF
    Description:
    Pre-print
    Download
    Type
    Preprint
    Authors
    Armitage, John
    Spalek, Leszek J.
    Nguyen, Malgorzata
    Nikolka, Mark
    Jacobs, Ian E.
    Marañón, Lorena
    Nasrallah, Iyad
    Schweicher, Guillaume
    Dimov, Ivan
    Simatos, Dimitrios
    McCulloch, Iain cc
    Nielsen, Christian B.
    Conduit, Gareth
    Sirringhaus, Henning
    KAUST Department
    Chemical Science Program
    KAUST Solar Center (KSC)
    Physical Science and Engineering (PSE) Division
    Date
    2019-10-21
    Permanent link to this record
    http://hdl.handle.net/10754/660689
    
    Metadata
    Show full item record
    Abstract
    In the majority of molecular optimization tasks, predictive machine learning (ML) models are limited due to the unavailability and cost of generating big experimental datasets on the specific task. To circumvent this limitation, ML models are trained on big theoretical datasets or experimental indicators of molecular suitability that are either publicly available or inexpensive to acquire. These approaches produce a set of candidate molecules which have to be ranked using limited experimental data or expert knowledge. Under the assumption that structure is related to functionality, here we use a molecular fragment-based graphical autoencoder to generate unique structural fingerprints to efficiently search through the candidate set. We demonstrate that fragment-based graphical autoencoding reduces the error in predicting physical characteristics such as the solubility and partition coefficient in the small data regime compared to other extended circular fingerprints and string based approaches. We further demonstrate that this approach is capable of providing insight into real world molecular optimization problems, such as searching for stabilization additives in organic semiconductors by accurately predicting 92% of test molecules given 69 training examples. This task is a model example of black box molecular optimization as there is minimal theoretical and experimental knowledge to accurately predict the suitability of the additives.
    Sponsors
    J.W.A. acknowledges doctoral support from the Canadian Centennial Scholarship Fund, Christ's College Cambridge and FlexEnable. L.J.S. is grateful to Marcin Abram for insightful discussions. I.E.J acknowledges funding from a Royal Society Newton International Fellowship. G.S. acknowledges postdoctoral fellowship support from the Wiener-Anspach Foundation and The Leverhulme Trust (Early Career Fellowship supported by the Isaac Newton Trust). I.D. acknowledges NanoDTC ERC and Cambridge Philosophical society.
    Publisher
    arXiv
    arXiv
    1910.13325
    Additional Links
    https://arxiv.org/pdf/1910.13325
    Collections
    Preprints; Physical Science and Engineering (PSE) Division; Chemical Science Program; KAUST Solar Center (KSC)

    entitlement

     
    DSpace software copyright © 2002-2021  DuraSpace
    Quick Guide | Contact Us | Send Feedback
    Open Repository is a service hosted by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items. For anonymous users the allowed maximum amount is 50 search results.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.