For more information visit: https://cemse.kaust.edu.sa/cs

Recent Submissions

  • Transcriptomic analysis identifies organ-specific metastasis genes and pathways across different primary sites.

    Zhang, Lin; Fan, Ming; Napolitano, Francesco; Gao, Xin; Xu, Ying; Li, Lihua (Journal of translational medicine, Springer Science and Business Media LLC, 2021-01-08) [Article]
    BackgroundMetastasis is the most devastating stage of cancer progression and often shows a preference for specific organs.MethodsTo reveal the mechanisms underlying organ-specific metastasis, we systematically analyzed gene expression profiles for three common metastasis sites across all available primary origins. A rank-based method was used to detect differentially expressed genes between metastatic tumor tissues and corresponding control tissues. For each metastasis site, the common differentially expressed genes across all primary origins were identified as organ-specific metastasis genes.ResultsPathways enriched by these genes reveal an interplay between the molecular characteristics of the cancer cells and those of the target organ. Specifically, the neuroactive ligand-receptor interaction pathway and HIF-1 signaling pathway were found to have prominent roles in adapting to the target organ environment in brain and liver metastases, respectively. Finally, the identified organ-specific metastasis genes and pathways were validated using a primary breast tumor dataset. Survival and cluster analysis showed that organ-specific metastasis genes and pathways tended to be expressed uniquely by a subgroup of patients having metastasis to the target organ, and were associated with the clinical outcome.ConclusionsElucidating the genes and pathways underlying organ-specific metastasis may help to identify drug targets and develop treatment strategies to benefit patients.
  • FENet: A Frequency Extraction Network for Obstructive Sleep Apnea Detection

    Ye, Guanhua; Yin, Hongzhi; Chen, Tong; Chen, Hongxu; Cui, Lizhen; Zhang, Xiangliang (IEEE Journal of Biomedical and Health Informatics, IEEE, 2021) [Article]
    Obstructive Sleep Apnea (OSA) is a highly prevalent but inconspicuous disease that seriously jeopardizes the health of human beings. Polysomnography (PSG), the gold standard of detecting OSA, requires multiple specialized sensors for signal collection, hence patients have to physically visit hospitals and bear the costly treatment for a single detection. Recently, many single-sensor alternatives have been proposed to improve the cost efficiency and convenience. Among these methods, solutions based on RR-interval (i.e., the interval between two consecutive pulses) signals reach a satisfactory balance among comfort, portability and detection accuracy.
  • A Siamese neural network model for the prioritization of metabolic disorders by integrating real and simulated data.

    Messa, Gian Marco; Napolitano, Francesco; Elsea, Sarah H; di Bernardo, Diego; Gao, Xin (Bioinformatics (Oxford, England), Oxford University Press (OUP), 2020-12-31) [Article]
    MotivationUntargeted metabolomic approaches hold a great promise as a diagnostic tool for inborn errors of metabolisms (IEMs) in the near future. However, the complexity of the involved data makes its application difficult and time consuming. Computational approaches, such as metabolic network simulations and machine learning, could significantly help to exploit metabolomic data to aid the diagnostic process. While the former suffers from limited predictive accuracy, the latter is normally able to generalize only to IEMs for which sufficient data are available. Here, we propose a hybrid approach that exploits the best of both worlds by building a mapping between simulated and real metabolic data through a novel method based on Siamese neural networks (SNN).ResultsThe proposed SNN model is able to perform disease prioritization for the metabolic profiles of IEM patients even for diseases that it was not trained to identify. To the best of our knowledge, this has not been attempted before. The developed model is able to significantly outperform a baseline model that relies on metabolic simulations only. The prioritization performances demonstrate the feasibility of the method, suggesting that the integration of metabolic models and data could significantly aid the IEM diagnosis process in the near future.Availability and implementationMetabolic datasets used in this study are publicly available from the cited sources. The original data produced in this study, including the trained models and the simulated metabolic profiles, are also publicly available (Messa et al., 2020).
  • DFT Study of NO Reduction Process on Ag/γ-Al2O3 Catalyst: Some Aspects of Mechanism and Catalyst Structure

    Matulis, Vitaly E.; Ragoyja, Ekaterina G.; Ivashkevich, Oleg A.; Lyakhov, Dmitry; Michels, Dominik L. (The Journal of Physical Chemistry C, American Chemical Society (ACS), 2020-12-23) [Article]
    Catalysts based on Ag/γ-Al2O3 are perspective systems for practical implementation of catalytic NO reduction. Nevertheless, the mechanism and regularities of this process have still not been fully investigated. Herein, we present the results of quantum-chemical research of the Ag/γ-Al2O3 catalyst surface and some aspects of the NO reduction mechanism on it. Proposed calculation methods using DFT and cluster models of the catalyst surface are compared and verified. The possibility of existence of small adsorbed neutral and cationic silver clusters on the surface of the catalyst is shown. It is demonstrated that NO adsorption on these clusters is energetically favorable, in the form of both monomers and dimers. The scheme of NO selective catalytic reduction (SCR) that explains increasing of N2O side-product amount on catalysts with silver fraction more than 2 wt % is proposed. The feasibility of this scheme is justified with calculated data. Some recommendations that allow decreasing amounts of N2O are developed.
  • Characterizing the Evasion Attackability of Multi-label Classifiers

    Yang, Zhuo; Han, Yufei; Zhang, Xiangliang (arXiv, 2020-12-17) [Preprint]
    Evasion attack in multi-label learning systems is an interesting, widely witnessed, yet rarely explored research topic. Characterizing the crucial factors determining the attackability of the multi-label adversarial threat is the key to interpret the origin of the adversarial vulnerability and to understand how to mitigate it. Our study is inspired by the theory of adversarial risk bound. We associate the attackability of a targeted multi-label classifier with the regularity of the classifier and the training data distribution. Beyond the theoretical attackability analysis, we further propose an efficient empirical attackability estimator via greedy label space exploration. It provides provably computational efficiency and approximation accuracy. Substantial experimental results on real-world datasets validate the unveiled attackability factors and the effectiveness of the proposed empirical attackability indicator
  • Predicting functions of maize proteins using graph convolutional network

    Zhou, Guangjie; Wang, Jun; Zhang, Xiangliang; Guo, Maozu; Yu, Guoxian (BMC Bioinformatics, Springer Science and Business Media LLC, 2020-12-16) [Article]
    Abstract Background Maize (Zea mays ssp. mays L.) is the most widely grown and yield crop in the world, as well as an important model organism for fundamental research of the function of genes. The functions of Maize proteins are annotated using the Gene Ontology (GO), which has more than 40000 terms and organizes GO terms in a direct acyclic graph (DAG). It is a huge challenge to accurately annotate relevant GO terms to a Maize protein from such a large number of candidate GO terms. Some deep learning models have been proposed to predict the protein function, but the effectiveness of these approaches is unsatisfactory. One major reason is that they inadequately utilize the GO hierarchy. Results To use the knowledge encoded in the GO hierarchy, we propose a deep Graph Convolutional Network (GCN) based model (DeepGOA) to predict GO annotations of proteins. DeepGOA firstly quantifies the correlations (or edges) between GO terms and updates the edge weights of the DAG by leveraging GO annotations and hierarchy, then learns the semantic representation and latent inter-relations of GO terms in the way by applying GCN on the updated DAG. Meanwhile, Convolutional Neural Network (CNN) is used to learn the feature representation of amino acid sequences with respect to the semantic representations. After that, DeepGOA computes the dot product of the two representations, which enable to train the whole network end-to-end coherently. Extensive experiments show that DeepGOA can effectively integrate GO structural information and amino acid information, and then annotates proteins accurately. Conclusions Experiments on Maize PH207 inbred line and Human protein sequence dataset show that DeepGOA outperforms the state-of-the-art deep learning based methods. The ablation study proves that GCN can employ the knowledge of GO and boost the performance. Codes and datasets are available at http://mlda.swu.edu.cn/codes.php?name=DeepGOA.
  • Improved StyleGAN Embedding: Where are the Good Latents?

    Zhu, Peihao; Abdal, Rameen; Qin, Yipeng; Wonka, Peter (arXiv, 2020-12-13) [Preprint]
    StyleGAN is able to produce photorealistic images almost indistinguishable from real ones. Embedding images into the StyleGAN latent space is not a trivial task due to the reconstruction quality and editing quality trade-off. In this paper, we first introduce a new normalized space to analyze the diversity and the quality of the reconstructed latent codes. This space can help answer the question of where good latent codes are located in latent space. Second, we propose a framework to analyze the quality of different embedding algorithms. Third, we propose an improved embedding algorithm based on our analysis. We compare our results with the current state-of-the-art methods and achieve a better trade-off between reconstruction quality and editing quality.
  • Exploring binary relations for ontology extension and improved adaptation to clinical text

    Slater, Luke T; Hoehndorf, Robert; Karwath, Andreas; Gkoutos, Georgios (Cold Spring Harbor Laboratory, 2020-12-05) [Preprint]
    Background: The controlled domain vocabularies provided by ontologies make them an indispensable tool for text mining. Ontologies also include semantic features in the form of taxonomy and axioms, which make annotated entities in text corpora useful for semantic analysis. Extending those semantic features may improve performance for characterisation and analytic tasks. Ontology learning techniques have previously been explored for novel ontology construction from text, though most recent approaches have focused on literature, with applications in information retrieval or human interaction tasks. We hypothesise that extension of existing ontologies using information mined from clinical narrative text may help to adapt those ontologies such that they better characterise those texts, and lead to improved classification performance. Results: We develop and present a framework for identifying new classes in text corpora, which can be integrated into existing ontology hierarchies. To do this, we employ the Stanford Open Information Extraction algorithm and integrate its implementation into the Komenti semantic text mining framework. To identify whether our approach leads to better characterisation of text, we present a case study, using the method to learn an adaptation to the Disease Ontology using text associated with a sample of 1,000 patient visits from the MIMIC-III critical care database. We use the adapted ontology to annotate and classify shared first diagnosis on patient visits with semantic similarity, revealing an improved performance over use of the base Disease Ontology on the set of visits the ontology was constructed from. Moreover, we show that the adapted ontology also improved performance for the same task over two additional unseen samples of 1,000 and 2,500 patient visits. Conclusions: We report a promising new method for ontology learning and extension from text. We demonstrate that we can successfully use the method to adapt an existing ontology to a textual dataset, improving its ability to characterise the dataset, and leading to improved analytic performance, even on unseen portions of the dataset.
  • Optimal correlation order in superresolution optical fluctuation microscopy

    Vlasenko, S.; Mikhalychev, A. B.; Karuseichyk, I. L.; Lyakhov, D. A.; Michels, Dominik L.; Mogilevtsev, D. (Physical Review A, American Physical Society (APS), 2020-12-04) [Article]
    Here, we show that superresolution optical fluctuation microscopy might not lead to ideally infinite superresolution enhancement with increasing of the order of measured cumulants. Using information analysis for estimating error bounds on the determination of point-source positions, we show that reachable precision per measurement might be saturated with increasing of the order of the measured cumulants in the superresolution regime. In fact, there is an optimal correlation order beyond which there is practically no improvement for objects of three and more point sources. However, for objects of just two sources, one still has an intuitively expected resolution increase with the cumulant order.
  • Spatial and Hyperfine Characteristics of SiV– and SiV0 Color Centers in Diamond: DFT Simulation

    Nizovtsev, A. P.; Kilin, S. Ya; Pushkarchuk, A. L.; Kuten, S. A.; Poklonski, N. A.; Michels, Dominik L.; Lyakhov, Dmitry; Jelezko, F. (Semiconductors, Pleiades Publishing Ltd, 2020-12-04) [Article]
    Abstract: One of the most promising platforms to implement quantum technologies are coupled electron-nuclear spins in diamond in which the electrons of paramagnetic color centers play a role of “fast” qubits, while nuclear spins of nearby 13C atoms can store quantum information for a very long time due to their exceptionally high isolation from the environment. Essential prerequisite for a high-fidelity spin manipulation in these systems with tailored control pulse sequences is a complete knowledge of hyperfine interactions. Development of this understanding for the negatively charged “silicon-vacancy” (SiV–) and neutral (SiV0) color center, is a primary goal of this article, where we are presenting shortly our recent results of computer simulation of spatial and hyperfine characteristics of these SiV centers in H-terminated cluster C128[SiV]H98 along with their comparison with available experimental data.
  • DFT Simulation of Electronic and Spin Properties of GeV– Color Center in Volume and Near-Surface of Nanodiamond for Temperature Sensor Applications

    Pushkarchuk, A. L.; Nizovtsev, A. P.; Kilin, S. Ya; Kuten, S. A.; Pushkarchuk, V. A.; Michels, Dominik L.; Lyakhov, Dmitry; Jelezko, F. (Semiconductors, Pleiades Publishing Ltd, 2020-12-04) [Article]
    Abstract: The “germanium-vacancy” (GeV) center in diamond can be used as Temperature Sensors. The idea of GeV-based thermometry is based on optical measurements of the spectral shift of the zero-phonon line and its spectral width as a function of temperature changes. At the same time optical characteristics of GeV center which is located near-surface could be modified by formation of defect states in the band gap based on surface impurities and dangling bonds. The electronic structure of the GeV center determines its optical properties. The goal of this study was to investigate comparatively the geometric characteristics and electronic structure of the GeV center in the volume and near-surface (100) of nanodiamond in cluster approximation. It was shown for the first time that formation of isolated dangling bond on the (100) diamond surface leads to formation of unoccupied state in the band gap in vicinity of 1 eV, which is located on the distance of 1.9 eV of conduction band edge. This state in the band gap may influence optical properties of GeV in diamond.
  • Geometric analysis of shape variability of lower jaws of prehistoric humans Analyse géométrique de la variabilité de la forme des mâchoires inférieures des humains préhistoriques

    Ren, Jing; Wonka, Peter; Harihara, Gowtham; Ovsjanikov, Maks (Anthropologie (France), Elsevier BV, 2020-12) [Article]
    In this document we describe our method and the results obtained for comparing jaws of prehistoric humans. Our main goal was twofold: (1) establish a methodology for comparing the structure of 3D shapes of scans of jaws using geometric data analysis techniques, and (2) use this methodology for comparing and clustering individual objects according to their geometric similarity. Moreover, we also applied geometric modeling techniques to establish a “clean” version of the dataset, without significant artefacts present in the original data, such as large missing parts. We then applied our analysis techniques both on the original and the clean dataset in order to validate our comparison results. For our core task of shape comparison, we used recent state-of-the-art shape matching methods and we present similarity results using different comparison metrics.
  • Huffman Coding Based Encoding Techniques for Fast Distributed Deep Learning

    Gajjala, Rishikesh R.; Banchhor, Shashwat; Abdelmoniem, Ahmed M.; Dutta, Aritra; Canini, Marco; Kalnis, Panos (ACM, 2020-12) [Conference Paper]
    Distributed stochastic algorithms, equipped with gradient compression techniques, such as codebook quantization, are becoming increasingly popular and considered state-of-the-art in training large deep neural network (DNN) models. However, communicating the quantized gradients in a network requires efficient encoding techniques. For this, practitioners generally use Elias encoding-based techniques without considering their computational overhead or data-volume. In this paper, based on Huffman coding, we propose several lossless encoding techniques that exploit different characteristics of the quantized gradients during distributed DNN training. Then, we show their effectiveness on 5 different DNN models across three different data-sets, and compare them with classic state-of-the-art Elias-based encoding techniques. Our results show that the proposed Huffman-based encoders (i.e., RLH, SH, and SHS) can reduce the encoded data-volume by up to 5.1×, 4.32×, and 3.8×, respectively, compared to the Elias-based encoders.
  • Exploring Entity Relationship in Pairwise Ranking: Adaptive Sampler and Beyond

    Yu, Lu (2020-12) [Dissertation]
    Advisor: Zhang, Xiangliang
    Committee members: Moshkov, Mikhail; Hoehndorf, Robert; Karypis, George
    Living in the booming age of information, we have to rely on powerful information retrieval tools to seek the unique piece of desired knowledge from such a big data world, like using personalized search engine and recommendation systems. As one of the core components, ranking model can appear in almost everywhere as long as we need a relative order of desired/relevant entities. Based on the most general and intuitive assumption that entities without user actions (e.g., clicks, purchase, comments) are of less interest than those with user actions, the objective function of pairwise ranking models is formulated by measuring the contrast between positive (with actions) and negative (without actions) entities. This contrastive relationship is the core of pairwise ranking models. The construction of these positive-negative pairs has great influence on the model inference accuracy. Especially, it is challenging to explore the entity relationships in heterogeneous information network. In this thesis, we aim at advancing the development of the methodologies and principles of mining heterogeneous information network through learning entity relations from a pairwise learning to rank optimization perspective. More specifically we first show the connections of different relation learning objectives modified from different ranking metrics including both pairwise and list-wise objectives. We prove that most of popular ranking metrics can be optimized in the same lower bound. Secondly, we propose the class-imbalance problem imposed by entity relation comparison in ranking objectives, and prove that class-imbalance problem can lead to frequency 5 clustering and gradient vanishment problems. As a response, we indicate out that developing a fast adaptive sampling method is very essential to boost the pairwise ranking model. To model the entity dynamic dependency, we propose to unify the individual-level interaction and union-level interactions, and result in a multi-order attentive ranking model to improve the preference inference from multiple views.
  • AdaBins: Depth Estimation using Adaptive Bins

    Bhat, Shariq Farooq; Alhashim, Ibraheem; Wonka, Peter (arXiv, 2020-11-28) [Preprint]
    We address the problem of estimating a high quality dense depth map from a single RGB input image. We start out with a baseline encoder-decoder convolutional neural network architecture and pose the question of how the global processing of information can help improve overall depth estimation. To this end, we propose a transformer-based architecture block that divides the depth range into bins whose center value is estimated adaptively per image. The final depth values are estimated as linear combinations of the bin centers. We call our new building block AdaBins. Our results show a decisive improvement over the state-of-the-art on several popular depth datasets across all metrics. We also validate the effectiveness of the proposed block with an ablation study and provide the code and corresponding pre-trained weights of the new state-of-the-art model.
  • Generative Layout Modeling using Constraint Graphs

    Para, Wamiq; Guerrero, Paul; Kelly, Tom; Guibas, Leonidas; Wonka, Peter (arXiv, 2020-11-26) [Preprint]
    We propose a new generative model for layout generation. We generate layouts in three steps. First, we generate the layout elements as nodes in a layout graph. Second, we compute constraints between layout elements as edges in the layout graph. Third, we solve for the final layout using constrained optimization. For the first two steps, we build on recent transformer architectures. The layout optimization implements the constraints efficiently. We show three practical contributions compared to the state of the art: our work requires no user input, produces higher quality layouts, and enables many novel capabilities for conditional layout generation.
  • Surface-only ferrofluids

    Huang, Libo; Michels, Dominik L. (ACM Transactions on Graphics, Association for Computing Machinery (ACM), 2020-11-26) [Article]
    We devise a novel surface-only approach for simulating the three dimensional free-surface flow of incompressible, inviscid, and linearly magnetizable ferrofluids. A Lagrangian velocity field is stored on a triangle mesh capturing the fluid's surface. The two key problems associated with the dynamic simulation of the fluid's interesting geometry are the magnetization process transitioning the fluid from a non-magnetic into a magnetic material, and the evaluation of magnetic forces. In this regard, our key observation is that for linearly incompressible ferrofluids, their magnetization and application of magnetic forces only require knowledge about the position of the fluids' boundary. Consequently, our approach employs a boundary element method solving the magnetization problem and evaluating the so-called magnetic pressure required for the force evaluation. The magnetic pressure is added to the Dirichlet boundary condition of a surface-only liquids solver carrying out the dynamical simulation. By only considering the fluid's surface in contrast to its whole volume, we end up with an efficient approach enabling more complex and realistic ferrofluids to be explored in the digital domain without compromising efficiency. Our approach allows for the use of physical parameters leading to accurate simulations as demonstrated in qualitative and quantitative evaluations.
  • Stormscapes: Simulating cloud dynamics in the now

    Hadrich, Torsten; Makowski, Miłosz; Pałubicki, Wojtek; Banuti, Daniel T.; Pirk, Sören; Michels, Dominik L. (ACM Transactions on Graphics, Association for Computing Machinery (ACM), 2020-11-26) [Article]
    The complex interplay of a number of physical and meteorological phenomena makes simulating clouds a challenging and open research problem. We explore a physically accurate model for simulating clouds and the dynamics of their transitions. We propose first-principle formulations for computing buoyancy and air pressure that allow us to simulate the variations of atmospheric density and varying temperature gradients. Our simulation allows us to model various cloud types, such as cumulus, stratus, and stratoscumulus, and their realistic formations caused by changes in the atmosphere. Moreover, we are able to simulate large-scale cloud super cells - clusters of cumulonimbus formations - that are commonly present during thunderstorms. To enable the efficient exploration of these stormscapes, we propose a lightweight set of high-level parameters that allow us to intuitively explore cloud formations and dynamics. Our method allows us to simulate cloud formations of up to about 20 km × 20 km extents at interactive rates. We explore the capabilities of physically accurate and yet interactive cloud simulations by showing numerous examples and by coupling our model with atmosphere measurements of real-time weather services to simulate cloud formations in the now. Finally, we quantitatively assess our model with cloud fraction profiles, a common measure for comparing cloud types.
  • MapTree: Recovering multiple solutions in the space of maps

    Ren, Jing; Melzi, Simone; Ovsjanikov, Maks; Wonka, Peter (ACM Transactions on Graphics, Association for Computing Machinery (ACM), 2020-11-26) [Article]
    In this paper we propose an approach for computing multiple high-quality near-isometric dense correspondences between a pair of 3D shapes. Our method is fully automatic and does not rely on user-provided landmarks or descriptors. This allows us to analyze the full space of maps and extract multiple diverse and accurate solutions, rather than optimizing for a single optimal correspondence as done in most previous approaches. To achieve this, we propose a compact tree structure based on the spectral map representation for encoding and enumerating possible rough initializations, and a novel efficient approach for refining them to dense pointwise maps. This leads to a new method capable of both producing multiple high-quality correspondences across shapes and revealing the symmetry structure of a shape without a priori information. In addition, we demonstrate through extensive experiments that our method is robust and results in more accurate correspondences than state-of-the-art for shape matching and symmetry detection.
  • Screen-space blue-noise diffusion of monte carlo sampling error via hierarchical ordering of pixels

    Ahmed, Abdalla G. M.; Wonka, Peter (ACM Transactions on Graphics, Association for Computing Machinery (ACM), 2020-11-26) [Article]
    We present a novel technique for diffusing Monte Carlo sampling error as a blue noise in screen space. We show that automatic diffusion of sampling error can be achieved by ordering the pixels in a way that preserves locality, such as Morton's Z-ordering, and assigning the samples to the pixels from successive sub-sequences of a single low-discrepancy sequence, thus securing well-distributed samples for each pixel, local neighborhoods, and the whole image. We further show that a blue-noise distribution of the error is attainable by scrambling the Z-ordering to induce isotropy. We present an efficient technique to implement this hierarchical scrambling by defining a context-free grammar that describes infinite self-similar lookup trees. Our concept is scalable to arbitrary image resolutions, sample dimensions, and sample count, and supports progressive and adaptive sampling.

View more