For more information visit:

Recent Submissions

  • Non-Linear Anisotropic Diffusion for Memory-Efficient Computed Tomography Super-Resolution Reconstruction

    Abu Jbara, Khaled F.; Idoughi, Ramzi; Heidrich, Wolfgang (IEEE, 2022-01-06) [Conference Paper]
    Despite the impressive performance of Computed Tomography (CT) hardware, there is still a need to push the boundaries of the CT spatial resolution. Super-resolution techniques have been widely used in computer vision to enhance the resolution of 2D and 3D images. They have also been introduced to improve the CT volume resolution. In this work, we propose a flexible framework that produces a higher-resolution 3D volume from low-resolution 2D projections. This framework can be applied to any CT data regardless of the original physical scale and regardless of the target application. It is based on regularization by denoising (RED) approach, where a Non-Linear Anisotropic Diffusion filter is used as the denoiser. We demonstrate our framework on both simulated and captured data, and show good quality reconstruction and a huge memory-footprint improvement in comparison to the state-of-the-art algorithm.
  • Channel Characterization of IRS-based Visible Light Communication Systems

    Abdelhady, Amr Mohamed Abdelaziz; Amin, Osama; Salem, Ahmed Sultan; Alouini, Mohamed-Slim; Shihada, Basem (Accepted by IEEE Transactions on Communications, Accepted by IEEE, 2022) [Article]
    This paper studies the temporal characteristics of the intelligent reflecting surface (IRS)-based visible light communication (VLC) channel using radiometric concepts. Throughout this study, we account for the delays experienced by the transmitted power along the continuum of paths originating at the source, passing through the IRS, reaching the detector. Then, we derive the impulse response of multi-element phase-tunable metasurface and orientation-tunable mirror array-based reflector setups for a general setting of source, reflector, and detector dimensions and relative positions. In addition, we derive simpler expressions for the two special cases, namely, the point source and the large-source small-reflector. Moreover, we present the exact expression for the delay spread and derive lower, upper bounds and asymptotic expressions when the number of reflecting elements increases for both reflector types. Finally, we study the impact of several system parameters on the temporal characterization of the two IRS-based VLC systems.
  • Semantic guide for semi-supervised few-shot multi-label node classification

    Xiao, Lin; Xu, Pengyu; Jing, Liping; Akujuobi, Uchenna Thankgod; Zhang, Xiangliang (Information Sciences, Elsevier BV, 2022-01) [Article]
    We study a new research problem named semi-supervised few-shot multi-label node classification which has the following characteristics: 1) the extreme imbal-ance between the number of labeled and unlabeled nodes that are connected on graphs (handled by semi-supervised node learning); 2) the few labeled nodes per label (few-shot learning); and 3) the semantical correlations among labels for they share the same subsets of nodes (multi-label classification). In this paper, we propose a Label-Aware Representation Network (LARN) model to tackle this problem, by taking advantage of the semantic knowledge of labels to characterize nodes and their neighbors. Such a label-aware feature learning process allows a node to prepare its representation by knowing how it will be classified. The learned rich representations so can combat the scarcity of labeled training nodes. A label correlation scanner is then proposed to adaptively capture the label correlation and extract the useful information to generate the final node representation. Experimental results demonstrate that LARN consistently out- performs the state-of-the-art methods with significant margins, especially when only a few-shot labeled nodes are available.
  • DeepSVP: Integration of genotype and phenotype for structural variant prioritization using deep learning

    Althagafi, Azza Th.; Alsubaie, Lamia; Kathiresan, Nagarajan; Mineta, Katsuhiko; Aloraini, Taghrid; Almutairi, Fuad; Alfadhel, Majid; Gojobori, Takashi; Alfares, Ahmad; Hoehndorf, Robert (Bioinformatics, Oxford University Press (OUP), 2021-12-24) [Article]
    Abstract Motivation Structural genomic variants account for much of human variability and are involved in several diseases. Structural variants are complex and may affect coding regions of multiple genes, or affect the functions of genomic regions in different ways from single nucleotide variants. Interpreting the phenotypic consequences of structural variants relies on information about gene functions, haploinsufficiency or triplosensitivity, and other genomic features. Phenotype-based methods to identifying variants that are involved in genetic diseases combine molecular features with prior knowledge about the phenotypic consequences of altering gene functions. While phenotype-based methods have been applied successfully to single nucleotide variants as well as short insertions and deletions, the complexity of structural variants makes it more challenging to link them to phenotypes. Furthermore, structural variants can affect a large number of coding regions, and phenotype information may not be available for all of them. Results We developed DeepSVP, a computational method to prioritize structural variants involved in genetic diseases by combining genomic and gene functions information. We incorporate phenotypes linked to genes, functions of gene products, gene expression in individual celltypes, and anatomical sites of expression, and systematically relate them to their phenotypic consequences through ontologies and machine learning. DeepSVP significantly improves the success rate of finding causative variants in several benchmarks and can identify novel pathogenic structural variants in consanguineous families. Availability
  • Protein-RNA interaction prediction with deep learning: structure matters.

    Wei, Junkang; Chen, Siyuan; Zong, Licheng; Gao, Xin; Li, Yu (Briefings in bioinformatics, Oxford University Press (OUP), 2021-12-21) [Article]
    Protein-RNA interactions are of vital importance to a variety of cellular activities. Both experimental and computational techniques have been developed to study the interactions. Because of the limitation of the previous database, especially the lack of protein structure data, most of the existing computational methods rely heavily on the sequence data, with only a small portion of the methods utilizing the structural information. Recently, AlphaFold has revolutionized the entire protein and biology field. Foreseeably, the protein-RNA interaction prediction will also be promoted significantly in the upcoming years. In this work, we give a thorough review of this field, surveying both the binding site and binding preference prediction problems and covering the commonly used datasets, features and models. We also point out the potential challenges and opportunities in this field. This survey summarizes the development of the RNA-binding protein-RNA interaction field in the past and foresees its future development in the post-AlphaFold era.
  • Accelerating bioactive peptide discovery via mutual information-based meta-learning.

    He, Wenjia; Jiang, Yi; Jin, Junru; Li, Zhongshen; Zhao, Jiaojiao; Manavalan, Balachandran; Su, Ran; Gao, Xin; Wei, Leyi (Briefings in bioinformatics, Oxford University Press (OUP), 2021-12-10) [Article]
    Recently, machine learning methods have been developed to identify various peptide bio-activities. However, due to the lack of experimentally validated peptides, machine learning methods cannot provide a sufficiently trained model, easily resulting in poor generalizability. Furthermore, there is no generic computational framework to predict the bioactivities of different peptides. Thus, a natural question is whether we can use limited samples to build an effective predictive model for different kinds of peptides. To address this question, we propose Mutual Information Maximization Meta-Learning (MIMML), a novel meta-learning-based predictive model for bioactive peptide discovery. Using few samples from various functional peptides, MIMML can sufficiently learn the discriminative information amongst various functions and characterize functional differences. Experimental results show excellent performance of MIMML though using far fewer training samples as compared to the state-of-the-art methods. We also decipher the latent relationships among different kinds of functions to understand what meta-model learned to improve a specific task. In summary, this study is a pioneering work in the field of functional peptide mining and provides the first-of-its-kind solution for few-sample learning problems in biological sequence analysis, accelerating the new functional peptide discovery. The source codes and datasets are available on
  • Intuitive and efficient roof modeling for reconstruction and synthesis

    Ren, Jing; Zhang, Biao; Wu, Bojian; Huang, Jianqiang; Fan, Lubin; Ovsjanikov, Maks; Wonka, Peter (ACM Transactions on Graphics, Association for Computing Machinery (ACM), 2021-12-10) [Article]
    We propose a novel and flexible roof modeling approach that can be used for constructing planar 3D polygon roof meshes. Our method uses a graph structure to encode roof topology and enforces the roof validity by optimizing a simple but effective planarity metric we propose. This approach is significantly more efficient than using general purpose 3D modeling tools such as 3ds Max or SketchUp, and more powerful and expressive than specialized tools such as the straight skeleton. Our optimization-based formulation is also flexible and can accommodate different styles and user preferences for roof modeling. We showcase two applications. The first application is an interactive roof editing framework that can be used for roof design or roof reconstruction from aerial images. We highlight the efficiency and generality of our approach by constructing a mesh-image paired dataset consisting of 2539 roofs. Our second application is a generative model to synthesize new roof meshes from scratch. We use our novel dataset to combine machine learning and our roof optimization techniques, by using transformers and graph convolutional networks to model roof topology, and our roof optimization methods to enforce the planarity constraint.
  • Ships, splashes, and waves on a vast ocean

    Huang, Libo; Qu, Ziyin; Tan, Xun; Zhang, Xinxin; Michels, Dominik L.; Jiang, Chenfanfu (ACM Transactions on Graphics, Association for Computing Machinery (ACM), 2021-12-10) [Article]
    The simulation of large open water surface is challenging using a uniform volumetric discretization of the Navier-Stokes equations. Simulating water splashes near moving objects, which height field methods for water waves cannot capture, necessitates high resolutions. Such simulations can be carried out using the Fluid-Implicit-Particle (FLIP) method. However, the FLIP method is not efficient for the long-lasting water waves that propagate to long distances, which require sufficient depth for a correct dispersion relationship. This paper presents a new method to tackle this dilemma through an efficient hybridization of volumetric and surface-based advection-projection discretizations. We design a hybrid time-stepping algorithm that combines a FLIP domain and an adaptively remeshed Boundary Element Method (BEM) domain for the incompressible Euler equations. The resulting framework captures the detailed water splashes near moving objects with the FLIP method, and produces convincing water waves with correct dispersion relationships at modest additional costs.
  • CLIP2StyleGAN: Unsupervised Extraction of StyleGAN Edit Directions

    Abdal, Rameen; Zhu, Peihao; Femiani, John; Mitra, Niloy J.; Wonka, Peter (arXiv, 2021-12-09) [Preprint]
    The success of StyleGAN has enabled unprecedented semantic editing capabilities, on both synthesized and real images. However, such editing operations are either trained with semantic supervision or described using human guidance. In another development, the CLIP architecture has been trained with internet-scale image and text pairings and has been shown to be useful in several zero-shot learning settings. In this work, we investigate how to effectively link the pretrained latent spaces of StyleGAN and CLIP, which in turn allows us to automatically extract semantically labeled edit directions from StyleGAN, finding and naming meaningful edit operations without any additional human guidance. Technically, we propose two novel building blocks; one for finding interesting CLIP directions and one for labeling arbitrary directions in CLIP latent space. The setup does not assume any pre-determined labels and hence we do not require any additional supervised text/attributes to build the editing framework. We evaluate the effectiveness of the proposed method and demonstrate that extraction of disentangled labeled StyleGAN edit directions is indeed possible, and reveals interesting and non-trivial edit directions.
  • Effects of Negation and Uncertainty Stratification on Text-Derived Patient Profile Similarity.

    Slater, Luke T; Karwath, Andreas; Hoehndorf, Robert; Gkoutos, Georgios (Frontiers in digital health, Frontiers Media SA, 2021-12-06) [Article]
    Semantic similarity is a useful approach for comparing patient phenotypes, and holds the potential of an effective method for exploiting text-derived phenotypes for differential diagnosis, text and document classification, and outcome prediction. While approaches for context disambiguation are commonly used in text mining applications, forming a standard component of information extraction pipelines, their effects on semantic similarity calculations have not been widely explored. In this work, we evaluate how inclusion and disclusion of negated and uncertain mentions of concepts from text-derived phenotypes affects similarity of patients, and the use of those profiles to predict diagnosis. We report on the effectiveness of these approaches and report a very small, yet significant, improvement in performance when classifying primary diagnosis over MIMIC-III patient visits.
  • Snapshot HDR Video Construction Using Coded Mask

    Alghamdi, Masheal; Fu, Qiang; Thabet, Ali Kassem; Heidrich, Wolfgang (arXiv, 2021-12-05) [Preprint]
    This paper study the reconstruction of High Dynamic Range (HDR) video from snapshot-coded LDR video. Constructing an HDR video requires restoring the HDR values for each frame and maintaining the consistency between successive frames. HDR image acquisition from single image capture, also known as snapshot HDR imaging, can be achieved in several ways. For example, the reconfigurable snapshot HDR camera is realized by introducing an optical element into the optical stack of the camera; by placing a coded mask at a small standoff distance in front of the sensor. High-quality HDR image can be recovered from the captured coded image using deep learning methods. This study utilizes 3D-CNNs to perform a joint demosaicking, denoising, and HDR video reconstruction from coded LDR video. We enforce more temporally consistent HDR video reconstruction by introducing a temporal loss function that considers the short-term and long-term consistency. The obtained results are promising and could lead to affordable HDR video capture using conventional cameras.
  • Barbershop

    Zhu, Peihao; Abdal, Rameen; Femiani, John; Wonka, Peter (ACM Transactions on Graphics, Association for Computing Machinery (ACM), 2021-12) [Article]
    Seamlessly blending features from multiple images is extremely challenging because of complex relationships in lighting, geometry, and partial occlusion which cause coupling between different parts of the image. Even though recent work on GANs enables synthesis of realistic hair or faces, it remains difficult to combine them into a single, coherent, and plausible image rather than a disjointed set of image patches. We present a novel solution to image blending, particularly for the problem of hairstyle transfer, based on GAN-inversion. We propose a novel latent space for image blending which is better at preserving detail and encoding spatial information, and propose a new GAN-embedding algorithm which is able to slightly modify images to conform to a common segmentation mask. Our novel representation enables the transfer of the visual properties from multiple reference images including specific details such as moles and wrinkles, and because we do image blending in a latent-space we are able to synthesize images that are coherent. Our approach avoids blending artifacts present in other approaches and finds a globally consistent image. Our results demonstrate a significant improvement over the current state of the art in a user study, with users preferring our blending solution over 95 percent of the time. Source code for the new approach is available at
  • Learning to reconstruct botanical trees from single images

    Li, Bosheng; Kałużny, Jacek; Klein, Jonathan; Michels, Dominik L.; Pałubicki, Wojtek; Benes, Bedrich; Pirk, Sören (ACM Transactions on Graphics, Association for Computing Machinery (ACM), 2021-12) [Article]
    We introduce a novel method for reconstructing the 3D geometry of botanical trees from single photographs. Faithfully reconstructing a tree from single-view sensor data is a challenging and open problem because many possible 3D trees exist that fit the tree's shape observed from a single view. We address this challenge by defining a reconstruction pipeline based on three neural networks. The networks simultaneously mask out trees in input photographs, identify a tree's species, and obtain its 3D radial bounding volume - our novel 3D representation for botanical trees. Radial bounding volumes (RBV) are used to orchestrate a procedural model primed on learned parameters to grow a tree that matches the main branching structure and the overall shape of the captured tree. While the RBV allows us to faithfully reconstruct the main branching structure, we use the procedural model's morphological constraints to generate realistic branching for the tree crown. This constraints the number of solutions of tree models for a given photograph of a tree. We show that our method reconstructs various tree species even when the trees are captured in front of complex backgrounds. Moreover, although our neural networks have been trained on synthetic data with data augmentation, we show that our pipeline performs well for real tree photographs. We evaluate the reconstructed geometries with several metrics, including leaf area index and maximum radial tree distances.
  • Weatherscapes : nowcasting heat transfer and water continuity

    Herrera, Jorge Alejandro Amador; Hadrich, Torsten; Pałubicki, Wojtek; Banuti, Daniel T.; Pirk, Sören; Michels, Dominik L. (ACM Transactions on Graphics, Association for Computing Machinery (ACM), 2021-12) [Article]
    Due to the complex interplay of various meteorological phenomena, simulating weather is a challenging and open research problem. In this contribution, we propose a novel physics-based model that enables simulating weather at interactive rates. By considering atmosphere and pedosphere we can define the hydrologic cycle - and consequently weather - in unprecedented detail. Specifically, our model captures different warm and cold clouds, such as mammatus, hole-punch, multi-layer, and cumulonimbus clouds as well as their dynamic transitions. We also model different precipitation types, such as rain, snow, and graupel by introducing a comprehensive microphysics scheme. The Wegener-Bergeron-Findeisen process is incorporated into our Kessler-type microphysics formulation covering ice crystal growth occurring in mixed-phase clouds. Moreover, we model the water run-off from the ground surface, the infiltration into the soil, and its subsequent evaporation back to the atmosphere. We account for daily temperature changes, as well as heat transfer between pedosphere and atmosphere leading to a complex feedback loop. Our framework enables us to interactively explore various complex weather phenomena. Our results are assessed visually and validated by simulating weatherscapes for various setups covering different precipitation events and environments, by showcasing the hydrologic cycle, and by reproducing common effects such as Foehn winds. We also provide quantitative evaluations creating high-precipitation cumulonimbus clouds by prescribing atmospheric conditions based on infrared satellite observations. With our model we can generate dynamic 3D scenes of weatherscapes with high visual fidelity and even nowcast real weather conditions as simulations by streaming weather data into our framework.
  • msRepDB: a comprehensive repetitive sequence database of over 80 000 species.

    Liao, Xingyu; Hu, Kang; Salhi, Adil; Zou, You; Wang, Jianxin; Gao, Xin (Nucleic acids research, Oxford University Press (OUP), 2021-12-01) [Article]
    Repeats are prevalent in the genomes of all bacteria, plants and animals, and they cover nearly half of the Human genome, which play indispensable roles in the evolution, inheritance, variation and genomic instability, and serve as substrates for chromosomal rearrangements that include disease-causing deletions, inversions, and translocations. Comprehensive identification, classification and annotation of repeats in genomes can provide accurate and targeted solutions towards understanding and diagnosis of complex diseases, optimization of plant properties and development of new drugs. RepBase and Dfam are two most frequently used repeat databases, but they are not sufficiently complete. Due to the lack of a comprehensive repeat database of multiple species, the current research in this field is far from being satisfactory. LongRepMarker is a new framework developed recently by our group for comprehensive identification of genomic repeats. We here propose msRepDB based on LongRepMarker, which is currently the most comprehensive multi-species repeat database, covering >80 000 species. Comprehensive evaluations show that msRepDB contains more species, and more complete repeats and families than RepBase and Dfam databases. (
  • Integral Methods for Versatile Fluid Simulation

    Huang, Libo (2021-11-30) [Dissertation]
    Advisor: Michels, Dominik L.
    Committee members: Pottmann, Helmut; Heidrich, Wolfgang; Batty, Christopher
    Physical simulations of natural phenomena usually boil down to solving an ordinary or partial differential equation system. Partial differential equation systems can be formulated either in differential form or in integral form. This dissertation explores integral methods for the simulation of magnetic fluids, so-called ferrofluids, and the surface of the vast ocean. The first two parts of this dissertation aim to contribute to the development of accurate and efficient methods for simulating ferrofluids on the macroscopic (in the order of millimeters) scale. The magnetic nature of these fluids imposes challenges for the simulation. The two most important challenges are to first model the influence of ferrofluids on surrounding magnetic fields and second the influence of magnetic forces on the fluids’ dynamics. To tackle these challenges, two Lagrangian simulation methods have been proposed. The first method discretizes the magnetic substance as clusters of particles carrying radial basis functions and applies magnetic forces between these particles. This is a mesh-free method suitable for particle-based fluid simulation frameworks such as smoothed-particle hydrodynamics. The second method follows another direction, only discretizing the fluid’s surface as triangles and vertices. A surface-based simulation for the fluid part is employed, and a boundary element method is utilized for the magnetic part. The magnetic forces are added as gradients of the magnetic energy defined on the fluid’s surface. The second approach has to solve significantly fewer unknowns in the underlying equations, and uses a more accurate surface tension model compared to the radial basis function approach. The proposed methods are able to reproduce a series of characteristic phenomena of magnetic fluids, both qualitatively and in some cases even quantitatively which leads to a better understanding of such kind of materials. The boundary element method employed in the second part shows advantages beyond ferrofluids. In the third part of this thesis, a boundary element method is coupled with a particle-based fluid simulator for ocean simulation. The wavy motion of the ocean is simulated using large triangle meshes, while water splashes are simulated using particles. This approach is much more efficient in terms of computation time and memory consumption.
  • Towards Affective Vision and Language

    Haydarov, Kilichbek (2021-11-30) [Thesis]
    Advisor: Elhoseiny, Mohamed
    Committee members: Wonka, Peter; Michels, Dominik
    Developing intelligent systems that can recognize and express human affects is essential to bridge the gap between human and artificial intelligence. This thesis explores the creative and emotional frontiers of artificial intelligence. Specifically, in this thesis, we investigate the relation between the affective impact of visual stimuli and natural language by collecting and analyzing a new dataset called ArtEmis. Furthermore, capitalizing on this dataset, we demonstrate affective AI models that can emotionally talk about artwork and generate them given their affective descriptions. In text-to-image generation task, we present HyperCGAN: a conceptually simple and general approach for text-to-image synthesis that uses hypernetworks to condition a GAN model on text. In our setting, the generator and the discriminator weights are controlled by their corresponding hypernetworks, which modulate weight parameters based on the provided text query. We explore different mechanisms to modulate the layers depending on the underlying architecture of a target network and the structure of the conditioning variable.
  • Radiogenomic Signatures of Oncotype DX Recurrence Score Enable Prediction of Survival in Estrogen Receptor–Positive Breast Cancer: A Multicohort Study

    Fan, Ming; Cui, Yajing; You, Chao; Liu, Li; Gu, Yajia; Peng, Weijun; Bai, Qianming; Gao, Xin; Li, Lihua (Radiology, Radiological Society of North America (RSNA), 2021-11-30) [Article]
    Radiogenomic signatures associated with genomic assays (Oncotype DX) were identified as independent predictors after adjusting for clinical factors for survival and neoadjuvant chemotherapy response in estrogen receptor–positive breast cancer.
  • Critical role of backbone coordination in the mRNA recognition by RNA induced silencing complex

    Zhu, Lizhe; Jiang, Hanlun; Cao, Siqin; Unarta, Ilona Christy; Gao, Xin; Huang, Xuhui (Communications Biology, Springer Science and Business Media LLC, 2021-11-30) [Article]
    AbstractDespite its functional importance, the molecular mechanism underlying target mRNA recognition by Argonaute (Ago) remains largely elusive. Based on extensive all-atom molecular dynamics simulations, we constructed quasi-Markov State Model (qMSM) to reveal the dynamics during recognition at position 6-7 in the seed region of human Argonaute 2 (hAgo2). Interestingly, we found that the slowest mode of motion therein is not the gRNA-target base-pairing, but the coordination of the target phosphate groups with a set of positively charged residues of hAgo2. Moreover, the ability of Helix-7 to approach the PIWI and MID domains was found to reduce the effective volume accessible to the target mRNA and therefore facilitate both the backbone coordination and base-pair formation. Further mutant simulations revealed that alanine mutation of the D358 residue on Helix-7 enhanced a trap state to slow down the loading of target mRNA. Similar trap state was also observed when wobble pairs were introduced in g6 and g7, indicating the role of Helix-7 in suppressing non-canonical base-paring. Our study pointed to a general mechanism for mRNA recognition by eukaryotic Agos and demonstrated the promise of qMSM in investigating complex conformational changes of biomolecular systems.
  • Domain-Aware Continual Zero-Shot Learning

    Yi, Kai (2021-11-29) [Thesis]
    Advisor: Elhoseiny, Mohamed
    Committee members: Wonka, Peter; Ghanem, Bernard; Michels, Dominik
    We introduce Domain Aware Continual Zero-Shot Learning (DACZSL), the task of visually recognizing images of unseen categories in unseen domains sequentially. We created DACZSL on top of the DomainNet dataset by dividing it into a sequence of tasks, where classes are incrementally provided on seen domains during training and evaluation is conducted on unseen domains for both seen and unseen classes. We also proposed a novel Domain-Invariant CZSL Network (DIN), which outperforms state-of-the-art baseline models that we adapted to DACZSL setting. We adopt a structure-based approach to alleviate forgetting knowledge from previous tasks with a small per-task private network in addition to a global shared network. To encourage the private network to capture the domain and task-specific representation, we train our model with a novel adversarial knowledge disentanglement setting to make our global network task-invariant and domain-invariant over all the tasks. Our method also learns a class-wise learnable prompt to obtain better class-level text representation, which is used to represent side information to enable zero-shot prediction of future unseen classes. Our code and benchmarks are made available at

View more