### Recent Submissions

• #### Intuitive and Efficient Roof Modeling for Reconstruction and Synthesis

(arXiv, 2021-09-16) [Preprint]
We propose a novel and flexible roof modeling approach that can be used for constructing planar 3D polygon roof meshes. Our method uses a graph structure to encode roof topology and enforces the roof validity by optimizing a simple but effective planarity metric we propose. This approach is significantly more efficient than using general purpose 3D modeling tools such as 3ds Max or SketchUp, and more powerful and expressive than specialized tools such as the straight skeleton. Our optimization-based formulation is also flexible and can accommodate different styles and user preferences for roof modeling. We showcase two applications. The first application is an interactive roof editing framework that can be used for roof design or roof reconstruction from aerial images. We highlight the efficiency and generality of our approach by constructing a mesh-image paired dataset consisting of 2539 roofs. Our second application is a generative model to synthesize new roof meshes from scratch. We use our novel dataset to combine machine learning and our roof optimization techniques, by using transformers and graph convolutional networks to model roof topology, and our roof optimization methods to enforce the planarity constraint.
• #### Features of structure, magnetic state and electrodynamic performance of SrFe12−xInxO19

(Scientific reports, Springer Science and Business Media LLC, 2021-09-16) [Article]
Indium-substituted strontium hexaferrites were prepared by the conventional solid-phase reaction method. Neutron diffraction patterns were obtained at room temperature and analyzed using the Rietveld methods. A linear dependence of the unit cell parameters is found. In3+ cations are located mainly in octahedral positions of 4fVI and 12 k. The average crystallite size varies within 0.84–0.65 μm. With increasing substitution, the TC Curie temperature decreases monotonically down to ~ 520 K. ZFC and FC measurements showed a frustrated state. Upon substitution, the average and maximum sizes of ferrimagnetic clusters change in the opposite direction. The Mr remanent magnetization decreases down to ~ 20.2 emu/g at room temperature. The Ms spontaneous magnetization and the keff effective magnetocrystalline anisotropy constant are determined. With increasing substitution, the maximum of the ε/ real part of permittivity decreases in magnitude from ~ 3.3 to ~ 1.9 and shifts towards low frequencies from ~ 45.5 GHz to ~ 37.4 GHz. The maximum of the tg(α) dielectric loss tangent decreases from ~ 1.0 to ~ 0.7 and shifts towards low frequencies from ~ 40.6 GHz to ~ 37.3 GHz. The low-frequency maximum of the μ/ real part of permeability decreases from ~ 1.8 to ~ 0.9 and slightly shifts towards high frequencies up to ~ 34.7 GHz. The maximum of the tg(δ) magnetic loss tangent decreases from ~ 0.7 to ~ 0.5 and shifts slightly towards low frequencies from ~ 40.5 GHz to ~ 37.7 GHz. The discussion of microwave properties is based on the saturation magnetization, natural ferromagnetic resonance and dielectric polarization types.
• #### Neural Étendue Expander for Ultra-Wide-Angle High-Fidelity Holographic Display

(arXiv, 2021-09-16) [Preprint]
Holographic displays can generate light fields by dynamically modulating the wavefront of a coherent beam of light using a spatial light modulator, promising rich virtual and augmented reality applications. However, the limited spatial resolution of existing dynamic spatial light modulators imposes a tight bound on the diffraction angle. As a result, today’s holographic displays possess low etendue, which is the product of the display area and the ´maximum solid angle of diffracted light. The low etendue forces a sacrifice of either the field ´ of view (FOV) or the display size. In this work, we lift this limitation by presenting neural etendue expanders. This new breed of optical elements, which is learned from a natural im- ´ age dataset, enables higher diffraction angles for ultra-wide FOV while maintaining both a compact form factor and the fidelity of displayed contents to human viewers. With neural etendue expanders, we achieve 64 ´ × etendue expansion of natural images with reconstruction ´ quality (measured in PSNR) over 29 dB on simulated retinal-resolution images. As a result, the proposed approach with expansion factor 64× enables high-fidelity ultra-wide-angle holographic projection of natural images using an 8K-pixel SLM, resulting in a 18.5 mm eyebox size and 2.18 steradians FOV, covering 85% of the human stereo FOV.
• #### Computational Imaging and Its Applications in Fluids

(2021-09-13) [Dissertation]
Committee members: Ghanem, Bernard; Wonka, Peter; Schindler, Konrad
Computational imaging di↵ers from traditional imaging system by integrating an encoded measurement system and a tailored computational algorithm to extract interesting scene features. This dissertation demonstrates two approaches which apply computational imaging methods to the fluid domain. In the first approach, we study the problem of reconstructing time-varying 3D- 3C fluid velocity vector fields. We extend 2D Particle Imaging Velocimetry to three dimensions by encoding depth into color (a “rainbow”). For reconstruction, we derive an image formation model for recovering stationary 3D particle positions. 3D velocity estimation is achieved with a variant of 3D optical flow that accounts for both physical constraints as well as the rainbow image formation model. This velocity field can be used to refine the position estimate by adding physical priors that tie together all the time steps, forming a joint reconstruction scheme. In the second approach, we study the problem of reconstructing the 3D shape of underwater environments. The distortions from the moving water surface provide a changing parallax for each point on the underwater surface. We utilize this observation by jointly estimating both the underwater geometry and the dynamic shape of the water surface. To this end, we propose a novel di↵erentiable framework to tie together all parameters in an integrated image formation model. To our knowledge, this is the first solution that is capable to simultaneously retrieve the structure of dynamic water surfaces and static underwater scene geometry in the wild.
• #### H2Opus: A distributed-memory multi-GPU software package for non-local operators

(arXiv, 2021-09-12) [Preprint]
Hierarchical $\mathcal{H}^2$-matrices are asymptotically optimal representations for the discretizations of non-local operators such as those arising in integral equations or from kernel functions. Their $O(N)$ complexity in both memory and operator application makes them particularly suited for large-scale problems. As a result, there is a need for software that provides support for distributed operations on these matrices to allow large-scale problems to be represented. In this paper, we present high-performance, distributed-memory GPU-accelerated algorithms and implementations for matrix-vector multiplication and matrix recompression of hierarchical matrices in the $\mathcal{H}^2$ format. The algorithms are a new module of H2Opus, a performance-oriented package that supports a broad variety of $\mathcal{H}^2$-matrix operations on CPUs and GPUs. Performance in the distributed GPU setting is achieved by marshaling the tree data of the hierarchical matrix representation to allow batched kernels to be executed on the individual GPUs. MPI is used for inter-process communication. We optimize the communication data volume and hide much of the communication cost with local compute phases of the algorithms. Results show near-ideal scalability up to 1024 NVIDIA V100 GPUs on Summit, with performance exceeding 2.3 Tflop/s/GPU for the matrix-vector multiplication, and 670 Gflops/s/GPU for matrix compression, which involves batched QR and SVD operations. We illustrate the flexibility and efficiency of the library by solving a 2D variable diffusivity integral fractional diffusion problem with an algebraic multigrid-preconditioned Krylov solver and demonstrate scalability up to 16M degrees of freedom problems on 64 GPUs.
• #### Doubly Adaptive Scaled Algorithm for Machine Learning Using Second-Order Information

(arXiv, 2021-09-11) [Preprint]
We present a novel adaptive optimization algorithm for large-scale machine learning problems. Equipped with a low-cost estimate of local curvature and Lipschitz smoothness, our method dynamically adapts the search direction and step-size. The search direction contains gradient information preconditioned by a well-scaled diagonal preconditioning matrix that captures the local curvature information. Our methodology does not require the tedious task of learning rate tuning, as the learning rate is updated automatically without adding an extra hyperparameter. We provide convergence guarantees on a comprehensive collection of optimization problems, including convex, strongly convex, and nonconvex problems, in both deterministic and stochastic regimes. We also conduct an extensive empirical evaluation on standard machine learning problems, justifying our algorithm's versatility and demonstrating its strong performance compared to other start-of-the-art first-order and second-order methods.
• #### IntraTomo: Self-supervised Learning-based Tomography via Sinogram Synthesis and Prediction

(IEEE, 2021-09-10) [Conference Paper]
We propose IntraTomo, a powerful framework that combines the benefits of learning-based and model-based approaches for solving highly ill-posed inverse problems, in the Computed Tomography (CT) context. IntraTomo is composed of two core modules: a novel sinogram prediction module and a geometry refinement module, which are applied iteratively. In the first module, the unknown density field is represented as a continuous and differentiable function, parameterized by a deep neural network. This network is learned, in a self-supervised fashion, from the incomplete or/and degraded input sinogram. After getting estimated through the sinogram prediction module, the density field is consistently refined in the second module using local and non-local geometrical priors. With these two core modules, we show that IntraTomo significantly outperforms existing approaches on several ill-posed inverse problems, such as limited angle tomography with a range of 45 degrees, sparse view tomographic reconstruction with as few as eight views, or super-resolution tomography with eight times increased resolution. The experiments on simulated and real data show that our approach can achieve results of unprecedented quality.
• #### AppQ: Warm-starting App Recommendation Based on View Graphs

(arXiv, 2021-09-08) [Preprint]
Current app ranking and recommendation systems are mainly based on user-generated information, e.g., number of downloads and ratings. However, new apps often have few (or even no) user feedback, suffering from the classic cold-start problem. How to quickly identify and then recommend new apps of high quality is a challenging issue. Here, a fundamental requirement is the capability to accurately measure an app's quality based on its inborn features, rather than user-generated features. Since users obtain first-hand experience of an app by interacting with its views, we speculate that the inborn features are largely related to the visual quality of individual views in an app and the ways the views switch to one another. In this work, we propose AppQ, a novel app quality grading and recommendation system that extracts inborn features of apps based on app source code. In particular, AppQ works in parallel to perform code analysis to extract app-level features as well as dynamic analysis to capture view-level layout hierarchy and the switching among views. Each app is then expressed as an attributed view graph, which is converted into a vector and fed to classifiers for recognizing its quality classes. Our evaluation with an app dataset from Google Play reports that AppQ achieves the best performance with accuracy of 85.0\%. This shows a lot of promise to warm-start app grading and recommendation systems with AppQ.
• #### ReFeaFi: Genome-wide prediction of regulatory elements driving transcription initiation

(PLOS Computational Biology, Public Library of Science (PLoS), 2021-09-07) [Article]
Regulatory elements control gene expression through transcription initiation (promoters) and by enhancing transcription at distant regions (enhancers). Accurate identification of regulatory elements is fundamental for annotating genomes and understanding gene expression patterns. While there are many attempts to develop computational promoter and enhancer identification methods, reliable tools to analyze long genomic sequences are still lacking. Prediction methods often perform poorly on the genome-wide scale because the number of negatives is much higher than that in the training sets. To address this issue, we propose a dynamic negative set updating scheme with a two-model approach, using one model for scanning the genome and the other one for testing candidate positions. The developed method achieves good genome-level performance and maintains robust performance when applied to other vertebrate species, without re-training. Moreover, the unannotated predicted regulatory regions made on the human genome are enriched for disease-associated variants, suggesting them to be potentially true regulatory elements rather than false positives. We validated high scoring “false positive” predictions using reporter assay and all tested candidates were successfully validated, demonstrating the ability of our method to discover novel human regulatory regions.
• #### Towards self-calibrated lens metrology by differentiable refractive deflectometry

(Optics Express, The Optical Society, 2021-09-02) [Article]
Deflectometry, as a non-contact, fully optical metrology method, is difficult to apply to refractive elements due to multi-surface entanglement and precise pose alignment. Here, we present a computational self-calibration approach to measure parametric lenses using dual-camera refractive deflectometry, achieved by an accurate, differentiable, and efficient ray tracing framework for modeling the metrology setup, based on which damped least squares is utilized to estimate unknown lens shape and pose parameters. We successfully demonstrate both synthetic and experimental results on singlet lens surface curvature and asphere-freeform metrology in a transmissive setting.
• #### Machine Learning and Deep Learning Methods that use Omics Data for Metastasis Prediction

(Computational and Structural Biotechnology Journal, Elsevier BV, 2021-09) [Article]
Knowing metastasis is the primary cause of cancer-related deaths, incentivized research directed towards unraveling the complex cellular processes that drive the metastasis. Advancement in technology and specifically the advent of high-throughput sequencing provides knowledge of such processes. This knowledge led to the development of therapeutic and clinical applications, and is now being used to predict the onset of metastasis to improve diagnostics and disease therapies. In this regard, predicting metastasis onset has also been explored using artificial intelligence approaches that are machine learning, and more recently, deep learning-based. This review summarizes the different machine learning and deep learning-based metastasis prediction methods developed to date. We also detail the different types of molecular data used to build the models and the critical signatures derived from the different methods. We further highlight the challenges associated with using machine learning and deep learning methods, and provide suggestions to improve the predictive performance of such methods.
• #### Flow-Guided Video Inpainting with Scene Templates

(arXiv, 2021-08-29) [Preprint]
We consider the problem of filling in missing spatio-temporal regions of a video. We provide a novel flow-based solution by introducing a generative model of images in relation to the scene (without missing regions) and mappings from the scene to images. We use the model to jointly infer the scene template, a 2D representation of the scene, and the mappings. This ensures consistency of the frame-to-frame flows generated to the underlying scene, reducing geometric distortions in flow based inpainting. The template is mapped to the missing regions in the video by a new L2-L1 interpolation scheme, creating crisp inpaintings and reducing common blur and distortion artifacts. We show on two benchmark datasets that our approach out-performs state-of-the-art quantitatively and in user studies.
• #### H2OPUS-TLR: High Performance Tile Low Rank Symmetric Factorizations using Adaptive Randomized Approximation

(arXiv, 2021-08-26) [Preprint]
Tile low rank representations of dense matrices partition them into blocks of roughly uniform size, where each off-diagonal tile is compressed and stored as its own low rank factorization. They offer an attractive representation for many data-sparse dense operators that appear in practical applications, where substantial compression and a much smaller memory footprint can be achieved. TLR matrices are a compromise between the simplicity of a regular perfectly-strided data structure and the optimal complexity of the unbalanced trees of hierarchically low rank matrices, and provide a convenient performance-tuning parameter through their tile size that can be proportioned to take into account the cache size where the tiles reside in the memory hierarchy. There are currently no high-performance algorithms that can generate Cholesky and $LDL^T$ factorizations, particularly on GPUs. The difficulties in achieving high performance when factoring TLR matrices come from the expensive compression operations that must be performed during the factorization process and the adaptive rank distribution of the tiles that causes an irregular work pattern for the processing cores. In this work, we develop a dynamic batching operation and combine it with batched adaptive randomized approximations to achieve high performance both on GPUs and CPUs. Our implementation attains over 1.2 TFLOP/s in double precision on the V100 GPU, and is limited by the performance of batched GEMM operations. The Cholesky factorization of covariance matrix of size $N = 131K$ arising in spatial statistics can be factored to an accuracy $\epsilon=10^{-2}$ in just a few seconds. We believe the proposed GEMM-centric algorithm allows it to be readily ported to newer hardware such as the tensor cores that are optimized for small GEMM operations.
• #### Linking common human diseases to their phenotypes; development of a resource for human phenomics

(Journal of Biomedical Semantics, Springer Science and Business Media LLC, 2021-08-23) [Article]
Abstract Background In recent years a large volume of clinical genomics data has become available due to rapid advances in sequencing technologies. Efficient exploitation of this genomics data requires linkage to patient phenotype profiles. Current resources providing disease-phenotype associations are not comprehensive, and they often do not have broad coverage of the disease terminologies, particularly ICD-10, which is still the primary terminology used in clinical settings. Methods We developed two approaches to gather disease-phenotype associations. First, we used a text mining method that utilizes semantic relations in phenotype ontologies, and applies statistical methods to extract associations between diseases in ICD-10 and phenotype ontology classes from the literature. Second, we developed a semi-automatic way to collect ICD-10–phenotype associations from existing resources containing known relationships. Results We generated four datasets. Two of them are independent datasets linking diseases to their phenotypes based on text mining and semi-automatic strategies. The remaining two datasets are generated from these datasets and cover a subset of ICD-10 classes of common diseases contained in UK Biobank. We extensively validated our text mined and semi-automatically curated datasets by: comparing them against an expert-curated validation dataset containing disease–phenotype associations, measuring their similarity to disease–phenotype associations found in public databases, and assessing how well they could be used to recover gene–disease associations using phenotype similarity. Conclusion We find that our text mining method can produce phenotype annotations of diseases that are correct but often too general to have significant information content, or too specific to accurately reflect the typical manifestations of the sporadic disease. On the other hand, the datasets generated from integrating multiple knowledgebases are more complete (i.e., cover more of the required phenotype annotations for a given disease). We make all data freely available at 10.5281/zenodo.4726713.
• #### Discrete Optimization for Shape Matching

(Computer Graphics Forum, Wiley, 2021-08-23) [Article]
We propose a novel discrete solver for optimizing functional map-based energies, including descriptor preservation and pro-moting structural properties such as area-preservation, bijectivity and Laplacian commutativity among others. Unlike thecommonly-used continuous optimization methods, our approach enforces the functional map to be associated with a pointwisecorrespondence as a hard constraint, which provides a stronger link between optimized properties of functional and point-to-point maps. Under this hard constraint, our solver obtains functional maps with lower energy values compared to the standardcontinuous strategies. Perhaps more importantly, the recovered pointwise maps from our discrete solver preserve the optimizedfor functional properties and are thus of higher overall quality. We demonstrate the advantages of our discrete solver on arange of energies and shape categories, compared to existing techniques for promoting pointwise maps within the functionalmap framework. Finally, with this solver in hand, we introduce a novel Effective Functional Map Reﬁnement (EFMR) methodwhich achieves the state-of-the-art accuracy on the SHREC’19 benchmark.
• #### Stylistic and Spatial Disentanglement in GANs

(2021-08-17) [Dissertation]
Committee members: Michels, Dominik; Ghanem, Bernard; Yang, Ming-Hsuan
This dissertation tackles the problem of entanglement in Generative Adversarial Networks (GANs). The key insight is that disentanglement in GANs can be improved by differentiating between the content, and the operations performed on that content. For example, the identity of a generated face can be thought of as the content, while the lighting conditions can be thought of as the operations. We examine disentanglement in several kinds of deep networks. We examine image-to-image translation GANs, unconditional GANs, and sketch extraction networks. The task in image-to-image translation GANs is to translate images from one domain to another. It is immediately clear that disentanglement is necessary in this case. The network must maintain the core contents of the image while changing the stylistic appearance to match the target domain. We propose latent filter scaling to achieve multimodality and disentanglement. Previous methods require complicated network architectures to enforce that disentanglement. Our approach, on the other hand, maintains the traditional GAN loss with a minor change in architecture. Unlike image-to-image GANs, unconditional GANs are generally entangled. Unconditional GANs offer one method of changing the generated output which is changing the input noise code. Therefore, it is very difficult to resample only some parts of the generated images. We propose structured noise injection to achieve disentanglement in unconditional GANs. We propose using two input codes: one to specify spatially-variable details, and one to specify spatially-invariable details. In addition to the ability to change content and style independently, it also allows users to change the content only at certain locations. Combining our previous findings, we improve the performance of sketch-to-image translation networks. A crucial problem is how to correct input sketches before feeding them to the generator. By extracting sketches in an unsupervised way only from the spatially-variable branch of the image, we are able to produce sketches that show the content in many different styles. Those sketches can serve as a dataset to train a sketch-to-image translation GAN.
• #### A Graph-based Approach for Trajectory Similarity Computation in Spatial Networks

(ACM, 2021-08-14) [Conference Paper]
Trajectory similarity computation is an essential operation in many applications of spatial data analysis. In this paper, we study the problem of trajectory similarity computation over spatial network, where the real distances between objects are reflected by the network distance. Unlike previous studies which learn the representation of trajectories in Euclidean space, it requires to capture not only the sequence information of the trajectory but also the structure of spatial network. To this end, we propose GTS, a brand new framework that can jointly learn both factors so as to accurately compute the similarity. It first learns the representation of each point-of-interest (POI) in the road network along with the trajectory information. This is realized by incorporating the distances between POIs and trajectory in the random walk over the spatial network as well as the loss function. Then the trajectory representation is learned by a Graph Neural Network model to identify neighboring POIs within the same trajectory, together with an LSTM model to capture the sequence information in the trajectory. We conduct comprehensive evaluation on several real world datasets. The experimental results demonstrate that our model substantially outperforms all existing approaches.
• #### Socially-Aware Self-Supervised Tri-Training for Recommendation

(ACM, 2021-08-14) [Conference Paper]
Self-supervised learning (SSL), which can automatically generate ground-truth samples from raw data, holds vast potential to improve recommender systems. Most existing SSL-based methods perturb the raw data graph with uniform node/edge dropout to generate new data views and then conduct the self-discrimination based contrastive learning over different views to learn generalizable representations. Under this scheme, only a bijective mapping is built between nodes in two different views, which means that the self-supervision signals from other nodes are being neglected. Due to the widely observed homophily in recommender systems, we argue that the supervisory signals from other nodes are also highly likely to benefit the representation learning for recommendation. To capture these signals, a general socially-aware SSL framework that integrates tri-training is proposed in this paper. Technically, our framework first augments the user data views with the user social information. And then under the regime of tri-training for multi-view encoding, the framework builds three graph encoders (one for recommendation) upon the augmented views and iteratively improves each encoder with self-supervision signals from other users, generated by the other two encoders. Since the tri-training operates on the augmented views of the same data sources for self-supervision signals, we name it self-supervised tri-training. Extensive experiments on multiple real-world datasets consistently validate the effectiveness of the self-supervised tri-training framework for improving recommendation. The code is released at https://github.com/Coder-Yu/QRec.
• #### Ships, Splashes, and Waves on a Vast Ocean

(arXiv, 2021-08-12) [Preprint]
The simulation of large open water surface is challenging for a uniform volumetric discretization of the Navier-Stokes equation. The water splashes near moving objects, which height field methods for water waves cannot capture, necessitates high resolution simulation such as the Fluid-Implicit-Particle (FLIP) method. On the other hand, FLIP is not efficient for the long-lasting water waves that propagates to long distances, which requires sufficient depth for correct dispersion relationship. This paper presents a new method to tackle this dilemma through an efficient hybridization of volumetric and surface-based advection-projection discretizations. We design a hybrid time-stepping algorithm that combines a FLIP domain and an adaptively remeshed Boundary Element Method (BEM) domain for the incompressible Euler equations. The resulting framework captures the detailed water splashes near moving objects with FLIP, and produces convincing water waves with correct dispersion relationship at modest additional cost.
• #### The COVID-19 epidemiology and monitoring ontology

(Center for Open Science, 2021-08-11) [Preprint]
The novel COVID-19 infectious disease emerged and spread, causing high mortality and morbidity rates worldwide. In the OBO Foundry, there are more than one hundred ontologies to share and analyse large-scale datasets for biological and biomedical sciences. However, this pandemic revealed that we lack tools for an efficient and timely exchange of this epidemiological data which is necessary to assess the impact of disease outbreaks, the efficacy of mitigating interventions and to provide a rapid response. In this study we present our findings and contributions for the bio-ontologies community.