Preprints

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 5 of 1398
  • Preprint

    Spatial Latent Gaussian Modelling with Change of Support

    (arXiv, 2024-03-13) Chacon Montalvan, Erick; Atkinson, Peter M.; Nemeth, Christopher; Taylor, Benjamin M.; Moraga, Paula; Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia; Statistics Program; Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division; Lancaster Environment Centre, Lancaster University, United Kingdom; Department of Mathematics and Statistics, Lancaster University, United Kingdom; School of Mathematical Sciences, University College Cork, Ireland

    Spatial data are often derived from multiple sources (e.g. satellites, in-situ sensors, survey samples) with different supports, but associated with the same properties of a spatial phenomenon of interest. It is common for predictors to also be measured on different spatial supports than the response variables. Although there is no standard way to work with spatial data with different supports, a prevalent approach used by practitioners has been to use downscaling or interpolation to project all the variables of analysis towards a common support, and then using standard spatial models. The main disadvantage with this approach is that simple interpolation can introduce biases and, more importantly, the uncertainty associated with the change of support is not taken into account in parameter estimation. In this article, we propose a Bayesian spatial latent Gaussian model that can handle data with different rectilinear supports in both the response variable and predictors. Our approach allows to handle changes of support more naturally according to the properties of the spatial stochastic process being used, and to take into account the uncertainty from the change of support in parameter estimation and prediction. We use spatial stochastic processes as linear combinations of basis functions where Gaussian Markov random fields define the weights. Our hierarchical modelling approach can be described by the following steps: (i) define a latent model where response variables and predictors are considered as latent stochastic processes with continuous support, (ii) link the continuous-index set stochastic processes with its projection to the support of the observed data, (iii) link the projected process with the observed data. We show the applicability of our approach by simulation studies and modelling land suitability for improved grassland in Rhondda Cynon Taf, a county borough in Wales.

  • Preprint

    GPU-Accelerated Vecchia Approximations of Gaussian Processes for Geospatial Data using Batched Matrix Computations

    (arXiv, 2024-03-12) Pan, Qilong; Abdulah, Sameh; Genton, Marc G.; Keyes, David E.; Ltaief, Hatem; Sun, Ying; Division of Computer, Electrical, and Mathematical Sciences and Engineering (CEMSE), Extreme Computing Research Center Technology, Thuwal, Jeddah 23955, Saudi Arabia; Extreme Computing Research Center; Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division; Statistics Program; Applied Mathematics and Computational Science Program; Office of the President; Division of Computer, Electrical, and Mathematical Sciences and Engineering (CEMSE), Statistics Program

    Gaussian processes (GPs) are commonly used for geospatial analysis, but they suffer from high computational complexity when dealing with massive data. For instance, the log-likelihood function required in estimating the statistical model parameters for geospatial data is a computationally intensive procedure that involves computing the inverse of a covariance matrix with size n X n, where n represents the number of geographical locations. As a result, in the literature, studies have shifted towards approximation methods to handle larger values of n effectively while maintaining high accuracy. These methods encompass a range of techniques, including low-rank and sparse approximations. Vecchia approximation is one of the most promising methods to speed up evaluating the log-likelihood function. This study presents a parallel implementation of the Vecchia approximation, utilizing batched matrix computations on contemporary GPUs. The proposed implementation relies on batched linear algebra routines to efficiently execute individual conditional distributions in the Vecchia algorithm. We rely on the KBLAS linear algebra library to perform batched linear algebra operations, reducing the time to solution compared to the state-of-the-art parallel implementation of the likelihood estimation operation in the ExaGeoStat software by up to 700X, 833X, 1380X on 32GB GV100, 80GB A100, and 80GB H100 GPUs, respectively. We also successfully manage larger problem sizes on a single NVIDIA GPU, accommodating up to 1M locations with 80GB A100 and H100 GPUs while maintaining the necessary application accuracy. We further assess the accuracy performance of the implemented algorithm, identifying the optimal settings for the Vecchia approximation algorithm to preserve accuracy on two real geospatial datasets: soil moisture data in the Mississippi Basin area and wind speed data in the Middle East.

  • Preprint

    Streamlining in the Riemannian Realm: Efficient Riemannian Optimization with Loopless Variance Reduction

    (arXiv, 2024-03-11) Demidovich, Yury; Malinovsky, Grigory; Richtarik, Peter; King Abdullah University of Science and Technology (KAUST) Thuwal, Saudi Arabia; Computer Science Program; Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division; Visual Computing Center (VCC)

    In this study, we investigate stochastic optimization on Riemannian manifolds, focusing on the crucial variance reduction mechanism used in both Euclidean and Riemannian settings. Riemannian variance-reduced methods usually involve a double-loop structure, computing a full gradient at the start of each loop. Determining the optimal inner loop length is challenging in practice, as it depends on strong convexity or smoothness constants, which are often unknown or hard to estimate. Motivated by Euclidean methods, we introduce the Riemannian Loopless SVRG (R-LSVRG) and PAGE (R-PAGE) methods. These methods replace the outer loop with probabilistic gradient computation triggered by a coin flip in each iteration, ensuring simpler proofs, efficient hyperparameter selection, and sharp convergence guarantees. Using R-PAGE as a framework for non-convex Riemannian optimization, we demonstrate its applicability to various important settings. For example, we derive Riemannian MARINA (R-MARINA) for distributed settings with communication compression, providing the best theoretical communication complexity guarantees for non-convex distributed optimization over Riemannian manifolds. Experimental results support our theoretical findings.

  • Preprint

    Observation of tunable topological polaritons in a cavity waveguide

    (Research Square Platform LLC, 2024-03-11) Gao, Zhen; Zhao, Dong; Wang, Ziyao; Yang, Linyun; Zhong, Yuxin; Xi, Xiang; Zhu, Zhenxiao; Gong, Maohua; Tu, Qingan; Meng, Yan; Yan, Bei; Shang, Ce; Physical Science and Engineering (PSE) Division; Southern University of Science and Technology; Southern University of Science and Technology, Shenzhen 518055, China

    Topological polaritons characterized by light-matter interactions have become a pivotal platform in exploring new topological phases of matter. Recent theoretical advances unveiled a novel mechanism for tuning topological phases of polaritons by modifying the surrounding photonic environment (light-matter interactions) without altering the lattice structure. Here, by embedding a dimerized chain of microwave helical resonators (electric dipole emitters) in a metallic cavity waveguide, we report the experimental observation of tunable topological phases of polaritons by varying the cavity width which governs the surrounding photonic environment and the strength of light-matter interactions. Moreover, we experimentally identified a new type of topological phase transition which includes three non-coincident critical points in the parameter space: the closure of the polaritonic bandgap, the transition of the Zak phase, and the hybridization of the topological edge states with the bulk states. These results reveal some remarkable and uncharted properties of topological matter when strongly coupled to light and provide an innovative design principle for tunable topological photonic devices.

  • Preprint

    Task-Oriented GNNs Training on Large Knowledge Graphs for Accurate and Efficient Modeling

    (arXiv, 2024-03-09) Abdallah, Hussein; Afandi, Waleed; Kalnis, Panos; Mansour, Essam; Computer Science Program; Extreme Computing Research Center; Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division; Concordia University

    A Knowledge Graph (KG) is a heterogeneous graph encompassing a diverse range of node and edge types. Heterogeneous Graph Neural Networks (HGNNs) are popular for training machine learning tasks like node classification and link prediction on KGs. However, HGNN methods exhibit excessive complexity influenced by the KG's size, density, and the number of node and edge types. AI practitioners handcraft a subgraph of a KG G relevant to a specific task. We refer to this subgraph as a task-oriented subgraph (TOSG), which contains a subset of task-related node and edge types in G. Training the task using TOSG instead of G alleviates the excessive computation required for a large KG. Crafting the TOSG demands a deep understanding of the KG's structure and the task's objectives. Hence, it is challenging and time-consuming. This paper proposes KG-TOSA, an approach to automate the TOSG extraction for task-oriented HGNN training on a large KG. In KG-TOSA, we define a generic graph pattern that captures the KG's local and global structure relevant to a specific task. We explore different techniques to extract subgraphs matching our graph pattern: namely (i) two techniques sampling around targeted nodes using biased random walk or influence scores, and (ii) a SPARQL-based extraction method leveraging RDF engines' built-in indices. Hence, it achieves negligible preprocessing overhead compared to the sampling techniques. We develop a benchmark of real KGs of large sizes and various tasks for node classification and link prediction. Our experiments show that KG-TOSA helps state-of-the-art HGNN methods reduce training time and memory usage by up to 70% while improving the model performance, e.g., accuracy and inference time.