For more information visit:

Recent Submissions

  • Nonseparable Space-Time Stationary Covariance Functions on Networks cross Time

    Porcu, Emilio; White, Philip A.; Genton, Marc G. (arXiv, 2022-08-09) [Preprint]
    The advent of data science has provided an increasing number of challenges with high data complexity. This paper addresses the challenge of space-time data where the spatial domain is not a planar surface, a sphere, or a linear network, but a generalized network (termed a graph with Euclidean edges). Additionally, data are repeatedly measured over different temporal instants. We provide new classes of nonseparable space-time stationary covariance functions where {\em space} can be a generalized network, a Euclidean tree, or a linear network, and where time can be linear or circular (seasonal). Because the construction principles are technical, we focus on illustrations that guide the reader through the construction of statistically interpretable examples. A simulation study demonstrates that we can recover the correct model when compared to misspecified models. In addition, our simulation studies show that we effectively recover simulation parameters. In our data analysis, we consider a traffic accident dataset that shows improved model performance based on covariance specifications and network-based metrics.
  • Granger Causality using Neural Networks

    Horvath, Samuel; Sultan, Malik Shahid; Ombao, Hernando (arXiv, 2022-08-07) [Preprint]
    The Granger Causality (GC) test is a famous statistical hypothesis test for investigating if the past of one time series affects the future of the other. It helps in answering the question whether one time series is helpful in forecasting. Standard traditional approaches to Granger causality detection commonly assume linear dynamics, but such simplification does not hold in many real-world applications, e.g., neuroscience or genomics that are inherently non-linear. In such cases, imposing linear models such as Vector Autoregressive (VAR) models can lead to inconsistent estimation of true Granger Causal interactions. Machine Learning (ML) can learn the hidden patterns in the datasets specifically Deep Learning (DL) has shown tremendous promise in learning the non-linear dynamics of complex systems. Recent work of Tank et al propose to overcome the issue of linear simplification in VAR models by using neural networks combined with sparsity-inducing penalties on the learn-able weights. In this work, we build upon ideas introduced by Tank et al. We propose several new classes of models that can handle underlying non-linearity. Firstly, we present the Learned Kernal VAR(LeKVAR) model-an extension of VAR models that also learns kernel parametrized by a neural net. Secondly, we show one can directly decouple lags and individual time series importance via decoupled penalties. This decoupling provides better scaling and allows us to embed lag selection into RNNs. Lastly, we propose a new training algorithm that supports mini-batching, and it is compatible with commonly used adaptive optimizers such as Adam.he proposed techniques are evaluated on several simulated datasets inspired by real-world applications.We also apply these methods to the Electro-Encephalogram (EEG) data for an epilepsy patient to study the evolution of GC before , during and after seizure across the 19 EEG channels.
  • Visually Evaluating Generative Adversarial Networks Using Itself under Multivariate Time Series

    Pan, Qilong (arXiv, 2022-08-04) [Preprint]
    Visually evaluating the goodness of generated Multivariate Time Series (MTS) are difficult to implement, especially in the case that the generative model is Generative Adversarial Networks (GANs). We present a general framework named Gaussian GANs to visually evaluate GANs using itself under the MTS generation task. Firstly, we attempt to find the transformation function in the multivariate Kolmogorov Smirnov (MKS) test by explicitly reconstructing the architecture of GANs. Secondly, we conduct the normality test of transformed MST where the Gaussian GANs serves as the transformation function in the MKS test. In order to simplify the normality test, an efficient visualization is proposed using the chi square distribution. In the experiment, we use the UniMiB dataset and provide empirical evidence showing that the normality test using Gaussian GANs and chi sqaure visualization is effective and credible.
  • Time-Varying Dispersion Integer-Valued GARCH Models

    Barreto-Souza, Wagner; Piancastelli, Luiza S. C.; Fokianos, Konstantinos; Ombao, Hernando (arXiv, 2022-08-04) [Preprint]
    We propose a general class of INteger-valued Generalized AutoRegressive Conditionally Heteroskedastic (INGARCH) processes by allowing time-varying mean and dispersion parameters, which we call time-varying dispersion INGARCH (tv-DINGARCH) models. More specifically, we consider mixed Poisson INGARCH models and allow for a dynamic modeling of the dispersion parameter (as well as the mean), similarly to the spirit of the ordinary GARCH models. We derive conditions to obtain first and second order stationarity, and ergodicity as well. Estimation of the parameters is addressed and their associated asymptotic properties established as well. A restricted bootstrap procedure is proposed for testing constant dispersion against time-varying dispersion. Monte Carlo simulation studies are presented for checking point estimation, standard errors, and the performance of the restricted bootstrap approach. The inclusion of covariates is also addressed and applied to the daily number of deaths due to COVID-19 in Ireland. Insightful results were obtained in the data analysis, including a superior performance of the tv-DINGARCH processes over the ordinary INGARCH models.
  • Functional-Coefficient Models for Multivariate Time Series in Designed Experiments: with Applications to Brain Signals

    Redondo, Paolo Victor; Huser, Raphaël; Ombao, Hernando (arXiv, 2022-07-30) [Preprint]
    To study the neurophysiological basis of attention deficit hyperactivity disorder (ADHD), clinicians use electroencephalography (EEG) which record neuronal electrical activity on the cortex. The most commonly-used metric in ADHD is the theta-to-beta spectral power ratio (TBR) that is based on a single-channel analysis. However, initial findings for this measure have not been replicated in other studies. Thus, instead of focusing on single-channel spectral power, a novel model for investigating interactions (dependence) between channels in the entire network is proposed. Although dependence measures such as coherence and partial directed coherence (PDC) are well explored in studying brain connectivity, these measures only capture linear dependence. Moreover, in designed clinical experiments, these dependence measures are observed to vary across subjects even within a homogeneous group. To address these limitations, we propose the mixed-effects functional-coefficient autoregressive (MX-FAR) model which captures between-subject variation by incorporating subject-specific random effects. The advantages of MX-FAR are the following: (1.) it captures potential non-linear dependence between channels; (2.) it is nonparametric and hence flexible and robust to model mis-specification; (3.) it can capture differences between groups when they exist; (4.) it accounts for variation across subjects; (5.) the framework easily incorporates well-known inference methods from mixed-effects models; (6.) it can be generalized to accommodate various covariates and factors. Finally, we apply the proposed MX-FAR model to analyze multichannel EEG signals and report novel findings on altered brain functional networks in ADHD.
  • Multivariate Functional Outlier Detection using the FastMUOD Indices

    Ojo, Oluwasegun Taiwo; Anta, Antonio Fernández; Genton, Marc G.; Lillo, Rosa E. (arXiv, 2022-07-26) [Preprint]
    We present definitions and properties of the fast massive unsupervised outlier detection (FastMUOD) indices, used for outlier detection (OD) in functional data. FastMUOD detects outliers by computing, for each curve, an amplitude, magnitude and shape index meant to target the corresponding types of outliers. Some methods adapting FastMUOD to outlier detection in multivariate functional data are then proposed. These include applying FastMUOD on the components of the multivariate data and using random projections. Moreover, these techniques are tested on various simulated and real multivariate functional datasets. Compared with the state of the art in multivariate functional OD, the use of random projections showed the most effective results with similar, and in some cases improved, OD performance.
  • Large-Scale Low-Rank Gaussian Process Prediction with Support Points

    Song, Yan; Dai, Wenlin; Genton, Marc G. (arXiv, 2022-07-26) [Preprint]
    Low-rank approximation is a popular strategy to tackle the "big n problem" associated with large-scale Gaussian process regressions. Basis functions for developing low-rank structures are crucial and should be carefully specified. Predictive processes simplify the problem by inducing basis functions with a covariance function and a set of knots. The existing literature suggests certain practical implementations of knot selection and covariance estimation; however, theoretical foundations explaining the influence of these two factors on predictive processes are lacking. In this paper, the asymptotic prediction performance of the predictive process and Gaussian process predictions is derived and the impacts of the selected knots and estimated covariance are studied. We suggest the use of support points as knots, which best represent data locations. Extensive simulation studies demonstrate the superiority of support points and verify our theoretical results. Real data of precipitation and ozone are used as examples, and the efficiency of our method over other widely used low-rank approximation methods is verified.
  • Template independent component analysis with spatial priors for accurate subject-level brain network estimation and inference

    Mejia, Amanda F.; Bolin, David; Yue, Yu Ryan; Wang, Jiongran; Caffo, Brian S.; Nebel, Mary Beth (Journal of Computational and Graphical Statistics, Informa UK Limited, 2022-07-22) [Article]
    Independent component analysis is commonly applied to functional magnetic resonance imaging (fMRI) data to extract independent components (ICs) representing functional brain networks. While ICA produces reliable group-level estimates, single-subject ICA often produces noisy results. Template ICA is a hierarchical ICA model using empirical population priors to produce more reliable subject-level estimates. However, this and other hierarchical ICA models assume unrealistically that subject effects are spatially independent. Here, we propose spatial template ICA (stICA), which incorporates spatial priors into the template ICA framework for greater estimation efficiency. Additionally, the joint posterior distribution can be used to identify brain regions engaged in each network using an excursions set approach. By leveraging spatial dependencies and avoiding massive multiple comparisons, stICA has high power to detect true effects. We derive an efficient expectation-maximization algorithm to obtain maximum likelihood estimates of the model parameters and posterior moments of the latent fields. Based on analysis of simulated data and fMRI data from the Human Connectome Project, we find that stICA produces estimates that are more accurate and reliable than benchmark approaches, and identifies larger and more reliable areas of engagement. The algorithm is computationally tractable, achieving convergence within 12 hours for whole-cortex fMRI analysis.
  • Automatic detection of unbalanced sitting postures in wheelchairs using unlabeled sensor data

    Kini, K Ramakrishna; Harrou, Fouzi; Madakyaru, Muddu; Kadri, Farid; Sun, Ying (IEEE Sensors Letters, Institute of Electrical and Electronics Engineers (IEEE), 2022-07-22) [Article]
    This letter presents an effective data-driven anomaly detection scheme for automatically recognizing unbalanced sitting posture in a wheelchair using data from pressure sensors embedded in the wheelchair. Essentially, the designed approach merges the desirable features of the kernel principal components analysis (KPCA) as a feature extractor with the Kantorovich Distance (KD)-driven monitoring chart to detect abnormal sitting posture in a wheelchair. It is worth noting that this approach does not require labeled data and only employs normal events data to train the detection model, which makes it more appealing in practice. Specifically, we used the KPCA model to exploit its capacity to reduce the dimensionality of nonlinear data to obtain good detection. At the same time, the KD monitoring scheme is an efficient distribution-driven anomaly detection approach in multivariate data. Furthermore, a nonparametric decision threshold using kernel density estimation is adopted to extend the flexibility of the proposed approach. Publically available data has been used to verify the detection capacity of the proposed approach. The overall detection system proved promising, outperforming some commonly used monitoring methods.
  • High-resolution Bayesian mapping of landslide hazard with unobserved trigger event

    Opitz, Thomas; Bakka, Haakon; Huser, Raphaël; Lombardo, Luigi (The Annals of Applied Statistics, Institute of Mathematical Statistics, 2022-07-19) [Article]
    Statistical models for landslide hazard enable mapping of risk factors and landslide occurrence intensity by using geomorphological covariates available at high spatial resolution. However, the spatial distribution of the triggering event (e.g., precipitation or earthquakes) is often not directly observed. In this paper we develop Bayesian spatial hierarchical models for point patterns of landslide occurrences using different types of log-Gaussian Cox processes. Starting from a competitive baseline model that captures the unobserved precipitation trigger through a spatial random effect at slope unit resolution, we explore novel complex model structures that take clusters of events arising at small spatial scales into account as well as nonlinear or spatially-varying covariate effects. For a 2009 event of around 5000 precipitation-triggered landslides in Sicily, Italy, we show how to fit our proposed models efficiently, using the integrated nested Laplace approximation (INLA), and rigorously compare the performance of our models both from a statistical and applied perspective. In this context we argue that model comparison should not be based on a single criterion and that different models of various complexity may provide insights into complementary aspects of the same applied problem. In our application our models are found to have mostly the same spatial predictive performance, implying that key to successful prediction is the inclusion of a slope-unit resolved random effect capturing the precipitation trigger. Interestingly, a parsimonious formulation of space-varying slope effects reflects a physical interpretation of the precipitation trigger: in subareas with weak trigger, the slope steepness is shown to be mostly irrelevant.

    Gong, Yan; Huser, Raphaël (Annals of Applied Statistics, Institute of Mathematical Statistics, 2022-07-19) [Article]
    Since the inception of Bitcoin in 2008, cryptocurrencies have played an increasing role in the world of e-commerce, but the recent turbulence in the cryptocurrency market in 2018 has raised some concerns about their stabil-ity and associated risks. For investors it is crucial to uncover the dependence relationships between cryptocurrencies for a more resilient portfolio diversi-fication. Moreover, the stochastic behavior in both tails is important, as long positions are sensitive to a decrease in prices (lower tail), while short positions are sensitive to an increase in prices (upper tail). In order to assess both risk types, we develop in this paper a flexible copula model which is able to distinctively capture asymptotic dependence or independence in its lower and upper tails simultaneously. Our proposed model is parsimonious and smoothly bridges (in each tail) both extremal dependence classes in the interior of the parameter space. Inference is performed using a full or censored likelihood approach, and we investigate by simulation the estimators’ efficiency under three different censoring schemes which reduce the impact of nonextreme observations. We also develop a local likelihood approach to capture the temporal dynamics of extremal dependence among pairs of leading cryptocurrencies. We here apply our model to historical closing prices of five leading cryotocurrencies which share large cryptocurrency market capitaliza-tions. The results show that our proposed copula model outperforms alternative copula models and that the lower-tail dependence level between most pairs of leading cryptocurrencies and, in particular, Bitcoin and Ethereum has become stronger over time, smoothly transitioning from an asymptotic independence regime to an asymptotic dependence regime in recent years, whilst the upper tail has been relatively more stable overall at a weaker dependence level.
  • A semi-supervised modulation identification in MIMO systems: A deep learning strategy

    Bouchenak, Sofya; Merzougui, Rachid; Harrou, Fouzi; Dairi, Abdelkader; Sun, Ying (IEEE Access, Institute of Electrical and Electronics Engineers (IEEE), 2022-07-19) [Article]
    Accurate modulation identification of the received signals is undoubtedly a central component in multiple-input multiple-output (MIMO) communication systems, facilitating the demodulation task. This study presents a flexible and semi-supervised deep learning-driven strategy for automatic modulation identification. To this end, the multiclass classification problem is treated as multiple binary discrimination problems to address modulation identification challenges. Here, we merge the features extraction ability of the Generative Adversarial Network (GAN) model and the semi-supervised anomaly detection scheme, the one-class Support Vector Machine (1SVM). Essentially, a single GAN-based 1SVM detector is trained using training data of each class, with the samples of that class as inlier and all other samples as anomalies (i.e., one-vs.-rest). The 1SVM is trained using the features learned by the GAN model. A dataset consisting of three digital modulations (i.e., BFSK, CPFSK, and PAM4) and three analog modulations (i.e., AM-DSB, AM-SSB, and WB-FM), widely used in wireless communications systems, is employed to demonstrate the performance of the considered deep learning-based methods. Compared to Restricted Boltzmann Machine (RBM) and Deep Belief Network (DBN)-based 1SVM, the conventional GAN, DBN, and RBM with softmax layer as discriminator layer, the proposed GAN-based 1SVM detector offers superior discrimination performance of modulation types by achieving an averaged accuracy of 0.951 and F1-Score of 0.954. Results also showed that the GAN-1SVM detector dominates the state-of-the-art modulation classification techniques.
  • Flexible Bivariate INGARCH Process With a Broad Range of Contemporaneous Correlation

    Piancastelli, Luiza S.C.; Barreto-Souza, Wagner; Ombao, Hernando (Journal of Time Series Analysis, Wiley, 2022-07-16) [Article]
    We propose a novel flexible bivariate conditional Poisson (BCP) INteger-valued Generalized AutoRegressive Conditional Heteroscedastic (INGARCH) model for correlated count time series data. Our proposed BCP-INGARCH model is mathematically tractable and has as the main advantage over existing bivariate INGARCH models its ability to capture a broad range (both negative and positive) of contemporaneous cross-correlation which is a non-trivial advancement. Properties of stationarity and ergodicity for the BCP-INGARCH process are developed. Estimation of the parameters is performed through conditional maximum likelihood (CML) and the finite sample behavior of the estimators is investigated through simulation studies. Asymptotic properties of the CML estimators are derived. Hypothesis testing methods for the presence of contemporaneous correlation between the time series are presented and evaluated. A Granger causality test is also addressed. We apply our methodology to monthly counts of hepatitis cases at two nearby Brazilian cities, which are highly cross-correlated. The data analysis demonstrates the importance of considering a bivariate model allowing for a wide range of contemporaneous correlation in real-life applications.
  • Parallel space-time likelihood optimization for air pollution prediction on large-scale systems

    Salvaña, Mary Lai O.; Abdulah, Sameh; Ltaief, Hatem; Sun, Ying; Genton, Marc G.; Keyes, David E. (ACM, 2022-07-12) [Conference Paper]
    Gaussian geostatistical space-time modeling is an effective tool for performing statistical inference of field data evolving in space and time, generalizing spatial modeling alone at the cost of the greater complexity of operations and storage, and pushing geostatistical modeling even further into the arms of high-performance computing. It makes inferences for missing data by leveraging space-time measurements of one or more fields. We propose a high-performance implementation of a widely applied space-time model for large-scale systems using a two-level parallelization technique. At the inner level, we rely on state-of-the-art dense linear algebra libraries and parallel runtime systems to perform complex matrix operations required to evaluate the maximum likelihood estimation (MLE). At the outer level, we parallelize the optimization process using a distributed implementation of the particle swarm optimization (PSO) algorithm. At this level, parallelization is accomplished using MPI sub-communicators, such that the nodes in each sub-communicator perform a single MLE iteration at a time. To evaluate the effectiveness of the proposed methodology, we assess the accuracy of the newly implemented space-time model on a set of large-scale synthetic space-time datasets. Moreover, we use the proposed implementation to model two air pollution datasets from the Middle East and US regions with 550 spatial locations X730 time slots and 945 spatial locations X500 time slots, respectively. The evaluation shows that the proposed approach satisfies high prediction accuracy on both synthetic datasets and real particulate matter (PM) datasets in the context of the air pollution problem. We achieve up to 757.16 TFLOPS/s using 1024 nodes (75% of the peak performance) using 490K geospatial locations on Shaheen-II Cray XC40 system.
  • rspatialdata: a collection of data sources and tutorials on downloading and visualising spatial data using R

    Moraga, Paula; Baker, Laurie (F1000Research, F1000 Research Ltd, 2022-07-11) [Article]
    Spatial and spatio-temporal data are used in a wide range of fields including environmental, health and social disciplines. Several packages in the statistical software R have been recently developed as clients for various databases to meet the growing demands for easily accessible and reliable spatial data. While documentation on how to use many of these packages exist, there is an increasing need for a one stop repository for tutorials on this information. In this paper, we present rspatialdata a website that provides a collection of data sources and tutorials on downloading and visualising spatial data using R. The website includes a wide range of datasets including administrative boundaries of countries, Open Street Map data, population, temperature, vegetation, air pollution, and malaria data. The goal of the website is to equip researchers and communities with the tools to engage in spatial data analysis and visualisation so that they can address important local issues, such as estimating air pollution, quantifying disease burdens, and evaluating and monitoring the United Nation’s sustainable development goals.
  • Are You All Normal? It Depends!

    Chen, Wanfang; Genton, Marc G. (International Statistical Review, Wiley, 2022-07-07) [Article]
    The assumption of normality has underlain much of the development of statistics, including spatial statistics, and many tests have been proposed. In this work, we focus on the multivariate setting and first review the recent advances in multivariate normality tests for i.i.d. data, with emphasis on the skewness and kurtosis approaches. We show through simulation studies that some of these tests cannot be used directly for testing normality of spatial data. We further review briefly the few existing univariate tests under dependence (time or space), and then propose a new multivariate normality test for spatial data by accounting for the spatial dependence. The new test utilises the union-intersection principle to decompose the null hypothesis into intersections of univariate normality hypotheses for projection data, and it rejects the multivariate normality if any individual hypothesis is rejected. The individual hypotheses for univariate normality are conducted using a Jarque–Bera type test statistic that accounts for the spatial dependence in the data. We also show in simulation studies that the new test has a good control of the type I error and a high empirical power, especially for large sample sizes. We further illustrate our test on bivariate wind data over the Arabian Peninsula.
  • Practical strategies for generalized extreme value-based regression models for extremes

    Castro-Camilo, Daniela; Huser, Raphaël; Rue, Haavard (Environmetrics, Wiley, 2022-06-28) [Article]
    The generalized extreme value (GEV) distribution is the only possible limiting distribution of properly normalized maxima of a sequence of independent and identically distributed random variables. As such, it has been widely applied to approximate the distribution of maxima over blocks. In these applications, GEV properties such as finite lower endpoint when the shape parameter ξ is positive or the loss of moments due to the magnitude of ξ are inherited by the finite-sample maxima distribution. The extent to which these properties are realistic for the data at hand has been widely ignored. Motivated by these overlooked consequences in a regression setting, we here make three contributions. First, we propose a blended GEV (bGEV) distribution, which smoothly combines the left tail of a Gumbel distribution (GEV with ξ=0 ) with the right tail of a Fréchet distribution (GEV with ξ>0 ). Our resulting distribution has, therefore, unbounded support. Second, we proposed a principled method called property-preserving penalized complexity (P 3 C) prior to decide on the existence of the GEV distribution first and second moments a priori. Third, we propose a reparametrization of the GEV distribution that provides a more natural interpretation of the (possibly covariate-dependent) model parameters, which in turn helps define meaningful priors. We implement the bGEV distribution with the new parameterization and the P 3 C prior approach in the R-INLA package to make it readily available to users. We illustrate our methods with a simulation study that reveals that the GEV and bGEV distributions are comparable when estimating the right tail under large-sample settings. Moreover, some small-sample settings show that the bGEV fit slightly outperforms the GEV fit. Finally, we conclude with an application to NO 2 pollution levels in California that illustrates the suitability of the new parameterization and the P 3 C prior approach in the Bayesian framework.
  • Better Methods and Theory for Federated Learning: Compression, Client Selection and Heterogeneity

    Horvath, Samuel (2022-06-27) [Dissertation]
    Advisor: Richtarik, Peter
    Committee members: Canini, Marco; Avestimehr, Salman; Rabbat, Michael; Genton, Marc G.
    Federated learning (FL) is an emerging machine learning paradigm involving multiple clients, e.g., mobile phone devices, with an incentive to collaborate in solving a machine learning problem coordinated by a central server. FL was proposed in 2016 by Konecny et al. and McMahan et al. as a viable privacy-preserving alternative to traditional centralized machine learning since, by construction, the training data points are decentralized and never transferred by the clients to a central server. Therefore, to a certain degree, FL mitigates the privacy risks associated with centralized data collection. Unfortunately, optimization for FL faces several specific issues that centralized optimization usually does not need to handle. In this thesis, we identify several of these challenges and propose new methods and algorithms to address them, with the ultimate goal of enabling practical FL solutions supported with mathematically rigorous guarantees. In particular, in the first four chapters after the introduction, we focus on the communication bottleneck and devise novel compression mechanisms and tools that can provably accelerate the training process. In the sixth chapter, we address another significant challenge of FL: partial participation of clients in each round of the training process. More concretely, we propose the first importance client sampling strategy that is compatible with two core privacy requirements of FL: secure aggregation and statelessness of clients. The seventh chapter is dedicated to another challenge in the cross-device FL setting—system heterogeneity, i.e., the diversity in clients’ processing capabilities and network bandwidth, and the communication overhead caused by slow connections. To tackle this, we introduce the ordered dropout (OD) mechanism. OD promotes an ordered, nested representation of knowledge in neural networks and enables the extraction of lower-footprint sub-models without retraining, which offers fair and accurate learning in this challenging FL setting. Lastly, in the eigh, we study several key algorithmic ingredients behind some of the most popular methods for cross-device FL aimed to tackle heterogeneity and communication bottleneck. In particular, we propose a general framework for analyzing methods employing all these techniques simultaneously, which helps us better understand their combined effect. Our approach identifies several inconsistencies and enables better utilization of these components, including the popular practice of running multiple local training steps before aggregation.
  • Cyber-attacks detection in industrial systems using artificial intelligence-driven methods

    Wang, Wu; Harrou, Fouzi; Bouyeddou, Benamar; Senouci, Sidi Mohammed; Sun, Ying (International Journal of Critical Infrastructure Protection, Elsevier BV, 2022-06-22) [Article]
    Modern industrial systems and critical infrastructures are constantly exposed to malicious cyber-attacks that are challenging and difficult to identify. Cyber-attacks can cause severe economic losses and damage the attacked system if not detected accurately and timely. Therefore, designing an accurate and sensitive intrusion detection system is undoubtedly necessary to ensure the productivity and safety of industrial systems against cyber-attacks. This paper first introduces a stacked deep learning method to detect malicious attacks in SCADA systems. We also consider eleven machine learning models, including the Xtreme Gradient Boosting (XGBoost), Random forest, Bagging, support vector machines with different kernels, classification tree pruned by the minimum cross-validation and by 1-standard error rule, linear discriminate analysis, conditional inference tree, and the C5.0 tree. Real data sets with different kinds of cyber-attacks from two laboratory-scale SCADA systems, gas pipeline and water storage tank systems, are employed to evaluate the performance of the investigated methods. Seven evaluation metrics have been used to compare the investigated models (accuracy, sensitivity, specificity, precision, recall, F1-score, and area under curve, or AUC). Overall, results show that the XGBoost approach achieved superior detection performance than all other investigated methods. This could be due to its desirable characteristics to avoid overfitting, decreases the complexity of individual trees, robustness to outliers, and invariance to scaling and monotonic transformations of the features. Unexpectedly, the deep learning models are not providing the best performance in this case study, even with their extended capacity to capture complex features interactions.
  • Approximate Bayesian Inference for the Interaction Types 1, 2, 3 and 4 with Application in Disease Mapping

    Fattah, Esmail Abdul; Rue, Haavard (arXiv, 2022-06-18) [Preprint]
    We address in this paper a new approach for fitting spatiotemporal models with application in disease mapping using the interaction types 1,2,3, and 4. When we account for the spatiotemporal interactions in disease-mapping models, inference becomes more useful in revealing unknown patterns in the data. However, when the number of locations and/or the number of time points is large, the inference gets computationally challenging due to the high number of required constraints necessary for inference, and this holds for various inference architectures including Markov chain Monte Carlo (MCMC) and Integrated Nested Laplace Approximations (INLA). We re-formulate INLA approach based on dense matrices to fit the intrinsic spatiotemporal models with the four interaction types and account for the sum-to-zero constraints, and discuss how the new approach can be implemented in a high-performance computing framework. The computing time using the new approach does not depend on the number of constraints and can reach a 40-fold faster speed compared to INLA in realistic scenarios. This approach is verified by a simulation study and a real data application, and it is implemented in the R package INLAPLUS and the Python header function: inla1234().

View more