Spatial and Spatio-temporal Statistical Methods for Environment and Public Health Applications

Embargo End Date


Amaral, André Victor Ribeiro

Moraga, Paula

Committee Members
Sun, Ying
Gomes, Diogo
Gómes-Rubio, Virgílio


KAUST Department
Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division


Access Restrictions
At the time of archiving, the student author of this dissertation opted to temporarily restrict access to it. The full text of this dissertation will become available to the public after the expiration of the embargo on 2024-11-23.

This thesis proposes statistical spatial and spatial-temporal methods for addressing real-world challenges within the public health and environmental domains.

Firstly, we introduce a new method that integrates compartment and spatial point process models to describe the propagation of infectious diseases over space and time. We apply this method to the analysis of COVID-19 data in Cali, Colombia, in 2020. Secondly, we implement a new class of statistical hazard models and Bayesian inference tools for studying spatially dependent survival data under the assumption of competing risks and unknown cause-of-death (named "relative survival framework"). This framework accounts for the neighboring spatial dependence among the studied sub-regions by means of a conditional autoregressive model. This method is employed in the analysis of colon cancer data in England. Thirdly, in the context of model-based geostatistics, we extend a well-known model for geostatistical data that accounts for preferential sampling by allowing the degree of preferentiality to vary over space. We use this model to analyze the levels of air pollution in the United States. Lastly, we explore different post-processing and ensemble techniques for the nowcasting of the "7-day COVID-19 hospitalization incidence" in Germany between 2021 and 2022. In this setting, additional challenges arise from the fact that the data is constantly being revised, preventing us from using common methods. We address this issue by training our models based on a modified version of the original incidence counts.

For Bayesian inference, we implement our models with Stan, as well as the integrated nested Laplace approximation (INLA) method. The latter may reduce the computational burden for parameter estimation and enable fast inference. In the corresponding chapters, we provide details of the methodology and accessible links to our scripts, ensuring the full reproducibility of our methods and results.