Space-Time Landslide Predictive Modelling

Landslides are nearly ubiquitous phenomena and pose severe threats to people, properties, and the environment. Investigators have for long attempted to estimate landslide hazard to determine where, when, and how destructive landslides are expected to be in an area. This information is useful to design landslide mitigation strategies, and to reduce landslide risk and societal and economic losses. In the geomorphology literature, most attempts at predicting the occurrence of populations of landslides rely on the observation that landslides are the result of multiple interacting, conditioning and triggering factors. Here, we propose a novel Bayesian modelling framework for the prediction of space-time landslide occurrences of the slide type caused by weather triggers. We consider log-Gaussian cox processes, assuming that individual landslides stem from a point process described by an unknown intensity function. We tested our prediction framework in the Collazzone area, Umbria, Central Italy, for which a detailed multi-temporal landslide inventory spanning 1941-2014 is available together with lithological and bedding data. We tested five models of increasing complexity. Our most complex model includes fixed effects and latent spatio-temporal effects, thus largely fulfilling the common definition of landslide hazard in the literature. We quantified the spatio-temporal predictive skill of our model and found that it performed better than simpler alternatives. We then developed a novel classification strategy and prepared an intensity-susceptibility landslide map, providing more information than traditional susceptibility zonations for land planning and management. We expect our novel approach to lead to better projections of future landslides, and to improve our collective understanding of the evolution of landscapes dominated by mass-wasting processes under geophysical and weather triggers.

elling equations that may not capture the complex interactions controlling the slope stability/instability conditions. Furthermore, the physically-based models require large datasets to describe the surface and subsurface mechanical and hydrological properties of the terrain, which are difficult and expensive to obtain. As a result, physically-based models are used chiefly for small or very small areas (e.g., Montgomery and Dietrich, 1994;Chakraborty and Goswami, 2016;Seyed-Kolbadi et al., 2019), albeit a few examples also exist of applications for large areas (e.g., Gorsevski et al., 2006;Raia et al., 2014;Alvioli and Baum, 2016).
Lastly, statistical approaches aim at exploiting the "functional" relationships existing between a set of instability factors, and the past and present distribution of landslides obtained typically from a landslide inventory map (Carrara, 1983;Duman et al., 2005;Guzzetti et al., 2012), or a landslide catalogue (Van Den Eeckhaut and Hervás, 2012). The large number of statistically-based approaches proposed in the literature (Reichenbach et al., 2018) almost invariably exploit classification methods, and provide probabilistic estimates suited for quantitative hazard assessments. Statistical models can be constructed using a variety of thematic and environmental variables obtained from existing maps or by processing remotely sensed images and data, in different landscape and environmental settings, covering a broad range of scales and study-area sizes. The dependent variable is obtained from different types of landslide inventory maps (Guzzetti et al., 2012) or landslide catalogues (Van Den Eeckhaut and Hervás, 2012), and is typically used in a binary structure, expressing the presence or absence of landslides in each mapping unit, where a terrain mapping unit is a regular or irregular geographical subdivision (e.g., a pixel, unique condition, slope or hydrological unit, administrative subdivision (Guzzetti et al., 1999;Guzzetti, 2005;Van Westen et al., 2006)) used to partition a study area. The fitted model is then used to assess the landslide susceptibility for each mapping unit (Guzzetti, 2005).
Focusing on statistically-based susceptibility approaches, a limitation of the traditional and of most of the current models is that they predict only whether a mapping unit is expected to have (or not have) landslides, regardless of the number or size of the expected failures in each unit. In a population of landslides, the size (i.e., length, width, area, volume) of the slope failures is linked to the number of the failures. Hovius et al. (1997) and Malamud et al. (2004), among others, have shown that the area of a landslide obeys empirical probability distributions that control the average size and relative proportion of the landslides in an area. Similar empirical dependencies have been found for landslide volume (Brunetti et al., 2009), for the landslide area to volume ratio Larsen et al., 2010), and more recently for the landslide width to length ratio (Taylor et al., 2018).
In an attempt to overcome the inherent inability of landslide susceptibility models to predict the number of the expected landslides, Lombardo et al. (2018a) have proposed a novel framework for landslide intensity assessment. The approach treats single landslides in a population of landslides as individual realisations from a continuous-space point process (Cox and Isham, 1980;Chiu et al., 2013). This is different from the discrete-space binary presence-absence model adopted by the traditional statistically-based susceptibility models (Carrara, 1983;Guzzetti et al., 1999;Guzzetti, 2005;Lombardo and Mai, 2018;Reichenbach et al., 2018). As a result, the approach aims at predicting the number of the expected landslides in each mapping unit adopting a Poisson distribution, whose mean is linked to the unknown landslide intensity that can be estimated from a landslide dataset. The approach was applied successfully to model populations of rainfall-induced (Lombardo et al., 2018a(Lombardo et al., , 2019b and seismically-triggered (Lombardo et al., 2019a) landslides in different morphological, geological, and climatic settings.
A second limitation of most statistically-based models is that they typically do not consider the spatial relationships between landslide occurrences in different terrain mapping units. Landslide occurrences are often assumed to be independent given the terrain conditions. In other words, the geographical location of the mapping units is not explicitly taken into account, so that adjacent, neighbouring, and distant units are considered equally by the models. Not considering the spatial dependencies (or lack thereof) among units placed in different locations in a study area may result in poorer susceptibility models and associated zonations (Reichenbach et al., 2018;Lombardo et al., 2018a).
A third limitation of most statistically-based models is the fact that they consider the likelihood of landslides occurring in an area to be constant in (i.e., independent of) time (e.g., Meusburger and Alewell, 2009;Cama et al., 2015). In the long run (i.e., hundreds to thousands of years), the assumption does not hold because the landslide triggering conditions (e.g., the frequency of high or prolonged rainfall periods, of snow melt events, or of earthquakes), as well as the predisposing conditions (e.g., land use or land cover) may change with time, inevitably changing the landslide susceptibility. Recent empirical evidence shows that in some landscapes, even for short periods (i.e., tens of years), when a landslide occurs it may become an "attractor" for future landslides, with new landslides being more likely to occur inside or in the immediate vicinity of the previous landslides. Samia et al. (2017a,b) called this effect "landslide path dependency", and found that the spatio-temporal dependency of new landslides on old landslides disappears in their study area, in Umbria, Central Italy-the same study area of this work-approximately after 10-15 years. The evidence violates the assumption that susceptibility is constant in time, even for short periods.
In the literature, approaches to predict the temporal occurrence of landslides are equally if not more diversified. Depending on the scope, the temporal coverage, the return time of the predictions, and the extent of the study area, methods include (i) the use of empirical rainfall thresholds for the possible occurrence of landslides (Glade et al., 2000;Dai and Lee, 2001;Crosta and Frattini, 2000;Aleotti, 2004;Guzzetti et al., 2007Guzzetti et al., , 2008Saito et al., 2010;Ko and Lo, 2018;Segoni et al., 2018;Guzzetti et al., 2019), (ii) physically-based, coupled, distributed rainfall and infiltration slope stability models (Montgomery and Dietrich, 1994;Van Asch et al., 1996;Baum et al., 2008;Lanni et al., 2013;Formetta et al., 2014;Reid et al., 2015;Formetta et al., 2016;Alvioli and Baum, 2016;Bout et al., 2018), and (iii) the analysis of time series of historical landslides and landslide events (Crovelli and Coe, 2009;Rossi et al., 2010b;Witt et al., 2010). Only physically-based models (inherently) consider the 8 spatio-temporal interactions that condition landslide occurrence, which are not considered by the threshold models or by the analysis of the historical records. However, the physicallybased models are generally applicable only to small areas and for short periods of time (a few hours to a few days), and are not suited for the spatio-temporal modelling of landslide occurrence over large areas and for long periods; is essential for medium to long term land planning and management.
In this work, we construct innovative geostatistical models that consider (i) spatial, (ii) temporal, and (iii) spatio-temporal landslide latent effects among: adjacent terrain mapping units, same mapping units but subsequent in time and both conditions together, respectively. We consider this an improvement over the existing approaches to predict the spatio-temporal occurrence of landslides in an area.

Study area
The area of Collazzone, Umbria, Central Italy, is well studied and represents a unique site where landslides have been mapped repeatedly over a large timespan. A description of the area can be found in Guzzetti et al. (2006a,b), Ardizzone et al. (2007), Galli et al. (2008) and other references therein. Overall, our study area extends over ≈ 79 km 2 , with terrain elevation between 145 m along the Tiber River flood plain NNW of Todi, and 634 m at Monte di Grutti (Figure 1). The landscape is predominantly hilly, and lithology and the attitude of bedding planes control the morphology of the slopes, and the presence and abundance of the landslides. In the area sedimentary rocks crop out, Cretaceous to Recent in age, including recent fluvial deposits, continental gravel, sand and clay, travertine, layered sandstone and marl, and thinly layered limestone. The climate is Mediterranean, with precipitation falling mostly in the period from October to December, and from February to May. Intense rainfall events or prolonged rainfall periods are the primary natural triggers of landslides in the area (Ardizzone et al., 2013), followed by the rapid melting of snow . Landslides are abundant and cover 17.05 km 2 -corresponding to a density of ≈ 43 landslides per square kilometre-and range in age, type, morphology, and volume from very old, partly eroded, large and deep-seated slides and slide-earth flows, to young and shallow slides and flows (Hungr et al., 2014). Recent landslides are most abundant in the cultivated areas and are rare in the forested terrain, indicating a dependence of the recent landslides on the agricultural practices (Mergili et al., 2014). photographs taken in 1941, 1954, 1977, 1985, and 1997, and of five stereoscopic, panchromatic and multi-spectral satellite image pairs acquired in August 2009, March 2010, May 2010, April 2013, and April 2014 (Table 1), supplemented by field checks and surveys executed in various periods from 1998, in May 2004, and in December 2005(Guzzetti et al., 2006aArdizzone et al., 2007;Galli et al., 2008;Ardizzone et al., 2013). The multi-temporal inventory shows the location, surface area, type (Hungr et al., 2014), and estimated age (Cruden and Varnes, 1996) of 3,379 landslides, ranging in size from A L = 5.8 × 10 3 m 2 to A L = 1.5 × 10 6 m 2 (mean, µ = 6.9 × 10 3 m 2 , standard deviation, σ = 3.2 × 10 4 m 2 ), for a total area covered by landslides of A LT = 17.05 km 2 . In the inventory, landslide age was defined as very old (relict), old (predating 1941), or recent (from 1941 to 2014), using photointerpretation criteria and field evidence, despite some ambiguity in the definition of the age of a landslide based on its morphological appearance (McCalpin, 1984;Guzzetti et al., 2006a). Key to our study is the fact that in the multi-temporal inventory landslides are separated into nineteen, irregularly spaced "temporal slices", each shown by a different colour in Figure 1. Landslides in the temporal slices before 2000 were detected and mapped using black and white (panchromatic) and colour (1977) aerial photographs, whereas landslides between 2009 and 2014 were detected and mapped using VHR stereoscopic satellite images (Table 1), and directly in the field. Visual interpretation of the stereoscopic satellite images was performed using different software, including: (i) ERDAS IMAGINE R and Leica Photogrammetry Suite (LPS) for block orientation of the stereo images; (ii) Stereo Analyst for ArcGIS R for image visualization and landslide mapping; and (iii) StereoMirror TM hardware technology to obtain 3D views of the VHR satellite images. The same interpretation criteria were adopted to identify and map the landslides on aerial photographs and the satellite images. (Guzzetti et al., 2012;Ardizzone et al., 2013;Murillo-Garcia et al., 2015). In each set of aerial photographs and in each pair of satellite images, landslides that appeared "fresh" were given the date (i.e., year) of the aerial photographs, or the date (i.e., month and year) of the satellite images. The "non-fresh" landslides were attributed to the period (i.e., the "temporal slice") between two successive sets of aerial photographs or satellite image pairs. Landslides mapped in the field after single or multiple rainfall events between 1998 and 2013 were given the date (i.e., month and year) of the field surveys. The different methods and instruments used to interpret the aerial photographs and the satellite images, and the fact that some of the landslides were mapped in the field, have conditioned the completeness and accuracy of the landslide information in the multitemporal inventory, which is therefore not constant throughout the time slices. In general, more recent time slices showing event or seasonal inventories (E and S, respectively, in Figure 1) have more numerous landslides of small sizes, whereas historical inventories (H, in Figure 1) show larger landslides, and lack landslides of very small size. This is a known bias of the multi-temporal inventory used in our study (Guzzetti et al., 2006a;Ardizzone et al., 2007;Galli et al., 2008;Ardizzone et al., 2013).

Mapping unit
Prediction of landslide occurrence in an area requires the preliminary selection of a suitable terrain mapping unit, i.e., a subdivision of the terrain that maximises the within-unit (internal) homogeneity and the between-unit (external) heterogeneity across distinct physical or geographical boundaries (Carrara et al., 1991;Guzzetti, 2005). In agreement with previous studies in the same (Guzzetti et al., 2006b), in similar (Carrara et al., 2003), and in different (Van Den Eeckhaut et al., 2009;Erener and Düzgün, 2012;Castro Camilo et al., 2017;Amato et al., 2019) areas, we select the "slope unit" (SU) as the mapping unit of reference. By definition, a SU is a terrain geomorphological unit bounded by drainage and divide lines, and corresponds to a slope, a combination of adjacent slopes, or a small catchment (Carrara et al., 1991;. We use a subdivision of the study area (i.e., our spatial domain) into 889 SUs ranging in size from A L = 6.17 × 10 2 m 2 to A L = 1.4 × 10 6 m 2 (mean, µ = 8.9 × 10 4 m 2 , standard deviation, σ = 1.1 × 10 5 m 2 ), corresponding to an average density of one SU approximately every 0.1 km 2 . Panel A of Figure 2 portrays the geographical distribution of the SUs used in the study.

Thematic data and explanatory variables
To support the modelling procedure, we used a set of morphometric and thematic variables (also called "covariates"), which we list in Table 2. The 16 morphometric covariates were derived from a 10 m × 10 m resolution digital elevation model (DEM) obtained through the automatic interpolation of 10 m and 5 m interval contour lines taken from 1:10,000 scale topographic base maps made accessible by the Umbria Region government (link here). To represent morphometric covariates at the SU scale, we computed the two main statistical properties (mean µ, and standard deviation σ) of each covariate for each respective SU. We further obtained the thematic covariates (Table 2) from a geological map prepared at 1:10,000 scale by Guzzetti et al. (2006a), and used in previous landslide susceptibility and hazard studies in the same area (Guzzetti et al., 2006a;Ardizzone et al., 2007;Galli et al., 2008). From the geological map we extracted information on nine lithological units, and four bedding domain classes. We selected these covariates because bedrock geology and the attitude of bedding planes have been shown to control the presence (or absence) and the abundance of landslides in our study area (Guzzetti et al., 2006a;Marchesini et al., 2015). To summarise the geologic and bedding information for each SU, we computed the ratio between the relative extent of each categorical class in a given SU and the total extent of the SU. As a result, we transformed the original categorical information into a continuous one, expressing the proportion of area coverage per class in each SU.

Pre-processing
We initially tested our modelling framework (see Section 5) using the 19 separate time slices shown in Figure 1, where the original 19 landslide count distributions represented our target variable. However, these preliminary tests did not produce satisfying predictive performances (unreported results). Possible reasons for the unsatisfactory results included (i) the irregular time-span of the landslide inventories, ranging from 1 to 23 years, (ii) the highly variable Table 2: Morphometric (M), lithological (L), and bedding-related (structural, S) explanatory variables (covariates) used in the study for space-time landslide predictive modelling in the Collazzone area, Umbria, Central Italy (see Figure 1). References: 1, http://www.umbriageo.regione.umbria.it/pagina/distribuzione-carta-tecnicaregionale-vettoriale-1; 2, Zevenbergen and Thorne (1987); 3, Lombardo et al. (2018b); 4, Heerdegen and Beran (1982); 5, Böhner and Selige (2006); 6, Beven and Kirkby (1979); 7, Guzzetti et al. (2006a).  19411941-195419541954-197719771977-198519851985Snow 1998 May  19411941-195419541954-197719771977-198519851985Snow 1998 May  Figure 3: Collazzone study area, Umbria, Central Italy. Spatial and temporal aggregation schemes of landslide counts. (A): Temporal distribution of the original counts obtained by considering landslides as pure points (dark grey); and, spatial aggregation by considering the areal extent of landslides (light grey). Hence, where landslides covered more than one SU, we repeated the count over multiple SUs to respect the geomorphological and areal realization of the instability process. (B): Spatially aggregated landslide count per SU displayed per "temporal slices" for the 19 inventories shown in Figure 1; and, associated temporal aggregation scheme to merge the 19 time slices (single bars) into six "time periods" (groups of bars shown by the same colour). (C): Summary staked bar chart of the aggregated landslide counts distribution per SU for six near-regular time periods, from T1 to T6: the x-axis reports the SUs' rank, from SUs having zero landslides (left) to SUs having up to 25 landslides (right). The y-axis shows the proportion of landslide counts across the six different time periods, shown by different colours. White characters correspond to the counts per time interval. More information on SUs is provided in Section 4.2.
To aggregate the data over space, we looked into the inventories and realized that some of the landslides have a large areal extent at certain times, leading the instability to affect several SUs simultaneously. In such cases, treating these landslides as a single points (i.e., assigning one count to a single SU) would neglect the areal extent of the landslide phenomenon. Therefore, in cases where a landslide covers more than one SU, we assigned a count to multiple SUs under the constraint that the intersections between the landslide and the SUs are larger than 2% of the area of the SU (Carrara and Guzzetti, 2013). In turn, this procedure generated a moderately larger landslide count per SU than the original landslide count, as shown in Figure 3A.
The second aggregation scheme involved the temporal dimension of the spatially-aggregated landslide counts. Specifically, we further aggregated the 19 temporal slices in a nearly regular temporal grid with intervals of approximately 15 years, with the exception of the stand-alone snow-melt event. The procedure produced six "time intervals", each composed of one to eight time slices, which we will refer to as T1 to T6 in the rest of the manuscript. Panel B of Figure 3 illustrates the distribution of landslide counts per SU for each time interval, and Panel C of Figure 3 summarises our final aggregation scheme.
In addition to the covariate preparation and the spatio-temporal aggregation, the preprocessing phase involved creating the so-called adjacency matrix (Zhang et al., 2010). The (i 1 , i 2 )-th entry of the adjacency matrix is equal to one if the i 1 -th SU shares a border with the i 2 -th SU, and it is zero otherwise. Therefore, it is a symmetric matrix (i.e., if SU1 is adjacent to SU2, then SU2 is also adjacent to SU1), which indicates the neighbourhood structure between SUs, and it provides fundamental information to build the space and space-time models presented in this work. Maps (B) and (C) in Figure 2 summarise the main steps involved in the calculation of the adjacency matrix, and illustrate the adjacency graph structure.

Fundamentals of point processes
Conceptually, we identify a landslide with a single point (s i , t i ), where s i is the spatial location and t i is time. In practice, we may also consider the spatial dimension of large landslides by counting them more than once. The time resolution here corresponds to the six time intervals, such that t i ∈ {1, 2, . . . , 6}. For the purpose of estimation and mapping at the resolution of the SUs, we do not need the exact position s i of landslides, but only the SU.
Our fundamental modelling assumption is that the space-time points (s i , t i ) identifying landslide occurrences stem from an unobserved intensity function λ(s, t) that varies over space and time. The interpretation of this intensity function is that we observe (approximately) λ(s, t) events on average per spatio-temporal unit around (s, t). For an arbitrary domain D stretching over space and time, the mean number of observed events is given by the integral D λ(s, t)dsdt. The natural distribution of such event counts is the Poisson distribution with intensity given by this integral. We assume that this stochastic mechanism generates the observed spatio-temporal point pattern, and we call it a Poisson point process.
In practice, we construct a model for the intensity function λ(s, t) which incorporates covariate effects, and which may further capture structured variations of the space-time intensity surface of random nature, i.e., not explained by the available covariate information. Such random effects can be considered as "unobserved" effects, hence covariates during the modelling process, whose signals influence the landslide space-time pattern in the data. For instance, in case of event-based landslide studies, the precipitation or seismic triggers may not be included among the covariate set (e.g., if they are unobserved), although they influence the concentration of landslides at specific locations. Nevertheless, advanced spatial models can retrieve the pattern of the unobserved trigger from the landslide distribution itself, as demonstrated by Lombardo et al. (2018aLombardo et al. ( , 2019b for storm-induced and by Lombardo et al. (2019a) for earthquake-induced landslides. As for temporal unobserved effects, advanced temporal models can retrieve the influence of the "landslide path dependency" recognised by Samia et al. (2018).
When allowing for such random components in the intensity function λ, the resulting stochastic model for the observed point pattern is called a Cox point process (Cox, 1965), and we use the uppercase notation Λ(s, t) for the intensity function to highlight its stochastic behaviour. More specifically, if we opt for the flexible and convenient choice of additive random effects in the log-intensity log(Λ(s, t)) possessing Gaussian process distributions, we obtain a Log-Gaussian Cox Process (Møller et al., 1998;Basu and Dassios, 2002;Diggle et al., 2013), LGCP in short.
For our dataset, we can rewrite the general structure of such models by taking into account the spatial (889 SUs) and temporal (6 intervals) resolution and by using the surface area |A i |, i = 1, . . . , 889, of SUs. We make the natural assumption that the average number of landslides observed within a SU s i is proportional to its surface area |A i |, given that all other predictor components are the same. Furthermore, we model a separate spatial intensity for each of the temporal intervals j = 1, . . . , 6, and so we write Λ j (s i ) = |A i | ×Λ j (s i ) for the integrated intensity of the i-th SU in the j-th temporal interval, whereΛ j (s i ) can be interpreted as the intensity of a (rescaled) unit-area SU with the same characteristics.
We further write N ij ∈ {0, 1, 2, . . .} for the number of landslides observed in SU i during temporal interval j. Then, given the intensity values Λ j (s), the model has the following structure: N ij | Λ j (s) ∼ Poisson(|A i | ×Λ j (s i )), i = 1, . . . , 889, j = 1, . . . , 6. (1) We will formulate five regression models (technically speaking, Poisson regressions with a log-link function) in §5.4- §5.7, with two variants of a baseline model without random effects presented in §5.4. These models integrate observed covariate effects and random effects into the log-intensity log(Λ j (s)) in an additive way. The models follow the general structure log(Λ j (s i )) = log(|A i |) + fixed covariate effects + random effects where the random effect component are not accounted for in our baseline models. The term log(|A i |) represents a fixed deterministic offset.

The Bayesian modelling paradigm
Before presenting the specific structure of the five models we tested, we first recall the general idea of fully Bayesian modelling as implemented here. We aim to simultaneously estimate the latent intensity function Λ j (s), and more specifically its components such as the coefficients of the fixed covariate effects and random effects, if present. Moreover, a relatively small number of so-called hyperparameters governing the smoothness and variance of the random effects is also calibrated automatically. In Bayesian modelling, we specify so-called prior probability distributions for the components and parameters to estimate. In general, prior distributions allow incorporating expert knowledge and can help stabilise estimated models when these have a very complex structure and/or when data are very noisy. Bayes' famous Theorem then tells us how we can construct the posterior distributions (densities, expected values, etc.), i.e., parameter estimates and precise probabilistic uncertainty statements, by confronting prior information with observed data. Therefore, the Bayesian mechanism allows us to move from the specification of "data distribution given the model parameters and their prior information" to the estimation target corresponding to the "distribution of model parameters given the data and prior information". The fitted posterior distributions can then be exploited to make inference (e.g., on model parameters and their uncertainty), derive any model-based diagnostics, and draw practical conclusions. Through the specification of priors, the Bayesian paradigm allows us to resolve potential parameter identifiability issues by naturally integrating constraints in the prior distributions. For example, if some covariates have a tendency towards collinearity (e.g., if they are strongly correlated, hence providing redundant information-a common instance in landslide susceptibility and hazard studies), then the prior structure can reduce numerical instabilities and keep the model and its parameters well identifiable from the data. Therefore, the specification of priors is crucial in models where the number of unknown parameters and/or latent random variables to infer is large compared to the observed sample size.
Here, we will specify different types of prior distributions for distinct model components (i.e., fixed effects and random effects). In particular, we assume that the continuous covariates have been standardised to have mean 0 and variance 1, such that the importance of estimated coefficients can be interpreted and compared more easily, and we expect estimated coefficients to be significant if they are moderately large in absolute value. In our models, we then specify a moderately informative normal distribution with mean 0 and variance 1 independently for the global intercept and all covariate coefficients. This implies that, a priori, the probability of having an absolute covariate coefficient value β larger than 2 is less than 5%. However, if the data provide clear evidence that the true coefficient is higher, the final posterior estimate of the coefficient can still be much larger without any difficulty.
As for latent random effects composed of numerous random variables, a general principle is to avoid overly complex models by constructing prior distributions that penalise the model complexity. If the stochastic behavior of such complex model components is not properly controlled by suitably chosen prior distributions, this can lead to overfitting and poor estimation and prediction performances. To penalise model complexity, a strategy is to use informative priors, which shrink complex model components towards simpler reference models, making sure that they can be reliably estimated. This general principle has been formalised recently and leads to an approach based on so-called Penalised Complexity (PC) priors (Simpson et al., 2016). We will make systematic use of PC priors in our implementa-tion. For instance, the reference model for a latent spatial random effect (i.e., the LSE) is simply taken to be its absence, such that the prior distribution of the precision parameter of this effect is designed to avoid overly large spatial variability. In our baseline models, the random effect components are replaced by their reference distribution, i.e., they are absent.

Latent Gaussian modelling approach, and the R-INLA library
Within the last decade since its publication in 2009, the Integrated Nested Laplace Approximation (INLA) method (Rue et al., 2009) has become the most popular tool for Bayesian spatial modelling thanks to its implementation in the R-INLA library (Lindgren et al., 2015, http://www.r-inla.org/) of the statistical software R (R Core Team, 2014). The general framework of INLA is that of Bayesian latent Gaussian models, which has found widespread interest in a diverse range of applications (Lombardo et al., 2018a;Opitz et al., 2018;Krainski et al., 2018;Moraga, 2019). Essentially, the data are assumed to follow a "well-behaved" distribution function, and to be conditionally independent given some latent (multivariate) Gaussian random effects. In our context, landslide counts are modelled using the Poisson distribution, whose mean is expressed on the log scale in terms of various fixed effects and latent random effects that are correlated over space and/or time. Each of the covariate coefficients simply has a normal distribution prior, independent from the others. As for hyperparameters, the prior distributions are chosen more specifically depending on the role of the parameter. More details on each of our models are given in the following sections.
The success of INLA relies on the systematic use of random effects with sparse precision (i.e., inverse covariance) matrices within this latent Gaussian modelling framework, coupled with astute analytical and numerical approximation schemes (Illian et al., 2012;Rue et al., 2009Rue et al., , 2017, which provide exceptional speeds-up for fitting large and complex models compared to more traditional Markov Chain Monte Carlo (MCMC) methods. For details on the theory and practice of R-INLA, we refer the interested reader to the landslide tutorial paper by Lombardo et al. (2019b) and to Bakka et al. (2018).

Baseline LGCP models with fixed effects only
First, we consider two "baseline" models, which we call Mod1 and Mod2, where the first one is purely spatial, in the sense that it does not include any information about the structure of time intervals, whereas the second allows for time-interval-specific regression constants. In the model Mod1, we only include the spatially-indexed covariates presented in Section 4.3 in the log-intensity: according to the SU-wise means and standard deviations of morphometric variables obtained from the DEM, with superscript "morph" in equation (3), and the 13 thematic properties, with superscript "them" in equation (3), expressed through SU-wise proportions (Table 2). This is a purely spatial model, which assumes that the spatial intensity is the same for each of the 6 temporal intervals (here treated as independent replicates). Moreover, this model does not account for any additional spatially-correlated nor temporally-correlated unobserved effects.
Model Mod2, with intensity Λ Mod2 j (s), has the following structure: where β j , j = 1, . . . , 6 are additional time-interval-specific intercepts. We resolve the nonidentifiability of β 0 in (3) and β j in (4) by imposing the sum-to-zero constraint 6 j=1 β j = 0, such that estimated β j -coefficients (j = 1, . . . , 6) indicate how strongly the overall number of landslides in a time interval deviates from the global average measured through β 0 . As indicated in §5.2, we assign independent zero-mean Gaussian priors for each regression coefficient. However, unlike the global intercept β 0 and the covariate coefficients β morph where the prior variance is set to one, the additional intercepts β j specific to each time interval do not have a fixed prior variance; instead, we specify a PC prior for the variance of β j which corresponds to a 50%-probability of being below or above 1, and we then let data decide how strongly the β j values should be allowed to vary between time intervals.

Spatial LGCP with replicated spatially structured random effects
To extend the baseline models Mod1 and Mod2, we now add a spatial random effect, with 6 replicates in time, to explain variations in the landslide intensity that cannot be explained by the observed covariates, and we write Λ Mod3 j (s) for the spatial intensity in the j-th temporal interval in model Mod3. Our prior assumption is that the spatial random effect should differ between SUs and between different temporal events, but that it tends to be similar for SUs sharing a boundary or being close in space; recall the adjacency graph structure in Figure 2C.
We first write the general model formula, and we then explain how we encode this prior assumption on spatial dependence for the random effect. The model may be written as where Λ Mod2 j (s) is the baseline intensity of model Mod2 defined in (4), and where W Mod3 j (s) is the latent spatial effect (LSE) with 6 replicates, one for each of the temporal intervals.
We use the notation i 1 ∼ i 2 if the SUs i 1 and i 2 are not the same but share a boundary, and we write nb(i) = {ĩ |ĩ ∼ i} for the set of all the neighbouring SUs that are adjacent to the i-th SU, with |nb(i)| indicating their number. Since the study area is contiguous, we obtain |nb(i)| ≥ 2 for all SUs i = 1, . . . , 889 ( Figure 2B). The model for W Mod3 j (s i ) is now specified by the following two conditions: 1. The spatial fields W Mod3 j 1 (s) and W Mod3 j 2 (s) are independent for different times j 1 = j 2 .
2. The value of the spatial random effect W Mod3 j (s i ) at the i-th SU, given all the other values W Mod3 j (s˜i),ĩ = i, follows a normal distribution whose mean value corresponds to the mean of the adjacent values, i.e., where τ Mod3 > 0 is a precision parameter to be estimated, which controls the dependence strength between neighbouring SUs. The parameter τ Mod3 of the conditional spatial distributions in (6) determines the value of the variance 1/κ Mod3 of the unconditional effects W Mod3 j (s i ); internally, R-INLA implements a parameterisation using the marginal precision κ Mod3 averaged over all SUs, which may be simpler to interpret in practice.
The mechanism of this model prescribes that there is less uncertainty about the random effect value in a SU if we know the values in the adjacent SUs. This prior assumption is valid when adjacent slope units have a similar behaviour, e.g., when the sliding surface at depth affects more than one SU, or when the landslide rainfall/seismic trigger has a clear dominating spatial pattern. However, the observed data can also counteract this prior assumption if necessary, i.e., in case two adjacent slope units have strongly different behaviours. This might occur in our study area, e.g., with the common case of bedding planes dipping out of, or nearly parallel to the slope ("cataclinal" slope) in one SU, and dipping into the slope ("anaclinal" slope) in an adjacent SU across a drainage line Santangelo et al., 2015). The inclusion of the LSE, W Mod3 j (s), in Mod3 induces spatial dependence within each temporal interval, but keeps the different temporal intervals independent, be they consecutive in time or not.
A priori, we assume that the LSEs have moderate absolute values and are relatively similar between adjacent SUs, such that we construct a prior model whose complexity remains moderate when compared to the baseline Mod2 in (4). To achieve this, we use a PC prior for the standard deviation parameter sd Mod3 = 1/ √ κ Mod3 , which corresponds to an exponential distribution; in mathematical notation, we assume that where λ > 0 is a penalty rate to be defined. Here we fix λ such that Pr(1/ √ κ Mod3 > 1) = 0.5, i.e., we give the standard deviation parameter a 50% chance of exceeding 1 a priori.

Temporal
LGCP with slope-unit-based temporal random effect Our next model, Mod4, has a similar structure as Mod3 in (5) at first sight, but we now make an assumption about the temporal dependence of SU-based random effects within the same SU, while disregarding any direct spatial relationship between adjacent SUs. Writing Λ Mod4 j (s) for the spatial intensity in the j-th temporal interval, the model formula is given as where Λ Mod2 j (s) is the baseline intensity of model Mod2 defined in (4), and where W Mod4 i (t) is a latent temporal effect (LTE) with 889 replicates (one for each of the SUs in our study region), defined by the following three conditions: where the temporal autocorrelation parameter −1 < β Mod4 < 1 and the precision parameter τ Mod4 > 0 have to be estimated.
3. The value at the first time point W Mod4 i (t 1 ) follows a normal distribution with mean 0 and variance 1 Equivalently to the conditions 2 and 3 above, we can write W Mod4 . From an interpretation perspective, the most interesting aspect is to have direct control over the standard deviation sd Mod4 = 1/κ Mod4 and the autoregression parameter β Mod4 . Here, we proceed as with the spatial model, and we therefore use the same PC prior as in (7), setting λ such that sd Mod4 has 50% chance to exceed the value 1. When specifying a prior model for β Mod4 , we assume that consecutive events are only weakly linked to each other, and we implement a PC prior penalizing absolute values of β Mod4 close to 1. Our choice of prior is such that there is a 50% chance for β Mod4 to exceed 0.5 in absolute value.

Space-time LGCP with combined spatial and temporal structures
Finally, considering both the spatial and temporal structures in the data, we construct our most complex model, Mod5, that combines explicit assumptions for the temporal dependence of random effects between consecutive inventories in time, and for spatial dependence based on spatial adjacency relations between SUs ( Figure 2C) for contemporaneous landslides, i.e., for landslides in the same time period (Figure 3). Similar to the spatial model Mod3 in (5), we have a single parameter κ Mod5 > 0 that simultaneously governs the spatial variability and dependence strength of the random effects. Additionally, the parameter β Mod5 controls the strength of association between consecutive time intervals, in analogy to the preceding temporal model Mod4 in (8). Therefore, this space-time model keeps a parsimonious parameterisation with only two hyperparameters β Mod5 and κ Mod5 to be estimated. Writing now Λ Mod5 j (s) for the spatial intensity in the j-th temporal interval in model Mod5, we define where the latent space-time effect (LSTE), W Mod5 (s, t), now combines the dependence relationships of the purely spatial model Mod3 in (5) and of the purely temporal model Mod4 in (8). Specifically, we assume the following structure: 1. The LSTE obeys the following temporal dynamics: where the spatial "innovation fields" ε j (s) are mutually independent in time and have the same distribution as the LSE W Mod3 1 in (6), with τ Mod3 replaced by τ Mod5 ε . We write κ Mod5 ε > 0 for the corresponding unconditional precision parameter used internally by R-INLA, in analogy with Model Mod3 in (5).
2. The field at the first time point W Mod5 (s, t 1 ) has the same probability distribution as the LSE W Mod3 1 in (6), but now we denote the unconditional precision parameter by ). This assures that the model is stationary in time for each SU with unconditional precision parameter κ Mod5 for all 6 fields W Mod5 (s, t j ), j = 1, . . . , 6.
The prior distribution of the precision parameter κ Mod5 is fixed as in the spatial model Mod3 in (5), and the prior of the temporal autocorrelation parameter β Mod5 is fixed similarly to the temporal model Mod4 in (8).
In the spatio-temporal model Mod5, the data can determine if landslide counts, not well explained by the observed covariates, tend to be similar in space between nearby SUs: (i) in case of small mass movements, which separately affect different SUs; or (ii) because of single landslide bodies, whose areal extents affect more than one SU. Similarly, the temporal component of the model captures the effect, not explained by the observed covariates, that lead to landslide counts being similar through time between consecutive events for the same SU. In particular, if estimated spatial and temporal dependencies are non-negligible, then the average number of landslides in a SU for a given temporal event is often related to the average number for SUs that are located "close" in space, and for all the events "close" in time. Therefore, this model can learn about clustering in space and persistence in time in the structure of the landslide-triggering mechanism. A general understanding of this concept could be practically translated in study areas where, due to orographic conditions, the occurrence of critical precipitation amounts may always tend to arise in the same, relatively confined area in space. The model Mod5 in (10) could capture this trigger structure through spatial dependence (confined area) and temporal dependence (same area for different events), even if no observed data are available for the relevant precipitation events. The latter can be a single trigger in case of event-based inventories or the cumulative effect of several triggers for inventories associated with a large time span. This assumption can be valid when storms have a clear spatial pattern characterised by a transition in precipitation regimes from the "epicentre" to the peripheries of the cloudburst (Lombardo et al., 2018a). However, the same assumption may not hold in our study area because of its size (less than 10 × 10km 2 ) and the absence of significant orographic gradients. An alternative explanation that may relate more closely to our study case could be that the spatial dependence captures the unknown effect of land use or land cover, driven, e.g., by changing economy and agricultural practices, or the dependence induced by large landslides destabilizing more than one SU.

Implementation and model validation
To assess the models' performance and their interpretability from an explanatory perspective, we chose to implement an initial modelling stage where we fitted the baseline LGCPs (Mod1 and Mod2) and the three extensions featuring the latent fields (Mod3, Mod4, and Mod5) using 100% of the dataset. Subsequently, we assessed the predictive performance of each model. To do this, we implemented two separate cross-validation (CV) schemes. The first approach is a spatial 10-fold CV, whereby we extract 10% of the dataset for testing and leave out the complementary 90% for fitting. The proportion of held-out data may seem relatively small in comparison to commonly used values in the literature (e.g., 30% to 50%, Reichenbach et al. (2018)), but we consider it as a sensible value to allow the model to learn about spatial structure in the training data. We constrained the random extractions to avoid any SU to be sampled more than once, and we used the same SUs across time intervals. In other words, the combination of the ten CV complementary subsets reproduces the original dataset. The second CV approach is a temporal "leave-one-out" procedure whereby six models are fitted, each one calibrated on five time intervals and tested on the remaining one, regardless of the temporal position of the time periods used, i.e., the fitted model does not account for possible landslide heritage effects (Samia et al., 2017a,b).
We assessed the accuracy of the estimated landslide intensities both for the fit and crossvalidation procedures by comparing observed and predicted landslide counts for each model, and for each temporal interval. Here, predicted counts for SU i and time interval j are defined as the posterior meanΛ j (s i ) of the corresponding intensity model Λ j (s i ). We also maintained a link to the "traditional" landslide susceptibility, i.e., the probability of observing one or more landslides in a given SU at a given time (Chung and Fabbri, 1999;Guzzetti et al., 1999;Van Westen et al., 1999). Thanks to the Poisson distribution of landslide counts, the fitted intensityΛ j (s i ) for the i-th SU and j-th temporal interval can be converted into the landslide susceptibility S j (s i ) via the following equation: Hence, we computed performance metrics for models Mod1 to Mod5, both in terms of landslide intensity and susceptibility, for each of the two CV schemes.

Classification of intensity and susceptibility estimates
In the statistically-based landslide susceptibility literature, there is no agreement on how to reclassify and show in map form the continuous spectrum of probability values that result from a classification model into few meaningful classes (Reichenbach et al., 2018); a fundamental step for a susceptibility zonation to be used in practical applications Galli et al., 2008). The simplest option is to divide the entire probability range [0, 1] into two classes of predicted "stable" and "unstable" conditions; with the stable conditions predicted not to have landslides, and the unstable conditions predicted to have landslides. Even for this simple twofold division, several approaches are found in the literature with authors arbitrarily setting the probability cutoff. For presence-absence balanced datasets, examples exist where the cutoff is set to 0.5 (Dai and Lee, 2002) without providing an explanation (e.g., Süzen and Kaya, 2012), or because it corresponds to the mean value between the two extremes of the theoretical probability range (e.g., Lombardo et al., 2016a). The approach is problematic, because it sets the cutoff where the model is most uncertain (Rossi et al., 2010a;Reichenbach et al., 2018). Frattini et al. (2010) pointed out that the choice of a cutoff value depends on the proportion of the presence-absence cases, leading to (much) smaller probability cutoff in datasets with a larger prevalence. For instance, Van Den Eeckhaut et al. (2006) reported an upper cutoff of 0.0012 for the high susceptibility class in a study case in the Flemish Ardennes where landslides were rare. In an attempt to address the issue, Castro Camilo et al. (2017) proposed to select a cutoff value that maximises the model accuracy, obtained testing thousands of probability values from the estimated susceptibility.
The problem becomes even more complex when the classification scheme involves more than two classes, typically three to seven (Reichenbach et al., 2018). A number of authors have used quantiles to segment the continuous probability estimates into discrete susceptibility classes, e.g., from "low" to "high" susceptibility. As an example, Lombardo and Mai (2018) used 2.5, 25, 50, 75, and 97.5 percentiles (i.e., where the three central values are motivated from the classical "boxplot"). Other authors segmented their probability estimates by counting the number of observed landslides in each probability class. As an example, Petschko et al. (2014) counted the number of landslides in each susceptibility "bin", and set probability thresholds corresponding to 5%, 25%, and 70% of the total observed landslides. An alternative approach was proposed by Chung and Fabbri (2003), who ranked their probability estimates based on an "effectiveness ratio", i.e., the ratio between the proportion of total landslide area, A LT in each susceptibility class and the proportion of the susceptibility class in the study area. Guzzetti et al. (2006b) adopted this approach to show the result of their susceptibility zonation for the Collazzone study area. The mentioned approaches have their rationale, but examples exist in the literature for which there is no clear justification for the adopted classification system, especially when the aim is to produce a predictive map rather than measuring the model performance. For instance, Ayalew and Yamagishi (2005) reported that slicing the probability domain into equally-spaced bins was not optimal in their case, and suggested using bins equal to the standard deviation of the probability. Similarly, Pourghasemi et al. (2012) used the natural break tool in ArcGIS R without providing an explanation for this choice. This is the current, controversial, and unclear state of play in the landslide susceptibility literature. We further note that all these methods fail to recognise that probability values near 0.5 reveal the inability of the model to determine if a mapping unit (e.g., a SU) is stable or unstable, and do not represent "medium" or "intermediate" susceptibility conditions, or levels (Rossi et al., 2010a;Reichenbach et al., 2018).
In this work, we propose an innovative strategy to classify and rank the intensity and the susceptibility estimates, and theirs associated uncertainties, provided by our five models. We adopt the following four-classes ranking scheme: • Clearly stable SUs: intensity ≤ 0.05, equivalent to susceptibility ≤ 0.05. This class corresponds to an intensity/susceptibility range, where the model predicts a lack of landslide occurrences with high probability. More precisely, for the SUs in this class, the probability of having no landslide is estimated to be more than 95%.
• Uncertain Type 1: 0.05 < intensity ≤ 1, equivalent to 0.05 < susceptibility ≤ 0.63. This class characterises SUs that have a probability of landslide occurrence being greater than 5%, while their expected landslide count is less than one (on average).
• Uncertain Type 2: 1 < intensity ≤ 3, equivalent to 0.63 < susceptibility ≤ 0.95. This class characterises SUs that have a probability of landslide occurrence being less than 95%, while their expected landslide count is more than one (on average).
• Clearly unstable SUs: intensity > 3, equivalent to susceptibility > 0.95. This class corresponds to an intensity/susceptibility range, where the model predicts landslide occurrences with high probability. More precisely, for the SUs in this class, the probability of having at least one landslide is estimated to be more than 95%, and the expected landslide count is more than 3 per SU.
This new classification is applied and discussed below in §6.7.

Results
In this section, we present the results of our modelling effort. First, in §6.1, we outline the results of our simplest, baseline model Mod1 in (3), showing the model intensity and susceptibility estimates. Model Mod1 is the closest LGCP counterpart to more "traditional" susceptibility models, and it represents a good reference against which to compare the more advanced models, which we do in §6.2. This is followed by the presentation of the models' fitting performance ( §6.3), of the latent temporal ( §6.4) and linear covariates ( §6.5) effects, of the models prediction skills ( §6.6), and of what we consider our best intensity-susceptibility model (Mod5) ( §6.7). Lastly, in §6.8, we provide information on the computational requirements to fit our models.

Baseline intensity and susceptibility estimates
For each of the 889 SUs that partition our study area, Figure 4 shows, in map form, the fitted estimates obtained by model Mod1 exploiting the morphometric and thematic covariates listed in Table 2, and without considering any spatial or temporal latent dependency in the data. Figure 4A shows the spatial distribution of the landslide intensity, i.e., of the number of landslides that, on average, can be expected in each SU, based on the used covariates.
More precisely, the map shows the temporal average, from T1 to T6 (see Figure 3), of the modelled intensities for each time interval, i.e., their posterior means. We note that model Mod1 estimated up to 6.4 landslides per SU; a Figure that is significantly lower than the maximum number of landslides (25, see Figure 3) in a SU in the multi-temporal inventory (Figure 1). This is a limitation of model Mod1. Figure 4B shows the spatial distribution of landslide susceptibility, i.e., the propensity (proneness) of each SU to generate landslides, based on the considered local terrain conditions (Table 2). In the map, the susceptibility estimates were obtained from the intensity estimates of Figure 4A using equation (12). In the intensity ( Figure 4A) and the susceptibility ( Figure 4B) maps, the histograms show the frequency distributions of the modelled intensity and susceptibility estimates. Visual comparison of the two histograms reveals that the distributions are significantly different; with the distribution of the intensity estimates positively skewed, and the distribution for the corresponding susceptibility estimates more uniform. This was expected, as the the number of SUs that can generate a large or very large number of landslides is limited in the study area, whereas the number of SUs that can generate landslides, i.e., that are potentially "susceptible" to landslides, is large and geographically distributed. Lastly, Figure 4C portrays a joint landslide intensity-susceptibility map prepared adopting the four-class ranking scheme proposed in §5.9, which gives a combined view of the other two maps. In the combined map, out of the 889 SUs, 82 (9.2%) are classified as "clearly stable" (blue), and 47 (5.3%) as "clearly unstable" (red). Overall, the terrain estimated to be "clearly stable" covers 1.9 km 2 , 2.4% of the study area, and the terrain estimated to be "clearly unstable" covers 13.3 km 2 , 16.9% of the study area. In the "clearly stable" SUs, the estimated landslide intensity is expected to be ≤ 0.05, and the corresponding susceptibility also ≤ 0.05, i.e., very low. Conversely, in the "clearly unstable" SUs, the landslide intensity is expected to be > 3, i.e., three or more expected landslides, and the susceptibility is > 0.95, i.e., very high. Further inspection of Figure 4C reveals that the majority of the SUs (58.8%), covering 28.5 km 2 , 36.1% of the study area, is considered of "Uncertain Type 1", followed by 26.7% of "Uncertain Type 2", covering 35.2 km 2 or 44.7% of the Collazzone area. In these areas, on average, less (respectively more) than one landslide is expected in each SU.

Advanced intensity and susceptibility estimates
To highlight the higher flexibility and the better performance of the more advanced models Mod3, Mod4, and Mod5, featuring spatial, temporal, and spatio-temporal latent effects, respectively, we show in Figure 5 the relative intensity maps, i.e., the ratio of the intensity estimates obtained for each time interval (T1 to T6) obtained by models Mod3, Mod4, and Mod5, to the baseline model Mod1, and in Figure 6 the corresponding relative susceptibility maps, showing the ratio of the susceptibility estimates obtained by the three advanced models to the baseline model Mod1.
Interestingly, the values portrayed in the maps shown in the two Figures 5 and 6 represent the factors by which the intensity and susceptibility baseline estimates should be multiplied to account for the space, time, and space-time dependencies. More precisely, the maps in the 18 panels of Figure 5 showΛ Mod3 estimated from each respective model for the different time intervals j = 1, . . . , 6. Similarly, using the intensity-susceptibility conversion equation (12) Figure 4A) have to be multiplied to get the estimated intensity for models Mod3, Mod4, and Mod 5, respectively. Color bar is uniform but limited to values greater than or equal to a factor of five for graphical purposes. Small graphs show density of intensity ratios (IRs) for each model and temporal interval. Note that x-axes and y-axes in the individual graphs cover different ranges. See text for explanation. respect to model Mod1, facilitating the interpretation of the performance of the advanced models Mod3, Mod4, and Mod5. To facilitate the visual comparison of the patterns of the estimated latent effects, when preparing the maps we choose different colour bars valid for each map in the two Figures 5 and 6. This was obtained saturating the colour scales at a factor of 5 for the Intensity ratios (IRs), and at a factor of 3 for the Susceptibility ratios (SRs). To further help the comparison, in each map we show the probability density distributions of the estimated values.
For each of the advanced models Mod3, Mod4, and Mod5, the estimated intensity and susceptibility ratios strongly differ from one, in several areas and time intervals, suggesting  Figure 4B) have to be multiplied to get the estimated intensity for Mod3, Mod4, and Mod 5, respectively. Color bar is uniform but limited to values greater than or equal to a factor of five, for graphical purposes. Small graphs show density of susceptibility ratios (SRs) for each model and temporal interval. Note that x-axes and y-axes in the individual graphs cover different ranges. See text for explanation.
that the inclusion of latent random effects in these complex models is necessary to capture the large variations in intensity and susceptibility across space and time. In other words, these variations cannot be explained solely by the available covariates included in model Mod1. The higher flexibility of the random effect models may thus improve the goodness-offit and prediction skills. This will be investigated in more detail in §6.3 and §6.6. We further note a clear resemblance between the intensity and susceptibility ratios of the spatial and spatio-temporal models, namely Mod3 and Mod5. This hints at a greater effect of the spatial dimension with respect to the temporal dimension in explaining the known distribution of landslides ( Figure 1).

Fitting models performance
For each SU in the study area, Figure 7 compares the observed to the estimated landslide counts, for the five models (Mod1 to Mod5). It also shows the susceptibility estimates calculated using equation (12), to facilitate comparison with traditional landslide predictive studies. Visual inspection of Figure 7 reveals a clear pattern where the baseline models, Mod1 and Mod2, strongly underestimate the observed counts larger than one, whereas they often tend to largely overestimate the zeros (i.e., no landslides). By contrast, the advanced models, Mod3, Mod4 and Mod5 that account for space, time, and space-time dependencies, respectively, significantly improve the goodness-of-fit, with points aligned closer to the diagonal, the latter corresponding to a perfect fit. Similarly, the Receiver Operating Characteristic (ROC) curve and the Area Under the Curve (AUC) computed for the five models show a similar situation where Mod1 is the weakest, followed by Mod2, whose performance is slightly better due to the contribution of the multiple temporal intercept. Overall, Mod3, Mod4, and Mod5 perform much better, improving the two baseline results from acceptable to outstanding, according to Hosmer and Lemeshow (2000).

Temporal effects
To investigate the temporal dynamics driving landslide occurrence in our study area, we now focus on model Mod4 in (9), in which we decompose the temporal effect into global multiple intercepts (with one coefficient for each time interval), assumed to be a priori independent across time, and latent temporal effects (LTEs) for each SU, assumed to be driven by an autoregressive temporal dependence structure (Blangiardo and Cameletti, 2015;Opitz, 2017). While the multiple intercepts are constant in space and capture abrupt changes in the overall landslide intensity over time (e.g., due to triggers of different magnitudes), the LTE is designed to capture local, SU-specific changes that are smoother in time, and we thus make the assumption that the LTE carries information about "clustering" and "repellency" effects in each SU.
The plot in Figure 8A shows the posterior distribution of the multiple intercepts. Inspection of the plot reveals a sudden increase of the multiple intercept during T2 (1941)(1942)(1943)(1944)(1945)(1946)(1947)(1948)(1949)(1950)(1951)(1952)(1953)(1954). This is the result of a severe regional rainfall event that hit Central Italy in December 1937 (Reichenbach et al., 1998), resulting in numerous landslides in the Collazzone area, which was captured in the multi-temporal inventory interpreting aerial photographs taken in 1941. In this sense, the multiple intercepts carry the strength of the overall effect of the triggers in the six time periods. Conversely, the LTE captures more localised effects in each SU, estimating the relation between landslide counts in a given time period and the number of landslides in the following time period.
In Figure 8B, we show the temporal evolution of the posterior means of the LTE for  the 889 SUs in the study area, in a single plot. Inspection of the plot reveals that most of the SUs (845, i.e., 95.0%) exhibit erratic temporal trends, with LTE increasing or de-creasing "randomly" in time (grey lines in Figure 8B). Closer inspection of the plot reveals that (i) a small number of SUs (34, i.e., 3.8%) exhibits a monotonically decreasing trend of the LTE (blue lines in Figure 8B), and (ii) an even smaller number of SUs (10, i.e., Figure 8B). From a geomorphological perspective, the first group encompasses SUs characterised by landslide temporal "repellency", where the presence of a landslide in a time period has hampered the occurrence of new landslides in the future periods in the same SU, whereas the second group encompasses SUs characterised by temporal "clustering", where new landslides have continually followed previous landslides in the same SU, for the entire considered period (T1-T6).

1.2%) exhibit a monotonically increasing trend of the LTE (red lines in
The later result agrees with the findings of Samia et al. (2018) who have identified a "temporal path dependency" of new landslides on pre-existing landslides in the Collazzone area. Interestingly, the temporal response of landslide path dependency identified by Samia et al. (2018) disappears after about 10-15 year in the study area, i.e., within most of the time periods considered in this study. This explains the reduced number of SUs characterised by distinct temporal "clustering" found in this study. Figure 9 shows a summary of the posterior distribution of all the estimated regression coefficients that appeared to be significant in at least one of the five models. In the plots, we also show the estimated coefficients for Mod1 and Mod2 to highlight an issue common to all regression models in which residual dependence is not accounted for appropriately. In fact, the 95% credible interval of the regression coefficients estimated for Mod1 and Mod2 (with no latent effects included) is narrower than the credible intervals for the models with the LSE (orange), LTE (yellow) and LSTE (brown). This is a result of the model structure, where the simpler models (Mod1 and Mod2) are overconfident of the information carried by the observations, whereas the models that incorporate spatial (Mod3), temporal (Mod4) and spatio-temporal (Mod5) dependencies estimate more realistic credible intervals. In our case, the differences are small, and the pattern shows that the five models assign analogous posterior mean values to each covariate, both in amplitude and sign. We consider this an evidence of the overall goodness-of-fit of the different models.

Covariates effects
Determining whether a predictive model is useful in practice depends, among other factors, on the interpretability of the estimated covariates' effects. Out of all the 29 covariates used to construct the models (see Table 2), 15 were significant in at least one model (see Figure 9). As the estimated regression coefficients were similar across the five models, we now provide a unique interpretation. The Mean Slope and Mean TWI variables gave the strongest contribution to the models, with coefficients much larger (in absolute value) than the coefficients of all other covariates. Both variables contributed to increase the landslide intensity (hence the susceptibility) in all the SUs. From a geomorphological perspective, terrain slope controls the balance of the retaining and the destabilising forces acting on a slope (Taylor, 1948;Wu and Sidle, 1995;Donnarumma et al., 2013), and in many areas and   Figure 9: Regression coefficients of all covariates, whose coefficients were significant in at least one of the fitted models, Mod1 (pink), Mod2 (blue), Mod3 (orange), Mod4 (yellow), and Mod5 (brown). Diamonds show posterior means; triangles show pointwise 97.5 and 2.5 posterior percentiles bracketing 95% pointwise credible intervals.

SD of TWI
for landslides of the slide type (Hungr et al., 2014) like the one prevalent in the Collazzone study area, terrain slope and its derivatives (e.g., mean slope, slope range, standard deviation of slope) are known to be positively (albeit not necessarily linearly) correlated to the presence and abundance of landslides, and hence to landslide susceptibility (Carrara et al., 1991(Carrara et al., , 1995aFabbri et al., 2003;Budimir et al., 2015;Lombardo and Mai, 2018). TWI measures the ability of a given area to retain surface water as a function of the terrain gradient and the upslope contributing area, favouring infiltration and the increase of the pore water pressure at depth, and, hence, slope instability (e.g., Yilmaz, 2009;Cama et al., 2017). Curvature primarily controls convergence and divergence of overland flows which is often linked to slope stability (e.g., Ohlmacher, 2007). Here, laterally-concave Planar Curvature is estimated to contribute to landslide-prone conditions, whereas mean upwardly-concave Profile Curvature conditions and their variability within a SU increase the expected number of landslides. This effect is exacerbated by the standard deviation of Elevation which is a proxy for terrain roughness. Our five models concurred that an increase in the standard deviation of Elevation within a SU contributed to an increase in the estimated number of landslides (i.e., a larger intensity), and hence to a larger susceptibility. The mean Relative Slope Position (RSP) was significant only for Mod1 and Mod2, although larger RSP values contributed to increasing the estimated intensity, for all five models. The RSP is a continuous index which essentially assigns 0 to lowland and flat areas, and up to 1 to mountain tops. Thus, a positive regression coefficient suggests that SUs located mostly in the high portion of the local topography are more prone to landslides than SUs located chiefly in the lower part of the local topography.
Terrain aspect, jointly measured by the Eastness and Northness covariates, both expressed by their mean and standard deviations, played a significant role albeit with a small amplitude (i.e., small absolute coefficient values). According to their sign, SUs facing North or West were related to larger landslide intensities and larger susceptibility estimates. We note here that the decomposition of the terrain aspect into its two main components (Eastness and Northness) was a numerically convenient way to handle the nonlinear and cyclic exposition signal on slope stability/instability conditions. However, by decoupling the aspect into a linear combination of sine and cosine components, we lost the original interpretation of the overall effect of the aspect expressed in degrees within the [0, 360) • range. To compensate for this, in Figure 10 we show the overall reconstructed effect of the terrain aspect, for each model. Precisely, in the Figure we plot β Eastness cos(θ) + β Northness sin(θ) as a function of θ ∈ [0, 360] • , where β Eastness and β Northness denote the Eastness and Northness coefficients, respectively, estimated from each model. Inspection of the Figure reveals that the effect of terrain aspect on landslide intensity and susceptibility is significant, and when backtransformed to its original scale, the W-NW components mentioned above reveal a clear positive contribution to the landslide counts (i.e., landslide intensity), which changes to a negative effect when moving towards E-SE components. This was known in the study area, and depends on the geometric and geomorphological interaction between the prevalent atti-tude of the bedding planes that characterise the study area and the orientation and geometry of the slopes Santangelo et al., 2015).

Predictive performance of models
The goodness-of-fit for the baseline models Mod1 and Mod2 was weak in terms of fitted counts, but acceptable in terms of binary metrics. Conversely, the more advanced random effect models Mod3, Mod4, and Mod5 showed outstanding explanatory performances both in terms of expected landslide counts, and fitted presence-absence probabilities (see Figure 7). Thus, it is natural to wonder whether or not the more complex LGCP models overfit the data. In case of overfitting, their predictive performance would be low.
To assess this, and to quantify the predictive performance of each model, we designed two cross-validation (CV) procedures. Because the data are spatio-temporal in nature, we considered both a spatial CV scheme and a temporal CV scheme. We measured the spatial predictive performance using a 10-fold cross-validation procedure (abbreviated Space 10-Fold ), and the temporal predictive performance using a leave-one-out cross-validation procedure (abbreviated Time leave-one-out). More precisely, the spatial 10-fold CV consists in splitting the original dataset into 10 complementary subsets at random, each comprising 10% of the original SUs. The model is then fitted using nine subsets (i.e., 90% of the SUs), and subsequently validated using the left-out subset (i.e., 10% of the SUs). The procedure is repeated by leaving out each subset. The complementary constraint ensures that every SU in the Collazzone study area is predicted exactly once (and only once) for all time intervals during the CV routine. As for the temporal leave-one-out CV, we leave out the data from one of the six time intervals, then we fit the model using the five remaining intervals, and finally we validate the model using the data from the time interval that was left out. The procedure is repeated by leaving out the data from each time interval. Essentially, this corresponds to a temporal 6-fold CV scheme, where each test set consists of a single time interval.
For both CV schemes, we examined the models' performances both in terms of intensity (i.e., predicted counts) and susceptibility (i.e., predicted probability of landslide occurrence), using the same summary measures used in Figure 7. Results are summarised in Figure 11 where the two main vertical panels represent the intensity and susceptibility results, and the sub-columns summarise the results for the Space 10-fold and Time leave-one-out procedures. The different rows correspond to Mod1, Mod2, Mod3, Mod4, and Mod5 (from top to bottom). We opted to avoid colour coding the Time leave-one-out intensities by time to improve readability of the figure.
Inspection of Figure 11 reveals that the agreement between the observed and the predicted landslide counts increases significantly from our simple models (Mod1, Mod2) to the advanced models that include latent variables (Mod3, Mod4, Mod5), which are better capable of predicting the number of landslides in each SU. In Mod3 and Mod5, the match between observed and estimated counts is reasonably good, i.e., it is still confined within the theoretical uncertainty of the Poisson distribution, up to 14 landslides. Conversely, from this number to the maximum (25 landslides) in the dataset, the predicted counts tend to underestimate the actual observations. This is not a feature of the model, but rather a consequence from our dataset, which comprises a large number of SUs with no or a few landslides and very few SUs with many landslides. Inspection of Figure 3 reveals that the landslide dataset has only a few isolated samples larger than 10 counts, which is where the model starts to perform poorly. Also, the Time leave-one-out is the CV scheme that deviates most strongly from the performance obtained for the fit. An explanation is that the Space 10-fold removes 533 SUs per iteration (889 × 6/10 ≈ 533), whereas a temporal CV removes 889 SUs. Moreover, the temporal CV disrupts the coherence in the data more strongly, since removing one time interval removes either 50% or 100% of the direct temporal neighbours for a large number of data points. Therefore, it is the most challenging CV scenario we could devise. Nevertheless, we note that the overall performance shown in the susceptibility case still falls in the "excellent" class of Hosmer and Lemeshow (2000), with outstanding AUC metrics. The improvement from Mod1 to Mod5 is measured quantitatively by the models' AUCs, which justify the inclusion of the latent effects. Figure 11: Out-of-sample performance summary of each fitted LGCP model Mod1 to Mod5 (top to bottom). Predicted counts plotted against observed counts based on a spatial 10fold CV scheme (Column 1) and a temporal leave-one-out CV scheme (Column 2). For each observed count, we show boxplots of predicted counts to summarise their distribution across all SUs. Grey lines are 95% intervals for an exact Poisson distribution with mean between 0 and 25. ROC curves and corresponding AUC values for each test set based on a spatial 10-fold CV scheme (Column 3) and a temporal leave-one-out CV scheme (Column 4).

Best intensity-susceptibility predictive model-Mod5
Examining the results or our modelling effort in terms of (i) estimated landslide counts, i.e., of predicted landslide intensity (Lombardo et al., 2018a(Lombardo et al., , 2019b, and of (ii) estimated binary presence-absence of landslides, i.e., of predicted landslide susceptibility (Brabb, 1985;Guzzetti et al., 1999;Reichenbach et al., 2018), and considering the models' spatial and temporal structures (Figures 5, 6), and their fitting (Figure 7) and predicting (Figure 11) performances, we maintain that Mod5-which jointly accounts for the spatial and temporal dependencies-has a better overall performance than the other four models (Mod1 to Mod4).
Mod5 provides comparable patterns to Mod3 in terms of the predicted landslide counts over time, keeping the flexibility of Mod4 in the binary (i.e., susceptibility) predictions. We conclude that Mod5 is our best model, and we select it to generate a combined landslide intensity-susceptibility classification for our study area, adopting the ranking scheme proposed in §5.9. We summarise the results of Mod5, for each of the six time periods (T1-T6), in Table 3 and Figure 12. Inspection of results reveals some degree of temporal variability in the combined intensity-susceptibility patterns. However, the general spatio-temporal pattern remains about the same. Similarly to the outcomes of our baseline model Mod1 ( §6.1), the number of SUs that Mod5 predicts capable of generating a large or very large number of landslides is limited, whereas the number of SUs that can generate landslides, i.e., that are potentially "susceptible" to slope failures, is large and geographically distributed. This is reasonable from a geomorphological and landscape evolution perspectives.
Overall, and for the entire considered period (Figure 3), model Mod5 classifies the (relative or absolute) majority of the SUs, from 409 (T2, 46.0%) to 602 (T4, 67.5%), as of "Uncertain Type 1" ( §5.9). In each of these SUs, covering collectively between 25.6 (32.4%) and 49.8 (63.1%) km 2 , the estimated landslide intensity, i.e., the expected number of landslides, is in the range (0.05, 1], on average. The second class encompasses from 100 (T4, 11.3%) to 227 (T1, 25.5%) SUs classified as of "Uncertain Type 2". In these SUs, covering collectively between 17.8 and 25.5 km 2 (22.5 to 32.3%), the number of expected landslides is in the range (1, 3], on average (Table 3). With a few exceptions, model Mod5 classifies the smallest number of SUs, from only 18 (T4, 2.0%) to 140 (T6, 15.8%), as "Clearly Unstable", covering between 4.6 (5.8%) and 23.6 (29.9%) km 2 , followed by from 97 (T2, 10.9%) to 169 (T4, 19.0%) SUs classified as "Clearly Stable", covering between 3.5 (4.4%) and 6.8 (8.7%) km 2 . We maintain that the variability in the predicted classification estimates ( Figure 12, Table 3) measures (i) the spatio-temporal uncertainty of the landslide intensitysusceptibility estimates at the spatial scale and in the temporal range considered by our modelling experiment, i.e., the aleatory uncertainty inherent in the landslide processes, and (ii) the epistemic model uncertainty introduced by the adopted modelling framework and the available landslide ( Figure 1) and thematic data (Table 2).  Figure 12: Combined intensity-susceptibility classification of the Collazzone study area, Umbria, Central Italy, based on model Mod5 constructed using morphometric, geologic, bedding attitude, and space-time dependencies. Each map corresponds to one of the six time periods shown in Figure 3. Pie-charts show the percentage of SUs falling into one of the four considered classes. See §5.9 for an explanation of the adopted classification and ranking scheme. 42

Computational requirements
All models presented in this paper fitted on any state-of-the-art computer. Running times were less than one minute (baseline models Mod1, Mod2), several minutes (Mod4), and around 6 hours (Mod3, Mod5). For estimation of the full model with all data and for the cross-validation models with some of the landslide counts held out, running times were of comparable order of magnitude in all cases. In particular, we consistently observed long running times of several hours for Mod3 and Mod5 due to longer computations related to Laplace approximations, but the diagnostic output of R-INLA did not indicate any instabilities. Memory requirements were less than 1Gb in all cases when using 2 cores in parallel for each job.
The main computational cost of our modelling procedure stems from the Laplace approximations, performed repeatedly during an estimation run with INLA. A general rule of thumb is that computation times increase with the number of observations (5334 with our dataset), the number of latent variables (29 parameters for Mod1, 34 parameters for Mod2, and 5368 parameters in the Mod3, Mod4 and Mod5 comprising random effects), and the complexity of the additive predictor model comprising components for fixed and random effects. When there is a tendency towards confounding of effects, i.e., when different additive components can provide similar contributions to the predictor, computations can become less stable such that computation times increase, or estimation may even fail, which did not happen with our models.
The mapping units for which event counts are recorded may have higher resolution or may span larger areas than here, with up to several hundreds of thousands of observations (e.g., Lombardo et al., 2018a). Running INLA on such datasets, with models comprising several thousands of latent variables, then typically takes several hours, or even several days in extreme cases. With memory requirements of R-INLA easily exceeding 16Gb in such highdimensional models, estimation is usually carried out on machines dedicated to scientific computing. In general, Bayesian hierarchical modelling demands considerably higher computing resources than classical frequentist approaches but also provides significant benefits as shown in this work. In the context of geomorphological applications, the estimation results usually have to be established only once, without any need to reestimate models for continuously updated data, such that this increased computational burden remains manageable in practice. Moreover, the library R-INLA provides several choices of less accurate approximation schemes to speed up estimation and reduce memory usage. Its recent integration of the powerful PARDISO library for numerical matrix computations further increases its potential for solving very high-dimensional problems (van Niekerk et al., 2019).

Discussion
We now discuss the results of our modelling experiment. We first focus on what we consider the main advantages and the limitations of the new modelling framework, and its potential applicability to other areas ( §7.1). Next, we make specific and general considerations on the results of our work for landslide hazard assessments ( §7.3). This is followed by a critical analysis of the modelling approach for geomorphological and slope stability inference ( §7.4). Lastly, we provide a perspective on further developments of landslide predictive modelling and zoning ( §7.5).

A new landslide predictive modelling framework
In our work, we experimented an innovative, Bayesian modelling framework for the spatiotemporal prediction of landslides of the slide type (Hungr et al., 2014) ( §5). Results proved that the adopted framework was capable of predicting the temporal, the spatial, and the spatio-temporal distributions of known landslides that occurred in our study area in the period from before 1941 to 2014 (Figure 1). Results also showed that considering the existing, albeit not identified explicitly (i.e., "latent"), landslide dependencies in space and time improved significantly the model predictive performance ( §6.6 and Figure 11), when compared to a simpler ("traditional") model, exemplified by our baseline model Mod1 (Figure 4) which does not consider the spatio-temporal dependencies among landslides.
Our Log-Gaussian Cox Process (LGCP) model is a "doubly" stochastic process, with the stochasticity given by (i) a Poisson component, describing the number of landslides in each SU, and by (ii) a Gaussian component, which describes the landslide intensity (i.e., the expected count per SU) on the logarithmic scale and allows to incorporate several types of random effects, namely linear, nonlinear and nonlinear, at a latent level (Lombardo et al., 2018a). Therefore, the LGCP approach provides a convenient framework to investigate and interpret morphometric and thematic covariates' influence on slope instability in the study area ( Figure 9). It also brings to light unobserved dependencies that influence the landslide intensity function Λ j (s) ( §5.2), and the derived landslide susceptibility estimates. We consider this a significant improvement over existing, statistically-based landslide prediction modelling tools currently available in the literature (Reichenbach et al., 2018), which do not cope with latent effects, and typically do not consider complex temporal or spatio-temporal landslide dependencies.
To construct our models, we exploited a multi-temporal landslide inventory comprised of numerous (3,379) landslides, which occurred in a significantly long period over our study area (Figures 1, 3). The geographical ("cartographic") and thematic ("geomorphological") detail and the accuracy of the landslide mapping were key to inform properly our models, and to evaluate their performances (Figure 7) and prediction skills ( Figure 11). This may be seen as a limitation of the proposed framework, which requires accurate landslide data to provide reliable intensity and susceptibility estimates. However, we maintain that in order to predict landslides in space and time such detailed and accurate information is necessary (mandatory), albeit it may be costly and time consuming to obtain it (Galli et al., 2008;Guzzetti et al., 2012). While accurate multi-temporal landslide information is available, together with relevant thematic data (e.g., Table 2), the added effort to construct and run an advanced model (e.g., our model Mod5) compared to a simpler model (e.g., Mod1) is negligible, both in terms of GIS pre-processing and data preparation, and for the statistical modelling. Indeed, with the R-INLA library, we can run very complex models using a simple syntax. In our case, the difference between Mod1 and Mod5 was two additional lines of code. Inspection of the model fitting ( Figure 7) and predictive ( Figure 11) performances further reveals that the more advanced models (Mod3, Mod4, Mod5) where generally better at predicting the spatial (i.e., "where") rather than the temporal (i.e., "when") component. We maintain that this is due to the combined effect of (i) the inherent short-term temporal viability-and related unpredictability-of landslide phenomena at the SU scale, at least in our study area (Samia et al., 2017a(Samia et al., ,b, 2018, and (ii) the number and length of the considered temporal periods and the number of landslides in each period (Figure 3), which depend on the temporal frequency of landslides in our study area. The latter, is in turn controlled by the frequency of the landslide triggering forcing events (e.g., severe or prolonged rainfall periods, rapid snow-melt events) (Rossi et al., 2010b;Witt et al., 2010). This has hazard and geomorphological consequences, which we address below in §7.3 and §7.4.
The models accounting of the spatial landslide dependencies involve smoothing residuals across adjacent SUs. This may have introduced some local inconsistency or error at the boundary between the SUs. However, as mentioned in §5.5, the models are flexible, and let the data prevail without introducing too much noise to the intensity estimates. Should one need to account for the effect of a "rigid" barrier (e.g., a river, a major divide, a main lithologic or tectonic discontinuity) on landslide intensity or susceptibility, two solutions are possible. Bakka et al. (2019) have developed a model for incorporating physical barriers, which can be fitted using R-INLA, although their method relies on a different type of spatial effect as the one exploited in this work. Alternatively, one can remove manually the links between adjacent SUs in the adjacency matrix ( Figure 2). Since the residual smoothing process is governed by the adjacency matrix, removing appropriate links will prevent the latent effect from "propagating" from one SU to its direct neighbors.
The computational burden of a space-time LGCP model in R-INLA depends on, and scales with, the size of the dataset by exploiting random effects with sparse precision (i.e., inverse covariance) matrices. Relatively small areas like the one used for our experiment can be investigated effectively with a standard, modern personal computer even for spatiotemporal models (e.g., model Mod5). Larger datasets covering large and very large areas need proportionally larger computer facilities. We note here that the adoption of the SUs as the mapping unit of reference, or of other similar terrain mapping units (Guzzetti et al., 1999;Guzzetti, 2005;Van Westen et al., 2006), as opposed to "grid cells" or pixels, has reduced greatly the computational burden. In general, use of SUs facilitates the construction of complex models even for large and very large areas covering thousands of square kilometres . We conclude that the main limitation for performing complex, space-time LGCP landslide modelling is mostly due to the lack of accurate datasets and detailed multi-temporal landslide inventories, and not to the computational requirements.
This should guide geomorphologists interested in landslide prediction modelling in their time allocation and resource investment (Guzzetti et al., 1999).
We further note that we successfully tested the new framework ( Figure 11) for landslides predominantly of the "slide" type (Hungr et al., 2014), which are common and abundant in our study area, and in similar areas in Central Italy and elsewhere in similar physiographical settings. We acknowledge that further efforts are required to test the framework with different landslide types, since their temporal and spatial dependencies may vary. However, we do not see any geomorphological or statistical reason that should limit or hamper the applicability of the proposed framework to other landslide types. Lastly, we stress that the predictions made by all our models are valid under the general assumption that (i) the driving forces that control the landslide processes in the study area are known and captured through the covariates used in the models, and (ii) the driving forces will remain nearly the same in the foreseeable future Guzzetti et al., 2006a). We maintain that both assumptions hold in our study area, but the same key assumptions should be considered thoroughly when similar models are constructed in other areas.

Statistical considerations
A rigorous implementation of a model based on spatial point pattern theory would have required to treat each landslide as a precisely geolocated point. For practical implementation, it is usually satisfactory to know in which mapping unit a landslide occurred. However, a few landslides in the multi-temporal inventory (Figure 1), mostly present in the T1 and T2 periods (Figure 3), have a large or very large area (for T1 and T2: A L ≥ 2.2×10 2 m 2 ; for T3, T4 and T5: A L ≥ 5.8 × 10 2 m 2 ), and intersect multiple SUs. Treating such large landslides as single "points" would have been a severe forcing from a geomorphological perspective. We were then faced with the choice of whether to conflict (i) with the conditional independence assumption of points in our modelling tool, or (ii) with the empirical, geomorphological field evidence.
The conditional independence assumption states that observed landslide counts are independent if we know the value of the predictor comprising the covariate information and random effects (the latter only if they are part of the model). This assumption is common to all well-established spatial statistical models for discrete data, irrespective of the choice of a susceptility model (i.e., using a Bernoulli distribution for presence-absence data) or an intensity model (i.e., using a Poisson distribution for count data), and seems difficult to abandon, especially as it is a critical assumption for using INLA.
By analogy with susceptibility studies in the existing literature (e.g., Guzzetti et al., 2006a), we chose to respect the field evidence, and we counted the presence of a landslideor of a portion of a landslide-in each SU if the landslide area exceeded 2% of the SU area, a percentage that accounts for possible mapping errors (Carrara et al., 1991(Carrara et al., , 1995b. We acknowledge that this approach creates some dependence between events in nearby SUs and can lead to local clustering patterns of events that cannot be fully captured by models that obey the conditional independence assumption. This entails two major problems: first, the model lacks realistic small-scale behavior and may underestimate components of landslide risk at relatively small spatial scales; second, statistical inference is flawed by considering dependent observations as independent, which will cause an underestimation of uncertainty, for instance by declaring certain covariates as significant while they are not.
However, by using random effects as in our models we can substantially alleviate these problems by capturing local dependence and clustering structures that cannot be explained by geomorphological covariates alone. In other words, the latent spatial random effect can capture (part of) the dependence induced by the largest landslides affecting several SUs. While classical generalized linear models (GLMs) have only fixed effects and assume complete independence of observations, our models are based on the less restrictive assumption of conditional independence with respect to the combination of fixed and random effects. In this Bayesian framework, we can model the propagation of landslide counts over neighboring SUs, which also includes how far a single landslide extends in space (i.e., "how large" it is). By working with intensities instead of susceptibilities, i.e., with count data instead of presence-absence data, we further reduce the loss of information in data and models when opting for larger mapping units, where the phenomenon of single landslides stretching over several units becomes less frequent. As a result, the procedure of using an intensity framework with random effects brings our models closer to the accepted definition of landslide hazard (Varnes, 1984;Guzzetti et al., 1999Guzzetti et al., , 2005.

Hazard considerations
As mentioned in §2, the prediction of landslide hazard proposed by Varnes (1984), and later modified by Guzzetti et al. (1999) and Guzzetti et al. (2005), requires the anticipation-in probabilistic terms-of "where" (spatial component), "when" (temporal component), and "how large" or destructive (magnitude component) landslides are expected to be in an area (Guzzetti, 2005). Our new modelling framework, and specifically our more advanced model Mod5, fulfils the definition, to a large extent. Model Mod5 accounts for the spatial and temporal dependencies of landslides, including latent effects not explicitly described by other covariates. The magnitude component of the hazard is also present in model Mod5, given by the expected number landslides in each SU, i.e., by the landslide intensity. We note here that the number of landslides was used as a measure of landslide event magnitude, e.g., by Keefer (1984) for earthquake-induced landslides and by Malamud et al. (2004) for landslides caused by weather and geophysical triggers. Furthermore, landslide size characteristics, including landslide area (Hovius et al., 1997;Guzzetti et al., 2002;Malamud et al., 2004), volume (Malamud et al., 2004;Brunetti et al., 2009), area-to-volume ratio Larsen et al., 2010), and length-to-width ratio (Taylor et al., 2018), which can all be associated to the vulnerability to landslides  and hence to the landslide destructive power, are known to be empirically related to the number of landslides in an area. And, because we also considered the landslides' extent when counting slope instabilities per slope units, our models Mod3 and Mod5 are even informed by the latent fields on the persistence (a proxy for size) of landslide counts over space.
We conclude that the landslide intensity framework proposed by Lombardo et al. (2018aLombardo et al. ( , 2019b for spatial predictions, and extended here in time and space-time, is well suited to fulfil the requirements given by the standard definition of landslide hazard, and capable to do so within a single model. This is a significant advancement over previous hazard models that considered the spatial, temporal, and the landslide area (a proxy for magnitude) components separately (Guzzetti et al., , 2006a, and had to further assume the independence of the three components to properly estimate landslide hazard in probabilistic terms. Our experiment revealed that only a few SUs in our study area exhibited a constant landslide clustering or repellency trend over the entire considered period, and that most of the SUs exhibited a (randomly) varying clustering/repellency signal (Figure 8). We take this as empirical evidence of the fact that the temporal prediction of landslides over relatively long periods, longer than about 15 years in our case, is problematic and inherently uncertain. Samia et al. (2017aSamia et al. ( ,b, 2018, working in the same study area, identified a landslide heritage effect-which they called "landslide path dependency"-that conditions the occurrence of new landslides dependent on the location of previous landslides in the same SU, over periods of less than 15 years. Both empirical findings have consequences for hazard assessment. Neglecting the temporal dependence on landslides will underestimate hazard in SUs characterised by a clustering effect (inflating "type II" errors), and will overestimate hazard in SUs characterised by a repellent effect (inflating "type I" errors). We further note that common approaches to predict future landslide occurrences over large areas, including the definition of empirical landslide thresholds for the possible initiation of landslides from landslide and rainfall records (Aleotti, 2004;Guzzetti et al., 2007Guzzetti et al., , 2008Saito et al., 2010;Ko and Lo, 2018;Segoni et al., 2018), and the calculation of return periods from time-series of triggering events, chiefly rainfall or precipitation events (Frattini et al., 2009), assume the stationarity of the landslide processes over time. However, evidence shows that landslide processes are not stationary in our study area, and arguably in other similar areas in Central Italy and in other similar physiographical and climatic settings. The finding poses questions on the reliability of landslide forecast and prediction models based on past landslide and rainfall records (Rossi et al., 2010b;Witt et al., 2010;Segoni et al., 2018;Guzzetti et al., 2019).
Lastly, we note that our work introduced two novel advancements in landslide hazard modelling. First, we provided a robust way to classify landslide susceptibility, obtained from the landslide intensity using equation (12). The approach avoids the common problem of interpreting intermediate probability estimates as a measure of intermediate or "mean" or "moderate" susceptibility levels, which is incorrect, conceptually and operationally, and can lead to serious problems if the susceptibility models and associated zonings are used for practical applications Galli et al., 2008;Reichenbach et al., 2018). Second, we propose a new way of portraying in a single map the information provided jointly by the landslide intensity and the landslide susceptibility estimates. We maintain that the use of a single cartographic representation to show the combined intensity-susceptibility information facilitates the use of the zonation for practical applications, and the design of landslide protocols for land planning and management Reichenbach et al., 2018). To the best of our knowledge, the combined intensity-susceptibility classification proposed in §5.9 and exemplified in Figure 12 for our best model Mod5, are unique in the landslide hazard modelling literature.

Geomorphological considerations
The performance of statistically-based landslide prediction models depend entirely on the models structure and on the data used to inform them. If the data (the "covariates") are accurate and meaningful, an analysis of the model results can provide valuable insights on the geomorphic processes that control the spatio-temporal distribution of landslides in area, provided the modelling framework is geomorphologically sound.
Concerning data, to inform our models we used covariates that are known to represent geomorphic conditions that favour or hamper the formation of landslides in our study area (Guzzetti et al., 2006a,b;Ardizzone et al., 2007;Galli et al., 2008), in similar geologic, physiographic, and climatic settings (Carrara et al., 1991(Carrara et al., , 2003Carro et al., 2003;Guzzetti, 2005;Marchesini et al., 2014), and even in very different landscapes (Budimir et al., 2015;Goetz et al., 2015;Lombardo et al., 2016a;Reichenbach et al., 2018). With this respect, we maintain that our morphometric, lithological, and structural covariates (Table 2) are sound, accurate, and meaningful landslide predictors, and that they contribute to explain the known spatio-temporal distribution of landslides in our ( Figure 1) and in similar study areas.
Concerning the model structure, the LGCP framework assumes that individual landslides in a complex landscape are the result of a point process, in space and time. In this framework, a single landslide, i.e., a single element of a large population of landslides, is represented by a "point" (s i , t i ) defined by its spatial (s i ) and temporal (t i ) location, i.e., "where" and "when" the "point" landslide occurred in the investigated area ( Figure 1) and period ( Figure 3). The model further assumes that the spatio-temporal distribution of landslide points is the result of an unobserved intensity function (λ(s, t)) that varies over space and time. It is the stochastic variation of this intensity function that determined the location and temporal occurrence of the landslides. The last assumption is that the model incorporates effects carried directly by the data, i.e., by the covariates, and by unobserved random effects not explained directly by the covariates. In our case, such random effects include, e.g., the fact that geomorphologically similar and adjacent SUs behave similarly in their ability to generate landslides, and the fact that landslides tend to repeat in time in the same places where they occurred in the past (Samia et al., 2018). Overall, these modelling assumptions are reasonable, from a geomorphological perspective.
Most of the landslides in the Collazzone study area have an area smaller than M o = 5, 648m 2 , about 0.01% of the size of the study area. Even the largest landslide, extending for 1.5×10 6 m 2 , covers less than 2% of the study area. In the study area landslides are caused chiefly by severe weather events, each covering a small or very small fraction of a year, and hence an even much smaller fraction of the multi-decadal period considered by our analyses. We conclude that for the spatial and temporal evolution of the landscape that characterises the Collazzone study area, individual landslides are-or can be safely considered as-"point" events, both in space and time.
It is known that landslides in the study area are not distributed randomly in space (Figure 1) (Guzzetti et al., 2006a,b;Ardizzone et al., 2007;Galli et al., 2008), and that the size and type of the landslides are controlled by the interaction between the geometry of the slopes (chiefly terrain gradient and aspect) and the attitude of the main lithological layers (i.e., the strike and dip of sand and gravel levels, and clay laminations) (Guzzetti et al., 2006a;Marchesini et al., 2015;Santangelo et al., 2015). Thus, adjacent "anaclinal" slopes tend to generate similar, large, deep-seated slides, or complex and compound landslides, whereas adjacent "cataclinal" slopes tend to generate similar, small shallow slides and minor rotational landslides. It is also known that landslides in the area do not occur randomly in time. As mentioned before, Samia et al. (2017aSamia et al. ( ,b, 2018, who worked in the same area, identified a landslide heritage effect that conditions the occurrence of new landslides dependent on the location of previous landslides over periods of less than 15 years. Our own results confirm that this heritage effect is limited in time, with only a minority of the SUs exhibiting a constant, long term clustering or repellency trend, with the vast majority of the SUs showing fluctuating dependence signals through time (Figure 8). Indeed, this is a reasonable and expected behaviour for the medium to long-term evolution of slopes, and more generally of landscapes shaped by mass wasting processes. We conclude that the assumption that there exist "unobserved" latent effects that control and explain the spatio-temporal distribution of landslides is geomorphologically sound, and it matches and explains the existing empirical evidences.
We see two main limitations of geomorphological relevance of our current LGCP framework. First, we do not explicitly consider the size (e.g., area, volume) of the predicted landslides in each SU, with consequences on the possibility to exploit the modelling results for erosion, sediment and landscape evolution modelling. Second, the applicability of the model over very long periods (centuries or millennia) remains to be determined. For the former, new modelling frameworks will have to be devised, and tested. For the latter, not only we lack long-term past landslide data to train sound models, but we also lack a proper understanding of how climate may change and influence future slope instabilities, in the same general area (Alvioli et al., 2018), and in other areas (Gariano and Guzzetti, 2016). The main problem to overcome both limitations lays in the lack of accurate, spatially distributed, multi-temporal landslide datasets. However, the rapidly improving methods and techniques for the automatic or semi-automatic detection and mapping of landslides over large areas from remotely sensed data promise to bridge this data acquisition gap (Guzzetti et al., 2012).

Perspective
We see a number of possible future improvements to our work, with further specific and general modelling, hazard, and geomorphological implications. For the specific case of the Collazzone study area, we envision adding new covariates to the model, including covariates describing (i) land use and land cover types, which are known to influence the size, abundance, and frequency of slope failures in the study area, and (ii) the morphometric and hydrological settings of the individual SUs, which can also influence the presence and evolution of landslides in layered sediments (Carrara et al., 1991). An additional improvement will be to add covariates describing spatio-temporal environmental variations, including, e.g., space-time changes in land use and land cover driven by different agricultural or forest practices. We also envision improving our modelling of the spatial latent effect introduced by SUs with similar or different lithological, hydrological, or structural characteristics. For the purpose, we could experiment the incorporation of physical barriers (e.g., lithological or structural domain boundaries) using the advanced modelling proposed by Bakka et al. (2019); or we could select/deselect manually the links between adjacent SUs ( Figure 2) to consider local physical-strong (permeable) or weak (impermeable)-barriers. However, the latter solution will be tedious to implement, and may introduce unnecessary subjectivity to the modelling. Lastly, we envision using information on the size of the landslides in each SU, a relevant information not currently used by our models.
More generally, we envision testing our proposed modelling framework in other areas, considering similar and different landslide types, and similar and different spatio-temporal environmental information. This will measure the applicability and flexibility of the modelling framework in different physiographic and climatic settings. As an example, where a multi-temporal landslide inventory is available for a large area, even with a coarser temporal resolution than the multi-temporal inventory available for Collazzone, we envision using covariates describing the spatio-temporal evolution of precipitation (e.g., rainfall totals, rainfall duration, rainfall intensity, number of rainy days) to establish a complex functional link between the medium to long term evolution of the precipitation characteristics, and the occurrence (or lack of occurrence) of landslides. We expect this to improve the currently limited ability to understand landslides in the changing climate, and to provide better climate-driven landslide projections (Gariano and Guzzetti, 2016). Similarly, we foresee the possibility to test the modelling framework in areas where landslides are caused by repeated geophysical (e.g., earthquake) and severe meteorological (e.g., typhoons) triggers. Where event inventory maps can be prepared after each main triggering event, which is now feasible over large areas with the existing remote sensing and image processing technologies (Guzzetti et al., 2012), we expect this to improve our ability to model the evolution of complex landscapes dominated by mass-wasting processes under multiple geophysical and weather forcing (Burbank et al., 2003;Dadson et al., 2003;Lavé and Burbank, 2004;Gabet, 2007;Larsen et al., 2010;Booth et al., 2013).

Conclusions
We proposed a novel Bayesian modelling framework for the spatio-temporal prediction of landslides. The framework exploits a Log-Gaussian Cox Process (LGCP), which assumes that individual landslides in an area are the result of a stochastic point process driven by an unknown intensity function. We tested the modelling framework in the Collazzone area, Umbria, Central Italy, for which a detailed multi-temporal landslide inventory covering the period from before 1941 to 2014, and lithological and bedding data are available. We used this complex space-time geomorphological and geological information to prepare five statistical models of increasing complexity. Our "baseline" model (Mod1) solely relies on the information carried by morphometric and thematic properties, and does not account for the relative influence of spatial and temporal clustering of the landslide process. The second model (Mod2) is similar, but it allows for time-interval-specific regression constants. The next two models are more complex, and account for spatial (Mod3) and temporal (Mod4) latent effects. Lastly, our model Mod5 jointly accounts for latent temporal effects between consecutive inventories and latent spatial effects between adjacent SUs. We maintain that our most complex model Mod5 fulfils the definition of landslide hazard given in the literature. Quantification of the spatial and the temporal predictive performances of the five models revealed that our most advanced Mod5 performed better than the others model. We concluded that Mod5 is our best model, and we selected it to generate a combined landslide intensity-susceptibility classification for our study area, providing more information than traditional susceptibility zonations for land planning and management.
Based on the results of our complex modelling experiment, we draw the following general conclusions.
• The landslide intensity framework introduced by Lombardo et al. (2018aLombardo et al. ( , 2019a for spatial predictions, and extended in this work for time and space-time domains, performs well, and it fulfils the requirements of the standard definition of landslide hazard within a single model. This is a significant advancement over previous landslide hazard modelling frameworks (Guzzetti et al., , 2006a. • For regional geomorphological evaluations or hazard assessments, individual landslides can be considered as "point" events, both in space and time, and a Log-Gaussian Cox Process (LGCP), or a similar model, is fully adequate for the statistical modelling of the spatial and temporal evolution of landslides in landscapes dominated by mass-wasting processes.
• Our experiment proves that latent or "unobserved" effects exist and they control the spatio-temporal distribution of landslides. Considering these latent space-time landslide dependencies significantly improves the model predictive performance, compared to simpler models that neglect the space-time structure of the landslide process.
• The main limitation for complex, space-time landslide modelling resides in the availability of accurate data, and chiefly of detailed multi-temporal landslide inventories, and not in the availability of complex statistical modelling tools, which are available, or in the computational requirements, which can be relatively easily fulfilled in typical applications. This consideration should guide those interested in space-time predictive modelling of landslides in the allocation of their research time and their resource investments (Guzzetti et al., 1999(Guzzetti et al., , 2012.
We expect our novel approach to the spatio-temporal prediction of landslides to enhance the ability to evaluate landslide hazard and its temporal and spatial variations, to lead to better projections of future landslides, and to improve our collective understanding of the evolution of landscapes dominated by mass-wasting processes under geophysical and weather drivers. To promote reproducible analysis and replicable experiments in different geomorphological contexts, we share as Supplementary Material the dataset, the adjacency matrix and the R code used in this study.