# 9 Bayesian spatial models

Bayesian hierarchical models can be used to analyze areal data that arise when an outcome variable is aggregated into areas that form a partition of the study region. Models can be specified to describe the variability in the response variable as a function of a number of covariates known to affect the outcome, as well as random effects to model residual variation not explained by the covariates. This provides a flexible and robust approach that allows us to assess the effects of explanatory variables, accommodate spatial autocorrelation, and quantify the uncertainty in the estimates obtained.

A commonly used spatial model is the Besag-York-Mollié (BYM) model . Let us assume that $$Y_i$$ are observed outcomes at regions $$i=1,\ldots,n$$ that can be modeled using a Normal distribution. The BYM model is specified as

$Y_i \sim Normal(\mu_i, \sigma^2), i = 1,\ldots,n,$

$\mu_i = \boldsymbol{z_i \beta} + u_i + v_i.$

Here, the fixed effects $$\boldsymbol{z_i \beta}$$ are expressed using a vector of intercept and $$p$$ covariates corresponding to area $$i$$, $$\boldsymbol{z_i} = (1, z_{i1}, \ldots, z_{ip})$$, and a coefficient vector $$\boldsymbol{\beta} = (\beta_0, \ldots, \beta_p)'$$.

The model includes a spatial random effect $$u_i$$ that accounts for the spatial dependence between outcomes indicating that areas that are close to each other may have similar values, and an unstructured exchangeable component $$v_i$$ to model uncorrelated noise. The spatial random effect $$u_i$$ can be modeled with an intrinsic conditional autoregressive model (CAR) that smooths the data according to a certain neighborhood structure. Specifically,

$u_i| \boldsymbol{u_{-i}} \sim N\left(\bar u_{\delta_i}, \frac{\sigma_u^2}{n_{\delta_i}}\right),$ where $$\bar u_{\delta_i}= n_{\delta_i}^{-1} \sum_{j \in \delta_i} u_j$$, with $$\delta_i$$ and $$n_{\delta_i}$$ representing, respectively, the set of neighbors and the number of neighbors of area $$i$$. The unstructured component $$v_i$$ is modeled as independent and identically distributed normal variables with zero mean and variance $$\sigma_v^2$$, $$v_i \sim N(0, \sigma_v^2).$$

## 9.1 Bayesian inference with INLA

Bayesian hierarchical models can be fitted using a number of approaches such as integrated nested Laplace approximation (INLA) and Markov chain Monte Carlo (MCMC) methods . INLA is a computational approach to perform approximate Bayesian inference in latent Gaussian models. This includes a wide range of models such as generalized linear mixed models and spatial and spatio-temporal models. INLA uses a combination of analytical approximations and numerical integration to obtain approximated posterior distributions of the parameters and is very fast compared to MCMC methods.

The R-INLA package can be used to fit models using INLA. The INLA website (http://www.r-inla.org) includes documentation, examples, and other resources about INLA and the R-INLA package, including books that provide an introduction to Bayesian data analysis using INLA as well as practical examples in a variety of settings .

To install R-INLA, we use the install.packages() function specifying the R-INLA repository since the package is not on CRAN.

install.packages("INLA",
library(INLA)

Then, to fit a model, we write the linear predictor as a formula object in R, and call the inla() function passing the formula, the family distribution, the data, and other options. The object returned by inla() contains the fitted model. This object can be inspected, and the posterior distributions can be post-processed using a set of functions provided by R-INLA. R-INLA also provides functionality to specify priors, as well as to obtain a number of criteria that allow us to assess and compare Bayesian models such as the deviance information criterion (DIC) , the Watanabe-Akaike information criterion (WAIC) , and the conditional predictive ordinate (CPO) .

## 9.2 Spatial modeling of housing prices

Here, we provide an example on how to specify and fit a Bayesian hierarchical model to estimate housing prices in Boston, Massachusetts, USA, using the R-INLA package.

### 9.2.1 Housing prices in Boston, Massachusetts, USA

The Boston housing prices are in the spData package , and can be obtained with the st_read() function of the sf package as follows.

package = "spData"), quiet = TRUE)

map$re_v <- 1:nrow(map) formula <- vble ~ CRIM + RM + f(re_u, model = "besag", graph = g, scale.model = TRUE) + f(re_v, model = "iid") Note that in R-INLA, the BYM model can also be specified with model = "bym" and this comprises both the spatial and unstructured components. Alternatively, we can use the BYM2 model which is a new parametrization of the BYM model that uses a scaled spatial component $$\boldsymbol{u_*}$$ and an unstructured component $$\boldsymbol{v_*}$$: $\boldsymbol{b} = \frac{1}{\sqrt{\tau_b}}(\sqrt{1-\phi}\boldsymbol{v_*} + \sqrt{\phi}\boldsymbol{u_*}).$ In this model, the precision parameter $$\tau_b>0$$ controls the marginal variance contribution of the weighted sum of $$\boldsymbol{u_*}$$ and $$\boldsymbol{v_*}$$. The mixing parameter $$0 \leq \phi \leq 1$$ measures the proportion of the marginal variance explained by the spatial component $$\boldsymbol{u_*}$$. Thus, the BYM2 model is equal to an only spatial model when $$\phi=1$$, and an only unstructured spatial noise when $$\phi=0$$ . The formula of the model using the BYM2 component is specified as follows. formula <- vble ~ CRIM + RM + f(re_u, model = "bym2", graph = g) Then, we fit the model by calling the inla() function specifying the formula, the family, the data, and using the default priors in R-INLA. We also set control.predictor = list(compute = TRUE) and control.compute = list(return.marginals.predictor = TRUE) to compute and return the posterior means of the predictors. res <- inla(formula, family = "gaussian", data = map, control.predictor = list(compute = TRUE), control.compute = list(return.marginals.predictor = TRUE)) ### 9.2.5 Results The resulting object res contains the fit of the model. We can use summary(res) to obtain a summary of the fitted model. res$summary.fixed contains a summary of the fixed effects.

res$summary.fixed mean sd 0.025quant 0.5quant (Intercept) 1.426719 0.088586 1.25287 1.426742 CRIM -0.007846 0.001293 -0.01038 -0.007846 RM 0.260341 0.014052 0.23279 0.260337 0.975quant mode kld (Intercept) 1.60044 1.426742 2.146e-10 CRIM -0.00531 -0.007846 2.188e-10 RM 0.28792 0.260337 2.154e-10 We observe the intercept $$\hat \beta_0=$$ 1.427 with a 95% credible interval equal to (1.253, 1.6). We observe the intercept $$\hat \beta_0=$$ –0.008 with a 95% credible interval equal to (–0.01, –0.005). This indicates crime is significantly negatively related to housing price. The coefficient of rooms is $$\hat \beta_2=$$ 0.26 with a 95% credible interval equal to (0.233, 0.288) indicating number of rooms is significantly positively related to housing price. Thus, the results suggest both crime and number of rooms are important in explaining the spatial pattern of housing prices. We can type res$summary.fitted.values to obtain a summary of the posterior distributions of the response $$\mu_i$$ for each of the areas. Column mean indicates the posterior mean, and columns 0.025quant and 0.975quant are, respectively, the lower and upper limits of 95% credible intervals representing the uncertainty of the estimates obtained.

summary(res$summary.fitted.values) mean sd 0.025quant Min. :1.61 Min. :0.00994 Min. :1.59 1st Qu.:2.83 1st Qu.:0.00997 1st Qu.:2.81 Median :3.05 Median :0.00998 Median :3.03 Mean :3.04 Mean :0.00999 Mean :3.01 3rd Qu.:3.22 3rd Qu.:0.00999 3rd Qu.:3.20 Max. :3.91 Max. :0.01124 Max. :3.89 0.5quant 0.975quant mode Min. :1.61 Min. :1.63 Min. :1.61 1st Qu.:2.83 1st Qu.:2.86 1st Qu.:2.83 Median :3.05 Median :3.08 Median :3.05 Mean :3.04 Mean :3.06 Mean :3.04 3rd Qu.:3.22 3rd Qu.:3.24 3rd Qu.:3.22 Max. :3.91 Max. :3.93 Max. :3.91 We can create variables with the posterior mean (PM) and lower (LL) and upper (UL) limits of 95% credible intervals. # Posterior mean and 95% CI map$PM <- res$summary.fitted.values[, "mean"] map$LL <- res$summary.fitted.values[, "0.025quant"] map$UL <- res$summary.fitted.values[, "0.975quant"] Then, we create maps of these variables with mapview specifying a common legend for the three maps, and using a popup table with the name, the logarithm of the housing prices, the covariates, and the posterior mean and 95% credible intervals. We use the sync() function of leafsync to plot synchronized maps. # common legend at <- seq(min(c(map$PM, map$LL, map$UL)),
max(c(map$PM, map$LL, map\$UL)),
length.out = 8)

# popup table
popuptable <- leafpop::popupTable(dplyr::mutate_if(map,
is.numeric, round, digits = 2),
zcol = c("TOWN", "vble", "CRIM", "RM", "PM", "LL", "UL"),
row.numbers = FALSE, feature.id = FALSE)

m1 <- mapview(map, zcol = "PM", map.types = "CartoDB.Positron",
at = at, popup = popuptable)
m2 <- mapview(map, zcol = "LL", map.types = "CartoDB.Positron",
at = at, popup = popuptable)
m3 <- mapview(map, zcol = "UL", map.types = "CartoDB.Positron",
at = at, popup = popuptable)

m <- leafsync::sync(m1, m2, m3, ncol = 3)
m