Books


My book Spatial Statistics for Data Science: Theory and Practice with R has been published by Chapman & Hall/CRC Data Science Series, and can be bought from CRC Press or Amazon. The online version of the book can be read here, and it is licensed under CC BY-NC-ND 4.0.
Spatial data is crucial to improve decision-making in a wide range of fields including environment, health, ecology, urban planning, economy, and society. Spatial Statistics for Data Science: Theory and Practice with R describes statistical methods, modeling approaches, and visualization techniques to analyze spatial data using R. The book provides a comprehensive overview of the varying types of spatial data, and detailed explanations of the theoretical concepts of spatial statistics, alongside fully reproducible examples which demonstrate how to simulate, describe, and analyze spatial data in various applications. Combining theory and practice, the book includes real-world data science examples such as disease risk mapping, air pollution prediction, species distribution modeling, crime mapping, and real state analyses. The book utilizes publicly available data and offers clear explanations of the R code for importing, manipulating, analyzing, and visualizing data, as well as the interpretation of the results. This ensures contents are easily accessible and fully reproducible for students, researchers, and practitioners.
Key Features:

  • Describes R packages for retrieval, manipulation, and visualization of spatial data
  • Offers a comprehensive overview of spatial statistical methods including spatial autocorrelation, clustering, spatial interpolation, model-based geostatistics, and spatial point processes
  • Provides detailed explanations on how to fit and interpret Bayesian spatial models using the integrated nested Laplace approximation (INLA) and stochastic partial differential equation (SPDE) approaches

The Spatial Statistics for Data Science book cover



My book Geospatial Health Data: Modeling and Visualization with R-INLA and Shiny has been published by Chapman & Hall/CRC Biostatistics Series, and can be bought from CRC Press or Amazon. The online version of the book can be read here, and it is licensed under CC BY-NC-ND 4.0. The book has also been translated to Chinese.
Geospatial health data are essential to inform public health and policy. These data can be used to quantify disease burden, understand geographic and temporal patterns, identify risk factors, and measure inequalities. Geospatial Health Data: Modeling and Visualization with R-INLA and Shiny describes spatial and spatio-temporal statistical methods and visualization techniques to analyze georeferenced health data in R. The book covers the following topics:

  • Manipulating and transforming point, areal and raster data,
  • Bayesian hierarchical models for disease mapping using areal and geostatistical data,
  • Fitting and interpreting spatial and spatio-temporal models with the integrated nested Laplace approximation (INLA) and the stochastic partial differential equation (SPDE) approaches, and the R-INLA package,
  • Creating interactive and static visualizations such as disease maps and time plots,
  • Reproducible R Markdown reports, interactive dashboards, and Shiny web applications that facilitate the communication of insights to collaborators and policymakers.
The book features fully reproducible examples of several disease and environmental applications using real-world data such as malaria in The Gambia, cancer in Scotland and USA, and air pollution in Spain. Examples in the book focus on health applications, but the approaches covered are also applicable to other fields that use georeferenced data including epidemiology, ecology, demography or criminology. The book provides clear descriptions of the R code for data importing, manipulation, modeling and visualization, as well as the interpretation of the results. This ensures contents are fully reproducible and accessible for students, researchers and practitioners.

The Geospatial Health book cover




The Geospatial Health book cover (Chinese)



Data


Chapter 10 Spatio-temporal modeling of geostatistical data. Air pollution in Spain

To reproduce the example of this chapter, we need to download the file `dataPM25.csv` which contains PM 2.5 levels at monitoring stations in Spain in years 2015, 2016 and 2017. Data are obtained from the European Environment Agency.

Download PM2.5 data

Chapter 15 Building a Shiny app to upload and visualize spatio-temporal data

To build the Shiny app shown in this chapter, we need to download the folder `appdir`. This folder contains the following subfolders:
  • `data` which contains a file with the data of lung cancer in Ohio, and a folder with the shapefile of Ohio
  • `www` with an image of a Shiny logo
Ohio data and map are obtained from the SpatialEpiApp package (Moraga, 2017).

Download appdir folder