Digital surveillance systems for climate-sensitive diseases


Paula Moraga, Ph.D. 

Assistant Professor of Statistics

King Abdullah University of Science
and Technology (KAUST), Saudi Arabia

   paulamoraga.bsky.social
   www.PaulaMoraga.com

a png
       a png

Dengue emergency 2024

Dengue emergency 2024

Dengue emergency 2024

Dengue

Dengue is a mosquito-borne disease caused by four dengue virus serotypes (DENV1–4) and transmitted by female mosquitoes of the Aedes species

Dengue poses significant public health challenges in tropical and sub-tropical regions of the world, causing considerable health and economic losses

Mosquito-borne diseases

Diseases spread because when a mosquito bites an infected person it also swallows any viruses or parasites living in the blood of the infected person, and these can be transferred to the next person the mosquito bites

Dengue symptoms

Many dengue cases only result in mild, flu-like illness, but some can be severe and even fatal

Dengue virus has four serotypes. Infection with one serotype does not provide immunity against the others, subsequent infections are often more severe

No specific treatment for dengue, usually rest, hydration and pain relievers

Early detection and timely access to proper medical care significantly reduce the fatality rates associated with severe cases

Dengue prevention

Dengue prevention focuses on personal protection using mosquito repellents and long-sleeve shirts, and mosquito control to eliminate breeding sites
(e.g., eliminating standing water, larvicides, insecticides)

Mosquito-borne diseases

Leta et al., International Journal of Infectious Diseases, 2018

Why are diseases spreading?

Rapid and unplanned urbanization and deforestation

Climate change









Ryan et al., PLOS NTDs, 2019
https://showyourstripes.info/

Global travel

What can we do to prevent the spread of infectious diseases?

Need to acknowledge connectivity between people, animals, and their shared environment and work together to prevent disease outbreaks and save lives

Access to healthcare
and education

Vaccine development and mosquito control

Early warning and response systems

Disease surveillance systems

Disease surveillance systems

Disease surveillance systems are critical to early detection of epidemics and the design of control strategies

Traditional surveillance systems rely on data gathered with a considerable delay and make surveillance systems ineffective for real-time surveillance


Digital data sources

Real-time digital information may enable to detect outbreaks earlier

“Flu plus fever, not a good way to start the weekend”

“I’m so irritated at this cough and fever”

“This flu, fever & throat ache won’t let me sleep”

Demographic and environmental risk factors

Digital health surveillance system

  • Data-gathering platform
  • Modelling framework that integrates multiple data sources so as to produce local probabilistic predictions of disease activity
  • Interactive dashboard that alerts public health officials when elevated disease levels are anticipated, and provides insights about disease drivers

Disease surveillance systems

Overview of my research to help inform disease surveillance

  • Use of digital data for nowcasting dengue in Brazil
  • Dengue forecasting models to inform policymaking
  • Methodology projects to improve disease surveillance
  • Conclusions

Dengue surveillance in Brazil

Brazil faced a severe dengue epidemic in 2024

2024 has been the worst year for dengue cases on record, with over 10 million cases reported globally. Brazil has been one of the most affected countries with over 6 million confirmed cases and 6,000 deaths

Dengue epidemic in Brazil

During this time, health systems were overwhelmed, making timely case reporting difficult. As a result, the official case numbers that were being reported in Brazil were underestimating the real number of cases.
This limited the effectiveness of public health decisions

Reporting delays in official dengue cases

In Brazil, the InfoDengue system collects and generates indicators of dengue and other arboviruses:

In principle, dengue is meant to be reported within seven days of case identification. In practice,

  • Less than 50% cases are reported within one week
  • Less than 75% cases reported within four weeks
  • No more than 90% cases reported within nine weeks

Reported dengue cases in Rio de Janeiro, January 2011 to April 2012. Red line reported cases for those weeks.
Black line eventually reported cases after 10 weeks.

Bastos et al., Statistics in Medicine, 2019

Dengue nowcasting by combining official and alternative data sources

We decided to investigate alternative data sources to complement official surveillance data to produce more accurate dengue predictions that help support decision-making

Fritz et al., Nature Sustainability, 2019

Aim: improving dengue nowcasting in Brazil using real-time search query data

Dengue nowcasting in Brazil

We assessed the value of Google Trends for weekly dengue nowcasting in the 27 Brazilian states

Each week from March 2024 to January 2025:

  • Download official number of dengue cases reported in InfoDengue and Google Trends indices
  • Fit several nowcasting models using different information
  • Performance evaluated comparing nowcasts with the actual cases (cases reported after 15 weeks) using error and uncertainty measures

Google Trend index for keyword dengue

Calculated correlation between dengue cases and Google Trends indices for several dengue-related keywords in Brazil, Jan 2013 - Dec 2023

Highest correlation observed between dengue cases and keywords
sintomas dengue (0.93), dengue (0.90), and sintomas de dengue (0.89)

High intercorrelations among keywords. Dengue series less sparse

We decided to use the Google Trend index for keyword dengue

Weekly Google Trends index for keyword ‘dengue’ in Brazil, 2019 to 2024.

Models

  • Only dengue cases
    SARIMA Seasonal Autoregressive Integrated Moving-Average
  • Only Google Trends data
    Linear model with number of cases as response and GT index as a covariate
  • Both cases and Google Trends
    SARIMAX SARIMA with eXogenous factors (GT index)
  • Model InfoDengue
    Joint model for reported cases and delay distribution
  • Naive baseline
    Nowcasts are cases reported in the previous week: \(\hat{c}_t = y_{t-1}\)

Results

Results vary by state. In general, Google Trends and joint model for reported cases and delay distribution by InfoDengue are the best-performing approaches

Xiao et al., PLOS Neglected Tropical Diseases, 2025

Weekly dengue nowcasts March to June, 2024

Weekly dengue nowcasts in Rio de Janeiro, March to June, 2024

Dengue tracker in Brazil

Dengue-tracker provides weekly updates on the number of dengue cases per state in Brazil

We present official and corrected case counts incorporating information from Google Trends

Reports assist policymakers
and the general public in understanding dengue levels
and guide their decisions

Dengue tracker in Brazil

Dengue tracker in Brazil

Impact Dengue-tracker

Dengue-tracker has been used to inform the Brazilian Ministry of Health and has been crucial in several situations:

  • Several weeks official dengue cases were not reported and the InfoDengue model was unable to produce nowcasts
  • Espirito Santo state stopped reporting at the beginning of the season
  • In these situations, Dengue-tracker was the only source of information

Dengue-tracker has been crucial to keep Brazilian Ministry of Health informed. We also disseminated Dengue-tracker in social media and among contacts so general population could be better informed about dengue activity levels

Conclusions

  • We demonstrated the value of Google Trends data for dengue surveillance during Brazil’s 2024 dengue epidemic

  • Further research is needed to understand the use of digital data for disease information (e.g., ChatGPT)

  • Need to understand biases in digital data (not all individuals use search engines, just more educated and younger, internet penetration not same in all regions)

  • This study highlights the need for multiple, complementary data sources rather than a single data source for disease surveillance, especially during disease outbreaks

Forecasting

Disease forecasting

Nowcasting methods allow us to understand current disease activity levels and make better informed decisions

It is also important to predict the number of cases that will occur in the future so we have more time to be prepared and reduce disease impacts

During the 2024 epidemic, the Brazilian Ministry of Health asked for 2025 dengue predictions to help inform their response and surveillance activities

Disease forecasting

We developed a neural network model for dengue forecasting. The model accounts for complex delayed and non-linear effects of climate variables, and spatial information to obtain improved predictions of future dengue cases

Chen and Moraga. BMC Public Health, 2025

Weekly dengue incidence rate (cases per 100K) in 27 Brazilian states

Climate covariates

Dengue forecasting methods improve their accuracy by including risk factors such as climate and environmental variables known to affect transmission

Climate covariates

We utilize a suite of covariates known to affect dengue transmission
from the Copernicus ERA5 Reanalysis Data summarized by week

Variable Unit Description
Minimum Temperature °C Lowest temperature recorded within the week, based on reanalysis hourly data.
Mean Temperature °C Average temperature across the week.
Maximum Temperature °C Highest temperature recorded within the week.
Minimum Precipitation Rate mm/h Lowest hourly precipitation rate recorded during the week.
Average Precipitation Rate mm/h Weekly average of hourly precipitation rates.
Maximum Precipitation Rate mm/h Highest hourly precipitation rate recorded during the week.
Total Precipitation mm Cumulative precipitation over the week.
Minimum Atmospheric Pressure atm Lowest atmospheric pressure measured at sea level during the week.
Average Atmospheric Pressure atm Weekly mean atmospheric pressure at sea level.
Maximum Atmospheric Pressure atm Highest atmospheric pressure measured at sea level during the week.
Minimum Relative Humidity % Lowest relative humidity value recorded during the week.
Mean Relative Humidity % Weekly average relative humidity.
Maximum Relative Humidity % Highest relative humidity recorded during the week.
Thermal Range °C Difference between the daily maximum and minimum temperatures.
Rainy Days Days Number of days within the week where the total precipitation exceeded 0.03 mm.

Borrowing information from neighbors

Neighbors assumed to be regions sharing a common boundary

Goias: Tocantins, Bahia, Minas Gerais, Mato Grosso, Mato Grosso do Sul and Distrito Federal

Model’s performance assessment

We use the first 6 years (i.e., 2016-01-03 to 2021-12-26) to train the model and predict the number of dengue cases 1, 2, 3, 4, 8 and 12 weeks ahead. Then, we move the window one week keeping 6 years for training to predict the number of cases weeks ahead until 2023-12-24

We assessed the model’s performance using error and uncertainty measures in comparison with other approaches that only use cases or climate information

Results

Model proposed performs well overall except northern states. These are regions in the Amazon which are less connected with their neighbors

Performance measures forecasts 4-weeks ahead

Federal Unit (FU) Code LSTM-Cases LSTM-Climate LSTM-Climate-Spatial Bayesian Baseline
MAE MAPE CRPS MAE MAPE CRPS MAE MAPE CRPS MAE MAPE CRPS
Acre (AC) 12 305.19 45.50% 90.91 129.76 22.30% 35.68 136.83 24.89% 37.34 382.77 47.23% 96.13
Alagoas (AL) 27 177.96 43.29% 38.14 79.24 30.54% 16.27 61.08 23.17% 12.98 69.39 24.28% 13.41
Amapá (AP) 16 51.21 47.90% 34.05 22.45 23.49% 5.35 27.45 26.98% 6.02 30.53 34.09% 7.12
Amazonas (AM) 13 188.17 41.56% 32.14 100.21 19.63% 19.23 111.60 21.64% 22.44 143.57 28.79% 31.40
Bahia (BA) 29 886.64 29.94% 165.30 639.44 23.20% 123.86 532.46 17.13% 120.50 718.63 22.84% 137.74
Ceará (CE) 23 562.67 46.52% 108.09 245.17 27.54% 52.99 187.56 15.51% 35.01 315.69 30.16% 60.26
Distrito Federal (DF) 53 1040.21 26.69% 244.60 926.73 23.24% 219.97 767.30 16.72% 211.70 997.42 24.57% 249.25
Espírito Santo (ES) 32 8431.94 30.90% 1713.56 7262.14 30.35% 1310.95 6300.78 23.06% 1308.43 6967.74 25.93% 1552.96
Goiás (GO) 52 1708.00 30.34% 310.44 1277.24 27.36% 226.00 1195.70 19.87% 222.87 1722.08 29.75% 321.36
Maranhão (MA) 21 143.59 56.31% 26.44 102.87 38.07% 18.93 59.27 23.91% 10.88 147.31 53.05% 28.14
Mato Grosso (MT) 51 657.81 34.65% 189.42 563.40 26.97% 142.56 340.72 16.69% 72.73 624.21 28.36% 125.27
Mato Grosso do Sul (MS) 50 1711.05 75.23% 342.47 568.10 59.97% 108.71 404.48 40.11% 81.94 1646.17 50.03% 344.61
Minas Gerais (MG) 31 15099.46 52.28% 3253.80 7730.47 33.33% 1648.53 5088.71 24.52% 1035.86 14220.67 40.19% 3472.85
Pará (PA) 15 319.85 47.03% 72.72 256.88 26.23% 53.97 159.61 19.43% 34.75 210.77 21.89% 56.14
Paraná (PR) 41 651.62 44.56% 145.98 532.44 26.62% 104.80 391.02 20.01% 81.55 603.78 22.63% 117.92
Pernambuco (PE) 26 501.76 41.95% 96.53 358.33 32.90% 69.27 257.65 19.53% 58.65 355.72 26.02% 72.60
Piauí (PI) 22 319.75 43.87% 57.96 263.10 30.81% 50.76 194.54 20.75% 41.58 298.91 28.17% 57.16
Rio de Janeiro (RJ) 33 1034.84 32.60% 217.36 861.35 25.10% 194.52 717.22 18.02% 175.08 910.87 22.58% 210.46
Rio Grande do Norte (RN) 24 313.87 49.58% 68.70 252.68 30.67% 47.31 171.49 19.76% 35.35 259.23 24.48% 49.58
Rio Grande do Sul (RS) 43 823.57 31.88% 155.09 679.61 28.42% 122.36 548.03 21.42% 98.53 736.82 25.74% 149.61
Rondônia (RO) 11 371.50 79.49% 103.44 300.93 42.08% 79.71 285.61 40.93% 69.17 323.33 43.03% 104.34
Roraima (RR) 14 9.33 40.63% 6.03 6.52 43.27% 5.02 6.36 44.31% 5.93 8.66 52.55% 2.55
Santa Catarina (SC) 42 7381.10 79.16% 1765.31 1585.71 56.28% 395.28 1556.58 15.21% 294.09 3028.00 40.68% 831.25
São Paulo (SP) 35 9544.39 49.61% 2196.34 4088.67 31.97% 921.95 3068.46 17.28% 612.34 8468.53 31.64% 1961.50
Sergipe (SE) 28 54.35 20.82% 13.07 45.92 17.48% 9.43 41.52 16.50% 8.11 86.08 31.38% 18.57
Tocantins (TO) 17 124.25 49.07% 31.51 103.21 37.89% 28.46 92.55 29.12% 20.44 128.97 45.07% 27.20

Mobility

Instead of assuming there was connectivity between adjacent regions, we assumed connectivity if there were people traveling between regions

Consider dataset on mobility spanning air, road, and waterway transport (Oliveira et al. The Lancet Digital Health, 2024)

Spatial modeling including mobility data

Consider contribution of cases imported into each city \(i\) from others in week \(t\): \[ \text{Imported Cases}_{i, t} = \sum_{j \in \mathcal{N}_i} \text{Mobility}_{ji} \cdot \frac{\text{Cases}_{j, t}}{\text{Population}_j} \]

\(\mathcal{N}_i\): set of cities with connections with city \(i\)\(\text{Mobility}_{ji}\): people from city \(j\) to \(i\)\(\text{Cases}_{j, t}\): number cases in city \(j\) and week \(t\)\(\text{Population}_j\) city \(j\)

Results

We assessed the performance of the improved model in selected cities representing different climatic zones and disease dynamics profiles. Results demonstrate the incorporation of human movement patterns improves prediction compared with models that use spatial adjacency structures

Chen and Moraga, under review, 2025

Conclusions

  • Developed a forecasting model that improves accuracy by integrating dengue cases, climate and human mobility
  • Model is generalizable and can be applied to forecast other diseases influenced by climate and mobility to improve health surveillance
  • Findings in open access papers, and code freely available for reproducibility

Translation of research into action and inform policymaking

Forecasting 2025 dengue cases in Brazil

Brazilian Ministry of Health asked for 2025 dengue predictions to help inform their response and surveillance activities

We participated in the Infodengue-Mosqlimate Dengue Challenge (IMDC) to produce actionable forecasts of the 2025 dengue season

We collaborated with 6 teams from different countries. Each team provided forecasts using a number of statistical and machine learning approaches that leveraged historical data as well as information on climate and environment. Then, individual forecasts were combined to produce a final dengue forecast ensemble for 2025. GitHub Dengue-Forecast-Ensemble

Ensemble forecast for dengue in Brazil, 2025

Predictions for the states of Amazonas (AM), Ceará (CE), Goiás (GO), Paraná (PR), and Minas Gerais (MG)

Dengue forecasting challenge results

The results of the challenge published in September 2024 as a technical report in Portuguese ensuring it reached key decision-makers in Ministry of Health

Dengue forecasting challenge results

  • Forecasting methods code and results publicly available on GitHub
  • Webinars organized to discuss the challenge and its results that were attended by a large number of public health professionals from Brazil
  • We continue our collaboration working on better models to provide improved projections for the 2026 dengue season that help inform prevention and control strategies by the Brazilian Ministry of Health

Methodology projects to improve disease surveillance

Spatio-temporal modeling of infectious diseases

Spatio-temporal disease prediction integrating compartment and point process models

          COVID-19 cases in Cali, Colombia, 2020

LGCP model

Fit a log-Gaussian Cox process for the locations of infected individuals in the studied region and time, with mean depending on population at risk, number of infected over time and random effects

\[\begin{align} N(A, t) &\sim \text{Poisson}\left(\int_{A}\Lambda(\mathbf{x}, t)d\mathbf{x}\right) \\ \Lambda(\mathbf{x}, t) &= \lambda_{0}(\mathbf{x}, t)\ I(t)\ \exp(S({\mathbf{x}, t})) \nonumber \\ \end{align}\] \[\begin{align} \lambda_{0}(\mathbf{x}, t):\ &\text{Proportional to the population density and integrates to 1}\\ I(t):\ &\text{Number of infected people at time } t\ \text{obtained from SIR model}\\ S(\mathbf{x}, t):\ &\text{Spatial Gaussian random field with Matern covariance function} \nonumber \\ \end{align}\]


Diggle et al., Environmetrics, 2005

SIR model

Fit SIR (Susceptible, Infected, Recovered) compartment model to aggregated data for each time to predict the number of infectious individuals at each time

Number individuals in population \(S(t) + I(t) + R(t) = N\) constant

Disease parameters:
\(\beta > 0\) infectious rate
\(\gamma > 0\) recovery rate

\[\begin{align*} \frac{dS(t)}{dt} &= -\beta S(t) \frac{I(t)}{N} \\ \frac{dI(t)}{dt} &= +\beta S(t) \frac{I(t)}{N} - \gamma I(t) \\ \frac{dR(t)}{dt} &= +\gamma I(t) \end{align*}\]

SIR model could be extended to consider more compartments (e.g.,
\(I_S(t)\) symptomatic, \(I_A(t)\) asymptomatic)


Kermack and McKendrick, Proceedings of the Royal Society, 1927

SIR with age-stratified contact information

We extend the SIR model to allow the incorporation of age-stratified contact information, and the estimation of the spatio-temporal intensity for each population group

Contact matrix with the average number of contacts of individuals with different age groups

Individuals in all age groups tend to mix with others of similar age.
This pattern most pronounced in those aged 5–24 years, least pronounced in those aged 55–69

Children mix with adults 30–39. Middle-aged adults mix with elderly

Mossong et al., PLOS Medicine, 2008

SIR + LGCP with age contact information

Using simulations and real data, we showed SIR+LGCP model has better performance than LGCP models that do not do use information from the SIR model, especially when making predictions

Ribeiro Amaral et al., SERRA, 2022

Spatio-temporal disease intensity

SIR+LGCP model allows us to identify high-risk locations and vulnerable populations to develop better strategies for disease prevention and control

Velocities of disease spread


The spatio-temporal disease intensity obtained can be used to calculate the velocities of disease spread

Directions and magnitudes of the velocities can be mapped at specific times to better examine the spread of the disease throughout the region



Rodriguez et al., under review, 2025

Spatial data misalignment

Spatial data

In disease surveillance we need to analyze data at different spatial and spatio-temporal resolutions and from different sources

ArcGIS Blog

Spatial data misalignment

The analysis of spatial data at different spatial resolutions entails a number of statistical challenges. These may occur in several inference problems:

Data fusion

Better predict a variable by combining data available at several spatial resolutions

Estimate air pollution by combining point- and area-level data

Interpolation

Predict a variable at locations or areal units different from those of its original collection

Downscale health outcomes from state to municipality level

Regression

Relationship between response variable and explanatory variables at different spatial scales

Relationship between dengue at county level and temperature given at point locations

Data fusion

   Ground measurements (points)       +      Satellite derived measurements (grid)

      European Environment Agency (EEAA) https://www.eea.europe.eu.   NASA Socioeconomic Data and Applications Center (SEDAC) https://sedac.ciesin.columbia.edu

Fast and flexible spatial modeling by assuming a spatially continuous variable underlying all observations modeled using a Gaussian random field

Zhong et al., Journal of the Royal Statistical Society Series A, 2024

Precision disease mapping

Disease mapping is important to understand geographic and temporal patterns of diseases and allocate resources where most needed

Often, maps given at an areal resolution which difficulties decision-making

Map shows malaria prevalence in Mozambique. However, disease risk varies continuously in space & areal data unable to show how risk varies within areas

Areal estimates make difficult targeting health interventions and directing resources where most needed

Disaggregate area-level data

High-resolution estimates permit to find differences in disease risk within study regions, and identify areas and groups of people at higher risk

Alahmadi and Moraga, SERRA, 2025

Open-Source Disease Surveillance System

Open-Source Disease Surveillance System

Acknowledgements

Team working on dengue surveillance


Paula Moraga
KAUST

Xiang Chen
KAUST

Yang Xiao
KAUST

Guilherme Soares
University of São Paulo

Rafael Izbicki
Federal University of São Carlos

Leo Bastos
Fiocruz

Visit Fiocruz and InfoDengue group, Brazil

Visit Ministry of Health and University, Malaysia

Visit Ministry of Health and University, Malaysia

Capacity Building

Courses equip researchers on methods and tools to quantify disease burden, understand geographic and temporal patterns, identify risk factors, and measure inequalities

They also show how to easily turn analyses into visually informative and interactive reports and dashboards that facilitate the communication of insights to collaborators and policymakers

Books

Conclusions

Conclusions

Data crucial for public health decision-making

We need data for public health decision-making. But not just any data

We need reliable, relevant, timely and detailed data to understand how different populations and regions are doing and be able to take efficient and effective actions to reduce disease burden and protect all populations

Collaborative research, data, and analytical tools crucial for solving health challenges, achieving sustainable development, and leaving no one behind

Thank you!


Thanks!

Paula Moraga

   paulamoraga.bsky.social
   www.PaulaMoraga.com

a png
       a png