domingo, 23 de febrero de 2020

On the predictability of infectious disease outbreaks

Scarpino and Petri, 2019.


Infectious disease outbreaks recapitulate biology: they emerge from the multi-level interaction of hosts, pathogens, and environment. Therefore, outbreak forecasting requires an integrative approach to modeling. While specific components of outbreaks are predictable, it remains unclear whether fundamental limits to outbreak prediction exist. Here, adopting permutation entropy as a model independent measure of predictability, we study the predictability of a diverse collection of outbreaks and identify a fundamental entropy barrier for disease time series forecasting. However, this barrier is often beyond the time scale of single outbreaks, implying prediction is likely to succeed. We show that forecast horizons vary by disease and that both shifting model structures and social network heterogeneity are likely mechanisms for differences in predictability. Our results highlight the importance of embracing dynamic modeling approaches, suggest challenges for performing model selection across long time series, and may relate more broadly to the predictability of complex adaptive systems.


Single outbreaks are often predictable. a The average predictability (1 − Hp) for weekly, state-level data from nine diseases is plotted as a function of time-series length in weeks. For each disease, we selected 1000 random starting locations in each time series and calculated the permutation entropy in rolling windows in lengths ranging from 2 to 104 weeks. The solid lines indicate the mean value and the shaded region marks the interquartile range across all states and starting locations in the time series. Although the slopes are different for each disease, in all cases, longer time series result in lower predictability. However, most diseases are predictable across single outbreaks and disease time series cluster together, i.e. there are disease-specific slopes on the relationship between predictability and time-series length. To aid in interpretation, the black dashed line plots the median permutation entropy across 20,000 stochastic simulations of a Susceptible Infectious Recovered (SIR) model, as described in the Supplement. This SIR model would be considered predictable, thus values above the black line might be thought of as in-the-range where model-based forecasts are expected to outperform forecasts based solely on statistical properties of the time-series data. The dark brown, dashed vertical line indicates the time period selected for b. In b, the predictability is shown after 4 months, i.e. 16 weeks, of data for each pathogen. The same procedure was used to generate the permutation entropy as in a. The mean predictability differed both by disease and by geographic location, i.e state (analysis of variance with post hoc Tukey honest significant differences test and correction for multiple comparison, sum of squares (SS) disease = 98.22, degrees of freedom (DF) disease = 8, p-value disease < 0.001; SS location = 94.7, DF location = 53, p-value location < 0.001). The solid line represents the median, boxes enclose the 25th to 75th percentiles of the distributions, and whiskers cover the entire distribution

No hay comentarios: