DEFENDER: Detecting and Forecasting Epidemics Using Novel Data-Analytics for Enhanced Response
File(s)journal.pone.0155417.PDF (1.86 MB)
Published version
Author(s)
Hankin, CL
Thapen, N
Simmie, D
Gillard, J
Type
Journal Article
Abstract
In recent years social and news media have increasingly been used to explain patterns in
disease activity and progression. Social media data, principally from the Twitter network,
has been shown to correlate well with official disease case counts. This fact has been
exploited to provide advance warning of outbreak detection, forecasting of disease levels
and the ability to predict the likelihood of individuals developing symptoms. In this paper we
introduce DEFENDER, a software system that integrates data from social and news media
and incorporates algorithms for outbreak detection, situational awareness and forecasting.
As part of this system we have developed a technique for creating a location network for
any country or region based purely on Twitter data. We also present a disease nowcasting
(forecasting the current but still unknown level) approach which leverages counts from multiple
symptoms, which was found to improve the nowcasting accuracy by 37 percent over
a model that used only previous case data. Finally we attempt to forecast future levels of
symptom activity based on observed user movement on Twitter, finding a moderate gain of
5 percent over a time series forecasting model.
disease activity and progression. Social media data, principally from the Twitter network,
has been shown to correlate well with official disease case counts. This fact has been
exploited to provide advance warning of outbreak detection, forecasting of disease levels
and the ability to predict the likelihood of individuals developing symptoms. In this paper we
introduce DEFENDER, a software system that integrates data from social and news media
and incorporates algorithms for outbreak detection, situational awareness and forecasting.
As part of this system we have developed a technique for creating a location network for
any country or region based purely on Twitter data. We also present a disease nowcasting
(forecasting the current but still unknown level) approach which leverages counts from multiple
symptoms, which was found to improve the nowcasting accuracy by 37 percent over
a model that used only previous case data. Finally we attempt to forecast future levels of
symptom activity based on observed user movement on Twitter, finding a moderate gain of
5 percent over a time series forecasting model.
Date Issued
2016-05-18
Date Acceptance
2016-04-28
Citation
PLOS One, 2016, 11 (5)
ISSN
1932-6203
Publisher
Public Library of Science
Journal / Book Title
PLOS One
Volume
11
Issue
5
Copyright Statement
© 2016 Thapen et al. This is an open
access article distributed under the terms of the
Creative Commons Attribution License, which permits
unrestricted use, distribution, and reproduction in any
medium, provided the original author and source are
credited
access article distributed under the terms of the
Creative Commons Attribution License, which permits
unrestricted use, distribution, and reproduction in any
medium, provided the original author and source are
credited
License URL
Sponsor
Defence Science and Technology Laboratory (DSTL)
Grant Number
DSTLX-1000085033
Subjects
General Science & Technology
MD Multidisciplinary
Publication Status
Published
Article Number
e0155417