Sequential Bayesian updating for Big Data
Abstract
The velocity, volume, and variety of big data present both challenges and opportunities for cognitive science.
We introduce sequential Bayesian updating as a tool to mine these three core properties. In the Bayesian
approach, we summarize the current state of knowledge regarding parameters in terms of their posterior
distributions, and use these as prior distributions when new data become available. Crucially, we construct
posterior distributions in such a way that we avoid having to repeat computing the likelihood of old data as
new data become available, allowing the propagation of information without great computational demand.
As a result, these Bayesian methods allow continuous inference on voluminous information streams in a
timely manner. We illustrate the advantages of sequential Bayesian updating with data from the MindCrowd
project, in which crowd-sourced data are used to study Alzheimer's Dementia. We fit an extended Linear
Approach to Threshold with Ergodic Rate model to reaction time data from the project in order to separate
two distinct aspects of cognitive functioning: speed of information accumulation and caution.
Citation
(2016). Sequential Bayesian updating for Big Data. Big Data in Cognitive Science: From Methods to Insights, 13–33.
Bibtex
@chapter{oravecz_etal:2016:Sequential, title = {{S}equential {B}ayesian updating for {B}ig {D}ata}, author = {Oravecz, Zita and Huentelman, Matt and Vandekerckhove, Joachim}, year = {2016}, journal = {Big Data in Cognitive Science: From Methods to Insights}, pages = {13--33} }