Survey Methodology
Social media as a data source for official statistics; the Dutch Consumer Confidence Index

by Jan van den Brakel, Emily Söhler, Piet Daas and Bart BuelensNote 1

  • Release date: December 21, 2017


In this paper the question is addressed how alternative data sources, such as administrative and social media data, can be used in the production of official statistics. Since most surveys at national statistical institutes are conducted repeatedly over time, a multivariate structural time series modelling approach is proposed to model the series observed by a repeated surveys with related series obtained from such alternative data sources. Generally, this improves the precision of the direct survey estimates by using sample information observed in preceding periods and information from related auxiliary series. This model also makes it possible to utilize the higher frequency of the social media to produce more precise estimates for the sample survey in real time at the moment that statistics for the social media become available but the sample data are not yet available. The concept of cointegration is applied to address the question to which extent the alternative series represent the same phenomena as the series observed with the repeated survey. The methodology is applied to the Dutch Consumer Confidence Survey and a sentiment index derived from social media.

Key Words:      Big data; Design-based inference; Model-based inference; Nowcasting; Structural time series modelling; Cointegration.

Table of contents

How to cite

van den Brakel, J., Söhler, E., Daas, P. and Buelens, B. (2017). Social media as a data source for official statistics; the Dutch Consumer Confidence Index. Survey Methodology, Statistics Canada, Catalogue No. 12-001-X, Vol. 43, No. 2. Paper available at


Date modified: