Session 1 – Keynote Address

Survey Sampling in Official Statistics - Some Thoughts on Directions
Ray Chambers, National Institute for Applied Statistics Research Australia (NIASRA), University of Wollongong, Australia

In this talk I will focus on what I expect will be the main survey-related methodological issues that National Statistical Institutes (NSI) will face over the next few years, and how methodologists in NSIs will need to change the way that they view 'official statistics inference' in order to deal with them in an effective manner. Kreuter (2013) eloquently captures the essence of these emerging issues, pointing out that "in recent years large survey organizations have made considerable efforts to enhance information on all sample cases with paradata, data from commercial vendors, and through linkage to administrative data to allow for improved field operations or non-response adjustments." Design-based survey sampling inference, as first described in Neyman (1934), cannot deal with this new breed of statistical inference problem, and the traditional methodological focus within NSIs on sampling design and estimation is of little use when the reality is more about integrating information from population registers and sample data based on quite different sources. Sampling inference will inevitably have to adapt to this new data collection environment, and in doing so will undergo a paradigm shift, because it no longer will be possible to characterise inferential uncertainty via variability under repeated sampling from a fixed, finite and well-specified population. The emphasis instead will be to focus on building models for these different sources of uncertainty and basing inference on how they interact. As a consequence, application of Bayesian ideas (e.g. calibrated Bayes) will become a serious option for NSIs, and integration of variability associated with the process of interest, the data capture process and the measurement error process will become the norm. My aim in this overview talk will be to present a modeler's perspective on the current status quo. In doing so, I will identify the strengths and drawbacks of the design and model-based inferential positions that survey sampling, at least as far as the official statistics world is concerned, finds itself at present. I will then use examples from linked data analysis, coverage error assessment, small area estimation, adaptive survey design, network modeling and micro-simulation to illustrate why taking a model-based perspective (either frequentist or Bayesian) represents the best way that NSIs can avoid acute and debilitating 'inferential schizophrenia' when dealing with the emerging information requirements of today (and possibly even tomorrow).

Session 2A – Web Surveys 1

Explorations in Non-Probability Sampling Using the Web
J. Michael Brick, Westat, USA

Producing estimates of finite populations using probability sampling has a long-tradition and has proven to be very successful for large samples. Other forms of making inferences using samples that are not from probability samples have also been used for many years, but these have been criticized due to self-selection bias and because they often do not provide information about the precision of the estimates. In recent years, the wide spread access to the Web and the ability to do very inexpensive data collection on the Web has reinvigorated interest in this topic.

In this talk, I give a brief review of some methods commonly used for doing surveys with non-probability samples. Reviews of non-probability sampling methodologies and evaluations of those methods are summarized. These findings suggest that there may be conditions under which non-probability samples could be considered as an alternative to probability samples. Finally, a research agenda for studying making inferences from samples using Web panels is discussed.

Web Panels for Official Statistics?
Jelke Bethlehem, Statistics Netherlands and Leiden University, Netherlands

New developments in computer technology, but also new challenges in society like increasing nonresponse rates and decreasing budgets, may lead to changes in survey methodology for official statistics. Web panels have become very popular in the world of market research.

Almost all opinion polls in the Netherlands are based on web panels. At first sight, web panels seems attractive. It is a means to collect data quick and cheap. So, why not use web panels for official statistics? This presentation explores the possibilities. An attempt is made to answer the question whether a web panel can be used for compiling accurate statistics about the general population.

To obtain accurate statistics, panel recruitment must be based on probability sampling. This is already a first complication. Furthermore, web panels may be affected by under-coverage, various types of nonresponse, and measurement errors. Also, panel maintenance is an issue. These methodological issues are discussed in some more detail.

To find out how realistic it is to use a web panel in official statistics, Statistics Netherlands carried out a pilot project in which it set up its own web panel. The objective of the project was not to recruit a representative panel but merely to gain some first experiences with building a web panel. Some results of this pilot project will be discussed.

Measurement Properties of Web Surveys
Roger Tourangeau, Westat, USA

Web surveys have serious shortcomings in terms of their representativeness, but they appear to have some good measurement properties. This talk focuses on the general features of web surveys that affect data quality, especially the fact that web surveys are self-administered but, unlike paper questionnaires, allow feedback to respondents in real time. A number of experiments have compared web surveys with other modes of data collection. A meta-analysis of these studies shows that web surveys maintain the advantages of traditional forms of self-administration; in particular, they reduce social desirability bias relative to interviewer administration of the questions. In addition, web surveys allow feedback to respondents in a variety of forms, including running tallies, prompts to provide answers to items that were skipped, and progress indicators. Some of these interactive features seem to be effective, but some seem to backfire. For example, by creating the sense that someone is paying attention to the respondent, interactive questionnaires may reduce the gains from self-administration. Similarly, progress indicators may actually increase breakoffs. This talk reviews the research evidence on these potential drawbacks to interactivity in web surveys. It concludes by discussing some likely future developments in web surveys—their incorporation of avatars as “virtual interviewers” and the increasing use of mobile devices (such as tablet computers and smart phones) to access and complete web surveys.

Session 2B – Data Collection 1

The Challenges of Producing Statistics for the Web: Sampling and Automated Data Collection of Webpage Information in the Brazilian Web
Emerson Gomes dos Santos, Isabela Bertolini Coelho and Suzana Jaize Alves da Silva, Núcleo de Informação e Coordenação do Ponto, Brazil and Pedro Luis do Nascimento Silva, IBGE - Escola Nacional de Ciências Estatísticas, Brazil

The Internet is probably the most sophisticated information and communication technology (ICT) currently available to society. Its structure and applications have clear social, cultural, economical and political implications. The Web has become the most widely known application on the Internet and may be defined as the part of the Internet that can be accessed through browsers. Studies on the characteristics and dimensions of the web require collecting and analyzing information from a dynamic and complex environment.

The Brazilian Network Information Center (NIC.br) has designed and carried out a pilot project to collect data from the Web in order to produce statistics about the webpages' characteristics such as size and age of the pages, languages, type of objects embedded on the pages, technical data including protocols (IPv4, IPv6, HTML), accessibility among others.

This pilot project was a first step towards establishing a methodology to collect the data in a dynamic environment without a frame. The core idea was collecting data from a sample of webpages automatically by using software known as web crawler. Several methodological challenges related to sampling procedures were tackled in this project. The motivation for this paper is to disseminate the methods and results of this study as well as to show current developments related to sampling techniques in a dynamic environment.

Effect of Using Mobile Devices to Complete the ACS on Data Quality and Respondent Burden
Rachel Horwitz, U.S. Census Bureau, USA

The American Community Survey (ACS) added an Internet data collection mode as part of a sequential mode design in 2013. The ACS currently uses a single web application for all Internet respondents, regardless of whether they respond on a PC or on a mobile device. However, as market penetration of mobile devices increases, more survey respondents are using tablets and smartphones to take surveys that are designed for personal computers. Using mobile devices to complete these surveys can be more difficult for respondents due to longer load times, small font sizes, using a finger to select the proper response option, and increased scrolling. These difficulties may translate to reduced data quality if respondents become frustrated or cannot navigate around the issues.

The ACS provides a unique opportunity to measure the impact of answering survey questions on a mobile device across a national probability sample. Specifically, this study uses breakoffs, completion time, how often respondents switch to a different device, average number of changed answers, and average number of error messages rendered to compare data quality across computers, tablets, and smartphones. Using a large, national sample also allows us to explore which demographic groups use mobile devices to answer the survey. Some of the traditionally hard-to-interview groups have higher mobile device penetration. If a survey focuses on these populations, it may be even more important to ensure the survey has high usability on all devices.

The Canadian Vehicle Use Study: an Electronic Data Collection
Émile Allie, Transport Canada

In the last quarter of 2011, Transport Canada with the participation of Environment Canada and Natural Resources Canada initiated a quarterly survey - the Canadian Vehicle Use Study – the light component (car, minivan, light trucks (carrying less than 4.5 metric tons), SUV). The data collection is in two steps. First, when the owner of a selected vehicle agrees to participate in the survey, we collect information about potential drivers (gender, age group) and the vehicle (primary vehicle, number of vehicles owned...) electronically or the classical way.

The second step at the trip level is fully electronic with an electronic device linked to the vehicle information system. At the beginning of each trip, the driver provides some basic information trough a sequence of touch screens (driver id, purpose of the trip, number of passengers). The remaining of the information is collected by the device every second – GPS location, speed, distance, time, fuel consumption, engine temperature, intake temperature... At the end of a trip, the driver is invited to provide a reason for the stop. After 21 days, the participant has to return the device to the survey manager with a pre-paid return box.

Challenges and Lessons Learned with the Implementation of Car Chips in the Fuel Consumption Survey
Agnes Waye, Serge Godbout and Pierre Daoust, Statistics Canada

The National Fuel Consumption Survey (FCS) was created in 2013 and is a quarterly survey that is designed to analyze distance driven and fuel consumption for passenger cars and other vehicles weighing less than 4,500 kilograms. The sampling frame consists of vehicles extracted from the vehicle registration files, which are maintained by provincial ministries. For its mode of collection, FCS uses car chips for a part of its sampled units to collect information about the trips and the fuel consumed. There are numerous advantages to using this new technology, for example, reduction in response burden, collection costs and effects in data quality. For the quarters in 2013, the sampled units were surveyed 95% via paper questionnaires and 5% with car chips, and in Q1 2014, 40% of sampled units were surveyed with car chips. This study outlines the methodology of the survey process, examines the advantages and challenges in processing and imputation for the two collection modes, presents some initial results and concludes with a summary of the lessons learned.

Can we Produce Reliable Field Crop Statistics Using Remote Sensing Approaches?
Jim Brisbane and Chris Mohl, Statistics Canada

Statistics Canada conducts six field crop surveys during the year to measure seeding intentions, actual seeding, harvested crop areas, yield and other statistics. Approximately 100,000 farms are contacted annually for these purposes. The agency is continually looking for ways to reduce the survey burden put on the farm operators. One option under consideration is the estimation of crop area and yield through the use of satellite images and remote sensing approaches. If successful it would allow statistics to be generated while requiring no contact with the operators. Statistics Canada has experimented with this technology in the past on a small scale, but has never attempted to replace an actual survey occasion in this manner. The success of such an approach depends upon numerous factors including the quality and frequency of the satellite images, the availability of good ground truth information that can be used to help distinguish individual crop types from the images and the prediction models.

This presentation will highlight some of Statistics Canada's past experiments with satellite and remote sensing approaches and will focus on the current methods being considered as possible replacements for crop area and yield estimates for one of the survey occasions.

Session 3A – Data Collection Mode Effects

Inference in Surveys with Mixed-mode Data Collection
Jan van den Brakel, Statistics Netherlands and Maastricht University, Netherlands; Bart Buelens, Maastricht University, Netherlands

Mixing modes of data collection in survey sampling is increasingly attractive. Driving factors are the pressure to reduce administration costs, attempts to reduce non-sampling errors, and technological developments leading to new data collection procedures. National Statistical Institutes produce official statistics which are often based on sample surveys repeated at regular intervals. A problem with sequential mixed-mode data collection is that the distribution of the respondents over the different data collection modes will generally not be constant in consecutive editions of a repeatedly conducted survey. This may cause effects associated with these modes, such as measurement bias, to vary over time. Time series based on repeatedly conducted surveys employing a mixed-mode design will therefore reflect a more severely biased estimate of changes over time of the variables of interest compared to uni-mode surveys.

In this paper two estimation procedures are compared that are robust to variations in the distribution of respondents over the different data collection modes. The first approach is based on the general regression (GREG) estimator. Measurement bias between the subsequent editions of a repeated survey is stabilized by calibrating the response to fixed distributions over the data-collection modes (Buelens and Van den Brakel, 2014). The use of this predominantly design-based approach is motivated with a measurement error model for the observations obtained in the sample. The second approach uses a linear model to estimate measurement errors and predict individual responses under different modes. These predictions are used in the GREG estimator to obtain parameter estimates under different modes. This approach is based on Suzer-Gurtekin et al. (2012). Both approaches are compared and applied to the Dutch Labour Force Survey.

Buelens, B. and J. van den Brakel (2014). Measurement error calibration in mixed-mode sample surveys. Sociological Methods & Research. Accepted for publication.

Suzer-Gurtekin, Z., S. Heeringa, and R. Vaillant (2012). Investigating the bias of alternative statistical inference methods in sequential mixed-mode surveys. In proceedings of the JSM, section on survey research methods, pp. 4711-4725.

Integration of the Electronic Questionnaire: Impact on the Collection Process and Results of the Survey of Employment, Payrolls and Hours
Danielle Léger and Leon Jang, Statistics Canada

The Survey of Employment, Payrolls and Hours (SEPH) produces monthly estimates and determines the month-to-month changes for variables such as employment, earnings and hours at detailed industrial levels for Canada, the provinces and territories. In order to improve the efficiency of collection activities for this survey, the electronic questionnaire (EQ) was introduced in the fall of 2012. Given the timeframe allowed for this transition as well as the production calendar of the survey, a conversion strategy was developed for the integration of this new mode. The goal of the strategy was to ensure a good adaptation of the collection environment and also to allow the implementation of a plan of analysis that would evaluate the impact of this change on the results of the survey. This paper will give an overview of the conversion strategy, the different adjustments that were made during the transition period and the results of various evaluations that were conducted. For example, the impact of the integration of the EQ on the collection process, the response rate and the follow-up rate will be presented. In addition, the effect that this new collection mode has on the survey estimates will also be discussed. More specifically, the results of a randomized experiment that was conducted in order to determine the presence of a mode effect will be presented.

Multimode Surveys through the Lens of Total Survey Error
Gaël de Peretti and Tiaray Razafindranovona, Institut National de la Statistique et des Etudes Economiques, France

National statistical institutes are subject to two requirements that are difficult to reconcile. On one hand, they must provide increasingly precise information on specific subjects and hard-to-reach or minority populations, with innovative methods that make the measurement more objective or ensure its confidentiality, and so on. On the other hand, they must deal with budget restrictions in a context where households are increasingly difficult to contact. This two-fold demand has an impact on survey quality in the broad sense; that is, not only in terms of precision, but also in terms of relevance, comparability, coherence, clarity and timeliness. Because the cost of Internet collection is low and a large proportion of the population has an Internet connection, statistical offices see this modern collection mode as a solution to their problems. Consequently, the development of Internet collection and, more generally, multimode collection is supposedly the solution for maximizing survey quality (Lyberg 2012), in particular in terms of total survey error (Groves and Lyberg 2012), because they address the problems of coverage, sampling, non-response or measurement, while remaining within tight budgets. However, while Internet collection is an inexpensive mode, it presents serious methodological problems: coverage, self-selection or selection bias, non-response and non-response adjustment difficulties, “satisficing” and so on. As a result, before developing or generalizing the use of multimode collection, INSEE launched a wide-ranging set of experiments to study these various methodological issues, and the initial results show that multimode collection is a source of both solutions and new methodological problems.

Session 3B – Nonsampling Errors

Fieldwork Effort, Response Rate and the Distribution of Survey outcomes: a multi-level meta-analysis
Patrick Sturgis, University of Southampton, United Kingdom; Franz Buschs, University of Westminster, United Kingdom; Joel Williams, TNS-BMRB, United Kingdom

As fieldwork agencies devote ever greater resources to mitigate falling response rates in face-to-face interview surveys, the need to better understand the relationship between level of effort, response rate, and nonresponse bias grows ever more pressing. In this study we assess how response rates and outcome distributions change over the number of calls made to a household. Our approach is comprehensive rather than selective: we analyse change in the response distribution over repeated calls for over 500 survey variables, across four different major surveys in the UK. The four surveys cover different topic areas and have response rates which vary between 54% and 76%. Comparisons are made for both unweighted and post-stratified estimates. We code each question on a number of different attribute dimensions to produce a broad typology of question types and then analyse nonresponse bias (defined as the difference between the point estimate at call n and the final response distribution for the full sample) within a multi-level meta-analytic framework, where estimates of bias are nested within calls and within questions, and questions are nested within surveys. This approach enables us to model how estimated bias varies systematically as a function of call number (fieldwork effort), question type, and survey topic as well as interactions between these characteristics. In addition to contributing to our understanding of how fieldwork effort is related to nonresponse bias, our study also includes an assessment of the cost-effectiveness of additional fieldwork effort at different points in the fieldwork cycle.s.

Measurement Error for Welfare Receipt and its Impact on Fixed-effects Models
Johannes Eggs, Institute for Employment Research (IAB), Germany

Most research on the influence and extent of measurement error in surveys is conducted cross-sectionally and not longitudinally. Lack of longitudinal research on the impact of measurement error is related to the lack of longitudinal validation data. In this work, the extent and impact of measurement error can be evaluated for up to five panel waves. This study focuses on measurement error for welfare receipt. The extent of underreporting of welfare is known to be considerable in surveys. However, as respondent characteristics can change over time, so can measurement error. Previous research has shown that measurement error decreases over subsequent panel waves. Yet, the change of measurement error over time can especially bias parameters of fixed-effects models as they rely on transitions from one state to another. Survey data of the German household panel study "Labour Market and Social Security" (PASS) is used for this study. The survey data is linked on individual level to register data that is provided by the German employment agency.

This paper focuses on three research questions. (1) Are classic assumptions about the distributions and correlations of measurement error receipt met for the measurement error for welfare receipt? (2) In order to correct for bias, a range of measurement error models have been introduced over time. Assumptions for these models are also discussed. (3) Whether and in which direction can measurement error for welfare receipt distort estimates for fixed effect models? For this purpose, analyses of an earlier study are recalculated with administrative information.

Modeling Self-enumeration and Follow-up Response Indicators as Discrete-time Survival While Keeping Self-enumeration Late Responses
Abdellatif Demnati, Statistics Canada

Collecting information from sampled units by mail or over the Internet is much cheaper than by conducting interviews. These methods make self-enumeration an attractive data collection method for surveys and censuses. Despite the benefits associated with self-enumeration data collection, in particular Internet-based data collection, self-enumeration can produce high nonresponse rates in comparison to interviews. Sub-sampling from nonrespondents can obtain unbiased estimates. Sub-sampled units that had not responded when follow-up activities started are exposed to two factors of data collection, which influence the resulting probability of response. Factors and interactions are commonly treated in the context of regression analysis, and have important implications for the interpretation of statistical models. Because response occurrence is intrinsically conditional, we first record response occurrence in discrete intervals, and we characterize the probability of response by a discrete time hazard. This approach facilitates examining when a response is most likely to occur and how the probability varies over time. In practice, however, data collections from self-enumeration and from follow-up are usually done in parallel, which makes sub-sampling from nonrespondents difficult to apply in some situations. In this case, excluding late self-enumeration responses, not obtained from the follow-up subsample, after follow-up has been started, is common to avoid a nonresponse bias. Finally, we propose an estimator of the population total and an associated variance estimator that use all the observed responses under the previously mentioned setup. We take into account correlation over time for the same unit in variance estimation.

Dealing with Administrative, Survey and Big Data: An Assessment of the Quality of Canadian Wetland Databases
Herbert Nkwimi Tchahou, Claude Girard and Martin Hamel, Statistics Canada

While wetlands represent only 6.4% of the world's surface area, they are essential to the survival of terrestrial species. These ecosystems require special attention in Canada, since that is where nearly 25% of the world's wetlands are found.

Environment Canada (EC) has massive databases that contain all kinds of wetland information from various sources. Before the information in these databases can be used for any environmental initiative, it had to be classified and its quality assessed.

In this presentation, we will give an overview of a joint pilot project conducted by EC and Statistics Canada to assess the quality of the database information, which has characteristics specific to big data, administrative data and survey data.

Session 4A – Record Linkage 1

Statistical Analyses of Regression Models with Linked Data
Partha Lahiri, University of Maryland, USA

Computerized record linkage (CRL) methods are frequently used by government statistical agencies to quickly and accurately link two large files that contain information on the same individuals or entities using available information, which typically does not include unique, error-free identification codes. Because CRL utilizes already existing databases, it enables new statistical analyses without the substantial time and resources needed to collect new data. The possibility of errors in linkage causes problems for estimating the relationships between variables in the linked dataset. We will present a simple method to correct mismatch biases of the least square estimators of regression coefficients using an enhancement of the existing mixture models on measurements of the similarity among pairs of records to estimate probabilities used in calculating record linkage weights. A simulation study is conducted to compare the performance of the proposed estimator and alternatives. The talk is based on my joint work with Michael Larsen and Judith Law.

Quality and Analysis of Sets of National Files
William E. Winkler, U.S. Census Bureau, USA

The goal of various clean-up methods is to improve the quality of files to make them suitable for economic and statistical analyses. To fill-in missing data and ‘correct’ fields, we need generalized software that implements the Fellegi-Holt model (JASA 1976) to preserve joint distributions and assure that records satisfy edits. To identify/correct duplicates within and across files, we need generalized software that implements the Fellegi-Sunter model (JASA 1969). The goal of the clean-up procedures is to reduce the error in files to at most 1% (not currently attainable in many situations). In this presentation, we cover methods of modeling/edit/imputation and record linkage that naturally morph into methods of adjusting statistical analyses in files to linkage error. The modeling/edit/imputation software has four algorithms that may be each 100 times as fast as algorithms in commercial or experimental university software. The record linkage software used in the 2010 Decennial Census matches 10^17 pairs (300 million x 300 million) in 30 hours using 40 cpus on an SGI Linux machine. It is 50 times as recent parallel software from Stanford (Kawai et al. 2006) and 500 times as fast as software used in some agencies (Wright 2010). The main parameter-estimation methods apply the EMH algorithm (Winkler 1993) that generalizes the ECM algorithm (Meng and Rubin 1993) from linear to convex constraints. Following the introduction of the two quality methods, we cover some of the research into adjusting statistical analyses for linkage error that began in Scheuren and Winkler (1993) and that is an area needing considerable additional research. A linkage error can be thought of as a type of edit failure where we need an auxiliary data source or significantly enhanced model to ‘correct’ the error.

Design-based Estimation with Record-Linked Administrative Files
Abel Dasylva, Statistics Canada

Exact record linkage is an essential tool for exploiting administrative files, especially when one is studying the relationships among many variables that are not contained in a single administrative file. It is aimed at identifying pairs of records associated with the same individual or entity. The result is a linked file that may be used to estimate population parameters including totals and ratios. Unfortunately, the linkage process is complex and error-prone because it usually relies on linkage variables that are non-unique and recorded with errors. As a result, the linked file contains linkage errors, including bad links between unrelated records, and missing links between related records. These errors may lead to biased estimators when they are ignored in the estimation process.

Previous work in this area has accounted for these errors using assumptions about their distribution. In general, the assumed distribution is in fact a very coarse approximation of the true distribution because the linkage process is inherently complex. Consequently, the resulting estimators may be subject to bias.

A new methodological framework, grounded in traditional survey sampling, is proposed for obtaining design-based estimators from linked administrative files. It consists of three steps. First, a probabilistic sample of record-pairs is selected. Second, a manual review is carried out for all sampled pairs. Finally, design-based estimators are computed based on the review results. This methodology leads to estimators with a design-based sampling error, even when the process is solely based on two administrative files. It departs from the previous work that is model-based, and provides more robust estimators. This result is achieved by placing manual reviews at the center of the estimation process. Effectively using manual reviews is crucial because they are a de-facto gold-standard regarding the quality of linkage decisions. The proposed framework may also be applied when estimating from linked administrative and survey data.

Session 4B – Multiple Sources of Data

Model-Assisted Domain Estimation When Combining Multiple Data Sources including Survey Data and Administrative Records
Dan Liao and Phillip S. Kott, RTI International, USA

In this paper, we will examine domain estimation with the use of auxiliary information, when combining multiple data sources including survey data and administrative records. Two competing approaches are considered: calibration weighting and probability-weighted linear prediction. When there is a domain indicator among the calibration targets, these two approaches will produce the same results. But what if there isn't? Comparisons will be made between the validity (bias) and reliability (variance) of these two methods through a simulation study based on the 2012 US birth data file. A bias test will be proposed to determine whether or not the bias of a domain estimate derived from the weighted prediction method is significantly different from zero. If it is not, the variance of this domain estimate can be measured and compared with the variance of the corresponding domain estimate derived using calibration weighting.

These rival methods are also frequently used when there is a two-phase sample and the calibration targets for the final sample are computed from the first-phase sample. We will discuss the additional complications in variance estimation caused by the existence of two sampling phases.

Augmenting and Improving Survey Research through the Multi-Level, Multi-Source (ML-MS) Approach
Tom W. Smith, NORC at the University of Chicago, USA

To more fully understand human society, surveys need to collect and analyze multi-level and multi-source data (ML-MS data). Methodologically, the use of ML-MS data in general and the augmenting of respondent-supplied information with auxiliary data (AD) from sample frames, other sources, and paradata in particular can notably help to both measure and reduce total survey error. For example, it can be employed to detect and reduce nonresponse bias, to verify interviews, to validate information supplied by respondents, and in other ways. Substantively, ML-MS data can greatly expand theory-driven research such as by allowing multi-level, contextual analysis of neighborhood, community, and other aggregate-level effects and by adding in case-level data that either cannot be supplied by respondents or is not as accurate and reliable as information from AD (e.g. health information from medical records vs. recall reports of medical care). Thus, the ML-MS approach will boost both the methodological vigor and substantive power of survey research. It is a general framework for conducting and improving survey research.

The Future of Total Survey Design
Kees Zeelenberg, Statistics Netherlands

Quality, and total survey error (TSE), are of fundamental importance to official statistics and national statistical institutes (NSIs). But there are 3 challenges that need to be addressed: the use of administrative data, the integration of statistical production processes, and the advent of big data.

In contrast with survey data, statistical institutes no longer have control over the quality, concepts and content of their statistical input data. We address the consequences of these developments for the TSE paradigm. We argue that TSE is still the relevant principle and that statistical methodology is still relevant for applying TSE principles to these new forms of raw statistical data, but that it is urgent for NSIs to address these new areas. We discuss various aspects and ways in which this might be done. For example, we need to know how to make statistics that are representative for the population, from these kind of data, and how to measure the quality of these data.

The integration of statistical production processes within NSIs also leads to new views of quality management and total survey error. We discuss how to manage the quality in the chain from basic statistics to integrated final statistics, by some kind of total quality management or chain management.

We also sketch what these developments might mean for the organization of NSIs and their human resource management.

Session 5 – Waskberg Award Winner Address

From Multiple Modes for Surveys to Multiple Data Sources for Estimates
Constance F. Citro, Committee on National Statistics of the U.S. National Research Council/National Academy of Sciences, USA

Users, funders, and providers of official statistics want estimates that are “wider, deeper, quicker, better, cheaper” (channeling Tim Holt, former head of the UK Office for National Statistics), to which I would add “more relevant” and “less burdensome.” Each of these adjectives poses challenges and opportunities for those who produce statistics. Since World War II, we have relied heavily on the probability sample survey as the best we could do—and that best being very good, indeed—to meet these goals for estimates in many areas, including household income and unemployment, self-reported health status, time use, crime victimization, business activity, commodity flows, consumer and business expenditures, and so on. Faced with secularly declining unit and item response rates and evidence of reporting error, we have responded in many ways, including the use of multiple survey modes, more sophisticated weighting and imputation methods, adaptive design, cognitive testing of survey items, and other means to maintain data quality. For statistics on the business sector, in order to reduce burden and costs, we long ago moved away from relying solely on surveys to produce needed estimates, but, to date, we have not done that for household surveys, at least not in the United States. I argue that we can and must move from a paradigm of producing the best estimates possible from a survey to that of producing the best possible estimates to meet user needs from multiple data sources. Such sources include administrative records and, increasingly, transaction and Internet-based data. I provide several examples, including household income and household plumbing facilities, to illustrate my thesis. I conclude by suggesting ways to inculcate a culture of official statistics that focuses on the end result of relevant, timely, accurate, and cost-effective statistics and treats surveys, along with other data sources, as means to that end.

Session 6A – Total Survey Error

Estimation of Variances for Instruments Used to Measure Physical Activity
Wayne A. Fuller and Dave Osthus, Iowa State University, USA

In surveys using a collection instrument subject to large measurement error, a second instrument with smaller error variance is sometimes observed on a subsample of the original sample. This permits calibration of the large-error instrument against the small-error instrument. The Physical Activity Measurement Survey is unique in that replicate measures were obtained for two measuring methods. One measurement is a personal activity interview and the other is a monitor worn by the respondent. Multiple measurements enable one to identify the day-to-day variance and the instrument variance. Estimates of the variance components and the estimated calibration equation are presented for the sample of women.

Evaluation of Total Survey Error Components for Integration of Multiple Data Sources
John L. Eltinge, U.S. Bureau of Labor Statistics, USA

For many years, large-scale statistical organizations have combined data from multiple sources to produce estimates of population means, totals and other aggregate quantities. Some standard examples include estimation based on ratio, regression, composite and calibration weighting. Customary examples of “multiple sources” include data from multiple sample surveys, as well as microdata or aggregated data from administrative records.

More recently, there has been interest in expanding the abovementioned approaches to use data from additional sources, e.g., commercial transaction records or other forms of “organic data.” Efficient uses of these sources require reliable information on several phenomena, including the following.

  1. The propensity of data from a given population unit to be covered by a given group of data sources. For a single data source, the resulting propensity models can be viewed as extensions of traditional models developed to assess incomplete frame coverage, sampling errors, unit nonresponse, wave nonresponse and item nonresponse in traditional sample surveys; and related models used in the literature on observational studies. In addition, models for the joint propensity of a given unit to be included in each of several data sources lead to extensions of previous literature on estimation from multiple-frame surveys.
  2. Error properties for measurements recorded for a given population unit through a given data source. Models for these errors can be viewed as extensions of customary unit-level and interviewer-level models for survey measurement errors.
  3. Relationships among the underlying true outcome variables and related auxiliary variables considered in (1) and (2).

This paper extends previous linear-model approaches for sample survey data to produce a general framework to incorporate each of (1)-(3) in development of combined-data estimators, and related diagnostics. Special attention is directed toward tools that can help a statistical organization evaluate the extent to which inclusion of a given additional data source may reduce the mean squared error of a particular class of combined-data estimators.

Managing Quality in a Statistical Agency – A Rocky Road
Lilli Japec, Statistics Sweden

Statistics Sweden has, like many other National Statistical Institutes (NSIs), a long history of working with quality. More recently, the agency decided to start using a number of frameworks to address organizational, process and product quality. It is important to consider all three levels, since we know that the way we do things, e.g. when asking questions, affects product quality and therefore process quality is an important part of the quality concept. Further, organizational quality, i.e. systematically managing aspects like training of staff and leadership, is fundamental for achieving process quality.

Statistics Sweden uses EFQM (European Foundation for Quality Management) as a framework for organizational quality and ISO 20252 for market, opinion and social research as a standard for process quality. In April 2014, as the first National Statistical Institute, Statistics Sweden was certified according to the ISO 20252.

One challenge that Statistics Sweden faced in 2011 was to systematically measure and monitor changes in product quality and to clearly present them to stakeholders. Together with external consultants, Paul Biemer and Dennis Trewin, Statistics Sweden developed a tool for this called ASPIRE.

To assure that quality is maintained and improved, Statistics Sweden has also built an organization for quality comprising a quality manager, quality coaches, and internal and external quality auditors.

In my presentation I will talk about the components of Statistics Sweden's quality management system and the challenges we have faced.

Session 6B – Record Linkage 2

Assessing and Improving the Quality, Analytic Potential and Accessibility of Data by Linking Administrative, Survey and Open Data
Manfred Antoni and Alexandra Schmucker, Institute for Employment Research (IAB), Germany

Surveys increasingly face the problem of unit-nonresponse due to increasing data protection concerns and panel attrition or declining reachability and cooperation of respondents. Quality issues arise with item-nonresponse or misreporting, especially when recall error in retrospective interviews occurs. Particularly longitudinal interviews lead to high costs and response burden.

One potential remedy for quality and costs issues is the linkage with administrative or open data. Their purpose of collection may initially have been other than creating research data but they usually offer precise and reliable information covering long periods of time. Data linkage thus results in higher cost efficiency and data quality. Linked data also provide higher analytic potential for substantive analyses than their separate parts, either by combining their sets of variables or by adding observational levels (e.g. employees within establishments within companies).

Moreover, research on the quality of either data source gets possible by applying validation, unit- or item-nonresponse analyses or by examining the selectivity of consent to and success of record linkage.

Our presentation will focus on the potential, quality and accessibility of linked data of the Research Data Centre of the German Federal Employment Agency. They comprise administrative, survey and open data on people, enterprises and companies.

Chronic Disease Surveillance in Québec Using Administrative File Linkage
Louis Rochette and Valérie Émond, Institut national de santé publique du Québec, Canada

Information collection is critical for chronic disease surveillance, to measure the extent of diseases, assess the use of services, identify risk groups and track the course of illnesses and risk factors over time to plan and implement public health programs for disease prevention. It is in this context that the Système intégré de surveillance des maladies chroniques du Québec [Quebec integrated chronic disease surveillance system] (SISMACQ) was created. SISMACQ is a database created by linking administrative files covering the period from 1996 to 2012. It is a useful alternative to survey data, since it covers the entire population, is not affected by recall bias and makes it possible to monitor the population over time and space. However, the amount of data processed, the linkage of files from different sources and the requirement to maintain confidentiality present a challenge that calls for the adoption of a series of methodological and technological measures. The purpose of this presentation is to discuss the methods selected to construct the population cohort from various raw data sources and to describe the processing performed to minimize bias. We will also discuss the effect of changes that can arise throughout the survey period and that may affect the results, such as changes in coding, practices or organization of care.

Overcoverage in the 2011 Canadian Census
Abel Dasylva, Robert-Charles Titus and Christian Thibault, Statistics Canada

The Census Overcoverage Study (COS) is a critical post-census coverage measurement study. Its main objective is to produce estimates of individuals counted multiple times, by province and territory, study their characteristics and identify possible reasons for the errors. The COS is based on the sampling and clerical review of groups of connected records that are built by linking the census response database to an administrative frame, and to itself. In this paper we describe the new 2011 COS methodology. This methodology has incorporated numerous improvements including a greater use of probabilistic record-linkage, the estimation of linking parameters with an Expectation Maximization (EM) algorithm, and the efficient use of household information to detect more overcoverage cases.

Useful Functionalities for Record Linkage
Martin Lachance, Statistics Canada

In the field of record linkage, there is a wide range of character string comparators, the most obvious one being an exact match between two sequences of words. Comparison problems arise when factors affect the composition of the strings (for example, the use of a nickname in place of an individual's given name, word inversion and typographical errors). Therefore, comparators that are more sophisticated (for example, Winkler) are required. These tools make it possible to establish links that would otherwise be hard to establish, thereby reducing the number of links that might be missed. Unfortunately, some of the links may be false links, commonly known as false positives.

To achieve better record linkage, a significant number of sophisticated string comparators have been developed, some able to handle long sequences of words, some able to be combined. This range of tools is currently available through a deterministic record linkage prototype, MixMatch, which also makes it possible to use prior knowledge to reduce the number of false positives generated during linkage. The functionality of this prototype has two purposes, namely to increase the match rate while attempting to minimize the number of false positives.

Administrative Records in the U.S. Census for Group Quarters: Potential Uses and Limitations
Asaph Young Chun, U.S. Census Bureau and Jessica Gan, Rice University, USA

The purpose of this paper is to present possible statistical uses of administrative records in the U.S. Census for group quarters (GQ). GQ enumeration involves collecting data from such hard-to-access places as correctional facilities, skilled nursing facilities, and military barracks. We illustrate the utility of administrative records in constructing the GQ frame for coverage improvement. We examine the availability and potential usage of administrative records in GQ enumeration. We analyze the results of the 2010 Census to investigate the extent to which administrative records were potentially used for GQ frame construction and enumeration by paying due attention to their merits and limitations. We discuss findings of pros and cons of using administrative records relative to their implications for conceptualizing data quality indicators of administrative records in GQ.

Session 7A – Big Data

What Big Data May Mean for Surveys
Mick P. Couper, Survey Research Center, University of Michigan, USA

Two converging trends raise questions about the future of large-scale probability surveys conducted by National Statistical Institutes (NSIs). First, increasing costs and rising rates of nonresponse potentially threaten the cost-effectiveness and inferential value of surveys. Second, there is growing interest in Big Data as a replacement for surveys. There are many different types of Big Data, but my particular focus is on data generated through social media. In this talk I will review some of the concerns about Big Data, particularly from the survey perspective. I will argue that there is a role for both high-quality surveys and big data analytics in the work of NSIs. But while Big Data is unlikely to replace high-quality surveys, I believe the two methods can serve complementary functions. I will attempt to identify some of the criteria that need to be met, and questions that need to be answered, before big data can be used for reliable population-based inference.

Big Data as a Data Source for Official Statistics: Experiences at Statistics Netherlands
Piet J.H. Daas, Statistics Netherlands

More and more data are being produced by an increasing number of electronic devices physically surrounding us and on the internet. The large amount of data and the high frequency at which they are produced have resulted in the introduction of the term ‘Big Data’. Because of the fact that these data reflect many different aspects of our daily lives and because of their abundance and availability, Big Data sources are very interesting from an official statistics point of view. However, first experiences obtained with analyses of large amounts of Dutch traffic loop detection records, call detail records of mobile phones and Dutch social media messages reveal that a number of challenges need to be addressed to enable the application of these data sources for official statistics. These challenges and the lessons learned during these initial studies will be addressed and illustrated by examples. More specifically, the following topics are discussed: the three general types of Big Data discerned, the need to access and analyse large amounts of data, how we deal with noisy data and look at selectivity (and our own bias towards this topic), how to go beyond correlation, how we found people with the right skills and mind-set to perform the work, and how we have dealt with privacy and security issues.

A Big Data Pilot Project, with Smart Meters
Lily Ma, Statistics Canada

The United Nations Economic Commission for Europe has identified Big Data as a key issue for official statistics. Statistics Canada's most recent Corporate Business Plan includes a comprehensive review of alternative sources of data to replace, complement or supplement its existing programs. What exactly is Big Data? What can it offer official statistics? What are the risks? What are the challenges? What are the privacy concerns? What are some of the tools that we need? What are some of the skills that we need? Can it potentially replace and or supplement surveys? Last fall, Statistics Canada invested in a Big Data Pilot project to answer some of these questions. This was the first business survey project of its kind. I will be sharing some of the lessons learned from The Big Data Pilot Project using Smart Meter electricity data.

Session 7B – Data Collection 2

Mode Effects in the 2011 UK Census Data: Will a 2021 UK Census Need a Differential Imputation Strategy?
Steven Rogers, Office for National Statistics, United Kingdom

In a way similar to many National Statistical Institutes, the Office for National Statistics (ONS) is exploring new ways to meet the needs and demands of a 21st Century consumer of statistical data. Ongoing initiatives such as the Beyond 2011 project (B2011) and the Electronic Data Collection programme (EDC) have been designed specifically to investigate the methodological issues associated with data sources and collection methods beyond those of a traditional survey or Census. As imputation plays an important role in any survey cycle by serving to reduce non-response bias in survey estimates, the ONS Edit & Imputation team are engaged in understanding the potential impact alternative data collection methods may have on the design of appropriate imputation strategies. Here we present some preliminary results of research based on 2011 Census data. The B2011 and EDC programmes are suggesting that in 2021 the ONS may conduct another Census but one driven primarily by an internet based questionnaire. We ask: in that event, is there any evidence from the 2011 UK Census data that indicates that an appropriate imputation strategy will need to include a discrete mode effect mechanism?

One Requirement: Collect less. Our Mission: Do the best we can.
Olivier Haag, Pierre-Arnaud Pendoli and Sébastien Faivre, Institut National de la Statistique et des Études Économiques (INSEE), France

In France, budgetary constraints have made it virtually impossible to hire casual interviewers to handle local collection issues. Therefore, it has become necessary to meet a predetermined annual work quota.

In INSEE surveys, which are carried out using a master sample, problems arise when an interviewer is absent for an extended period throughout the collection for a survey. When that occurs, an area may in fact cease to be covered by the survey, which generates bias in the estimates.

In response to this new problem, two methods have been implemented, depending on when the problem is identified:

  • If an area becomes “left out” before or at the start of the collection, an “under-allocation” procedure is carried out. The procedure involves interviewing a minimum number of households in the “struggling” collection area at the expense of other areas in which no collection problems have been identified. The goal is to minimize the dispersion of the weights through the means of collection initially allocated to the survey.
  • If an area becomes “left out” during the collection, the surveys that remain are prioritized. The prioritization is based on a representativeness indicator (R indicator) that makes it possible to measure the degree of similarity between a sample and the base population. The R indicator is based on the dispersion of the estimated response probabilities of the households sampled, and it consists of partial R indicators that measure representativeness variable by variable. These R indicators are tools that can be used to analyze the collection by isolating underrepresented population groups. Collection efforts can be increased for groups that have been identified beforehand.

Introducing Adaptive Design Elements in the Panel Study “Labour Market and Social Security” (PASS)
Mark Trappmann, Gerrit Müller, Frauke Kreuter, Institute for Employment Research (IAB), Germany

PASS is one of the major German panel surveys. It focuses on unemployment and poverty dynamics. Since 2007 about 15.000 persons in about 10.000 households are interviewed each year. PASS uses a sequential mixed-mode design of CAPI and CATI. Data can be linked to detailed administrative records on employment histories for all respondents who provide informed consent.

Since Wave 4 detailed paradata have been available on a biweekly basis during fieldwork. Since Wave 6 (2012) these have been used for informed interventions into the fieldwork of the panel. The presentation gives an overview of the elements of this adaptive survey design with a focus on two experiments concerning optimal contact times and interviewer incentives for low propensity cases.

In the first experiment contact times in the CATI part of the study were tailored to the day of the week and time of the day of the successful interview in the previous wave. While 80% of the households received this treatment, 20% were scheduled randomly. The tailoring slightly reduces the average number of contact attempts until contact, but has only an insignificant effect on cooperation upon first contact.

For the second experiment response propensities were estimated for CAPI cases during fieldwork based on contact histories and frame data. In the last phase of data collection, interviewers were promised considerable premiums for completing cases with a low predicted response propensity. The premium was offered for a random half of the low propensity cases. We find that incentives lead to a higher probability of receiving a final status (interview or refusal) while the number of cases still open at the end of the fieldwork (address problems, noncontacts, broken appointments) decreases. However, response rates are not significantly higher for the experimental group. Based on these results we implemented an experiment combining interviewer and respondent incentives in the current wave.

Testing Collection Strategies for Online Self-Reporting Surveys
Margaret Wu, Lecily Hunter and François Brisebois, Statistics Canada

In January and February 2014, Statistics Canada conducted a test aiming at measuring the effectiveness of different collection strategies using an online self-reporting survey. Sampled units were contacted using mailed introductory letters and asked to complete the online survey without any interviewer contact. The objectives of this test were to measure the take-up rates for completing an online survey, and to measure the profiles of respondents/non-respondents. Different samples and letters were tested to determine the relative effectiveness of the different approaches. The results of this project will be used to inform various social surveys that are preparing to include an internet response option in their surveys. The poster will present the general methodology of the test as well as results observed from collection and the analysis of profiles.

Innovative Collection and Analysis Management in the Integrated Business Statistics Program
Fraser Mills, Serge Godbout, Frédéric Picard and Keven Bosa, Statistics Canada

Statistics Canada has undertaken a major redesign for its Business Statistics surveys, the Integrated Business Statistics Program (IBSP) to replace the Unified Enterprise Survey (UES). One key component of the new framework is the adaptive collection and analysis management methodology. It was developed to require less manual intervention while still achieving similar quality at a lower cost. It uses historical and partially collected data, estimates and quality indicators which are produced while collection is still underway. Scores are calculated for each collection unit in order to gauge its impact with regard to the quality indicator. The scores are then aggregated within each collection unit, creating a global impact measure. Based on these, priority lists are produced in order to drive decisions regarding non-response follow-up, selective editing, and failed edit follow-up.

This talk will focus on quality indicators and measure of impact scores. More precisely, it will describe the methodology behind the measure of impact scores and link it to the theory of variance due to imputation. Assumptions to achieve the creation of the measure of impact scores will also be discussed.

Session 8A – Microsimulations 1

Modelling Complex and Atypical Households: Example of Demo4, the Demographic Model of the European Project SustainCity
Sophie Pennec, Elisabeth Morand and Laurent Toulemon, Institut National d'Etudes Démographiques, France

As part of the European project SustainCity, a microsimulation model of individuals and households was created to simulate the population of various European cities. The project's aim was to combine several transportation and land use microsimulation models, to add on a dynamic population module and to apply these microsimulation approaches to three geographic areas of Europe (the Île-de-France region and the Brussels and Zurich agglomerations).

In the SustainCity project, the number and structure of households as basic agents are deduced from a specific demographic model that simulates individual behaviours. For the sake of design simplicity, the dynamic model was developed as an autonomous module within the SustainCity project and can therefore be used on its own for strictly demographic applications.

The proposed model is a cross-sectional model (based on an initial population for a given year) that is closed (the individuals are explicitly associated with one another) and uses annual transitions to simulate behaviours. To be of service to the various members of the SustainCity consortium, the demographic model had to be user-friendly (with a graphic interface and parameter modification menus, for example). For this reason, we employed the language Modgen, developed by Statistics Canada. The model simulates individual-level events (mortality, fertility, formation and dissolution of unions, departure of children from the parental home) and deduces the corresponding changes at the household level. In addition to simple households composed of no more than one family (couple or adults and children), the model also looks at complex households (all other modes of cohabitation, including multi-generation households and co-tenancy) and atypical households. The presentation focuses more specifically on the definition and modelling of complex households and atypical households (collectives) with an application to the case of the Île-de-France region.

Modelling the Early Life-course (MEL-C): A Decision-Support Tool for Policy Makers
Barry Milne, Roy Lay-Yee, Jessica Thomas, Martin von Randow and Peter Davis, Centre of Methods and Policy Application in the Social Sciences (COMPASS), University of Auckland, New Zealand

Micro-simulation relies on data from the real world to create an artificial one that mimics the original but upon which virtual experiments can be carried out. Modelling the Early Life-Course (MEL-C) is a micro-simulation model which uses estimates derived from New Zealand longitudinal studies to determine transitions from birth to age 13 for a representative, synthetic sample of New Zealand children. MEL-C focusses on simulation of three main outcomes: health service use, early literacy, and antisocial behaviour. Potential predictors that have been modelled include: demographic characteristics, family characteristics, pre- and peri-natal influences, and participation in early childcare education. I will describe how the model can be interrogated to test “What-if?” scenarios, e.g., What if rates of smoking in pregnancy were lower?; What interventions have the greatest benefit for deprived or minority groups? I will also demonstrate the software that has been developed for manipulating the model.

Microsimulation at Statistics Canada: Past, Present, Future
Chantal Hicks and Martin Spielauer, Statistics Canada

Statistics Canada has a long history in developing microsimulation models as well as methodologies and tools that facilitate their construction. Models are used for socio-economic analysis, health analysis, population projections and personnel projections. What do these models have in common? They all integrate various data sources into coherent platforms which can answer questions which cannot be answered by any single dataset. As a result, microsimulation models increase the relevance of available data and improve data consistency and quality. Additionally, many models create and use synthetic databases that are non-confidential – these can shared with the public thus improving data accessibility. Most of the microsimulation activities at Statistics Canada are funded by external clients including other government departments, provincial governments, academia, and think-tanks. Besides the microsimulation models developed and maintained at Statistics Canada, the agency also shares its technology and expertise worldwide. This contribution gives an overview on microsimulation activities at Statistics Canada, its history, rationale, current challenges, and goals for the future.

Session 8B – Web Surveys 2

Web Panel Surveys – A Challenge for Official Statistics
Jörgen Svensson, Statistics Sweden

During the last decade, web panel surveys have been established as a fast and cost-efficient method in market surveys. The rationale for this is new developments in information technology, in particular the continued rapid growth of internet and computer use among the public. Also growing nonresponse rates and prices forced down in the survey industry lie behind this change. However, there are some serious inherent risks connected with web panel surveys, not least selection bias due to the self-selection of respondents. There are also risks of coverage and measurement errors. The absence of an inferential framework and of data quality indicators is an obstacle against using the web panel approach for high-quality statistics about general populations. Still, there seem to be increasing challenges for some national statistical institutes by a new form of competition for ad hoc statistics and even official statistics from web panel surveys.

This paper explores the question of design and use of web panels in a scientifically sound way. An outline is given of a standard from the Swedish Survey Society for performance metrics to assess some quality aspects of results from web panel surveys. Decomposition of bias and mitigation of bias risks are discussed in some detail. Some ideas are presented for combining web panel surveys and traditional surveys to achieve controlled cost-efficient inference.

On Bias Adjustments for Web Surveys
Lingling Fan, Wendy Lou and Victoria Landsman, University of Toronto, Canada

Web surveys have become an attractive data collection mode over the last decades, but by design they exclude the entire non-internet population. And also they do not have high response rates, thus non-coverage and non-response biases are more worrisome in web surveys. Imputation is a commonly used method to deal with item non-response, by which a complete data set can be created by filling in missing values. In this study, we will use imputation methods including hot deck imputation, tree-based imputation, and Bayesian logistic regression imputation to address non-coverage bias and non-response bias in web surveys. We present simulation results to illustrate the performance of the methods under different scenarios depending on the availability of additional information for the reference population, which look promising. Possible extensions of these approaches and directions for future work will also be discussed.

Nonresponse Bias in a Probability-Based Internet Panel: The Effect of (Un)conditional Cash Incentives
Ulrich Krieger, University of Mannheim, Germany

The German Internet Panel (GIP) is a new large-scale online panel based on a probability sample of individuals living within households in Germany. In 2012 households were approached offline, with a short face-to-face interview. Subsequently, all household members were invited to complete the bi-monthly GIP questionnaires. To minimize non-coverage bias, households without access to the internet were provided with the necessary hardware and/or a broadband internet connection.

Recruitment into the GIP consisted of various stages: the face-to-face household interview, mailed invitations to the online survey, reminder letters, a phone follow-up, and final mailed reminders. During the face-to-face phase we conducted an experiment with €5 unconditional vs. €10 conditional household incentives. In addition, an experiment with €5 unconditional personal incentives was conducted during the first reminder.

We examine the effects of experimental variation in the recruitment process on the sample composition of the GIP. We will use data from the German census as a benchmark to evaluate the representativeness of the panel and how this is affected by different recruitment measures and incentive experiments. We answer the question of whether a carefully recruited, probability-based online sample is suitable for social and economic research.

Instant versus Delayed Interactive Feedback on Speeding and Nondifferentiation in Grid Questions
Tanja Kunz and Marek Fuchs, Darmstadt University of Technology, Germany

In Web surveys, interactive feedback can be used to improve the quality of respondents' answers. Previous research proved its effectiveness in terms of reduced speeding and nondifferentiation in grid questions (Conrad et al., 2009, 2011; Zhang, 2012, 2013). Interactive feedback can be provided either (1) after a respondent has already submitted the entire grid as is the case with previous studies (delayed feedback) or (2) while a respondent is still in the process of answering the grid items (instant feedback).

In a randomized between-subjects experiment embedded in a Web survey among university freshmen (n=1.696) the effectiveness of providing instant feedback on speeding (Experiment 1) and nondifferentiation (Experiment 2) in grids was compared to providing delayed feedback or no feedback at all. Findings demonstrate the benefits of using instant feedback since instant prompts to differentiate more between items reduce both nondifferentiation and speeding more reliably than delayed feedback. Furthermore, delayed feedback on speeding actually results in longer completion times. However, since these longer completion times are not accompanied by a decrease in nondifferentiation as compared to instant feedback the additional time used seems to be no productive time. Thus, findings indicate that the precise timing of interactive feedback is decisive for reducing satisficing behaviors since instant feedback is more effective in reducing speeding and nondifferentiation in grid questions than delayed feedback.

Are They Willing to Use the Web? First Results of a Possible Switch from PAPI to CAPI/CAWI in an Establishment Panel Survey
Peter Ellguth and Susanne Kohaut, Institute for Employment Research (IAB), Germany

The IAB-Establishment Panel is the most comprehensive establishment survey in Germany with almost 16.000 firms participating every year. Face-to-face interviews with paper and pencil (PAPI) are conducted since the beginning of the survey in 1993. As for all panel surveys high response rates are of special importance and in our case are supported by the option to leave the questionnaire with the respondents to be completed by themselves (about 20% of the interviews).

Meanwhile alternative computer aided survey methods are available with indisputable advantages. When changing the survey mode of the IAB Establishment Panel one challenge is to make sure that it will still be possible to complete the questionnaire without an interviewer being present. Otherwise nonresponse rates (unit and item) would increase. To meet this challenge (computer aided) personal interviews (CAPI) combined with a web based version of the questionnaire (CAWI) seems to be a promising solution.

So far, little is known of the ability or willingness to participate in such a survey at establishment level. Therefore, questions about the internet access, the willingness to complete the questionnaire online and reasons for refusal were included in the 2012 wave of the IAB-Establishment Panel.

In this paper some key results will be presented that might be interesting for the general debate on survey methodology. First results are indicating a widespread refusal to take part in a web survey. We like to take a closer look at the establishments to find out what the characteristics of the firms able/ willing to participate in a web survey are and what might be learned concerning sample drawing, fieldwork and data collecting.

Session 9A – Quality Indicators for Administrative Data

Different Informative Context for the Statistical Use of Administrative Data
Loredana Di Consiglio and Piero Demetrio Falorsi, Italian National Statistical Institute ISTAT, Italy

The project SN_MIAD of the Statistical Network is responsible for developing methodologies for an integrated use of administrative data (AD) in the statistical process. The project is chaired by the Italian National Institute (Istat) and is composed by the Australian Bureau of Statistics (ABS), Statistics Canada (StatCan) and Statistics New Zealand (SNZ). SN-MIAD aims at providing guidelines for exploiting AD for statistical purposes. In particular a quality framework is developed, a mapping of possible uses is provided and a schema of the possible informative context is proposed.

This paper focuses on this latter aspect. In particular, we will distinguish between dimensions that relate to features of the source connected with accessibility and with characteristics that are connected to the structure of the AD and its relationship with the statistical concepts. We call the first class of features the framework for access and the second class of features the data framework. In this paper we will mainly concentrate on the second class of characteristics that are related specifically with the kind of information that can be obtained from the secondary source. In particular, these features relate with the target administrative population and measurement on this population and how it is (or may be) connected with the target population and target statistical concepts. Connection of informative contexts and quality framework will also be highlighted.

A Framework to Evaluate Administrative Data
Mylène Lavigne, Martin Lessard and Christian Nadeau, Statistics Canada

Several factors explain why national statistical institutes are turning increasingly to administrative data to produce statistical information. These factors include budget reductions, declining response rates, increased response burden, improved record linkage techniques and more powerful computers. However, the institutes must consider a number of aspects related to the use of administrative data during the process of acquiring new sources: confidentiality and privacy issues, legal and financial aspects, and the impact on the quality of the statistical products. The quality of these products depends partly on the quality of the inputs. Therefore, when deciding to acquire new administrative data sources to use them for statistical purposes, an evaluation is needed to determine whether those data are suitable for our planned use.

In this vein, a framework to evaluate the quality of administrative data is being developed at Statistics Canada. Drawing on the agency's Quality Assurance Framework, this evaluation framework consists of two phases. The first phase, carried out without access to the data in question, seeks to evaluate the data's relevance, coherence, timeliness, interpretability and accessibility. The second phase, which relies on a partial or preliminary version of the data, focuses on accuracy. This paper presents the framework and the suggested evaluation tools for the two phases.

Developing Quality Indicators for Business Statistics Involving Administrative Data: Outcome of a Cross-European Collaboration
Daniel Lewis and John-Mark Frost, Office for National Statistics, United Kingdom

With the increasing use of administrative data in the production of business statistics comes the challenge for statistical producers of how to assess quality. A cross-European project (ESSnet Admin Data) set out to aid producers of official statistics in meeting this challenge by developing quality indicators for business statistics involving administrative data. The team, from across a number of National Statistical Institutes (NSI), developed:

  • a list of basic quality indicators including quantitative indicators and complementary qualitative indicators
  • a set of composite indicators which draws together the basic quality indicators into ‘themes’ in line with the European dimensions of output quality – providing a more holistic view of the quality of a statistical output; and
  • guidance on the accuracy of mixed source (survey and administrative data) statistics.

These can be implemented as part of a quality management system, aiming to assess and improve quality, as well as being used to inform users of the quality of the statistics produced.

This paper will review the overall findings of the project and pull together the three main elements which, we believe, provides a valuable resource to producers of statistics. The paper will also provide information on the other work of the ESSnet Admin Data, including methods developed and identified as best practice in aiding NSIs in maximising their use of administrative data.

Session 9B – Sampling and Estimation

The New Generalized Sampling System, G-Sam
Carlos Leon, Statistics Canada

In recent years, a number of the computer systems in Statistics Canada's generalized systems program have been given major facelifts. In particular, GSAM/SGE is now G-Sam, with new functionality that includes sample selection and coordination, and optimized stratification and distribution. In this presentation, we will take a look at the various G-Sam modules, the underlying methodology, practical issues on use in a production environment, operations research methods used and a real-time demonstration of the system's capabilities.

The 2011 National Household Survey Public Use Micro-data File Methodology – “How to balance the requirements for more information and the requirements for low risk of disclosure in the micro data?”
William Liu and François Verret, Statistics Canada

The 2011 National Household Survey (NHS) is a voluntary survey that replaced the traditional mandatory long-form questionnaire of the Canadian Census of Population. The NHS sampled about 30% of the Canadian households and achieved an un-weighted response rate of 69%. In comparison, the last census long form was sent to 20% of the households and achieved a response rate of 94%. Based on the long-form data, two Public Use Micro-data Files (PUMF) are traditionally produced: the individual PUMF and the hierarchical PUMF. Both give information on individuals, but the hierarchical PUMF further provides information on the household and family relationships between the individuals. In order to produce two PUMFs that cover the whole country evenly and that do not overlap, a special sub-sampling strategy has been applied. In the confidentiality analyses, difficulty has increased dramatically in the 2011 production due to the numerous new variables, more detailed geographic information and the voluntary nature of the NHS. This presentation will describe the 2011 NHS PUMF methodology and how it balances the requirements for more information and the requirements for low risk of disclosure in the micro data.

Study of the “Product” Sampling Scheme as Illustrated by the ELFE Survey
Guillaume Chauvet, Ecole Nationale de la Statistique et de l'Analyse de l'Information (Crest/Ensai), France; Hélène Juillard and Anne Ruiz-Gazen, Université Toulouse, France

The Étude longitudinale française depuis l'enfance (ELFE), which began in 2011, involves over 18,300 infants in maternity wards who are included with the consent of their parents. In each of the randomly selected maternity wards, all the infants of the target population, born on one of the 25 days distributed across the four seasons in 2011, were selected. This sample is the result of a non-standard sampling scheme that we call échantillonnage produit, or “product” sampling. In this survey, it takes the form of cross-tabulation of two independent samples, namely the sample of maternity wards and the sample of days. Although there may be a maternity ward cluster effect, there may also be a day cluster effect. Unlike the classic two-stage sampling scheme, product sampling does not check for independence.

We will present an overall and in-depth study of product sampling and the estimation of simple and complex parameters for this scheme. The estimation of variance will also be examined in detail, and simplified variance estimators will be proposed and compared. The case in which both sampling schemes are simple random sampling without replacement or stratified random sampling without replacement will be considered. Lastly, we will consider a comparison between product sampling and classic two-stage sampling from a theoretical point of view and with the use of simulations.

Sample Size Optimization with Sample Frame Data
Noriki Armando Ibarra Medina, Instituto Nacional de Estadistica y Geographia, Mexico

When designing a probabilistic sample survey, special care should be taken when selecting sampling units, so they will be perfectly determined and survey variables can be measured. A sampling selection method, plus a sample size will be necessary if we are trying to know which sampling units will be included in the sample. Sample size calculation also considers non-response adjustments, so even if there may be units in which it is not possible to obtain observations of the studying variables, sufficient information can be measured on the remaining units to ensure compliance with the proposed sampling design.

The National Institute for Statistics and Geography (INEGI, México) designs such sample surveys, making the selection of samples based on a sampling frame (dwellings) built with clusters of dwellings and designed into five panels; a specific panel is updated within three months, while the others remain fixed. Within each cluster, housing type (inhabited or uninhabited ) is updated. Thus, all panels have different updating reference time.

Housing type dynamics (changes between housing types) will increase through time, therefore an adjustment to the effective sample size is proposed:



c:= correction number, meaning the expected response rate for dwellings selected within specific panel with i quarter periods lapsed since last update. It's based on history statistics and time series for different expected response rate based on update panel period.

ni := selected dwellings within panels with i quarter periods lapsed since last update.

As a consequence, it will be possible to budget potential savings derived from reductions in the sample size calculation.

Organization of Data Collection and Record Linkage for the Epidemiological Surveillance of Workers in France Who May Be Exposed to Nanomaterials (EpiNano)
Delphine Jezewski-Serra, Laurène Delabre, Stéphane Ducamp, Yuriko Iwatsubo and Irina Guseva Canu, Institut de Veille Sanitaire, France

The EpiNano surveillance mechanism currently being developed within the occupational health department of the Institut de veille sanitaire (InVS) involves implementing a future cohort to track changes in the health status of individuals who may be exposed to manufactured nanomaterials in the workplace.

EpiNano is based on industrial hygiene data collected to characterize nanomaterial exposure for various types of jobs within companies. The data, captured through e-questionnaires by industrial hygienists, will be cross-tabulated with data from self-administered questionnaires completed by cohort members (for example, type of work, working conditions, use of personal protective equipment and self-rated health). The data thus collected will then be matched (once they have been de-identified) with health data from national medical administrative databases (health insurance (SNIIRAM), hospitalizations (PMSI) and death certificates (CépiDc)), making it possible to passively monitor health events. Follow-up questionnaires will be sent regularly to participants. To increase responsiveness and speed up the processing of the information collected, electronic data collection is being considered.

It should take three years to construct the cohort, which will then be generally tracked for at least 20 years. Ad hoc studies of specific research hypotheses by teams external to the InVS are being considered.

Ultimately, we hope to be able to track approximately 2,000 workers in France who may be exposed to manufactured nanomaterials, to identify the potential medium- and long-term health effects of occupational exposure to nanomaterials.

Session 10A – Alternative Sampling Methods

Inference and Diagnostics for Respondent-Driven Sampling Data
Krista J. Gile, University of Massachusetts, Amherst, USA

Respondent-Driven Sampling is type of link-tracing network sampling used to study hard-to-reach populations. Beginning with a convenience sample, each person sampled is given 2-3 uniquely identified coupons to distribute to other members of the target population, making them eligible for enrollment in the study. This is effective at collecting large diverse samples from many populations.

Unfortunately, sampling is affected by many features of the network and sampling process. In this talk, we present advances in sample diagnostics for these features, as well as advances in inference adjusting for such features.

Estimation with Nonprobability Surveys and the Question of External Validity
Jill A. Dever, RTI International, USA

Probability sampling designs, those with samples drawn from the target population through a known random mechanism, are considered by many to be the gold standard for surveys. Theory has existed since the early 1930's to produce population estimates from these samples under the labels of design-based, randomization based, and model assisted estimation.

Two key requirements for probability sampling are (1) the linkage of the sample to the target population is known, and (2) any nonresponse resulting from the sample is inconsequential. The former is captured through the base weights calculated as the inverse selection probability. The latter suggests nonresponding sampled units are missing at random at least once the base weights have been adjusted. Both requirements are needed so that external validity of the estimates is not compromised. Ever increasing nonresponse for probability surveys, however, is one criticism not easily refuted. Enter non probability sampling.

Studies including non probability have gained more attention in recent years but they are not new. Touted as cheaper, faster (even better) than probability designs, these surveys capture participants through various methods such as volunteer panel surveys. Both the linkage and the probability of participation in the survey must be addressed to answer the question of external validity.

This paper first summarizes the work to date on analyses with nonprobability designs and its implications on external validity. With this in mind, we expand the research by providing cautionary conditions when external validity is most likely in question, and conversely when it is not.

Examining Some Aspects of Balanced Sampling in Surveys
Guillaume Chauvet, CREST-ENSAI Ecole Nationale de la Statistique et de l'Analyse de l'Information, France; David Haziza, Université de Montréal, Canada and Éric Lesage, CREST-ENSAI Ecole Nationale de la Statistique et de l'Analyse de l'Information, France

Balanced sampling has received some attention in recent years. There exists a number of procedures leading to a balanced or approximately balanced sample. These procedures can be divided into two families: the rejective methods and the Cube method (Deville and Tillé, 2004). The goal of such procedures is to prevent the selection of undesirable samples with respect to the auxiliary information and to reduce the variance of estimators of totals for variables of interest that are correlated to the auxiliary variables involved in the balance constraints.

With a rejective procedure, the balancing constraint is well respected but the inclusion probabilities are complex and unknown. Conversely, with the Cube method, the inclusion probabilities are exactly respected, but the selected sample could be unbalanced.

In this presentation, we examine the properties of several estimation procedures under the rejective sampling procedure of Fuller (2009) and the Cube method and we comment on the results of an extensive simulation study.

Session 10B – Microsimulations 2

Simario: an R Package for Dynamic Microsimulation
Jessica McLay, Oliver Mannion, Janet Pearson and Barry Milne, University of Auckland, New Zealand

There are a variety of software and programming languages that are used for microsimulation. Simario is the first R package created to perform dynamic microsimulation. Simario comprises a collection of R functions that enable one to build a dynamic (or non-dynamic) microsimulation. R is renowned for data manipulation with the language built around vectors, matrices, arrays, lists, and objects. It is also renowned for easily simulating from numerous statistical distributions from binomial, normal, Poisson, negative binomial and beyond. These capabilities are harnessed to create a flexible framework for microsimulation. A simple microsimulation model is used to demonstrate how to use the simario functions to create a working microsimulation. The process covers all aspects of programming a microsimulation model including: the starting population, variable definition files, the inclusion of transition probabilities and parameter estimates from statistical models, simulating different types of variables through time, summarising data from individual runs, collating the summaries over multiple runs, functions for viewing results, and running ‘what if’ scenarios. Unique features of the simario framework are highlighted including flexibility of parameter estimates and variable manipulation, the ability to run scenarios by changing the distribution of a variable in the population, the ability to run scenarios on a subgroup of the population, and specific functions to view simulation results by any user-specified subgroup. The current limitations of simario are also reported. It is expected that the simario package will be published on the CRAN website in late 2014.

Demosim Population Projection Model: Update and New Developments
Éric Caron-Malenfant, Statistics Canada

Demosim is a microsimulation model by Statistics Canada that uses population census microdata to generate population projections for various characteristics (for example, visible minority group, Aboriginal identity or education) for selected geographic areas including census metropolitan areas and Indian reserves. Dynamically taking into account a wide range of characteristics associated with the occurrence of simulated events, Demosim has led to the release of the Projections of the Diversity of the Canadian Population, 2006-2031, and the Population Projections by Aboriginal Identity in Canada, 2006-2031.

The purpose of this presentation is to give an overview of new developments that will be introduced into Demosim for new population projections to be produced using the model in partnership with other federal departments. Aside from updating the model's base population using microdata from the 2011 National Household Survey, the new developments include adding new geographic areas, new variables (for example, immigrant admission class, Aboriginal families and households and language variables) and new events that make it possible to project these variables. As well, this presentation will describe the possibilities arising from the use of new data sources, including record linkage to construct certain model parameters, such as those associated with changes in education level or Aboriginal identity reported over a person' lifetime.

The Relation Between Education and Labour Force Participation of Aboriginal Peoples: a Simulation Analysis Using the Demosim Population Projection Model
Martin Spielauer, Statistics Canada

This study aims at quantifying the impact of educational attainments on the future labour force participation of Aboriginal peoples. Using Statistics Canada's Demosim population projection model, we are able to simulate alternative scenarios of educational change and resulting effects on the future labour force until 2056. About half of the observed difference in labour force participation rates between Aboriginal peoples and the Canadian born population neither belonging to an Aboriginal nor to a visible minority group can be attributed to educational differences. Following a “medium growth – recent trend” scenario, over the next four decades population growth of Aboriginal peoples would result in an 45% increase in size of its labour force if relative educational differences persist. In education scenarios which close the educational gap, this number would increase by almost 70%. At the same time, the composition of the future Aboriginal peoples' labour force would be dramatically different. While the impact of educational improvements on the future labour force is significant, the change is found to be a slow gradual process as successive young school-age cohorts have yet to enter the labour market and renew the workforce.

Projections of Aboriginal and Non-Aboriginal Families and Households Using the Demosim Population Projection Model
Jean-Dominique Morency, Statistics Canada

In this presentation, we will discuss how a new module for projecting Aboriginal and non-Aboriginal families and households using the Demosim Population Projection Model works. Although projections of Aboriginal and non-Aboriginal families and households have already been produced in the past in separate projection exercises, this is the first time in Canada that projections of Aboriginal and non-Aboriginal families and households have been produced simultaneously—by Demosim projecting the total population of Canada—and the first time that they have been produced using a microsimulation model.

After explaining the objectives of the projections, we will focus on what we mean by Aboriginal and non-Aboriginal families and households. We will then give details on the methodology used to produce the projections: assigning family and household characteristics to each individual in the base population and applying family reference person rates and household maintainer rates to obtain a count of the number of families and households for each projection year.

Labour Force Projections using Demosim: New developments
Laurent Martel, Statistics Canada

Labour force projections are popular with the media, banks and financial institutions. Projections generated with the microsimulation model Demosim have the benefit of explicitly taking into account future changes in the composition of the population with respect to education and ethnocultural diversity. Demosim is also used to carry out unique sensitivity studies, such as measuring the effect on future labour force participation of higher educational attainment of certain populations or of better economic integration of immigrants.

This presentation will describe the proposed new developments related to the labour force, as part of the update of the Demosim model using the National Household Survey (NHS) database. In particular, we will show the methods used to calculate the differentials in the labour force participation rates for a number of variables, including education; immigrant, visible minority and Aboriginal status; regions including Indian reserves; place of birth, etc. We will also demonstrate how recent data linkages can improve this Demosim module, particularly the option of adding variables not found in the NHS, such as immigrant class, a key variable in the economic integration of immigrants.

Session 11 – Approaches to Inference in Survey Sampling

On the Calibrated Bayesian in Design and Analysis
Donald B. Rubin, Harvard University, USA

A variety of statisticians have argued for using Bayesian thinking to create procedures and frequentist operating characteristics to ensure that selected procedures are calibrated across a variety of realistic situations. This presentation supports that approach at the design stage, meaning before the actual data set has been collected and observed, but distinguishes that calibration from the more refined and conditional calibration that a statistician should employ after seeing the data set and selecting procedures to trust for drawing inferences from that specific data set.

J.N.K. Rao, Carleton University, Canada
Ray Chambers, National Institute for Applied Statistics Research Australia (NIASRA), University of Wollongong, Australia

Poster Session

Automatic Coding of Occupations
Arne Bethmann, Manfred Antoni, Malte Schierholz, Markus Zielonka, Daniel Bela and Knut Wenzig, Institute for Employment Research (IAB), Germany

In recent years several German large-scale panel studies demonstrated the demand for the coding of open ended survey questions on respondents' occupations (e. g. NEPS, SOEP and PASS). So far occupational coding in Germany is mostly done semi-automatically, employing dictionary approaches with subsequent manual coding of cases which could not be coded automatically.

Since the manual coding of occupations generates considerably higher costs than automatic coding, it is highly desirable from a survey cost perspective to increase the proportion of coding that can be done automatically. At the same time the quality of the coding is of paramount importance calling for close scrutiny. The quality of the automatic coding must at least match that of the manual coding if survey cost is not to be traded for survey error. From a total survey error perspective this would free resources formerly spent on the reduction of processing error and offer the opportunity of employing those resources to reduce other error sources.

In contrast to dictionary approaches, which are mainly used for automatic occupational coding in German surveys, we will employ different machine learning algorithms (e. g. naïve bayes or k-nearest-neighbours) for the task. Since we have a substantial amount of manually coded occupations from recent studies at our disposal we will use these as training data for the automatic classification. This enables us to evaluate the performance as well as the quality – and hence the feasibility – of machine learning algorithms for the task of automatic coding of open ended survey questions on occupations.

Secure Record Linkage: Encrypt, Match, Analyze
Aleksander Essex, Western University; Khaled El Emam, CHEO Research Institute; Luk Arbuckle, Privacy Analytics; Matthew Tucciarone, CHEO Research Institute, Canada

Imagine you want to find records in a data set that you don't have access to, because you don't have the appropriate authority or consent. You also don't want the data holder, or a third party, to know who you're looking for, so asking them for certain records is not an option. You want to find John Doe, but you don't care to know if Jane Buck is in the data set, and you don't want the data holder to know you're looking for John Doe. You may want to link records for research, fraud detection, de-duplication, or surveillance. But without a way to bring these records together, ensuring privacy and confidentiality of sensitive information, you will be hard pressed to share data between organizations.

What if you could instead find those records you're looking for without seeing the other records in the data set that are of no interest to you, and without disclosing to the data holder the records you're trying to find? This can be achieved with strong security using public-key cryptosystems—specifically homomorphic encryption—and only allowing matches to be returned and decrypted by the data requestor. No trusted third party is used, and no data is decrypted except by the requestor. A semi-trusted third party is used as the key holder, but all they learn from participating in the secure record linkage is how many records matched between the two data sets. We will use a case study to explain the protocol.

Interviewers' Influence on Bias in Reported Income
Manfred Antoni, Basha Vicari and Daniel Bela, Institute for Employment Research (IAB), Germany

Questions on sensitive topics like income often produce relatively high rates of item nonresponse or measurement error. In this context several analyses have been done on item nonresponse, but little is known about misreporting. One possible explanation of such misreporting is social desirability bias, which may lead to over-reporting of desirable attributes or underreporting of undesirable ones. However, a competent interviewer may be able to inhibit such behaviour. We therefore examine the influence of respondent and interviewer characteristics on the accuracy of reported income.

Using linked survey and administrative data we are able to detect the extent of social desirability bias in reported incomes. The starting point for the linkage is data from the German National Educational Panel Study (NEPS). In addition to survey data, NEPS provides rich paradata, including interviewer characteristics and context data. About 90% of the respondents consented to a linkage of their survey information with administrative data from the German Federal Employment Agency. These longitudinal earnings data are highly reliable as they are based on mandatory notifications of employers to the social security system. The data sources were combined using record linkage techniques for non-unique identifiers.

We include interviewer and respondent characteristics as well as their interactions into our model to estimate their respective impact on the incidence and size of any bias in reported incomes. This allows us to control for latent interviewer traits that might have influenced the respondent's answering behaviour during each interview of a given interviewer.

Investment Banking Services Price Index: A New Approach to Using Administrative Data
Min Xie, Nael Hajjar and Lucy Opsitnik, Statistics Canada

The Investment Banking Services Price Index (IBSPI) will measure the changes in prices of investment banking services in order to deflate the output of the Investment Banking industry (part of NAICS 52311). Investment Banking Services include underwriting of securities (debt and equity) and Merger and Acquisition (M & A) advisory services.

This industry is heavily regulated in Canada and publicly traded companies are required to report on new issues and M & A's, therefore there is a large amount of administrative data available. Some private organizations, such as Bloomberg and Financial Post, keep track of these activities via regulatory sources like the System for Electronic Document and Retrieval (SEDAR) as well as corporate announcements.The information is categorized in longitudinal databases which are made available to the public on a subscription basis. In order to build a price index for the activities of the industry and ensure sufficient coverage, several options are available including: use of administrative data exclusively, a mix of administrative data and model-based approach, or a model-based approach.

This paper considers these options by examining the relevance and usefulness of the available administrative data. First, we present an analysis of the industry from three perspectives: industry activity, NAICS, and SNA output. We then examine the available administrative data, coverage and limitations. Next, we summarize options to build price indexes using the available administrative data, and examine the limitations while proposing strategies to mitigate (when to consider and how to integrate supplemental information). This mixed approach goes beyond traditional survey taking and explores an alternative to the model pricing approach in order to create this new Canadian financial services price index.