# Statistics by subject – Statistical methods

Other available resources to support your research.

Help for sorting results
Browse our central repository of key standard concepts, definitions, data sources and methods.
All (185)

## All (185) (25 of 185 results)

• Articles and reports: 82-003-X201601214687
Description:

This study describes record linkage of the Canadian Community Health Survey and the Canadian Mortality Database. The article explains the record linkage process and presents results about associations between health behaviours and mortality among a representative sample of Canadians.

Release date: 2016-12-21

• Articles and reports: 12-001-X201600214676
Description:

Winsorization procedures replace extreme values with less extreme values, effectively moving the original extreme values toward the center of the distribution. Winsorization therefore both detects and treats influential values. Mulry, Oliver and Kaputa (2014) compare the performance of the one-sided Winsorization method developed by Clark (1995) and described by Chambers, Kokic, Smith and Cruddas (2000) to the performance of M-estimation (Beaumont and Alavi 2004) in highly skewed business population data. One aspect of particular interest for methods that detect and treat influential values is the range of values designated as influential, called the detection region. The Clark Winsorization algorithm is easy to implement and can be extremely effective. However, the resultant detection region is highly dependent on the number of influential values in the sample, especially when the survey totals are expected to vary greatly by collection period. In this note, we examine the effect of the number and magnitude of influential values on the detection regions from Clark Winsorization using data simulated to realistically reflect the properties of the population for the Monthly Retail Trade Survey (MRTS) conducted by the U.S. Census Bureau. Estimates from the MRTS and other economic surveys are used in economic indicators, such as the Gross Domestic Product (GDP).

Release date: 2016-12-20

• Articles and reports: 12-001-X201600214677
Description:

How do we tell whether weighting adjustments reduce nonresponse bias? If a variable is measured for everyone in the selected sample, then the design weights can be used to calculate an approximately unbiased estimate of the population mean or total for that variable. A second estimate of the population mean or total can be calculated using the survey respondents only, with weights that have been adjusted for nonresponse. If the two estimates disagree, then there is evidence that the weight adjustments may not have removed the nonresponse bias for that variable. In this paper we develop the theoretical properties of linearization and jackknife variance estimators for evaluating the bias of an estimated population mean or total by comparing estimates calculated from overlapping subsets of the same data with different sets of weights, when poststratification or inverse propensity weighting is used for the nonresponse adjustments to the weights. We provide sufficient conditions on the population, sample, and response mechanism for the variance estimators to be consistent, and demonstrate their small-sample properties through a simulation study.

Release date: 2016-12-20

• Technical products: 11-522-X201700014711
Description:

After the 2010 Census, the U.S. Census Bureau conducted two separate research projects matching survey data to databases. One study matched to the third-party database Accurint, and the other matched to U.S. Postal Service National Change of Address (NCOA) files. In both projects, we evaluated response error in reported move dates by comparing the self-reported move date to records in the database. We encountered similar challenges in the two projects. This paper discusses our experience using “big data” as a comparison source for survey data and our lessons learned for future projects similar to the ones we conducted.

Release date: 2016-03-24

• Technical products: 11-522-X201700014745
Description:

In the design of surveys a number of parameters like contact propensities, participation propensities and costs per sample unit play a decisive role. In on-going surveys, these survey design parameters are usually estimated from previous experience and updated gradually with new experience. In new surveys, these parameters are estimated from expert opinion and experience with similar surveys. Although survey institutes have a fair expertise and experience, the postulation, estimation and updating of survey design parameters is rarely done in a systematic way. This paper presents a Bayesian framework to include and update prior knowledge and expert opinion about the parameters. This framework is set in the context of adaptive survey designs in which different population units may receive different treatment given quality and cost objectives. For this type of survey, the accuracy of design parameters becomes even more crucial to effective design decisions. The framework allows for a Bayesian analysis of the performance of a survey during data collection and in between waves of a survey. We demonstrate the Bayesian analysis using a realistic simulation study.

Release date: 2016-03-24

• Technical products: 11-522-X201700014710
Description:

The Data Warehouse has modernized the way the Canadian System of Macroeconomic Accounts (MEA) are produced and analyzed today. Its continuing evolution facilitates the amounts and types of analytical work that is done within the MEA. It brings in the needed element of harmonization and confrontation as the macroeconomic accounts move toward full integration. The improvements in quality, transparency, and timeliness have strengthened the statistics that are being disseminated.

Release date: 2016-03-24

• Technical products: 11-522-X201700014738
Description:

In the standard design approach to missing observations, the construction of weight classes and calibration are used to adjust the design weights for the respondents in the sample. Here we use these adjusted weights to define a Dirichlet distribution which can be used to make inferences about the population. Examples show that the resulting procedures have better performance properties than the standard methods when the population is skewed.

Release date: 2016-03-24

• Technical products: 11-522-X201700014754
Description:

Background: There is increasing interest in measuring and benchmarking health system performance. We compared Canada’s health system with other countries in the Organisation for Economic Co-operation and Development (OECD) on both the national and provincial levels, across 50 indicators of health system performance. This analysis can help provinces identify potential areas for improvement, considering an optimal comparator for international comparisons. Methods: OECD Health Data from 2013 was used to compare Canada’s results internationally. We also calculated provincial results for OECD’s indicators on health system performance, using OECD methodology. We normalized the indicator results to present multiple indicators on the same scale and compared them to the OECD average, 25th and 75th percentiles. Results: Presenting normalized values allow Canada’s results to be compared across multiple OECD indicators on the same scale. No country or province consistently has higher results than the others. For most indicators, Canadian results are similar to other countries, but there remain areas where Canada performs particularly well (i.e. smoking rates) or poorly (i.e. patient safety). This data was presented in an interactive eTool. Conclusion: Comparing Canada’s provinces internationally can highlight areas where improvement is needed, and help to identify potential strategies for improvement.

Release date: 2016-03-24

• Technical products: 11-522-X201700014732
Description:

The Institute for Employment Research (IAB) is the research unit of the German Federal Employment Agency. Via the Research Data Centre (FDZ) at the IAB, administrative and survey data on individuals and establishments are provided to researchers. In cooperation with the Institute for the Study of Labor (IZA), the FDZ has implemented the Job Submission Application (JoSuA) environment which enables researchers to submit jobs for remote data execution through a custom-built web interface. Moreover, two types of user-generated output files may be distinguished within the JoSuA environment which allows for faster and more efficient disclosure review services.

Release date: 2016-03-24

• Articles and reports: 82-003-X201600314338
Description:

This paper describes the methods and data used in the development and implementation of the POHEM-Neurological meta-model.

Release date: 2016-03-16

• Articles and reports: 12-001-X201500214230
Description:

This paper develops allocation methods for stratified sample surveys where composite small area estimators are a priority, and areas are used as strata. Longford (2006) proposed an objective criterion for this situation, based on a weighted combination of the mean squared errors of small area means and a grand mean. Here, we redefine this approach within a model-assisted framework, allowing regressor variables and a more natural interpretation of results using an intra-class correlation parameter. We also consider several uses of power allocation, and allow the placing of other constraints such as maximum relative root mean squared errors for stratum estimators. We find that a simple power allocation can perform very nearly as well as the optimal design even when the objective is to minimize Longford’s (2006) criterion.

Release date: 2015-12-17

• Articles and reports: 12-001-X201500214238
Description:

Félix-Medina and Thompson (2004) proposed a variant of link-tracing sampling to sample hidden and/or hard-to-detect human populations such as drug users and sex workers. In their variant, an initial sample of venues is selected and the people found in the sampled venues are asked to name other members of the population to be included in the sample. Those authors derived maximum likelihood estimators of the population size under the assumption that the probability that a person is named by another in a sampled venue (link-probability) does not depend on the named person (homogeneity assumption). In this work we extend their research to the case of heterogeneous link-probabilities and derive unconditional and conditional maximum likelihood estimators of the population size. We also propose profile likelihood and bootstrap confidence intervals for the size of the population. The results of simulations studies carried out by us show that in presence of heterogeneous link-probabilities the proposed estimators perform reasonably well provided that relatively large sampling fractions, say larger than 0.5, be used, whereas the estimators derived under the homogeneity assumption perform badly. The outcomes also show that the proposed confidence intervals are not very robust to deviations from the assumed models.

Release date: 2015-12-17

• Articles and reports: 12-001-X201500214237
Description:

Careful design of a dual-frame random digit dial (RDD) telephone survey requires selecting from among many options that have varying impacts on cost, precision, and coverage in order to obtain the best possible implementation of the study goals. One such consideration is whether to screen cell-phone households in order to interview cell-phone only (CPO) households and exclude dual-user household, or to take all interviews obtained via the cell-phone sample. We present a framework in which to consider the tradeoffs between these two options and a method to select the optimal design. We derive and discuss the optimum allocation of sample size between the two sampling frames and explore the choice of optimum p, the mixing parameter for the dual-user domain. We illustrate our methods using the National Immunization Survey, sponsored by the Centers for Disease Control and Prevention.

Release date: 2015-12-17

• Articles and reports: 12-001-X201500214249
Description:

The problem of optimal allocation of samples in surveys using a stratified sampling plan was first discussed by Neyman in 1934. Since then, many researchers have studied the problem of the sample allocation in multivariate surveys and several methods have been proposed. Basically, these methods are divided into two classes: The first class comprises methods that seek an allocation which minimizes survey costs while keeping the coefficients of variation of estimators of totals below specified thresholds for all survey variables of interest. The second aims to minimize a weighted average of the relative variances of the estimators of totals given a maximum overall sample size or a maximum cost. This paper proposes a new optimization approach for the sample allocation problem in multivariate surveys. This approach is based on a binary integer programming formulation. Several numerical experiments showed that the proposed approach provides efficient solutions to this problem, which improve upon a ‘textbook algorithm’ and can be more efficient than the algorithm by Bethel (1985, 1989).

Release date: 2015-12-17

• Articles and reports: 82-003-X201501214295
Description:

Using the Wisconsin Cancer Intervention and Surveillance Monitoring Network breast cancer simulation model adapted to the Canadian context, costs and quality-adjusted life years were evaluated for 11 mammography screening strategies that varied by start/stop age and screening frequency for the general population. Incremental cost-effectiveness ratios are presented, and sensitivity analyses are used to assess the robustness of model conclusions.

Release date: 2015-12-16

• Articles and reports: 82-003-X201501114243
Description:

A surveillance tool was developed to assess dietary intake collected by surveys in relation to Eating Well with Canada’s Food Guide (CFG). The tool classifies foods in the Canadian Nutrient File (CNF) according to how closely they reflect CFG. This article describes the validation exercise conducted to ensure that CNF foods determined to be “in line with CFG” were appropriately classified.

Release date: 2015-11-18

• Articles and reports: 12-001-X201500114199
Description:

In business surveys, it is not unusual to collect economic variables for which the distribution is highly skewed. In this context, winsorization is often used to treat the problem of influential values. This technique requires the determination of a constant that corresponds to the threshold above which large values are reduced. In this paper, we consider a method of determining the constant which involves minimizing the largest estimated conditional bias in the sample. In the context of domain estimation, we also propose a method of ensuring consistency between the domain-level winsorized estimates and the population-level winsorized estimate. The results of two simulation studies suggest that the proposed methods lead to winsorized estimators that have good bias and relative efficiency properties.

Release date: 2015-06-29

• Articles and reports: 12-001-X201500114174
Description:

Matrix sampling, often referred to as split-questionnaire, is a sampling design that involves dividing a questionnaire into subsets of questions, possibly overlapping, and then administering each subset to one or more different random subsamples of an initial sample. This increasingly appealing design addresses concerns related to data collection costs, respondent burden and data quality, but reduces the number of sample units that are asked each question. A broadened concept of matrix design includes the integration of samples from separate surveys for the benefit of streamlined survey operations and consistency of outputs. For matrix survey sampling with overlapping subsets of questions, we propose an efficient estimation method that exploits correlations among items surveyed in the various subsamples in order to improve the precision of the survey estimates. The proposed method, based on the principle of best linear unbiased estimation, generates composite optimal regression estimators of population totals using a suitable calibration scheme for the sampling weights of the full sample. A variant of this calibration scheme, of more general use, produces composite generalized regression estimators that are also computationally very efficient.

Release date: 2015-06-29

• Articles and reports: 12-001-X201500114173
Description:

Nonresponse is present in almost all surveys and can severely bias estimates. It is usually distinguished between unit and item nonresponse. By noting that for a particular survey variable, we just have observed and unobserved values, in this work we exploit the connection between unit and item nonresponse. In particular, we assume that the factors that drive unit response are the same as those that drive item response on selected variables of interest. Response probabilities are then estimated using a latent covariate that measures the will to respond to the survey and that can explain a part of the unknown behavior of a unit to participate in the survey. This latent covariate is estimated using latent trait models. This approach is particularly relevant for sensitive items and, therefore, can handle non-ignorable nonresponse. Auxiliary information known for both respondents and nonrespondents can be included either in the latent variable model or in the response probability estimation process. The approach can also be used when auxiliary information is not available, and we focus here on this case. We propose an estimator using a reweighting system based on the previous latent covariate when no other observed auxiliary information is available. Results on its performance are encouraging from simulation studies on both real and simulated data.

Release date: 2015-06-29

• Articles and reports: 12-001-X201500114161
Description:

A popular area level model used for the estimation of small area means is the Fay-Herriot model. This model involves unobservable random effects for the areas apart from the (fixed) linear regression based on area level covariates. Empirical best linear unbiased predictors of small area means are obtained by estimating the area random effects, and they can be expressed as a weighted average of area-specific direct estimators and regression-synthetic estimators. In some cases the observed data do not support the inclusion of the area random effects in the model. Excluding these area effects leads to the regression-synthetic estimator, that is, a zero weight is attached to the direct estimator. A preliminary test estimator of a small area mean obtained after testing for the presence of area random effects is studied. On the other hand, empirical best linear unbiased predictors of small area means that always give non-zero weights to the direct estimators in all areas together with alternative estimators based on the preliminary test are also studied. The preliminary testing procedure is also used to define new mean squared error estimators of the point estimators of small area means. Results of a limited simulation study show that, for small number of areas, the preliminary testing procedure leads to mean squared error estimators with considerably smaller average absolute relative bias than the usual mean squared error estimators, especially when the variance of the area effects is small relative to the sampling variances.

Release date: 2015-06-29

• Articles and reports: 82-003-X201500614196
Description:

This study investigates the feasibility and validity of using personal health insurance numbers to deterministically link the CCR and the Discharge Abstract Database to obtain hospitalization information about people with primary cancers.

Release date: 2015-06-17

• Technical products: 11-522-X201300014259
Description:

In an effort to reduce response burden on farm operators, Statistics Canada is studying alternative approaches to telephone surveys for producing field crop estimates. One option is to publish harvested area and yield estimates in September as is currently done, but to calculate them using models based on satellite and weather data, and data from the July telephone survey. However before adopting such an approach, a method must be found which produces estimates with a sufficient level of accuracy. Research is taking place to investigate different possibilities. Initial research results and issues to consider are discussed in this paper.

Release date: 2014-10-31

• Technical products: 11-522-X201300014290
Description:

This paper describes a new module that will project families and households by Aboriginal status using the Demosim microsimulation model. The methodology being considered would assign a household/family headship status annually to each individual and would use the headship rate method to calculate the number of annual families and households by various characteristics and geographies associated with Aboriginal populations.

Release date: 2014-10-31

• Technical products: 11-522-X201300014274
Description:

What is big data? Can it replace and or supplement official surveys? What are some of the challenges associated with utilizing big data for official statistics? What are some of the possible solutions? Last fall, Statistics Canada invested in a Big Data Pilot project to answer some of these questions. This was the first business survey project of its kind. This paper will cover some of the lessons learned from the Big Data Pilot Project using Smart Meter Data.

Release date: 2014-10-31

• Technical products: 11-522-X201300014279
Description:

As part of the European SustainCity project, a microsimulation model of individuals and households was created to simulate the population of various European cities. The aim of the project was to combine several transportation and land-use microsimulation models (land-use modelling), add on a dynamic population module and apply these microsimulation approaches to three geographic areas of Europe (the Île-de-France region and the Brussels and Zurich agglomerations

Release date: 2014-10-31

Data (0)

## Data (0) (0 results)

Your search for "" found no results in this section of the site.

You may try:

Analysis (95)

## Analysis (95) (25 of 95 results)

• Articles and reports: 82-003-X201601214687
Description:

This study describes record linkage of the Canadian Community Health Survey and the Canadian Mortality Database. The article explains the record linkage process and presents results about associations between health behaviours and mortality among a representative sample of Canadians.

Release date: 2016-12-21

• Articles and reports: 12-001-X201600214676
Description:

Winsorization procedures replace extreme values with less extreme values, effectively moving the original extreme values toward the center of the distribution. Winsorization therefore both detects and treats influential values. Mulry, Oliver and Kaputa (2014) compare the performance of the one-sided Winsorization method developed by Clark (1995) and described by Chambers, Kokic, Smith and Cruddas (2000) to the performance of M-estimation (Beaumont and Alavi 2004) in highly skewed business population data. One aspect of particular interest for methods that detect and treat influential values is the range of values designated as influential, called the detection region. The Clark Winsorization algorithm is easy to implement and can be extremely effective. However, the resultant detection region is highly dependent on the number of influential values in the sample, especially when the survey totals are expected to vary greatly by collection period. In this note, we examine the effect of the number and magnitude of influential values on the detection regions from Clark Winsorization using data simulated to realistically reflect the properties of the population for the Monthly Retail Trade Survey (MRTS) conducted by the U.S. Census Bureau. Estimates from the MRTS and other economic surveys are used in economic indicators, such as the Gross Domestic Product (GDP).

Release date: 2016-12-20

• Articles and reports: 12-001-X201600214677
Description:

How do we tell whether weighting adjustments reduce nonresponse bias? If a variable is measured for everyone in the selected sample, then the design weights can be used to calculate an approximately unbiased estimate of the population mean or total for that variable. A second estimate of the population mean or total can be calculated using the survey respondents only, with weights that have been adjusted for nonresponse. If the two estimates disagree, then there is evidence that the weight adjustments may not have removed the nonresponse bias for that variable. In this paper we develop the theoretical properties of linearization and jackknife variance estimators for evaluating the bias of an estimated population mean or total by comparing estimates calculated from overlapping subsets of the same data with different sets of weights, when poststratification or inverse propensity weighting is used for the nonresponse adjustments to the weights. We provide sufficient conditions on the population, sample, and response mechanism for the variance estimators to be consistent, and demonstrate their small-sample properties through a simulation study.

Release date: 2016-12-20

• Articles and reports: 82-003-X201600314338
Description:

This paper describes the methods and data used in the development and implementation of the POHEM-Neurological meta-model.

Release date: 2016-03-16

• Articles and reports: 12-001-X201500214230
Description:

This paper develops allocation methods for stratified sample surveys where composite small area estimators are a priority, and areas are used as strata. Longford (2006) proposed an objective criterion for this situation, based on a weighted combination of the mean squared errors of small area means and a grand mean. Here, we redefine this approach within a model-assisted framework, allowing regressor variables and a more natural interpretation of results using an intra-class correlation parameter. We also consider several uses of power allocation, and allow the placing of other constraints such as maximum relative root mean squared errors for stratum estimators. We find that a simple power allocation can perform very nearly as well as the optimal design even when the objective is to minimize Longford’s (2006) criterion.

Release date: 2015-12-17

• Articles and reports: 12-001-X201500214238
Description:

Félix-Medina and Thompson (2004) proposed a variant of link-tracing sampling to sample hidden and/or hard-to-detect human populations such as drug users and sex workers. In their variant, an initial sample of venues is selected and the people found in the sampled venues are asked to name other members of the population to be included in the sample. Those authors derived maximum likelihood estimators of the population size under the assumption that the probability that a person is named by another in a sampled venue (link-probability) does not depend on the named person (homogeneity assumption). In this work we extend their research to the case of heterogeneous link-probabilities and derive unconditional and conditional maximum likelihood estimators of the population size. We also propose profile likelihood and bootstrap confidence intervals for the size of the population. The results of simulations studies carried out by us show that in presence of heterogeneous link-probabilities the proposed estimators perform reasonably well provided that relatively large sampling fractions, say larger than 0.5, be used, whereas the estimators derived under the homogeneity assumption perform badly. The outcomes also show that the proposed confidence intervals are not very robust to deviations from the assumed models.

Release date: 2015-12-17

• Articles and reports: 12-001-X201500214237
Description:

Careful design of a dual-frame random digit dial (RDD) telephone survey requires selecting from among many options that have varying impacts on cost, precision, and coverage in order to obtain the best possible implementation of the study goals. One such consideration is whether to screen cell-phone households in order to interview cell-phone only (CPO) households and exclude dual-user household, or to take all interviews obtained via the cell-phone sample. We present a framework in which to consider the tradeoffs between these two options and a method to select the optimal design. We derive and discuss the optimum allocation of sample size between the two sampling frames and explore the choice of optimum p, the mixing parameter for the dual-user domain. We illustrate our methods using the National Immunization Survey, sponsored by the Centers for Disease Control and Prevention.

Release date: 2015-12-17

• Articles and reports: 12-001-X201500214249
Description:

The problem of optimal allocation of samples in surveys using a stratified sampling plan was first discussed by Neyman in 1934. Since then, many researchers have studied the problem of the sample allocation in multivariate surveys and several methods have been proposed. Basically, these methods are divided into two classes: The first class comprises methods that seek an allocation which minimizes survey costs while keeping the coefficients of variation of estimators of totals below specified thresholds for all survey variables of interest. The second aims to minimize a weighted average of the relative variances of the estimators of totals given a maximum overall sample size or a maximum cost. This paper proposes a new optimization approach for the sample allocation problem in multivariate surveys. This approach is based on a binary integer programming formulation. Several numerical experiments showed that the proposed approach provides efficient solutions to this problem, which improve upon a ‘textbook algorithm’ and can be more efficient than the algorithm by Bethel (1985, 1989).

Release date: 2015-12-17

• Articles and reports: 82-003-X201501214295
Description:

Using the Wisconsin Cancer Intervention and Surveillance Monitoring Network breast cancer simulation model adapted to the Canadian context, costs and quality-adjusted life years were evaluated for 11 mammography screening strategies that varied by start/stop age and screening frequency for the general population. Incremental cost-effectiveness ratios are presented, and sensitivity analyses are used to assess the robustness of model conclusions.

Release date: 2015-12-16

• Articles and reports: 82-003-X201501114243
Description:

A surveillance tool was developed to assess dietary intake collected by surveys in relation to Eating Well with Canada’s Food Guide (CFG). The tool classifies foods in the Canadian Nutrient File (CNF) according to how closely they reflect CFG. This article describes the validation exercise conducted to ensure that CNF foods determined to be “in line with CFG” were appropriately classified.

Release date: 2015-11-18

• Articles and reports: 12-001-X201500114199
Description:

In business surveys, it is not unusual to collect economic variables for which the distribution is highly skewed. In this context, winsorization is often used to treat the problem of influential values. This technique requires the determination of a constant that corresponds to the threshold above which large values are reduced. In this paper, we consider a method of determining the constant which involves minimizing the largest estimated conditional bias in the sample. In the context of domain estimation, we also propose a method of ensuring consistency between the domain-level winsorized estimates and the population-level winsorized estimate. The results of two simulation studies suggest that the proposed methods lead to winsorized estimators that have good bias and relative efficiency properties.

Release date: 2015-06-29

• Articles and reports: 12-001-X201500114174
Description:

Matrix sampling, often referred to as split-questionnaire, is a sampling design that involves dividing a questionnaire into subsets of questions, possibly overlapping, and then administering each subset to one or more different random subsamples of an initial sample. This increasingly appealing design addresses concerns related to data collection costs, respondent burden and data quality, but reduces the number of sample units that are asked each question. A broadened concept of matrix design includes the integration of samples from separate surveys for the benefit of streamlined survey operations and consistency of outputs. For matrix survey sampling with overlapping subsets of questions, we propose an efficient estimation method that exploits correlations among items surveyed in the various subsamples in order to improve the precision of the survey estimates. The proposed method, based on the principle of best linear unbiased estimation, generates composite optimal regression estimators of population totals using a suitable calibration scheme for the sampling weights of the full sample. A variant of this calibration scheme, of more general use, produces composite generalized regression estimators that are also computationally very efficient.

Release date: 2015-06-29

• Articles and reports: 12-001-X201500114173
Description:

Nonresponse is present in almost all surveys and can severely bias estimates. It is usually distinguished between unit and item nonresponse. By noting that for a particular survey variable, we just have observed and unobserved values, in this work we exploit the connection between unit and item nonresponse. In particular, we assume that the factors that drive unit response are the same as those that drive item response on selected variables of interest. Response probabilities are then estimated using a latent covariate that measures the will to respond to the survey and that can explain a part of the unknown behavior of a unit to participate in the survey. This latent covariate is estimated using latent trait models. This approach is particularly relevant for sensitive items and, therefore, can handle non-ignorable nonresponse. Auxiliary information known for both respondents and nonrespondents can be included either in the latent variable model or in the response probability estimation process. The approach can also be used when auxiliary information is not available, and we focus here on this case. We propose an estimator using a reweighting system based on the previous latent covariate when no other observed auxiliary information is available. Results on its performance are encouraging from simulation studies on both real and simulated data.

Release date: 2015-06-29

• Articles and reports: 12-001-X201500114161
Description:

A popular area level model used for the estimation of small area means is the Fay-Herriot model. This model involves unobservable random effects for the areas apart from the (fixed) linear regression based on area level covariates. Empirical best linear unbiased predictors of small area means are obtained by estimating the area random effects, and they can be expressed as a weighted average of area-specific direct estimators and regression-synthetic estimators. In some cases the observed data do not support the inclusion of the area random effects in the model. Excluding these area effects leads to the regression-synthetic estimator, that is, a zero weight is attached to the direct estimator. A preliminary test estimator of a small area mean obtained after testing for the presence of area random effects is studied. On the other hand, empirical best linear unbiased predictors of small area means that always give non-zero weights to the direct estimators in all areas together with alternative estimators based on the preliminary test are also studied. The preliminary testing procedure is also used to define new mean squared error estimators of the point estimators of small area means. Results of a limited simulation study show that, for small number of areas, the preliminary testing procedure leads to mean squared error estimators with considerably smaller average absolute relative bias than the usual mean squared error estimators, especially when the variance of the area effects is small relative to the sampling variances.

Release date: 2015-06-29

• Articles and reports: 82-003-X201500614196
Description:

This study investigates the feasibility and validity of using personal health insurance numbers to deterministically link the CCR and the Discharge Abstract Database to obtain hospitalization information about people with primary cancers.

Release date: 2015-06-17

• Articles and reports: 12-001-X201400114001
Description:

This article addresses the impact of different sampling procedures on realised sample quality in the case of probability samples. This impact was expected to result from varying degrees of freedom on the part of interviewers to interview easily available or cooperative individuals (thus producing substitutions). The analysis was conducted in a cross-cultural context using data from the first four rounds of the European Social Survey (ESS). Substitutions are measured as deviations from a 50/50 gender ratio in subsamples with heterosexual couples. Significant deviations were found in numerous countries of the ESS. They were also found to be lowest in cases of samples with official registers of residents as sample frame (individual person register samples) if one partner was more difficult to contact than the other. This scope of substitutions did not differ across the ESS rounds and it was weakly correlated with payment and control procedures. It can be concluded from the results that individual person register samples are associated with higher sample quality.

Release date: 2014-06-27

• Articles and reports: 12-001-X201400114002
Description:

We propose an approach for multiple imputation of items missing at random in large-scale surveys with exclusively categorical variables that have structural zeros. Our approach is to use mixtures of multinomial distributions as imputation engines, accounting for structural zeros by conceiving of the observed data as a truncated sample from a hypothetical population without structural zeros. This approach has several appealing features: imputations are generated from coherent, Bayesian joint models that automatically capture complex dependencies and readily scale to large numbers of variables. We outline a Gibbs sampling algorithm for implementing the approach, and we illustrate its potential with a repeated sampling study using public use census microdata from the state of New York, U.S.A.

Release date: 2014-06-27

• Articles and reports: 12-001-X201400111886
Description:

Bayes linear estimator for finite population is obtained from a two-stage regression model, specified only by the means and variances of some model parameters associated with each stage of the hierarchy. Many common design-based estimators found in the literature can be obtained as particular cases. A new ratio estimator is also proposed for the practical situation in which auxiliary information is available. The same Bayes linear approach is proposed for obtaining estimation of proportions for multiple categorical data associated with finite population units, which is the main contribution of this work. A numerical example is provided to illustrate it.

Release date: 2014-06-27

• Articles and reports: 12-001-X201300211885
Description:

Web surveys are generally connected with low response rates. Common suggestions in textbooks on Web survey research highlight the importance of the welcome screen in encouraging respondents to take part. The importance of this screen has been empirically proven in research, showing that most respondents breakoff at the welcome screen. However, there has been little research on the effect of the design of this screen on the level of the breakoff rate. In a study conducted at the University of Konstanz, three experimental treatments were added to a survey of the first-year student population (2,629 students) to assess the impact of different design features of this screen on the breakoff rates. The methodological experiments included varying the background color of the welcome screen, varying the promised task duration on this first screen, and varying the length of the information provided on the welcome screen explaining the privacy rights of the respondents. The analyses show that the longer stated length and the more attention given to explaining privacy rights on the welcome screen, the fewer respondents started and completed the survey. However, the use of a different background color does not result in the expected significant difference.

Release date: 2014-01-15

• Articles and reports: 12-001-X201300211871
Description:

Regression models are routinely used in the analysis of survey data, where one common issue of interest is to identify influential factors that are associated with certain behavioral, social, or economic indices within a target population. When data are collected through complex surveys, the properties of classical variable selection approaches developed in i.i.d. non-survey settings need to be re-examined. In this paper, we derive a pseudo-likelihood-based BIC criterion for variable selection in the analysis of survey data and suggest a sample-based penalized likelihood approach for its implementation. The sampling weights are appropriately assigned to correct the biased selection result caused by the distortion between the sample and the target population. Under a joint randomization framework, we establish the consistency of the proposed selection procedure. The finite-sample performance of the approach is assessed through analysis and computer simulations based on data from the hypertension component of the 2009 Survey on Living with Chronic Diseases in Canada.

Release date: 2014-01-15

• Articles and reports: 82-003-X201301011873
Description:

A computer simulation model of physical activity was developed for the Canadian adult population using longitudinal data from the National Population Health Survey and cross-sectional data from the Canadian Community Health Survey. The model is based on the Population Health Model (POHEM) platform developed by Statistics Canada. This article presents an overview of POHEM and describes the additions that were made to create the physical activity module (POHEM-PA). These additions include changes in physical activity over time, and the relationship between physical activity levels and health-adjusted life expectancy, life expectancy and the onset of selected chronic conditions. Estimates from simulation projections are compared with nationally representative survey data to provide an indication of the validity of POHEM-PA.

Release date: 2013-10-16

• Articles and reports: 12-001-X201300111823
Description:

Although weights are widely used in survey sampling their ultimate justification from the design perspective is often problematical. Here we will argue for a stepwise Bayes justification for weights that does not depend explicitly on the sampling design. This approach will make use of the standard kind of information present in auxiliary variables however it will not assume a model relating the auxiliary variables to the characteristic of interest. The resulting weight for a unit in the sample can be given the usual interpretation as the number of units in the population which it represents.

Release date: 2013-06-28

• Articles and reports: 12-001-X201300111827
Description:

SILC (Statistics on Income and Living Conditions) is an annual European survey that measures the population's income distribution, poverty and living conditions. It has been conducted in Switzerland since 2007, based on a four-panel rotation scheme that yields both cross-sectional and longitudinal estimates. This article examines the problem of estimating the variance of the cross-sectional poverty and social exclusion indicators selected by Eurostat. Our calculations take into account the non-linearity of the estimators, total non-response at different survey stages, indirect sampling and calibration. We adapt the method proposed by Lavallée (2002) for estimating variance in cases of non-response after weight sharing, and we obtain a variance estimator that is asymptotically unbiased and very easy to program.

Release date: 2013-06-28

• Articles and reports: 12-001-X201200211756
Description:

We propose a new approach to small area estimation based on joint modelling of means and variances. The proposed model and methodology not only improve small area estimators but also yield "smoothed" estimators of the true sampling variances. Maximum likelihood estimation of model parameters is carried out using EM algorithm due to the non-standard form of the likelihood function. Confidence intervals of small area parameters are derived using a more general decision theory approach, unlike the traditional way based on minimizing the squared error loss. Numerical properties of the proposed method are investigated via simulation studies and compared with other competitive methods in the literature. Theoretical justification for the effective performance of the resulting estimators and confidence intervals is also provided.

Release date: 2012-12-19

• Articles and reports: 82-003-X201200111625
Description:

This study compares estimates of the prevalence of cigarette smoking based on self-report with estimates based on urinary cotinine concentrations. The data are from the 2007 to 2009 Canadian Health Measures Survey, which included self-reported smoking status and the first nationally representative measures of urinary cotinine.

Release date: 2012-02-15

Reference (90)

## Reference (90) (25 of 90 results)

• Technical products: 11-522-X201700014711
Description:

After the 2010 Census, the U.S. Census Bureau conducted two separate research projects matching survey data to databases. One study matched to the third-party database Accurint, and the other matched to U.S. Postal Service National Change of Address (NCOA) files. In both projects, we evaluated response error in reported move dates by comparing the self-reported move date to records in the database. We encountered similar challenges in the two projects. This paper discusses our experience using “big data” as a comparison source for survey data and our lessons learned for future projects similar to the ones we conducted.

Release date: 2016-03-24

• Technical products: 11-522-X201700014745
Description:

In the design of surveys a number of parameters like contact propensities, participation propensities and costs per sample unit play a decisive role. In on-going surveys, these survey design parameters are usually estimated from previous experience and updated gradually with new experience. In new surveys, these parameters are estimated from expert opinion and experience with similar surveys. Although survey institutes have a fair expertise and experience, the postulation, estimation and updating of survey design parameters is rarely done in a systematic way. This paper presents a Bayesian framework to include and update prior knowledge and expert opinion about the parameters. This framework is set in the context of adaptive survey designs in which different population units may receive different treatment given quality and cost objectives. For this type of survey, the accuracy of design parameters becomes even more crucial to effective design decisions. The framework allows for a Bayesian analysis of the performance of a survey during data collection and in between waves of a survey. We demonstrate the Bayesian analysis using a realistic simulation study.

Release date: 2016-03-24

• Technical products: 11-522-X201700014710
Description:

The Data Warehouse has modernized the way the Canadian System of Macroeconomic Accounts (MEA) are produced and analyzed today. Its continuing evolution facilitates the amounts and types of analytical work that is done within the MEA. It brings in the needed element of harmonization and confrontation as the macroeconomic accounts move toward full integration. The improvements in quality, transparency, and timeliness have strengthened the statistics that are being disseminated.

Release date: 2016-03-24

• Technical products: 11-522-X201700014738
Description:

In the standard design approach to missing observations, the construction of weight classes and calibration are used to adjust the design weights for the respondents in the sample. Here we use these adjusted weights to define a Dirichlet distribution which can be used to make inferences about the population. Examples show that the resulting procedures have better performance properties than the standard methods when the population is skewed.

Release date: 2016-03-24

• Technical products: 11-522-X201700014754
Description:

Background: There is increasing interest in measuring and benchmarking health system performance. We compared Canada’s health system with other countries in the Organisation for Economic Co-operation and Development (OECD) on both the national and provincial levels, across 50 indicators of health system performance. This analysis can help provinces identify potential areas for improvement, considering an optimal comparator for international comparisons. Methods: OECD Health Data from 2013 was used to compare Canada’s results internationally. We also calculated provincial results for OECD’s indicators on health system performance, using OECD methodology. We normalized the indicator results to present multiple indicators on the same scale and compared them to the OECD average, 25th and 75th percentiles. Results: Presenting normalized values allow Canada’s results to be compared across multiple OECD indicators on the same scale. No country or province consistently has higher results than the others. For most indicators, Canadian results are similar to other countries, but there remain areas where Canada performs particularly well (i.e. smoking rates) or poorly (i.e. patient safety). This data was presented in an interactive eTool. Conclusion: Comparing Canada’s provinces internationally can highlight areas where improvement is needed, and help to identify potential strategies for improvement.

Release date: 2016-03-24

• Technical products: 11-522-X201700014732
Description:

The Institute for Employment Research (IAB) is the research unit of the German Federal Employment Agency. Via the Research Data Centre (FDZ) at the IAB, administrative and survey data on individuals and establishments are provided to researchers. In cooperation with the Institute for the Study of Labor (IZA), the FDZ has implemented the Job Submission Application (JoSuA) environment which enables researchers to submit jobs for remote data execution through a custom-built web interface. Moreover, two types of user-generated output files may be distinguished within the JoSuA environment which allows for faster and more efficient disclosure review services.

Release date: 2016-03-24

• Technical products: 11-522-X201300014259
Description:

In an effort to reduce response burden on farm operators, Statistics Canada is studying alternative approaches to telephone surveys for producing field crop estimates. One option is to publish harvested area and yield estimates in September as is currently done, but to calculate them using models based on satellite and weather data, and data from the July telephone survey. However before adopting such an approach, a method must be found which produces estimates with a sufficient level of accuracy. Research is taking place to investigate different possibilities. Initial research results and issues to consider are discussed in this paper.

Release date: 2014-10-31

• Technical products: 11-522-X201300014290
Description:

This paper describes a new module that will project families and households by Aboriginal status using the Demosim microsimulation model. The methodology being considered would assign a household/family headship status annually to each individual and would use the headship rate method to calculate the number of annual families and households by various characteristics and geographies associated with Aboriginal populations.

Release date: 2014-10-31

• Technical products: 11-522-X201300014274
Description:

What is big data? Can it replace and or supplement official surveys? What are some of the challenges associated with utilizing big data for official statistics? What are some of the possible solutions? Last fall, Statistics Canada invested in a Big Data Pilot project to answer some of these questions. This was the first business survey project of its kind. This paper will cover some of the lessons learned from the Big Data Pilot Project using Smart Meter Data.

Release date: 2014-10-31

• Technical products: 11-522-X201300014279
Description:

As part of the European SustainCity project, a microsimulation model of individuals and households was created to simulate the population of various European cities. The aim of the project was to combine several transportation and land-use microsimulation models (land-use modelling), add on a dynamic population module and apply these microsimulation approaches to three geographic areas of Europe (the Île-de-France region and the Brussels and Zurich agglomerations

Release date: 2014-10-31

• Technical products: 11-522-X201300014277
Description:

This article gives an overview of adaptive design elements introduced to the PASS panel survey in waves four to seven. The main focus is on experimental interventions in later phases of the fieldwork. These interventions aim at balancing the sample by prioritizing low-propensity sample members. In wave 7, interviewers received a double premium for completion of interviews with low-propensity cases in the final phase of the fieldwork. This premium was restricted to a random half of the cases with low estimated response propensity and no final status after four months of prior fieldwork. This incentive was effective in increasing interviewer effort, however, led to no significant increase in response rates.

Release date: 2014-10-31

• Technical products: 11-522-X200800011000
Description:

The present report reviews the results of a mailing experiment that took place within a large scale demonstration project. A postcard and stickers were sent to a random group of project participants in the period between a contact call and a survey. The researchers hypothesized that, because of the additional mailing (the treatment), the response rates to the upcoming survey would increase. There was, however, no difference between the response rates of the treatment group that received the additional mailing and the control group. In the specific circumstances of the mailing experiment, sending project participants a postcard and stickers as a reminder of the upcoming survey and of their participation in the pilot project was not an efficient way to increase response rates.

Release date: 2009-12-03

• Technical products: 11-522-X200800010983
Description:

The US Census Bureau conducts monthly, quarterly, and annual surveys of the American economy and a census every 5 years. These programs require significant business effort. New technologies, new forms of organization, and scarce resources affect the ability of businesses to respond. Changes also affect what businesses expect from the Census Bureau, the Census Bureau's internal systems, and the way businesses interact with the Census Bureau.

For several years, the Census Bureau has provided a special relationship to help large companies prepare for the census. We also have worked toward company-centric communication across all programs. A relationship model has emerged that focuses on infrastructure and business practices, and allows the Census Bureau to be more responsive.

This paper focuses on the Census Bureau's company-centric communications and systems. We describe important initiatives and challenges, and we review their impact on Census Bureau practices and respondent behavior.

Release date: 2009-12-03

• Technical products: 11-522-X200800011009
Description:

The National Routing System is a multi-jurisdictional effort to improve the collection and validation of birth and death information from provincial vital event registries. Instead of having to wait for batch files to be sent at various points during the year, provinces send individual records as an event is registered. Timeliness is further enhanced by the adoption of data and technical standards. Data users no longer have to deal with multiple data formats and transfer media when compiling data from multiple sources. Similarly, data providers need to perform a once only transformation of their data in order to satisfy multiple clients.

Release date: 2009-12-03

• Technical products: 11-522-X200800011003
Description:

This study examined the feasibility of developing correction factors to adjust self-reported measures of Body Mass Index to more closely approximate measured values. Data are from the 2005 Canadian Community Health Survey where respondents were asked to report their height and weight and were subsequently measured. Regression analyses were used to determine which socio-demographic and health characteristics were associated with the discrepancies between reported and measured values. The sample was then split into two groups. In the first, the self-reported BMI and the predictors of the discrepancies were regressed on the measured BMI. Correction equations were generated using all predictor variables that were significant at the p<0.05 level. These correction equations were then tested in the second group to derive estimates of sensitivity, specificity and of obesity prevalence. Logistic regression was used to examine the relationship between measured, reported and corrected BMI and obesity-related health conditions. Corrected estimates provided more accurate measures of obesity prevalence, mean BMI and sensitivity levels. Self-reported data exaggerated the relationship between BMI and health conditions, while in most cases the corrected estimates provided odds ratios that were more similar to those generated with the measured BMI.

Release date: 2009-12-03

• Technical products: 11-522-X200800010958
Description:

Telephone Data Entry (TDE) is a system by which survey respondents can return their data to the Office for National Statistics (ONS) using the keypad on their telephone and currently accounts for approximately 12% of total responses to ONS business surveys. ONS is currently increasing the number of surveys which use TDE as the primary mode of response and this paper gives an overview of the redevelopment project covering; the redevelopment of the paper questionnaire, enhancements made to the TDE system and the results from piloting these changes. Improvements to the quality of the data received and increased response via TDE as a result of these developments suggest that data quality improvements and cost savings are possible as a result of promoting TDE as the primary mode of response to short term surveys.

Release date: 2009-12-03

• Technical products: 11-522-X200800010920
Description:

On behalf of Statistics Canada, I would like to welcome you all, friends and colleagues, to Symposium 2008. This the 24th International Symposium organized by Statistics Canada on survey methodology.

Release date: 2009-12-03

• Technical products: 11-522-X200800010974
Description:

This paper will focus on establishment survey questionnaire design guidelines. More specifically, it will discuss the process involved in transitioning a set of guidelines written for a broad, survey methodological audience to a more narrow, agency-specific audience of survey managers and analysts. The process involved the work of a team comprised of individuals from across the Census Bureau's Economic Directorate, working in a cooperative and collaborative manner. The team decided what needed to be added, modified, and deleted from the broad starting point, and determined how much of the theory and experimental evidence found in the literature was necessary to include in the guidelines. In addition to discussing the process, the paper will also describe the end result: a set of questionnaire design guidelines for the Economic Directorate.

Release date: 2009-12-03

• Technical products: 11-522-X200800010956
Description:

The use of Computer Audio-Recorded Interviewing (CARI) as a tool to identify interview falsification is quickly growing in survey research (Biemer, 2000, 2003; Thissen, 2007). Similarly, survey researchers are starting to expand the usefulness of CARI by combining recordings with coding to address data quality (Herget, 2001; Hansen, 2005; McGee, 2007). This paper presents results from a study included as part of the establishment-based National Center for Health Statistics' National Home and Hospice Care Survey (NHHCS) which used CARI behavior coding and CARI-specific paradata to: 1) identify and correct problematic interviewer behavior or question issues early in the data collection period before either negatively impact data quality, and; 2) identify ways to diminish measurement error in future implementations of the NHHCS. During the first 9 weeks of the 30-week field period, CARI recorded a subset of questions from the NHHCS application for all interviewers. Recordings were linked with the interview application and output and then coded in one of two modes: Code by Interviewer or Code by Question. The Code by Interviewer method provided visibility into problems specific to an interviewer as well as more generalized problems potentially applicable to all interviewers. The Code by Question method yielded data that spoke to understandability of the questions and other response problems. In this mode, coders coded multiple implementations of the same question across multiple interviewers. Using the Code by Question approach, researchers identified issues with three key survey questions in the first few weeks of data collection and provided guidance to interviewers in how to handle those questions as data collection continued. Results from coding the audio recordings (which were linked with the survey application and output) will inform question wording and interviewer training in the next implementation of the NHHCS, and guide future enhancement of CARI and the coding system.

Release date: 2009-12-03

• Technical products: 11-522-X200800010946
Description:

In the mid 1990s the first question testing unit was set-up in the UK Office for National Statistics (ONS). The key objective of the unit was to develop and test the questions and questionnaire for the 2001 Census. Since the establishment of this unit the area has been expanded into a Data Collection Methodology (DCM) Centre of Expertise which now sits in the Methodology Directorate. The DCM centre has three branches which support DCM work for social surveys, business surveys, the Census and external organisations.

In the past ten years DCM has achieved a variety of things. For example, introduced survey methodology involvement in the development and testing of business survey question(naire)s; introduced a mix-method approach to the development of questions and questionnaires; developed and implemented standards e.g. for the 2011 census questionnaire & showcards; and developed and delivered DCM training events.

This paper will provide an overview of data collection methodology at the ONS from the perspective of achievements and challenges. It will cover areas such as methods, staff (e.g. recruitment, development and field security), and integration with the survey process.

Release date: 2009-12-03

• Technical products: 11-522-X200800010968
Description:

Statistics Canada has embarked on a program of increasing and improving the usage of imaging technology for paper survey questionnaires. The goal is to make the process an efficient, reliable and cost effective method of capturing survey data. The objective is to continue using Optical Character Recognition (OCR) to capture the data from questionnaires, documents and faxes received whilst improving the process integration and Quality Assurance/Quality Control (QC) of the data capture process. These improvements are discussed in this paper.

Release date: 2009-12-03

• Technical products: 11-522-X200800010970
Description:

RTI International is currently conducting a longitudinal education study. One component of the study involved collecting transcripts and course catalogs from high schools that the sample members attended. Information from the transcripts and course catalogs also needed to be keyed and coded. This presented a challenge because the transcripts and course catalogs were collected from different types of schools, including public, private, and religious schools, from across the nation and they varied widely in both content and format. The challenge called for a sophisticated system that could be used by multiple users simultaneously. RTI developed such a system possessing all the characteristics of a high-end, high-tech, multi-user, multitask, user-friendly and low maintenance cost high school transcript and course catalog keying and coding system. The system is web based and has three major functions: transcript and catalog keying and coding, transcript and catalog keying quality control (keyer-coder end), and transcript and catalog coding QC (management end). Given the complex nature of transcript and catalog keying and coding, the system was designed to be flexible and to have the ability to transport keyed and coded data throughout the system to reduce the keying time, the ability to logically guide users through all the pages that a type of activity required, the ability to display appropriate information to help keying performance, and the ability to track all the keying, coding, and QC activities. Hundreds of catalogs and thousands of transcripts were successfully keyed, coded, and verified using the system. This paper will report on the system needs and design, implementation tips, problems faced and their solutions, and lessons learned.

Release date: 2009-12-03

• Technical products: 11-522-X200800010978
Description:

Census developers and social researchers are at a critical juncture in determining collection modes of the future. Internet data collection is technically feasible, but the initial investment in hardware and software is costly. Given the great divide in computer knowledge and access, internet data collection is viable for some, but not for all. Therefore internet cannot fully replace the existing paper questionnaire - at least not in the near future.

Canada, Australia and New Zealand are pioneers in internet data collection as an option for completing the census. This paper studies four driving forces behind this collection mode: 1) responding to social/public expectations; 2) longer term economic benefits; 3) improved data quality; and 4) improved coverage.

Issues currently being faced are: 1) estimating internet uptake and maximizing benefits without undue risk; 2) designing a questionnaire for multiple modes; 3) producing multiple public communication approaches; and 4) gaining positive public reaction and trust in using the internet.

This paper summarizes the countries' collective thinking and experiences on the benefits and limitation of internet data collection for a census of population and dwellings. It also provides an outline of where countries are heading in terms of internet data collection in the future.

Release date: 2009-12-03

• Technical products: 11-522-X200800010990
Description:

The purpose of the Quebec Health and Social Services User Satisfaction Survey was to provide estimates of user satisfaction for three types of health care institutions (hospitals, medical clinics and CLSCs). Since a user could have visited one, two or all three types, and since the questionnaire could cover only one type, a procedure was established to select the type of institution at random. The selection procedure, which required variable selection probabilities, was unusual in that it was adjusted during the collection process to adapt increasingly to regional disparities in the use of health and social services.

Release date: 2009-12-03

• Technical products: 11-522-X200800010994
Description:

The growing difficulty of reaching respondents has a general impact on non-response in telephone surveys, especially those that use random digit dialling (RDD), such as the General Social Survey (GSS). The GSS is an annual multipurpose survey with 25,000 respondents. Its aim is to monitor the characteristics of and major changes in Canada's social structure. GSS Cycle 21 (2007) was about the family, social support and retirement. Its target population consisted of persons aged 45 and over living in the 10 Canadian provinces. For more effective coverage, part of the sample was taken from a follow-up with the respondents of GSS Cycle 20 (2006), which was on family transitions. The remainder was a new RDD sample. In this paper, we describe the survey's sampling plan and the random digit dialling method used. Then we discuss the challenges of calculating the non-response rate in an RDD survey that targets a subset of a population, for which the in-scope population must be estimated or modelled. This is done primarily through the use of paradata. The methodology used in GSS Cycle 21 is presented in detail.

Release date: 2009-12-03

Date modified: